Anybody remembers MetaCreations Canoma released in 1999?
Worked also with only one photo.
Extending this into using known 3rd party geometries of identifiable objects instead of reconstructing by hand seems like a very logical extension in retrospect.
As cited in the paper and by Canoma this 1996 paper by Paul Debevec is really where it all started: Modeling and Rendering Architecture from Photographs:
A hybrid geometry- and image-based approach
Darn, it looks like it won't be long before photo editing software can
(1) Find stock models for all objects in a scene
(2) Align them perfectly
(3) Let you manipulate them arbitrarily
(4) Render an output picture with all the changes applied that is virtually indistinguishable from a real photograph.
Once this happens (and it doesn't look like it'll take long) photography will no longer be an accurate reference for knowledge about the real world.
Still not there for me. The devil is in the detail - the sauce and the strawberries in the food picture, the books in the desk picture. Also, I still haven't seen a rendering of a human being that looks genuinely photorealistic.
I used to love that technology was pushing towards this point when I was a kid. Now it scares me. Maybe I'm getting old..
I stand by that what I've mentioned looks fake (especially the strawberries and the sauce) but I think that would be a great exercise. Your brain believes what it expects. I had the expectation that these were renderings so it was easy to pick out flaws.
The are very good. Knowing they're renderings though, you can see it in the face and hair.
DanBC made an interesting comment - it would interesting to see renderings like these in a double blind test with photographs and see how well they stack up.
Those are pictures of an adult, not a child. When you use the word "girl" to describe an adult woman you're implicitly belittling her. Don't be that guy.
> Those are pictures of an adult, not a child. When you use the word "girl" to describe an adult woman you're implicitly belittling her. Don't be that guy.
Sorry, I'm not a native English speaker (so I'm not confident enough to downvote or anything), but your judging this use of “girl” as female version of “boy”, ignoring the overall mode of expression, doesn't seem adequate. I would have no objections if thomaseng's comment was more formal:
> I would consider these pictures of a girl quite lifelike
But it's not.
If I were the author, and the pictures were of a man, I'd totally say “guy”. Once you flip the gender, “guy” seems to become “girl”, not “woman”. (Again, given the overall informal style used.)
And as for the word “guy”, it doesn't sound in any way belittling a grown-up man (and you just used it yourself).
I would say that the male equivalent of "girl" is "boy", not "guy".
The English language is often unhelpful in that exact equivalents of the word you want that exist for one gender don't exist for the other, or else carry other connotations. Master vs Mistress for instance.
The real-time V-Ray raytracing is slow because it is very high quality -- it is doing real global illumination with no precalculations, favoring quality over speed.
It would be cool if the position and rotation would be synchronized between the two, so I could choose a nice view in the Real-Time tab and switch over to Photo Realistic to get the raytraced version.
It works that way in the editor (although you need a Clara.io account and you need to be logged in) if you set up multiple viewports with the same camera.
You just described holy grail in 3D vision, AI and robotics. Being able to make sense of the world from a 2D image is one of the cornerstones for building artificial brain.
Reminds me of the Running Man (1987) scene where, in supposed real-time, a video production editor synthetically composes Arnold Schwarzenegger's and Jesse Ventura's characters together in a deathmatch. One would have to go from rigid-component origami birds on static frames in this CMU paper to semi-solid human figures on moving frames in the movie. 3D models of famous actors' bodies are already made for special effects, painstakingly rendered and composited together in batch mode.
(Personal recollection: there was a solid model Shaq's head at 3d modeling company Viewpoint Datalabs back in the day. His head is huge.)
This is a neat approach. Basically it is a combination of:
(1) Fitting 3D stock models to existing models using a simple but interactive ray casting approach.
(2) Estimating soft lighting on objects fairly convincingly.
(3) Re-rendering the stock models using the artificial lighting and textures of the original photographs.
It is a pretty cool approach. There are real limitations to this but I think that the automated lighting estimate is just cool and has wide applications in the visual effects space.
They never seem to mention that in the paper, at least not prominently as I of course skimmed it today. But Photoshop already has a built in tool for this, so I guess they can just use the standard methods that seem to work fairly well.
Judging from the YouTube videos, the novel part is that they can fill out the part that is occluded from the photo (either using textures from the 3D model, or by using InPaint) because they refer to earlier work that already lets you cut out and manipulate the objects using 3D models.
This is very impressive, but were the fingers behind the paper crane drawn in by hand? I don't see how any algorithm could create that kind of content.
I'd really like to see a video of someone starting with an image and using these algorithms and tools to create one of these effects from start to finish.
In the full paper they say "We use a separately captured background photograph for the chair, while for all other photos, we fill the background using Context-Aware Fill in Photoshop."
So I think the fingers indeed were filled in algorithmically. This is plausible since, as best as I can tell, current Context Aware Fill algorithms are based on magic.
As per wikipedia. Imagination, also called the faculty of imagining, is the ability to form new images and sensations that are not perceived through senses such as sight, hearing, or other senses. Imagination is magic. Everyone knows that. So any generative model in general is ;)
If you like this kind of effect, you should also check out VideoCopilot because inserting 3D objects on top of reference images or video is a recurring use of After Effects (it even ships with a lite version of Cinema4D now).
This and Photoshop's context-aware fill (to help fill the holes left from removing the object in the reference image) are very handy to achieve such effects.
Taking this approach geared towards pre-made 2d still imagery and implement on rendered stills out of a 3d model and some serious MAGIC can take place!
In this scenario you already have all 3d elements in hand, so no need to look for them, as well as the complete environment. lots of things that called for re-rendering can be done with this approach post render.
We need a way to do digital signatures on images such that they cannot be faked. It should verify the image, location, time, and serial number. I know, this seems impossible since someone (the camera) needs to know the private key and that could be compromised.
I've thought about this before, and it is actually pretty easy. You just do a hash of the image and push it into the bitcoin blockchain as a transaction. done. Only downside is that it will cost a wee bit of money and that you need to be connected to the internet at the point when you shoot the image (or, at the point where you want to have the image verified). See:
A composite solution:
The camera's chip produces two signatures for each image: one fairly resistant to cropping, color correction, rotation, etc. and resistant to forging; and a second, pixel-perfect one. The photographer that wishes to be able to claim authenticity uploads the photos to the camera makers website, which verfifies the authenticity of both signatures and publishes both in a secure database indexed by the "editable" signature (and his name, if he wants proof of authorship).
Now, if the author wants to claim authenticity or ownership of a picture, he just has to present the original picture so that people can attest it is not significantly modified and/or he is indeed the author.
Of course, reading the private keys on the chip has to be very hard.
Oh right, I had only thought about proofing that an image hasn't been tampered with after a certain date, but of course it could have been tampered with before that.
> We need a way to do digital signatures on images such that they cannot be faked.
This is almost impossible – the camera's processor can be tampered with and the environment can be altered (e.g. GPS spoofing).
The best we could do would be a notary service where a trusted third-party could produce a signature for a set of bits at a particular time. That would prevent either altering an unwilling third-party's photos or back-dating images after an important event.
All these are unreliable to an extent which suggests there will probably be a fair market for forensic photography software in the future…
You'd need to put the key into the camera's processor chip, where it would be hard (but not impossible) to compromise. A minor problem is that any cropping, gamma correction, etc would invalidate the signature. A bigger problem I see is that you could print the fake scene on a big piece of paper, take a picture of that, and the digital signature would be totally valid.
>> A bigger problem I see is that you could print the fake scene on a big piece of paper, take a picture of that, and the digital signature would be totally valid.
That's why I want the date and camera SN to be authenticated as well. A photo of an actual event could be shown to have the correct date, while a photo of an altered print would have a later date.
A mask in photoediting is usually a gray scale image that is white where you want to remove things and black where you want to keep things and often is in-between where you want soft edges.
To create these in photoshop you can use the magic wand tool and it selects things with similar colors. But you can create these types of masks in a variety of ways.
that means the user marks the boundary of the object to be manipulated, the rest of the image (including the object's shadow) will be treated as the background. This separation is called a mask in image processing lingo.
They fill hidden parts according to symmetries they find in the object’s geometry or make use of model’s textures or user-defined input if there is no symmetry.
> For areas of the object that do not satisfy the criteria of geometric symmetry and appearance similarity, such as the underside of the taxi cab in Figure 1, the assignment defaults to the stock model appear- ance. The assignment also defaults to the stock model appearance when after several iterations, the remaining parts of the object are partitioned into several small areas where the object lacks structural symmetries relative to the visible areas. In this case, we allow the user to fill the appearance in these areas on the texture map of the 3D model using PatchMatch.
Worked also with only one photo.
Extending this into using known 3rd party geometries of identifiable objects instead of reconstructing by hand seems like a very logical extension in retrospect.
http://www.canoma.com/
http://digitalurban.blogspot.de/2006/12/great-software-from-...
As cited in the paper and by Canoma this 1996 paper by Paul Debevec is really where it all started: Modeling and Rendering Architecture from Photographs: A hybrid geometry- and image-based approach
http://www.pauldebevec.com/Research/debevec-csd-96-893.pdf
Still very impressive Video: https://www.youtube.com/watch?v=RPhGEiM_6lM