sprockets Shelton's new Char: Hans It's just donuts by ItsJustMe 3D Printing Free model: USS Midnight Rodger Reynolds' 1950s Street Car Madfox's Pink Floyd Video Tinkering Gnome's Elephant
sprockets
Recent Posts | Unread Content
Jump to content
Hash, Inc. - Animation:Master

Tech Watch: Automated Obstruction Removal from Images


Rodney

Recommended Posts

  • Admin

This is pretty neat technology that will have many applications.

In a way it's not unlike a very different technology 'seam stitching' but from a very different approach with different applications.

 

Article:

http://www.npr.org/sections/thetwo-way/2015/08/05/429720274/obstructions-vanish-from-images-treated-with-new-software-from-mit-google

 

Video:

 

xhttps://www.youtube.com/watch?v=xoyNiatRIh4

 

Of specific interest (and a bit scary... although sure to be useful to reveal who took a picture/video) is the ability to pull the obstruction out of the image as a separate Alpha Matte.

 

For those that recall the movie's look at technology.... shades of Bladerunner. ;)

 

 

Here's a paper that goes into some of the math/details:

 

http://people.csail.mit.edu/mrub/papers/ObstructionFreePhotograpy_SIGGRAPH2015.pdf

Link to comment
Share on other sites

  • Replies 6
  • Created
  • Last Reply

Top Posters In This Topic

Popular Days

Top Posters In This Topic

  • Admin
The limitation seems to be that the camera must move and must create a pretty substantial parallax change among the images.

 

Yes, it seemed odd that they emphasized that a smartphone be used and made no mention of standard cameras.

I noted that peculiarity but wasn't sure what to make of it (assuming it means anything).

Link to comment
Share on other sites

  • Admin
I wonder if it is in any way similar to how our minds filter out the thing we aren't interested in when we look at such situations.

 

Abstractly I'd say yes but perhaps you are digging deeper than what I immediately consider.

Focus seems to be a primary way we filter objects/areas of interest.

The general approach here appears to be like our eyes which will filter out things that are directly in front of our face mostly because we are seeing from two perspectives simultaneously. The classic example being our nose, which we tend to perceive as not inside our field of view but when one eye is closed... there it is.

 

If we take that vision to include many more origins of perspective then we begin to be able to map the space in front of us with a sense of depth. The areas that move the most are temporally nearer to us while those that move less are farther away. This part of the equation could easily be tested by having objects in the 'distance' move faster.

But there is more going on here then just that. The tech related to this appears to be at a pixel or even sub-pixel level.

 

A key to this technology then appears to be that the source is said to be 'a short sequential image'.

I'll guess that other techiques must be used for achieving similar results with static/still/single frames of imagery. (i.e. best guesses based on available data to reconstruct missing elements for further processing. Ex: similar photographs, known dates and times, weather conditions, camera/lens types, etc.)

 

There are animation-related principles at play here (perhaps more appropriately labeled motion-related principles) in that a start, a stop and an inbetween 'frame' (literally frame of reference) is established for purpose of analysis. Then a bit of data is sampled from the start and stop frames in order to project an external/linear inbetween. Now determine the differences between the linear (sourced) inbetween and the projected inbetween (i.e. is it the same?). Establish the source as the starting point and the projection and the other end of the spectrum and reiterate again. Contrast and compare. Where the data approaches zero/no change (or trivial differentiation) record that data and pick two more frames of reference and run through the process all over again. In short order we'll have a map of the flow of every pixel in every frame of every image in the sequence (assuming we want that many, which is only required if we want to fully process our maps without biased interpretation/interpolation).

 

We've all heard of reverse engineering. This is a bit more like... reverse rendering. :)

 

 

Added: I should have at least mentioned the concept of motion parallax as that appears to be at the root of the technology as well as a useful construct in animation (ala multiplane camera effects).

 

Also, I should add that one of the first things that came to mind upon seeing this technology was that it (as well as related technology) might make for excellent analysis tools to extract/locate/identify keyframes/extremes from sequential imagery. I'm not sure why that came to mind but it did. This relates to finding that proverbial needle in a haystack... especially those that prefer to be hid.

 

And yet another addition: Regarding still imagery it occurs to me that after a bazillion alpha mattes are generated those found to be most useful would then be used as filters on still imagery. The narrowing down of useful filters being affected by a few human participants that validate the usefulness of the filters. To put it another way, the distance of the camera lens to a reflection in an image is not infinite so filters that work well on many images could reasonably be expected to work on a random (i.e. similarly obstructed) image.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...