[Numpy-discussion] Help with interpolating missing values from a 3D scanner

Thu Jan 15 19:14:47 EST 2009

On Thu, Jan 15, 2009 at 16:55, David Bolme <bolme1234 at comcast.net> wrote:
>
> I am working on a face recognition using 3D data from a special 3D
> imaging system.  For those interested the data comes from the FRGC
> 2004 dataset.  The problem I am having is that for some pixels the
> scanner fails to capture depth information.  The result is that the
> image has missing values.  There are small regions on the face such as
> eyebrows and eyes that are missing the depth information.  I would
> like to fill in these region by interpolating from nearby pixels but I
> am not sure of the best way to do that.
>
> I currently have two arrays:
>
> * floating point array with depth information (missing data is encoded
> as a large negative number -999999.0)
> * boolean array that is a missing data mask
>
> I have some ideas on how to solve this problem but I am hoping that
> someone on this list may have more experience with this type of
> missing data problem.
>
> * Are there any utilities in scipy/numpy designed for this type of
> missing data problem?

You could toss it into the natural neighbor interpolator in
scikits.delaunay. It's designed for interpolating scattered (X,Y)
points onto a grid, but it works fine for interpolating a regular grid
with missing values, too.

Similarly, scipy.interpolate.Rbf should work, too.

> * If not does any one have suggestions on how I should proceed?

Another approach (that you would have to code yourself) is to take a
Gaussian smoothing kernel of an appropriate size, center it over each
missing pixel, then average the known pixels under the kernel using
the kernel as a weighting factor. Place that average value into the
missing pixel. This is actually fairly similar to the Rbf method
above, but will probably be more efficient since you know that the
points are all gridded.

If you have sizable contiguous regions of missing pixels like the
eyes, you may want to iterate that process, only including the last
iteration's values for the missing pixels into the average, too.
Iterate until your deltas between iterations is within a desirable
tolerance.

Why iterate? If you have sizable regions of missing pixels, you'll get
a fair bit of noise in the center since their values will be
controlled only by a few distance pixels at the edge of your kernel.
Just moving to the next missing pixel might get an entirely new set of
known pixels. Iterating spreads the "good" information from the edges
of the missing region into the center. This is roughly akin to solving
a PDE over the missing region using the known pixels as boundary
conditions. I have no particular references for this approach, but I
imagine you can dig up something in the literature about PDE-based
image processing.

This question came up at the first Open Training we held at Enthought.
Mark Marlett brought up the iteration approach. I pooh-poohed it at
the time, preferring the RBF analogy of the single-pass approach, but
if you need largish holes to be smoothed out, I think it should work
better.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco