[SciPy-Dev] Find points in delaunay triangulation : scipy.spatial vs. scipy.interpolation

Sun May 13 10:36:05 EDT 2012

Le 13/05/2012 05:42, Pauli Virtanen a écrit :
> 13.05.2012 02:13, Pablo Winant kirjoitti:
> [clip]
>>    - The LinearNDInterpolation object implemented in cython requires a
>> list of points and a list of values to be created. But is is not
>> documented how to change the values of the interpolator without doing
>> the mesh again.
> [clip]
>
> This is a good suggestion --- the interpolator constructors should
> accept an existing triangulation.
>
> As a workaround, you can try modifying its ".values" attribute in-place,
>
> 	ip.values[...] = new_values
Thank you for the suggestion, it works.

>> - I tried to use the delaunay object from scipy and noticed a strange
>> thing: for a given set of coordinates it takes longer to get the indices
>> of the triangles containing the points than it takes to perform the
>> interpolation using LinearND object. This is puzzling since apparently
>> the implementation of LinearND performs many calls to the qhull library
>> to get this indices. Attached is a simple exampe demonstrating this anomaly.
> The example doesn't seem to be attached, so I'm not exactly sure what
> you are seeing?
Happens to me all the time :-) Here it is.
>
> Constructing a Delaunay triangulation is more expensive than performing
> the interpolation.
Yes. Unless I am mistaken, in my example the triangulation is done 
before and the more costly operation becomes point location. It seems to 
be faster with LinearNd.

> There are also memory advantages in getting the point
> positions one-by-one (as in during interpolation) than all-at-once.
I understand that one. But in order to play with some variations in the 
interpolation scheme in python, it becomes more efficient to operate on 
a list of points. For instance, applying the barycentric coefficients to 
the list of values on the vertices can be done in one single matrix 
operation, if I have extracted the list of locations before.

>
>> One last thing: I have written an interpolation object on sparse grids,
>> using smolyak product of chebychev polynomials. It is written in pure
>> python (vectorized) and licensed under the bsd license. Currently it
>> lives in another library but I guess it would make more sense to have
>> something like that in a more general scientific lib. Let me know if you
>> are interested. (it is available there anyway:
>> https://github.com/albop/dynare-python/tree/master/dolo/src/dolo/numeric: chebychev.py
>> and smolyak.py)
> This seems slightly similar to the 2-D tensor approach in Fitpack,
> although that uses regular grids only.
This is the classical approach to get an interpolation with more than 
one dimension. But is is impossible if there are more than 3/4 
dimensions while smolyak tensors can be used with more dimensions. There 
are actually several ways to use sparse tensor grids to interpolate: one 
can do piecewise interpolation on each hypercube of the grid or use the 
lagrange polynomial on the grid (spectral method).

There is a synthetic paper describing precisely many possible methods: 
Sparse grids 2004, Bungarts and Griebel. Some of these methods were 
implemented in matlab 
http://www.ians.uni-stuttgart.de/spinterp/about.html .
I actually found the method in an economic paper:

"Computing Stochastic Dynamic Economic Models with a Large Number of 
State Variables: A Description and Application of a Smolyak-Collocation 
Method", Malin, B. Krueger, D., Kubler, F.

>
> I can think of a couple of questions about this method that are not
> immediately clear to me:
>
> - Is it robust? I.e. how stable is the interpolant against
>    perturbations of the data, badly scattered data points etc?
>    Do you get problems with Chebychev polynomials if a high order is
>    required?
It has roughly the same problems as chebychev polynomials: they are 
designed so as to minimize Runge-Kutta errors and there is a similar 
property for sparse product of chebychev polynomials.

The method is meant to represent a function which you can evaluate at 
the points of the grid: they are precomputed, not directly taken from 
the data so I don't know how to use it to interpolate scattered data.

It can achieve high precision for relatively smooth functions and fail 
badly if there are discontinuities. Outside of the approximation space, 
the extrapolation is known to behave badly. But in my experiments it was 
better than having a constant value everywhere.

True, I had some problems using very high order polynomials (with l=6 
meaning chebychev polynomials of order up to  2^6=64) but I don't know 
where does the limitation comes from (my implementation or the algorithm)

>
> - Is your grid selection adaptive, or does the user need to tinker with
>    grid parameters when interpolating?
In the current version the user only specifies a parameter l, which 
quantify the precision of the grid (there are 2^l points in each  
dimension) and the interconnection between dimensions. For instance at 
l=1, there are two points in each dimension and no cross products. The 
main advantage, is that increasing the number of dimensions keeping l 
constant increases the number of points of the grid in a sub-exponential 
way: this breaks the curse of dimensionality.

In the matlab toolbox by Andreas Klimke, there are adaptive schemes to 
increase the number of points in each dimension so it is at least 
theoretically possible.

>
> - How fast is this? As far I see, you need to solve a global linear
>    least squares problem, which maybe is not so nice with a large number
>    of data points?
I would say it is relatively fast: for the same sparse grid I found it 
to be faster than the triangulation approach. And I use it with 5-6 
dimensions without problems while I could not go that high with another 
interpolation scheme.
The global linear least square problem is solved to find the langrange 
polynomials interpolating a given set of values.  It is not done again 
when the interpolation is computed for new points (cf. discussion about 
LinearND and Delaunay). It could be avoided as there are explicit 
combinatorial formulas to do the same (but at that time I found it too 
complex ; it is probably not).

>
> - There are a large number of possible interpolation methods we might
>    like to include. What are the advantages of this one? One seems to be
>    that it generalizes to n-D, and could be used to get smooth
>    interpolants.
>
>

Yes absolutely. It is especially suited when the number of dimensions is 
bigger than 3-4 where existing method (even delaunay based as far as I 
know) have a huge performance penalty cost.

There are two other advantages of the current implementation: it can 
compute efficiently the derivative of the interpolated function at given 
points and it can also compute the derivative of the interpolated values 
w.r.t. to the values on the grid.

What other interpolation schemes are you considering ? I am very 
interested by these developments as interpolation is really a 
practical/theoretical bottleneck for many of my problems. At some point 
I had a prototype of a multilinear interpolation in dimension n. Would 
you be interested in it ?

Best regards,

Pablo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20120513/6c48e6b6/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: example_linearnd_vs_delaunay.py
Type: text/x-python
Size: 1316 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20120513/6c48e6b6/attachment.py>