[SciPy-Dev] Find points in delaunay triangulation : scipy.spatial vs. scipy.interpolation
Pablo Winant
pablo.winant at gmail.com
Sun May 13 10:36:05 EDT 2012
Le 13/05/2012 05:42, Pauli Virtanen a écrit :
> 13.05.2012 02:13, Pablo Winant kirjoitti:
> [clip]
>> - The LinearNDInterpolation object implemented in cython requires a
>> list of points and a list of values to be created. But is is not
>> documented how to change the values of the interpolator without doing
>> the mesh again.
> [clip]
>
> This is a good suggestion --- the interpolator constructors should
> accept an existing triangulation.
>
> As a workaround, you can try modifying its ".values" attribute in-place,
>
> ip.values[...] = new_values
Thank you for the suggestion, it works.
>> - I tried to use the delaunay object from scipy and noticed a strange
>> thing: for a given set of coordinates it takes longer to get the indices
>> of the triangles containing the points than it takes to perform the
>> interpolation using LinearND object. This is puzzling since apparently
>> the implementation of LinearND performs many calls to the qhull library
>> to get this indices. Attached is a simple exampe demonstrating this anomaly.
> The example doesn't seem to be attached, so I'm not exactly sure what
> you are seeing?
Happens to me all the time :-) Here it is.
>
> Constructing a Delaunay triangulation is more expensive than performing
> the interpolation.
Yes. Unless I am mistaken, in my example the triangulation is done
before and the more costly operation becomes point location. It seems to
be faster with LinearNd.
> There are also memory advantages in getting the point
> positions one-by-one (as in during interpolation) than all-at-once.
I understand that one. But in order to play with some variations in the
interpolation scheme in python, it becomes more efficient to operate on
a list of points. For instance, applying the barycentric coefficients to
the list of values on the vertices can be done in one single matrix
operation, if I have extracted the list of locations before.
>
>> One last thing: I have written an interpolation object on sparse grids,
>> using smolyak product of chebychev polynomials. It is written in pure
>> python (vectorized) and licensed under the bsd license. Currently it
>> lives in another library but I guess it would make more sense to have
>> something like that in a more general scientific lib. Let me know if you
>> are interested. (it is available there anyway:
>> https://github.com/albop/dynare-python/tree/master/dolo/src/dolo/numeric: chebychev.py
>> and smolyak.py)
> This seems slightly similar to the 2-D tensor approach in Fitpack,
> although that uses regular grids only.
This is the classical approach to get an interpolation with more than
one dimension. But is is impossible if there are more than 3/4
dimensions while smolyak tensors can be used with more dimensions. There
are actually several ways to use sparse tensor grids to interpolate: one
can do piecewise interpolation on each hypercube of the grid or use the
lagrange polynomial on the grid (spectral method).
There is a synthetic paper describing precisely many possible methods:
Sparse grids 2004, Bungarts and Griebel. Some of these methods were
implemented in matlab
http://www.ians.uni-stuttgart.de/spinterp/about.html .
I actually found the method in an economic paper:
"Computing Stochastic Dynamic Economic Models with a Large Number of
State Variables: A Description and Application of a Smolyak-Collocation
Method", Malin, B. Krueger, D., Kubler, F.
>
> I can think of a couple of questions about this method that are not
> immediately clear to me:
>
> - Is it robust? I.e. how stable is the interpolant against
> perturbations of the data, badly scattered data points etc?
> Do you get problems with Chebychev polynomials if a high order is
> required?
It has roughly the same problems as chebychev polynomials: they are
designed so as to minimize Runge-Kutta errors and there is a similar
property for sparse product of chebychev polynomials.
The method is meant to represent a function which you can evaluate at
the points of the grid: they are precomputed, not directly taken from
the data so I don't know how to use it to interpolate scattered data.
It can achieve high precision for relatively smooth functions and fail
badly if there are discontinuities. Outside of the approximation space,
the extrapolation is known to behave badly. But in my experiments it was
better than having a constant value everywhere.
True, I had some problems using very high order polynomials (with l=6
meaning chebychev polynomials of order up to 2^6=64) but I don't know
where does the limitation comes from (my implementation or the algorithm)
>
> - Is your grid selection adaptive, or does the user need to tinker with
> grid parameters when interpolating?
In the current version the user only specifies a parameter l, which
quantify the precision of the grid (there are 2^l points in each
dimension) and the interconnection between dimensions. For instance at
l=1, there are two points in each dimension and no cross products. The
main advantage, is that increasing the number of dimensions keeping l
constant increases the number of points of the grid in a sub-exponential
way: this breaks the curse of dimensionality.
In the matlab toolbox by Andreas Klimke, there are adaptive schemes to
increase the number of points in each dimension so it is at least
theoretically possible.
>
> - How fast is this? As far I see, you need to solve a global linear
> least squares problem, which maybe is not so nice with a large number
> of data points?
I would say it is relatively fast: for the same sparse grid I found it
to be faster than the triangulation approach. And I use it with 5-6
dimensions without problems while I could not go that high with another
interpolation scheme.
The global linear least square problem is solved to find the langrange
polynomials interpolating a given set of values. It is not done again
when the interpolation is computed for new points (cf. discussion about
LinearND and Delaunay). It could be avoided as there are explicit
combinatorial formulas to do the same (but at that time I found it too
complex ; it is probably not).
>
> - There are a large number of possible interpolation methods we might
> like to include. What are the advantages of this one? One seems to be
> that it generalizes to n-D, and could be used to get smooth
> interpolants.
>
>
Yes absolutely. It is especially suited when the number of dimensions is
bigger than 3-4 where existing method (even delaunay based as far as I
know) have a huge performance penalty cost.
There are two other advantages of the current implementation: it can
compute efficiently the derivative of the interpolated function at given
points and it can also compute the derivative of the interpolated values
w.r.t. to the values on the grid.
What other interpolation schemes are you considering ? I am very
interested by these developments as interpolation is really a
practical/theoretical bottleneck for many of my problems. At some point
I had a prototype of a multilinear interpolation in dimension n. Would
you be interested in it ?
Best regards,
Pablo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20120513/6c48e6b6/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: example_linearnd_vs_delaunay.py
Type: text/x-python
Size: 1316 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20120513/6c48e6b6/attachment.py>
More information about the SciPy-Dev
mailing list