best way(s) to fit a polynomial to points?

Thu Feb 19 10:43:53 EST 2004

In article <3of9g1-fuv.ln1 at jowls.lairds.org>,
Kyler Laird  <Kyler at news.Lairds.org> wrote:
			.
			.
			.
>I was generating the test points with a simple algorithm.  I
>finally discovered that the fitting algorithm fails if the
>points are (close to being) colinear.
			.
			.
			.
>I'll append the code that I successfully used.
>
>>Will the new (x,y)-s be interior to the convex hull formed by
>>the tabulated ones, or might they be outside?
>
>It's good practice to make them interior (have ground control
>points around the perimeter of the region of interest) but it
>isn't always the case.  The solution should be general enough
>to work for outside points but it's understood that the error
>for them is likely to be high.
>
>>How
>>soon does this thing need to go live?
>
>It's something I meant to do during my Remote Sensing class
>last semester and then someone asked about it in another news
>group recently.  It's not urgent.  I just wanted to solve it.
>
>(I have some aerial photos that I georeferenced using ERDAS
>Imagine but I want to use my own code so that I can do more
>with them.)
>
>I'm happy with the C code, but I'd still prefer a Python
>solution.
			.
		[few hundred lines of C]
			.
			.
Right.

The C source exhibited has no geodetic or cartographic content,
apart from a slight assymmetry in the model polynomial; it's
just a simple-minded least-squares regression against a few
leading terms (of what I hope are zero-based data--do you in
fact precondition your data?).  It looks to me like a very
straightforward exercise to translate this mechanically to Python.

One ought to be able to do better, though.  I don't know the 
scientific packages available to Python.  You mentioned previously
that you found a couple of good univariate polynomial solvers, but
no multivariate one.  The cheap alternative is to pick out a 
least-squares solver--there surely are several, and they're easy
to write, if not--and feed it tabulations of the terms of data:
1, x, y, (x ** 2) * y, x * (y ** 2), ...  The biggest hazard is
the temptation to use too many terms.

This makes for a robust solver, is far less expensive than the
analytic work required for a formally-correct-but-practically-
indistinguishable solution, invariably, in my experience, yields
useful and/or instructive results, and, nicest of all, generalizes
nicely to appropriate transformations of the data.  If you've
designed the solver correctly, it's easy to experiment with
logarithms or square roots or such.  I think that's a healthy thing.

It's probably also timely to read up on R <URL:
http://www-106.ibm.com/developerworks/linux/library/l-sc16.html >.
There have been at least a couple of Python-R bindings in the last
four years.  I haven't checked at all to assess their current 
health.

And, oh YES, (near) colinearity indeed makes things go kablooey.
-- 

Cameron Laird <claird at phaseit.net>
Business:  http://www.Phaseit.net