[SciPy-User] Orthogonal distance regression in 3D

Wed Mar 7 10:20:06 EST 2012

On Wed, Mar 7, 2012 at 7:30 AM, Robert Kern <robert.kern at gmail.com> wrote:
> On Wed, Mar 7, 2012 at 06:57, Владимир <draft2008 at bk.ru> wrote:
>>
>>
>>
>> 06 марта 2012, 15:15 от Robert Kern <robert.kern at gmail.com>:
>>> On Tue, Mar 6, 2012 at 08:22, Владимир <draft2008 at bk.ru> wrote:
>>> >
>>> > 05 марта 2012, 14:59 от Robert Kern <robert.kern at gmail.com>:
>>> >> On Mon, Mar 5, 2012 at 10:26, Владимир <draft2008 at bk.ru> wrote:
>>> >> > 02 марта 2012, 15:49 от Robert Kern <robert.kern at gmail.com>:
>>> >> >> On Fri, Mar 2, 2012 at 06:02, Владимир <draft2008 at bk.ru> wrote:
>>> >> >> > Hello!
>>> >> >> > I'm working with orthogonal distance regression (scipy.odr).
>>> >> >> > I try to fit the curve to a point cloud (3d), but it doesn work properly, it
>>> >> >> > returns wrong results
>>> >> >> >
>>> >> >> > For example I want to fit the simple curve y = a*x + b*z + c to some point
>>> >> >> > cloud (y_data, x_data, z_data)
>>> >> >> >
>>> >> >> >
>>> >> >> >     def func(p, input):
>>> >> >> >
>>> >> >> >     x,z = input
>>> >> >> >
>>> >> >> >     x = np.array(x)
>>> >> >> >
>>> >> >> >     z = np.array(z)
>>> >> >> >
>>> >> >> >     return (p[0]*x + p[1]*z + p[2])
>>> >> >> >
>>> >> >> >
>>> >> >> >     initialGuess = [1,1,1]
>>> >> >> >
>>> >> >> >     myModel = Model(func)
>>> >> >> >
>>> >> >> >     myData = Data([x_data, z_daya], y_data)
>>> >> >> >
>>> >> >> >     myOdr = ODR(myData, myModel, beta0 = initialGuess)
>>> >> >> >
>>> >> >> >     myOdr.set_job(fit_type=0)
>>> >> >> >
>>> >> >> >     out = myOdr.run()
>>> >> >> >
>>> >> >> >     print out.beta
>>> >> >> >
>>> >> >> > It works perfectly in 2d dimension (2 axes), but in 3d dimension the results
>>> >> >> > are not even close to real, moreover it is very sensitive to initial Guess,
>>> >> >> > so it returns different result even if i change InitiaGuess from [1,1,1]
>>> >> >> > to [0.99,1,1]
>>> >> >> >
>>> >> >> > What do I do wrong?
>>> >> >>
>>> >> >> Can you provide a complete runnable example including some data? Note
>>> >> >> that if you do not specify any errors on your data, they are assumed
>>> >> >> to correspond to a standard deviation of 1 for all dimensions. If that
>>> >> >> is wildly different from the actual variance around the "true"
>>> >> >> surface, then it might lead the optimizer astray.
>>> >> >>
>>> >> >> --
>>> >> >> Robert Kern
>>> >> >>
>>> >> >
>>> >> > I wonder why when I change the initial guess the results changes too. As it, the result depends on the initial guess directly. This is wrong.
>>> >> >
>>> >> > Here is an example (Sorry for the huge array of data, but its important to show what happens on it)
>>> >> >
>>> >> > import numpy as np
>>> >> > from scipy.odr import *
>>> >> > from math import *
>>> >>
>>> >> [snip]
>>> >>
>>> >> > def funcReturner(p, input):
>>> >> >        input = np.array(input)
>>> >> >        x = input[0]
>>> >> >        z = input[1]
>>> >> >        return 10**(p[0]*x + p[1]*z +p[2])
>>> >>
>>> >> Ah. 10**(p[0]*x+p[1]*z+p[2]) is a *lot* different from the linear
>>> >> problem you initially asked about. Setting the uncertainties
>>> >> accurately on all axes of your data is essential. Do you really know
>>> >> what they are? It's possible that you want to try fitting a plane to
>>> >> np.log10(y_data) instead.
>>> >>
>>> >> > myModel = Model(funcReturner)
>>> >> > myData = Data([x_data,z_data], y_data)
>>> >> > myOdr = ODR(myData, myModel, beta0=[0.04, -0.02,  1.75])
>>> >> > myOdr.set_job(fit_type=0)
>>> >> > out = myOdr.run()
>>> >> > result = out.beta
>>> >> >
>>> >> > print "Optimal coefficients: ", result
>>> >> >
>>> >> > I tryed to specify sx, sy, we, wd, delta, everything: and I get the better results, but they are still not what I need. And they are still depends directly on initial guess as well.
>>> >> > If I set initial guess to [1,1,1], it fails to find any close solution and returns totally wrong result with huge Residual Variance like 3.21014784829e+209
>>> >>
>>> >> For such a nonlinear problem, finding reasonable initial guesses is
>>> >> useful. There is also a maximum iteration limit defaulting to a fairly
>>> >> low 50. Check out.stopreason to see if it actually converged or just
>>> >> ran into the iteration limit. You can keep calling myOdr.restart()
>>> >> until it converges. If I start with beta0=[1,1,1], it converges
>>> >> somewhere between 300 and 400 iterations.
>>> >>
>>> >> --
>>> >> Robert Kern
>>> >>
>>> >
>>> > Yeah, increasing the number of iterations (maxit parameter) makes the results slightly more accurate, but not better. I mean if I attain that the stop reason is "sum square convergence", results are even worse. But, I tryed to fit converted function, like you recommended - np.log10(y_data). And it gave me the proper results. Why that happens and is it possible to achieve these results without convertion?
>>>
>>> As I mentioned before, in a nonlinear case, you really need to have
>>> good estimates of the uncertainties on each point. Since your Y
>>> variable varies over several orders of magnitude, I really doubt that
>>> the uncertainties are the same for each point. It's more likely that
>>> you want to assign a relative 10% (or whatever) uncertainty to each
>>> point rather than the same absolute uncertainty to each. I don't think
>>> that you have really measured both 1651.5+-1.0 and 0.05+-1.0, but
>>> that's what you are implicitly saying when you don't provide explicit
>>> estimates of the uncertainties.
>>>
>>> One problem that you are going to run into is that least-squares isn't
>>> especially appropriate for your model. Your Y output is strictly
>>> positive, but it goes very close to 0.0. The error model that
>>> least-squares fits is that each measurement follows a Gaussian
>>> distribution about the true point, and the Gaussian has infinite
>>> support (specifically, it crosses that 0 line, and you know a priori
>>> that you will never observe a negative value). For the observations
>>> ~1000.0, that doesn't matter much, but it severely distorts the
>>> problem at 0.05. Your true error distribution is probably something
>>> like log-normal; the errors below the curve are small but the errors
>>> above can be large. Transforming strictly-positive data with a
>>> logarithm is a standard technique. In a sense, the "log-transformed"
>>> model is the "true" model to be using, at least if you want to use
>>> least-squares. Looking at the residuals of both original and the
>>> log10-transformed problem (try plot(x_data, out.eps, 'k.'),
>>> plot(x_data, out.delta[0], 'k.'), etc.), it looks like the
>>> log10-transformed data does fit fairly well; the residuals mostly
>>> follow a normal distribution of the same size across the dataset.
>>> That's good! But it also means that if you transform these residuals
>>> back to the original space, they don't follow a normal distribution
>>> anymore, and using least-squares to fit the problem isn't appropriate
>>> anymore.
>>>
>>> > I could use converted function further, but the problem is that I have the whole list of different functions to fit. And I'd like to create universal fitter for all of them.
>>>
>>> Well, you will have to go through those functions (and their implicit
>>> error models) and determine if least-squares is truly appropriate for
>>> them. Least-squares is not appropriate for all models. However,
>>> log-transforming the strictly-positive variables in a model quite
>>> frequently is all you need to do to turn a least-squares-inappropriate
>>> model into a least-squares-appropriate one. You can write your
>>> functions in that log-transformed form and write a little adapter to
>>> transform the data (which is given to you in the original form).
>>>
>>> --
>>> Robert Kern
>>>
>>
>> Robert, thank you very much for detailed answer, now I see what is the problem. I don't really have any uncertainties, and I guess it would be hard to compute them from the data. Moreover, this data is just the sample, and I will have a different types of data in real. Transformation actually helps just for the the couple of functions, for instance, 10**(A*x + B*z +C) and C*(A)**X*(B)**Z functions fit just perfectly, but doesn't work for any others (like A*lg(X) + B*Z + C, C/(1 + A*X + B*Z)).
>
> Right. That's why I said that you have to go through all of the
> functions to see if it's applicable or not.
>
>> I transform the function like this (conditionally):
>> y_data = np.log10(y_data)
>> function = np.log10(function)
>> , is that correct?
>
> Yes.
>
>> And what do you mean by little adapter to transform the data?
>
> I just meant the "y_data = np.log10(y_data)" part.
>
>> By the way, the problem appears only in 3d mode. When I use the same logarithmic data in 2d mode (no Z axis), it works perfectly for all functions, and no log10 transformation needed (this transformation distort the results, make them worse in that case).
>
> Without knowing how you are getting the data to make that
> determination, I don't have much to say about it. The problem of
> inaccurate uncertainties will probably get larger as you increase the
> dimension of the inputs. Since you don't know them, it's probably not
> a good idea to keep trying to do ODR instead of ordinary least
> squares. Use myOdr.set_job(fit_type=2) to use OLS instead. You still
> have a problem with the unknown uncertainties on the Y output, but
> that narrows some of the problems down.
>
>> Do you know any other fitting methods, available in python?
>
> None that free you from this kind of thoughtful analysis. Fitting
> functions isn't a black box, I'm afraid. You need to consider your
> models and your data before fitting, and you need to look at the
> residuals afterwards to check that the results make sense. Then you
> improve your model. Least-squares is not suitable for all models. You
> may need to roll your own error model and use the generic minimization
> routines in scipy.optimize. If you can formulate your models as
> generalized linear models (which you can for a couple of them, but not
> all), you could look at the functionality for this in the statsmodels
> package.

some additional comments

If the nonlinear function are common or standard (in a field), then it
can be possible to find predefined starting values, or figure out
function specific ways to (semi-)automate the estimation.

I didn't look very closely at those, but these packages have a large
collection of nonlinear functions (the last time I looked)

http://code.google.com/p/pyeq2/     (also with global optimizer)
http://packages.python.org/PyModelFit/    maybe 1d only

R has a few "self-starting" non-linear functions, but AFAIK 1d only,
where it's possible to figure out good starting values from the data.

As Robert explained one problem with least squares is that the
variance of the error might not be the same for all observations
(heteroscedasticity in econometrics)

Besides guessing the right transformation, as the log in this example,
there are ways to test for the heteroscedasticity and to estimate the
transformation within a class of nonlinear transformation.

statsmodels has a few statistical tests, but currently only intended
and checked for the linear case (it should extend to nonlinear
functions)

scipy.stats has an estimator for the Box-Cox transformation, but I
looked at it only for the case of identically distributed observation,
not for function fitting. (statsmodels doesn't have any parameterized
transformations yet, only predefined transformation like the log. )
I don't know of anything in python that could currently be directly
used for this.

Josef

> --
> Robert Kern
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user