[Tutor] equation solving (was: Re: Tutor Digest, Vol 114, Issue 73)

Oscar Benjamin oscar.j.benjamin at gmail.com
Fri Sep 6 11:27:59 CEST 2013


On 5 September 2013 21:59, Alan Gauld <alan.gauld at btinternet.com> wrote:
> On 05/09/13 20:13, I. Alejandro Fleischer wrote:
>>
>> I have a set of data to fit to a custom equation, y=a+b*exp(k*x), would
>> you advice me on the how to, or tutorial?
>
> Can be be more precise? The more specific you are the easier it is
> to give specific answers.

I agree with Alan. You need to be more specific. I think that I
understand what you mean but perhaps you aren't aware that your
problem is ill-posed.

I'm going to assume that you have some data that gives paired
measurements of two quantities e.g. (x1, y1), (x2, y2), ... (xn, yn).
You want to find parameters a, b, and k so that y = a+b*exp(k*x) is a
good fit to your data. The problem is that there is no unique
definition of a "good" fit.

A well-posed optimisation problem identifies a single scalar quantity
that must be minimised (or maximised). The obvious choice in this kind
of thing is to treat one of your measured variables as the independent
variable (by convention this is called x) and the other as the
dependent variable (by convention y) and then define your error as the
sum of the squares of the residuals in estimating yi from xi:

Error = (1/2) sum[i=1..N] {   ((yi - (a+b*exp(k*xi)))**2)  }

However this is an arbitrary choice. You could have tried to regress x
onto y instead and then used the residuals for x which would lead to
different answers. Similarly choosing the sum of squares of the
residuals is an arbitrary choice.

In your particular case the highly non-linear relationship between x
and y means that minimising this kind of error could lead to a poor
result. If some of the yi are very large - as they could easily be for
this exponential relationship - then your fit will end up being
dominated by the largest data-points. In the worst case you'd
basically be computing an exact fit to the three largest data-points.
So a better residual might be something like:

(yi - (a+b*exp(k*xi))) / yi

It's hard to say without knowing more about the data or the problem though.

In any case your problem is just complicated enough that you need a
non-linear optimisation routine e.g. from scipy. If you knew the value
of a you could do a linear regression of log(y-a) onto x. Similarly if
you knew the value of k you could do a linear regression of y onto
exp(k*x). If you don't know any of a, b, or k then you have a
non-linear regression problem and you'll probably want to use a
function for non-linear least squares or perhaps an arbitrary
non-linear optimisation routine.

So your first step is probably to install scipy if you haven't already
and have a look at its optimize module. I can be more specific if you
explain a little more about what you're trying to do and what your
data looks like. Also as Alan says you need to explain how experienced
you are in the relevant maths, and in programming and Python to get
reasonable help.


Oscar


More information about the Tutor mailing list