[SciPy-User] How to fit a curve/function?

Fri Jun 10 09:22:52 EDT 2011

-------- Original-Nachricht --------
> Datum: Thu, 9 Jun 2011 12:07:47 -0400
> Von: josef.pktd at gmail.com
> An: SciPy Users List <scipy-user at scipy.org>
> Betreff: Re: [SciPy-User] How to fit a curve/function?

> On Thu, Jun 9, 2011 at 11:52 AM, Johannes Radinger <JRadinger at gmx.at>
> wrote:
> > Hello again...
> >
> > i try no to fit a curve using integrals as conditions.
> > the scipy manual says that integrations to infinite are possible with
> Inf,
> >
> > I tried following but I fail (saying inf is not defined):
> >
> > cond2 = 5.0/10/2 - integrate.quad(pdf,35000,Inf,args=(s1,s2))[0]
> >
> > what causes the problem? Do I use quad/Inf in a wrong way?
> 
> numpy.inf    inf doesn't exist in python itself

Oh thank you for that! I was just confused because in the manual Inf is used itself in the example...anyway it works now..

just additoinal questions:

1) How can I fit the curve with other kind of data. Like, if got something
like a histogram (counts per category of x). I want to fit now the function already mentioned with these data...

2) Can I get a value how good the fit is?

/Johannes

> 
> Josef
> 
> >
> > The error is:
> > NameError: global name 'Inf' is not defined
> >
> > /Johannes
> >
> > -------- Original-Nachricht --------
> >> Datum: Wed, 8 Jun 2011 11:37:52 -0400
> >> Von: josef.pktd at gmail.com
> >> An: SciPy Users List <scipy-user at scipy.org>
> >> Betreff: Re: [SciPy-User] How to fit a curve/function?
> >
> >> On Wed, Jun 8, 2011 at 11:37 AM,  <josef.pktd at gmail.com> wrote:
> >> > On Wed, Jun 8, 2011 at 10:56 AM, Johannes Radinger <JRadinger at gmx.at>
> >> wrote:
> >> >>
> >> >> -------- Original-Nachricht --------
> >> >>> Datum: Wed, 8 Jun 2011 10:33:45 -0400
> >> >>> Von: josef.pktd at gmail.com
> >> >>> An: SciPy Users List <scipy-user at scipy.org>
> >> >>> Betreff: Re: [SciPy-User] How to fit a curve/function?
> >> >>
> >> >>> On Wed, Jun 8, 2011 at 10:27 AM, Johannes Radinger
> <JRadinger at gmx.at>
> >> >>> wrote:
> >> >>> >
> >> >>> > -------- Original-Nachricht --------
> >> >>> >> Datum: Wed, 8 Jun 2011 10:12:58 -0400
> >> >>> >> Von: josef.pktd at gmail.com
> >> >>> >> An: SciPy Users List <scipy-user at scipy.org>
> >> >>> >> Betreff: Re: [SciPy-User] How to fit a curve/function?
> >> >>> >
> >> >>> >> On Wed, Jun 8, 2011 at 9:41 AM, Johannes Radinger
> >> <JRadinger at gmx.at>
> >> >>> >> wrote:
> >> >>> >> >
> >> >>> >> > -------- Original-Nachricht --------
> >> >>> >> >> Datum: Wed, 8 Jun 2011 07:10:38 -0400
> >> >>> >> >> Von: josef.pktd at gmail.com
> >> >>> >> >> An: SciPy Users List <scipy-user at scipy.org>
> >> >>> >> >> Betreff: Re: [SciPy-User] How to fit a curve/function?
> >> >>> >> >
> >> >>> >> >> On Wed, Jun 8, 2011 at 6:52 AM, Johannes Radinger
> >> <JRadinger at gmx.at>
> >> >>> >> >> wrote:
> >> >>> >> >> > Hello,
> >> >>> >> >> >
> >> >>> >> >> > I've got following function describing any kind of animal
> >> >>> dispersal
> >> >>> >> >> kernel:
> >> >>> >> >> >
> >> >>> >> >> > def pdf(x,s1,s2):
> >> >>> >> >> >    return
> >> >>> >> >>
> >> >>> >>
> >> >>>
> >>
> (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2)))))
> >> >>> >> >> >
> >> >>> >> >> > On the other hand I've got data from literature with which
> I
> >> want
> >> >>> to
> >> >>> >> fit
> >> >>> >> >> the function so that I get s1, s2 and x.
> >> >>> >> >> > Ususally the data in the literature are as follows:
> >> >>> >> >> >
> >> >>> >> >> > Example 1: 50% of the animals are between -270m and +270m
> and
> >> 90%
> >> >>> >>  are
> >> >>> >> >> between -500m and + 500m
> >> >>> >> >> >
> >> >>> >> >> > Example 2: 84% is between - 5000 m and +5000m, and 73% are
> >> between
> >> >>> >> >> -1000m and +1000m
> >> >>> >> >> >
> >> >>> >> >> > So far as I understand an integration of the function is
> >> needed to
> >> >>> >> solve
> >> >>> >> >> for s1 and s2 as all the literature data give percentage
> (area
> >> under
> >> >>> >> the
> >> >>> >> >> curve) Can that be used to fit the curve or can that create
> >> ranges
> >> >>> for
> >> >>> >> s1
> >> >>> >> >> and s2.
> >> >>> >> >>
> >> >>> >> >> I don't see a way around integration.
> >> >>> >> >>
> >> >>> >> >> If you have exactly 2 probabilities, then you can you a
> solver
> >> like
> >> >>> >> >> scipy.optimize.fsolve to match the probabilites
> >> >>> >> >> eg.
> >> >>> >> >> 0.5 = integral pdf from -270 to 270
> >> >>> >> >> 0.9 = integral pdf from -500 to 500
> >> >>> >> >>
> >> >>> >> >> If you have more than 2 probabilities, then using
> optimization
> >> of a
> >> >>> >> >> weighted function of the moment conditions would be better.
> >> >>> >> >>
> >> >>> >> >> Josef
> >> >>> >> >
> >> >>> >> >
> >> >>> >> >
> >> >>> >> > Hello again
> >> >>> >> >
> >> >>> >> > I tried following, but without success so far. What do I have
> to
> >> do
> >> >>> >> excactly...
> >> >>> >> >
> >> >>> >> > import numpy
> >> >>> >> > from scipy import stats
> >> >>> >> > from scipy import integrate
> >> >>> >> > from scipy.optimize import fsolve
> >> >>> >> > import math
> >> >>> >> >
> >> >>> >> > p=0.3
> >> >>> >> >
> >> >>> >> > def pdf(x,s1,s2):
> >> >>> >> >    return
> >> >>> >>
> >> >>>
> >>
> (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2)))))
> >> >>> >> >
> >> >>> >> > def equ(s1,s2):
> >> >>> >> >    0.5==integrate.quad(pdf,-270,270,args=(s1,s2))
> >> >>> >> >    0.9==integrate.quad(pdf,-500,500,args=(s1,s2))
> >> >>> >> >
> >> >>> >> > result=fsolve(equ, 1,500)
> >> >>> >> >
> >> >>> >> > print result
> >> >>> >>
> >> >>> >> equ needs to return the deviation of the equations (I changed
> some
> >> >>> >> details for s1 just to try it)
> >> >>> >>
> >> >>> >> import numpy
> >> >>> >> from scipy import stats
> >> >>> >> from scipy import integrate
> >> >>> >> from scipy.optimize import fsolve
> >> >>> >> import math
> >> >>> >>
> >> >>> >> p=0.3
> >> >>> >>
> >> >>> >> def pdf(x,s1,s2):
> >> >>> >>     return
> >> >>> >>
> >> >>>
> >>
> (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(math.sqrt(2*math.pi*s2**2))*numpy.exp(-((x-0)**(2)/(2*s2**(2)))))
> >> >>> >>
> >> >>> >> def equ(arg):
> >> >>> >>     s1,s2 = numpy.abs(arg)
> >> >>> >>     cond1 = 0.5 - integrate.quad(pdf,-270,270,args=(s1,s2))[0]
> >> >>> >>     cond2 = 0.9 - integrate.quad(pdf,-500,500,args=(s1,s2))[0]
> >> >>> >>     return [cond1, cond2]
> >> >>> >>
> >> >>> >> result=fsolve(equ, [200., 1200])
> >> >>
> >> >> thank you for your last reply...seems that the parameters of the two
> >> normals are nearly identical... anyway just two small addtional
> questions:
> >> >>
> >> >> 1)in fsolve(equ, [200., 1200]) the 200 and 1200 are kind of start
> >> values so far as I understand...how should these be choosen? what is
> >> recommended?
> >> >
> >> > There is no general solution for choosing starting values, in your
> >> > case it should be possible to
> >> >
> >> >>>> q = np.array([0.5, 0.9])
> >> >>>> cr = x/stats.norm.ppf(0.5 + q/2.)
> >> >>>> x = [270, 500]
> >> >>>> q = np.array([0.5, 0.9])
> >> >>>> x = [270, 500]
> >> >>>> cr = x/stats.norm.ppf(0.5 + q/2.)
> >> >>>> stats.norm.cdf(500, scale=cr[1]) - stats.norm.cdf(-500,
> scale=cr[1])
> >> > 0.89999999999999991
> >> -------
> >> I forgot to remove the typos
> >> >>>> stats.norm.cdf(q[0], scale=cr[1]) - stats.norm.cdf(-q[0],
> >> scale=cr[0])
> >> > 0.0011545021185267457
> >> >>>> stats.norm.cdf(q[0], scale=cr[0]) - stats.norm.cdf(-q[0],
> >> scale=cr[0])
> >> > 0.000996601515122153
> >> ---------
> >> >>>> stats.norm.cdf(x[0], scale=cr[0]) - stats.norm.cdf(-x[0],
> >> scale=cr[0])
> >> > 0.5
> >> >>>> sol = fsolve(equ, np.sort(cr))
> >> >
> >> > there are some numerical problems finding the solution (???)
> >> >
> >> >>>> equ(sol)
> >> > array([-0.05361093,  0.05851309])
> >> >>>> from pprint import pprint
> >> >>>> pprint(fsolve(equ, np.sort(cr), xtol=1e-10, full_output=1))
> >> > (array([ 354.32616549,  354.69918062]),
> >> >  {'fjac': array([[-0.7373189 , -0.67554484],
> >> >       [ 0.67554484, -0.7373189 ]]),
> >> >  'fvec': array([-0.05361093,  0.05851309]),
> >> >  'nfev': 36,
> >> >  'qtf': array([  1.40019135e-07,  -7.93593929e-02]),
> >> >  'r': array([ -5.21390161e-04,  -1.21700831e-03,  
> 3.88274320e-07])},
> >> >  5,
> >> >  'The iteration is not making good progress, as measured by the \n
> >> > improvement from the last ten iterations.')
> >> >
> >> >>
> >> >> 2) How can that be solve if I have I third condition (overfitted)
> can
> >> that be used as well or how does the alternative look like?
> >> >
> >> > use optimize.leastsq on equ (I never tried this for this case)
> >> > use fmin on the sum of squared errors
> >> >
> >> > if the intervals for the probabilities are non-overlapping (interval
> >> > data), then there is an optimal weighting matrix, (but my code for
> >> > that in the statsmodels.sandbox is not verified).
> >> >
> >> > Josef
> >> >
> >> >
> >> >>
> >> >> /johannes
> >> >>
> >> >>> >>
> >> >>> >> print result
> >> >>> >>
> >> >>> >> but in the results I get the parameters are very close to each
> >> other
> >> >>> >> [-356.5283675   353.82544075]
> >> >>> >>
> >> >>> >> the pdf looks just like a mixture of 2 normals both with loc=0,
> >> then
> >> >>> >> maybe the cdf of norm can be used directly
> >> >>> >
> >> >>> >
> >> >>> > Thank you for that hint... First yes these are 2 superimposed
> >> normals
> >> >>> but for other reasons I want to use the original formula instead of
> >> the
> >> >>> stats.functions...
> >> >>> >
> >> >>> > anyway there is still a thing...the locator s1 and s2 are like
> the
> >> scale
> >> >>> parameter of stats.norm so the are both + and -. For fsolve above
> it
> >> seems
> >> >>> that I get only one parameter (s1 or s2) but for the positive and
> >> negative
> >> >>> side of the distribution. So in actually there are four parameters
> >> -s1,
> >> >>> +s1, -s2, +s2. How can I solve that? Maybe I can restrict the
> fsolve
> >> to look
> >> >>> for the two values only in the positive range...
> >> >>>
> >> >>> It doesn't really matter, if the scale only shows up in quadratic
> >> >>> terms, or as in my initial change I added a absolute value, so
> whether
> >> >>> it's positive or negative, it's still only one value, and we
> >> >>> interprete it as postive scale
> >> >>>
> >> >>> s1 = sqrt(s1**2)
> >> >>>
> >> >>> Josef
> >> >>>
> >> >>> >
> >> >>> > any guesses?
> >> >>> >
> >> >>> > /J
> >> >>> >
> >> >>> >>
> >> >>> >> >>> from scipy import stats
> >> >>> >> >>> stats.norm.cdf(270, scale=350) - stats.norm.cdf(-270,
> >> scale=350)
> >> >>> >> 0.55954705470577526
> >> >>> >> >>>
> >> >>> >> >>> stats.norm.cdf(270, scale=354) - stats.norm.cdf(-270,
> >> scale=354)
> >> >>> >> 0.55436474670960978
> >> >>> >> >>> stats.norm.cdf(500, scale=354) - stats.norm.cdf(-500,
> >> scale=354)
> >> >>> >> 0.84217642881921018
> >> >>> >>
> >> >>> >> Josef
> >> >>> >> >
> >> >>> >> >
> >> >>> >> > /Johannes
> >> >>> >> >>
> >> >>> >> >> >
> >> >>> >> >> > /Johannes
> >> >>> >> >> >
> >> >>> >> >> > --
> >> >>> >> >> > NEU: FreePhone - kostenlos mobil telefonieren!
> >> >>> >> >> > Jetzt informieren: http://www.gmx.net/de/go/freephone
> >> >>> >> >> > _______________________________________________
> >> >>> >> >> > SciPy-User mailing list
> >> >>> >> >> > SciPy-User at scipy.org
> >> >>> >> >> > http://mail.scipy.org/mailman/listinfo/scipy-user
> >> >>> >> >> >
> >> >>> >> >> _______________________________________________
> >> >>> >> >> SciPy-User mailing list
> >> >>> >> >> SciPy-User at scipy.org
> >> >>> >> >> http://mail.scipy.org/mailman/listinfo/scipy-user
> >> >>> >> >
> >> >>> >> > --
> >> >>> >> > NEU: FreePhone - kostenlos mobil telefonieren!
> >> >>> >> > Jetzt informieren: http://www.gmx.net/de/go/freephone
> >> >>> >> > _______________________________________________
> >> >>> >> > SciPy-User mailing list
> >> >>> >> > SciPy-User at scipy.org
> >> >>> >> > http://mail.scipy.org/mailman/listinfo/scipy-user
> >> >>> >> >
> >> >>> >> _______________________________________________
> >> >>> >> SciPy-User mailing list
> >> >>> >> SciPy-User at scipy.org
> >> >>> >> http://mail.scipy.org/mailman/listinfo/scipy-user
> >> >>> >
> >> >>> > --
> >> >>> > NEU: FreePhone - kostenlos mobil telefonieren!
> >> >>> > Jetzt informieren: http://www.gmx.net/de/go/freephone
> >> >>> > _______________________________________________
> >> >>> > SciPy-User mailing list
> >> >>> > SciPy-User at scipy.org
> >> >>> > http://mail.scipy.org/mailman/listinfo/scipy-user
> >> >>> >
> >> >>> _______________________________________________
> >> >>> SciPy-User mailing list
> >> >>> SciPy-User at scipy.org
> >> >>> http://mail.scipy.org/mailman/listinfo/scipy-user
> >> >>
> >> >> --
> >> >> NEU: FreePhone - kostenlos mobil telefonieren!
> >> >> Jetzt informieren: http://www.gmx.net/de/go/freephone
> >> >> _______________________________________________
> >> >> SciPy-User mailing list
> >> >> SciPy-User at scipy.org
> >> >> http://mail.scipy.org/mailman/listinfo/scipy-user
> >> >>
> >> >
> >> _______________________________________________
> >> SciPy-User mailing list
> >> SciPy-User at scipy.org
> >> http://mail.scipy.org/mailman/listinfo/scipy-user
> >
> > --
> > NEU: FreePhone - kostenlos mobil telefonieren!
> > Jetzt informieren: http://www.gmx.net/de/go/freephone
> > _______________________________________________
> > SciPy-User mailing list
> > SciPy-User at scipy.org
> > http://mail.scipy.org/mailman/listinfo/scipy-user
> >
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user

-- 
NEU: FreePhone - kostenlos mobil telefonieren!			
Jetzt informieren: http://www.gmx.net/de/go/freephone