[SciPy-Dev] Subversion scipy.stats irregular problem with source code example

Skipper Seabold jsseabold at gmail.com
Tue Sep 28 13:18:13 EDT 2010


On Tue, Sep 28, 2010 at 1:12 PM, James Phillips <zunzun at zunzun.com> wrote:
> Since I observed the following behavior in the SVN respository version
> of SciPy, it seemed to me proper to post to the dev mailing list.  I'm
> using Ubuntu Lucid Lynx 32 bit and a fresh GIT of Numpy.  I am not
> sure if a Trac bug report needs to be entered.
>
>
> Below is some example code for fitting two statistical distributions.
> Sometimes the numpy-generated data is fit, as I can see the estimated
> and fitted parameters are different.  Sometimes I receive many
> messages repeated on the command line like:
>
> Warning: invalid value encountered in absolute
> Warning: invalid value encountered in subtract
>
> and the estimated parameters equal the fitted parameter values,
> indicating no fitting took place.  Sometimes I receive on the command
> line:
>
> Traceback (most recent call last):
>  File "/home/zunzun/local/lib/python2.6/site-packages/scipy/stats/distributions.py",
> line 1987, in func
>    sk = 2*(b-a)*math.sqrt(a + b + 1) / (a + b + 2) / math.sqrt(a*b)
> ValueError: math domain error
> Traceback (most recent call last):
>  File "example.py", line 10, in <module>
>    fitStart_beta = scipy.stats.beta._fitstart(data)
>  File "/home/zunzun/local/lib/python2.6/site-packages/scipy/stats/distributions.py",
> line 1992, in _fitstart
>    a, b = optimize.fsolve(func, (1.0, 1.0))
>  File "/home/zunzun/local/lib/python2.6/site-packages/scipy/optimize/minpack.py",
> line 125, in fsolve
>    maxfev, ml, mu, epsfcn, factor, diag)
> minpack.error: Error occured while calling the Python function named func
>
> and program flow is stopped.
>
>
> In summary, three behaviors: (1) Fits OK (2) Many exceptions with no
> fitting (3) minpack error.  Running the program 10 times or so will
> reproduce these behaviors without fail from the "bleeding-edge"
> repository code.
>
>     James Phillips
>
>
> ########################################################
>
> import numpy, scipy, scipy.stats
>
> # test uniform distribution fitting
> data = numpy.random.uniform(2.0, 3.0, size=100)
> fitStart_uniform = scipy.stats.uniform._fitstart(data)
> fittedParameters_uniform = scipy.stats.uniform.fit(data)
>
> # test beta distribution fitting
> data = numpy.random.beta(2.0, 3.0, size=100)
> fitStart_beta = scipy.stats.beta._fitstart(data)
> fittedParameters_beta = scipy.stats.beta.fit(data)
>
> print
> print 'uniform._fitstart returns', fitStart_uniform
> print 'fitted parameters for uniform =', fittedParameters_uniform
> print
> print 'beta._fitstart returns', fitStart_beta
> print 'fitted parameters for beta =', fittedParameters_beta
> print
> _______________________________________________

Is there an existing bug ticket for this?  If not there probably should be...

I think the fitting code should be looked at as experimental.  It's
good that you caught that no fitting is actually done in these cases.
The problem stems (for the most part) from bad starting values
(outside the support of the distribution for those with bounded
support).  I've tried to go through and fix this, giving very naive
(but correct) starting values to fit methods, but I haven't gotten
much further than that.

I don't know if Travis or Josef have gone back to look at this.
Hopefully one of these days I will find some more time to look at this
and try to give a systematic fix.

Skipper



More information about the SciPy-Dev mailing list