[SciPy-Dev] Subversion scipy.stats irregular problem with source code example

Skipper Seabold jsseabold at gmail.com
Thu Dec 9 15:12:52 EST 2010


On Thu, Dec 9, 2010 at 2:34 PM, Charles <charles.moliere at gmail.com> wrote:
> Skipper Seabold <jsseabold <at> gmail.com> writes:
>
>>
>> On Tue, Sep 28, 2010 at 1:12 PM, James Phillips <zunzun <at> zunzun.com>
> wrote:
>> > Since I observed the following behavior in the SVN respository version
>> > of SciPy, it seemed to me proper to post to the dev mailing list.  I'm
>> > using Ubuntu Lucid Lynx 32 bit and a fresh GIT of Numpy.  I am not
>> > sure if a Trac bug report needs to be entered.
>> >
>> >
>> > Below is some example code for fitting two statistical distributions.
>> > Sometimes the numpy-generated data is fit, as I can see the estimated
>> > and fitted parameters are different.  Sometimes I receive many
>> > messages repeated on the command line like:
>> >
>> > Warning: invalid value encountered in absolute
>> > Warning: invalid value encountered in subtract
>> >
>> > and the estimated parameters equal the fitted parameter values,
>> > indicating no fitting took place.  Sometimes I receive on the command
>> > line:
>> >
>> > Traceback (most recent call last):
>> >  File "/home/zunzun/local/lib/python2.6/site-
> packages/scipy/stats/distributions.py",
>> > line 1987, in func
>> >    sk = 2*(b-a)*math.sqrt(a + b + 1) / (a + b + 2) / math.sqrt(a*b)
>> > ValueError: math domain error
>> > Traceback (most recent call last):
>> >  File "example.py", line 10, in <module>
>> >    fitStart_beta = scipy.stats.beta._fitstart(data)
>> >  File "/home/zunzun/local/lib/python2.6/site-
> packages/scipy/stats/distributions.py",
>> > line 1992, in _fitstart
>> >    a, b = optimize.fsolve(func, (1.0, 1.0))
>> >  File "/home/zunzun/local/lib/python2.6/site-
> packages/scipy/optimize/minpack.py",
>> > line 125, in fsolve
>> >    maxfev, ml, mu, epsfcn, factor, diag)
>> > minpack.error: Error occured while calling the Python function named func
>> >
>> > and program flow is stopped.
>> >
>> >
>> > In summary, three behaviors: (1) Fits OK (2) Many exceptions with no
>> > fitting (3) minpack error.  Running the program 10 times or so will
>> > reproduce these behaviors without fail from the "bleeding-edge"
>> > repository code.
>> >
>> >     James Phillips
>> >
>> >
>> > ########################################################
>> >
>> > import numpy, scipy, scipy.stats
>> >
>> > # test uniform distribution fitting
>> > data = numpy.random.uniform(2.0, 3.0, size=100)
>> > fitStart_uniform = scipy.stats.uniform._fitstart(data)
>> > fittedParameters_uniform = scipy.stats.uniform.fit(data)
>> >
>> > # test beta distribution fitting
>> > data = numpy.random.beta(2.0, 3.0, size=100)
>> > fitStart_beta = scipy.stats.beta._fitstart(data)
>> > fittedParameters_beta = scipy.stats.beta.fit(data)
>> >
>> > print
>> > print 'uniform._fitstart returns', fitStart_uniform
>> > print 'fitted parameters for uniform =', fittedParameters_uniform
>> > print
>> > print 'beta._fitstart returns', fitStart_beta
>> > print 'fitted parameters for beta =', fittedParameters_beta
>> > print
>> > _______________________________________________
>>
>> Is there an existing bug ticket for this?  If not there probably should be...
>>
>> I think the fitting code should be looked at as experimental.  It's
>> good that you caught that no fitting is actually done in these cases.
>> The problem stems (for the most part) from bad starting values
>> (outside the support of the distribution for those with bounded
>> support).  I've tried to go through and fix this, giving very naive
>> (but correct) starting values to fit methods, but I haven't gotten
>> much further than that.
>>
>> I don't know if Travis or Josef have gone back to look at this.
>> Hopefully one of these days I will find some more time to look at this
>> and try to give a systematic fix.
>>
>> Skipper
>>
>
>
> Hi,
> I'm very sorry for entering the thread like this, but after a long search over
> the web, this thread is the more relevant to my problem which I'm stuck with.
> I'm actually trying to fit a gamma distribution on a set of experimental
> values with gamma.fit() in scipy 0.8.0. Here is the very simple code I'm using
> with a sample of my data:
>
> ##########################
> import scipy as sp
> import scipy.stats as ss
>
> exp_data =[25.6,35.8,100.2,115.2,125.2,140.1,160.6,210.1,250.5,4500.3]
> data = sp.array(exp_data)
>
> fit_alpha, fit_loc, fit_beta = ss.gamma.fit(data)
> print(fit_alpha,fit_loc,fit_beta)
> #########################
>
> I then receive many messages on the command line:
> Warning: invalid value encountered in subtract
>
> Which ends with no fitting of the parameters:
> (1.0, 0.0, 1.0)
>
> With earlier version of scipy (0.7.2), the error message are absent but still
> no fitting is done. Apparently, it is the extrem value of "4500.3" that is
> causing problem with the fitting in this case.
>
> I know you metionned earlier that the fitting code should be considered as
> experimental, however I was wondering if this should be considered as a bug,
> or if I'm making a mistake. In either case, is there a fix for the fit
> method to work with a gamma distribution?
>

It looks like Josef's recent changes have got this working.  Using the
most recent trunk, so you might want to upgrade or see the changeset

In [1]: import scipy as sp

In [2]: import scipy.stats as ss

In [3]:

In [4]: exp_data =[25.6,35.8,100.2,115.2,125.2,140.1,160.6,210.1,250.5,4500.3]

In [5]: data = sp.array(exp_data)

In [6]:

In [7]: fit_alpha, fit_loc, fit_beta = ss.gamma.fit(data)

In [8]: print(fit_alpha, fit_loc, fit_beta)
(0.37079887324711569, 25.599999999999998, 2459.7323873048508)

Skipper



More information about the SciPy-Dev mailing list