[SciPy-dev] Sad sad sad... Was: Warning about remaining issues in stats.distributions ?

josef.pktd at gmail.com josef.pktd at gmail.com
Thu Mar 12 23:51:24 EDT 2009


On Thu, Mar 12, 2009 at 10:41 PM, Yaroslav Halchenko
<lists at onerussian.com> wrote:
> heh heh... very sad to see that the warning was simply ignored and 0.7.0
> still has this issue on exactly the same command (hence advice to include
> it to unittests was ignored as well):
>
>>>> print scipy.__version__
> 0.7.0
>>>> scipy.stats.rdist(1.32, 0, 1).cdf(-1.0+numpy.finfo(float).eps)
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
>  File "/usr/lib/python2.5/site-packages/scipy/stats/distributions.py", line 117, in cdf
>    return self.dist.cdf(x,*self.args,**self.kwds)
>  File "/usr/lib/python2.5/site-packages/scipy/stats/distributions.py", line 625, in cdf
>    place(output,cond,self._cdf(*goodargs))
>  File "/usr/lib/python2.5/site-packages/scipy/stats/distributions.py", line 528, in _cdf
>    return self.veccdf(x,*args)
>  File "/usr/lib/python2.5/site-packages/numpy/lib/function_base.py", line 1886, in __call__
>    _res = array(self.ufunc(*newargs),copy=False,
>  File "/usr/lib/python2.5/site-packages/scipy/stats/distributions.py", line 525, in _cdf_single_call
>    return scipy.integrate.quad(self._pdf, self.a, x, args=args)[0]
>  File "/usr/lib/python2.5/site-packages/scipy/integrate/quadpack.py", line 185, in quad
>    retval = _quad(func,a,b,args,full_output,epsabs,epsrel,limit,points)
>  File "/usr/lib/python2.5/site-packages/scipy/integrate/quadpack.py", line 249, in _quad
>    return _quadpack._qagse(func,a,b,args,full_output,epsabs,epsrel,limit)
>  File "/usr/lib/python2.5/site-packages/scipy/stats/distributions.py", line 3046, in _pdf
>    return pow((1.0-x*x),c/2.0-1) / special.beta(0.5,c/2.0)
> ZeroDivisionError: 0.0 cannot be raised to a negative power
>
> and my workaround doesn't work any more so I need to look for another
> one.
>
>
> On Tue, 09 Dec 2008, Yaroslav Halchenko wrote:
>
>> > * distributions that have problems for some range of parameters
>> so a good (imho) piece to add to unittests for the 'issues' to be fixed:
>
>> scipy.stats.rdist(1.32, 0, 1).cdf(-1.0+numpy.finfo(float).eps)
>
>> (tried on the SVN trunk to verify that it fails... discover the reason on
>> your own ;-))
>
>> For myself I resolved it with
>
>>     __eps = N.sqrt(N.finfo(float).eps)
>>     rdist = rdist_gen(a=-1.0+__eps, b=1.0-__eps, ....
>
>> but I am not sure if that is the cleanest way... and may be some other
>> distributions would need such tweakery to make them more stable.
> --
>                                  .-.
> =------------------------------   /v\  ----------------------------=
> Keep in touch                    // \\     (yoh@|www.)onerussian.com
> Yaroslav Halchenko              /(   )\               ICQ#: 60653192
>                   Linux User    ^^-^^    [175555]
>
>

Fixing numerical integration over the distance of a machine epsilon of
a function that has a singularity at the boundary was not very high on
my priority list.

If there is a real use case that requires this, I can do a temporary
fix. As far as I have seen, you use explicitly this special case as a
test case and not a test that would reflect a failing use case.
Overall, I prefer to have a general solution to the boundary problem
for numerical integration, instead of messing around with the
theoretically correct boundaries.

Also, I would like to know what the references for the rdist are.
Google search for r distribution is pretty useless, and I have not yet
found a reference or an explanation of the rdist and its uses.

Josef



More information about the SciPy-Dev mailing list