[SciPy-Dev] scipy.stats: algorithm to for ticket 1493

Mon May 14 17:11:32 EDT 2012

On Mon, May 14, 2012 at 4:46 PM, nicky van foreest <vanforeest at gmail.com> wrote:
> Yes,  you're right. I am trying to use the right version of scipy and
> stats, but I first have to figure out how to that.

It was also directed at myself, I spent half an hour staring at the
online code looking for the problem that wasn't there. :)

I switch python versions when I want to switch scipy versions. (python
2.5 with scipy 0.7, ...)

Josef

>
> Nicky
>
> On 14 May 2012 22:15,  <josef.pktd at gmail.com> wrote:
>> On Mon, May 14, 2012 at 3:51 PM,  <josef.pktd at gmail.com> wrote:
>>> On Mon, May 14, 2012 at 2:45 PM, nicky van foreest <vanforeest at gmail.com> wrote:
>>>>>> Nice example. The answer is negative, while it should be positive, but
>>>>>> the answer is within numerical accuracy I would say.
>>>>>
>>>>> oops, didn't we have a case with negative sign already ?
>>>>> maybe a check self.a <= p <= self.b  ?
>>>>
>>>> I included this. I also think that a check on whether left and right
>>>> stay within  self.a and self.b should be included, perhaps just for
>>>> safety reasons.
>>>>
>>>>>
>>>>>>
>>>>>>> I don't see anything yet to criticize in your latest version :(
>>>>>>
>>>>>> Ok. I just checked the tests in scipy/stats/tests.
>>>>>
>>>>> If you are curious, you could temporarily go closer to q=0 and q=1 in
>>>>> the tests for ppf, and see whether it breaks for any distribution.
>>>>
>>>> Good idea. Just to see what would happen I changed the following code
>>>> in test_continuous_basic.py:
>>>>
>>>> @_silence_fp_errors
>>>> def check_cdf_ppf(distfn,arg,msg):
>>>>    values = [-1.e-5, 0.,0.001,0.5,0.999,1.]
>>>>    npt.assert_almost_equal(distfn.cdf(distfn.ppf(values, *arg), *arg),
>>>>                            values, decimal=DECIMAL, err_msg= msg + \
>>>>                            ' - cdf-ppf roundtrip')
>>>
>>> roundtrip: looks like ppf should be ok, but cdf is not
>>>
>>>>>> stats.norm.ppf(-1e-5)
>>> nan
>>>>>> stats.norm.cdf(np.nan)
>>> 0.0
>>>>>> stats.norm.cdf(stats.norm.ppf(-1e-5))
>>> 0.0
>>>
>>> I'm using scipy 0.9. but I don't think this has changed, not that I know of
>>>
>>> I'm trying to track down when this got changed.
>>> (github doesn't show changes in a file that has too many changes, need
>>> to dig out git)
>>
>> It would be better to run the same version as looking at the code.
>> It's difficult to find the bug or understand the behavior if it's not
>> there anymore
>>
>> switching to scipy 0.10
>>
>>>>> stats.norm.cdf(np.nan)
>> nan
>>>>> scipy.__version__
>> '0.10.0b2'
>>
>> nan propagation is not available in 0.9.0
>>
>> https://github.com/scipy/scipy/commit/96e39ecc6a2b671ed7f99a9c0375adc9238c6056#L0L1343
>>
>> Josef
>>
>>>
>>>>
>>>>
>>>> Thus, I changed the values into an array. It should fail on the first
>>>> value, as it is negative, but I get a pass. Specifically, I ran:
>>>>
>>>> nicky at chuck:~/prog/scipy/scipy/stats/tests$ python test_continuous_basic.py
>>>> ..............................................................................................................................
>>>> ----------------------------------------------------------------------
>>>> Ran 126 tests in 93.990s
>>>>
>>>> OK
>>>>
>>>>>
>>>>
>>>> Weird result. If I add a q  = 1.0000001 I get a fail on the fourth
>>>> test, as expected.
>>>>
>>>>>> - repair for the cases q =  0 and q = 1 by means of an explicit test.
>>>>>
>>>>> isn't ppf (generic part) taking care of this, if not then it should, I think
>>>>
>>>> Actually, from the code in lines:
>>>>
>>>> https://github.com/scipy/scipy/blob/master/scipy/stats/distributions.py#L1529
>>>>
>>>> I am inclined to believe you. However, in view of the above test ...
>>>> Might it be that the conditions on L1529 have been added quite
>>>> recently, and did not yet make it to my machine? I'll check this right
>>>> now....As a matter of fact, my distributions.py contains the same
>>>> check, i.e.,         cond1 = (q > 0) & (q < 1) . Hmmm.
>>>>
>>>> Now I admit that I do not understand in all nitty-gritty detail the
>>>> entire implementation of ppf(), but I suspect that this is a bug.
>>>>
>>>>>
>>>>> ppf(0) = self.a
>>>>> ppf(1) = self.b
>>>>
>>>> Good idea.
>>>
>>> this already looks correct in the generic ppf code
>>>
>>>>>> stats.beta.ppf(0, 0.5)
>>> 0.0
>>>>>> stats.beta.a
>>> 0.0
>>>
>>> Josef
>>>>
>>>> I'll implement the code in my branch, and do a pull request.
>>>> _______________________________________________
>>>> SciPy-Dev mailing list
>>>> SciPy-Dev at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/scipy-dev
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-dev
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev