[SciPy-dev] nbinom.ppf

Mon Jul 27 19:10:11 EDT 2009

On Mon, Jul 27, 2009 at 6:20 PM, Pierre GM<pgmdevlist at gmail.com> wrote:
>
> On Jul 27, 2009, at 4:58 PM, Robert Kern wrote:
>
>> On Mon, Jul 27, 2009 at 15:29, <josef.pktd at gmail.com> wrote:
>>> On Mon, Jul 27, 2009 at 3:55 PM, Robert Kern<robert.kern at gmail.com>
>>> wrote:
>>>>
>>>> ceil() is correct; round() is not. round() would be okay if the only
>>>> inputs are expected to be outputs of the CDF, but one frequently
>>>> needs
>>>> the PPF to take all values in [0,1], like for example doing random
>>>> number generation via inversion.
>
> Fair enough, but a bit frustrating nevertheless in this particular case.
>
>>> Do you think it would be useful to round if we are epsilon (?) close
>>> to the next integer? It is more likely that users have a case like
>>> Pierre's where the answer might be the closest integer, instead of an
>>> epsilon below. It would move the floating point error, but to an, in
>>> actual usage, less likely location.
>
>
>
>>> I think, it is in scipy.special where I saw some code that treats
>>> anything that is within 1e-8 (?) of an integer as the integer.
>>
>> I think the code after the first line is an attempt to remedy this
>> situation:
>>
>>    def _ppf(self, q, n, pr):
>>        vals = ceil(special.nbdtrik(q,n,pr))
>>        vals1 = vals-1
>>        temp = special.nbdtr(vals1,n,pr)
>>        return where(temp >= q, vals1, vals)
>>
>> It is possible that "temp >= q" should be replaced with "temp >= q-
>> eps".
>
> Doesn't work in the case I was presenting, as temp is here an array of
> NaNs. Using
>  >>> vals = ceil(special.nbdtrik(q,n,pr)-eps)
> seems to work well enough, provided that eps is around 1e-8 as Josef
> suggested.
>

I finally remembered, that in the test for the discrete distribution
or for histogram with integers, I always use the 1e-8 correction, or
1e-14, depending on what is the more likely numerical error in the
calculation. I'm usually not sure what the appropriate correction is.

special.nbdtrik doesn't have any documentation, so I don't even know
what it is supposed to do.

The original failure case should make a nice roundtrip test for the
discrete distribution, after the correction. I think, until now I only
used floating point cases that are usually non-integers in the tests
of the discrete distribution, but not exact roundtrip tests as for the
continuous distributions.

I will test and change this as soon as possible.

Josef

>
>
>
> _______________________________________________
> Scipy-dev mailing list
> Scipy-dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>