[SciPy-User] norm.logpdf fails with float128?

Fri Feb 1 10:16:39 EST 2013

please reply at end of message or inline, this is a bottom-posting mailinglist

On Fri, Feb 1, 2013 at 2:31 AM, Jacob Biesinger
<jake.biesinger at gmail.com> wrote:
> I don't think it would solve the issues of having different dtypes supported
> differently by the various pdf functions, but would it be helpful to replace
> the hard coded 'd' data type of output with a kwargs.get('dtype', 'd') so
> the user could specify the output data type?

That would be another possibility

>
> On Jan 31, 2013 10:20 PM, <josef.pktd at gmail.com> wrote:
>>
>> On Thu, Jan 31, 2013 at 11:55 PM, Jacob Biesinger
>> <jake.biesinger at gmail.com> wrote:
>> > Hi!
>> >
>> > I know float128 support is a bit sketchy.  I've noticed the norm.pdf and
>> > norm.logpdf functions both choke when given float128 values.  The weird
>> > thing is the corresponding norm._pdf and norm._logpdf functions seem to
>> > work
>> > fine:
>> >
>> > # pdf function fails
>> >>>> m = sp.ones(5, dtype=sp.longdouble)
>> >>>> norm.pdf(m)
>> > ...
>> > TypeError: array cannot be safely cast to required type
>> >
>> >>>> norm.pdf(m, dtype=sp.longdouble)
>> > ...
>> > TypeError: array cannot be safely cast to required type
>> >
>> >
>> > # but the _pdf function works fine, with appropriate
>> > long-double-precision
>> >>>> norm._pdf(m*100)
>> > array([ 1.3443135e-2172,  1.3443135e-2172,  1.3443135e-2172,
>> >         1.3443135e-2172,  1.3443135e-2172], dtype=float128)
>> >
>> > Is this expected behaviour? Perhaps a problem with the `place` command?
>> > If
>> > it's not expected, I'm happy to tinker and submit a pull request.
>>
>> Is this recent behavior because numpy became more picky on downcasting
>> (which I welcome)?
>> I'm on Windows and cannot really check the behavior with float128.
>>
>> http://projects.scipy.org/scipy/ticket/1718
>>
>> the output array is defined as "d" float64
>>
>> https://github.com/scipy/scipy/blob/master/scipy/stats/distributions.py#L1179
>>
>> I never looked closely at what the behavior of the continuous
>> distributions for different dtypes.
>>
>> There is no direct casting of arguments. As you showed, the private,
>> distributions specific methods are calculating with whatever is given,
>> lower precision if float32,  or an exception if it cannot handle it
>>
>> >>> stats.norm._cdf(np.array([-1j, 0, 1, np.nan, np.inf]))
>> Traceback (most recent call last):
>>   File "<pyshell#58>", line 1, in <module>
>>     stats.norm._cdf(np.array([-1j, 0, 1, np.nan, np.inf]))
>>   File
>> "C:\Programs\Python33\lib\site-packages\scipy\stats\distributions.py",
>> line 2032, in _cdf
>>     return _norm_cdf(x)
>>   File
>> "C:\Programs\Python33\lib\site-packages\scipy\stats\distributions.py",
>> line 2003, in _norm_cdf
>>     return special.ndtr(x)
>> TypeError: ufunc 'ndtr' not supported for the input types, and the
>> inputs could not be safely coerced to any supported types according to
>> the casting rule ''safe''
>>
>> >>> stats.norm._pdf(np.array([-1j, 0, 1, np.nan, np.inf]))
>> array([ 0.65774462 -0.j,  0.39894228 +0.j,  0.24197072 +0.j,
>>                nan+nanj,         nan+nanj])
>>
>>
>> I don't know what we should do about it, some of the special functions
>> can only handle float64, and float32 I think.
>>
>> 1) status quo: let users decide, and raise an exception as now if
>> results cannot be cast to float64.
>>
>> 2) status quo with explicit type check of arguments: we can raise an
>> exception before doing the calculations.
>> We could return output as float32 if input is also float32
>>
>> 3) match output type to input type (at least a floating point, no
>> integers): Will work in some cases, like norm.pdf, will not work
>> (exception or implicit casting) if the supporting functions
>> (scipy.special, scipy.integrate) don't support it.
>> I don't know if all of them raise exceptions with downcasting.

4) add dtype argument to methods

implementation: the current *args, **kwds parsing would need to be
changed, since with two exceptions, we rely on loc scale being the
last arguments, Ralph is just cleaning up this a bit
https://github.com/scipy/scipy/pull/400

usability: for the usage in statsmodels, automatic (input dependent)
casting would be more convenient. Since we cannot hardcode the dtype,
because we would like different dtypes at different calls to the
function, we would have to choose the dtype case dependent. Would  be
nicer if that is in the methods directly.
edit: not really a problem, we could just do (..., dtype=x.dtype)

---
I think we should widen the supported dtypes, but I think that we are
too close to a release to change this now because the impact of any
change is not so easy to guess, and we might need more time to figure
out how to handle it.

float128 would be useful to have higher precision in some of the
methods, especially when they are struggling numerically close to
corner or extreme cases. (Although it won't help me on Windows.)
But maybe we have the precision already in the actual calculations, we
just don't return it.

Just to get a better background

What's your use case for using float128, why don't you cast to float64?

The test suite has a list of distributions and example parameters that
makes it easy to loop over all distributions. As a relatively quick
check, you could run them on the private methods, _pdf _cdf, and see
which return again float128.
I have no idea about the number of cases that would work correctly,
there might be exceptions and implicit casting in some of the helper
functions.

Josef

>>
>> other versions like always cast to float64 goes against the trend to
>> be more explicit.
>>
>> ???
>>
>> Josef
>>
>>
>> >
>> > --
>> > Jake Biesinger
>> > Graduate Student
>> > Xie Lab, UC Irvine
>> >
>> > _______________________________________________
>> > SciPy-User mailing list
>> > SciPy-User at scipy.org
>> > http://mail.scipy.org/mailman/listinfo/scipy-user
>> >
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>