[SciPy-dev] Statistics toolbox and nans

Pearu Peterson pearu at cens.ioc.ee
Fri Nov 1 09:54:04 EST 2002


On 1 Nov 2002, A.J. Rossini wrote:

> >>>>> "travis" == Travis Oliphant <oliphant.travis at ieee.org> writes:
> 
>     travis> Hello developers.
>     travis> What should we do about nan's and the stats toolbox.  Stats is one
>     travis> package where people may use nans to represent missing values.
> 
> Yech.  This is a hard issue, but NAN isn't the solution.

I think so too that using NANs for representing missing values cannot be
reliable. There's too much weirdness going on with NaNs depending on the
local C library. For example, on linux

>>> nan=float('nan')
>>> nan==nan
1
>>> nan==1
1

while on Windows nan==1 returns 0, as I have been told. See

http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&threadm=mailman.1035055286.17772.python-list%40python.org&rnum=6&prev=/groups%3Fq%3DPearu%2BPeterson%26hl%3Den%26lr%3D%26ie%3DUTF-8%26scoring%3Dd

Tim Peters has been explained these NAN issues several times on the
Usenet, google for 'Tim Peters NaN'.

Since "all IEEE-754 behavior visible from Python is a platform-dependent
accident" [T.P.], I don't see that NaNs could be used in SciPy for
anything useful in an platform independent way. I would avoid using NaNs
and Infs as much as possible until they become less platform-dependent,
may be by implementing special objects for Python instead of using
float('nan'), float('inf') (that even should not work on Win32).

Pearu





More information about the SciPy-Dev mailing list