[Numpy-discussion] Medians that ignore values
Peter Saffrey
pzs at dcs.gla.ac.uk
Thu Sep 18 07:27:49 EDT 2008
I have data from biological experiments that is represented as a list of
about 5000 triples. I would like to convert this to a list of the median
of each triple. I did some profiling and found that numpy was much about
12 times faster for this application than using regular Python lists and
a list median implementation. I'll be performing quite a few
mathematical operations on these values, so using numpy arrays seems
sensible.
The only problem is that my data has gaps in it - where an experiment
failed, a "triple" will not have three values. Some will have 2, 1 or
even no values. To keep the arrays regular so that they can be used by
numpy, is there some dummy value I can use to fill these gaps that will
be ignored by the median routine?
I tried NaN for this, but as far as median is concerned, it counts as
infinity:
>>> from numpy import *
>>> median(array([1,3,nan]))
3.0
>>> median(array([1,nan,nan]))
nan
Is this the correct behavior for median with nan? Is there a fix for
this or am I going to have to settle with using lists?
Thanks,
Peter
More information about the NumPy-Discussion
mailing list