[SciPy-User] removing multiple occurrences of a specific value (or range of values) from an array

Brennan Williams brennan.williams at visualreservoir.com
Mon Jan 10 20:57:54 EST 2011


On 11/01/2011 2:48 p.m., josef.pktd at gmail.com wrote:
> On Mon, Jan 10, 2011 at 8:42 PM, Brennan Williams
> <brennan.williams at visualreservoir.com>  wrote:
>> I have a numpy array and I use .min(), .max(), .std(), average(..),
>> median(...) etc to get various stats values.
>>
>> Depending on where the data originally came from, the array can contain
>> a null value which could be 1.0e+20 or similar (can vary from dataset to
>> dataset). Due to rounding errors this can sometimes appear as something
>> like 1.0000002004e+20 etc etc.
>>
>> So I want to be able to correctly calculate the stats values by ignoring
>> the null values.
>>
>> I also want to be able to replace the null values with another value
>> (for plotting/exporting).
>>
>> What's the best way to do this without looping over the elements of the
>> array?
> If you don't have anything large, then you could just do
>
> x[x>1e19]=np.nan
>
> or filter them out, or convert to masked array.
>
the array is usually <10,000 values, often <1000

On a separate note I found that .std() didn't return a valid value when 
I have a lot of 1.0e+20's in the array. I realise that it is probably a 
single precision issue and I probably won't need to worry about this in 
future but I presume I should use .std(dtype=float64) ?

> Josef
>





More information about the SciPy-User mailing list