[Numpy-discussion] replacing Nan's in a string array converted from a float array

Robert Kern robert.kern at gmail.com
Thu May 7 20:00:43 EDT 2009


On Thu, May 7, 2009 at 19:36, Brennan Williams
<brennan.williams at visualreservoir.com> wrote:
> I've created an array of strings using something like....
>
>              stringarray=self.karray.astype("|S8")
>
> If the array value is a Nan I get "1.#QNAN" in my string array.
>
> For cosmetic reasons I'd like to change this to something else, e.g.
> "invalid" or "inactive".
>
> My string array can be up to 100,000+ values.
>
> Is there a fast way to do this?

Well, there is a print option that lets you change how nans are
represented when arrays are printed. It is possible that this setting
should also be used when converting to string arrays. However, it does
not do so currently:

In [9]: %push_print --nanstr invalid
Precision:  8
Threshold:  1000
Edge items: 3
Line width: 75
Suppress:   False
NaN:        invalid
Inf:        Inf

In [10]: a = zeros(10)

In [11]: a[5] = nan

In [12]: a
Out[12]:
array([     0.,      0.,      0.,      0.,      0., invalid,      0.,
            0.,      0.,      0.])

In [13]: a.astype('|S8')
Out[13]:
array(['0.0', '0.0', '0.0', '0.0', '0.0', 'nan', '0.0', '0.0', '0.0', '0.0'],
      dtype='|S8')


You will need to use the typical approach:

  mask = (stringarray == '1.#QNAN')
  stringarray[mask] = 'invalid'

This will be wasteful of memory, so with your large array size, you
might want to consider breaking it into chunks and modifying the
chunks in this way.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco



More information about the NumPy-Discussion mailing list