[Numpy-discussion] `missing` argument in genfromtxt only a string?

Skipper Seabold jsseabold at gmail.com
Sun Sep 13 15:51:24 EDT 2009


On Sun, Sep 13, 2009 at 1:29 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
> Is there a reason that the missing argument in genfromtxt only takes a string?
>
> For instance, I have a dataset that in most columns has a zero for
> some observations but in others it was just left blank, which is the
> equivalent of zero.  I would like to set all of the missing to 0 (it
> defaults to -1 now) when loading in the data.  I suppose I could do
> this with a converter, but I have too many columns for this.
>
> Before I try to work on a patch, I'd just like to know if I'm missing
> something, maybe there's already way to do this (without using a
> mask)?
>
> -Skipper
>

To be a little more concrete here are the two problems I am having right now.

from StringIO import StringIO
import numpy as np

s = stringIO('D01N01,10/1/2003  ,1, 1,  0, 400, 600,0,   0,  0,0,0,
0,0,0,    0,   0,0,0,   0,0,0,0,0,0,   0,   0,0,   0,0,   0,0,0,3,0,
50,  80,0,  0,0,0,0,0, 4,0, 3380, 1070,   0,  0,  0,0,0,0,1,0, 600,
900,0,   0,    0,0,0,0, 0,0,   0,   0,0,0,  0,0,  0,0, 0,0,   0,
0,0,0,0,  0,0,0,2,0,1000, 900,0,   0,   0,0,0,0,0,0,   0,   0,0,0,
0,0,0,0,0,0,   0,   0,0,0,0,0,0,0,0,0,  0,  0,0,  0,0,0,0,0,0,0,   0,
 0,0,0,0,0,0,0,0,0,  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,    0,    0,0,   0,0,0,0,0,1,0, 500, 800,0,  0,
0,0,0,0,0,0,    0,    0,0,   0,   0,0,0,0,1,  0,  300,    0,0,   0,
0,0,0,0, 1,0, 1600,  900,   0,   0,0,0,   0,0,0,0,     0,
0,0,0,0,0,0,0,0,0,    0,    0,0,0,0,0,0,0, 0,0,    0,   0,
0,0,0,0,0,0,0,0,   0,   0,0,0,0,0,0,0, 0, 0,0,0,0, 0,0, 0,0,0,0,0,
0,0,0,0,0,0,0,0, 0,0,   0,  0,0, 0,0,0,0,0, 0,0,
0,0,0,0,0,0,0,0,0,0,   0,   0,0,0,0,0,0,0,0,0,  0,  0,0,0,0,0,0,0,0,0,
 0,0,0,  0,0,0,0,0, 0,0,    0,    0,    0,    0,0,   0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0\r\nL24U05,12/25/2003
,2,  ,   ,    ,    , ,    ,   , , ,   , , ,     ,    , , ,    , , , ,
, ,    ,    , ,    , ,    , , , , ,   ,    , ,   , , , , ,  , ,     ,
   ,    ,   ,   , , , , , ,    ,    , ,    ,     , , , ,  , ,    ,
, , ,   , ,   , ,  , ,    ,    , , , ,   , , , , ,    ,    , ,    ,
, , , , , ,    ,    , , ,   , , , , , ,    ,    , , , , , , , , ,   ,
 , ,   , , , , , , ,    ,    , , , , , , , , ,   , , , , , , , , , , ,
, , , , , , , , ,    , , , , , , , , , ,     ,     , ,    , , , , , ,
,    ,    , ,   ,    , , , ,0,0,    0,    0,0,   0,   0,0,0,0, ,   ,
  ,     , ,    ,   , , , ,  , ,     ,     ,    ,    , , ,    , , , ,
   ,     , , , , , , , , ,     ,     , , , , , , ,  , ,     ,    ,
, , , , , , , ,    ,    , , , , , , ,  ,  , , , ,  , ,  , , , , ,   ,
, , , , , , ,  , ,    ,   , ,  , , , , ,  , ,     , , , , , , , , , ,
  ,    , , , , , , , , ,   ,   , , , , , , ,0,0,  0,0,0,  0,0,0,0,0,
, ,     ,     ,     ,     , ,    ,   , , , , , , , , , , , , , , , , ,
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
, , , , \r\n')

data = np.genfromtxt(s, dtype=None, delimiter=",", names=None)

All of the missing values in the second observation are now -1.  Also,
I'm having trouble defining a converter for my dates.

I have the function

from datetime import datetime

def str2date(date):
    day,month,year = date.strip().split('/')
    return datetime(*map(int, [year, month, day]))

conv = {1 : lambda s: str2date(s)}
s.seek(0)
data = np.genfromtxt(s, dtype=None, delimiter=",", names=None, converters=conv)

I get

/usr/local/lib/python2.6/dist-packages/numpy/lib/io.pyc in
genfromtxt(fname, dtype, comments, delimiter, skiprows, converters,
missing, missing_values, usecols, names, excludelist, deletechars,
case_sensitive, unpack, usemask, loose)
    990         if dtype is None:
    991             for (converter, item) in zip(converters, values):
--> 992                 converter.upgrade(item)
    993         # Store the values

    994         append_to_rows(tuple(values))

/usr/local/lib/python2.6/dist-packages/numpy/lib/_iotools.pyc in
upgrade(self, value)
    469             # Raise an exception if we locked the converter...

    470             if self._locked:
--> 471                 raise ValueError("Converter is locked and
cannot be upgraded")
    472             _statusmax = len(self._mapper)
    473             # Complains if we try to upgrade by the maximum


ValueError: Converter is locked and cannot be upgraded

Does anyone know what I'm doing wrong?

Thanks,

Skipper



More information about the NumPy-Discussion mailing list