[Numpy-discussion] Indexing bug in structured arrays(?)

Chris Fonnesbeck fonnesbeck at gmail.com
Sat Apr 2 18:12:56 EDT 2011


I am either getting a nasty bug when indexing structured arrays, or I
don't really understand how they work. I have imported some data using
genfromtxt and an associated list of dtypes:

ndtype=[('include', int), ('year', int), ('month', int), ('day', int),
('deg_day_north', int), ('deg_day_south', int), ('water_north',
float), ('water_south', float), ('obs1', int), ('obs2', int), ('obs3',
int), ('obs4', int)]

survey_data = genfromtxt("Data/man33.out", dtype=ndtype)

This yields a nice structured array as follows:

array([(1, 85, 1, 6, 34, 20, 20.0, 22.5, 6, 30, 26, 79),
       (1, 85, 1, 10, 87, 38, 17.0, 22.0, 11, 43, 40, 113),
       (1, 85, 1, 14, 137, 60, 14.0, 22.0, 4, 87, 18, 70),
       (1, 85, 1, 19, 126, 42, 16.0, 22.0, 31, 192, 32, 301),
       (1, 85, 1, 22, 170, 80, 13.0, 21.0, 6, 316, 2, 118),
       (1, 85, 1, 27, 170, 99, 13.0, 21.0, 2, 373, 12, 124),
       (0, 85, 2, 2, 106, 34, 17.0, 22.0, 0, 52, 93, 87),
  ...
       (1, 0, 1, 16, 18, 2, 18.0, 22.5, 45, 158, 18, 141),
       (1, 0, 1, 27, 103, 70, 15.0, 22.0, 86, 503, 40, 162),
       (1, 1, 12, 21, 64, 33, 18.0, 24.5, 24, 103, 51, 72),
       (1, 1, 1, 1, 113, 56, 15.0, 22.5, 65, 399, 32, 259),
       (1, 1, 1, 5, 169, 104, 13.0, 22.5, 120, 677, 24, 390),
       (1, 1, 1, 24, 68, 43, 16.0, 21.0, 282, 658, 6, 298)],
      dtype=[('include', '<i8'), ('year', '<i8'), ('month', '<i8'),
('day', '<i8'), ('deg_day_north', '<i8'), ('deg_day_south', '<i8'),
('water_north', '<f8'), ('water_south', '<f8'), ('obs1', '<i8'),
('obs2', '<i8'), ('obs3', '<i8'), ('obs4', '<i8')])

The first column ('include') is an indicator for inclusion on a
particular analysis, so I want to filter those out. So, I tried:

In [30]: survey_data['include'==1]
Out[30]: (1, 85, 1, 6, 34, 20, 20.0, 22.5, 6, 30, 26, 79)

Is this the expected behavior? If so, please point me to any
documentation that explains it -- I did not see it on the pages
describing structured arrays on the Numpy site. I finally was able to
index these out using an array of booleans, but I would have thought
the above would work.

Thanks in advance.
cf



More information about the NumPy-Discussion mailing list