[SciPy-user] Record Array: How to add a column?
John Hunter
jdh2358 at gmail.com
Tue Oct 14 12:55:25 EDT 2008
On Tue, Oct 14, 2008 at 11:28 AM, Pierre GM <pgmdevlist at gmail.com> wrote:
> John,
> Do you plan to have your modifications part of numpy.records ? In any case,
> I'll try to check whether it is easy to add support to missing data:
> MaskedArrays should now support with flexible-types.
I do not have concrete plans, but I have spoken with Jarrod about
moving some of these over, making some of them record array methods,
others available in the np.rec namespace. I think the consensus is
that these are useful and belong in numpy, but we are awaiting someone
to do the port.
On the subject of masked record arrays. We added masked array support
to mlab.csv2rec some time ago and it has caused no shortage of
headaches because of differences in the interface for objects for
masked record arrays and regular recarrays.
The following example shows a record array with a 'date' column which
is a O4 python object type. Here is the behavior of the recarray
In [212]: !cat test1.csv
date,age,name
2008-01-01,10,'tom'
2008-01-02,11,'dick'
2008-01-03,12,'harry'
In [213]: r1 = mlab.csv2rec('test1.csv')
In [214]: type(r1)
Out[214]: <class 'numpy.core.records.recarray'>
In [215]: r1.dtype
Out[215]: dtype([('date', '|O4'), ('age', '<i4'), ('name', '|S7')])
In [216]: print r1[0].date.year
2008
In particular, on a given row of the recarray, I can call object
methods and access object attributes.
In the next example, the data file has a missing value on the last row
in the 'age' column, so we return a masked record array
In [217]: !cat test2.csv
date,age,name
2008-01-01,10,'tom'
2008-01-02,11,'dick'
2008-01-03,,'harry'
In [218]: type(r2)
Out[218]: <class 'numpy.ma.mrecords.MaskedRecords'>
In [219]: print r2.dtype
[('date', '|O4'), ('age', '<i4'), ('name', '|S7')]
In [220]: r2[0].date.year
------------------------------------------------------------
Traceback (most recent call last):
File "<ipython console>", line 1, in ?
AttributeError: 'MaskedArray' object has no attribute 'year'
It would help us a lot in this regard if we could access the
underlying object. Is there a reason why the masked array behaves
differently when it comes to accessing the underlying object methods
and is there a sensible way to make them compatible?
Thanks,
JDH
More information about the SciPy-User
mailing list