[SciPy-user] Record array help

Mon May 19 10:55:35 EDT 2008

Hi Johann

2008/5/19 Johann Rohwer <jr at sun.ac.za>:
> Is there any extended documentation/tutorial on record arrays?

There is an introduction here:

http://www.scipy.org/RecordArrays

> 1. Is it possible to change the dtype of a field after the record array has
> been created?

It can be done, but often it is not very useful:

In [3]: dt = np.dtype([('x',np.uint8),('y',np.uint8)])

In [4]: np.array([(1,2),(3,4)],dtype=dt)
Out[4]:
array([(1, 2), (3, 4)],
      dtype=[('x', '|u1'), ('y', '|u1')])

In [5]: _.view(np.uint16)
Out[5]: array([ 513, 1027], dtype=uint16)

I suspect what you want to do is to change one 'column' from, say, int
to float, and reinterpret the data.  For that, you'll need to make a
copy.

> 2. The CSV file has missing data points - how do I turn these into python
> 'None' elements in the record array? (If I leave that element empty in the
> CSV file, then csv2rec complains about not being able to handle the import;
> if I put 'None' in the CSV file (without quotes), then the whole field
> including the 'None' and all the other float data is converted into a string
> dtype, rendering the numerical data useless).

Maybe `numpy.loadtxt` could be of some use.

> 3. Is it possible to obtain a subset of the original data (corresponding to
> two or more columns of the CSV file) as a conventional 2D numpy array, or
> can I access the data only individually by column (i.e. field in the record
> array)?

I hope someone comes up with an elegant solution, otherwise you can make a copy:

numpy.array([data['field1'], data['field2']]).T

Regards
Stéfan