[Numpy-discussion] Automatic string length in recarray

Thomas Robitaille thomas.robitaille at gmail.com
Mon Nov 2 23:35:19 EST 2009


Hi,

I'm having trouble with creating np.string_ fields in recarrays. If I  
create a recarray using

np.rec.fromrecords([(1,'hello'),(2,'world')],names=['a','b'])

the result looks fine:

rec.array([(1, 'hello'), (2, 'world')], dtype=[('a', '<i8'), ('b', '| 
S5')])

But if I want to specify the data types:

np.rec.fromrecords([(1,'hello'),(2,'world')],dtype=[('a',np.int8), 
('b',np.str)])

the string field is set to a length of zero:

rec.array([(1, ''), (2, '')], dtype=[('a', '|i1'), ('b', '|S0')])

I need to specify datatypes for all numerical types since I care about  
int8/16/32, etc, but I would like to benefit from the auto string  
length detection that works if I don't specify datatypes. I tried  
replacing np.str by None but no luck. I know I can specify '|S5' for  
example, but I don't know in advance what the string length should be  
set to.

Is there a way to solve this problem without manually examining the  
data that is being passed to rec.fromrecords?

Thanks for any help,

Thomas



More information about the NumPy-Discussion mailing list