[Tutor] list to numpy record array
Skipper Seabold
jsseabold at gmail.com
Tue Feb 23 18:55:07 CET 2010
On Mon, Feb 22, 2010 at 11:50 PM, Vincent Davis
<vincent at vincentdavis.net> wrote:
>
> I must be missing something simple. I have a list of lists data = "[[' 0', ' 0', '234.0', '24.0', ' 25'], [' 1', ' 0', '22428.0', '2378.1', ' 25'],......" and what to make a record array from it but it gets screwed up or I don't get it, maybe both. Notice that at this stage the items are strings, not numbers, and there is whitespace not sure this matters.
> Here is what is happening
> adata = numpy.array(data,numpy.float64)
>
> >>> adata
> array([[ 0.00000000e+00, 0.00000000e+00, 2.34000000e+02,
> 2.40000000e+01, 2.50000000e+01],
> ...,
> [ 4.77000000e+02, 4.77000000e+02, 2.07000000e+02,
> 4.58000000e+01, 2.50000000e+01]])
>
> This is what I would expect except it is not a record array.
> This is not what I expect. I think I have tried every iteration including using numpy dtaypes numpy.int32 or bdata = numpy.array(data, dtype = [('x', int),('y', int),('mean',float),('stdv',float),('npixcels',int)])
> What am I missing?
>
> bdata = numpy.array(data, [('x', int),('y', int),('mean',float),('stdv',float),('npixcels',int)])
> >>> bdata
> array([[(3153952, 0, 0.0, 0.0, 0), (3153952, 0, 0.0, 0.0, 0),
> (206933603122, 0, 0.0, 0.0, 0), (808334386, 0, 0.0, 0.0, 0),
> (3486240, 0, 0.0, 0.0, 0)],
> [(3219488, 0, 0.0, 0.0, 0), (3153952, 0, 0.0, 0.0, 0),
> (13561617777439282, 0, 0.0, 0.0, 0),
> (54074581398322, 0, 0.0, 0.0, 0), (3486240, 0, 0.0, 0.0, 0)],
> [(3285024, 0, 0.0, 0.0, 0), (3153952, 0, 0.0, 0.0, 0),
> (206933931058, 0, 0.0, 0.0, 0), (925775666, 0, 0.0, 0.0, 0),
> (3486240, 0, 0.0, 0.0, 0)],
> ...,
> [(3487540, 0, 0.0, 0.0, 0), (3618612, 0, 0.0, 0.0, 0),
> (206933602866, 0, 0.0, 0.0, 0), (908996661, 0, 0.0, 0.0, 0),
> (3486240, 0, 0.0, 0.0, 0)],
> [(3553076, 0, 0.0, 0.0, 0), (3618612, 0, 0.0, 0.0, 0),
> (13561596370041137, 0, 0.0, 0.0, 0),
> (62870573495603, 0, 0.0, 0.0, 0), (3486240, 0, 0.0, 0.0, 0)],
> [(3618612, 0, 0.0, 0.0, 0), (3618612, 0, 0.0, 0.0, 0),
> (206933798962, 0, 0.0, 0.0, 0), (942552372, 0, 0.0, 0.0, 0),
> (3486240, 0, 0.0, 0.0, 0)]],
>
> dtype=[('x', '<i8'), ('y', '<i8'), ('mean', '<f8'), ('stdv', '<f8'), ('npixcels', '<i8')])
>
>
I neglected to reply to the whole list on my first try. For posterity's sake:
You should ask on the scipy-user list with a self-contained example.
It is heavily trafficked. http://www.scipy.org/Mailing_Lists
>From the example you gave above, I am not sure what's going unless
it's something in the casting from strings. Note though that you have
created a structured array and not a record array. The subtle
difference is that the record array allows attribute lookup ie., you
could do bdata.x instead of bdata['x']. Structured arrays are usually
faster as the attribute lookup convenience is implemented in Python
whereas the structured arrays use C code.
hth,
Skipper
More information about the Tutor
mailing list