[Tutor] list to numpy record array

Skipper Seabold jsseabold at gmail.com
Tue Feb 23 18:55:07 CET 2010


On Mon, Feb 22, 2010 at 11:50 PM, Vincent Davis
<vincent at vincentdavis.net> wrote:
>
> I must be missing something simple. I have a list of lists data = "[['  0', '  0', '234.0', '24.0', ' 25'], ['  1', '  0', '22428.0', '2378.1', ' 25'],......" and what to make a record array from it but it gets screwed up or I don't get it, maybe both. Notice that at this stage the items are strings, not numbers, and there is whitespace not sure this matters.
> Here is what is happening
> adata = numpy.array(data,numpy.float64)
>
> >>> adata
> array([[  0.00000000e+00,   0.00000000e+00,   2.34000000e+02,
>           2.40000000e+01,   2.50000000e+01],
>        ...,
>        [  4.77000000e+02,   4.77000000e+02,   2.07000000e+02,
>           4.58000000e+01,   2.50000000e+01]])
>
> This is what I would expect except it is not a record array.
> This is not what I expect. I think I have tried every iteration including using numpy dtaypes numpy.int32 or bdata = numpy.array(data, dtype = [('x', int),('y', int),('mean',float),('stdv',float),('npixcels',int)])
> What am I missing?
>
> bdata = numpy.array(data, [('x', int),('y', int),('mean',float),('stdv',float),('npixcels',int)])
> >>> bdata
> array([[(3153952, 0, 0.0, 0.0, 0), (3153952, 0, 0.0, 0.0, 0),
>         (206933603122, 0, 0.0, 0.0, 0), (808334386, 0, 0.0, 0.0, 0),
>         (3486240, 0, 0.0, 0.0, 0)],
>        [(3219488, 0, 0.0, 0.0, 0), (3153952, 0, 0.0, 0.0, 0),
>         (13561617777439282, 0, 0.0, 0.0, 0),
>         (54074581398322, 0, 0.0, 0.0, 0), (3486240, 0, 0.0, 0.0, 0)],
>        [(3285024, 0, 0.0, 0.0, 0), (3153952, 0, 0.0, 0.0, 0),
>         (206933931058, 0, 0.0, 0.0, 0), (925775666, 0, 0.0, 0.0, 0),
>         (3486240, 0, 0.0, 0.0, 0)],
>        ...,
>        [(3487540, 0, 0.0, 0.0, 0), (3618612, 0, 0.0, 0.0, 0),
>         (206933602866, 0, 0.0, 0.0, 0), (908996661, 0, 0.0, 0.0, 0),
>         (3486240, 0, 0.0, 0.0, 0)],
>        [(3553076, 0, 0.0, 0.0, 0), (3618612, 0, 0.0, 0.0, 0),
>         (13561596370041137, 0, 0.0, 0.0, 0),
>         (62870573495603, 0, 0.0, 0.0, 0), (3486240, 0, 0.0, 0.0, 0)],
>        [(3618612, 0, 0.0, 0.0, 0), (3618612, 0, 0.0, 0.0, 0),
>         (206933798962, 0, 0.0, 0.0, 0), (942552372, 0, 0.0, 0.0, 0),
>         (3486240, 0, 0.0, 0.0, 0)]],
>
>       dtype=[('x', '<i8'), ('y', '<i8'), ('mean', '<f8'), ('stdv', '<f8'), ('npixcels', '<i8')])
>
>

I neglected to reply to the whole list on my first try.  For posterity's sake:

You should ask on the scipy-user list with a self-contained example.
It is heavily trafficked. http://www.scipy.org/Mailing_Lists

>From the example you gave above, I am not sure what's going unless
it's something in the casting from strings.  Note though that you have
created a structured array and not a record array.  The subtle
difference is that the record array allows attribute lookup ie., you
could do bdata.x instead of bdata['x'].  Structured arrays are usually
faster as the attribute lookup convenience is implemented in Python
whereas the structured arrays use C code.

hth,

Skipper


More information about the Tutor mailing list