[Numpy-discussion] Structured array creation with list of lists and others

Kirill Balunov kirillbalunov at gmail.com
Sun Mar 26 12:44:11 EDT 2017


Allan thank you for your draft! I agree with you that  (not in mine ) in
general case it would be hard to resolve all corner cases. Also I think if
someone read numpy reference linearly, he/she will have some insight that
list of tuples are necessary (but it was not my case).

For me one problem is that in some cases numpy allows a lot freedom, but in
other it is unnecessarily strict. Another one is exception messages (but
this is certainly subjective).



2017-03-24 19:48 GMT+03:00 Allan Haldane <allanhaldane at gmail.com>:

> On 03/23/2017 02:16 PM, Kirill Balunov wrote:
> > It was the first time I tried to create a structured array in numpy.
> > Usually I use pandas for heterogeneous arrays, but it is one more
> > dependency to my project.
> >
> > It took me some time (really, much more than some), to understand the
> > problem with structured array creation. As example:
> >
> > I had list of list of this kind:
> > b=[[ 1, 10.3, 12.1, 2.12 ],...]
> >
> > And tried:
> > np.array(b, dtype='i4,f4,f4,f4')
> >
> > Which raises some weird exception:
> > TypeError: a bytes-like object is required, not 'int'
> >
> > Two hours later I found that I need list of tuples. I didn't find any
> help
> > in documentation and could not realize that the problem with the inner
> > lists...
> >
> > Why there is such restriction - 'list of tuples' to create structured
> > array? What is the idea behind that, why not list of lists, or tuple of
> > lists or ...?
> >
> > Also the exception does not help at all...
> > p.s.: It looks like that dtype also accepts only list of tuples. But I
> can
> > not catch the idea for this restrictions.
> >
>
> The problem is that numpy needs to distinguish between multidimensional
> arrays and structured elements. A "list of lists" will often trigger
> numpy's broadcasting rules, which is not what you want here.
>
> For instance, should numpy interpret your input list as a 2d array of
> dimension Lx4 containing integer elements, or a 1d array of length L of
> structs with 4 fields?
>
> In this particular case maybe numpy could, in principle, figure it out
> from what you gave it by calculating that the innermost dimension is
> the same length as the number of fields. But there are other cases (such
> as assignment) where similar ambiguities arise that are harder to
> resolve. So to preserve our sanity we want to require that structures be
> formatted as tuples all the time.
>
> I have a draft of potential updated structured array docs you can read
> here:
> https://gist.github.com/ahaldane/7d1873d33d4d0f80ba7a54ccf1052eee
>
> See the section "Assignment from Python Native Types (Tuples)", which
> hopefully better warns that tuples are needed. Let me know if you think
> something is missing from the draft.
>
> (WARNING: the section about multi-field assignment in the doc draft is
> incorrect for current numpy - that's what I'm proposing for the next
> release. The rest of the docs are accurate for current numpy)
>
> Agreed that the error message could be changed.
>
> Allan
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170326/ae6681d1/attachment.html>


More information about the NumPy-Discussion mailing list