[Numpy-discussion] Object array creation from sequences

Ed Schofield schofield at ftw.at
Wed May 3 04:28:15 EDT 2006


Hi all,

NumPy currently does the following:

>>> s = set([1, 100, 10])
>>> a = numpy.array(s)
>>> a
array(set([1, 100, 10]), dtype=object)
>>> a.shape
()

Many functions in NumPy's functional interface, like numpy.sort(),
inherit this behaviour:

>>> b = numpy.sort(s)
>>> b
array(set([1, 10, 100]), dtype=object)
>>> b.shape
()

I'd like to propose two modifications to improve array construction from
non-list sequences:

1. We inspect whether the data has a __len__ method.  If it does, and it
returns an integer, we construct an array out of the C equivalent of
list(data).

Others on this list have noted that NumPy also creates a rank-0 object
array from generators:

>>> c = numpy.array(i*2 for i in xrange(10))
>>> c
array(<generator object at 0xb6918b2c>, dtype=object)

This proposal wouldn't affect this case, since generators do not in
general have a __len__ attribute.


2. A stronger version of the above proposal: if the data has an __iter__
method, we construct an array out of the C equivalent of list(data). 
This would handle the generator case above correctly until we have a
more efficient implementation.  Creating an array from an infinite
sequence would loop forever, just as list(inf_generator) does.


-- Ed





More information about the NumPy-Discussion mailing list