[Numpy-discussion] question about creating numpy arrays

Darren Dale dsdale24 at gmail.com
Thu May 20 11:53:09 EDT 2010


[sorry, my last got cut off]

On Thu, May 20, 2010 at 11:37 AM, Darren Dale <dsdale24 at gmail.com> wrote:
> On Thu, May 20, 2010 at 10:44 AM, Benjamin Root <ben.root at ou.edu> wrote:
>>> I gave two counterexamples of why.
>>
>> The examples you gave aren't counterexamples.  See below...
>
> I'm not interested in arguing over semantics. I've discovered an issue
> with how numpy deals with lists of objects that derive from ndarray,
> and am concerned about the implications for classes that extend
> ndarray.
>
>> On Wed, May 19, 2010 at 7:06 PM, Darren Dale <dsdale24 at gmail.com> wrote:
>>>
>>> On Wed, May 19, 2010 at 4:19 PM,  <josef.pktd at gmail.com> wrote:
>>> > On Wed, May 19, 2010 at 4:08 PM, Darren Dale <dsdale24 at gmail.com> wrote:
>>> >> I have a question about creation of numpy arrays from a list of
>>> >> objects, which bears on the Quantities project and also on masked
>>> >> arrays:
>>> >>
>>> >>>>> import quantities as pq
>>> >>>>> import numpy as np
>>> >>>>> a, b = 2*pq.m,1*pq.s
>>> >>>>> np.array([a, b])
>>> >> array([ 12.,   1.])
>>> >>
>>> >> Why doesn't that create an object array? Similarly:
>>> >>
>>
>>
>> Consider the use case of a person creating a 1-D numpy array:
>>  > np.array([12.0, 1.0])
>> array([ 12.,  1.])
>>
>> How is python supposed to tell the difference between
>>  > np.array([a, b])
>> and
>>  > np.array([12.0, 1.0])
>> ?
>>
>> It can't, and there are plenty of times when one wants to explicitly
>> initialize a small numpy array with a few discrete variables.
>>
>>
>>>
>>> >>>>> m = np.ma.array([1], mask=[True])
>>> >>>>> m
>>> >> masked_array(data = [--],
>>> >>             mask = [ True],
>>> >>       fill_value = 999999)
>>> >>
>>> >>>>> np.array([m])
>>> >> array([[1]])
>>> >>
>>
>> Again, this is expected behavior.  Numpy saw an array of an array,
>> therefore, it produced a 2-D array. Consider the following:
>>
>>  > np.array([[12, 4, 1], [32, 51, 9]])
>>
>> I, as a user, expect numpy to create a 2-D array (2 rows, 3 columns) from
>> that array of arrays.
>>
>>>
>>> >> This has broader implications than just creating arrays, for example:
>>> >>
>>> >>>>> np.sum([m, m])
>>> >> 2
>>> >>>>> np.sum([a, b])
>>> >> 13.0
>>> >>
>>
>>
>> If you wanted sums from each object, there are some better (i.e., more
>> clear) ways to go about it.  If you have a predetermined number of
>> numpy-compatible objects, say a, b, c, then you can explicitly call the sum
>> for each one:
>>  > a_sum = np.sum(a)
>>  > b_sum = np.sum(b)
>>  > c_sum = np.sum(c)
>>
>> Which I think communicates the programmer's intention better than (for a
>> numpy array, x, composed of a, b, c):
>>  > object_sums = np.sum(x)       # <--- As a numpy user, I would expect a
>> scalar out of this, not an array
>>
>> If you have an arbitrary number of objects (which is what I suspect you
>> have), then one could easily produce an array of sums (for a list, x, of
>> numpy-compatible objects) like so:
>>  > object_sums = [np.sum(anObject) for anObject in x]
>>
>> Performance-wise, it should be no more or less efficient than having numpy
>> somehow produce an array of sums from a single call to sum.
>> Readability-wise, it makes more sense because when you are treating objects
>> separately, a *list* of them is more intuitive than a numpy.array, which is
>> more-or-less treated as a single mathematical entity.
>>
>> I hope that addresses your concerns.
>
> I appreciate the response, but you are arguing that it is not a
> problem, and I'm certain that it is. It may not be numpy

It may not be numpy's problem, I can accept that. But it is definitely
a problem for quantities. I'm trying to determine just how big a
problem it is. I had hoped that one day quantities might become a part
of numpy or scipy, but this appears to be a fundamental issue and it
makes me doubt that inclusion would be appropriate.

Thank you for the suggestion about calling the sum method instead of
numpy's function. That is a reasonable workaround.

Darren



More information about the NumPy-Discussion mailing list