Faster way to map numpy arrays

Oscar Benjamin oscar.benjamin at bristol.ac.uk
Tue Jun 26 09:51:38 EDT 2012


On 26 June 2012 04:20, Saurabh Kabra <skabra at gmail.com> wrote:

> Thanks guys
>
> I implemented a numpy array with fancy indices and got rid of the list and
> the loops. The time to do the mapping improved ~10x. As a matter of fact,
> the number of elements in array A to be summed and mapped was different for
> each element in B (which was the reason I was using lists). But I solved
> that problem by simply adding zero elements to make a regular 3D numpy
> array out of the list.
>

Is that good enough, or are you looking for more speedup?

Padding with zeros to create the larger-than-needed array may be less
time-efficient (and is definitely less memory efficient) than extracting
each subarray in a loop. Consider the following:

>>> import numpy
>>> a = numpy.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> a
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])
>>> af = a.flatten()
>>> af
array([1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> indices = [[1, 7, 8], [0, 1], [8, 4]]
>>> indices = [numpy.array(inds) for inds in indices]
>>> indices
[array([1, 7, 8]), array([0, 1]), array([8, 4])]
>>> for inds in indices:
...     print inds, af[inds], af[inds].sum()
...
[1 7 8] [2 8 9] 19
[0 1] [1 2] 3
[8 4] [9 5] 14

Knowing the most efficient way depends on a number of things. The first
would be whether or not the operation you're describing is repeated. If,
for example, you keep doing this for the same indices but each time
changing the array A then you should try precomputing all the indices as a
list of numpy arrays (as shown above).

On the other hand, if you're repeating with the same matrix A but different
sets of indices, you'll need to think about how you are generating the
indices.

In my experience the fastest way to do something like this would be to use
cython as suggested above by Stefan.

Oscar.


>
> Saurabh
>
>
>
>
>
> On 25 June 2012 08:24, Stefan Behnel
> <stefan_ml at behnel.de> wrote:
>
>  Saurabh Kabra, 25.06.2012 05:37:
>> > I have written a script to map a 2D numpy array(A) onto another
>> array(B) of
>> > different dimension. more than one element (of array A) are summed and
>> > mapped to each element of array B.  To achieve this I create a list
>> where I
>> > store the index of array A to be mapped to array B. The list is the
>> > dimension of array B (if one can technically say that) and each element
>> is
>> > a list of indices to be summed. Then I parse this list with a nested
>> loop
>> > and compute each element of array B.
>>
>  >
>> > Because of the nested loop and the big arrays the process takes a
>> minute or
>> > so. My question is: is there a more elegant and significantly faster
>> way of
>> > doing this in python?
>>
>> I'm sure there's a way to do this kind of transformation more efficiently
>> in NumPy. I faintly recall that you can use one array to index into
>> another, something like that might do the trick already. In any case,
>> using
>> a NumPy array also for the mapping matrix sounds like a straight forward
>> thing to try.
>>
>
> I can't tell from the description of the problem what you're trying to do
> but for the special case of summing along one axis of a numpy array of
> dimension N to produce a new numpy array of dimension N-1, there is  fast
> builtin support in numpy:
>
> >>> import numpy
> >>> a = numpy.array([[1, 2], [3, 4]])
> >>> a
> array([[1, 2],
>        [3, 4]])
> >>> a.sum()   # sum of all elements
> 10
> >>> a.sum(axis=1)  # sum of each row
> array([3, 7])
> >>> a.sum(axis=0)  # sum of each column
> array([4, 6])
>
> If your problem is not in this form, you can use numpy's fancy indexing to
> convert it to this form, provided the number of elements summed from A is
> the same for each element of B (i.e. if each element of B is the result of
> summing exactly 10 elements chosen from A).
>
>
>> But you might also want to take a look at Cython. It sounds like a problem
>> where a trivial Cython implementation would seriously boost the
>> performance.
>>
>> http://docs.cython.org/src/tutorial/numpy.html
>
>
>>
>> Stefan
>>
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20120626/581edb01/attachment.html>


More information about the Python-list mailing list