Faster way to map numpy arrays

Saurabh Kabra skabra at gmail.com
Mon Jun 25 23:20:58 EDT 2012


Thanks guys

I implemented a numpy array with fancy indices and got rid of the list and
the loops. The time to do the mapping improved ~10x. As a matter of fact,
the number of elements in array A to be summed and mapped was different for
each element in B (which was the reason I was using lists). But I solved
that problem by simply adding zero elements to make a regular 3D numpy
array out of the list.

Saurabh





On 25 June 2012 08:24, Stefan Behnel <stefan_ml at behnel.de> wrote:

> Saurabh Kabra, 25.06.2012 05:37:
> > I have written a script to map a 2D numpy array(A) onto another array(B)
> of
> > different dimension. more than one element (of array A) are summed and
> > mapped to each element of array B.  To achieve this I create a list
> where I
> > store the index of array A to be mapped to array B. The list is the
> > dimension of array B (if one can technically say that) and each element
> is
> > a list of indices to be summed. Then I parse this list with a nested loop
> > and compute each element of array B.
>
>
> > Because of the nested loop and the big arrays the process takes a minute
> or
> > so. My question is: is there a more elegant and significantly faster way
> of
> > doing this in python?
>
> I'm sure there's a way to do this kind of transformation more efficiently
> in NumPy. I faintly recall that you can use one array to index into
> another, something like that might do the trick already. In any case, using
> a NumPy array also for the mapping matrix sounds like a straight forward
> thing to try.
>

I can't tell from the description of the problem what you're trying to do
but for the special case of summing along one axis of a numpy array of
dimension N to produce a new numpy array of dimension N-1, there is  fast
builtin support in numpy:

>>> import numpy
>>> a = numpy.array([[1, 2], [3, 4]])
>>> a
array([[1, 2],
       [3, 4]])
>>> a.sum()   # sum of all elements
10
>>> a.sum(axis=1)  # sum of each row
array([3, 7])
>>> a.sum(axis=0)  # sum of each column
array([4, 6])

If your problem is not in this form, you can use numpy's fancy indexing to
convert it to this form, provided the number of elements summed from A is
the same for each element of B (i.e. if each element of B is the result of
summing exactly 10 elements chosen from A).


> But you might also want to take a look at Cython. It sounds like a problem
> where a trivial Cython implementation would seriously boost the
> performance.
>
> http://docs.cython.org/src/tutorial/numpy.html


>
> Stefan
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20120626/67a54dd6/attachment.html>


More information about the Python-list mailing list