[SciPy-user] filling array without loop...

Anne Archibald peridot.faceted at gmail.com
Sun Apr 22 19:41:15 EDT 2007


On 22/04/07, fred <fredmfp at gmail.com> wrote:

> Ok, I think I got the trick.
> Very powerful ! ;-)

Broadcasting and elementwise operations are the basic raison d'être for numpy.

> I tried this:
>
>            cells_array =
> vstack([hstack(VTK_HEXAHEDRON_NB_POINTS*ones((nz-1,ny-1,nx-1),dtype=int)),
>
> hstack(k[:-1,:,:]*(nx*ny)+j[:,:-1,:]*nx+i[:,:,:-1]),
>
> hstack(k[:-1,:,:]*(nx*ny)+j[:,:-1,:]*nx+i[:,:,1:]),
>
> hstack(k[:-1,:,:]*(nx*ny)+j[:,1:,:]*nx+i[:,:,1:]),
>
> hstack(k[:-1,:,:]*(nx*ny)+j[:,1:,:]*nx+i[:,:,:-1]),
>
> hstack(k[1:,:,:]*(nx*ny)+j[:,:-1,:]*nx+i[:,:,:-1]),
>
> hstack(k[1:,:,:]*(nx*ny)+j[:,:-1,:]*nx+i[:,:,1:]),
>
> hstack(k[1:,:,:]*(nx*ny)+j[:,1:,:]*nx+i[:,:,1:]),
>
> hstack(k[1:,:,:]*(nx*ny)+j[:,1:,:]*nx+i[:,:,:-1])]).transpose().ravel()
>
> Written as this, I get a memory allocation error: for nx=ny=nz=200, it
> exceeds 2GB,
> altough it works fine when array() used. Any idea ?

Um, first of all, what are you trying to accomplish with all those
hstack()s? The stacking functions are a good way to make an array out
of a list or tuple, but if what you have is already an array, some
combination of reshape() and transpose() is almost certainly what you
want. If you're doing something simple, that probably won't copy your
data; if it does copy your data, rearrange how you fill your arrays to
start with.

at 7x200x200x200x8 bytes, the final array will be 427MiB, so any
duplication of it is liable to push you close to 2 GB. It's a bit
crazy to be working so close to the point at which you run out of RAM.

But you have a point; numpy's vector operations, if written in the
obvious way, can use far more memory than looping. The easiest
solution is a compromise: partially vectorize your function. Instead
of replacing all three loops by numpy vector operations, try replacing
only one or two. This will still cut your overhead drastically while
keeping the array sizes small.

If you need more speed from this, it's going to take more work and
produce uglier code. You can rewrite the code so all matrix operations
are done in place (using += or ufunc output arguments), or there are
tools like numexpr, pyrex, or weave.

Good luck,
Anne


More information about the SciPy-User mailing list