[SciPy-user] filling array without loop...

fred fredmfp at gmail.com
Mon Apr 23 07:14:22 EDT 2007


Anne Archibald a écrit :
>> I tried this:
>>
>>            cells_array =
>> vstack([hstack(VTK_HEXAHEDRON_NB_POINTS*ones((nz-1,ny-1,nx-1),dtype=int)),
>>
>> hstack(k[:-1,:,:]*(nx*ny)+j[:,:-1,:]*nx+i[:,:,:-1]),
>>
>>     
[snip]
>> Um, first of all, what are you trying to accomplish with all those
>> hstack()s? The stacking functions are a good way to make an array out
>> of a list or tuple, but if what you have is already an array, some
>> combination of reshape() and transpose() is almost certainly what you
>> want. If you're doing something simple, that probably won't copy your
>> data; if it does copy your data, rearrange how you fill your arrays to
>> start with.
>>     
Ok, maybe I misunderstood.

In this example

    x = x0 + arange(nx, dtype=float32)*dx
    y = y0 + arange(ny, dtype=float32)*dy
    z = z0 + arange(nz, dtype=float32)*dz
    c = vstack([hstack([x]*ny*nz), \
                repeat(hstack([y]*nz), nx), \
                repeat(z, nx*ny)]).transpose().ravel()
 
using hstack()/vstack() is much faster than using array()
for nx,ny,nz = 200,250,300
>> at 7x200x200x200x8 bytes, the final array will be 427MiB, so any
>> duplication of it is liable to push you close to 2 GB. It's a bit
>> crazy to be working so close to the point at which you run out of RAM.
>>     
It's only for testing ;-)
What I don't understand for this example (200x200x200) is why it works 
fine with array and does not with vstack/hstack ?
?stack() methods duplicate data that array() don't ?
>> But you have a point; numpy's vector operations, if written in the
>> obvious way, can use far more memory than looping. The easiest
>> solution is a compromise: partially vectorize your function. Instead
>> of replacing all three loops by numpy vector operations, try replacing
>> only one or two. This will still cut your overhead drastically while
>> keeping the array sizes small.
>>     
Ok.
>> If you need more speed from this, it's going to take more work and
>> produce uglier code. You can rewrite the code so all matrix operations
>> are done in place (using += or ufunc output arguments), or there are
>> tools like numexpr, pyrex, or weave.
>>     
I'll think of this when I'll really want to optimize my code ;-)

Thanks.

Cheers,

-- 
http://scipy.org/FredericPetit




More information about the SciPy-User mailing list