[SciPy-user] memory usage

Fri Aug 10 00:31:43 EDT 2007

On 09/08/07, Emanuele Zattin <emanuelez at gmail.com> wrote:
> yeah i noticed that as well.
> i think i'm on the right way now... i think this is all due to the
> heavy use of broadcasting i did in order to optimize performance... i
> might as well try to go with some inline C code with nested for loops.

If memory is your problem, rather than speed, I'd take a careful look
at how your numpy code is written. Minor rewriting of expressions can
often save a great deal of temporary space. (Memory allocation is
actually extremely fast, so unless you're *running out* of memory,
don't worry too much about it.)

It's worth distinguishing between temporary memory use - that is,
where the intermediate values in your calculation fill up all your RAM
but then disappear once the calculation is over - and memory use by
arrays you actually want.

Are the arrays you actually care about huge? if you've got 2000x2000x3
channels of numpy floats, that's 96 MB; I expect in a comet detection
routine you're shuffling a number of such things around? If this is
the case, you're going to need to think about your algorithm: can you
get away with float32s, which are half as big? can you keep fewer
images in memory at once?

For temporaries, there are various fairly easy tricks to cut down on
the creation of temporaries. Bracketing expressions can make a huge
difference: 2*(3*bigarray) makes more temporaries than (2*3)*bigarray
(numpy has extremely limited scope for expression optimization). The
package numexpr is essentially an expression compiler that will help
with this sort of thing. You can also use the output arguments of
ufuncs: for example, you can apply sine to an array in place, or add
one array into an existing array.

Broadcasting isn't necessarily wasteful; A[...,newaxis] doesn't take
any more space than A does. Of course, doing something like
(A[:,newaxis]*B[newaxis,:])*C does create a big array, which you may
be able to avoid doing. Particularly, if you don't know about dot, you
should: dot(A,B) is roughly sum(A*B) (it is a matrix product, in fact,
for rank-2 arrays, and something analogous for all other ranks). Not
only does it avoid a large temporary array, it's implemented using
fast routines from BLAS/ATLAS.

Anne