NumPy Slow ??
David M. Cooke
cookedm at physics.mcmaster.ca
Mon Sep 18 18:28:30 EDT 2000
At some point, hdemers at venus.astro.umontreal.ca (Hugues Demers) wrote:
> Why is it that this line take ~30 sec to execute on a 500 MHz pentium III
> 128Meg RAM ?
>
> data1 = choose(greater(data,z2),(data,z2))
>
> where data is a 2048x2080 array of float and z2 is a float.
>
> I thought that NumPy functions where written in C for faster execution. Or
> maybe that I'm wrong and this is fast execution?
Was that a C float or a NumPy Float (which is a C double)? I'll assume
Float.
The code below took ~2sec on my 550 MHz PIII with 256 Meg RAM:
>>> from Numeric import *
>>> from RandomArray import *
>>> data = random( (2048, 2080) )
>>> data1 = choose(greater(data, 0.5), (data, 0.5))
Note that greater creates a new array (of longs), and so does choose
(of doubles). There are 2048*2080=4 259 840 elements per array. A
double is 8 bytes and a long is 4, so the total memory taken by the
three arrays is 2048*2080*(8+4+8)= 81.25Meg. It's likely then that
your machine has to swap some of that in and out. At the end, though,
the array created by greater should be garbage collected.
Indeed, if I save the array created by greater, I can see that python
is using 83Meg of memory.
Using a python loop, this took ~15sec, without creating an intermediate
array:
>>> from copy import copy
>>> data1 = copy(data) # this is fast
>>> d1f = data1.flat
>>> for i in xrange(0, d1f.shape[0]):
>>> if d1f[i] > 0.5: d1f[i] = 0.5
Obviously, you could also do this in place. If you really need more
speed, write a C extension.
Moral of the story: to deal with a lot of data quickly, you need a lot
of memory.
--
|>|\/|<
----------------------------------------------------------------------------
David M. Cooke
cookedm at mcmaster.ca
More information about the Python-list
mailing list