[SciPy-user] Arrays and strange memory usage ...

Robert Kern robert.kern at gmail.com
Tue Sep 2 19:44:14 EDT 2008


On Tue, Sep 2, 2008 at 18:19, David Cournapeau <cournape at gmail.com> wrote:
> On Wed, Sep 3, 2008 at 2:11 AM, christophe grimault
> <christophe.grimault at novagrid.com> wrote:
>> Hi,
>>
>> I have a application that is very demanding in memory ressources. So I
>> started to to look closer at python + numpy/scipy as far as memory is
>> concerned.
>
> I you are really tight on memory, you will have problems with python
> and most programming language which do not let you control memory in a
> fine grained manner. Now, it depends on what you mean by memory
> demanding: if you have barely enough memory for holding your data, it
> will extremely difficult to do it in python, and difficult to do in
> any language, including C and other manually managed languages.
>
>>
>> I can't explain the following :
>>
>> I start my python, + import scipy. A 'top' in the console shows that :
>>
>>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME COMMAND
>> 14791 grimault  20   0 21624 8044 3200 S    0  0.4   0:00.43 python
>>
>> Now after typing :
>>
>> z = scipy.arange(1000000)
>>
>> I get :
>> 14791 grimault  20   0 25532  11m 3204 S    0  0.6   0:00.44 python
>>
>> So the memory increased by ~ 7 Mb. I was expecting 4 Mb since the data
>> type is int32, giving 4*1000000 = 4 Mb of memory chunk (in C/C++ at
>> least).
>
> a = scipy.arange(1e6)
> a.itemsize * a.size
>
> Give me 8e6 bytes. arange is float64 by default, and I get a similar
> memory increase (~ 8Mb).

No, the default is int (int32 on 32-bit systems, int64 on most 64-bit
systems) if you give it integer arguments and float64 if you give it
float arguments.

>> It gets even worse with complex float. I tried :
>> z = arange(1000000) + 1j*arange(1000000)
>>
>> Expecting 8 Mb,
>
> Again, this is strange, it should default to float128. Which version
> of numpy/scipy are you using ?

You mean complex128.

One thing to be aware of is that there are temporaries involved.
1j*arange(1000000) will allocate almost 16 MB of memory just by itself
and then allocate another 16 MB for the result of the addition. The
memory may not get returned to the OS when an object gets deallocated
although it will be reused by Python.

FWIW, here is what I get with SVN numpy on OS X:

>>> import numpy
45564 Python       0.0%  0:00.49   1    16    127 5172K  1292K  7960K    28M

>>> a = numpy.arange(1000000)
45564 Python       0.0%  0:00.50   1    16    128 9092K  1292K    12M    32M

>>> z = numpy.arange(1000000) + 1j * numpy.arange(1000000)
45564 Python       0.0%  0:00.60   1    16    129   24M  1292K    27M    47M

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco



More information about the SciPy-User mailing list