Limit on entries in dictionary data structure

MRAB python at mrabarnett.plus.com
Sun Jan 30 22:52:12 EST 2011


On 31/01/2011 02:43, Shruti Sanadhya wrote:
> Hi,
>
> I am running a script that uses dictionaries on Python 2.6.4 on Ubuntu
> 9.10. I notice that my script crashes with a MemoryError when my
> dictionary reaches 44739243th entry. My system has 3GB RAM (32-bit). I
> noticed that changing the key or value types also did not help my code.
> For simplicity I tried running this:
>
> #BEGIN:python code
> import sys
> f={}
> lsize=0
> try:
>      for k in range(0,4*44739243):
>          f[k]=k
>          if sys.getsizeof(f) > lsize:
>              lsize = sys.getsizeof(f)
>              print k, lsize
> except:
>      print k, lsize
>      raise
> #END:python code
>
> The code terminates with:
> "Traceback (most recent call last):
>    File "pydict-limit.py", line 6, in <module>
>      f[k]=k
> MemoryError"
>
> I have also tried running this on Ubuntu 9.10 with Python 2.6.6 with
> 3.5GB RAM(32-bit) and a 64-bit LINUX cluster machine with 62GB RAM and
> Python 2.4. On both these machines it crashed at entry 44739243. The
> size of the data structure grows to 1.5GB. On another 64-bit cluster
> machine with 32GB RAM and Python 2.6.6 it crashed at entry 178956970. In
> this case the size of the data structure grew to 6GB.
>
> Has anybody encountered this before? Can somebody suggest any fix for
> this? I am trying to read some 10GB network traces into a hash
> table(hence dictionary) and need constant time lookup. Clearly
> increasing the machine RAM does not work. Any suggestions would be
> greatly appreciated.
>
sys.getsizeof(...) returns the size of an object itself, not the size
of an object plus any others which that object references.

The dict in your example code occupied 1.5GB, but that didn't include
the size of the int keys and values, only the references to them.

As for the 64-bit Linux cluster machine, did you run a 32-bit or a
64-bit build of Python? A 32-bit build can't use more than 4GB (2**32).



More information about the Python-list mailing list