very large dictionary

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Mon Aug 4 18:52:43 EDT 2008


On Mon, 04 Aug 2008 07:02:16 -0700, Simon Strobl wrote:

> I created a python file that contained the dictionary. The size of this
> file was 6.8GB. 

Ah, that's what I thought you had done. That's not a dictionary. That's a 
text file containing the Python code to create a dictionary.

My guess is that a 7GB text file will require significantly more memory 
once converted to an actual dictionary: in my earlier post, I estimated 
about 5GB for pointers. Total size of the dictionary is impossible to 
estimate accurately without more information, but I'd guess that 10GB or  
20GB wouldn't be unreasonable.

Have you considered that the operating system imposes per-process limits 
on memory usage? You say that your server has 128 GB of memory, but that 
doesn't mean the OS will make anything like that available.

And I don't know how to even start estimating how much temporary memory 
is required to parse and build such an enormous Python program. Not only 
is it a 7GB program, but it is 7GB in one statement.


> I thought it would be practical not to create the
> dictionary from a text file each time I needed it. I.e. I thought
> loading the .pyc-file should be faster. Yet, Python failed to create a
> .pyc-file

Probably a good example of premature optimization. Out of curiosity, how 
long does it take to create it from a text file?



-- 
Steven



More information about the Python-list mailing list