my computer is allergic to pickles

Peter Otten __peter__ at web.de
Wed Mar 9 15:30:01 EST 2011


Bob Fnord wrote:

> I'm using python to do some log file analysis and I need to store
> on disk a very large dict with tuples of strings as keys and
> lists of strings and numbers as values.
> 
> I started by using cPickle to save the instance of the class that
> contained this dict, but the pickling process started to write
> the file but ate so much memory that my computer (4 GB RAM)
> crashed so badly that I had to press the reset button. I've never
> seen out-of-memory errors do this before. Is this normal?
> 
> (I know from the output that got written before the crash that my
> program had finished building the dict and started the
> pickle. When I tried running the other program that reads the
> pickle and analyzes the data in it, it gave an error because the
> file was incomplete. So I know where in my code the crash
> happened.)
> 
>>From searching the web, I get the impression that pickle uses a
> lot of memory because it checked for recursion and other things
> that could break other serialization methods. So I've switched to
> using marshal to save the dict itself (the only persistent thing
> in the class, which just has convenience methods for adding data
> to the dict and searching it for the second stage of analysis).
> 
> I found some references to h5 tables for getting around the
> pickling memory problem, but I got the impression they only work
> with fixed columns, not a somewhat complex data structure like
> mine.
> 
> Any comments, suggestions?

Have you seen that one?

http://mail.python.org/pipermail/python-list/2008-July/1139855.html



More information about the Python-list mailing list