Memory usage steadily going up while pickling objects

Dave Angel davea at davea.name
Fri Jun 14 21:52:11 EDT 2013


On 06/14/2013 07:04 PM, Giorgos Tzampanakis wrote:
> I have a program that saves lots (about 800k) objects into a shelve
> database (I'm using sqlite3dbm for this since all the default python dbm
> packages seem to be unreliable and effectively unusable, but this is
> another discussion).
>
> The process takes about 10-15 minutes. During that time I see memory usage
> steadily rising, sometimes resulting in a MemoryError. Now, there is a
> chance that my code is keeping unneeded references to the stored objects,
> but I have debugged it thoroughly and haven't found any.
>
> So I'm beginning to suspect that the pickle module might be keeping an
> internal cache of objects being pickled. Is this true?
>

You can learn quite a bit by using the  sys.getrefcount() function.  If 
you think a variable has only one reference (if it had none, it'd be 
very hard to test), and you call sys.getrefcount(), you can check if 
your assumption is right.

Note that if the object is part of a complex object, there may be 
several mutual references, so the count may be more than you expect. 
But you can still check the count before and after calling the pickle 
stuff, and see if it has increased.

Note that even if it has not, that doesn't prove you don't have a problem.

Could the problem be the sqlite stuff?  Can you disable that part of the 
logic, and see whether just creating the data still produces the leak?


-- 
DaveA



More information about the Python-list mailing list