Memory usage steadily going up while pickling objects

Peter Otten __peter__ at web.de
Sat Jun 15 02:37:49 EDT 2013


Giorgos Tzampanakis wrote:

> I have a program that saves lots (about 800k) objects into a shelve
> database (I'm using sqlite3dbm for this since all the default python dbm
> packages seem to be unreliable and effectively unusable, but this is
> another discussion).
> 
> The process takes about 10-15 minutes. During that time I see memory usage
> steadily rising, sometimes resulting in a MemoryError. Now, there is a
> chance that my code is keeping unneeded references to the stored objects,
> but I have debugged it thoroughly and haven't found any.
> 
> So I'm beginning to suspect that the pickle module might be keeping an
> internal cache of objects being pickled. Is this true?

Pickler/Unpickler objects use a cache to maintain object identity, but at 
least shelve in the standard library uses a new Pickler/Unpickler for each 
set/get operation. 

I don't have sqlite3dbm, but you can try the following:

>>> import shelve
>>> class A: pass
... 
>>> a = A()
>>> s = shelve.open("tmp.shelve")
>>> s["x"] = s["y"] = a
>>> s["x"] is s["y"]
False

If you are getting True there must be a cache. One way to enable a cache 
yourself is writeback:

>>> s = shelve.open("tmp.shelve", writeback=True)
>>> s["x"] = s["y"] = a
>>> s["x"] is s["y"]
True

You didn't do that, I guess?




More information about the Python-list mailing list