Memory usage steadily going up while pickling objects
Peter Otten
__peter__ at web.de
Sat Jun 15 02:37:49 EDT 2013
Giorgos Tzampanakis wrote:
> I have a program that saves lots (about 800k) objects into a shelve
> database (I'm using sqlite3dbm for this since all the default python dbm
> packages seem to be unreliable and effectively unusable, but this is
> another discussion).
>
> The process takes about 10-15 minutes. During that time I see memory usage
> steadily rising, sometimes resulting in a MemoryError. Now, there is a
> chance that my code is keeping unneeded references to the stored objects,
> but I have debugged it thoroughly and haven't found any.
>
> So I'm beginning to suspect that the pickle module might be keeping an
> internal cache of objects being pickled. Is this true?
Pickler/Unpickler objects use a cache to maintain object identity, but at
least shelve in the standard library uses a new Pickler/Unpickler for each
set/get operation.
I don't have sqlite3dbm, but you can try the following:
>>> import shelve
>>> class A: pass
...
>>> a = A()
>>> s = shelve.open("tmp.shelve")
>>> s["x"] = s["y"] = a
>>> s["x"] is s["y"]
False
If you are getting True there must be a cache. One way to enable a cache
yourself is writeback:
>>> s = shelve.open("tmp.shelve", writeback=True)
>>> s["x"] = s["y"] = a
>>> s["x"] is s["y"]
True
You didn't do that, I guess?
More information about the Python-list
mailing list