Is shelve/dbm supposed to be this inefficient?

Marc 'BlackJack' Rintsch bj_666 at gmx.net
Thu Aug 2 01:53:44 EDT 2007


On Wed, 01 Aug 2007 15:47:21 -0800, Joshua J. Kugler wrote:

> My original data is 33MB.  When each row is converted to python lists, and
> inserted into a shelve DB, it balloons to 69MB.  Now, there is some
> additional data in there namely a list of all the keys containing data (vs.
> the keys that contain version/file/config information), BUT if I copy all
> the data over to a dict and dump the dict to a file using cPickle, that
> file is only 49MB.  I'm using pickle protocol 2 in both cases.
> 
> Is this expected? Is there really that much overhead to using shelve and dbm
> files?  Are there any similar solutions that are more space efficient?  I'd
> use straight pickle.dump, but loading requires pulling the entire thing
> into memory, and I don't want to have to do that every time.

You did not say how many records you store.  If the underlying DB used by
`shelve` works with a hash table it may be expected to see that "bloat". 
It's a space vs. speed trade off then.

Ciao,
	Marc 'BlackJack' Rintsch



More information about the Python-list mailing list