Large Dictionaries

Klaas mike.klaas at gmail.com
Wed May 24 19:42:38 EDT 2006


Chris:
> class StorageBerkeleyDB(StorageTest):
>    def runtest(self, number_hash):
>        db = bsddb.hashopen(None, flag='c', cachesize=8192)
>        for (num, wildcard_digits) in number_hash.keys():
>            key = '%d:%d' % (num, wildcard_digits)
>            db[key] = None
>        db.close()

BDBs can accomplish what you're looking to do, but they need to be
tuned carefully.  I won't get into too many details here, but you have
a few fatal flaws in that code.

1. 8Kb of cache is _pathetic_.  Give it a few hundred megs.  This is by
far your nbiggest problem.
2. Use BTREE unless you have a good reason to use DBHASH
3. Use proper bdb env creation instead of the hash_open apis.
4. Insert your keys in sorted order.

-Mike




More information about the Python-list mailing list