dict vs kjBuckets vs ???

Tim Peters tim_one at email.msn.com
Thu Jun 10 22:23:15 EDT 1999


[MK]
> In some book on algorithms I've read that after inserting limited
> number of items performance of operating on hash tables
> drops dramatically.

Depends on the details.  What you read is true of some kinds of hash tables.
Python's dicts dynamically expand to keep the "load factor" under 2/3, so
what you read isn't applicable to Python in normal use.

> I plan to write a program that would store lots (in range of 10M or even
> more) of relatively small objects (a few hundred bytes at most), so what
> do you think I should use?

Let's do a little math <wink>:  10M * 100 = ?, a lower bound on what you're
contemplating.  Do you have gigabytes of RAM?

> I thought about dictionaries, kjBuckets, or maybe even library called
> Metakit for Python (http://www.equi4.com/metakit/info/README-Python.html).
>
> what-do-you-think-ly y'rs

You don't really want to know <wink>.  Memory-based data structures aren't
going to work for the size of thing you have in mind.  If you can make it
fly it all, you'll likely require a powerful database, so of those choices
Metakit is the only approach that's not dead on arrival.

better-still-write-it-in-perl<wink>-ly y'rs  - tim






More information about the Python-list mailing list