very large dictionaries
William Park
opengeometry at yahoo.ca
Wed Jun 16 15:40:49 EDT 2004
robin <escalation746 at yahoo.com> wrote:
> I need to do a search through about 50 million records, each of which
> are less than 100 bytes wide. A database is actually too slow for
> this, so I thought of optimising the data and putting it all in
> memory.
>
> There is a single key field, so a dictionary is an obvious choice for
> a structure, since Python optimises these nicely.
>
> But is there a better choice? Is it worth building some sort of tree?
50M x 100 = 5000M = 5G. You got 5Gig of memory?
Since you are talking about key/value record, you can choose from GDBM
(gdbm), Berkeley DB (dbhash), or disk-based dictionary front-end
(shelve). You can now access GDBM database from Bash shell. :-)
--
William Park, Open Geometry Consulting, <opengeometry at yahoo.ca>
No, I will not fix your computer! I'll reformat your harddisk, though.
More information about the Python-list
mailing list