very large dictionaries

William Park opengeometry at yahoo.ca
Wed Jun 16 15:40:49 EDT 2004


robin <escalation746 at yahoo.com> wrote:
> I need to do a search through about 50 million records, each of which
> are less than 100 bytes wide. A database is actually too slow for
> this, so I thought of optimising the data and putting it all in
> memory.
> 
> There is a single key field, so a dictionary is an obvious choice for
> a structure, since Python optimises these nicely.
> 
> But is there a better choice? Is it worth building some sort of tree?

50M x 100 = 5000M = 5G.  You got 5Gig of memory?

Since you are talking about key/value record, you can choose from GDBM
(gdbm), Berkeley DB (dbhash), or disk-based dictionary front-end
(shelve).   You can now access GDBM database from Bash shell. :-)

-- 
William Park, Open Geometry Consulting, <opengeometry at yahoo.ca>
No, I will not fix your computer!  I'll reformat your harddisk, though.



More information about the Python-list mailing list