Large Dictionaries

Paul McGuire ptmcg at austin.rr._bogus_.com
Mon May 15 14:15:14 EDT 2006


"Claudio Grondi" <claudio.grondi at freenet.de> wrote in message
news:e49va0$m72$1 at newsreader3.netcologne.de...
> Chris Foote wrote:
> > Hi all.
> >
> > I have the need to store a large (10M) number of keys in a hash table,
> > based on a tuple of (long_integer, integer).  The standard python
> > dictionary works well for small numbers of keys, but starts to
> > perform badly for me inserting roughly 5M keys:
> >
> > # keys   dictionary  metakit   (both using psyco)
> > ------   ----------  -------
> > 1M            8.8s     22.2s
> > 2M           24.0s     43.7s
> > 5M          115.3s    105.4s
> >
> > Has anyone written a fast hash module which is more optimal for
> > large datasets ?
> >
> > p.s. Disk-based DBs are out of the question because most
> > key lookups will result in a miss, and lookup time is
> > critical for this application.
> >
> > Cheers,
> > Chris
> Python Bindings (\Python24\Lib\bsddb vers. 4.3.0) and the DLL for
> BerkeleyDB (\Python24\DLLs\_bsddb.pyd vers. 4.2.52) are included in the
> standard Python 2.4 distribution.
>
> "Berkeley DB was  20 times faster  than other databases.  It has the
> operational speed of  a main memory database, the startup and  shut down
> speed of a  disk-resident database, and does not have the  overhead  of
> a client-server inter-process communication."
> Ray  Van Tassle,  Senior  Staff Engineer, Motorola
>
> Please let me/us know if it is what you are looking for.
>
> Claudio

sqlite also supports an in-memory database - use pysqlite
(http://initd.org/tracker/pysqlite/wiki) to access this from Python.

-- Paul





More information about the Python-list mailing list