Populating a dictionary, fast [SOLVED SOLVED]

Chris Mellon arkanes at gmail.com
Thu Nov 15 11:51:08 EST 2007


On Nov 14, 2007 5:26 PM, Steven D'Aprano
<steve at remove-this-cybersource.com.au> wrote:
> On Wed, 14 Nov 2007 18:16:25 +0100, Hrvoje Niksic wrote:
>
> > Aaron Watters <aaron.watters at gmail.com> writes:
> >
> >> On Nov 12, 12:46 pm, "Michael Bacarella" <m... at gpshopper.com> wrote:
> >>>
> >>> > It takes about 20 seconds for me. It's possible it's related to
> >>> > int/long
> >>> > unification - try using Python 2.5. If you can't switch to 2.5, try
> >>> > using string keys instead of longs.
> >>>
> >>> Yes, this was it.  It ran *very* fast on Python v2.5.
> >>
> >> Um.  Is this the take away from this thread?  Longs as dictionary keys
> >> are bad?  Only for older versions of Python?
> >
> > It sounds like Python 2.4 (and previous versions) had a bug when
> > populating large dicts on 64-bit architectures.
>
> No, I found very similar behaviour with Python 2.5.
>
>
> >> Someone please summarize.
> >
> > Yes, that would be good.
>
>
> On systems with multiple CPUs or 64-bit systems, or both, creating and/or
> deleting a multi-megabyte dictionary in recent versions of Python (2.3,
> 2.4, 2.5 at least) takes a LONG time, of the order of 30+ minutes,
> compared to seconds if the system only has a single CPU. Turning garbage
> collection off doesn't help.
>
>

I can't duplicate this in a dual CPU (64 bit, but running in 32 bit
mode with a 32 bit OS) system. I added keys to a dict until I ran out
of memory (a bit over 22 million keys) and deleting the dict took
about 8 seconds (with a stopwatch, so not very precise, but obviously
less than 30 minutes).

>>> d = {}
>>> idx = 0
>>> while idx < 1e10:
...   d[idx] = idx
...   idx += 1
...
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
MemoryError
>>> len(d)
22369622
>>> del d



More information about the Python-list mailing list