list_to_dict()

Martin Maney maney at pobox.com
Mon Jan 20 02:56:48 EST 2003


Peter Abel <p-abel at t-online.de> wrote:
>>>> def myzip(keys,values):
> ....    res={}
> ....    map(lambda x,y:res.setdefault(x,y),keys,values)
> ....    return res
> 
>   And I think it goes with python 2.1. or earlier.
>   A little time_test(number_of_keys) - function which generates
>   simple 5-random-letters-keys and a value-list as range(number_of_keys)
>   will show an increase in speed.
>>>> time_test(50000)
> dzip  :  0.28100001812
> myzip :  0.140999913216
>>>> time_test(500000)
> dzip  :  3.31200003624
> myzip :  1.59399998188

That's odd.  When I dropped the map(...) code into my simple test, it
was the worst performer at all sizes.  :-(

Oh, wait - are you testing building single dictionaries with 50K keys? 
My largest test is with 1000 keys per dictionary, and the results that
actually matter would with perhaps 5 to 10 keys.  Try building a dict
of 10 items 5K times instead.  Not that the trend I see at 1000
keys/dict and below makes the numbers you reported seem likely even
so...

Building dicts of 5 items 40000 times (CPU time per Kdict):
        traditional: 0.0190
traditional inlined: 0.0162
          dict(zip): 0.0255
  inlined dict(zip): 0.0223
                map: 0.0337
        inlined map: 0.0302
Building dicts of 10 items 20000 times (CPU time per Kdict):
        traditional: 0.0285
traditional inlined: 0.0260
          dict(zip): 0.0320
  inlined dict(zip): 0.0295
                map: 0.0535
        inlined map: 0.0500
Building dicts of 100 items 4000 times (CPU time per Kdict):
        traditional: 0.2000
traditional inlined: 0.2025
          dict(zip): 0.1650
  inlined dict(zip): 0.1600
                map: 0.4125
        inlined map: 0.4100
Building dicts of 1000 items 400 times (CPU time per Kdict):
        traditional: 2.0500
traditional inlined: 2.0500
          dict(zip): 1.5750
  inlined dict(zip): 1.5500
                map: 3.9000
        inlined map: 4.0750

The changes in the total numbers of keys inserted was done to keep the
overall running times about the same - between two and three seconds
on the P2/233 I originally did this on.  As you can see, I've since
changed the format to report the time per 1000 dictionaries to make the
shape of the scaling apparent.

The "inlined" versions simply place the body of the functions into the
body of the test loop, of course.  I can't now recall why I called the
dzip() code "traditional", unless it was just that that's what I have
been using in a couple of projects.





More information about the Python-list mailing list