list_to_dict()
Peter Abel
p-abel at t-online.de
Mon Jan 20 14:14:19 EST 2003
Martin Maney <maney at pobox.com> wrote in message news:<b0ga40$nr8$1 at wheel2.two14.net>...
> Peter Abel <p-abel at t-online.de> wrote:
> >>>> def myzip(keys,values):
> > .... res={}
> > .... map(lambda x,y:res.setdefault(x,y),keys,values)
> > .... return res
> >
> > And I think it goes with python 2.1. or earlier.
> > A little time_test(number_of_keys) - function which generates
> > simple 5-random-letters-keys and a value-list as range(number_of_keys)
> > will show an increase in speed.
> >>>> time_test(50000)
> > dzip : 0.28100001812
> > myzip : 0.140999913216
> >>>> time_test(500000)
> > dzip : 3.31200003624
> > myzip : 1.59399998188
>
> That's odd. When I dropped the map(...) code into my simple test, it
> was the worst performer at all sizes. :-(
>
> Oh, wait - are you testing building single dictionaries with 50K keys?
> My largest test is with 1000 keys per dictionary, and the results that
> actually matter would with perhaps 5 to 10 keys. Try building a dict
> of 10 items 5K times instead. Not that the trend I see at 1000
> keys/dict and below makes the numbers you reported seem likely even
> so...
>
> Building dicts of 5 items 40000 times (CPU time per Kdict):
> traditional: 0.0190
> traditional inlined: 0.0162
> dict(zip): 0.0255
> inlined dict(zip): 0.0223
> map: 0.0337
> inlined map: 0.0302
> Building dicts of 10 items 20000 times (CPU time per Kdict):
> traditional: 0.0285
> traditional inlined: 0.0260
> dict(zip): 0.0320
> inlined dict(zip): 0.0295
> map: 0.0535
> inlined map: 0.0500
> Building dicts of 100 items 4000 times (CPU time per Kdict):
> traditional: 0.2000
> traditional inlined: 0.2025
> dict(zip): 0.1650
> inlined dict(zip): 0.1600
> map: 0.4125
> inlined map: 0.4100
> Building dicts of 1000 items 400 times (CPU time per Kdict):
> traditional: 2.0500
> traditional inlined: 2.0500
> dict(zip): 1.5750
> inlined dict(zip): 1.5500
> map: 3.9000
> inlined map: 4.0750
>
> The changes in the total numbers of keys inserted was done to keep the
> overall running times about the same - between two and three seconds
> on the P2/233 I originally did this on. As you can see, I've since
> changed the format to report the time per 1000 dictionaries to make the
> shape of the scaling apparent.
>
> The "inlined" versions simply place the body of the functions into the
> body of the test loop, of course. I can't now recall why I called the
> dzip() code "traditional", unless it was just that that's what I have
> been using in a couple of projects.
I give up !!
Until today I thougt I could say that I know a little bit
about Python programming. But from now I'm not quite sure
if I know how to write P Y T H O N ? ?
I tried the following code:
import random,time,gc
# An inline-function to generate random keys with n letters
# between chr(65) and chr(90) [A-Z]
genkey=lambda n:reduce(lambda
last_chars,next_char:last_chars+next_char,map(lambda
n:chr(random.randint(65,90)),range(n)),'')
def test(n_keys=1000,key_items=100,item_len=5):
print '-'*60
print "Key-Generation for %d keys with %d items/key and each item %d
chars long"%(n_keys,key_items,item_len)
# It's not very readable, but I hope its quicker then a for-loop
keys=[ tuple( map(lambda n:genkey(item_len),range(key_items) )) for
k in range(n_keys)]
# Values are simpler
values=range(n_keys)
# Here we go
# Your dzip-function
res = {}
st1=time.time()
for k,v in zip(keys, values):
res[k] = v
et1=time.time()
dzip_time=et1-st1
print 'dzip :','Time total:',dzip_time,'Time/key
:',dzip_time/n_keys
# Some garbage-collection
# to have similar conditions
res = {}
gc.collect()
# myzip-function
st2=time.time()
map(lambda x,y:res.setdefault(x,y),keys,values)
et2=time.time()
mytime=et2-st2
print 'myzip:','Time total:',mytime,'Time/key :',mytime/n_keys
if mytime:
print 'Timerelation dzip/myzip',dzip_time/mytime
else:
print "Can't calculate Timerelation dzip/myzip ->
ZeroDivisionError"
test()
test(500,50,10)
test(2000,200,2)
print
And I let it run 3 times under w2k in the dos-box and believe
or not I got 3 different results as you can see:
M:\Python\XML-XSL\Tests>list_to_dict.py
------------------------------------------------------------
Key-Generation for 1000 keys with 100 items/key and each item 5 chars
long
dzip : Time total: 0.0199999809265 Time/key : 1.99999809265e-005
myzip: Time total: 0.00999999046326 Time/key : 9.99999046326e-006
Timerelation dzip/myzip 2.0
------------------------------------------------------------
Key-Generation for 500 keys with 50 items/key and each item 10 chars
long
dzip : Time total: 0.00999999046326 Time/key : 1.99999809265e-005
myzip: Time total: 0.00999999046326 Time/key : 1.99999809265e-005
Timerelation dzip/myzip 1.0
------------------------------------------------------------
Key-Generation for 2000 keys with 200 items/key and each item 2 chars
long
dzip : Time total: 0.0299999713898 Time/key : 1.49999856949e-005
myzip: Time total: 0.0299999713898 Time/key : 1.49999856949e-005
Timerelation dzip/myzip 1.0
M:\Python\XML-XSL\Tests>list_to_dict.py
------------------------------------------------------------
Key-Generation for 1000 keys with 100 items/key and each item 5 chars
long
dzip : Time total: 0.0199999809265 Time/key : 1.99999809265e-005
myzip: Time total: 0.00999999046326 Time/key : 9.99999046326e-006
Timerelation dzip/myzip 2.0
------------------------------------------------------------
Key-Generation for 500 keys with 50 items/key and each item 10 chars
long
dzip : Time total: 0.00999999046326 Time/key : 1.99999809265e-005
myzip: Time total: 0.00999999046326 Time/key : 1.99999809265e-005
Timerelation dzip/myzip 1.0
------------------------------------------------------------
Key-Generation for 2000 keys with 200 items/key and each item 2 chars
long
dzip : Time total: 0.039999961853 Time/key : 1.99999809265e-005
myzip: Time total: 0.0199999809265 Time/key : 9.99999046326e-006
Timerelation dzip/myzip 2.0
M:\Python\XML-XSL\Tests>list_to_dict.py
------------------------------------------------------------
Key-Generation for 1000 keys with 100 items/key and each item 5 chars
long
dzip : Time total: 0.0199999809265 Time/key : 1.99999809265e-005
myzip: Time total: 0.0 Time/key : 0.0
Can't calculate Timerelation dzip/myzip -> ZeroDivisionError
------------------------------------------------------------
Key-Generation for 500 keys with 50 items/key and each item 10 chars
long
dzip : Time total: 0.00999999046326 Time/key : 1.99999809265e-005
myzip: Time total: 0.00999999046326 Time/key : 1.99999809265e-005
Timerelation dzip/myzip 1.0
------------------------------------------------------------
Key-Generation for 2000 keys with 200 items/key and each item 2 chars
long
dzip : Time total: 0.039999961853 Time/key : 1.99999809265e-005
myzip: Time total: 0.0199999809265 Time/key : 9.99999046326e-006
Timerelation dzip/myzip 2.0
And the winner is . . . (forget it!!)
Cheers
Peter
More information about the Python-list
mailing list