Python and STL efficiency
Maric Michaud
maric at aristote.info
Wed Aug 23 04:24:48 EDT 2006
Le mardi 22 août 2006 23:15, Fredrik Lundh a écrit :
> Maric Michaud wrote:
> > The problem here, is that the strings in the set are compared by value,
> > which is not optimal, and I guess python compare them by adress ("s*n is
> > s*n" has the same complexity than "s*n == s*n" in CPython, right ?).
>
> wrong.
>
Ah ! wrong, thanks, but not in our case :
In [80]: for i in (10**e for e in range(5)) :
....: res1 = timeit.Timer('s == t', 's, t = "e"*%i, "e"*%i'%(i,
i)).timeit()
....: res2 = timeit.Timer('s is t', 's, t = "e"*%i, "e"*%i'%(i,
i)).timeit()
....: print res1/res2
....:
....:
1.10532866525
1.27507328965
1.90244004672
8.33974283485
89.5215441627
Indeed, it's wrong for two instances of str, but not if we compare an instance
to itself :
In [79]: for i in (10**e for e in range(9)) :
....: r1=timeit.Timer('s is p', "s, p = ('s'*%s,)*2" % i).timeit()
....: r2=timeit.Timer('s == p', "s, p = ('s'*%s,)*2" % i).timeit()
....: print r1/r2
....:
....:
0.854690643008
0.872682262181
0.851785060822
0.871193603744
0.890304121256
0.86925960859
0.846364097331
0.91614070798
0.825424114324
So my lastest c++ algorithm is the good one still, as the python code is
like :
In [29]: timeit.Timer('set(t)', 't=("e"*10,)*10').timeit()
Out[29]: 1.883868932723999
In [30]: timeit.Timer('set(t)', 't=("e"*100,)*10').timeit()
Out[30]: 1.8789899349212646
and not :
In [27]: timeit.Timer('set(t)', 't=["e"*10 for i in range(10)]').timeit()
Out[27]: 2.6000990867614746
In [28]: timeit.Timer('set(t)', 't=["e"*100 for i in range(10)]').timeit()
Out[28]: 4.1173238754272461
> > timeit -s"s='x'; n=1000" "s*n is n*s"
>
> 1000000 loops, best of 3: 1.9 usec per loop
>
> > timeit -s"s='x'; n=1000" "s*n == n*s"
>
> 100000 loops, best of 3: 4.5 usec per loop
>
> </F>
--
_____________
Maric Michaud
_____________
Aristote - www.aristote.info
3 place des tapis
69004 Lyon
Tel: +33 426 880 097
More information about the Python-list
mailing list