How smart is the Python interpreter?
Steven D'Aprano
steve at REMOVE-THIS-cybersource.com.au
Thu Jul 31 18:58:25 EDT 2008
On Thu, 31 Jul 2008 04:09:57 -0700, ssecorp wrote:
> def str_sort(string):
> s = ""
> for a in sorted(string):
> s+=a
> return s
>
>
> if i instead do:
>
> def str_sort(string):
> s = ""
> so = sorted(string)
> for a in so:
> s+=a
> return s
>
>
> will that be faster
Oh dear. Premature optimization, the root of all (programming) evil.
You can test which is faster by using the timeit module. In the
interactive interpreter, define the two functions above with different
names, and a string to supply as argument. Then call:
from timeit import Timer
t1 = Timer('str_sort1(s)', 'from __main__ import str_sort1, s')
t2 = Timer('str_sort2(s)', 'from __main__ import str_sort2, s')
t1.repeat(number=1000)
t2.repeat(number=1000)
I'll be hugely surprised if there was any meaningful difference.
> or the interpreter can figure out that it only has
> to do sorted(string) once? or that kind of cleverness is usually
> reserved for compilers and not interpreters?
Python uses a compiler. It doesn't do a lot of clever optimization, but
it does some. In this case, no, it doesn't optimize your function, so
technically the first may be a tiny bit faster. But, frankly, your
function is so painfully inefficient, doing a lot of useless work, that
you probably won't notice any real difference.
The correct way to do what you have done above is ''.join(sorted(s)).
Anything else is much slower.
>>> def sort_str(s):
... ss = ""
... for c in sorted(s):
... ss += c
... return ss
...
>>> s = "abcdefghijklmnopqrstuvwxyz"*100
>>> from timeit import Timer
>>> t1 = Timer('"".join(sorted(s))', 'from __main__ import s')
>>> t2 = Timer('sort_str(s)', 'from __main__ import sort_str, s')
>>> t1.repeat(number=1000)
[1.6792540550231934, 1.6882510185241699, 1.660383939743042]
>>> t2.repeat(number=1000)
[2.5500221252441406, 2.4761130809783936, 2.5888760089874268]
--
Steven
More information about the Python-list
mailing list