Abuse of subject, was Re: Abuse of Big Oh notation

Tue Aug 21 03:52:09 EDT 2012

wxjmfauth at gmail.com wrote:

> By chance and luckily, first attempt.

> c:\python32\python -m timeit "('€'*100+'€'*100).replace('€'
> , 'œ')"
> 1000000 loops, best of 3: 1.48 usec per loop
> c:\python33\python -m timeit "('€'*100+'€'*100).replace('€'
> , 'œ')"
> 100000 loops, best of 3: 7.62 usec per loop

OK, that is roughly factor 5. Let's see what I get:

$ python3.2 -m timeit '("€"*100+"€"*100).replace("€", "œ")'
100000 loops, best of 3: 1.8 usec per loop
$ python3.3 -m timeit '("€"*100+"€"*100).replace("€", "œ")'
10000 loops, best of 3: 9.11 usec per loop

That is factor 5, too. So I can replicate your measurement on an AMD64 Linux 
system with self-built 3.3 versus system 3.2.

> Note
> The used characters are not members of the latin-1 coding
> scheme (btw an *unusable* coding).
> They are however charaters in cp1252 and mac-roman.

You seem to imply that the slowdown is connected to the inability of latin-1 
to encode "œ" and "€" (to take the examples relevant to the above 
microbench). So let's repeat with latin-1 characters:

$ python3.2 -m timeit '("ä"*100+"ä"*100).replace("ä", "ß")'
100000 loops, best of 3: 1.76 usec per loop
$ python3.3 -m timeit '("ä"*100+"ä"*100).replace("ä", "ß")'
10000 loops, best of 3: 10.3 usec per loop

Hm, the slowdown is even a tad bigger. So we can safely dismiss your theory 
that an unfortunate choice of the 8 bit encoding is causing it. Do you 
agree?