[issue14419] Faster ascii decoding

STINNER Victor report at bugs.python.org
Tue Mar 27 14:03:35 CEST 2012


STINNER Victor <victor.stinner at gmail.com> added the comment:

New tests. I'm not conviced by the patch: it slows down the decoder for "short" strings. I don't understand which kind of ASCII encoded strings (specific length or content?) are optimized by the patch.

Unpatched:

$ ./python -m timeit -n 50000 -r 100 -s 'data=open("README", "r").read().encode("ascii")' 'data.decode("ASCII")'
50000 loops, best of 100: 1.41 usec per loop

$ ./python -m timeit -n 1000 -s 'import codecs; d = codecs.getdecoder("ascii"); x = bytes(range(128))*10' 'd(x)'
1000 loops, best of 3: 0.564 usec per loop

$ ./python -m timeit -n 1000 -s 'import codecs; d = codecs.getdecoder("ascii"); x = bytes(range(128))*1000' 'd(x)'
1000 loops, best of 3: 24.4 usec per loop

$ ./python -m timeit -n 10 -s 'import codecs; d = codecs.getdecoder("ascii"); x = bytes(range(128))*100000' 'd(x)'
10 loops, best of 3: 10.9 msec per loop

$ ./python -m timeit -n 1000 -s 'enc = "ascii"; import codecs; d = codecs.getdecoder(enc); x = ("\u0020" * 1000000).encode(enc)' 'd(x)'
1000 loops, best of 3: 722 usec per loop

Patched:

$ ./python -m timeit -n 50000 -r 100 -s 'data=open("README", "r").read().encode("ascii")' 'data.decode("ASCII")'
50000 loops, best of 100: 1.74 usec per loop

$ ./python -m timeit -n 1000 -s 'import codecs; d = codecs.getdecoder("ascii"); x = bytes(range(128))*10' 'd(x)'
1000 loops, best of 3: 0.597 usec per loop

$ ./python -m timeit -n 1000 -s 'import codecs; d = codecs.getdecoder("ascii"); x = bytes(range(128))*1000' 'd(x)'
1000 loops, best of 3: 27.3 usec per loop

$ ./python -m timeit -n 10 -s 'import codecs; d = codecs.getdecoder("ascii"); x = bytes(range(128))*100000' 'd(x)'
10 loops, best of 3: 8.32 msec per loop

$ ./python -m timeit -n 1000 -s 'enc = "ascii"; import codecs; d = codecs.getdecoder(enc); x = ("\u0020" * 1000000).encode(enc)' 'd(x)'
1000 loops, best of 3: 479 usec per loop

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue14419>
_______________________________________


More information about the Python-bugs-list mailing list