[New-bugs-announce] [issue14874] Faster charmap decoding
Serhiy Storchaka
report at bugs.python.org
Tue May 22 00:19:30 CEST 2012
New submission from Serhiy Storchaka <storchaka at gmail.com>:
Charmap decoders are not as important as UTF decoders, but are still widely used. In Python 3.3 with PEP 393 they slowed down 4x. The proposed patch restores the performance.
Optimized only the most common case, when the decoder is specified by the UCS2 table with length >= 256. Map-based decoders translated to table-based. UCS1 tables widened to UCS2 by adding 257th fake characters.
Benchmark results:
3.2 3.3(vanilla) 3.3(patched)
cp1251 'A'*10000 111 (+10%) 31 (+294%) 122
cp1251 '\xa0'*10000 111 (+8%) 29 (+314%) 120
cp1251 '\u0402'*10000 111 (+6%) 25 (+372%) 118
----------
components: Interpreter Core, Unicode
files: decode_charmap.patch
keywords: patch
messages: 161301
nosy: ezio.melotti, haypo, lemburg, pitrou, storchaka
priority: normal
severity: normal
status: open
title: Faster charmap decoding
type: performance
versions: Python 3.3
Added file: http://bugs.python.org/file25664/decode_charmap.patch
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue14874>
_______________________________________
More information about the New-bugs-announce
mailing list