[New-bugs-announce] [issue14874] Faster charmap decoding

Serhiy Storchaka report at bugs.python.org
Tue May 22 00:19:30 CEST 2012


New submission from Serhiy Storchaka <storchaka at gmail.com>:

Charmap decoders are not as important as UTF decoders, but are still widely used. In Python 3.3 with PEP 393 they slowed down 4x. The proposed patch restores the performance.

Optimized only the most common case, when the decoder is specified by the UCS2 table with length >= 256. Map-based decoders translated to table-based. UCS1 tables widened to UCS2 by adding 257th fake characters.

Benchmark results:

                             3.2           3.3(vanilla)  3.3(patched)

cp1251    'A'*10000          111 (+10%)    31 (+294%)    122
cp1251    '\xa0'*10000       111 (+8%)     29 (+314%)    120
cp1251    '\u0402'*10000     111 (+6%)     25 (+372%)    118

----------
components: Interpreter Core, Unicode
files: decode_charmap.patch
keywords: patch
messages: 161301
nosy: ezio.melotti, haypo, lemburg, pitrou, storchaka
priority: normal
severity: normal
status: open
title: Faster charmap decoding
type: performance
versions: Python 3.3
Added file: http://bugs.python.org/file25664/decode_charmap.patch

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue14874>
_______________________________________


More information about the New-bugs-announce mailing list