[New-bugs-announce] [issue14654] More fast utf-8 decoding

Serhiy Storchaka report at bugs.python.org
Mon Apr 23 23:04:07 CEST 2012


New submission from Serhiy Storchaka <storchaka at gmail.com>:

The utf-8 decoder is already well optimized. I propose a patch, which accelerates the utf-8 decoder for some of the frequent cases even more (+10-30%). In particular, for 2-bites non-latin1 codes will get about +30%.

This is not the final result of optimization. It may be possible to optimize the decoding of the ascii and mostly-ascii text (up to the speed of memcpy), decoding of text with occasional errors, reduce code duplication. But I'm not sure of the success.

Related issues:
[issue4868] Faster utf-8 decoding
[issue13417] faster utf-8 decoding
[issue14419] Faster ascii decoding
[issue14624] Faster utf-16 decoder
[issue14625] Faster utf-32 decoder

----------
components: Interpreter Core
files: decode_utf8.patch
keywords: patch
messages: 159080
nosy: haypo, pitrou, storchaka
priority: normal
severity: normal
status: open
title: More fast utf-8 decoding
type: performance
versions: Python 3.3
Added file: http://bugs.python.org/file25326/decode_utf8.patch

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue14654>
_______________________________________


More information about the New-bugs-announce mailing list