[issue4868] Faster utf-8 decoding

Thu Jan 8 16:22:32 CET 2009

Antoine Pitrou <pitrou at free.fr> added the comment:

> Attached patch
> (utf8decode4.patch) changes this and may enter the fast loop on the
> first character.

Thanks!

> Does this idea apply to the encode function as well?

Probably, although with less efficiency (a long can hold 1, 2 or 4
unicode characters depending on the build).
The unrolling part also applies to simple codecs such as latin1.
Unrolling PyUnicode_DecodeLatin1 a bit (4 copies per iteration) makes it
twice faster on non-tiny strings. I'll experiment with utf16.

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue4868>
_______________________________________