[issue18059] Add multibyte encoding support to pyexpat

Serhiy Storchaka report at bugs.python.org
Sat Mar 25 09:39:08 EDT 2017


Serhiy Storchaka added the comment:

Marc-Andre, there are at least two issues about supporting East Asian encodings (issue13612 and issue15877). I think this means that that encodings are used in XML in wild. Current support of encodings (8-bit + UTF-8 + UTF-16) is enough for my needs, but I never have deal with East Asian languages.

Currently the CodecInfo object has an optional flag _is_text_encoding. I think we can add more private attributes (flags and precomputed tables) for using with the expat parser. If they are not set (third-party encodings) the current autodetection code can be used as a fallback.

----------
nosy: +ncoghlan

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18059>
_______________________________________


More information about the Python-bugs-list mailing list