[issue18059] Add multibyte encoding support to pyexpat
Serhiy Storchaka
report at bugs.python.org
Sat Mar 25 09:39:08 EDT 2017
Serhiy Storchaka added the comment:
Marc-Andre, there are at least two issues about supporting East Asian encodings (issue13612 and issue15877). I think this means that that encodings are used in XML in wild. Current support of encodings (8-bit + UTF-8 + UTF-16) is enough for my needs, but I never have deal with East Asian languages.
Currently the CodecInfo object has an optional flag _is_text_encoding. I think we can add more private attributes (flags and precomputed tables) for using with the expat parser. If they are not set (third-party encodings) the current autodetection code can be used as a fallback.
----------
nosy: +ncoghlan
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18059>
_______________________________________
More information about the Python-bugs-list
mailing list