[issue19519] Parser: don't transcode input string to UTF-8 if it is already encoded to UTF-8

Martin v. Löwis report at bugs.python.org
Thu Nov 7 16:44:39 CET 2013


Martin v. Löwis added the comment:

tok->enc and tok->encoding should always have the same value, except that tok->enc gets set earlier.

tok->enc is used when parsing from strings, to remember what codec to use. For file based parsing, the codec object created knows what encoding to use; for string-based parsing, tok->enc stores the encoding.

If the code is to be simplified, unifying the cases of string-based parsing and file-based parsing might be a worthwhile goal.

----------
nosy: +loewis

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue19519>
_______________________________________


More information about the Python-bugs-list mailing list