[Python-Dev] Unicode literals in Python 2.7

Adam Bartoš drekin at gmail.com
Sat May 2 21:57:45 CEST 2015


I think I have found out where the problem is. In fact, the encoding of the
interactive input is determined by sys.stdin.encoding, but only in the case
that it is a file object (see
https://hg.python.org/cpython/file/d356e68de236/Parser/tokenizer.c#l890 and
the implementation of tok_stdin_decode). For example, by default on my
system sys.stdin has encoding cp852.

>>> u'á'
u'\xe1' # correct
>>> import sys; sys.stdin = "foo"
>>> u'á'
u'\xa0' # incorrect

Even if sys.stdin contained a file-like object with proper encoding
attribute, it wouldn't work since sys.stdin has to be instance of <type
'file'>. So the question is, whether it is possible to make a file instance
in Python that is also customizable so it may call my code. For the first
thing, how to change the value of encoding attribute of a file object.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20150502/7da2b9e0/attachment.html>


More information about the Python-Dev mailing list