[Python-ideas] Fix default encodings on Windows

Adam Bartoš drekin at gmail.com
Fri Aug 12 12:24:11 EDT 2016


*On Fri Aug 12 11:33:35 EDT 2016, *

*Random832 wrote:*> On Wed, Aug 10, 2016, at 15:08, Steve Dower wrote:
>>* That's the hope, though that module approaches the solution differently
*>>* and may still uses. An alternative way for us to fix this whole thing
*>>* would be to bring win_unicode_console into the standard library and use
*>>* it by default (or probably whenever PYTHONIOENCODING is not specified).
*>
> I have concerns about win_unicode_console:
> - For the "text_transcoded" streams, stdout.encoding is utf-8. For the
> "text" streams, it is utf-16.

UTF-16 it the "native" encoding since it corresponds to the wide chars used
by Read/WriteConsoleW. The UTF-8 is used just as a signal for the consumers
of PyOS_Readline.

> - There is no object, as far as I can find, which can be used as an
> unbuffered unicode I/O object.

There is no buffer just on those wrapping streams because the bytes I have
are not in UTF-8. Adding one would mean a fake buffer that just decodes and
writes to the text stream. AFAIK there is no guarantee that sys.std*
objects have buffer attribute and any code relying on that is incorrect.
But I inderstand that there may be such code and we may want to be
compatible.


> - raw output streams silently drop the last byte if an odd number of
> bytes are written.

That's not true, it doesn't write an odd number of bytes, but returns the
correct number of bytes written. If only one byte is given, it raises a
ValueError.


> - The sys.stdout obtained via streams.enable does not support .buffer /
> .buffer.raw / .detach
> - All of these objects provide a fileno() interface.

Is this wrong? If I remember, I provide it because of some check -- maybe
in input() -- to be viewed as a stdio stream.


> - When using os.read/write for data that represents text, the data still
> should be encoded in the console encoding and not in utf-8 or utf-16.

I don't know what to do with this. Generally I wouldn't use bytes to
communicate textual data.


Regards,
Adam Bartoš
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20160812/ab142dc8/attachment.html>


More information about the Python-ideas mailing list