Discerning "Run Environment"

Eryk Sun eryksun at gmail.com
Thu May 19 00:18:34 EDT 2022


On 5/18/22, Chris Angelico <rosuav at gmail.com> wrote:
>
> Real solution? Set the command prompt to codepage 65001. Then it
> should be able to handle all characters. (Windows-65001 is its alias
> for UTF-8.)

I suggest using win_unicode_console for Python versions prior to 3.6:

https://pypi.org/project/win_unicode_console

This package uses the console's native 16-bit character support with
UTF-16 text, as does Python 3.6+. Compared to the console's incomplete
and broken support for UTF-8, the console's support for UTF-16 (or
just UCS-2 prior to Windows 10) is far more functional and reliable
across commonly used versions of Windows 7, 8, and 10.

Reading console input as UTF-8 is still limited to ASCII up to and
including Windows 11, which for me is a showstopper. Non-ASCII
characters are read as null bytes, which is useless. Support for
writing UTF-8 to the console screen buffer is implemented correctly in
recent builds of Windows 10 and 11, and mostly correct in Windows 8.
Prior to Windows 8, writing UTF-8 to the console is badly broken. It
returns the number of UTF-16 codes written instead of the number of
bytes written, which confuses buffered writers into writing a lot of
junk to the screen.


More information about the Python-list mailing list