cannot open file with non-ASCII filename

eryk sun eryksun at gmail.com
Mon Dec 14 13:45:31 EST 2015


On Mon, Dec 14, 2015 at 10:24 AM, Ulli Horlacher
<framstag at rus.uni-stuttgart.de> wrote:
> With Python 2.7.11 on Windows 7 my users cannot open/read files with
> non-ASCII filenames.
[...]
>     c = msvcrt.getch()

This isn't an issue with Python per se, and the same problem exists in
Python 3, using either getch or getwch. Microsoft's getwch function
isn't designed to handle the variety of ways the console host
(conhost.exe) encodes Unicode keyboard events. Their implementation
calls ReadConsoleInput and looks for a KEY_EVENT. If bKeyDown is set
it grabs the UnicodeChar field.

In an ideal world it would be that simple. However, the console
literally supports the alt+numpad sequences that allow entering
characters by code. So the input event sequence, for example, could be
+VK_MENU, +VK_NUMPAD7, -VK_NUMPAD7, +VK_NUMPAD6, -VK_NUMPAD6,
-VK_MENU, which is an "L". (Denoting "+" as key down and "-" as key
up.) This may just be the closest approximation in the system locale's
codepage (ANSI). That doesn't matter because the actual Unicode
codepoint is set in the last event's UnicodeChar field.

Try using the pyreadline module. IIRC, it does a better job decoding
the events from ReadConsoleInput.



More information about the Python-list mailing list