[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

"Martin v. Löwis" martin at v.loewis.de
Sat Apr 25 14:17:14 CEST 2009


Simon Cross wrote:
>> Unfortunately, for Windows, the situation would
>> be exactly the opposite: the byte-oriented interface cannot represent
>> all data; only the character-oriented API can.
> 
> Is the second part of this actually true? My understanding may be
> flawed, but surely all Unicode data can be converted to and from bytes
> using UTF-8?

[I hope, by "second part", you refer to the part that I left]

It's true that UTF-8 could represent all Windows file names. However,
the byte-oriented APIs of Windows do not use UTF-8, but instead, they
use the Windows ANSI code page (which varies with the installation).

> Given this, can't people who
> must have access to all files / environment data just use the bytes
> interface?

No, because the Windows API would interpret the bytes differently,
and not find the right file.

Regards,
Martin


More information about the Python-Dev mailing list