[issue3187] os.listdir can return byte strings

Martin v. Löwis report at bugs.python.org
Fri Oct 3 12:43:52 CEST 2008


Martin v. Löwis <martin at v.loewis.de> added the comment:

> Which charset is used when you use bytes filename?

It's the "ANSI" code page, which is a system-wide admin-modifiable
indirection to some real code page (changing it requires a reboot).
In the API, it's referred to as CP_ACP. It's also related to the
"multi-byte" API, which has caused Mark Hammond to call the codec
invoking it "mbcs" (IOW, "mbcs" is always the codec name for the
file system encoding). The specific code page that CP_ACP denotes
can be found with locale.getpreferredencoding(). Using that codec
name (which might be e.g. "cp1252") is different from using "mbcs",
as that goes through a regular (table-driven) Python codec. In
particular, the Python codec will report errors, whereas the "mbcs"
codec will find replacement characters.

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue3187>
_______________________________________


More information about the Python-bugs-list mailing list