[Python-Dev] Re: PEP 277: Unicode file name support for Windows NT, was PEP-time ? ...

Martin v. Loewis martin@v.loewis.de
Wed, 16 Jan 2002 08:08:33 +0100


>    Won't this lead to a less useful result as Py_FileSystemDefaultEncoding
> will be NULL on, for example, Linux, so if there are names containing
> non-ASCII characters then it will either raise an exception or stick '?'s in
> the names. So it would be better to use narrow strings there as that will
> pass through all file names.

On Linux, if the user has set LANG to a reasonable value, and the
Python application has invoked setlocale(),
Py_FileSystemDefaultEncoding will not be NULL.

It still might happen that an individual file name cannot be decoded
from the file system encoding, e.g. if the locale is set to UTF-8, but
you have a Latin-1 file name (created by a different user). In that
exceptional case, I would neither expect an exception, nor expect
replacement characters in the Unicode string, but instead use a byte
string *for this specific file name*.

Just because there is there is the rare chance that you cannot
meaningfully interpret a certain file name does not mean that all
other installation have to suffer.

>    You have probably already realised, but Windows 9x will also need a
> Unicode preserving listdir but it will have to encode using mbcs.

Exactly. Unfortunately, we cannot do anything to avoid replacement
characters here, since it is already Windows who will introduce
them. In turn, we know that decoding from "mbcs" will always succeed.

Regards,
Martin