LC_ALL and os.listdir()
Duncan Booth
duncan.booth at invalid.invalid
Thu Feb 24 05:57:37 EST 2005
Martin v. Löwis wrote:
> Serge Orlov wrote:
>> Shouldn't os.path.join do that? If you pass a unicode string
>> and a byte string it currently tries to convert bytes to characters
>> but it makes more sense to convert the unicode string to bytes
>> and return two byte strings concatenated.
>
> Sounds reasonable. OTOH, this would be the only (one of a very
> few?) occasion where Python combines byte+unicode => byte.
> Furthermore, it might be that the conversion of the Unicode
> string to a file name fails as well.
>
> That said, I still think it is a good idea, so contributions
> are welcome.
>
It would probably mess up those systems where filenames really are unicode
strings and not byte sequences.
Windows (when using NTFS) stores all the filenames in unicode, and Python
uses the unicode api to implement listdir (when given a unicode path). This
means that the filename never gets encoded to a byte string either by the
OS or Python. If you use a byte string path than the filename gets encoded
by Windows and Python just returns what it is given.
More information about the Python-list
mailing list