Encoding of file names

"Martin v. Löwis" martin at v.loewis.de
Fri Dec 9 17:13:30 EST 2005


Tom Anderson wrote:
> Isn't the key thing that Windows is applying a non-roundtrippable 
> character encoding?

This is a fact, but it is not a key thing. Of course Windows is
applying a non-roundtrippable character encoding. What else could it
do?

> Windows, however, maps that name to the 
> 8-bit string "double bucky blackslash vertical bar"

Only if you ask it to. There are two sets of APIs: one to apply
if you ask for byte strings (FindFirstFileA), and one to apply when you
ask for Unicode strings (FindFirstFileW).

In one case it has to convert; in the other, it doesn't.

> I don't know what Windows *should* do here. I know it shouldn't do this 
> - this leads to breaking of some very basic invariants about files and 
> directories, and so the kind of confusion utabintarbo suffered.

It always did this, and always will. Applications should stop using the
*A versions of the API. If they continue to do so, they will continue
to get bogus results in border cases.

The real issue here really is that there was a border case, when there
shouldn't be one.

Regards,
Martin



More information about the Python-list mailing list