[Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

Neil Hodgson nyamatongwe at gmail.com
Sun Jul 17 10:26:33 CEST 2005


Martin v. Löwis:

> This appears to be based on the usedDefault return value of
> WideCharToMultiByte. I believe this is insufficient:
> WideCharToMultiByte might convert Unicode characters to
> codepage characters in a lossy way, without using the default
> character. For example, it converts U+0308 (combining diaeresis)
> to U+00A8 (diaeresis) (or something like that, I forgot the
> exact details). So if you have, say, "p-umlaut" (i.e. U+0070
> U+0308), it converts it to U+0070 U+00A8 (in the local code page).
> Trying to use this as a filename later fails.

   There is WC_NO_BEST_FIT_CHARS to defeat that. It says that it will
use the default character if the translation can't be round-tripped.
Available on WIndows 2000 and XP but not NT4. We could compare the
original against the round-tripped as described at
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicode_2bj9.asp

   Neil


More information about the Python-Dev mailing list