[python-win32] opening files with names in non-english characters.

Tim Roberts timr at probo.com
Tue Feb 24 22:46:47 CET 2009


Vernon Cole wrote:
> 2) I suspect that Python is NOT returning question mark characters.
> Windows (or KIMODO) is probably failing to display some characters
> that Python is returning.

Actually, you're wrong here.  The Windows API is returning the file
names in Unicode.  The "os.listdir" function wants to return an 8-bit
string, so it converts the Unicode names using some encoding, which can
probably be predicted.  Characters which do not exist in that encoding
are changed to ? by the decode error process.  So, by the time
os.listdir returns, the file names really DO have "?" characters in them.

> to see what is really in that object. You cannot expect your display
> to have the glyph for every arbitrary unicode code point, so repr()
> will display it with backslash escapes so that you can tell what is
> there.
>   

Yes, but in that case, you'll get an error when you try to print it, not
just "?" characters.

> 3) The documentation for "os.listdir(path)" says: "Changed in version
> 2.3: On Windows NT/2k/XP and Unix, if path is a Unicode object, the
> result will be a list of Unicode objects."  Make sure that your path
> argument is a unicode object.
>   

THIS is the solution to his problem.

-- 
Tim Roberts, timr at probo.com
Providenza & Boekelheide, Inc.



More information about the python-win32 mailing list