pep 277, Unicode filenames & mbcs encoding &c.

Edward K. Ream edreamleo at charter.net
Tue Oct 21 10:32:00 EDT 2003


Am I reading pep 277 correctly?  On Windows NT/XP, should filenames always
be converted to Unicode using the mbcs encoding?  For example,

myFile = unicode(__file__, "mbcs", "strict")

This seems to work, and I'm wondering whether there are any other details to
consider.

My experiments with Idle for Python 2.2 indicate that os.path.join doesn't
work as I expect when one of the args is a Unicode string.  Everything
before the Unicode string gets thrown away.  But this is probably moot:  pep
277 implies Python 2.3...

Am I correct that conversions to Unicode (using "mbcs" on Windows) should be
done before passing arguments to os.path.join, os.path.split,
os.path.normpath, etc. ?  Presumably os.path functions use the default
system encoding to convert strings to Unicode, which isn't likely to be
"mbcs" or anything else useful :-)

Are there any situations where some other encoding should be used instead on
Windows?  What about other platforms? For instance, does Linux allow
non-ascii file names?  If so, what encoding should be specified when
converting to Unicode?  Thanks.

Edward
--------------------------------------------------------------------
Edward K. Ream   email:  edreamleo at charter.net
Leo: Literate Editor with Outlines
Leo: http://webpages.charter.net/edreamleo/front.html
--------------------------------------------------------------------






More information about the Python-list mailing list