Unicode filenames and os.path.* functions

Alex Martelli aleax at aleax.it
Fri Jan 4 04:46:59 EST 2002


"Michael Ebert" <michael.m.ebert at daimlerchrysler.com> wrote in message
news:14b92abf.0201040134.3f45fbe9 at posting.google.com...
> Hello,
>
>     I'm using Python 2.1.1 and have to work with unicode filenames on
> Windows 2000. The functions in os.path like
> os.path.exists(Unicode-Filename), os.path.getsize(...), etc. don't support
this
> ("UnicodeError: ASCII encoding error: ordinal not in range(128)").
> What is the reason for that?

The default encoding method used in Python is 'ascii' -- 7-bit characters
only.  The reason for that is that the 'ascii' subset is the only one that
is in common among the vast majority of single-byte and multi-byte
encodings.  You can change this default for your own installation by
suitably editing site.py or sitecustomize.py (in C:\Python22\Lib if you
have a standard installation of Python 2.2 on Windows, for example).

The reason os.path.exists, etc, take as their argument a string and not
a Unicode object is that this is the behavior of the underlying C libraries
(which must provide compatibility between different operating systems).
So, when you pass a Unicode object, it's converted to a string (via the
default encoding, if you don't specify one explicitly of course).


> Is there a Unicode supporting file operation library?

I believe Mark Hammon's "win32all" extensions accept Unicode strings
(in particular as filenames and paths) and give you access to all of
the Win32 API file-related functionality.


Alex






More information about the Python-list mailing list