Printing Filenames with non-Ascii-Characters
vincent wehren
vincent at visualtrans.de
Tue Feb 1 16:26:22 EST 2005
Marian Aldenhövel wrote:
> Hi,
>
> I am very new to Python and have run into the following problem. If I do
> something like
>
> dir = os.listdir(somepath)
> for d in dir:
> print d
>
> The program fails for filenames that contain non-ascii characters.
>
> 'ascii' codec can't encode characters in position 33-34:
If you read this carefully, you'll notice that Python has tried and
failed to *encode* a decoded ( = unicode) string using the 'ascii'
codec. IOW, d seems to be bound to a unicode string. Which is unexpected
unless maybe the argument passed to os.listdir (somepath) is a Unicode
string, too. (If given a Unicode string as argument, os.listdir will
return the list as a list of unicode names).
If you're printing to the console, modern Pythons will try to guess the
console's encoding (e.g. cp850). I would expect a UnicodeEncodeError if
the print fails because the characters do not map to the console's
encoding, not the error you're seeing.
How *are* you running the program. In the console (cmd.exe)? Or from
some IDE?
>
> I have noticed that this seems to be a very common problem. I have read
> a lot
> of postings regarding it but not really found a solution. Is there a simple
> one?
>
> What I specifically do not understand is why Python wants to interpret the
> string as ASCII at all. Where is this setting hidden?
Don't be tempted to ever change sys.defaultencoding in site.py, this is
site specific, meaning that if you ever distribute them, programs
relying on this setting may fail on other people's Python installations.
--
Vincent Wehren
>
> I am running Python 2.3.4 on Windows XP and I want to run the program on
> Debian sarge later.
>
> Ciao, MM
More information about the Python-list
mailing list