Help needed with filenames

pdenize denize.paul at gmail.com
Sun May 10 03:32:07 EDT 2009


I have a program that reads files using glob and puts them into an XML
file in UTF-8 using
  unicode(file, sys.getfilesystemencoding()).encode("UTF-8")
This all works fine including all the odd characters like accents etc.

However I also print what it is doing and someone pointed out that
many characters are not printing correctly in the Windows command
window.

I have tried to figure this out but simply get lost in the translation
stuff.
if I just use print filename it has characters that dont match the
ones in the filename (I sorta expected that).
So I tried print unicode(file, sys.getfilesystemencoding()) expecting
the correct result, but no.
UnicodeEncodeError: 'charmap' codec can't encode character u'\u2013'

I did notice that when a windows command window does a directory
listing of these files the characters seem to be translated into close
approximations (long dash to minus, special double quotes to simple
double quotes, but still retains many of the accent chars).  I looked
at translate to do this but did not know how to determine which
characters to map.

Can anyone tell me what I should be doing here?



More information about the Python-list mailing list