LANG, locale, unicode, setup.py and Debian packaging

Donn donn.ingle at gmail.com
Sun Jan 13 13:48:08 EST 2008


Well, that didn't take me long... Can you help with this situation?
I have a file named "MÖgul.pog" in this directory:
/home/donn/.fontypython/

I set my LANG=C

Now, I want to open that file from Python, and I create a path with 
os.path.join() and an os.listdir() which results in this byte string:
paf = ['/home/donn/.fontypython/M\xc3\x96gul.pog']

I *think* that the situation is impossible because the system cannot resolve 
the correct filename (due the locale being ANSI and the filename being other) 
but I am not 100% sure.

So, I have been trying combinations of open:
1. f = codecs.open( paf, "r", "utf8" )
I had hopes for this one.
2. f = codecs.open( paf, "r", locale.getpreferredencoding())
3. f = open( paf, "r")

But none will open it - all get a UnicodeDecodeError. This aligns with my 
suspicions, but I wanted to bounce it off you to be sure.

It does not really mesh with our previous words about opening all files as 
bytestrings, and admits failure to open this file.

Also, this codecs.open(filename, "r", <encoding>) function:
1. Does it imply that the filename will be opened (with the name as it's 
type : i.e. bytestring or unicode ) and written *into* as <encoding> 
2. Imply that filename will be encoded via <encoding> and written into as 
<encoding>
It's fuzzy, how is the filename handled?

\d


-- 
He has Van Gogh's ear for music. -- Billy Wilder

Fonty Python and other dev news at:
http://otherwiseingle.blogspot.com/



More information about the Python-list mailing list