4Suite / non-ASCII filenames

Martin v. Loewis martin at v.loewis.de
Sat Oct 12 18:50:51 EDT 2002


Thomas Korb <Doc at goodweb.de> writes:

> The file names come from various sources; e.g Tk - file select
> boxes, but also default filenames (defined in the script), 
> file names passed as arguments on the commandline (Linux only) etc. 

When they come from Tkinter, I doubt they are byte strings: Tkinter
will return Unicode strings in this case.

For things coming from the command line, it would be good to convert
them to Unicode.

> > Can you arrange it to be a Unicode string instead?
> >
> 
> Would be no problem (but passing a Unicode-filename to the XSLT-
> processor leads to a similar error).

Yes. It would be good if you could report the precise place where this
causes an error. I'll assume that it is when passing the Unicode
string to the open call.

In this case, please do locale.setlocale(locale.LC_CTYPE, ""). If you
have set lang to, say, de_DE.ISO-8859-1, this will cause the file
system default encoding to be set to latin-1. In turn, open will
accept Unicode file names as long as they can be converted to latin-1
- independent from the system default encoding.

> But I have the feeling that they rely on the default encoding when
> dealing with filenames. And since I do not want to use sitecustomize.py,
> I do not know how to solve this problem.

I assume they mix the URI with some Unicode object, so it would be
good if the URI was already a Unicode string.

> (Is there a good reason why sys.setdefaultencoding() is not allowed
> in scripts? This is not the first time that I would need something
> like that.)

If sys.setdefaultencoding was available, then people may arrange to
break libraries - libraries may not expect that the encoding of their
byte strings changes from under them. By not making this a feature,
life for library authors ought to get simpler.

Also, I'm personally convinced that there is no need to have the
system default encoding at any other value but ASCII; problems *can*
be solved, in an elegant way, without having to change global
variables.

Regards,
Martin




More information about the Python-list mailing list