encoding of sys.argv ?

"Martin v. Löwis" martin at v.loewis.de
Mon Oct 23 19:05:33 EDT 2006


Jiba schrieb:
> I use a Linux box, with French UTF-8 locales and an UTF-8 filesystem.
> sys.getdefaultencoding() is "ascii" and sys.getfilesystemencoding()
> is "utf-8". However, sys.argv is neither in ASCII (since I can pass
> French accentuated character), nor in UTF-8. It seems to be encoded
> in "latin-1", but why ?

Let me second Leo Kislov's analysis. They should be encoded in
locale.getpreferredencoding(), which should be UTF-8. Are you
*sure* they aren't encoded in this way?

On my Debian system, I get this:

martin at mira:~/tmp$ echo $LANG
de_DE.UTF-8
martin at mira:~/tmp$ cat a.py
import sys
print sys.argv

martin at mira:~/tmp$ python a.py Martin v. Löwis
['a.py', 'Martin', 'v.', 'L\xc3\xb6wis']

So clearly, my terminal application + shell passes them as UTF-8,
as it should. The terminal application is KDE konsole; the shell
is bash. The shell *pretty likely* passes the arguments "through"
as-read from the terminal, so if you are not seeing UTF-8, you
have managed to misconfigure your terminal.

Regards,
Martin



More information about the Python-list mailing list