system(...) and unicode

andrew at acooke.org andrew at acooke.org
Mon May 22 16:57:40 EDT 2006


Hi,

I'm seeing the following error:

  ...
  system(cmd)
  UnicodeEncodeError: 'ascii' codec can't encode character u'\xe3' in
position 57: ordinal not in range(128)

and I think I vaguely understand what's going on - "cmd" is constructed
to include a file name that is UTF-8 encoded (I think - it includes
accents when I "ls" the file - this is on a recent Suse Linux with
Python 2.4.2).  So I guess I need to specify the encoding used, right?
But (1) I don't know how to do this; (2) this string came from the
filesystem in the first place, so how come it isn't managed in an
internally consistent way?; and (3) I have no explicit uncode strings
in my program.

Looking at the docs (unicode howto) it seems like maybe I need to do
  system(cmd.encode(...))
but how do I know which locale and what if cmd isn't a unicode string
(I didn't make it so!)?  I could force an encoding as in the unicode
howto ("filename.decode(encoding)"), but that seems to already be
happening (or is it not - am I wrong in assuming that?).

So can someone help me or point me to some more detailed instructions,
please?  At the CL "locale" says en_GB.UTF-8, but I'd like this code to
work whatever the locale is, if that makes sense.

Sorry for being stupid,
Andrew




More information about the Python-list mailing list