Yet another unicode WTF

Ron Garret rNOSPAMon at flownet.com
Thu Jun 4 21:18:24 EDT 2009


Python 2.6.2 on OS X 10.5.7:

[ron at mickey:~]$ echo $LANG
en_US.UTF-8
[ron at mickey:~]$ cat frob.py 
#!/usr/bin/env python
print u'\u03BB'

[ron at mickey:~]$ ./frob.py 
ª
[ron at mickey:~]$ ./frob.py > foo
Traceback (most recent call last):
  File "./frob.py", line 2, in <module>
    print u'\u03BB'
UnicodeEncodeError: 'ascii' codec can't encode character u'\u03bb' in 
position 0: ordinal not in range(128)


(That's supposed to be a small greek lambda, but I'm using a 
brain-damaged news reader that won't let me set the character encoding.  
It shows up correctly in my terminal.)

According to what I thought I knew about unix (and I had fancied myself 
a bit of an expert until just now) this is impossible.  Python is 
obviously picking up a different default encoding when its output is 
being piped to a file, but I always thought one of the fundamental 
invariants of unix processes was that there's no way for a process to 
know what's on the other end of its stdout.

Clues appreciated.  Thanks.

rg



More information about the Python-list mailing list