How does unicode() work?

Carsten Haese carsten at uniqsys.com
Wed Jan 9 09:14:51 EST 2008


On Wed, 2008-01-09 at 13:44 +0100, Fredrik Lundh wrote:
> Robert Latest wrote:
> 
> > Here's a test snippet...
> > 
> > import sys
> > for k in sys.stdin:
> >     print '%s -> %s' % (k, k.decode('iso-8859-1'))
> > 
> > ...but it barfs when actually fed with iso8859-1 characters. How is this 
> > done right?
> 
> it's '%s -> %s' % (byte string, unicode string) that barfs.  try doing
> 
> import sys
> for k in sys.stdin:
>      print '%s -> %s' % (repr(k), k.decode('iso-8859-1'))
> 
> instead, to see what's going on.

If that really is the line that barfs, wouldn't it make more sense to
repr() the unicode object in the second position?

import sys
for k in sys.stdin:
     print '%s -> %s' % (k, repr(k.decode('iso-8859-1')))

Also, I'm not sure if the OP has told us the truth about his code and/or
his error message. The implicit str() call done by formatting a unicode
object with %s would raise a UnicodeEncodeError, not the
UnicodeDecodeError that the OP is reporting. So either I need more
coffee or there is something else going on here that hasn't come to
light yet.

-- 
Carsten Haese
http://informixdb.sourceforge.net





More information about the Python-list mailing list