[Tutor] unicode() bug?

A.M. Kuchling amk at amk.ca
Sun Nov 9 17:45:41 EST 2003


On Sun, Nov 09, 2003 at 05:07:37PM -0500, Jonathan Soons wrote:
> output.write(unicode(txt, "iso-8859-1", "ignore"))

unicode() returns a Unicode string, but the write() method wants an 8-bit
string.  You could see this by breaking this into two expressions:

	 u = unicode(...)
	 output.write(u)
	 
The conversion to Unicode is ignoring errors as you expect, but write()
doesn't know what encoding to use for writing out the Unicode string. So,
convert it to 8-bit via some encoding and be explicit:

     u = unicode(...)
     output.write(u.encode('iso-8859-1'))

See http://effbot.org/zone/unicode-objects.htm for some additional notes.

--amk



More information about the Tutor mailing list