[Tutor] More string conversions.

Magnus Lyckå magnus@thinkware.se
Sun Jul 6 18:45:02 2003


At 21:44 2003-07-06 +0200, j2 wrote:
>Traceback (most recent call last):
>   File "./test.py", line 28, in ?
>     print s
>UnicodeError: ASCII encoding error: ordinal not in range(128)
>cookiemonster:~/scripts#

Your default character encoding is ASCII. That means you can't
print a unicode string containing non-ascii characters. This is
Python you know: "In the face of ambiguity, refuse the temptation
to guess." and "Explicit is better than implicit." (See "import this")

You need to tell python what encoding to use when you print the
unicode string. One way to do this is

print s.encode('iso-8859-1')

I seem to remember that there was some way to do that without
having to do .encode(encoding) for each print statement, but I
don't remember how. You can change your default encoding in
'site.py', but then your program might not work on another
computer than your own, and it might break when you upgrade
Python I guess.

After a little experimentation, I seem to have found a way...
I'm sure there is a more kosher way to do this, but my code
seems to work...

from email.Header import decode_header

raw_text = 
"""=?iso-8859-1?Q?Hej=2C_tack_f=F6r_bra_team_work_p=E5_ndc5_i_fredags_efterm?=
      =?iso-8859-1?Q?iddag!_/Rolf?="""

header = decode_header(raw_text)

result = u''

for text, encoding in header:
     result += text.decode(encoding)

# The rest is new...

import sys

class UnicodeToEncoding:
     def __init__(self, encoding, file_handle):
         self.enc = encoding
         self.f = file_handle

     def write(self, data):
         return self.f.write(data.encode(self.enc))

sys.stdout = UnicodeToEncoding('iso-8859-1', sys.stdout)

print result


--
Magnus Lycka (It's really Lyckå), magnus@thinkware.se
Thinkware AB, Sweden, www.thinkware.se
I code Python ~ The Agile Programming Language