[Tutor] unicode problem

Michael Janssen Janssen@rz.uni-frankfurt.de
Mon Apr 28 04:50:24 2003


On Mon, 28 Apr 2003, Paul Tremblay wrote:

> When I use Sax, I am getting a unicode problem.
>
> If I put an "=F6" in my file (ö), then sax translates this to a
> unicode string:
>
> u'?' (some value)
>
> I then cannot parse the string. If I try to add to it:
>
> my_string =3D my_string + '\n'
>
> Then I get this error:
>
>
>  File "/home/paul/lib/python/paul/format_txt.py", line 159, in r_border
>     line =3D line + filler + padding + border + "\n"
> UnicodeError: ASCII decoding error: ordinal not in range(128)

I don't know, if this is also suitable for your situation, but it can
solve errors with "not in range":

>>> u =3D u'=E4'
>>> u
u'\xe4'
>>> print u

Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: ASCII encoding error: ordinal not in range(128)
>>> print u.encode("latin-1")
=E4

unicode string is converted to string

The name of the encoding may also be iso-8859-1. I suppose legal values
for encoding are such for that files under /path/to/pathon/lib/encodings
are found. encode takes a second parameter controlling how to deal with
errors - compare help("".encode).

Michael