Degree symbol (UTF-8 > ASCII)

Steven Taschuk staschuk at telusplanet.net
Wed Apr 16 23:01:53 EDT 2003


Quoth Peter Clark:
> [...] However, when I put it into the code, I get this error:
> 
>     w += [deg.encode('UTF-8') + scale.strip()]
> UnicodeError: ASCII decoding error: ordinal not in range(128)

Aha!  Yes, it's a *decoding* error.  Your 'deg' variable is a
normal string, right?

    >>> '\xb0'.encode('utf-8')
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    UnicodeError: ASCII decoding error: ordinal not in range(128)
    >>> '\xb0'.decode('latin-1').encode('utf-8')
    '\xc2\xb0'
    >>> '\xb0'.decode('latin-1') # produces Unicode string
    u'\xb0'
    >>> u'\xb0'.encode('utf-8') # uses Unicode string
    '\xc2\xb0'

See?  For .encode to work, it has to know what characters are in
the string being encoded.  If it's a normal string with bytes
outside of range(128), they're not ASCII, so it quite properly
refuses to guess what they are.  But if you tell it the correct
encoding, it can construct the right sequence of characters (as a
Unicode string), and then encode that as instructed.

>     Since the output is meant to be read to be displayed by a font
> which is in essentially latin-1 encoding, I need to restrict the
> manner in which the degree symbol is displayed to one byte. [...]

So just do something like
    print >>fileobject, chr(176)
    print >>fileobject, u'\N{DEGREE SIGN}'.encode('latin-1')

(Note that if you do want your file to be in ISO-8859-1, the XML
declaration should say that's what it is.)

-- 
Steven Taschuk                                                   w_w
staschuk at telusplanet.net                                      ,-= U
                                                               1 1





More information about the Python-list mailing list