Dr. Dobb's Python-URL! - weekly Python news and links (Dec 30)

Thomas Heller theller at python.net
Tue Jan 4 10:41:05 EST 2005


Skip Montanaro <skip at pobox.com> writes:

>     michele> BTW what's the difference between .encode and .decode ?
>
> I started to answer, then got confused when I read the docstrings for
> unicode.encode and unicode.decode:
>
>     >>> help(u"\xe4".decode)
>     Help on built-in function decode:
>
>     decode(...)
>         S.decode([encoding[,errors]]) -> string or unicode
>
>         Decodes S using the codec registered for encoding. encoding defaults
>         to the default encoding. errors may be given to set a different error
>         handling scheme. Default is 'strict' meaning that encoding errors raise
>         a UnicodeDecodeError. Other possible values are 'ignore' and 'replace'
>         as well as any other name registerd with codecs.register_error that is
>         able to handle UnicodeDecodeErrors.
>
>     >>> help(u"\xe4".encode)
>     Help on built-in function encode:
>
>     encode(...)
>         S.encode([encoding[,errors]]) -> string or unicode
>
>         Encodes S using the codec registered for encoding. encoding defaults
>         to the default encoding. errors may be given to set a different error
>         handling scheme. Default is 'strict' meaning that encoding errors raise
>         a UnicodeEncodeError. Other possible values are 'ignore', 'replace' and
>         'xmlcharrefreplace' as well as any other name registered with
>         codecs.register_error that can handle UnicodeEncodeErrors.
>
> It probably makes sense to one who knows, but for the feeble-minded like
> myself, they seem about the same.

It seems also the error messages aren't too helpful:

>>> "ä".encode("latin-1")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0x84 in position 0: ordinal not in range(128)
>>>

Hm, why does the 'encode' call complain about decoding?

Why do string objects have an encode method, and why do unicode objects
have a decode method, and what does this error message want to tell me:

>>> u"ä".decode("latin-1")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 0: ordinal not in range(128)
>>>

Thomas



More information about the Python-list mailing list