[Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

Ian Bicking ianb at colorstudy.com
Sat Feb 18 00:13:51 CET 2006


Martin v. Löwis wrote:
> Users do
> 
> py> "Martin v. Löwis".encode("utf-8")
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position 11:
> ordinal not in range(128)
> 
> because they want to convert the string "to Unicode", and they have
> found a text telling them that .encode("utf-8") is a reasonable
> method.
> 
> What it *should* tell them is
> 
> py> "Martin v. Löwis".encode("utf-8")
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> AttributeError: 'str' object has no attribute 'encode'

I think it would be even better if they got "ValueError: utf8 can only 
encode unicode objects".  AttributeError is not much more clear than the 
UnicodeDecodeError.

That str.encode(unicode_encoding) implicitly decodes strings seems like 
a flaw in the unicode encodings, quite seperate from the existance of 
str.encode.  I for one really like s.encode('zlib').encode('base64') -- 
and if the zlib encoding raised an error when it was passed a unicode 
object (instead of implicitly encoding the string with the ascii 
encoding) that would be fine.

The pipe-like nature of .encode and .decode works very nicely for 
certain transformations, applicable to both unicode and byte objects. 
Let's not throw the baby out with the bath water.


-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org


More information about the Python-Dev mailing list