[Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

Sat Feb 18 00:51:32 CET 2006

"Martin v. Löwis" <martin at v.loewis.de> wrote:
> 
> Josiah Carlson wrote:
> > How are users confused?
> 
> Users do
> 
> py> "Martin v. Löwis".encode("utf-8")
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position 11:
> ordinal not in range(128)
> 
> because they want to convert the string "to Unicode", and they have
> found a text telling them that .encode("utf-8") is a reasonable
> method.

Removing functionality because some users read bad instructions
somewhere, is a bit like kicking your kitten because your puppy peed on
the floor.  You are punishing the wrong group, for something that
shouldn't result in punishment: it should result in education.

Users are always going to get bad instructions, and removing utility
because some users fail to think before they act, or complain when their
lack of thinking doesn't work, will give us a language where we are
removing features because *new* users have no idea what they are doing.

> What it *should* tell them is
> 
> py> "Martin v. Löwis".encode("utf-8")
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> AttributeError: 'str' object has no attribute 'encode'

I disagree.  I think the original error was correct, and we should be
educating users to prefix their literals with a 'u' if they want unicode,
or they should get their data from a unicode source (wxPython with
unicode, StreamReader, etc.)

> > bytes.encode CAN only produce bytes.
> 
> I don't understand MAL's design, but I believe in that design,
> bytes.encode could produce anything (say, a list). A codec
> can convert anything to anything else.

That seems to me to be a little overkill...

In any case, I personally find that data.encode('base-64') and
edata.decode('base-64') to be more convenient than binascii.b2a_base64
(data) and binascii.a2b_base64(edata).  Ditto for hexlify/unhexlify, etc.

> > If some users
> > can't understand this (passing different arguments to a function may
> > produce different output),
> 
> It's worse than that. The return *type* depends on the *value* of
> the argument. I think there is little precedence for that: normally,
> the return values depend on the argument values, and, in a polymorphic
> function, the return type might depend on the argument types (e.g.
> the arithmetic operations). Also, the return type may depend on the
> number of arguments (e.g. by requesting a return type in a keyword
> argument).

You only need to look to dictionaries where different values passed into
a function call may very well return results of different types, yet
there have been no restrictions on mapping to and from single types per
dictionary.

Many dict-like interfaces for configuration files do this, things like
config.get('remote_host') and config.get('autoconnect') not being
uncommon.

 - Josiah