[Python-ideas] Strings can sometimes convert to bytes without an encoding

Franklin? Lee leewangzhong+python at gmail.com
Tue Jun 14 19:46:34 EDT 2016


On Tue, Jun 14, 2016 at 7:26 PM, Guido van Rossum <guido at python.org> wrote:
> -1. Such a check for the contents of the string sounds exactly like the
> Python 2 behavior we are trying to get away with.

But isn't it really just converting back and forth between two
representations of the same thing? A str with char width 1 is
conceptually an ASCII string; you're just changing how it's exposed to
the program.

As it stands, when you have an ASCII string stored as a str, you can
call str.encode() on it (whereby it will default to encoding='utf-8'),
or you can call `bytes(s, 'utf-8')`, and pass in an argument which is
conceptually ignored. (Unless it is in fact not an ASCII string!) On
the other hand, `bytes(s)` means, "Encoding shall not be necessary."
That could be semantically useful, and a non-ASCII string will trigger
an exception, while the other methods will just encode.


More information about the Python-ideas mailing list