[Python-3000] Immutable bytes -- looking for volunteer

Guido van Rossum guido at python.org
Tue Sep 25 23:26:40 CEST 2007


OK, Jeffrey's and Adam's patches were helpful; it looks like the
damage done by making bytes immutable is pretty limited: plenty of
modules are affected, but the changes are straightforward and
localized.

So now I have an idea that goes a little farther. It relates to
Talin's response (second message in this thread if you're using gmail)
and acknowledges that there are some good use cases for mutable bytes
as well (as I've always maintained).

How about we take the existing PyString implementation (Python 2's
str, currently still present as str8 in py3k), remove the locale and
unicode mixing support, and call it bytes. Then the PyBytes type can
be renamed to buffer. It is well-documented that I don't care much
about the existing buffer() builtin; it can be renamed to memview for
all I care (that would be a more descriptive name anyway).

This would provide a much better transitional path for 2.x code
manipulating raw bytes using str instances: just change "..." into
b"..." and str into bytes. (Of course, 2.x code that is confused about
bytes vs. characters will fail hard in 3.0 as soon as a bytes and a
str instance meet -- this is already the case in the current 3.0 code
base and will remain unchanged.)

It would mean more fixes beyond what Jeffrey and Adam did, since
iterating over a bytes instance would return a bytes instance of
length 1 instead of a small int, and the bytes constructor would
change accordingly (no more initializing a bytes object from a list of
ints).

The (new) buffer object would also have to change to be more
compatible with the (new) bytes object -- bytes<-->buffer conversions
should be 1-1, and iterating over a buffer instance would also have to
return a length-1 buffer (or bytes???) instance.

Thoughts?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-3000 mailing list