[Python-Dev] Python 3.x and bytes

Nick Coghlan ncoghlan at gmail.com
Thu May 19 09:49:47 CEST 2011


On Thu, May 19, 2011 at 5:10 AM, Eric Smith <eric at trueblade.com> wrote:
> On 05/18/2011 12:16 PM, Stephen J. Turnbull wrote:
>> Robert Collins writes:
>>
>>  > Its probably too late to change, but please don't try to argue that
>>  > its correct: the continued confusion of folk running into this is
>>  > evidence that confusion *is happening*. Treat that as evidence and
>>  > think about how to fix it going forward.
>>
>> Sorry, Rob, but you're just wrong here, and Nick is right.  It's
>> possible to improve Python 3, but not to "fix" it in this respect.
>> The Python 3 solution is correct, the Python 2 approach is not.
>> There's no way to avoid discontinuity and confusion here.
>
> I don't think there's any connection between the way 2.x confused text
> strings and binary data (which certainly needed addressing) with the way
> that 3.x returns a different type for byte_str[i] than it does for
> byte_str[i:i+1]. I think it's the latter that's confusing to people.
> There's no particular requirement for different types that's needed to
> fix the byte/str problem.

It's a mental model problem. People try to think of bytes as
equivalent to 2.x str and that's just wrong, wrong, wrong. It's far
closer to array.array('c'). Strings are basically *unique* in
returning a length 1 instance of themselves for indexing operations.
For every other sequence type, including tuples, lists and arrays,
slicing returns a new instance of the same type, while indexing will
typically return something different.

Now, we definitely didn't *help* matters by keeping so many of the
default behaviours of bytes() and bytearray() coupled to ASCII-encoded
text, but that was a matter of practicality beating purity: there
really *are* a lot of wire protocols out there that are ASCII based.
In hindsight, perhaps we should have gone further in breaking things
to try to make the point about the mental model shift more forcefully.
(However, that idea carries with it its own problems).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-Dev mailing list