[Python-ideas] Stop displaying elements of bytes objects as printable ASCII characters in CPython 3

Ron Adam ron3200 at gmail.com
Thu Sep 11 03:36:53 CEST 2014



On 09/10/2014 05:09 PM, Nick Coghlan wrote:
>
> On 11 Sep 2014 06:30, "Chris Lasher"
> <chris.lasher at gmail.com
> <mailto:chris.lasher at gmail.com>> wrote:
>  >
>  > Put yourself in the shoes of a beginner.
>
> We often compromise the beginner experience for backwards compatibility
> reasons, or to provide a better developer experience in the long run (cf.
> changing print from a statement to a builtin function).
>
> In this case, I *agree* the current behaviour is confusing, since it
> recreates some of the old "is it binary or is it text?" confusion that was
> more endemic in Python 2.
>
> In Python 3, "bytes" is still a hybrid type that can hold:
> * arbitrary binary data
> * binary data that contains ASCII segments
>
> A pure teaching language wouldn't make that compromise. Python 3 isn't a
> pure teaching language though - it's a pragmatic professional programming
> language that is *also* useful for teaching.
>
> The problem is that for a lot of data it is *genuinely ambiguous* as to
> which of those it actually is (and it may change at runtime depending on
> the specific nature of the data).

Considering "genuinely ambiguous", if it was a new feature we might quote...

   "In the face of ambiguity, refuse the temptation to guess."

It's interesting that there is nothing in the zen rules about change or 
backward compatibility.  If there were, it might have said...

   "Changing too much, too fast, is often too disruptive".


> Both the default repr and the literal form assume the "binary data ASCII
> compatible segments", which aligns with the behaviour of the Python 2 str
> type. That isn't going to change in Python, especially since we actually
> *did* try it for a while (prior to the 3.0 release) and really didn't like it.
>
> However, as others have noted, making it easier to get a pure hex
> representation is likely worth doing. There are lots of ways of doing that
> currently, but none that really qualify as "obvious".

When working with hex data, I prefer the way hex editors do it.  With pairs 
of hex digits separated by a space.

      "50 79 74 68 6f 6e"    b'Python'

But I'm not sure there's a way to make that work cleanly. :-/


Cheers,
    Ron



More information about the Python-ideas mailing list