[Python-Dev] PEP 460 reboot

Mon Jan 13 13:08:00 CET 2014

On 13 January 2014 17:15, Guido van Rossum <guido at python.org> wrote:
> On Sun, Jan 12, 2014 at 10:59 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> All it takes is to let go of the idea "I wish the Python 3 bytes type
>> was more like the Python 2 str type" and instead think "hmm, the
>> Python 3 bytes type doesn't seem like a great fit for my use case,
>> maybe I need a different type".
>
> Maybe you're letting your excitement about asciistr get the better of
> you? IMO we don't need more types. If you can refrain from using
> int(b), b.lower() and b += 'abc' when b isn't ASCII-encoded, why
> couldn't you also refrain from b += b'%s' % 42?

It's the fact I'd feel obliged to refrain from using *any* of the
proposed interpolation methods when dealing with arbitrary binary data
if they include the assumption of ASCII compatibility. The reason
Antoine's updates to PEP 460 earned an immediate +1 from me (even
though I was initially dubious about the PEP in general) is that it
aligns *exactly* with how I usually use the bytes type in Python 3 -
as a pure container of arbitrary binary data, without making
assumptions about whether it is ASCII compatible or not.

While I still occasionally have reservations about it, I think on
balance it's a good thing that the bytes type has a much support for
ASCII compatible data , but my specific concern with your more lenient
proposal is that it takes something that I liked and would use (the
current PEP 460 API) and turned it into something I would have to
avoid because it doesn't correctly support arbitrary binary data.

> I'll suppress the urge to quote verbatim from my first message in this
> thread (about the motivation for bytes) but I'll just recommend you
> re-read it.
>
> (It's too late here to write more, but it looks like we are in for a
> bitter fight. :-( )

I realised my problem was specifically with providing the ASCII
compatible version *without* providing a pure binary equivalent that
*doesn't* involve making the assumption of ASCII compatibility. This
means that adding formatb and formatb_map methods with the current
semantics of format and format_map from PEP 460 would cover the use
cases I care about, and I can then happily ignore the debates about
what the semantics of the ASCII compatible version will be.

The semantics of binary interpolation could potentially even be
simplified further, since the ASCII assuming versions would be
responsible for handling the 2/3 source compatibility problem.
"{}".formatb(other) would also provide an alternative to calling the
bytes constructor that doesn't suffer from the
unexpected-int-input-is-handled-as-a-length failure mode.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia