[Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

Sat Jan 11 09:43:25 CET 2014

On 11 January 2014 12:28, Ethan Furman <ethan at stoneleaf.us> wrote:
> On 01/10/2014 06:04 PM, Antoine Pitrou wrote:
>>
>> On Fri, 10 Jan 2014 20:53:09 -0500
>> "Eric V. Smith" <eric at trueblade.com> wrote:
>>>
>>>
>>> So, I'm -1 on the PEP. It doesn't address the cases laid out in issue
>>> 3892. See for example http://bugs.python.org/issue3982#msg180432 .
>>
>>
>> Then we might as well not do anything, since any attempt to advance
>> things is met by stubborn opposition in the name of "not far enough".
>
>
> Heh, and here I thought it was stubborn opposition in the name of purity.
> ;)

No, it's "the POSIX text model is completely broken and we're not
letting people bring it back by stealth because they want to stuff
their esoteric use case back into the builtin data types instead of
writing their own dedicated type now that the builtin types don't
handle it any more".

Yes, we know we changed the text model and knocked wire protocols off
their favoured perch, and we're (thoroughly) aware of the fact that
wire protocol developers don't like the fact that the default model
now strongly favours the vastly more common case of application
development.

However, until Benno volunteered to start experimenting with
implementing an asciistr type yesterday, there have been *zero*
meaningful attempts at trying to solve the issues with wire protocol
manipulation outside the Python 3 core - instead there has just been a
litany of whining that Python 3 is different from Python 2, and a
complete and total refusal to attempt to understand *why* we changed
the text model.

The answer *should* be obvious: the POSIX based text model in Python 2
makes web frameworks easier to write at the expense of making web
applications *harder* to write, and the same is true for every other
domain where the wire protocol and file format handling is isolated to
widely used frameworks and support libraries, with the application
code itself operating mostly on text and structured data. With the
Python 3 text model, we decided that was a terrible trade-off, so the
core text model now *strongly* favours application code.

This means that is now *boundary* code that may need additional helper
types, because the core types aren't necessarily going to cover all
those use cases any more. In particular, the bytes type is, and always
will be, designed for pure binary manipulation, while the str type is
designed for text manipulation. The weird kinda-text-kinda-binary
8-bit builtin type is gone, and *deliberately* so.

I've been saying for years that people should experiment with creating
a Python 3 extension type that
behaves more like the Python 2 str type. For the standard library,
we've never hit a case where the explicit encoding and decoding was so
complicated that creating such a type seemed simpler, so *we're* not
going to do it. After discussing it with me at LCA, Benno Rice offered
to try out the idea, just to determine whether or not it was actually
possible. If there are any CPython bugs that mean the idea *doesn't*
currently work (such as interoperability issues in the core types),
then I'm certainly happy for us to fix *those*. But we're never ever
going to change the core text model back to the broken POSIX one, or
even take steps in that direction.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia