[Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

Mon Aug 18 01:08:09 CEST 2014

On 18 Aug 2014 03:07, "Raymond Hettinger" <raymond.hettinger at gmail.com>
wrote:
>
>
> On Aug 17, 2014, at 1:41 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>
>> If I see "bytearray(10)" there is nothing there that suggests "this
>> creates an array of length 10 and initialises it to zero" to me. I'd
>> be more inclined to guess it would be equivalent to "bytearray([10])".
>>
>> "bytearray.zeros(10)", on the other hand, is relatively clear,
>> independently of user expectations.
>
>
> Zeros would have been great but that should have been done originally.
> The time to get API design right is at inception.
> Now, you're just breaking code and invalidating any published examples.

I'm fine with postponing the deprecation elements indefinitely (or just
deprecating bytes(int) and leaving bytearray(int) alone).

>
>>>
>>> Another thought is that the core devs should be very reluctant to
deprecate
>>> anything we don't have to while the 2 to 3 transition is still in
progress.
>>> Every new deprecation of APIs that existed in Python 2.7 just adds
another
>>> obstacle to converting code.  Individually, the differences are trivial.
>>> Collectively, they present a good reason to never migrate code to
Python 3.
>>
>>
>> This is actually one of the inconsistencies between the Python 2 and 3
>> binary APIs:
>
>
> However, bytearray(n) is the same in both Python 2 and Python 3.
> Changing it in Python 3 increases the gulf between the two.
>
> The further we let Python 3 diverge from Python 2, the less likely that
> people will convert their code and the harder you make it to write code
> that runs under both.
>
> FWIW, I've been teaching Python full time for three years.  I cover the
> use of bytearray(n) in my classes and not a single person out of 3000+
> engineers have had a problem with it.   I seriously question the PEP's
> assertion that there is a real problem to be solved (i.e. that people
> are baffled by bytearray(bufsiz)) and that the problem is sufficiently
> painful to warrant the headaches that go along with API changes.

Yes, I'd expect engineers and networking folks to be fine with it. It isn't
how this mode of the constructor *works* that worries me, it's how it
*fails* (i.e. silently producing unexpected data rather than a type error).

Purely deprecating the bytes case and leaving bytearray alone would likely
address my concerns.

>
> The other proposal to add bytearray.byte(3) should probably be named
> bytearray.from_byte(3) for clarity.  That said, I question whether there
is
> actually a use case for this.   I have never seen seen code that has a
> need to create a byte array of length one from a single integer.
> For the most part, the API will be easiest to learn if it matches what
> we do for lists and for array.array.

This part of the proposal came from a few things:

* many of the bytes and bytearray methods only accept bytes-like objects,
but iteration and indexing produce integers
* to mitigate the impact of the above, some (but not all) bytes and
bytearray methods now accept integers in addition to bytes-like objects
* ord() in Python 3 is only documented as accepting length 1 strings, but
also accepts length 1 bytes-like objects

Adding bytes.byte() makes it practical to document the binary half of ord's
behaviour, and eliminates any temptation to expand the "also accepts
integers" behaviour out to more types.

bytes.byte() thus becomes the binary equivalent of chr(), just as Python 2
had both chr() and unichr().

I don't recall ever needing chr() in a real program either, but I still
consider it an important part of clearly articulating the data model.

> Sorry Nick, but I think you're making the API worse instead of better.
> This API isn't perfect but it isn't flat-out broken either.   There is
some
> unfortunate asymmetry between bytes() and bytearray() in Python 2,
> but that ship has sailed.  The current API for Python 3 is pretty good
> (though there is still a tension between wanting to be like lists and like
> strings both at the same time).

Yes. It didn't help that the docs previously expected readers to infer the
behaviour of the binary sequence methods from the string documentation -
while the new docs could still use some refinement, I've at least addressed
that part of the problem.

Cheers,
Nick.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140818/d29c5dc3/attachment.html>