[Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

Nick Coghlan ncoghlan at gmail.com
Sun Aug 17 10:41:05 CEST 2014


On 17 August 2014 18:13, Raymond Hettinger <raymond.hettinger at gmail.com> wrote:
>
> On Aug 14, 2014, at 10:50 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>
> Key points in the proposal:
>
> * deprecate passing integers to bytes() and bytearray()
>
>
> I'm opposed to removing this part of the API.  It has proven useful
> and the alternative isn't very nice.   Declaring the size of fixed length
> arrays is not a new concept and is widely adopted in other languages.
> One principal use case for the bytearray is creating and manipulating
> binary data.  Initializing to zero is common operation and should remain
> part of the core API (consider why we now have list.copy() even though
> copying with a slice remains possible and efficient).

That's why the PEP proposes adding a "zeros" method, based on the name
of the corresponding NumPy construct.

The status quo has some very ugly failure modes when an integer is
passed unexpectedly, and tries to create a large buffer, rather than
throwing a type error.

> I and my clients have taken advantage of this feature and it reads nicely.

If I see "bytearray(10)" there is nothing there that suggests "this
creates an array of length 10 and initialises it to zero" to me. I'd
be more inclined to guess it would be equivalent to "bytearray([10])".

"bytearray.zeros(10)", on the other hand, is relatively clear,
independently of user expectations.

> The proposed deprecation would break our code and not actually make
> anything better.
>
> Another thought is that the core devs should be very reluctant to deprecate
> anything we don't have to while the 2 to 3 transition is still in progress.
> Every new deprecation of APIs that existed in Python 2.7 just adds another
> obstacle to converting code.  Individually, the differences are trivial.
> Collectively, they present a good reason to never migrate code to Python 3.

This is actually one of the inconsistencies between the Python 2 and 3
binary APIs:

Python 2.7.5 (default, Jun 25 2014, 10:19:55)
[GCC 4.8.2 20131212 (Red Hat 4.8.2-7)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> bytes(10)
'10'
>>> bytearray(10)
bytearray(b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00')

Users wanting well-behaved binary sequences in Python 2.7 would be
well advised to use the "future" module to get a full backport of the
actual Python 3 bytes type, rather than the approximation that is the
8-bit str in Python 2. And once they do that, they'll be able to track
the evolution of the Python 3 binary sequence behaviour without any
further trouble.

That said, I don't really mind how long the deprecation cycle is. I'd
be fine with fully supporting both in 3.5 (2015), deprecating the main
constructor in favour of the explicit zeros() method in 3.6 (2017) and
dropping the legacy behaviour in 3.7 (2018)

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-Dev mailing list