[Python-ideas] Adding bytes.frombuffer() constructor

INADA Naoki songofacandy at gmail.com
Sun Aug 7 15:09:31 EDT 2016


On Mon, Aug 8, 2016 at 12:47 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 7 August 2016 at 15:08, Michael Selik <michael.selik at gmail.com> wrote:
>>
>>
>> On Sat, Aug 6, 2016 at 8:45 PM INADA Naoki <songofacandy at gmail.com> wrote:
>>>
>>> 1. bytes(bytearray[:n])
>>> 2. bytes(memoryview(bytearray)[:n])
>>>
>>> (1) is simplest, but it produces temporary bytearray having n bytes.
>>
>>
>> Does that actually make the difference between unacceptably inefficient
>> performance and acceptably efficient for an application you're working on?


Yes.  My intention is Tornado and AsyncIO.  Since they are framework,
we can't assume how large it is -- it may be few bytes ~ few giga bytes.


>>>
>>> While (2) is more efficient than (1), it uses still temporary memoryview
>>> object, and it looks bit tricky.
>>
>>
>> Using the memoryview is nicely explicit whereas ``bytes.frombuffer`` could
>> be creating a temporary bytearray as part of its construction.
>
>
> It could, but it wouldn't (since that would be pointlessly inefficient).
>
> The main question to be answered here would be whether adding a dedicated
> spelling for "bytes(memoryview(bytearray)[:n])" actually smooths out the
> learning curve for memoryview in general, where folks would learn:
>
> 1. "bytes(mybytearray[:n])" copies the data twice for no good reason
> 2. "bytes.frombuffer(mybytearray, n)" avoids the double copy
> 3. "bytes(memoryview(mybytearray)[:n])" generalises to arbitrary slices
>
> With memoryview being a builtin, I'm not sure that argument can be made
> successfully - the transformation in going from step 1 direct to step 3 is
> just "wrap the original object with memoryview before slicing to avoid the
> double copy", and that's no more complicated than using a different
> constructor method.


I'm not sure, too.
memoryview may and may not be bytes-like object which os.write or
socket.send accepts.

But memoryview is successor of buffer.  So we should encourage to
use it for zero copy slicing.


Thank you.

-- 
INADA Naoki  <songofacandy at gmail.com>


More information about the Python-ideas mailing list