[Python-Dev] Understanding the buffer API

Jeff Allen "ja...py" at farowl.co.uk
Sat Aug 4 01:34:16 CEST 2012


I'm implementing the buffer API and some of memoryview for Jython. I 
have read with interest, and mostly understood, the discussion in Issue 
#10181 that led to the v3.3 re-implementation of memoryview and 
much-improved documentation of the buffer API. Although Jython is 
targeting v2.7 at the moment, and 1-D bytes (there's no Jython NumPy), 
I'd like to lay a solid foundation that benefits from the recent CPython 
work. I hope that some of the complexity in memoryview stems from legacy 
considerations I don't have to deal with in Jython.

I am puzzled that PEP 3118 makes some specifications that seem 
unnecessary and complicate the implementation. Would those who know the 
API inside out answer a few questions?

My understanding is this: When a consumer requests a buffer from the 
exporter it specifies using flags how it intends to navigate it. If the 
buffer actually needs more apparatus than the consumer proposes, this 
raises an exception. If the buffer needs less apparatus than the 
consumer proposes, the exporter has to supply what was asked for.  For 
example, if the consumer sets PyBUF_STRIDES, and the buffer can only be 
navigated by using suboffsets (PIL-style) this raises an exception. 
Alternatively, if the consumer sets PyBUF_STRIDES, and the buffer is 
just a simple byte array, the exporter has to supply shape and strides 
arrays (with trivial values), since the consumer is going to use those 
arrays.

Is there any harm is supplying shape and strides when they were not 
requested? The PEP says: "PyBUF_ND ... If this is not given then shape 
will be NULL". It doesn't stipulate that strides will be null if 
PyBUF_STRIDES is not given, but the library documentation says so. 
suboffsets is different since even when requested, it will be null if 
not needed.

Similar, but simpler, the PEP says "PyBUF_FORMAT ... If format is not 
explicitly requested then the format must be returned as NULL (which 
means "B", or unsigned bytes)". What would be the harm in returning "B"?

One place where this really matters is in the implementation of 
memoryview. PyMemoryView requests a buffer with the flags PyBUF_FULL_RO, 
so even a simple byte buffer export will come with shape, strides and 
format. A consumer (of the memoryview's buffer API) might specify 
PyBUF_SIMPLE: according to the PEP I can't simply give it the original 
buffer since required fields (that the consumer will presumably not 
access) are not NULL. In practice, I'd like to: what could possibly go 
wrong?

Jeff Allen



More information about the Python-Dev mailing list