[Python-Dev] Extended Buffer Interface/Protocol

Thu Mar 22 19:53:16 CET 2007

Greg Ewing wrote:
> Travis Oliphant wrote:
> 
> 
>>I'm talking about arrays of pointers to other arrays:
>>
>>i.e. if somebody defined in C
>>
>>float B[10][20]
>>
>>then B would B an array of pointers to arrays of floats.
> 
> 
> No, it wouldn't, it would be a contiguously stored
> 2-dimensional array of floats. An array of pointers
> would be
> 
>    float *B[10];
> 
> followed by code to allocate 10 arrays of 20 floats
> each and initialise B to point to them.
> 

You are right, of course, that example was not correct.  I think the 
point is still valid, though.   One could still use the shape to 
indicate how many levels of pointers-to-pointers there are (i.e. how 
many pointer dereferences are needed to select out an element).  Further 
dimensionality could then be reported in the format string.

This would not be hard to allow.  It also would not be hard to write a 
utility function to copy such shared memory into a contiguous segment to 
provide a C-API that allows casual users to avoid the details of memory 
layout when they are writing an algorithm that just uses the memory.

> I can imagine cases like that coming up in practice.
> For example, an image object might store its data
> as four blocks of memory for R, G, B and A planes,
> each of which is a contiguous 2d array with shape
> and stride -- but you want to view it as a 3d
> array byte[plane][x][y].

All we can do is have the interface actually be able to describe it's 
data.  Users would have to take that information and write code 
accordingly.

In this case, for example, one possibility is that the object would 
raise an error if strides were requested.  It would also raise an error 
if contiguous data was requested (or I guess it could report the R 
channel only if it wanted to).   Only if segments were requested could 
it return an array of pointers to the four memory blocks.  It could then 
report itself as a 2-d array of shape (4, H)  where H is the height. 
Each element of the array would be reported as "%sB" % W where W is the 
width of the image (i.e. each element of the 2-d array would be a 1-d 
array of length W.

Alternatively it could report itself as a 1-d array of shape (4,) with 
elements "(H,W)B"

A user would have to write the algorithm correctly in order to access 
the memory correctly.

Alternatively, a utility function that copies into a contiguous buffer 
would allow the consumer to "not care" about exactly how the memory is 
layed out.  But, the buffer interface would allow the utility function 
to figure it out and do the right thing for each exporter.  This 
flexibility would not be available if we don't allow for segmented 
memory in the buffer interface.

So, I don't think it's that hard to at least allow the multiple-segment 
idea into the buffer interface (as long as all the segments are the same 
size, mind you).  It's only one more argument to the getbuffer call.

-Travis