[Numpy-discussion] Latest Array-Interface PEP

Fri Jan 12 11:21:28 EST 2007

On 1/12/07, Travis Oliphant <oliphant at ee.byu.edu > wrote:
>
> Neal Becker wrote:
> > I believe we are converging, and this is pretty much the same design as
> I
> > advocated.  It is similar to boost::ublas.
> >
> I'm grateful to hear that.   It is nice when ideas come from several
> different corners.
> > Storage is one concept.
> >
> > Interpretation of the storage is another concept.
> >
> > Numpy is a combination of a storage and interpretation.
> >
> > Storage could be dense or sparse.  Allocated in various ways. Sparse can
> be
> > implemented in different ways.
> >
> > Interpretation can be 1-d, 2-d.  Zero-based, non-zero based.  Also there
> is
> > question of ownership (slices).
> >
>
> How do we extend the buffer interface then?  Do we have one API that
> allows sharing of storage and another that handles sharing of
> interpretation?
>
> How much detail should be in the interface regarding storage detail.
> Is there a possibility of having at least a few storage models
> "shareable" so that memory can be shared by others that view the data in
> the same way?

I'm concerned about the direction that this PEP seems to be going. The
original proposal was borderline too complicated IMO, and now it seems
headed in the direction of more complexity.

Also, it seems that there are three different goals getting conflated here.
None are bad, but they don't and probably shouldn't, all be addressed by the
same PEP.

   1. Allowing producers and consumers of blocks of data to share blocks
   efficiently. This is half of what the original PEP proposed.
   2. Describing complex data types at the c-level. This is the other
   half of the PEP[1].
   3. Things that act like arrays, but have different storage methods.
   This details of this still seem pretty vague, but to the extent that I can
   figure them out, it doesn't seem useful or necessary to tie this into the
   rest of the array interface PEP.  For example,
   "array_interface->get_block_from_slice()" has been mentioned. Why that
   instead of "PyObject_AsExtendedBuffer(PyObject_GetItem(index), ....)"[2].
   I'll stop here, till I see some more details of what people have in mind,
   but at this point, I think that alternative memory models are a different
   problem that should be addressed separately.

 Sadly, I'm leaving town shortly and I'm running out of time, so I'll have
to leave my objections in this somewhat vague state.

Oh, the way that F. Lundh plans to expose PIL's data a chunk at a time is
mentioned in this python-dev summary:
http://www.python.org/dev/summary/2006-11-01_2006-11-15/
It doesn't seem necessary to have special support for this; all that is
necessary is for the object returned by acquire_view to support the extended
array protocol.

[1] Remind me again why we can't simply use ctypes for this? It's already in
the core. I'm sure it's less efficient, but you shouldn't need to parse the
data structure information very often. I suspect that something that
leveraged ctypes would meet less resistance.

[2] Which reminds me. I never saw in the PEP what the actual call in the
buffer protocol was supposed to look like. Is it something like:
PyObject_AsExtendedBuffer(PyObject * obj, void **buffer, Py_ssize_t
*buffer_len,
funcptr *bf_getarrayview, funcptr *bf_relarrayview)
?

-- 
//=][=\\

tim.hochberg at ieee.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070112/b91eec19/attachment.html>