[Cython] buffer syntax vs. memory view syntax

Tue May 8 00:35:29 CEST 2012

On Mon, May 7, 2012 at 11:40 AM, Dag Sverre Seljebotn
<d.s.seljebotn at astro.uio.no> wrote:
>
> mark florisson <markflorisson88 at gmail.com> wrote:
>
>>On 7 May 2012 17:00, Dag Sverre Seljebotn <d.s.seljebotn at astro.uio.no>
>>wrote:
>>> On 05/07/2012 04:16 PM, Stefan Behnel wrote:
>>>>
>>>> Stefan Behnel, 07.05.2012 15:04:
>>>>>
>>>>> Dag Sverre Seljebotn, 07.05.2012 13:48:
>>>>>>
>>>>>> BTW, with the coming of memoryviews, me and Mark talked about just
>>>>>> deprecating the "mytype[...]" meaning buffers, and rather treat it
>>as
>>>>>> np.ndarray, array.array etc. being some sort of "template types".
>>That
>>>>>> is,
>>>>>> we disallow "object[int]" and require some special declarations in
>>the
>>>>>> relevant pxd files.
>>>>>
>>>>>
>>>>> Hmm, yes, it's unfortunate that we have two different types of
>>syntax
>>>>> now,
>>>>> one that declares the item type before the brackets and one that
>>declares
>>>>> it afterwards.
>>>>
>>>>
>>>> I actually think this merits some more discussion. Should we
>>consider the
>>>> buffer interface syntax deprecated and focus on the memory view
>>syntax?
>>>
>>>
>>> I think that's the very-long-term intention. Then again, it may be
>>too early
>>> to really tell yet, we just need to see how the memory views play out
>>in
>>> real life and whether they'll be able to replace np.ndarray[double]
>>among
>>> real users. We don't want to shove things down users throats.
>>>
>>> But the use of the trailing-[] syntax needs some cleaning up. Me and
>>Mark
>>> agreed we'd put this proposal forward when we got around to it:
>>>
>>>  - Deprecate the "object[double]" form, where [dtype] can be stuck on
>>any
>>> extension type
>>>
>>>  - But, do NOT (for the next year at least) deprecate
>>np.ndarray[double],
>>> array.array[double], etc. Basically, there should be a magic flag in
>>> extension type declarations saying "I can be a buffer".
>>>
>>> For one thing, that is sort of needed to open up things for templated
>>cdef
>>> classes/fused types cdef classes, if that is ever implemented.
>>
>>Deprecating is definitely a good start. I think at least if you only
>>allow two types as buffers it will be at least reasonably clear when
>>one is dealing with fused types or buffers.
>>
>>Basically, I think memoryviews should live up to demands of the users,
>>which would mean there would be no reason to keep the buffer syntax.
>
> But they are different approaches -- use a different type/API, or just try to speed up parts of NumPy..

Part of the question here is whether using np.ndarray[...] currently
(or will) offer any additional functionality.

While we should likely start steering people this direction,
especially over object[...], it seems too soon to deprecate the
old-style buffer access.

>>One thing to do is make memoryviews coerce cheaply back to the
>>original objects if wanted (which is likely). Writting
>>np.asarray(mymemview) is kind of annoying.
>>
>
>
> It is going to be very confusing to have type(mymemview), repr(mymemview), and so on come out as NumPy arrays, but not have the full API of NumPy. Unless you auto-convert on getattr to...
>
> If you want to eradicate the distinction between the backing array and the memory view and make it transparent, I really suggest you kick back alive np.ndarray (it can exist in some 'unrealized' state with delayed construction after slicing, and so on). Implementation much the same either way, it is all about how it is presented to the user.
>
> Something like mymemview.asobject() could work though, and while not much shorter, it would have some polymorphism that np.asarray does not have (based probably on some custom PEP 3118 extension)

I think it's valuable to have a single name refer to both the Python
object (on which methods can be called, and a new one might have to be
created if there was slicing) and the memory view. In this light,
being able to specify something is both a NumPy array (to use some
(overlay optimized?) methods on it and a memory view (for fast
indexing) without having two different variables can result in much
cleaner code (and an easier transition from untyped NumPy).

>>Also, OT (sorry), but I'm kind of worried about the memoryview ABI. If
>>it changes (and I intend to do so), cython modules compiled with
>>different cython versions will become incompatible if they call each
>>other through pxds. Maybe that should be defined as UB...
>>
>>> The semantic meaning of trailing [] is still sort of like the C++
>>meaning;
>>> that it templates the argument types (except it's lots of special
>>cases in
>>> the compiler for various things rather than a Turing-complete
>>template
>>> language...)
>>>
>>> Dag
>>>
>>>>
>>>> The words-to-punctuation ratio of the latter may hurt the eyes when
>>>> encountering it unprepared, but at least it doesn't require two type
>>>> names,
>>>> of which the one before the brackets (i.e. "object") is mostly
>>useless.
>>>> (Although it does reflect the notion that we are dealing with an
>>object
>>>> here ...)
>>>>
>>>> Stefan
>>>> _______________________________________________
>>>> cython-devel mailing list
>>>> cython-devel at python.org
>>>> http://mail.python.org/mailman/listinfo/cython-devel
>>>
>>>
>>> _______________________________________________
>>> cython-devel mailing list
>>> cython-devel at python.org
>>> http://mail.python.org/mailman/listinfo/cython-devel
>>_______________________________________________
>>cython-devel mailing list
>>cython-devel at python.org
>>http://mail.python.org/mailman/listinfo/cython-devel
>
> --
> Sent from my Android phone with K-9 Mail. Please excuse my brevity.
> _______________________________________________
> cython-devel mailing list
> cython-devel at python.org
> http://mail.python.org/mailman/listinfo/cython-devel