[Cython] Automatic conversion with fixed-size C arrays

Fri Jul 25 22:44:49 CEST 2014

On Fri, Jul 25, 2014 at 1:30 PM, Kurt Smith <kwmsmith at gmail.com> wrote:
>
>
> On Fri, Jul 25, 2014 at 1:16 PM, Robert Bradshaw
> <robertwb at math.washington.edu> wrote:
>>
>> On Fri, Jul 25, 2014 at 8:24 AM, Kurt Smith <kwmsmith at gmail.com> wrote:
>>>
>>>
>>> On Thu, Jul 24, 2014 at 11:16 PM, Robert Bradshaw <robertwb at gmail.com>
>>> wrote:
>>>>
>>>> On Thu, Jul 24, 2014 at 7:13 PM, Kurt Smith <kwmsmith at gmail.com> wrote:
>>>> >
>>>> > On Thu, Jul 24, 2014 at 2:07 PM, Robert Bradshaw <robertwb at gmail.com>
>>>> > wrote:
>>>> >>
>>>> >> On Fri, Jul 18, 2014 at 12:44 PM, Robert Bradshaw
>>>> >> <robertwb at gmail.com>
>>>> >> wrote:
>>>> >> > On Fri, Jul 18, 2014 at 12:08 PM, Kurt Smith <kwmsmith at gmail.com>
>>>> >> > wrote:
>>>> >> >> On Wed, Jul 16, 2014 at 1:02 PM, Robert Bradshaw
>>>> >> >> <robertwb at gmail.com>
>>>> >> >> wrote:
>>>> >> >>>
>>>> >> >>>
>>>> >> >>> Yes, this'd be nice to have. One difficulty with arrays is that
>>>> >> >>> they
>>>> >> >>> can't be returned by value, and so the ordinary from_py_function
>>>> >> >>> mechanism (which gets called recursively) would need to be
>>>> >> >>> adapted.
>>>> >> >>> (Allowing to_py_function to be optinally be called by reference
>>>> >> >>> instead of by value could be nice as well from a performance
>>>> >> >>> standpoint.)
>>>> >> >>
>>>> >> >>
>>>> >> >> OK, thanks for the pointers.  I'll put some time on this over the
>>>> >> >> weekend.
>>>> >> >> Should I just make a PR when things are ready to review, or should
>>>> >> >> I
>>>> >> >> put up
>>>> >> >> an issue first?
>>>> >> >
>>>> >> > I think this thread is sufficient; looking forward to a pull
>>>> >> > request.
>>>> >>
>>>> >> Don't know if you had time to look at this yet,
>>>> >
>>>> >
>>>> > Yes, I've put in some time.  Initial focus is getting
>>>> >
>>>> >     cdef int a[10] = obj
>>>> >
>>>> > working, and then using that for structs.
>>>>
>>>> Sounds good.
>>>>
>>>> > Took a little while to re-orient myself with the codebase.
>>>> >
>>>> > I'm working on getting the `to_py_function` and `from_py_function`
>>>> > infrastructure to take arguments by reference; right now I'm getting
>>>> > something hacked into place, and I'd appreciate your review to point
>>>> > out the
>>>> > right way (or at least a better way) to do it.
>>>>
>>>> from_py_function always takes its argument (a PyObject*) by reference.
>>>> It's used as an rvalue, so might not make sense for arrays.
>>>
>>>
>>> My bad, I worded my response poorly -- what I mean is that currently the
>>> generated code is something like:
>>>
>>>     __pyx_v_t1 = xxx_from_py_function_xxx(temp_py_obj);
>>>
>>>     ___pyx_v_a = __pyx_v_t1;
>>>
>>> where __pyx_v_a and __pyx_v_t1 are declared as fixed-size C arrays.
>>>
>>> This won't work, for reasons pointed out.
>>>
>>> So I'm trying to generate instead:
>>>
>>>     err_code = xxx_from_py_function_xxx(temp_py_obj, &__pyx_v_a[0], 10);
>>> if (err_code == -1) { ... }
>>
>>
>> FWIW think you can just write &__pyx_v_a
>
>
> Yes, thanks, this form will also generalize to multidimensional fixed-sized
> arrays.  For that matter, we can just pass in __pyx_v_a, which is equivalent
> to either &__pyx_v_a or &__pyx_v_a[0] (K&R 5.3).
>
>>
>>
>>>
>>> Where the 10 is the array length.  The function would initialize the
>>> array internally.  The python object is passed by reference as before, and
>>> the array is also passed by reference.  The array is assigned to just once,
>>> minimizing copying.  We can return a status code to propagate errors, etc.
>>> This seems like the cleanest code to me.  Thoughts?
>>
>>
>> Yes, that makes sense.
>>
>> One of my concerns was whether xxx_from_py_function_xxx was always
>> assigned to an lvalue, enabling this transformation, but I think with error
>> checking it's always assigned to a temporary, so we're OK here.
>>
>
> Sorry, I didn't quite follow that.  The lhs is an lvalue at the C level
> already, right?  And we'd want to use the array variable directly in the
> xxx_from_py_function_xxx call, not a temporary, since the `__pyx_v_a =
> __pyx_v_t1` temp assignment won't work.  What am I missing here?

I was worried there might be a case that the return value of
xxx_from_py_function_xxx was not immediately assigned to an lvalue,
but I think that's never the case.

>> Are you thinking of transforming all xxx_from_py_function_xxx to be of
>> this form?
>
>
> No, I was thinking this would kick in for just fixed-size arrays, structs,
> and perhaps others as makes sense.  Everything else would remain as-is.
>
>>
>> It could make sense for structs, and make for cleaner error checking, but
>> would there be more overhead for simple types like converting to a long
>> (which are currently macros)?
>
>
> Agreed.  Simple types should remain as-is.
>
>>
>> If the length is an argument, would one have to check for whether to pass
>> this at every use of the from_py_function?
>
>
> No, my thinking is that this transformation (and passing in the array size)
> would only apply for fixed-size arrays.

Sounds good.

- Robert