[Cython] CEP1000: Native dispatch through callables

mark florisson markflorisson88 at gmail.com
Fri Apr 13 14:46:27 CEST 2012


On 13 April 2012 12:38, Stefan Behnel <stefan_ml at behnel.de> wrote:
> Robert Bradshaw, 13.04.2012 12:17:
>> On Fri, Apr 13, 2012 at 1:52 AM, Dag Sverre Seljebotn wrote:
>>> On 04/13/2012 01:38 AM, Robert Bradshaw wrote:
>>>> Have you given any thought as to what happens if __call__ is
>>>> re-assigned for an object (or subclass of an object) supporting this
>>>> interface? Or is this out of scope?
>>>
>>> Out-of-scope, I'd say. Though you can always write an object that detects if
>>> you assign to __call__...
>
> +1 for out of scope. This is a pure C level feature.
>
>
>>>> Minor nit: I don't think should_dereference is worth branching on, if
>>>> one wants to save the allocation one can still use a variable-sized
>>>> type and point to oneself. Yes, that's an extra dereference, but the
>>>> memory is already likely close and it greatly simplifies the logic.
>>>> But I could be wrong here.
>>>
>>>
>>> Those minor nits are exactly what I seek; since Travis will have the first
>>> implementation in numba<->SciPy, I just want to make sure that what he does
>>> will work efficiently work Cython.
>>
>> +1
>>
>> I have to admit building/invoking these var-arg-sized __nativecall__
>> records seems painful. Here's another suggestion:
>>
>> struct {
>>     void* pointer;
>>     size_t signature; // compressed binary representation, 95% coverage
>>     char* long_signature; // used if signature is not representable in
>> a size_t, as indicated by signature = 0
>> } record;
>>
>> These char* could optionally be allocated at the end of the record*
>> for optimal locality. We could even dispense with the binary
>> signature, but having that option allows us to avoid strcmp for stuff
>> like d)d and ffi)f.
>
> Assuming we use literals and a const char* for the signature, the C
> compiler would cut down the number of signature strings automatically for
> us. And a pointer comparison is the same as a size_t comparison.
>
> That would only apply at a per-module level, though, so it would require an
> indirection for the signature IDs. But it would avoid a global registry.
>
> Another idea would be to set the signature ID field to 0 at the beginning
> and call a C-API function to let the current runtime assign an ID > 0,
> unique for the currently running application. Then every user would only
> have to parse the signature once to adapt to the respective ID and could
> otherwise branch based on it directly.
>
> For Cython, we could generate a static ID variable for each typed call that
> we found in the sources. When encountering a C signature on a callable,
> either a) the ID variable is still empty (initial case), then we parse the
> signature to see if it matches the expected signature. If it does, we
> assign the corresponding ID to the static ID variable and issue a direct
> call. If b) the ID field is already set (normal case), we compare the
> signature IDs directly and issue a C call it they match. If the IDs do not
> match, we issue a normal Python call.
>
>
>>> Right... if we do some work to synchronize the types for Cython modules
>>> generated by the same version of Cython, we're left with 3-4 types for
>>> Cython, right? Then a couple for numba and one for f2py; so on the order of
>>> 10?
>>
>> No, I think each closure is its own type.
>
> And that even applies to fused functions, right? They'd have one closure
> for each type combination.
>

Hm, there is only one type for the function (CyFunction), but there is
a different type for the closure scope for each closure. The same goes
for FusedFunction, there is only one type, and each instance contains
a dict of specializations (mapping signatures to PyCFunctions).

(But each module still has different function types of course).

>>> An alternative is do something funny in the type object to get across the
>>> offset-in-object information (abusing the docstring, or introduce our own
>>> flag which means that the type object has an additional non-standard field
>>> at the end).
>>
>> It's a hack, but the flag + non-standard field idea might just work...
>
> Plus, it wouldn't have to stay a non-standard field. If it's accepted into
> CPython 3.4, we could safely use it in all existing versions of CPython.
>
> Stefan
> _______________________________________________
> cython-devel mailing list
> cython-devel at python.org
> http://mail.python.org/mailman/listinfo/cython-devel


More information about the cython-devel mailing list