[Cython] CEP1000: Native dispatch through callables
Robert Bradshaw
robertwb at gmail.com
Fri Apr 13 22:54:12 CEST 2012
On Fri, Apr 13, 2012 at 12:52 PM, Stefan Behnel <stefan_ml at behnel.de> wrote:
> Robert Bradshaw, 13.04.2012 20:21:
>> On Fri, Apr 13, 2012 at 10:26 AM, Robert Bradshaw wrote:
>>> On Fri, Apr 13, 2012 at 4:59 AM, Dag Sverre Seljebotn wrote:
>>>> On 04/13/2012 01:38 PM, Stefan Behnel wrote:
>>>>> That would only apply at a per-module level, though, so it would
>>>>> require an indirection for the signature IDs. But it would avoid a
>>>>> global registry.
>>>>>
>>>>> Another idea would be to set the signature ID field to 0 at the beginning
>>>>> and call a C-API function to let the current runtime assign an ID> 0,
>>>>> unique for the currently running application. Then every user would only
>>>>> have to parse the signature once to adapt to the respective ID and could
>>>>> otherwise branch based on it directly.
>>>>>
>>>>> For Cython, we could generate a static ID variable for each typed call
>>>>> that
>>>>> we found in the sources. When encountering a C signature on a callable,
>>>>> either a) the ID variable is still empty (initial case), then we parse the
>>>>> signature to see if it matches the expected signature. If it does, we
>>>>> assign the corresponding ID to the static ID variable and issue a direct
>>>>> call. If b) the ID field is already set (normal case), we compare the
>>>>> signature IDs directly and issue a C call it they match. If the IDs do not
>>>>> match, we issue a normal Python call.
>>>
>>> If I understand correctly, you're proposing
>>>
>>> struct {
>>> char* sig;
>>> long id;
>>> } sig_t;
>>>
>>> Where comparison would (sometimes?) compute id from sig by augmenting
>>> a global counter and dict? Might be expensive to bootstrap, but
>>> eventually all relevant ids would be filled in and it would be quick.
>
> Yes. If a function is only called once, the overhead won't matter. And
> starting from the second call, it would either be fast if the function
> signature matches or slow anyway if it doesn't match.
There's still data locality issues, including the cached id for the
caller as well as the callee.
>>> Interesting. I wonder what the performance penalty would be over
>>> assuming id is statically computed lots of the time, and using that to
>>> compare against fixed values. And there's memory locality issues as
>>> well.
>>
>> To clarify, I'd really like to have the following as fast as possible:
>>
>> if (callable.sig.id == X) {
>> // yep, that's what I thought
>> } else {
>> // generic call
>> }
>>
>> Alternatively, one can imagine wanting to do:
>>
>> switch (callable.sig.id) {
>> case X:
>> // I can do this
>> case Y:
>> // this is common and fast as well
>> ...
>> default:
>> // generic call
>> }
>
> Yes, that's the idea.
>
>
>> There is some question about how promotion should work (e.g. should
>> this flexibility reside in the caller or the callee (or both, though
>> that could result in a quadratic number of comparisons)?)
>
> Callees could expose multiple signatures (which would result in a direct
> call for each, without further comparisons), then the caller would have to
> choose between those. However, if none matches exactly, the caller might
> want to promote its arguments and try more signatures. In any case, it's
> the caller that does the work, never the callee.
>
> We could generate code like this:
>
> /* cdef int x = ...
> * cdef long y = ...
> * cdef int z # interesting: what if z is not typed?
> * z = func(x, y)
> */
>
> if (func.sig.id == id("[int,long] -> int")) {
> z = ((cast)func.cfunc) (x,y);
> } else if (sizeof(long) > sizeof(int) &&
> (func.sig.id == id("[long,long] -> int"))) {
> z = ((cast)func.cfunc) ((long)x, y);
> } etc. ... else {
> /* pack and call as Python function */
> }
>
> Meaning, the C compiler could reduce the amount of optimistic call code at
> compile time.
Interesting idea. Alternatively, I wonder if the signature could
reflect exactly-sized types rather than int/long/etc. Perhaps that
would make the code more complicated on both ends...
I'm assuming your id(...) is computed at compile time in this example,
right? Otherwise it would get a bit messier.
- Robert
More information about the cython-devel
mailing list