[Cython] CEP1000: Native dispatch through callables

Nathaniel Smith njs at pobox.com
Tue Apr 17 17:16:33 CEST 2012


On Tue, Apr 17, 2012 at 3:34 PM, Dag Sverre Seljebotn
<d.s.seljebotn at astro.uio.no> wrote:
> On 04/17/2012 04:20 PM, Nathaniel Smith wrote:
>> Since you've set this up... I have a suggestion for something that may
>> be worth trying, though I've hesitated to propose it seriously. And
>> that is, an API where instead of scanning a table, the C-callable
>> exposes a pointer-to-function, something like
>>   int get_funcptr(PyObject * self, PyBytesObject * signature, struct
>> c_function_info * out)
>
>
> Hmm. There's many ways to implement that function though. It shifts the
> scanning logic from the caller to the callee;

Yes, that's part of the point :-). Or, well, I guess the point is more
that it shifts the scanning logic from the ABI docs to the callee.

> you would need to call it
> multiple times for different signatures...

Yes, I'm not sure what I think about this -- there are arguments
either way for who should handle promotion. E.g., imagine the
following situation:

We have a JITable function
We have already JITed the int64 version of this function
Now we want to call it with an int32
Question: should we promote to int64, or should we JIT?

Later you write:
> if found in table:
>   do dispatch
> else if object supports get_funcptr:
>   call get_funcptr
> else:
>   python dispatch

If we do promotion during the table scanning, then we'll never call
get_funcptr and we'll never JIT an int32 version. OTOH, if we call
get_funcptr before doing promotion, then we'll end up calling
get_funcptr multiple times for different signatures regardless.

OTOOH, there are a *lot* of possible coercions for, say, a 3-argument
function with return, so just enumerating them is not necessarily a
good strategy. Possibly if get_functpr can't handle the initial
signature, it should return a table of signatures that it *is* willing
to handle... assuming that most callees will either be able to handle
a fixed set of types (cython variants) or else handle pretty much
anything (JIT), and only the former will reach this code path. Or we
could write down the allowed promotions (stealing from the C99 spec),
and require the callee to pick the best promotion if it can't handle
the initial request. Or we could put this part off until version 2,
once we see how eager callers are to actually implement a real
promotion engine.

> But if the overhead can be shown to be miniscule then it does perhaps make
> the API nicer, even if it feels like paying for nothing at the moment. But
> see below.
>
> Will definitely not get around to this today; anyone else feel free...
>
>
>>
>> The rationale is, if we want to support JITed functions where new
>> function pointers may be generated on the fly, the array approach has
>> a serious problem. You have to decide how many array slots to allocate
>> ahead of time, and if you run out, then... too bad. I guess you get to
>
>
> Note that the table is jumped to by a pointer in the PyObject, i.e. the
> PyObject I've tested with is
>
> [object data, &table, table]

Oh, I see! I thought you were embedding it in the object, to avoid an
extra indirection (and potential cache miss). That's probably
necessary, for the reasons you say, but also makes the get_funcptr
approach potentially more competitive.

> So a JIT could have the table in a separate location on the heap, then it
> can allocate a new table, copy over the contents, and when everything is
> ready, then do an atomic pointer update (using the assembly instructions/gcc
> intrinsics, not pthreads or locking).
>
> The old table would need to linger for a bit, but could at latest be
> deallocated when the PyObject is deallocated.

IMHO we should just hold the GIL through lookups, which would simplify
tihs, but that's mostly based on the naive intuition that we shouldn't
be passing around Python boxes in no-GIL code. Maybe there are good
reasons to.

- N


More information about the cython-devel mailing list