[Python-Dev] C-level duck typing

Stefan Behnel stefan_ml at behnel.de
Wed May 16 14:16:05 CEST 2012


Stefan Behnel, 16.05.2012 13:13:
> Dag Sverre Seljebotn, 16.05.2012 12:48:
>> On 05/16/2012 11:50 AM, "Martin v. Löwis" wrote:
>>>> Agreed in general, but in this case, it's really not that easy. A C
>>>> function call involves a certain overhead all by itself, so calling into
>>>> the C-API multiple times may be substantially more costly than, say,
>>>> calling through a function pointer once and then running over a
>>>> returned C
>>>> array comparing numbers. And definitely way more costly than running over
>>>> an array that the type struct points to directly. We are not talking
>>>> about
>>>> hundreds of entries here, just a few. A linear scan in 64 bit steps over
>>>> something like a hundred bytes in the L1 cache should hardly be
>>>> measurable.
>>>
>>> I give up, then. I fail to understand the problem. Apparently, you want
>>> to do something with the value you get from this lookup operation, but
>>> that something won't involve function calls (or else the function call
>>> overhead for the lookup wouldn't be relevant).
>>
>> In our specific case the value would be an offset added to the PyObject*,
>> and there we would find a pointer to a C function (together with a 64-bit
>> signature), and calling that C function (after checking the 64 bit
>> signature) is our final objective.
> 
> I think the use case hasn't been communicated all that clearly yet. Let's
> give it another try.
> 
> Imagine we have two sides, one that provides a callable and the other side
> that wants to call it. Both sides are implemented in C, so the callee has a
> C signature and the caller has the arguments available as C data types. The
> signature may or may not match the argument types exactly (float vs.
> double, int vs. long, ...), because the caller and the callee know nothing
> about each other initially, they just happen to appear in the same program
> at runtime. All they know is that they could call each other through Python
> space, but that would require data conversion, tuple packing, calling,
> tuple unpacking, data unpacking, and then potentially the same thing on the
> way back. They want to avoid that overhead.
> 
> Now, the caller needs to figure out if the callee has a compatible
> signature. The callee may provide more than one signature (i.e. more than
> one C call entry point), perhaps because it is implemented to deal with
> different input data types efficiently, or perhaps because it can
> efficiently convert them to its expected input. So, there is a signature on
> the caller side given by the argument types it holds, and a couple of
> signature on the callee side that can accept different C data input. Then
> the caller needs to find out which signatures there are and match them
> against what it can efficiently call. It may even be a JIT compiler that
> can generate an efficient call signature on the fly, given a suitable
> signature on callee side.
> 
> An example for this is an algorithm that evaluates a user provided function
> on a large NumPy array. The caller knows what array type it is operating
> on, and the user provided function may be designed to efficiently operate
> on arrays of int, float and double entries.
> 
> Does this use case make sense to everyone?
> 
> The reason why we are discussing this on python-dev is that we are looking
> for a general way to expose these C level signatures within the Python
> ecosystem. And Dag's idea was to expose them as part of the type object,
> basically as an addition to the current Python level tp_call() slot.

... and to finish the loop that I started here (sorry for being verbose):

The proposal that Dag referenced describes a more generic way to make this
kind of extension to type objects from user code. Basically, it allows
implementers to say "my type object has capability X", in a C-ish kind of
way. And the above C signature protocol would be one of those capabilities.

Personally, I wouldn't mind making the specific signature extension a
proposal instead of asking for a general extension mechanism for arbitrary
capabilities (although that still sounds tempting).

Stefan



More information about the Python-Dev mailing list