[Cython] CEP1000: Native dispatch through callables

Dag Sverre Seljebotn d.s.seljebotn at astro.uio.no
Sun Apr 15 10:15:53 CEST 2012


On 04/15/2012 09:39 AM, Robert Bradshaw wrote:
> On Sat, Apr 14, 2012 at 11:58 PM, Dag Sverre Seljebotn
> <d.s.seljebotn at astro.uio.no>  wrote:
>> Ah, Cython objects. Didn't think of that. More below.
>>
>>
>> On 04/14/2012 11:02 PM, Stefan Behnel wrote:
>>>
>>> Hi,
>>>
>>> thanks for writing this up. Comments inline as I read through it.
>>>
>>> Dag Sverre Seljebotn, 14.04.2012 21:08:
>>>>
>>>> each described by a function pointer and a signature specification
>>>>
>>>> string, such as "id)i" for {{{int f(int, double)}}}.
>>>
>>>
>>> How do we deal with object argument types? Do we care on the caller side?
>>> Functions might have alternative signatures that differ in the type of
>>> their object parameters. Or should we handle this inside of the caller and
>>> expect that it's something like a fused function with internal dispatch in
>>> that case?
>>
>>>
>>> Personally, I think there is not enough to gain from object parameters
>>> that
>>> we should handle it on the caller side. The callee can dispatch those if
>>> necessary.
>>>
>>> What about signatures that require an object when we have a C typed value?
>>>
>>> What about signatures that require a C typed argument when we have an
>>> arbitrary object value in our call parameters?
>>>
>>> We should also strip the "self" argument from the parameter list of
>>> methods. That's handled by the attribute lookup before even getting at the
>>> callable.
>>
>> On 04/15/2012 07:59 AM, Robert Bradshaw wrote:
>>> It would certainly be useful to have special syntax for memory views
>>> (after nailing down a well-defined ABI for them) and builtin types.
>>> Being able to declare something as taking a
>>> "sage.rings.integer.Integer" could also prove useful, but could result
>>> in long (and prefix-sharing) signatures, favoring the
>>> runtime-allocated ids.
>>
>>
>> I do think describing Cython objects in this cross-tool CEP would work
>> nicely, this is for standardized ABIs only (we can't do memoryviews either
>> until their ABI is standard).
>>
>> I think I prefer to a) exclude it now, and b) down the line we need another
>> cross-tool ABI to communicate vtables, and then we could put that into this
>> CEP now.
>>
>> I strongly believe we should go with the Go "duck-typing" approach for
>> interfaces, i.e. it is not the declared name that should be compared but the
>> method names and signatures.
>>
>> The only question that needs answering for CEP1000 is: Would this blow up
>> the signature string enough that interning is the only viable option?
>
> Exactly.
>
>> Some strcmp solutions:
>>
>>   a) Hash each vtable descriptor to 160-bits, and assume the hash is unique.
>> Still, a couple of interfaces would blow up the signature string a lot.
>>
>>   b) Modify approach B in CEP 1000 to this: If it is longer than 160 bits,
>> take a full cryptographic hash, and just assume there won't be hash
>> collisions (like git does). This still saves for short signature strings,
>> and avoids interning at the cost of doing 160-bit comparisons.
>>
>> Both of these require other ways at getting at the actual string data. But I
>> still like b) above better than interning.
>
> Requiring an implementation (or at least access too) a cryptographic
> hash greatly complicates the spec. (On another note, even a simple
> hash as a prefix might be useful to prevent a lot of false partial
> matches, e.g. "sage.rings...") 160 * n bits starts to get large too
> (and we'd have to twiddle them to insert/avoid a "dash" ever 16
> bytes).

Do you really think it complicates the spec? SHA-1 is pretty standard, 
and Python ships with hashlib (the hashing part isn't performance critical).

I prefer hashing to string-interning as it can still be done 
compile-time etc. 160 bits isn't worse than the second-to-best strcmp 
case of a 256-bit function entry.

Shortening the hash to 120 bits (truncation) we could have a spec like this:

  - Short signature: [64 bit encoded signature. 64 bit funcptr]
  - Long signature: [64 bit hash, 64 bit pointer to full signature,
                     8 bit guard byte, 56 bits remaining hash,
                     64 bit funcptr]


Anyway: Looks like it's about time to do some benchmarks. I'll try to 
get around to it next week.


Dag


More information about the cython-devel mailing list