[Cython] Fwd: Re: SEP 201 draft: Native callable objects

Dag Sverre Seljebotn d.s.seljebotn at astro.uio.no
Thu May 31 22:06:33 CEST 2012


Forgot to CC this list...

-------- Original Message --------
Subject: Re: [Cython] SEP 201 draft: Native callable objects
Date: Thu, 31 May 2012 22:06:02 +0200
From: Dag Sverre Seljebotn <d.s.seljebotn at astro.uio.no>
Reply-To: numfocus at googlegroups.com
To: numfocus at googlegroups.com

On 05/31/2012 09:29 PM, Dag Sverre Seljebotn wrote:
> On 05/31/2012 08:50 PM, Robert Bradshaw wrote:
>> On Thu, May 31, 2012 at 7:04 AM, Dag Sverre Seljebotn
>> <d.s.seljebotn at astro.uio.no> wrote:
>>> [Discussion on numfocus at googlegroups.com please]
>>>
>>> I've uploaded a draft-state SEP 201 (previously CEP 1000):
>>>
>>> https://github.com/numfocus/sep/blob/master/sep201.rst
>>>
>>> """
>>> Many callable objects are simply wrappers around native code. This
>>> holds for
>>> any Cython function, f2py functions, manually written CPython
>>> extensions,
>>> Numba, etc.
>>>
>>> Obviously, when native code calls other native code, it would be nice to
>>> skip the significant cost of boxing and unboxing all the arguments.
>>> """
>>>
>>>
>>> The thread about this on the Cython list is almost endless:
>>>
>>> http://thread.gmane.org/gmane.comp.python.cython.devel/13416/focus=13443
>>>
>>> There was a long discussion on the key-comparison vs. interned-string
>>> approach. I've written both up in SEP 201 since it was the major
>>> point of
>>> contention. There was some benchmarks starting here:
>>>
>>> http://thread.gmane.org/gmane.comp.python.cython.devel/13416/focus=13443
>>>
>>> And why provide a table and not a get_function_pointer starting here:
>>>
>>> http://thread.gmane.org/gmane.comp.python.cython.devel/13416/focus=13443
>>>
>>> For those who followed that and don't want to read the entire spec, the
>>> aspect of flags is new. How do we avoid to duplicate entries/check
>>> against
>>> two signatures for cases like a GIL-holding caller wanting to call a
>>> nogil
>>> function? My take: For key-comparison you can compare under a mask, for
>>> interned-string we should have additional flags field.
>>>
>>> The situation is a bit awkward: The Cython list consensus (well, me and
>>> Robert Bradshaw) decided on what is "Approach 1" (key-comparison) in SEP
>>> 201. I pushed for that.
>>>
>>> Still, now that a month has passed, I just think key-comparison is
>>> too ugly,
>>> and that the interning mechanism shouldn't be *that* hard to code up,
>>> probably 500 lines of C code if one just requires the GIL in a first
>>> iteration, and that keeping the spec simpler is more important.
>>>
>>> So I'm tentatively proposing Approach 2.
>>
>> I'm still not convinced that a hybrid approach, where signatures below
>> some cutoff are compiled down to keys, is not a worthwhile approach.
>> This gets around variable-length keys (both the complexity and
>> possible runtime costs for long keys) and allows simple libraries to
>> produce and consume fast callables without participating in the
>> interning mechanism.
>
> I still think this gives us the "worst of both worlds", all the
> disadvantages and none of the advantages.

Wait -- the complexity of the key approach is in the compilation, but
avoiding any encoding/decoding would remove a major source of spec
complexity. So this could just work:

typedef struct {
     union {
         char *interned_sigptr;
         char short_sig[8];
     }
     uintptr_t flags;
     void *funcptr;
};

And then flags contains whether the signature is short.

I think compiling down a signature that doesn't end with 0x000 is rather
complicated even if there's no Huffman. The point is to be able to hand
off a char* to a signature parsing routine easily, with no
decompilation. Using a flag avoids that but requires a couple more
instructions.

Pro: Get somewhere without actually implementing interning (that's like
a 3-hour job if you require the GIL, a little more to make sure it's
forward-compatible)

Cons: Is more complicated. Couple of extra assembly instructions but
they probably don't matter.

Dag

>
> How many simple libraries are there really? Cython on one end, the
> magnificently complicated NumPy ufuncs on the other? Thinking big,
> perhaps PyPy and Julia? Cython, PyPy, Julia would all have to deal with
> long signatures anyway. And NumPy ufuncs are already complicated so even
> more low-level stuff wouldn't hurt.
>
>> It's unclear how to rendezvous on a common interning interface without
>> the GIL/Python, so perhaps requiring the GIL to use it not to onerous.
>> An alternative is to acquire the GIL in the first/reference
>> implementation (which could allow the interning function pointers to
>> be cached by an external GIL-oblivions JIT for example). Presumably
>> some other locking mechanism would be required if the GIL is not used,
>> so the overhead would likely not be that great.
>
> Yes. I guess a goal could be to make sure there's no ABI breakage
> if/when the GIL requirement is lifted.
>
> Since modules can already have a reference to the interner by the time
> the first module interfacing with a GIL-less world is imported, this is
> non-trivial, but "every problem can be solved with another level of
> indirection", and particularly this one.
>
> Good idea on separating out interning as a separate spec; that's
> definitely useful for interfaces etc. as well down the line. I can get
> to work on a string interning spec and implementation as SEP 202 in
> spare minutes over the next month or so, but I won't bother unless SEP
> 201 will uses interning. My role in that depends on Travis' timeline as
> well, as my ETA is so unpredictable.
>
> Dag



More information about the cython-devel mailing list