[Cython] [Python-Dev] C-level duck typing

Thu May 31 16:20:18 CEST 2012

On 05/28/2012 05:59 PM, Dag Sverre Seljebotn wrote:
>
>
> Dag Sverre Seljebotn<d.s.seljebotn at astro.uio.no>  wrote:
>
>> On 05/28/2012 01:24 PM, Nathaniel Smith wrote:
>>> On Mon, May 28, 2012 at 12:09 PM, mark florisson
>>> <markflorisson88 at gmail.com>   wrote:
>>>> On 28 May 2012 12:01, Nathaniel Smith<njs at pobox.com>   wrote:
>>>>> On Mon, May 28, 2012 at 11:55 AM, mark florisson
>>>>> <markflorisson88 at gmail.com>   wrote:
>>>>>> On 28 May 2012 11:41, Nathaniel Smith<njs at pobox.com>   wrote:
>>>>>>> On Mon, May 28, 2012 at 10:13 AM, mark florisson
>>>>>>> <markflorisson88 at gmail.com>   wrote:
>>>>>>>> On 28 May 2012 09:54, mark florisson<markflorisson88 at gmail.com>
>> wrote:
>>>>>>>>> On 27 May 2012 23:12, Nathaniel Smith<njs at pobox.com>   wrote:
>>>>>>>>>> On Sun, May 27, 2012 at 10:24 PM, Dag Sverre Seljebotn
>>>>>>>>>> <d.s.seljebotn at astro.uio.no>   wrote:
>>>>>>>>>>> On 05/18/2012 10:30 AM, Dag Sverre Seljebotn wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> On 05/18/2012 12:57 AM, Nick Coghlan wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> I think the main things we'd be looking for would be:
>>>>>>>>>>>>> - a clear explanation of why a new metaclass is considered
>> too complex a
>>>>>>>>>>>>> solution
>>>>>>>>>>>>> - what the implications are for classes that have nothing
>> to do with the
>>>>>>>>>>>>> SciPy/NumPy ecosystem
>>>>>>>>>>>>> - how subclassing would behave (both at the class and
>> metaclass level)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yes, defining a new metaclass for fast signature exchange
>> has its
>>>>>>>>>>>>> challenges - but it means that *our* concerns about
>> maintaining
>>>>>>>>>>>>> consistent behaviour in the default object model and
>> avoiding adverse
>>>>>>>>>>>>> effects on code that doesn't need the new behaviour are
>> addressed
>>>>>>>>>>>>> automatically.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Also, I'd consider a functioning reference implementation
>> using a custom
>>>>>>>>>>>>> metaclass a requirement before we considered modifying type
>> anyway, so I
>>>>>>>>>>>>> think that's the best thing to pursue next rather than a
>> PEP. It also
>>>>>>>>>>>>> has the virtue of letting you choose which Python versions
>> to target and
>>>>>>>>>>>>> iterating at a faster rate than CPython.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> This seems right on target. I could make a utility code C
>> header for
>>>>>>>>>>>> such a metaclass, and then the different libraries can all
>> include it
>>>>>>>>>>>> and handshake on which implementation becomes the real one
>> through
>>>>>>>>>>>> sys.modules during module initialization. That way an
>> eventual PEP will
>>>>>>>>>>>> only be a natural incremental step to make things more
>> polished, whether
>>>>>>>>>>>> that happens by making such a metaclass part of the standard
>> library or
>>>>>>>>>>>> by extending PyTypeObject.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> So I finally got around to implementing this:
>>>>>>>>>>>
>>>>>>>>>>> https://github.com/dagss/pyextensibletype
>>>>>>>>>>>
>>>>>>>>>>> Documentation now in a draft in the NumFOCUS SEP repo, which
>> I believe is a
>>>>>>>>>>> better place to store cross-project standards like this. (The
>> NumPy
>>>>>>>>>>> docstring standard will be SEP 100).
>>>>>>>>>>>
>>>>>>>>>>> https://github.com/numfocus/sep/blob/master/sep200.rst
>>>>>>>>>>>
>>>>>>>>>>> Summary:
>>>>>>>>>>>
>>>>>>>>>>>    - No common runtime dependency
>>>>>>>>>>>
>>>>>>>>>>>    - 1 ns overhead per lookup (that's for the custom slot
>> *alone*, no
>>>>>>>>>>> fast-callable signature matching or similar)
>>>>>>>>>>>
>>>>>>>>>>>    - Slight annoyance: Types that want to use the metaclass
>> must be a
>>>>>>>>>>> PyHeapExtensibleType, to make the binary layout work with how
>> CPython makes
>>>>>>>>>>> subclasses from Python scripts
>>>>>>>>>>>
>>>>>>>>>>> My conclusion: I think the metaclass approach should work
>> really well.
>>>>>>>>>>
>>>>>>>>>> Few quick comments on skimming the code:
>>>>>>>>>>
>>>>>>>>>> The complicated nested #ifdef for __builtin_expect could be
>> simplified to
>>>>>>>>>>    #if defined(__GNUC__)&&   (__GNUC__>   2 || __GNUC_MINOR__>
>> 95)
>>>>>>>>>>
>>>>>>>>>> PyCustomSlots_Check should be called PyCustomSlots_CheckExact,
>> surely?
>>>>>>>>>> And given that, how can this code work if someone does
>> subclass this
>>>>>>>>>> metaclass?
>>>>>>>>>
>>>>>>>>> I think we should provide a wrapper for PyType_Ready, which
>> just
>>>>>>>>> copies the pointer to the table and the count directly into the
>>>>>>>>> subclass. If a user then wishes to add stuff, the user can
>> allocate a
>>>>>>>>> new memory region dynamically, memcpy the base class' stuff in
>> there,
>>>>>>>>> and append some entries.
>>>>>>>>
>>>>>>>> Maybe we should also allow each custom type to set a
>> deallocator,
>>>>>>>> since they are then heap types which can go out of scope. The
>>>>>>>> metaclass can then call this deallocator to deallocate the
>> table.
>>>>>>>
>>>>>>> Custom types are plain old Python objects, they can use
>> tp_dealloc.
>>>>>>>
>>>>>> If I set etp_custom_slots to something allocated on the heap, then
>> the
>>>>>> (shared) metaclass would have to deallocate it. The tp_dealloc of
>> the
>>>>>> type itself would be called for its instances (which can be used
>> to
>>>>>> deallocate dynamically allocated memory in the objects if you use
>> a
>>>>>> custom slot "pointer offset").
>>>>>
>>>>> Oh, I see. Right, the natural way to handle this would be have each
>>>>> user define their own metaclass with the behavior they want.
>> Another
>>>>> argument for supporting multiple metaclasses simultaneously I
>> guess...
>>>>>
>>>>> - N
>>>>> _______________________________________________
>>>>> cython-devel mailing list
>>>>> cython-devel at python.org
>>>>> http://mail.python.org/mailman/listinfo/cython-devel
>>>>
>>>> That bludgeons your constant time type check.
>>>
>>> Not if you steal a flag, like the interpreter already does with
>>> Py_TPFLAGS_INT_SUBCLASS, Py_TPFLAGS_STRING_SUBCLASS, etc. I was
>>> referring to that argument I made earlier :-)
>>>
>>>> It's easier to just
>>>> reserve an extra slot for a deallocator pointer :) It would probably
>>>> be set to NULL in the common case anyway, since you allocate your
>>>> slots statically.
>>
>> Subclassing: Note that even if all types has to have a PyHeapTypeObject
>>
>> structure, they are still statically allocated! So for statically
>> created subclasses (which should be the majority of the cases), there's
>>
>> not going to be any deallocator...
>>
>> I agree that there should be a PyExtensibleType_Ready. To keep
>> allocating statically I propose that the subclass should leave some
>> room
>> open for slots from the superclass:
>>
>> PyCustomSlot subclass_custom_slots[10] = {
>>    {SLOT_C, foo}, {SLOT_D, BAR}, {0,0}, ...
>> }
>>
>> Then, fill in etp_count=2, etp_custom_slots=subclass_custom_slots, and
>> then call PyExtensibleType_Ready(&Subclass_Type, 10); i.e., the number
>> of total elements in etp_custom_slots is passed in.
>>
>> One should always leave more room than one thinks one needs if the
>> superclass is from another library...
>>
>> Then, inheritance happens according to the following rules:
>>
>>   - Slots are inherited from superclass
>>   - Slots in subclass with same ID overwrites superclass
>>   - Slots from superclass are put before slots from subclass
>>   - Exception raised if the number of final slots is larger than the
>> limit passed in to PyExtensibleType_Ready.
>>
>> (Whenever this is not sufficient, you can always manually munge the
>> table after PyExtensibleType_Ready.)
>>
>> Question: How to deal with possible flag bits in the ID?
>>
>> Three approaches:
>>
>> a) Forget about the flags-in-ID idea; if you want flags, stick them in
>> the data
>>
>>   b) Embed a seperate variable for flags in every PyCustomSlot
>>
>>   c) Standardize on a *hard* requirement on the bottom 8 bits being
>> flags while the top 24 bits indicate incompatible slots; so for the
>> purposes of inheritance, 0x12345601 would overwrite 0x12345600.
>>
>> To me, b) is OK, but the 32 bit ID space is already so ridiculously
>> huge
>> that c) is a "why not"? -1 on a), it'd be rather tedious if the payload
>>
>
>
> I guess the sane thing to do is make the custom slot (id, flags, data); and have id and flags be 32 bits on all platforms. Otherwise 32 bits are wasted to padding on 64 bit platforms anyway.
>

SEP updated (to what I hope is the final form):

https://groups.google.com/forum/?fromgroups#!topic/numfocus/-XWwLMVgXBQ
https://github.com/numfocus/sep/blob/master/sep200.rst
https://github.com/dagss/pyextensibletype

Changes:

  - Remove the flags concept; option a) above

  - Use the tp_flags bit. (I benchmarked walking the type hierarchy, and 
it doesn't cost if you don't take the branch, but I'm much happier for 
clients to avoid having to rendezvous on the metaclass, in particular if 
this is used in the NumPy API).

  - All manually allocated IDs have the least significant bit set, so 
that one can also use 2-byte aligned pointers as IDs (e.g., objects 
representing interfaces or interned strings can be used as slot IDs).

Dag