[capi-sig] Creating type object dynamically in run-time

Thu May 10 15:55:27 CEST 2012

On 10 May 2012 14:01, Stefan Behnel <python_capi at behnel.de> wrote:
> Mateusz Loskot, 10.05.2012 14:37:
>> On 10 May 2012 13:00, Stefan Behnel wrote:
>>> Mateusz Loskot, 10.05.2012 13:47:
>>>> On 10 May 2012 12:28, Stefan Behnel wrote:
>>>>> Mateusz Loskot, 10.05.2012 13:24:
>>>>>>
>>>>>> I'm writing fairly complex structure of Python extensions directly using
>>>>>> Python C API (version 3+ only). I define some type objects with statically,
>>>>>> according to the canonical Noddy example presented in the docs.
>>>>>> Some type objects have to be defined/composed dynamically in run-time.
>>>>>
>>>>> Do you really need a type or is a Python class enough? The letter would be
>>>>> much easier to create.
>>>>
>>>> I'm not sure I understand.
>>>> Aren't terms "class" and "type" names of the same concept, since Python 2.2?
>>>
>>> With "type" I meant "extension type", i.e. the one you define using C structs.
>>
>> So, I probably misused "object type" as a shortcut referring
>> to the pair of struct + PyTypeObject instance:
>>
>> typedef struct {
>>     PyObject_HEAD
>> } noddy_NoddyObject;
>>
>> static PyTypeObject noddy_NoddyType = { ... }
>>
>>> The alternative would be a simple Python class as you would get with
>>>
>>>   class MyClass(object): pass
>>
>> This is clear to me. I simply was confused because AFAIK there is no
>> class while using Python C API. As mentioned before, I follow Eli Bendersky's:
>>
>> "
>> Note that when we create new types using the C API of CPython,
>> there’s no "class" mentioned – we create a new "type", not a new "class".
>> "
>
> I think the emphasis is on "when we create new types". The sentence above
> sounds like a tautology to me (we obviously create a type and not a class
> when we create a type).

Thanks for clarification.

>>>> In my system, I'm embedding Python and I add custom Python extension
>>>> too (let's call it 'emb')
>>>> So, users of my embedded Python have access to 'emb' module.
>>>> The 'emb' module defines number of types, some are defined statically,
>>>> as the noddy_NoddyType, so users can instantiate it
>>>>
>>>> n = emb.Noddy()
>>>> n.bar() # method defined statically in methods table of noddy_NoddyType
>>>>
>>>> Now, I'd like to add some types which are generated in run-time, way
>>>> before the 'emb' module is appended to inittab and embedded Python is
>>>> initialised.
>>>
>>> Why would you want to do that before initialising the Python runtime?
>>
>> The 'emb' must be added to inittab before Python is initialised (that's what the
>> manual says, isn't it.) So, the scheme is this:
>>
>> /* 1. Dynamically generate emb.GeneratedNoddy */
>> ... /* Trying to figure out how */
>>
>> /* 2. Register 'emb' module as built-in */
>> PyImport_AppendInittab("emb", &PyInit_emb);
>>
>> /* 3. Initialise Python */
>> Py_Initialize();
>>
>> /* 4. we're ready to use */
>> ...
>>
>> /* 5. Clean-up */
>> Py_Finalize();
>>
>>
>> Does it make sense?
>
> No. You don't need to declare all types before hand. You can add a type (or
> function, or name, or whatever) to the module dict at any time, just as you
> can in Python. Whether that's a C implemented type of a normal Python class
> (or something else) doesn't matter at that point.
>
> The inittab mechanism is just there for convenience when everything *is*
> statically defined at compile time (or at least module init time).

Yes, I'd realised that my assumption was incorrect, so I followed up with
response to my own post.

>>>> And, I'd like to enable users to instantiate them in the same way as
>>>> Noddy above:
>>>>
>>>> d = emb.GeneratedNoddy()
>>>>
>>>> or allow users to use and access
>>>>
>>>> d  = emb.foo()
>>>> d.bar() # added dynamically in run-time during emb.GeneratedNoddy composition
>>>>
>>>> where:
>>>>
>>>> type(d)
>>>> <class 'emb.GeneratedNoddy'>
>>>>
>>>> I hope it makes my intentions clear.
>>>
>>> Not clear enough. The question is what your dynamically created types
>>> should do and provide. Would they need to be implemented in C (not just
>>> their methods but the types themselves!) or would a Python implementation
>>> suffice, potentially with methods implemented in C?
>>
>> OK, I think I see where is the gap in my explanation.
>>
>> 0. The generated objects will define wrappers for existing C/C++ API.
>>
>> 1. I have defined contained object which needs extra steps to
>> initialise within C/C++
>> Namely, I need to initialise members like pointer_to_some_wrapped_api_element,
>> it may be some arbitrary data or PyCapsule, etc.
>>
>> /* statically
>> typedef struct {
>>     PyObject_HEAD
>>     /* members initialised
>>     PyObject* pointer_to_some_wrapped_api_element;
>> } GeneratedNoddyObject;
>>
>> 2. The corresponding GeneratedNoddyType is generated dynamically,
>> with added set of methods, etc.
>>
>> 3. The GeneratedNoddyType is exposed as emb.GeneratedNoddy.
>>
>> AFAIU, generating emb.GeneratedNoddy dynamically in C makes it easier to
>> compose the 'emb' module.
>
> Not at all. It's much easier to use a Python class than to do everything in
> lengthy C-API declarations.
>
>
>> Alternatively, I can generate Python classes (script in text form),
>> but there is one inconvenience.
>> The only way to expose such classes I know is to compile and import as
>> *separate* module
>> using PyImport_ExecCodeModule (e.g. with name '_privemb'. So, the
>> resulting structure would be:
>>
>> emb
>> emb.Noddy
>> emb._privemb
>> emb._privemb.GeneratedNoddy
>
> Wrong again. You can execute any Python code from your C code. Look for the
> PyRun_*() functions.

Do you mean something similar to this approach?

 /* dynamically generated lengthy class definition */
const char* c = "class A(object): pass";

PyObject* class_a = PyRun_StringFlags(c, ...);
PyObject_SetAttrString(module, "A", class_a)

> Note that this is even easier in Cython, where you write Python code anyway
> (instead of C code).

Yes, I looked at the Cython docs.
Unfortunately, I can't use it. I have to stick to Python 3.2 dist.

>> Given that complication of loading classes from textual form through
>> intermediate module,
>> I thought using Python C API to generate extension types is better.
>> (I don't mine dealing with C code verbosity and complexity.)
>
> You should. The simpler the code, the easier it is to maintain it. Why else
> would you want to embed Python instead of using C for everything?

You are right of course.
It's clear to me too, but there seem to be many ways to achieve same/similar
results, so I'm trying to survey which one is most recommended in my case.

>> However, I'd really like to learn canonical means of generating extension types
>> dynamically using plain Python C API.
>> Long story short, I assume I'm looking for Python C API equivalent of
>> using type() function.
>
> Then call type().

It's possible, certainly.
But, is there any plain C equivalent, perhaps using PyType_Type
and the Descriptor Objects?
(Why purpose and usage of the Descriptor Objects are not documented?.)
I'm seeking for confirmation, if such approach would be close to the
type() function.

To summary:

1)
It seems to be that generating script with Python classes in textual form
and then using Python C API like PyRun_StringFlags, etc.
is the recommended approach

2)
There is approach based on direct call to type() function
using Python C API.

3)
Is there third option possible which replaces the 2nd's type() call
with chain of plain Python C API calls?
I know there are pros/cons related to amount of code, maintenance hassle,
code complexity, and such. but it is not related to my question really.

Best regards,
-- 
Mateusz Loskot, http://mateusz.loskot.net