[Import-SIG] PEP 489: Redesigning extension module loading
Petr Viktorin
encukou at gmail.com
Wed Mar 25 14:36:42 CET 2015
On 03/25/2015 01:11 PM, Nick Coghlan wrote:
> On 25 March 2015 at 02:34, Petr Viktorin <encukou at gmail.com> wrote:
>> I'll share my notes on an API with PEP 384-style slots, before attempting to
>> write it out in PEP language.
>>
>> I struggled to find a good name for the "PyType_Spec" equivalent, since
>> ModuleDef and ModuleSpec are both taken, but then I realized that, if the
>> docstring is put in a slot, I just need an array of slots...
>
> Because we're looking for an exported symbol, I think there's value in
> having a more clearly defined top level structure rather than just an
> array.
OK.
I'm not sure on cross-platform support of data rather than functions
exported from shared libraries, so kept the hook as a function.
Perhaps I'm being too paranoid here?
> PyModule_Export or PyModule_Declare come to mind, with a preference
> for the former (since we're exporting a module definition for CPython
> to import)
That's the name I was looking for, thanks!
> typedef struct PyModule_Export {
> const char* doc;
> PyModule_Slot *slots; /* terminated by slot==0. */
> } PyModule_Export;
>
> I prefer this mostly because it's easier to document and hence to
> understand - you can cover the process of creating the overall module
> in relation to PyModule_Export, while PyModule_Slot docs can focus on
> defining the *content* of the module.
I don't think this is a problem. I can document creating with the
PyModuleExport_<modulename> symbol, and then when say that it's an array
of PyModule_Slot in the appropriate section.
> Having the docstring as the only expected field helps suggest that
> modules should at least define that much. Unlike types, we can leave
> the name out by default, as it will usually be implied by the file
> name (as is the case with Python modules).
The downside is that it's additional boilerplate. PyType_Spec has a
bunch of mandatory int fields, but here everything is a pointer.
Also, does the docstring always need to be specified (as a constant)? I
think some internal modules are fine without a docstring (see _hashlib,
_multiprocessing, _elementtree, _sqlite3, ...).
But if you're convinced a separate PyModule_Export structure is better,
I won't fight.
> You've sold me on the idea of using a slots based API, though.
> However, the PEP's going to need to spend a bit more time on how to
> map this to the existing PyModule_Create API for modules that also
> want to support older versions of Python, while using the new system
> on 3.5+.
Agreed.
>> Does the following look reasonable?
>>
>> in moduleobject.h:
>>
>> typedef struct PyModule_Slot{
>> int slot;
>> void *pfunc;
>> } PyModuleDesc_Slot;
>
> "pfunc" doesn't fit in this case, so I think a more generic field name
> like "value" would be needed.
>
>> typedef struct PyModule_StateDef {
>> int size;
>> traverseproc m_traverse;
>> inquiry m_clear;
>> freefunc m_free;
>> }
>>
>> #define Py_m_doc 1
>> #define Py_m_create 2
>> #define Py_m_methods 3
>> #define Py_m_statedef 4
>> #define Py_m_exec 5
>
> Py_mod_*, perhaps?
Sure.
> I'm also wondering if "exec" should move to be an "m_init" method in
> PyModule_StateDef, rather than an independent slot, replacing it with
> a PyType_Spec "types" slot as suggested below.
No. Sometimes the exec doesn't need C state. It can work with just the
module dict, for example to export some methods conditionally, or export
objects that aren't methods/classes/whatever there's a special slot for.
[...]
>> I've thought about supporting multiple modules per extension, but I don't
>> see a clear way to do that. The standard ModuleSpec machinery assumes one
>> module per file, and it's not straightforward to get around that. To load
>> more modules from an extension, you'd need a custom finder or loader anyway.
>> So I'm going to implement helpers needed to load a module given an arbitrary
>> PyModuleDesc, and leave implementing multi-mod support to people who need it
>> for now.
>> So, an "inittab" is out for now.
>
> Symlinks should work for making the same binary file importable under
> different names in simple cases, and more complex cases are likely to
> need a custom finder and loader anyway.
>
>> Perhaps a slot for automatically adding classes (from array of PyType_Spec)
>> would help PyType_Spec adoption.
>
> Perhaps this one would be worth including in the initial proposal to
> help make it clear why we decided the slots based design was
> worthwhile?
>
>> And then a slot adding string/int/... constants from arrays of name/value
>> would mean most modules wouldn't need an exec function.
>
> For those cases, I think the module internally is likely to want fast
> C level access to the relevant constants - this note is the one that
> inspired my suggestion of moving the "exec" link into the statedef
> slot.
This is for wrapping constants that are already known at the C level.
For example _ssl has a long list of these calls:
PyModule_AddIntConstant(m, "SSL_ERROR_ZERO_RETURN",
PY_SSL_ERROR_ZERO_RETURN);
PyModule_AddIntConstant(m, "SSL_ERROR_WANT_READ",
PY_SSL_ERROR_WANT_READ);
PyModule_AddIntConstant(m, "SSL_ERROR_WANT_WRITE",
PY_SSL_ERROR_WANT_WRITE);
PyModule_AddIntConstant(m, "SSL_ERROR_WANT_X509_LOOKUP",
PY_SSL_ERROR_WANT_X509_LOOKUP);
PyModule_AddIntConstant(m, "SSL_ERROR_SYSCALL",
PY_SSL_ERROR_SYSCALL);
PyModule_AddIntConstant(m, "SSL_ERROR_SSL",
PY_SSL_ERROR_SSL);
PyModule_AddIntConstant(m, "SSL_ERROR_WANT_CONNECT",
PY_SSL_ERROR_WANT_CONNECT);
... and so on. Many modules don't have proper error checking for this.
>> And an "inittab" slot should be possible for package-style extensions.
>> I'll leave these ideas out for now, but possibilities for extending are
>> there.
>
> If I recall correctly, there's actually a longstanding RFE somewhere
> for builtin packages that this change may eventually be able to help
> with. It was something embedding the full Qt libraries I think.
There are probably more use cases, but let's stick to the basics for now.
More information about the Import-SIG
mailing list