[Import-SIG] Running C extension modules using -m switch

Mon May 22 05:33:50 EDT 2017

On 05/20/2017 06:36 AM, Nick Coghlan wrote:
> On 19 May 2017 at 21:43, Petr Viktorin <encukou at gmail.com> wrote:
>> On 05/19/2017 12:24 PM, Nick Coghlan wrote:
>>>
>>> On 18 May 2017 at 22:50,  <gmarcel.plch at gmail.com> wrote:
>>>>
>>>> Greetings,
>>>>
>>>> This has been already sent to python-ideas, but since I got no
>>>> response, so I'm re-sending it to this SIG. I would welcome any
>>>> comments.
>>>
>>>
>> ...
>>>>
>>>>
>>>> This new method calls into the _imp module, which executes the module
>>>> as a script.
>>>> I can see two ways of doing this. Both expect that the module uses PEP
>>>> 489 multi-phase initialization.
>>>
>>>
>>> The main reason I didn't immediately reply is that I had a vague
>>> recollection of thinking this could be done *without* a new method on
>>> loaders, but I needed to refresh my memory of our plans in that
>>> regard.
>>>
>>> I've now done that, and I'm pretty sure the unwritten plan was to
>>> change runpy to do something like the following:
>>>
>>>       spec = importlib.find_spec(modname)
>>>       created = spec.loader.create_module()
>>>       if created is not None:
>>>           raise RuntimeError("Cannot use customised module instance as
>>> __main__")
>>>       spec.loader.exec_module(main_mod)
>>>
>>> That's oversimplified quite a bit, but it gives the general idea.
>>
>>
>> The problem here is that for extension modules,
>> `spec.loader.create_module()` returns None.
> 
> I'm guessing this was meant to be "doesn't return None". I thought I
> was forgetting something, and that would be it :)
> 
>> It can't: the PyModuleDef is
>> attached to the returned module, and that's where the Py_mod_exec function
>> is stored. This is unlike with source modules, where the code is always
>> looked up by module name.
>>
>> So I see these ways to make things work:
>> - Make spec.loader.create_module() return None if Py_mod_create is missing,
>> and either store the PyModuleDef on the loader (which doesn't really fit in
>> with how importlib works), or re-load it from the .so every time (which
>> seems wasteful and hacky).
>> - Make exec_module take two modules – the module in whose namespace to run,
>> and the module whose code should run. Or make it take a module and a spec of
>> a different module. This would be an API change, affecting all third-party
>> loaders, so it's out.
>> - Add a new loader method taking two modules (or module and spec) as above
>> - Add a new loader method to explicitly run as main
> 
> As a third variant on the last two options: add a new optional
> "exec_in_namespace" method - that could potentially be useful for
> generalising reload and lazy loading support, as well as making it
> easier for pdb, profile, etc, to support non-traditional modules.

That won't be possible, since Py_mod_exec expects a module argument.
The main extra thing a module has in addition to its namespace dict is 
the C-level module state (which I don't think is handled in the current 
PoC – Marcel, can you add that?)
In the face of C module state, I think asking extension authors to 
always handle reloading correctly is too much. But the other use cases 
should be possible.