[New-bugs-announce] [issue40333] Request for multi-phase initialization API to run code after importlib init

Gregory Szorc report at bugs.python.org
Sun Apr 19 16:10:33 EDT 2020


New submission from Gregory Szorc <gregory.szorc at gmail.com>:

I'm porting PyOxidizer to the PEP 587 APIs. So far, it is mostly straightforward.

I was looking forward to adopting the multi-phase initialization API because I thought it would enable me to get rid of the very ugly hack PyOxidizer uses to inject its custom meta path importer. This hack is described in gory detail at https://docs.rs/pyembed/0.7.0/pyembed/technotes/index.html#our-importing-mechanism and the tl;dr is we use a modified version of the `importlib._bootstrap_external` module to configure a builtin extension module on sys.meta_path so it can handle all imports during initialization.

The good news is the multi-phase initialization API gives us an injection point between "core" and "main." I _think_ I would be able to import the built-in extension here and register it on sys.meta_path.

However, the new multi-phase initialization API doesn't give us the total control that we need. Specifically, PyOxidizer's importer is leveraging a few functions in importlib._bootstrap_external as part of its operation. So it needs this module to be available before it can be loaded and installed on sys.meta_path. It also wants total control over sys.meta_path. So we don't want importlib._bootstrap_external to be mucking with sys.meta_path and imports being performed before PyOxidizer has a chance to readjust state.

The critical feature that PyOxidizer needs is the ability to muck with sys.meta_path and importlib *after* importlib externals are initialized and *before* any non-builtin, non-frozen import is attempted. In the current state of the initialization code, we need to run custom code between init_importlib_external() and when the first non-builtin, non-frozen import is attempted (currently during _PyUnicode_InitEncodings()).

Would it be possible to get a multi-phase initialization API that stops after init_importlib_external()?

If not, could we break up PyConfig._install_importlib into 2 pieces to allow disabling of just importlib._bootstrap_external and provide a supported mechanism to initialize the external mechanism between "core" and "main" initialization? (Although I'm not sure if this is possible, since "main" finishes initializing aspects of "sys" before init_importlib_external() and I'm not sure if it is safe to initialize importlib externals before this is done. I'm guessing there is a reason that code runs before importlib is fully initialized.)

I suppose I could change PyOxidizer's functionality a bit to work around the lack of an importlib._bootstrap_external module between "core" and "main" initialization. I'm pretty sure I could make this work. But my strong preference is to inject code after importlib external support is fully initialized but before any imports are performed with it.

Overall the PEP 587 APIs are terrific and a substantial improvement over what came before. Thank you for all your work on this feature, Victor!

----------
components: C API
messages: 366802
nosy: indygreg, vstinner
priority: normal
severity: normal
status: open
title: Request for multi-phase initialization API to run code after importlib init
versions: Python 3.8, Python 3.9

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue40333>
_______________________________________


More information about the New-bugs-announce mailing list