[Python-Dev] Allowing to run certain regression tests in subprocesses

Eli Bendersky eliben at gmail.com
Mon Aug 5 01:06:33 CEST 2013


On Sun, Aug 4, 2013 at 7:26 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 4 August 2013 23:40, Eli Bendersky <eliben at gmail.com> wrote:
>> On Sat, Aug 3, 2013 at 7:08 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>> Sure, it's just unusual to have a case where "importing is blocked by adding
>>> None to sys.modules" differs from "not actually available", so I'd like to
>>> understand the situation better.
>>
>> I must admit I'm confused by the behavior of import_fresh_module too.
>>
>> Snippet #1 raises the expected ImportError:
>>
>> sys.modules['pyexpat'] = None
>> import _elementtree
>>
>> However, snippet #2 succeeds importing:
>>
>> ET = import_fresh_module('_elementtree', blocked=['pyexpat'])
>> print(ET)
>
> /me goes and looks
>
> That function was much simpler when it was first created :P
>
> Still, I'm fairly confident the complexity of that dance isn't
> relevant to the problem you're seeing.
>
>> I believe this happens because import_fresh_module does an import of
>> the 'name' it's given before even looking at the blocked list. Then,
>> it assigns None to sys.modules for the blocked names and re-imports
>> the module. So in essence, this is somewhat equivalent to snippet #3:
>>
>> modname = '_elementtree'
>> __import__(modname)
>> del sys.modules[modname]
>> for m in sys.modules:
>>     if modname == m or m.startswith(modname + '.'):
>>         del sys.modules[m]
>> sys.modules['pyexpat'] = None
>> ET = importlib.import_module(modname)
>> print(ET)
>>
>> Which also succeeds.
>>
>> I fear I'm not familiar enough with the logic of importing to
>> understand what's going on, but it has been my impression that this
>> problem is occasionally encountered with import_fresh_module and C
>> code that imports stuff (the import of pyexpat is done by C code in
>> this case).
>
> I had missed it was a C module doing the import. Looking into the
> _elementtree.c source, the problem in this case is the fact that a
> shared library that doesn't use PEP 3121 style per-module state is
> only loaded and initialised once, so reimporting it gets the same
> module back (from the extension loading cache), even if the Python
> level reference has been removed from sys.modules. Non PEP 3121 C
> extension modules thus don't work properly with
> test.support.import_fresh_module (as there's an extra level of caching
> involved that *can't* be cleared from Python, because it would break
> things).
>
> To fix this, _elementree would need to move the pyexpat C API pointer
> to per-module state, rather than using a static variable (see
> http://docs.python.org/3/c-api/module.html#initializing-c-modules).
> With per-module state defined, the import machine should rerun the
> init function when the fresh import happens, thus creating a new copy
> of the module. However, this isn't an entirely trivial change for
> _elementree, since:
>
> 1. Getting from the XMLParser instance back to the module where it was
> defined in order to retrieve the capsule pointer via
> PyModule_GetState() isn't entirely trivial in C. You'd likely do it
> once in the init method, store the result in an XMLParser attribute,
> and then tweak the EXPAT() using functions to include an appropriate
> local variable definition at the start of the method implementation.
>
> 2. expat_set_error would need to be updated to accept the pyexpat
> capsule pointer as a function parameter
>

Thanks Nick; I suspected something of the sort is going on here, but
you provide some interesting leads to look at. I'll probably open an
issue to track this at some point.

Eli


More information about the Python-Dev mailing list