Doubley imported module caused devastating bug

Terry Reedy tjreedy at udel.edu
Thu Sep 24 17:26:03 EDT 2009


Zac Burns wrote:
> On Thu, Sep 24, 2009 at 1:38 PM, Carl Banks <pavlovevidence at gmail.com> wrote:
>> On Sep 24, 10:26 am, Zac Burns <zac... at gmail.com> wrote:
>>> Currently it is possible to import a file of one path to more than one
>>> 'instance' of a module. One notable example is "import __init__" from
>>> a package. Seehttp://stackoverflow.com/questions/436497/python-import-the-containin...
>>>
>>> This recently caused a devastating bug in some of my code. What I have
>>> is support for the Perforce global options as a context for a perforce
>>> module.http://www.perforce.com/perforce/doc.072/manuals/cmdref/o.gopts.html#...
>>> This way one can call functions that call many perforce command and
>>> have them execute on a different client for example.
>>>
>>> So, in module A and module B both imported the Perforce module, but
>>> they turned out not to be the same module. Module A did "with
>>> Perforce.GlobalOptions(client=client): B.function()"
>>>
>>> B.function did not receive the new GlobalOptions because of this
>>> problem. As a result important files on the original client were
>>> overwritten (OUCH).
>>>
>>> I would like to propose that it be made impossible in the Python
>>> source to import two instances of the same module.
>> Impossible's a pretty strong word.
>>
>> It's a reasonable request, but with Python's importing the way it is
>> it'd be kind of hard to do.  A Python file can be visible in multiple
>> ways.
>>
>> However, anyone who does "import __init__" (or "from . import
>> __init__" with relative import) is asking for trouble, I can't think
>> of any valid reason to do it, and I wouldn't mind seeing that
>> forbidden, but it's simple to avoid.

/__init__.py is basically an implementation hack to make a directory 
also 'be' a file. Use at one own risk, I say.


> There are corner cases. The corner case that I ran into was that there
> were two ways to find the module on PATH because one value of PATH was
> over another. Since then this problem has been removed and it wasn't
> too much trouble to work around - but finding the problem was a real
> pain.
> 
> I am not intimately familiar with the import code and trust your
> judgment that it is difficult. If people are in agreement that this
> should be changed though it could be put in a list somewhere waiting
> for some ambitious person to figure out the implementation, no?

1. It would slow down all imports, at least a bit.

2. It would kill code that intentionally makes use of duplicate modules 
(but this could be considered exploitation of a bug, perhaps). It would 
also make forced module reloads harder, it not impossible. Currently, 
just delete the entry in sys.modules.

3. The language itself does not specify how and where from an 
implementation 'initializes' a module on first import. Indeed, CPython 
has at least three options (.py, .zip, and .dll or .pyd (Windows)), with 
hooks for more. Lets a take the request as specifically preventing the 
creation of duplicate module objects from a particular .py file.

One implementatin *might* be add a set to sys, say sys.mod_files for 
x.py or x.pyc files already used to initialize a module. The .py or .pyc 
or .pyo would be stripped but the name otherwise should be the absolute 
path. (Including drive letter, on Windows).

This would not cover the case when files are symlinked (or copied). For 
*nix, a set of inode numbers could be used, but not for Windows.  I 
suspect there might be other system-specific problems I have not thought of.

Terry Jan Reedy




More information about the Python-list mailing list