PyWart: "Python's import statement and the history of external dependencies"

Ian Kelly ian.g.kelly at gmail.com
Fri Nov 21 14:23:48 EST 2014


On Fri, Nov 21, 2014 at 11:24 AM, Rick Johnson
<rantingrickjohnson at gmail.com> wrote:
> Are you also going to call drivers "fools" because they bought
> a "certain brand" of car only to have the airbag explode in
> their face?

No, but I'll call them fools if they buy a car and the engine catches
fire because they never bothered to change the oil.

If you don't want to have module name collisions, then don't create
modules with names that are likely to collide when Python gives you an
excellent tool for avoiding collisions (namespaces). Don't go blaming
Python for "bad design" when you couldn't even be bothered to use the
tools made available to you.

>> Now you can drop as much stuff in there as you like, and
>> none of it will ever conflict with the standard library
>> (unless a standard "ricklib" module is added, which is
>> unlikely).
>
> Yes, and now we've solved one problem by replacing it with
> it's inverse -- try importing the *python lib* calendar
> module and all you will get is your local "intra-package"
> version. Now, the only way to get to the lib module is by
> mutilating sys.path, or using an import utility module to
> "import by filepath".

Um, no. If your calendar module is named ricklib.calendar, then
importing just calendar will import the standard library calendar.

The only exception is if you're doing "import calendar" from inside
the ricklib package, and you're using Python 2, and you don't have
"from __future__ import absolute_import" at the top of your module.
The solution to this is easy: just add that __future__ import to the
top of your module, and poof, implicit relative imports don't happen.
This is also fixed entirely in Python 3.

This is the second rather blatant error you've made about Python's
import mechanism, which makes me suspect that you don't really know
very much about it.

> Anyone would expect that when *DIRECTLY* importing a
> package, if the __init__ file has code, then THAT code
> should be executed, HOWEVER,  not many would expect that
> merely "referencing" the package name (in order to import a
> more deeply nested package) would cause ALL the
> intermediate __init__ files to execute -- this is madness,
> and it prevents using an __init__ file as an "import hub"
> (without side-effects)!

The whole point of the __init__.py file, in case you didn't intuit it
from the name, is to host any initialization code for the package. Why
on earth would you expect to import a module from a package without
initializing the package?

> Because the alternative is messy. If i have a collection of
> modules under a package, sometimes i would like to import
> all the "exportable objects" into the __init__ file and use
> the package as an "import hub".

What is the point of putting things into a hierarchical namespace in
the first place if you're just going to turn around and subvert it
like this?

> But the current "global import search path" injections are
> just the inverse. You make changes to sys.path in one
> module, and if you fail to reset the changes before
> execution moves to the next module in the "import chain",
> then that module's import search path will be affected in
> implicit ways that could result in importing the wrong
> module.

No, because the trick you describe doesn't even work. If you edit
sys.path in one file in order to import the coconut module:

sys.path.insert(0, '/path/to/island')
import coconut

And then in another module change the sys.path file and try to import
a different coconut module:

sys.path[0] = '/path/to/other/island')
import coconut

You think the second import will produce the second coconut module? It
won't, because the sys.modules cache will already contain an entry for
'coconut' that points to the first module imported. In order to make
this work, you would have to not only modify sys.path but also clear
the sys.modules cache.

Hopefully by the time you've done that you will have realized that
you're abusing the import system by having multiple modules with the
same name, and that as a general rule modules shouldn't be
manipulating the value of sys.path *at all*. Instead, set your
sys.path correctly from the PYTHONPATH environment variable, and if
you really must modify sys.path, try to do it only from the main
script.

>> It's three lines of code to replace one. Two if you exclude the
>> importlib.machinery import that doesn't need to be repeated.  Is this
>> really any worse than something like:
>>
>> local_search_path.insert(0, "/path/to/local/module")
>> import my_local_module
>>
>> that you are proposing?
>
> If the changes were LOCAL, then i would have no problem to
> this type of mutation, But they are not.

The example of a direct loader call that described as "boilerplate
hell", to which I was responding, doesn't change anything, locally or
globally. All it does is import a module from an arbitrary location.



More information about the Python-list mailing list