[Python-Dev] New universal import mechanism ( Re: [Python-checkins] python/dist/src/Python import.c,2.210,2.211)

Just van Rossum just@letterror.com
Mon, 2 Dec 2002 12:37:37 +0100


Wiktor Sadowski wrote:

> I think Python should have a more universal import mechanism , which
> could be done by adding a new C_CUSTOM type  to the definitions for
> dynamic loading (importdl.h) , a new struct _customtab to import.h
> and a new function to Python API (import.c). Then any
> archive,database,internet import architecture could be implemented in
> C extensions without messing with Python core.

(Could you please post a normal diff of your patch? It looks very interesting.)

I think I agree with this concept as well. Yesterday I tried to review the
latest incarnation of the "import from zip file" patch and ran away screaming:
it's way too intrusive on import.c, touches too many files and is otherwise
simply incomprehensible to me.

I've been doing some thinking about the import mechanism, too, and my current
gripe is about the __import__ hook: I find it utterly useless that the
implementer of the hook is supposed to deal with sys.modules. What I wish for is
a hook that only gets invoked if the module isn't already in sys.modules.

Here's a quick idea. Say we add a new dynamic variable to sys named
"import_hooks". It's a list that is empty by default, but may contain import
hooks. If an import occurs, sys.modules is checked first, and *only* if the
module isn't yet loaded, something like this will be done:

    for hook in sys.import_hooks:
        mod = hook(absolutemodulename, parentpackage_or_None)
        if mod is not None:
            return mod
    return builtin_import_hook(absolutemodulename, parentpackage_or_None)

(Hm, instead of calling builtin_import_hook() explicitly, it could also be part
of the default sys.import_hooks list. Yeah, that's better, as it allows one to
completely disable builtin import.)

For an import from within a package, the mechanism is possibly invoked twice.
For example if a submodule of Foo imports Baz, the mechanism is called once with
"Foo.Baz" and if that fails (returns None), it's called again with "Baz".

I think this will make writing import hooks a) more attractive and b) far less
brittle, since there's so much less to implement.

A new-style import hook could use sys.path if that makes sense for it, but for
(say) a zip file importer it makes more sense (to me at least) to just add a
hook per zip file. No need to allow non-string types on sys.path.

I'm not entirely sure yet how my idea and yours could be integrated, but I think
it's work a look.

Just