[Python-Dev] New Import Hooks PEP, a first draft (and req. for PEP #)

Just van Rossum just@letterror.com
Fri, 20 Dec 2002 19:25:27 +0100


James C. Ahlstrom wrote:

> Although I originally found the find/load dichotomy annoying,
> I now think it should be kept.

It *is* kept, in every detail of the PEP.

> It solves the real
> world problem of finding files which are not Python modules.
> Currently Zope and others find configuration and data files
> either by looping over sys.path themselves, or by looking
> at the base name of module.__file__ (the file name of an
> imported module).  Both fail for zip imports.

That's why I introduced __importer__ and __importer__.get_data(). (The
details of their specification might change, though. Oh wait, get_data()
has no specification at all yet ;-)

> I too find the imp find/load function signatures hard to love.  But
> I also don't want to break any code which depends on them, nor
> make gratuitous changes which make them useless for zip imports.
> I think the imp module should Just Work for zip imports.

I didn't change anything there and I sure don't break any code, because
there's hardly any code out there that expects imp.find_module() to work
for zip files <0.5 wink>. However, for imp.find_module() to "just work"
and be *useful* for imports from importer objects, it will need to
return a loader object. I don't see how this can be done without
breaking imp.find_module(). I thought of returning a loader object
instead of a file object for "hooked" imports, but unless we add a dummy
close() method to all loader objects, this will break a common idiom:

    file, filename, stuff = imp.find_module(...)
    if file:
        file.close()

And that's just the beginning of the trouble. It's also rather hackish.
So I propose to add a new function to the imp module, as described in
the PEP. And no, this one doesn't solve the data file problem.

I could change imp.find_module() to return *something* for hooked
imports, but it won't be useful to do imports without a loader object.

> I suggest we keep imp.find_module but add another argument:
> 
>       find_module(name [, path [, suffixes]])
> 
> where suffixes defaults to imp.get_suffixes().  This enables it
> to be used to search for files with different suffixes, for
> example suffixes=[".conf"].

I'm not sure if I like this: imp.find_module() is designed for finding
modules, not arbitrary files. It's an interesting idea though, but this
would neccesarily complicate the importer protocol, as each importer
would have to deal with arbitrary suffixes. It implies file-system
semantics that importer objects can't neccesarily satisfy, eg. an
importer doesn't neccesarily deal with suffixes at *all*.

> To make this work, the returned file object would have to be
> either a real file object, or another object with a read() method.
> Python has lots of precedents for accepting file-like objects.  For
> example, sys.stdout can be replaced with any object which has a
> write method.
> 
> The returned file-like object must have a read() method and
> a close() method.  It could also have a stat() method if it
> is a zip file, because zip files record the file date, time
> and size.

(File objects don't have a stat method.)

> So you call file.read() for a configuration file, or pass
> it to imp.load_module() if it is a Python module.

I'll think about this some more, but I'm not yet convinced.

Just