[Python-Dev] Import redesign [LONG]

Greg Stein gstein@lyra.org
Sat, 4 Dec 1999 13:59:00 -0800 (PST)


On Sat, 4 Dec 1999, James C. Ahlstrom wrote:
> Greg Stein wrote:
>...
> > To be explicit/clear and to be sure I'm hearing you right: sys.path may
> > contain Importer instances. Given the name FOO, the system will step
> > through sys.path looking for the first occurence of FOO (looking in a
> > directory or delegating). FOO may be found with any number of
> > (configurable) file extensions, which are ordered (e.g. ".so" before
> > ".py" before ".isl").
> 
> This is basically a gripe about this design spec.  So if the answer
> turns out to be "we need this functionality so shut up" then just
> say that and don't flame me.
> 
> This spec is painful.  Suppose sys.path has 10 elements, and there
> are six file extensions.  Then the simple algorithm is slow:
>   for path in sys.path:		# Yikes, may not be a string!
>     for ext in file_extensions:
>       name = "%s.%s" % (module_name, ext)
>       full_path = os.path.join(path, name)
>       if os.path.isfile(full_path):
>         # Process file here

This is the algorithm that Python uses today, and my standard Importers
follow.

> And sys.path can contain class instances
> which only makes things slower.

IMO, we don't know this, or whether it is significant.

> You could do a readdir() and cache
> the results, but maybe that would be slower.  A better
> algorithm might be faster, but a lot more complicated.

Who knows. BUT: the import process is now in Python -- it makes it *much*
easier to run these experiments. We could not really do this when the
import process is "hard-coded" in C code.

> In the context of archive files, it is also painful.  It prevents
> you from saving a single dictionary of module names.  Instead you
> must have len(sys.path) dictionaries.  You could try to
> save in the archive information about whether (say) a foo.dll was
> present in the file system, but the list of extensions is extensible.

I am not following this. What/where is the "single dictionary of module
names" ? Are you referring to a cache? Or is this about building an
archive?

An archive would look just like we have now: map a name to a module. It
would not need multiple dictionaries.

> The above problem only exists to support equally-named modules; that
> is, to support a run-time choice of whether to load foo.pyc, foo.dll,
> foo.isl, etc.  I claim (without having written it) that the fastest
> algorithm to solve the unique-name case is much faster than the fastest
> algorithm to solve the choose-among-equal-names case.
> 
> Do we really need to support the equal-name case [Jim runs for
> cover...]?
> If so, how about inventing a new way to support it.  Maybe if equal
> names exist, these must be pre-loaded from a known location?

I don't understand what the problem is. I don't see one. We are still
mapping a name to a module. sys.path defines a precedence.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/