[Python-Dev] new imputil.py

Tue, 04 Jan 2000 19:36:00 +0100

Greg Stein wrote:
> 
> Comments:
> 
> On Mon, 3 Jan 2000, M.-A. Lemburg wrote:
> >...
> > The new importer does load everything in the test set
> > (top level modules, packages, extensions within packages)
> > without problems on Linux.
> 
> Great!
> 
> > Some comments:
> >
> > · Why is the sys.path.insert(0,imputil.BuiltinImporter())
> > needed in order to get b/w compatibility ?
> 
> Because I didn't want to build too much knowledge into the ImportManager.
> Heck, I think adding sys.path removed some of the design elegence; adding
> real knowledge of builtins... well, we'll just not talk about that. :-)
> 
> We could certainly do it this way; let's see what Guido says. I'm not
> truly adverse to it, but I'd recommend against adding a knowledge of
> BuiltinImporter to the ImportManager.

I was under the impression that the ImportManager should replace
the current implementation. In that light it should of course
provide all the needed techniques per default without the
need to tweak sys.path.

> > · Why is there no __path__ aware code in imputil.py (this is
> > definitely needed in order to make it a drop-in replacement) ?
> 
> Because I don't like __path__ :-)  I don't think it would be too hard to
> add, though.
> 
> If Guido says we need __path__, then I'll add it. I do believe there was a
> poll a while back where he asked whether anybody truly used it. I don't
> remember the result and/or Guido's resolution of the matter.

AFAIK, JimF is using it in Zope. I will use it in the
b/w compatibility package for the soon to be released
mx Extensions packages (instead of using relative imports,
BTW -- can't wait for those to happen).

> > · Performance is still 50% of the Python builtin importer --
> > a bummer if you ask me. More aggressive caching is definitely
> > needed, perhaps even some recoding of methods in C.
> 
> I'm scared of caching and the possibility for false positives/negatives.
> 
> But yes, it is still slower and could use some analysis and/or recoding
> *if* the speed is a problem. Slower imports does not necessarily mean they
> are "too slow."

There has been some moaning about the current Python startup
speed, so I guess people already find the existing strategy
too slow.

Anyway, put the cache risks into the user's hands and have
them decide whether or not to use them. The important thing
is providing a standard approach to caching which all
importers can use and hook into rather than having three
or four separate cache implementations.

> > · The old chaining code should be moved into a subclass of
> > its own.
> 
> Good thought. But really: I'd just rather torch it. This kind of depends
> on whether we can get away with saying the ImportManager is *the* gateway
> between the interpreter and Python-level import hooks. In other words,
> will ImportManager be the *only* Python code to ever be allowed to call
> sys.set_import_hook() ? If the ImportManager doesn't have to "play with
> other import hooks", then the chaining can be removed altogether.

Hmm, nuking the chains might cause some problems with code
using the old ni.py or other code such as my old ClassModules.py
module which emulates modules using classes (provides all the
cool __getattr__ and __setattr__ features to modules as well).

> > · The code should not import strop directly as this module
> > will probably go away RSN. Use string methods instead.
> 
> Yah. But I'm running this against 1.5.2 :-)
> 
> I might be able to do something where the string methods are used if
> available, and use the strop module if not.
> [ similar to the 'os' bootstrapping that is done ]
> 
> Finn Bock emailed me to say that JPython does not have strop, but does
> have string methods.

Since imputil.py targets 1.6 you can safely assume that string
methods are in place.

> > · The design of the ImportManager has some minor flaws: the
> > FS importer should be settable via class attributes,
> 
> The class or the object itself? Putting a class in there would be nice, or
> possibly passing it to the constructor (with a suitable default).
> 
> This is a good idea, though. Please clarify what you'd like to see, and
> I'll get it added.

I usually put these things into the class so that subclasses
can easily override the setting.

> > deinstallation
> > should be possible,
> 
> Maybe. This is somewhat dependent upon whether it must "play nice."
> Deinstallation would be quite easy if we move to a sys.get/set style of
> interface, and it wouldn't be an issue to do de-install code.

I was thinking mainly of debugging situations where you play
around with new importer code -- its probably not important
for production code.

> > a query mechanism to find the importer
> > used by a certain import would also be nice to be able to
> > verify correct setup.
> 
> module.__importer__ provides the importer that was used. This is defined
> behavior (the system relies on that being set to deal with packages
> properly).
> 
> Is this sufficient, or were you looking for something else?

I was thinking of a situations like:

if <RelativeImporter is not installed>:
   <install RelativeImporter>

or

if <need SignedModuleImporter for modules xyz>:
   raise SystemError,'wrong setup'

Don't know if these queries are possible with the current
flags and attributes.

> module.__ispkg__ is also set to 0/1 accordingly.
> 
> For backwards compat, __file__ and __path__ are also set. The __all__
> attribute in an __init__.py file is used for "from package import *".
> 
> > · py/pyc/pyo file piping hooks would be nice to allow
> > imports of signed (and trusted) code and/or encrypted code
> > (a mixin class for these filters would do the trick).
> 
> I'd happily accept a base SuffixImporter class for these "pipes". I don't
> believe that the ImportManager, Importer, or SuffixImporter base classes
> would need any changes, though.
> 
> Note that I probably will rearrange the _fs_import() and friends, per
> Guido's suggestion to move them into a base class. That may be a step
> towards having "pipes" available.

It would be nice to be able to use the concept of stackable streams
as source for byte and source code. For this to work one would have
to make the file reading process a little more abstract by using e.g. a
StreamReader instead (see the current unicode-proposal.txt version).

> > · Wish list: a distutils importer hooked to a list of standard
> > package repositories, a module to file location mapper to
> > speed up file system based imports,
> 
> I'm not sure what the former would do. distutils is still a little
> nebulous to me right now.

Basically it should scan a set of URLs providing access to
package repositories which hold distutils installable package
archives. In case it finds a suitable package it should then
proceed to auto-install it and then continue the normal import
process.

> For a mapper, we can definitely have a custom Importer that knows where
> certain modules are found. However, I suspect you're looking for some kind
> of a cache, but there isn't a hook to say "I found <foo> at <this>
> location" (which would be used to build the mapping).

Right. I would like to see some standard mechanism used
throughout the ImportManager for this. One which all importers
can use and rely on. E.g. it would be nice to have an option
to load the cache from disk upon startup to reduce search times.
All this should be left for the user to configure with the
standard setting being no cache at all (to avoid confusion
and reduce support costs ;-).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                             Happy New Century !
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/