[Python-Dev] New and Improved Import Hooks

Just van Rossum just@letterror.com
Wed, 4 Dec 2002 22:12:48 +0100


Gordon McMillan wrote:

> Code like this:
>  for p in sys.path:
>    x = os.path.join(p, ...)
>    ....
> is very common (I patched linecache.py for this after imputil went
> into the std lib). Since PYTHONPATH can consist only of strings, it
> seems wise to tackle the issue (dealing with strings that describe
> non-directory collections of modules) instead of postponing it. Also
> seems sensible to make it so that if X works on PYTHONPATH,
> sys.path.append(X) should work, too.

Erm, I'm not sure what you're saying... Are you saying that we should fix all
cases where non-strings on sys.path cause problems, or are you saying that
there's so much code out there assuming sys.path contains strings, and that we
therefore should stick with strings?

Both positions can be defended, and both have their problems.

A) Stick with strings. Hooks can be implemented by subclassing str. This is
great for hooks written in Python, but subclassing str in C is not
straightforward. Things can still break, though: eg.
os.path.basename(strsubinst) will return a regular string, not an instance of
the subclass; might be an issue.

B) Allow arbitrary objects on sys.path. Hooks are then easier to write (in C),
but some code breakage will occur. The std library we can fix (if needed), but
third-party code might break.

I would very much prefer B, but if it turns out that we can't break the string
assumption, I'd still be happy with A (rather that than nothing!).

Regarding PYTHONPATH and sys.path.append("/path/to/my/archive.zip"): for now I'd
suggest that the sys.path traversing code checks for a .zip extension, and
replace the item with a zipimporter instance. This check can be very cheap.
Later we could add a general extension-checking feature, where one could
register an import hook for a specific extension. This might be a case of YAGNI,
though...

Just