[Import-sig] Re: Proposal for a modified import mechanism.

Sat Nov 10 23:44:44 EST 2001

Hey Gordon,

> eric wrote:
>
> [Frederic Giacometti]
>
> > > because the obvious reason of increased name space collision,
> > > increased run-time overhead etc...
> >
> > I'm missing something here because I don't understand why this
> > increases name space collision.
>
> Currently, os.py in a package masks the real one from
> anywhere inside the package. This would extend that to
> anywhere inside any nested subpackage. Whether that's a
> "neat" or a "dirty" trick is pretty subjective. The wider the
> namespace you can trample on, the more it tends to be "dirty".

Yeah, I guess I come down on the "neat" side in this one.  If I have a
module or package called 'common' at the top level of a deep hierarchy,
I'd like all sub-packages to inherit it.  That seems intuitive to me and
inline with the concept of a 'package'.  Perhaps the hijacking of the
Numeric example strikes a nerve, but inheriting the 'common' module
shouldn't be so contentious.  Also, if someone has the gall to hijack
os.py at the top of your package directory structure, it seems very
likely you want this new behavior everywhere within your package.

I have a feeling this discussion has been around the block a few times
when packages were first being developed...

>
> > If the objection is to the fact
> > that SciPy can have a version of Numeric in it that masks a
> > Numeric installed in site-packages, I guess I consider this a
> > feature, not a bug.  Afterall, this is already the behavior for
> > single level packages, extending it to multi-level packages seems
> > natural.  If this isn't your objection, please explain.
>
> Well, it's a feature that can crash Python. If the package
> (which the user has, and you have a hijacked, incompatible
> copy of) contains an extension module, all kinds of nasty
> things can happen when both are loaded.

This I need to know about.  How does this happen?  So you have
two extension modules, with the same name, one living in a package
and the other living in site-packages.  If you import both of these, their
namespaces don't conflict in some strange way do they?

Or are you talking about passing a structure (like a numeric array)
generated in one ext module into a routine in the other ext module
(expecting a different format of a numeric array) and then getting
some strange (seg-fault even) kind of behavior?  Anyway, I'd like
a few more details for reasons orthogonal to this discussion.

> Submit patches to the package authors, or require a specific
> version, or write a wrapper that adapts to different versions or
> fork or do without.  This is definitely a dirty trick.

We've done the "require specific version" option here, and
"conveniently" upgraded the user's Numeric package for them.
The problem is that some people use old versions of Numeric
in production code, and don't want to risk an upgrade -- but still
want to try out SciPy.  I consider our solution a dirtier trick
than encapsulating things completely within SciPy.  I also don't
think any nasty things could happen in this specific situation of
having two relatively recent Numerics loaded up, but I could be wrong.

>
> > The current runtime overhead isn't so bad.
>
> Under anything near normal usage, no - packages structures
> are nearly always shallow. It wouldn't be much work to
> construct a case where time spent in import doubled, however.
>
> When the "try relative, then try absolute" strategy was
> introduced with packages, it added insignificant overhead. It's
> not so insignificant now. When (and if) the standard library
> moves to a package structure, it's possilbe it will be seen as a
> burden.

I haven't followed the discussion as to whether the standard library
will move to packages, but I'll be surpised if it does.  I very much
like the encapsulation offered by packages, but have found their
current incarnation difficult to develop compared to simple modules.
The current discussion concerns only one of the issues.

As for overhead, I thought I'd get a couple more data points from
distutils and xml since they are standard packages.  The distutils import
is pretty much a 0% hit.  However, the xml import is *much* slower --
a factor of 3.5.  Thats a huge hit and worth complaining about.  I don't
know if this can be optimized or not.  If not, it may be a show stopper,
even if the philosophical argument was uncontested.

eric

import speed numbers below:

C:\temp>python
ActivePython 2.1, build 210 ActiveState)
based on Python 2.1 (#15, Apr 23 2001, 18:00:35) [MSC 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import time
>>> t1 = time.time();import distutils.command.build; t2 = time.time();print
t2-t
1
0.519999980927
>>>

C:\temp>python
ActivePython 2.1, build 210 ActiveState)
based on Python 2.1 (#15, Apr 23 2001, 18:00:35) [MSC 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import my_import
>>> import time
>>> t1 = time.time();import distutils.command.build; t2 = time.time();print
t2-t
1
0.511000037193
>>>

C:\temp>python
ActivePython 2.1, build 210 ActiveState)
based on Python 2.1 (#15, Apr 23 2001, 18:00:35) [MSC 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import my_import
>>> import time
>>> t1 = time.time();import xml.sax.saxutils; t2 = time.time();print t2-t1
1.35199999809
>>>

C:\temp>python
ActivePython 2.1, build 210 ActiveState)
based on Python 2.1 (#15, Apr 23 2001, 18:00:35) [MSC 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import time
>>> t1 = time.time();import xml.sax.saxutils; t2 = time.time();print t2-t1
0.381000041962
>>>