[Python-Dev] Re: Proposal for a modified import mechanism.

eric ej@ee.duke.edu
Sun, 11 Nov 2001 00:40:54 -0500


Hey Frederic,

> But then, this is not an import problem.
> If you use Numeric, you call Numeric. If you call something other than
Numeric,
> just give a different name, and all the confusion will go away.

This is certainly an option, but not a good one in my opinion.  The main
issue is
we want to force a specific version of Numeric for SciPy while allowing
people
to keep their old standard version of Numeric available for their production
code.
Single level packages provide a handy way of doing this.  Multi-level
packages
(like SciPy) do not.  I guess I just don't see why making a package
multi-level
should inherently make it harder to do things.

> If you're worried that you've already encoded the Numeric name 50 times
into
> 300 files; run a python script over these 300 files; this will do the
renaming
> of the 15.000 occurences of the Numeric name.

Sure, but this is inconvenient, and something I think should be handled
by the packaging facility, not by running a renaming script.

>
> > That makes SciPy
> > self-contained and allows people to try it out without worrying that it
> > might break their current installation.  There are other solutions to
this
> > problem, but Prabhu's fix is by far the easiest and most robust.
>
> And then, in maintenance/integration phase, sometimes 'Numeric' will call
> Numeric, some other times it will your package ?

In the integration phase, users would need to change the
"from Numeric import *" to "from scipy import *" (or search and replace
Numeric with scipy) in their code as Numeric is completely subsumed
into scipy.  So, when not using SciPy, their legacy code can continue using
an old version of Numeric.  When switching to SciPy, the make the
replacement.  As I said, its mainly a version issue (with a few minor
changes).

>
> What if somebody, for some reason I know nothing of (e.g. probably some
> integration) wants to call Numeric and your Numeric package in the same
module ?
> Wish them tough luck to sort out this poisoned gift....
>
> > Prabhu's import also has some other nice benefits.  Some of the
sub-packages
> > in SciPy are useful outside of SciPy.  Also sometimes it is easier to
> > develop a packages outside of the SciPy framework.  It would be nice to
be
> > able to develop a module or package 'foo' outside of SciPy and then move
it
> > into SciPy at a later date.  However, every SciPy sub-package that
referred
> > to foo prior to its inclusion in SciPy now has to be updated from
'import
> > foo' to 'import scipy.foo'.  These kind of issues make it very painful
and
> > time consuming to rearrange package structures or move modules and
> > sub-packages in and out of the package.
>
> There are basic python scripts which do this painlessly. If you're really
> working on a large project, there's a project architect which normally
would
> take care of such things, and for whom this should not be a too much of a
> problem.

Hmmm.  I guess the "project architect" in this case is jointly held by
Travis Oliphant and yours truely.  Neither of us are packaging guru's, but
do
have a fair amount of experience with Python.  We worked quite a while on
(and are still working on) packaging issues.  Incidently,
I have know idea what Travis O.'s opinion is on this specific topic.

>
>
> > Simplifying this will improves
> > package development.
> >
> > > I'm personnally against anything that enlarges the search path
uselessly;
> >
> > Hopefully I've explained why it is useful for complex packages.
>
> Python helps in many areas, but expecting it to palliate for the package
design
> and architecture flaws that inexorably surface anytimes something
non-trivial is
> developped, might be somehow at the edge. Python has not yet replaced the
need
> for relevant software architects.

Them thars fightin' words. ; )  I'm biased, but don't thinking scipy's
architecture is
flawed.  It is simply a *very* large package of integrated sub-packages that
also
relies heavily on a 3rd evolving group of modules (Numeric).  As such, it
reveals
the difficult issues that arise when trying to build large packages of
integrated
sub-packages that rely on a 3rd evolving group of modules...

>
> >
> > > because the obvious reason of increased name space collision,
increased
> > > run-time overhead etc...
> >
> > I'm missing something here because I don't understand why this increases
> > name space collision.  If the objection is to the fact that SciPy can
have a
> > version of Numeric in it that masks a Numeric installed in
site-packages, I
> > guess I consider this a feature, not a bug.
>
> Actually, it is normally worse than a bug: it is a source of bug tomorrow
in
> your application - of all the bugs you'll have when your programmer will
be
> confusing the two Numeric packages, as well as all the mainteance and
> integration problems you'll have down the line -.


I disagree and don't think that is true in this (and many other) situations.
People
who want to use SciPy will migrate completely to it since it includes
Numeric.
What the sub-package option offers is a way to test SciPy and optionally use
it while keeping their standard Numeric around for their production code.

>
> But by then, hopefully for you, you'll be somewhere else... The sad
reality of
> most projects :((
>
> >  Afterall, this is already the
> > behavior for single level packages, extending it to multi-level packages
> > seems natural.  If this isn't your objection, please explain.
> >
> > The current runtime overhead isn't so bad.  Prabhu sent me a few numbers
on
> > the SciPy import (which contains maybe 10-15 nested packages).  I
attached
> > them below -- the overhead is less than 10%.  It should be negligible
for
> > standard modules as only packages are really affected (right Prabhu?).
>
> And that's how, when you cumulate of the overheads for all new features,
you get
> potenially +100-200% overhead on the new releases.
> Albeit all the efforts of the Python team, Python 2.0 is up to 70% slower
than
> python 1.5.2; Python 2.1.1 is up to 30% slower than python 2.0, and so
on...
> So, +10% on only such a minor features is anything but negligible :(((


The computational cost of additional functionality is always a question of
what
portion of a program is impacted.  If we were talking about 10% hit on
looping
structures or dictionary lookups or local variable lookups, then yes it
needs
extreme scrutiny.  Adding 10% to a rare event is not worthy of note.  I
expect
(and see) 0% overhead for importing standard modules (by far the most
common case).  Adding 10% overhead to importing a very large package
with 10-15 nested sub-packages is just not a big deal.  The 350% cost I
saw (noted in a response to Gordon) is a *huge* deal and would need to be
solved (moving to C would help) before this became standard.

eric

----- Original Message -----
From: "Frederic Giacometti" <frederic.giacometti@arakne.com>
To: "eric" <ej@ee.duke.edu>
Cc: <import-sig@python.org>; <prabhu@cyberwaveindia.com>;
<python-list@python.org>; <python-dev@python.org>
Sent: Saturday, November 10, 2001 4:43 PM
Subject: Re: Proposal for a modified import mechanism.


>
>
> eric wrote:
>
> > I have to agree with Prabhu on this one.  The current behavior of
import,
> > while fine for standard modules and even simple packages with a single
> > level, is sub-optimal for packages that contain sub-packages.  The
proposed
> > behavior solves the problem.
> >
> > Handling the packaging issues in SciPy was difficult, and even resulted
in a
> > (not always popular) decision to build and overwrite the Numeric package
on
> > machines that install SciPy.  Prabhu's import doesn't resolve all the
issues
> > (I think packages may just be difficult...), but it would have solved
this
> > one.  The proposed import allows us to put our own version of Numeric in
the
> > top SciPy directory.  Then all SciPy sub-packages would grab this one
> > instead of an existing site-packages/Numeric.
>
> But then, this is not an import problem.
> If you use Numeric, you call Numeric. If you call something other than
Numeric,
> just give a different name, and all the confusion will go away.
> If you're worried that you've already encoded the Numeric name 50 times
into
> 300 files; run a python script over these 300 files; this will do the
renaming
> of the 15.000 occurences of the Numeric name.
>
> > That makes SciPy
> > self-contained and allows people to try it out without worrying that it
> > might break their current installation.  There are other solutions to
this
> > problem, but Prabhu's fix is by far the easiest and most robust.
>
> And then, in maintenance/integration phase, sometimes 'Numeric' will call
> Numeric, some other times it will your package ?
>
> What if somebody, for some reason I know nothing of (e.g. probably some
> integration) wants to call Numeric and your Numeric package in the same
module ?
> Wish them tough luck to sort out this poisoned gift....
>
> > Prabhu's import also has some other nice benefits.  Some of the
sub-packages
> > in SciPy are useful outside of SciPy.  Also sometimes it is easier to
> > develop a packages outside of the SciPy framework.  It would be nice to
be
> > able to develop a module or package 'foo' outside of SciPy and then move
it
> > into SciPy at a later date.  However, every SciPy sub-package that
referred
> > to foo prior to its inclusion in SciPy now has to be updated from
'import
> > foo' to 'import scipy.foo'.  These kind of issues make it very painful
and
> > time consuming to rearrange package structures or move modules and
> > sub-packages in and out of the package.
>
> There are basic python scripts which do this painlessly. If you're really
> working on a large project, there's a project architect which normally
would
> take care of such things, and for whom this should not be a too much of a
> problem.
>
>
> > Simplifying this will improves
> > package development.
> >
> > > I'm personnally against anything that enlarges the search path
uselessly;
> >
> > Hopefully I've explained why it is useful for complex packages.
>
> Python helps in many areas, but expecting it to palliate for the package
design
> and architecture flaws that inexorably surface anytimes something
non-trivial is
> developped, might be somehow at the edge. Python has not yet replaced the
need
> for relevant software architects.
>
> >
> > > because the obvious reason of increased name space collision,
increased
> > > run-time overhead etc...
> >
> > I'm missing something here because I don't understand why this increases
> > name space collision.  If the objection is to the fact that SciPy can
have a
> > version of Numeric in it that masks a Numeric installed in
site-packages, I
> > guess I consider this a feature, not a bug.
>
> Actually, it is normally worse than a bug: it is a source of bug tomorrow
in
> your application - of all the bugs you'll have when your programmer will
be
> confusing the two Numeric packages, as well as all the mainteance and
> integration problems you'll have down the line -.
>
> But by then, hopefully for you, you'll be somewhere else... The sad
reality of
> most projects :((
>
> >  Afterall, this is already the
> > behavior for single level packages, extending it to multi-level packages
> > seems natural.  If this isn't your objection, please explain.
> >
> > The current runtime overhead isn't so bad.  Prabhu sent me a few numbers
on
> > the SciPy import (which contains maybe 10-15 nested packages).  I
attached
> > them below -- the overhead is less than 10%.  It should be negligible
for
> > standard modules as only packages are really affected (right Prabhu?).
>
> And that's how, when you cumulate of the overheads for all new features,
you get
> potenially +100-200% overhead on the new releases.
> Albeit all the efforts of the Python team, Python 2.0 is up to 70% slower
than
> python 1.5.2; Python 2.1.1 is up to 30% slower than python 2.0, and so
on...
> So, +10% on only such a minor features is anything but negligible :(((
>
> FG