[SciPy-dev] Package organization
Fernando Perez
Fernando.Perez at colorado.edu
Thu Oct 13 02:33:53 EDT 2005
Robert Kern wrote:
> I would like to see scipy's package organization become flatter and more
> oriented towards easy, lightweight, modular packaging rather than
> subject matter. For example, some people want bindings to SUNDIALS for
> ODEs. They could go into scipy.integrate, but that introduces a large,
> needless dependency for those who just want to compute integrals. So I
> would suggest that SUNDIALS bindings would be in their own
> scipy.sundials package.
>
> As another example, I might also suggest moving the simulated annealing
> module out into scipy.globalopt along with diffev.py and pso.py that are
> currently in my sandbox. They're all optimizers, but functionally they
> are unrelated to the extension-heavy optimizers that make up the
> remainder of scipy.optimize.
>
> The wavelets library that Fernando is working on would go in as
> scipy.wavelets rather than being stuffed into scipy.signal. You get the
> idea.
>
> This is also why I suggested making the scipy_core versions of fftpack
> and linalg be named scipy.corefft and scipy.corelinalg and not be
> aliased to scipy.fftpack and scipy.linalg. The "core" names reflect
> their packaging and their limited functionality. For one thing, this
> naming allows us to try to import the possibly optimized versions in the
> full scipy:
[snip excellent analysis of scipy's organizational status ]
Let me add a minor twist to your plan, which perhaps may help a little. How
about making a two-level distinction between 'scipy, the core package' and
'scipy, the collection of tools'? Here's how it could be organized, in terms
of namespaces and release policy: whatever is defined as the core is released
by the scipy package proper, and can be safely considered a dependency for the
rest. Note that this can still be split between scipy_core and 'full scipy',
where scipy_core is Travis' new Numeric/numarray and 'full scipy' contains
much more functionality.
But as far as packages written by third-party authors, which can live under
the scipy namespace as an umbrella, benefit from scipy's build facilities and
core libraries, how about putting them all into a 'toolkits' namespace? The
actual name, for typing convenience, could be scipy.kits or scipy.tools
(something short).
This would then give us the following structure:
1. Scipy_core: the new Numeric/numarray package, which includes basic FFT,
linear algebra, random numbers and perhaps basic i/o (at least save/load
abilities), and whatever else I'm missing right now (I don't have it yet
installed on this laptop).
2. Scipy 'full': depends on (1), and exposes all the other scipy names:
scipy.{linalg,optimize,integrate,...}. These are libraries considered
officially part of scipy, so that even if they are maintained by others (much
like python's stdlib), there is a committment to a common release cycle.
These can, if need be, have inter-dependencies, as they will always be
released as a whole.
1 and 2 all use the top-level scipy namespace. Then we have:
3. The scipy.{kits|tools} namespace (or whatever the chosen name). This is
where third parties can drop their own packages, which can depend either only
(1) or on the full (2) system (their level of dependency should be explicitly
stated).
The kits namespace may ship empty by default, or it could be populated with a
few things from current scipy if it is decided they are best moved there.
The only thing required for projects to live in the .kits namespace is really
to avoid top-level name collisions, so it would perhaps be worth having an
informal policy of people checking with scipy-dev for a name before using it.
This layout would allow the core team to work with relative freedom at the
top-level namespace, without worrying about toolkits taking names they may
need in the future. Similarly, toolkit authors will have a well-defined API
to build upon.
The criterion for deciding what goes in (2) should be one of generality: tools
likely to be of very wide need for most things in scientific work, and which
provide a foundation for toolkit authors.
If this is combined with a CPAN-like system (eggs, PyPi, whatever), it should
be very easy for users, once they have the basic layers in place, to grab a
toolkit by issuing a single command or going to a website. I'd suggest, if
this were adopted, keeping a simple page at scipy with brief descriptions for
each toolkit, even if they are developed/distributed externally.
The current 'example package' (the ex-xxx package) could be the prototype for
a toolkit, used by new toolkit authors to get off the ground quickly, and by
scipy to establish coding and documentation policy for .kits members.
If we establish a few conventions to be followed by toolkits, we can ensure
that the top-level documentation/info facilities automatically register them
(dynamically).
Anyway, I've certainly far exceeded my opinion budget on this one, so I should
shut up now :)
Cheers,
f
More information about the SciPy-Dev
mailing list