[SciPy-Dev] Proposal for Scikit-Signal - a SciPy toolbox for signal processing

Wed Jan 4 13:24:22 EST 2012

On Wed, Jan 4, 2012 at 3:53 PM, <josef.pktd at gmail.com> wrote:

> On Wed, Jan 4, 2012 at 9:30 AM, Skipper Seabold <jsseabold at gmail.com>
> wrote:
> > On Wed, Jan 4, 2012 at 1:37 AM, Travis Oliphant <travis at continuum.io>
> wrote:
> > <snip>
> >> So, my (off the top of my head) take on what should be core scipy is:
> >>
> >> fftpack
> >> stats
> >> io
> >> special
> >> optimize]
> >> linalg
> >> lib.blas
> >> lib.lapack
> >> misc
> >>
> >> I think the other packages should be maintained, built and distributed
> as
> >>
> >> scipy-constants
> >> scipy-integrate
> >> scipy-cluster
> >> scipy-ndimage
> >> scipy-spatial
> >> scipy-odr
> >> scipy-sparse
> >> scipy-maxentropy
> >> scipy-signal
> >> scipy-weave  (actually I think weave should be installed separately
> and/or merged with other foreign code integration tools like fwrap, f2py,
> etc.)
> >>
> >> Then, we could create a scipy superpack to install it all together.
> What issues do people see with a plan like this?
> >>
> >
> > My first thought is that what is 'core' could use a little more
> > discussion. We are using parts of integrate and signal in statsmodels
> > so our dependencies almost double if these are split off as a separate
> > installation. I'd suspect others might feel the same. This isn't a
> > deal breaker though, and I like the idea of being more modular,
> > depending on how it's implemented and how easy it is for users to grab
> > and install different parts.
>
> I think that breaking up scipy just gives us a lot more installation
> problems, and if it's merged together again into a superpack, then it
> wouldn't change a whole lot, but increase the work of the release
> management.
> I wouldn't mind if weave is split out, since it crashes and I never use it.
>
> The splitup is also difficult because of interdependencies,
> stats is a final usage sub package and doesn't need to be in the core,
> it's not used by any other part, AFAIK
> it uses at least also integrate.
>
> optimize uses sparse is at least one other case I know.
>
> I've been in favor of cleaning up imports for a long time, but
> splitting up scipy means we can only rely on a smaller set of
> functions without increasing the number of packages that need to be
> installed.
>
> What if stats wants to use spatial or signal?
>
>
I agree with Josef that splitting scipy will be difficult, and I suspect
it's (a) not worth the pain and (b) that it doesn't solve the issue that I
think Travis hopes it will solve (more development of the sub-packages).
Installation, dependency problems and effort of releasing will probably get
worse.

Looking at Travis' list of non-core packages I'd say that sparse certainly
belongs in the core and integrate probably too. Looking at what's left:
- constants : very small and low cost to keep in core. Not much to improve
there.
- cluster : low maintenance cost, small. not sure about usage, quality.
- ndimage : difficult one. hard to understand code, may not see much
development either way.
- spatial : kdtree is widely used, of good quality. low maintenance cost.
- odr : quite small, low cost to keep in core. pretty much done as far as I
can tell.
- maxentropy : is deprecated, will disappear.
- signal : not in great shape, could be viable independent package. On the
other hand, if scikits-signal takes off and those developers take care to
improve and build on scipy.signal when possible, that's OK too.
- weave : no point spending any effort on it. keep for backwards
compatibility only, direct people to Cython instead.

Overall, I don't see many viable independent packages there. So here's an
alternative to spending a lot of effort on reorganizing the package
structure:
1. Formulate a coherent vision of what in principle belongs in scipy
(current modules + what's missing).
2. Focus on making it easier to contribute to scipy. There are many ways to
do this; having more accessible developer docs, having a list of "easy
fixes", adding info to tickets on how to get started on the reported
issues, etc. We can learn a lot from Sympy and IPython here.
3. Recognize that quality of code and especially documentation is
important, and fill the main gaps.
4. Deprecate sub-modules that don't belong in scipy (anymore), and remove
them for scipy 1.0. I think that this applies only to maxentropy and weave.
5. Find a clear (group of) maintainer(s) for each sub-module. For people
familiar with one module, responding to tickets and pull requests for that
module would not cost so much time.

In my opinion, spending effort on improving code/documentation quality and
attracting new developers (those go hand in hand) instead of reorganizing
will have both more impact and be more beneficial for our users.

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20120104/e6e6af6b/attachment.html>