[SciPy-Dev] SciPy Goal

Charles R Harris charlesr.harris at gmail.com
Wed Jan 4 21:33:38 EST 2012


On Wed, Jan 4, 2012 at 6:43 PM, Travis Oliphant <travis at continuum.io> wrote:

> Thanks for the feedback.      My point was to generate discussion and
> start the ball rolling on exactly the kind of conversation that has
> started.
>
> Exactly as Ralf mentioned, the point is to get development on sub-packages
> --- something that the scikits effort and other individual efforts have
> done very, very well.   In fact, it has worked so well, that it taught me a
> great deal about what is important in open source.   My perhaps irrational
> dislike for the *name* "scikits" should not be interpreted as anything but
> a naming taste preference (and I am not known for my ability to choose
> names well anyway).     I very much like and admire the community around
> scikits.  I just would have preferred something easier to type (even just
> sci_* would have been better in my mind as high-level packages:  sci_learn,
> sci_image, sci_statsmodels, etc.).    I didn't feel like I was able to
> fully participate in that discussion when it happened, so you can take my
> comments now as simply historical and something I've been wanting to get
> off my chest for a while.
>
> Without better packaging and dependency management systems (especially on
> Windows and Mac), splitting out code doesn't help those who are not
> distribution dependent (who themselves won't be impacted much).   There are
> scenarios under which it could make sense to split out SciPy, but I agree
> that right now it doesn't make sense to completely split everything.
> However, I do think it makes sense to clean things up and move some things
> out in preparation for SciPy 1.0
>
> One thing that would be nice is what is the view of documentation and
> examples for the different packages.   Where is work there most needed?
>
>
> Looking at Travis' list of non-core packages I'd say that sparse certainly
> belongs in the core and integrate probably too. Looking at what's left:
> - constants : very small and low cost to keep in core. Not much to improve
> there.
>
>
> Agreed.
>
> - cluster : low maintenance cost, small. not sure about usage, quality.
>
>
> I think cluster overlaps with scikits-learn quite a bit.   It basically
> contains a K-means vector quantization code with functionality that I
> suspect  exists in scikits-learn.   I would recommend deprecation and
> removal while pointing people to scikits-learn for equivalent functionality
> (or moving it to scikits-learn).
>
>
I disagree. Why should I go to scikits-learn for basic functionality like
that? It is hardly specific to machine learning. Same with various matrix
factorizations.

> - ndimage : difficult one. hard to understand code, may not see much
> development either way.
>
>
> This overlaps with scikits-image but has quite a bit of useful
> functionality on its own.   The package is fairly mature and just needs
> maintenance.
>
>
Again, pretty basic stuff in there, but I could be persuaded to go to
scikits-image since it *is* image specific and might be better maintained.

> - spatial : kdtree is widely used, of good quality. low maintenance cost.
>
>
>
Indexing of all sorts tends to be fundamental. But not everyone knows they
want it ;)

Good to hear maintenance cost is low.
>
> - odr : quite small, low cost to keep in core. pretty much done as far as
> I can tell.
>
>
> Agreed.
>
> - maxentropy : is deprecated, will disappear.
>
>
> Great.
>
> - signal : not in great shape, could be viable independent package. On the
> other hand, if scikits-signal takes off and those developers take care to
> improve and build on scipy.signal when possible, that's OK too.
>
>
> What are the needs of this package?  What needs to be fixed / improved?
> It is a broad field and I could see fixing scipy.signal with a few simple
> algorithms (the filter design, for example), and then pushing a separate
> package to do more advanced signal processing algorithms.    This sounds
> fine to me.   It looks like I can put attention to scipy.signal then, as It
> was one of the areas I was most interested in originally.
>
>
Filter design could use improvement. I also have a remez algorithm that
works for complex filter design that belongs somewhere.

> - weave : no point spending any effort on it. keep for backwards
> compatibility only, direct people to Cython instead.
>
>
> Agreed.   Anyway we can deprecate this for SciPy 1.0?
>
>
> Overall, I don't see many viable independent packages there. So here's an
> alternative to spending a lot of effort on reorganizing the package
> structure:
> 1. Formulate a coherent vision of what in principle belongs in scipy
> (current modules + what's missing).
>
>
> O.K.  so SciPy should contain "basic" modules that are going to be needed
> for a lot of different kinds of analysis to be a dependency for other more
> advanced packages.  This is somewhat vague, of course.
>
> What do others think is missing?  Off the top of my head:   basic wavelets
> (dwt primarily) and more complete interpolation strategies (I'd like to
> finish the basic interpolation approaches I started a while ago).
> Originally, I used GAMS as an "overview" of the kinds of things needed in
> SciPy.   Are there other relevant taxonomies these days?
>
> http://gams.nist.gov/cgi-bin/serve.cgi
>
>
> 2. Focus on making it easier to contribute to scipy. There are many ways
> to do this; having more accessible developer docs, having a list of "easy
> fixes", adding info to tickets on how to get started on the reported
> issues, etc. We can learn a lot from Sympy and IPython here.
>
>
> Definitely!
>
> 3. Recognize that quality of code and especially documentation is
> important, and fill the main gaps.
>
>
> Is there a write-up of recognized gaps here that we can start with?
>
> 4. Deprecate sub-modules that don't belong in scipy (anymore), and remove
> them for scipy 1.0. I think that this applies only to maxentropy and weave.
>
>
> I think it also applies to cluster as described above.
>
> 5. Find a clear (group of) maintainer(s) for each sub-module. For people
> familiar with one module, responding to
>
> tickets and pull requests for that module would not cost so much time.
>
>
> Is there a list where this is kept?
>
>
> In my opinion, spending effort on improving code/documentation quality and
> attracting new developers (those go hand in hand) instead of reorganizing
> will have both more impact and be more beneficial for our users.
>
>
>
Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20120104/d2429359/attachment.html>


More information about the SciPy-Dev mailing list