[SciPy-dev] toward scipy 1.0

jh at physics.ucf.edu jh at physics.ucf.edu
Tue Nov 4 13:39:24 EST 2008


This message is about cleaning up scipy and releasing 1.0.  Tied up in
this topic is the issue of how code is brought into scipy.  We
recognize that there is work to do before scipy is 1.0-ready, and much
of that work is not merely adding to what is there.  So, I also think
that we need to solve the problems that got us here before 1.0.

Since this is a message about solving a problem in our process, some
may think I am placing blame or don't support our developers.  So, let
me state at the outset that we could not have come as far as we have
without a lot of dedicated work on many parts, particularly the
current set of developers.  Further, they have all recognized and
spoken about the organizational issues we have had, and about greater
community input as a route out.  All I'm intending here is to move us
in that direction.

"Jarrod Millman" <millman at berkeley.edu> wrote in another thread:

> I imagine that a project could easily start as a
> scikit and mature there.  Then a number of developer decide that it
> belongs in scipy proper.

I think this is part of the problem.  We need to define a clear route
to inclusion for new packages.  That route should include a period of
community review and then a vote, and not merely be the decision of a
few developers in private discussion.  This should be a formalized
process, so that if we have a disagreement, the result is clear and we
all agree to live with it.  I feel tht the small number of people
involved in decisions, and the lack of community review, has brought
scipy's organization to where it is today.  We are all human.  As
recently as the 1.2.0 numpy release we had significant API changes
angling for inclusion at the last minute, and only last-minute
community outcry let cooler heads prevail.  To solve this problem, we
need to open the process and make it more deliberate and deliberative.
We don't need scipy to be nimble, we need it to be stable and well
thought-out.

>From discussions with Stefan, I understand that some parts of scipy
are not maintained and others don't hang together or mesh well with
the rest of the package.  Some module names don't make sense
(scipy.stsci was recently pointed out).  Docs are arcane (stats) or
lacking.  We now have a decade or more of use experience with scipy
and its predecessors, enough to make something coherent and long-term
stable out of it, but this refactoring has not yet happened.

So, I propose a community reassessment of scipy and a refactoring and
doc effort before 1.0.  The goal is that in 6.0 your 1.0-based code
still works well, and we look at it and say, "That structure really
stood the test of time.  It still makes sense today."

Here's what I propose:

1. reassess what's there:
- break it into components for discussion purposes
- decide for each whether it's:
  - used
  - self-consistent
  - complete at some level
  - maintained
  - documented for normal users
  - a build problem
  - well integrated into the rest of scipy
This part of the process would be done by small teams and would result
in a short report for each package recommending whether it is:
  - worth keeping as is
  - needs specific work (docs, build stuff, tests, etc.)
  - needs a maintainer
  - needs to be refactored, removed, or merged into something else
Any component for which a team does not volunteer to do the review
would be a good candidate for removal based on lack of use.

2. hunt for looming incompatible API changes:
- in the code as it exists
- that result from the reassessment above
These would be collected on a page(s) for specific community review
and comment.

3. community comments on the collection of reports online and looks
for consensus on any overall restructuring

4. do the work resulting from the reassessment
Yes, this is the hard part.  Likely we will pick up some new
developers/maintainers from step 1 and they pick up much of the work.

5. present the refactored package in a declared-unstable release (0.7.99?)

6. stabilize it and release it as 0.8

7. run a doc marathon based on that release (I can pay one or more
writers)

8. allow significant additions, if any, until 0.9, but only after they
have been scikits for a year *and* pass a community-based acceptance
process

9. release scipy 1.0 as a cleaned-up package with community testing
and full docs, but no significant new code after 0.9

10. include new code in future releases after they have been scikits
for a year *and* pass a community-based acceptance process
I don't expect most scikits to be accepted after 1 year, and expect
that some never will be.

I think this is a 2-3-year process, of which steps 1-4 could happen in
6-8 months, say by next June.  My guess is that, except for docs, most
of the components would come through pretty much unchanged, but a few
would get a real working over or be removed.

We'd need a web tool similar to the doc wiki for this.  Or, we could
use a wiki and a lot of discipline.  We would also need to define some
sort of voting process after the assessment period.  Whoever gets to
vote, I propose a 2/3 majority required for inclusion of a new
package, and an 80% majority for a major API change (which waits for
the next major release).

There is much detail to fill in, such as what this community review
process really looks like, not to mention whether any of this is a
good idea at all.  Let me know your thoughts.

--jh--



More information about the SciPy-Dev mailing list