[SciPy-dev] Scipy workflow (and not tools).

Tue Feb 24 19:20:16 EST 2009

2009/2/24 Robert Kern <robert.kern at gmail.com>:
> On Tue, Feb 24, 2009 at 15:13, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
>
>> I think at this point we would be better off trying to recruit at least one
>> person to "own" each package. For new packages that is usually the person
>> who committed it but we also need ownership of older packages. Someone with
>> a personal stake in a package is likely to do more for quality assurance at
>> this point than any amount of required review.
>
> "Ownership" has a bad failure mode. Case in point: nominally, I am the
> "owner" of scipy.stats and numpy.random and completely failed to move
> Josef's patches along.

It seems to me that scipy's development model is a classic open-source
"scratch an itch": it bothered me that people were forever asking
questions that needed spatial data structures, so I took a weekend and
wrote some. I don't foresee this changing without some major change
(e.g. a company suddenly hiring ten people to work full-time on
scipy). So the question is how to make this model produce reliable
code.

Suggestions people have made to accomplish this:

(1) Don't allow anything into SVN without tests and documentation.
(2) Make sure everything gets reviewed before it goes in.
(3) Appoint owners for parts of scipy.

Of these, I strongly approve of (1). It's really not a barrier.
Writing tests is easy. Every programmer does *some* testing (well
maybe not Knuth, but everybody else) to make sure the code does what
it's supposed to. Writing these tests in nose-compatible form really
isn't hard. Documentation is more of an obstacle, just because it's
extra work. But I think it's not too much to ask.

(2) I'm not so sure of. For an example, a few days ago I fixed a
couple of spatial bugs. In both cases, the bug fix was a one-line
change to scipy proper, plus a unit test that would have caught the
bug but now passes. What would be gained by waiting until somebody
else got around to looking at those fixes before committing them?

I am tempted to suggest a weaker standard: optional code review. If
you want to submit a piece of code to scipy and don't have SVN access,
or do but want someone else to take a look at it (as, e.g., I did for
scipy.spatial as a whole), post it; people can review it and when it's
been adequately reviewed it goes in. Of course, here we return to
infrastructure: as far as I know we don't have any reasonable tool for
doing these reviews, or for connecting them to bug reports.

(3) I am highly dubious of. Certainly we'll have informal owners - I
fixed the bugs in spatial in part because I wrote the code and was
embarrassed to see it broken. I know the spatial code pretty well, so
I will probably have an easier time assessing patches to it. But I am
often busy - if those spatial bugs had been reported a month earlier I
would not have been able to get to them any sooner. Making it my fault
if patches don't get in to scipy.spatial - which is, really, what
we're talking about - is a recipe for driving people like me away from
developing scipy. Don't do it.

Anne