[SciPy-dev] The future of SciPy and its development infrastructure

Thu Feb 26 02:04:46 EST 2009

Hi,

On Mon, Feb 23, 2009 at 8:04 AM, Stéfan van der Walt <stefan at sun.ac.za> wrote:
> [If you only have 30 seconds to read this email, read the bold text only]
>
> Dear SciPy developers
>
> The past while has seen a rocky ride with the SciPy servers, but yesterday
> Peter Wang announced that he is attending to the situation.  This, then,
> seems like the perfect time to stand back and take a look at our
> infrastructure, and whether we should continue with the current setup.
>
> To put this conversation into context, we have to face the facts: SciPy has
> a large user community relative to the number of developers.  A big library
> of code, used by many scientists, is supported by a small handful of people
> all over the world.  We cannot afford a high barrier to contribution, and we
> have to lower the effort it takes for a developer to merge contributed code.
>
> I'd like to propose two changes to the status quo:
>
> 1. Change to a distributed revision control system, encouraging more open
> collaboration.
> 2. Determine guidelines for code acceptance, in terms of unit tests,
> documentation and peer review.
>
> Allow me to motivate these changes, and then suggest practical approaches
> for their implementation:
>
> Subversion allows only a selected group of developers to change the SciPy
> source code.  This does not encourage a culture of meritocracy, but worse,
> has practical implications, in that users cannot merge their own patches.  I
> won't discuss the advantages of distributed revision control here, but note
> that it shifts responsibility from the current core developers to
> contributers; that benefits us all!
>
> This ties in with my second point: code review.  The current developers have
> access to SVN because they are experienced programmers with knowledge of
> SciPy's scientific domains of application.  We are unable to employ this
> scarce resource fully, because it simply takes too long to merge a patch
> from Trac, review it, *bring it up to scratch*, and commit it.  We have to
> put a system in place which allows contributers to take responsibility for
> their own patches, and for core developers to guide and advise during this
> process.  As it is, we have many patches waiting on Trac for up to a year or
> more without any feedback; that is not acceptable.
>
> My view on testing is simple: untested code is probably broken code (and I
> can show examples from the past year's commit logs to corroborate this
> statement).  As for documentation, we cannot afford to be without it.
>
> Implementation:
>
> Enthought generously hosts SciPy, and I hope they will continue doing so.
> New software will need to be installed on the server, but we have many hands
> willing to tackle that task: David Cournapeau and myself included.  Before
> deploying to scipy.org, we will configure a different server as a proof of
> concept.
>
> 1) Distributed revision control system: David Cournapeau and myself have
> been test driving Git [1] on SciPy and NumPy for a while.  It is fast, well
> supported, has great branch support, and is simple to use for the average
> contributor, while allowing powerful patch-carving for the more adventurous.
>
> 2) Ticketing back-end: David is exploring RedMine [2], and I'd like to take
> a look at InDefero [3], but we'll do a careful analysis of trac-git (like
> FedoraHosted) too.
>
> Thank you for taking the time to deliberate on SciPy's future.  I would love
> to hear your comments.

I read through the whole thread and I fully agree with Stefan and I
support him. Git is +1, I think it's the best tool these days.

I noticed several times, that Stefan had to fix patches committed by
other people and that is very, very bad. It's wasting Stefan's time
and I just think that broken patches should never be allowed to get
in.

I also think that peer review is absolutely necessary and if there is
a right process for it, I can promise that I will be reviewing too. In
fact, I suggested that in the past already. So I think Travis you
don't have to be afraid that the code will stall. Besides it works for
Sage and other projects as well. We used that in sympy too -- and we
have a lot less developers in sympy than there are in scipy. So if we
can do it, imho scipy can too.

So, +1 to what Stefan said. I also think, by reading this thread, that
most of the people agree with Stefan.

Ondrej