[SciPy-dev] Scipy workflow (and not tools).

Thu Feb 26 03:38:57 EST 2009

On Thu, Feb 26, 2009 at 5:42 AM, Neil Martinsen-Burrell
<nmb at wartburg.edu> wrote:
> Rob Clewley <rob.clewley <at> gmail.com> writes:
>
> [...]
>
>> So, can't there be informal teams of curatorship so that not everyone
>> involved has to be really familiar with the tools discussed in the
>> other thread?! Unfortunately I cannot afford the time to ride the
>> waves of changing fashion in VCS, etc.
>>
>> Wouldn't this help to get more people involved? ... those many people
>> that Gael correctly assumes are out there but staying silent!
>
> I am the kind of person that you want developing code for Scipy.  I prove the
> existence of a non-empty class of people who are out here but stay silent (no
> longer!).  I am a persistent lurker on these lists. I'm a heavy user of Numpy
> and Scipy in my research.  I use Numpy and Scipy in the classes I teach.  I
> contribute to other Python-based OSS projects in my small spare time.  When
> you folks talk about attracting people to work on Scipy, I should be the kind
> of person you are thinking about (and I am legion?).  I'd like to share some
> of my thoughts on the issues of code review, tests, documentation and
> workflow in the hopes of offering a non-insider perspective.
>
> 1) Code review is very helpful for me as a new contributor.  I am much more
> likely to contribute in a context in which I feel that whatever code I *can*
> produce is going to be reviewed and I can work on it to bring it up to Scipy
> standards.  If I feel that I have to produce picture-perfect Python on my
> first try, I am much less likely to try in the first place.  Code review is a
> perfect place for interested people (me!) to learn how to be active people.
> It is also a positive-feedback loop, as other interested people see the
> mentoring process that someone else has gone through with code review and feel
> themselves up to the task of trying to contribute.  For this reason, I think
> it is a benefit for code reviews to take place in public fora such as mailing
> lists, not exclusively in special code-review applications/domains.
>
> 2) Unit testing is also important for me as a new contributor.  If I would
> like to mess around with something that I don't understand in order to learn
> something, unit testing allows me to experiment effectively.  Without unit
> tests, I cannot be an effective experimentalist in my hacking.  In addition,
> other projects have trained me to unit test my contributions, so that is
> what I would most likely be doing if I were to contribute and I would like to
> feel that my effort to write tests is valued.
>
> 3) Documenting code seems like a very important standard to uphold for new
> contributors.  As someone who *might* contribute, I don't yet have a fixed
> notion of what is good enough code.  So, if I do decide to send something up
> for public consumption, then I am easy to convince that I need to do more
> documentation.
>
> 4) Workflow and tools are extremely important for me as a new contributor.
> One of the things that keeps me from developing even small patches for Scipy
> is SVN.  If I want to make a change, I have to check out the trunk and then
> develop my change *completely without the benefit of version control*.  I am not
> allowed to make any intermediate commits while I learn my way through the coding
> process.  I must submit a fully formed patch without ever being able
> to checkpoint my own progress.  This is basically a deal-breaker for me.  I
> don't enjoy coding without a safety net, especially large changes, especially
> test-driven changes and especially heavily documented changes.  I want to be
> able to polish my patch using the power of version control.  Not having this
> makes me enjoy scipy development less which makes me less likely to
> contribute.
>
> As a fairly early convert to DVCS, I am used to being able to use my local
> branch of the project however I need to in my own development process.  Being
> able to commit to a local branch as I see fit also helps produce
> well-tested and well-documented code *and* enables effective multi-step code
> review.  Particularly with Bazaar's bundle concept where the history of a
> local branch can be swapped via email (not just the patch), reviewers can
> merge a bundle from an email and review directly in the branch as I developed
> it.  Their suggestions can then be incorporated into new revisions in my
> local branch, which can then be submitted again for more polishing.  (I
> imagine git and Mercurial have similar lightweight capabilities for
> exchanging branches;  I just don't have experience with them.)
>
>
> I hope that my thoughts help clarify this group's thinking about what sort of
> things can help bring in new contributors.  (Oh, and I've got some ideas for
> scipy.stats ;)
>
> -Neil
> - Show quoted text -

As another long time lurker I would also support everything Neil said.

I also wanted to add the point, that what stops me recommending scipy
more widely to my colleagues is not that there is not enough code in
it - it is that it is not stable enough to rely on for their work.
That is perhaps a bit harsh, but I am sure that the first time one of
my colleagues lost 1/2 a day because of a scipy bug (as I have done
quite a few times) they would be back to MATLAB.

So I would agree with Stefan and the others that the priority is not
getting more code in per se, but improving the quality and frequency
of releases to get a platform whose stability compares with MATLAB
before adding more stuff.

Cheers

Robin