[SciPy-dev] Scipy workflow (and not tools).

Thu Feb 26 02:15:07 EST 2009

On Wed, Feb 25, 2009 at 9:42 PM, Neil Martinsen-Burrell
<nmb at wartburg.edu> wrote:
> Rob Clewley <rob.clewley <at> gmail.com> writes:
>
> [...]
>
>> So, can't there be informal teams of curatorship so that not everyone
>> involved has to be really familiar with the tools discussed in the
>> other thread?! Unfortunately I cannot afford the time to ride the
>> waves of changing fashion in VCS, etc.
>>
>> Wouldn't this help to get more people involved? ... those many people
>> that Gael correctly assumes are out there but staying silent!
>
> I am the kind of person that you want developing code for Scipy.  I prove the
> existence of a non-empty class of people who are out here but stay silent (no
> longer!).  I am a persistent lurker on these lists. I'm a heavy user of Numpy
> and Scipy in my research.  I use Numpy and Scipy in the classes I teach.  I
> contribute to other Python-based OSS projects in my small spare time.  When
> you folks talk about attracting people to work on Scipy, I should be the kind
> of person you are thinking about (and I am legion?).  I'd like to share some
> of my thoughts on the issues of code review, tests, documentation and
> workflow in the hopes of offering a non-insider perspective.
>
> 1) Code review is very helpful for me as a new contributor.  I am much more
> likely to contribute in a context in which I feel that whatever code I *can*
> produce is going to be reviewed and I can work on it to bring it up to Scipy
> standards.  If I feel that I have to produce picture-perfect Python on my
> first try, I am much less likely to try in the first place.  Code review is a
> perfect place for interested people (me!) to learn how to be active people.
> It is also a positive-feedback loop, as other interested people see the
> mentoring process that someone else has gone through with code review and feel
> themselves up to the task of trying to contribute.  For this reason, I think
> it is a benefit for code reviews to take place in public fora such as mailing
> lists, not exclusively in special code-review applications/domains.
>
> 2) Unit testing is also important for me as a new contributor.  If I would
> like to mess around with something that I don't understand in order to learn
> something, unit testing allows me to experiment effectively.  Without unit
> tests, I cannot be an effective experimentalist in my hacking.  In addition,
> other projects have trained me to unit test my contributions, so that is
> what I would most likely be doing if I were to contribute and I would like to
> feel that my effort to write tests is valued.
>
> 3) Documenting code seems like a very important standard to uphold for new
> contributors.  As someone who *might* contribute, I don't yet have a fixed
> notion of what is good enough code.  So, if I do decide to send something up
> for public consumption, then I am easy to convince that I need to do more
> documentation.
>
> 4) Workflow and tools are extremely important for me as a new contributor.
> One of the things that keeps me from developing even small patches for Scipy
> is SVN.  If I want to make a change, I have to check out the trunk and then
> develop my change *completely without the benefit of version control*.  I am not
> allowed to make any intermediate commits while I learn my way through the coding
> process.  I must submit a fully formed patch without ever being able
> to checkpoint my own progress.  This is basically a deal-breaker for me.  I
> don't enjoy coding without a safety net, especially large changes, especially
> test-driven changes and especially heavily documented changes.  I want to be
> able to polish my patch using the power of version control.  Not having this
> makes me enjoy scipy development less which makes me less likely to
> contribute.
>
> As a fairly early convert to DVCS, I am used to being able to use my local
> branch of the project however I need to in my own development process.  Being
> able to commit to a local branch as I see fit also helps produce
> well-tested and well-documented code *and* enables effective multi-step code
> review.  Particularly with Bazaar's bundle concept where the history of a
> local branch can be swapped via email (not just the patch), reviewers can
> merge a bundle from an email and review directly in the branch as I developed
> it.  Their suggestions can then be incorporated into new revisions in my
> local branch, which can then be submitted again for more polishing.  (I
> imagine git and Mercurial have similar lightweight capabilities for
> exchanging branches;  I just don't have experience with them.)
>
>
> I hope that my thoughts help clarify this group's thinking about what sort of
> things can help bring in new contributors.  (Oh, and I've got some ideas for
> scipy.stats ;)

Yes, +1 to all what you said. Also I agree with what Stefan said and I
think most of the others sort of agree with this too. So I hope some
change will happen soon in this direction. :)

E.g. dvcs and peer review.

I also agree with what Robert Kern said about the experimentalist in
the corner --- I think let's just start peer review and see what
happens.  Make it easy for people like me to see what patches are
waiting for review so that I can go through them and do the review
(=making myself responsible for the patches if I say they are ok, or
otherwise offer suggestions).

Ondrej