[SciPy-User] peer review of scientific software

Sat Jun 1 23:29:26 EDT 2013

On Sat, Jun 1, 2013 at 5:39 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> On Sat, Jun 1, 2013 at 8:35 AM,  <josef.pktd at gmail.com> wrote:
>> On Tue, May 28, 2013 at 10:34 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
>>> Hi,
>>>
>>> On Tue, May 28, 2013 at 7:18 PM, Paulo Jabardo <pjabardo at yahoo.com.br> wrote:
>>>> I'm an engineer working in research but I spend a good deal of time coding.
>>>> What I've seen with most of my colleagues and friends is that they will only
>>>> code whenever it is extremely necessary for an immediate application in an
>>>> experiment or for their PhD. The problem starts very early, when I was
>>>> beginning my studies, we were taught C (and that is still the case almost 20
>>>> years later). A small percentage of the students (10%?) enjoy programming
>>>> and they will profit. I really loved pointers and doing neat tricks. For the
>>>> rest it was torture, plain and simple torture. And completely useless. Most
>>>> students couldn't do anything useful with programming. All their suffering
>>>> was for nothing. What happened later was obvious: they would avoid
>>>> programming at all costs and if they had to do something they would use
>>>> MS-Excel. The spreadsheets I've seen... I still have nightmares. The things
>>>> they accomplished humbles me, proves that I'm a lower being. I've seen
>>>> people solve partial differential equations where each cell was an element
>>>> in the solution and it was colored according to the result. Beautiful but
>>>> I'd rather suffer accute physical pain than to do something like that, or
>>>> worse, debug  such a "program". By the way, this sort of application was not
>>>> a joke or a neat hack, it was actually the only way those guys knew how to
>>>> solve a problem.
>>>>
>>>> 15 years later... I have a physics undergraduate student working with me.
>>>> Very smart and interested. They still learn C and later on when they need to
>>>> do something, what is it they do? Most professors use Origin. A huge
>>>> improvement over Excel, but still. A couple of months ago, he had to turn in
>>>> a report and since we don't have Origin, he was using Excel. I kind of felt
>>>> sorry for him and I helped him out to do it in Python. He couldn't believe
>>>> it.
>>>
>>> Oh - dear; you probably saw this stuff?
>>>
>>> http://blog.stodden.net/2013/04/19/what-the-reinhart-rogoff-debacle-really-shows-verifying-empirical-results-needs-to-be-routine/
>>
>> I think that's a good example that peer review works.
>
> It's a good example of how peer-review should work, but it's very
> uncommon for the reviewer to have the original spreadsheet, and that
> was the key to the problem.

The spreadsheet mistake was only one point driving the result, the
rest were modelling decisions.
Even without having access to their original work, the study can be
independently redone and show that there is no "big" effect.
Even in their results, using robust measures like median doesn't show
much of an effect. So it's mainly a few outliers (or coding mistakes)
<advertising for robust statistics>

my favorite outside economics
http://www.genomesunzipped.org/2012/03/questioning-the-evidence-for-non-canonical-rna-editing-in-humans.php

(one advantage of economics is that there have always been "schools of
thought" partially lined up with the political orientation.
This has the consequence that if one side finds something "good", the
other side tries to disprove it. And the compensating bias might
uncover what is a robust finding.)

>
>>>> I did my Masters and PhD in CFD. Most other students had almost no
>>>> background in programming and did most things using Excel! When they had to
>>>> modify some code, it was almost by accident that things worked. You can
>>>> imagine what sort of code comes out of this. The professors didn't know
>>>> programming much better. Just getting them to understand the concept of
>>>> version control took a while.
>>>>
>>>> In my opinion, If schools taught, at the begining, something like
>>>> Python/Octave/R instead of C, students would be able to use this knowledge
>>>> easily and productively throughout their courses and eventually learn C when
>>>> they really needed it.
>>>
>>> That's surely one of the big arguments for Python - it is a great
>>> first language, and it is capable across a wider range than Octave or
>>> R - or even Excel :)
>>
>> We can mistake in any language
>>
>> I just read this
>>
>> """
>> Abstract
>>
>>     [Correction Notice: An Erratum for this article was reported in
>> Vol 17(4) of Psychological Methods (see record 2012-33502-001). The R
>> code for arriving at adjusted p values for one of the methods is
>> incorrect. The specific changes that need to be made are provided in
>> the erratum.]
>> """
>>
>> It's still functioning peer review if a mistake is found after an
>> article has been published, or after a pull request has landed in
>> master.
>
> The problem is that the peers don't get to review what has been done,
> in general, they get to review what the author said had been done.
>
> Donoho's point - about computational science - is that this can be
> very different.
>
> The question is then : does this matter?  Are - most published
> research findings false?

following the link from the PLOS editorial statement
http://www.plosmedicine.org/article/info:doi/10.1371/journal.pmed.0020124

I think the entire premise "are research findings false" is completely
misguided. It just continuous the magic 0.05 tradition.
(However I think it makes a good polemic to illustrate a point.)
Disclaimer: I never read the applied part of any paper outside of
economics, and I can only imagine from second hand readings that some
articles really only report a p-value or if their result is
statistically significant or not.

I have been reading now for several months articles criticizing
research tradition and editorial recommendations to improve
statistical reporting in various fields, starting with psychological
methods and behavioral research. The general recommendation is to
report effect sizes and confidence intervals instead of, or additional
to p-values. So we can actually see what the size of this statistical
(non-)significant effect is, and learn from it. Maybe the interval is
not completely "false".
and there are other problems in some fields with the majority of the
research: the studies are underpowered, they ignore multiple testing
problems, ... (according to some editorials and reports)

Where open access to research methodology comes in is in undermining
the reputation of researchers that systematically bias (Ioannidis)
their results.
In economics this debate happened a few years ago after some famous
failures to (independently) replicate results, and now most, I think
all, top economics journals require that the data/source is published.

>
>> ---------
>> in general:
>>
>> in the research areas that I know, the vast majority of researchers
>> use Windows, and everything that is not core task is point and click.
>> As long as Matlab, Stata and GAUSS, or whatever else, doesn't have
>> version control build in, VC won't be used by the majority of
>> researchers that I know. We didn't grow up when version control was
>> popular. And we don't have IT guys to manage it for us.
>> (There is the old fashioned version control of starting new
>> directories at crucial stages, or for specific conference talks and
>> paper submissions.)
>> (DVCS are only a few years old, and it will take a few more years for
>> diffusion to "non-programmers" to happen.)
>
> We get taught some complicated things when we are training - calculus,
> algebra...
>
> Does it make sense that we don't teach less complicated things like
> version control and programming?

(I don't know about teaching computer programming in American
undergraduate programs, I'm a resident alien.)

Programming within economics is not directly part of the curriculum.
Students (undergraduates once they are beyond Excel!) learn
programming in statistics, in my PhD program it was applied
statistics/econometrics and computational economics (simulating
macroeconomy) where we learned to program, and got paid for it as
research assistant (with no requirement for unit tests nor version
control.)

My impression is that for "non-programmers", the behavioral pattern
for using the tools is acquired in the applied fields that use
computer programming. Once unit/functional testing and version control
is used there by professors and teaching assistants and required as
part of the best practice for doing your work, then it will stick.

Otherwise it's like calculus. Some need it most of their life, the
other ones forget about it as soon as the exams are over.
(But you cannot learn calculus and statistics by doing, and there is
only limited amount of time students have. More statistics please.)

-----
Two more

Version control systems are not available for Word Processing which
rules out version control for large parts of the actual work.

network and peer effects:
One reason I think that version control will be the standard in a few
more years (if usability gets better) is that you just need one or a
few "programming types" in a group to spread it like an infection. You
need those guys both as an advertising to see how to do things in a
better way, but also as a support when a new user gets lost.
I only found git acceptable because I knew that I have the rescue and
support team on the mailing lists. (Thanks for that.)

Josef

>
>> Even after using git some time, I only find it usable because I can do
>> all the regular stuff with git gui (and for unusual stuff I can use
>> commandline and git gui at the same time).
>>
>>
>> ---------
>> (just in case I'm misunderstood:
>> I'm all in favor of best practices and unit and functional tests, but
>> I don't expect that researchers will adopt it (fast) if it goes
>> against their usual pattern of using tools.
>> example: If you teach a software carpentry course that uses Linux,
>> then I wouldn't be surprised if some users go back to their office and
>> the first thing they do is use Excel. :)
>
> In general as you know I agree completely that it doesn't make sense
> to persuade people to switch from Windows to Linux at the same time as
> persuading them to use good software tools.  We should teach people
> stuff that they will and can use, and it's a common them among
> software-carpentry types that it would be better to teach Windows
> people how to best use Windows rather than teaching them on a virtual
> machine that they are unlikely to use for their work.
>
> Cheers,
>
> Matthew
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user