[SciPy-dev] 2-review system on doc wiki

Sun Feb 14 14:56:25 EST 2010

On Sun, Feb 14, 2010 at 2:07 PM, Joe Harrington <jh at physics.ucf.edu> wrote:
> On 14 February 2010 17:12, Bruce Southey <bsouthey at gmail.com> wrote:
>> I just think that the 'bar' here is set too high for a volunteer
>> project. Also I think that this 'new version' is asking too much
>> especially when people have been working under a rather different
>> approach. Also there is no conflict resolution between all the steps
>> involved.
>
> Sorry for the length here.  Hopefully this clarifies a lot of
> questions.  See, in particular, example 3 if you're not convinced we
> need this.
>
> I agree with Stefan that this really isn't that complicated.  David
> and I have discussed the two-review system here, in doc telecons, at
> the SciPy09 conference, and in its proceedings; this is nothing new.
> The motivation is simple: I read a number of the reviewed pages and
> found problems that should not have passed review.  The plan is a
> slight modification of our one-review plan.  David pointed to that
> already (thanks, David).
>
> There are no differences of approach other than the change in the
> review system.  Since only a tiny fraction (8%) of the pages has
> undergone any level of review, and only 4% have passed review, the
> change will not cause a major upset to what we are doing.
>
> As always, we resolve conflicts by discussion and use of the comment
> field on each page.
>
> We are aiming at a product of equal or greater quality to similar
> manuals for software such as IDL or Matlab.  Whether this can all be
> done by volunteers is an irrelevant question.  I expect that the
> number of reviewers will be much smaller than the number of writers.
> We will identify and vet technical and presentation reviewers, and if
> necessary we can seek funds to pay them.  Of course, we'll try the
> volunteer way first.  I hope that we can find volunteer technical
> reviewers from among the developers.  Presentation reviewers will
> likely have substantial technical writing experience; we have a list
> of a few potentials already.  A professional copy editor will proof
> the doc the first time we have fully-reviewed pages and hopefully for
> each major release thereafter, but that's a future problem.
>
> I give some examples and clarification on the review roles below.
>
> EXAMPLES
>
> 1. numpy.core.umath.sqrt does not define the "out" argument (technical
> omission) and uses language "branch cut", "continuous from above on
> it" that will confuse the majority of readers who have not taken a
> course in complex variables, such as high-school students and perhaps
> many of their teachers (presentation review).  This could be solved
> with an external reference, which is missing, or even just a rewording
> of the sentence, like:
>
> In the terminology of complex-variable calculus (ref), sqrt has a
> branch cut [-inf, 0) and is continuous from above on it.
>
> This is what I call "introducing an expert section".  It signifies to
> our target audience (one level below the likely users of a function)
> that we're about to go over their heads, where to go to come up to
> speed, and otherwise not to sweat it if they don't get it.  (Actually,
> in this particular case, it's not clear to me why we need to document
> the analytic properties of taking roots.  There's *lots* more one
> could say about roots, and trig functions, and....  We should leave
> that to the textbooks.)
>
> 2. Most routines are missing pointers to relevant pages of the
> numpy.doc package that discuss things like "along and axis" or "out".
> In many cases, that's because these pages didn't exist when the
> function docstrings were written.
>
> 3. From scipy, some of the ready-for-review pages in scipy.stats are
> likely technically good, but are totally impenetrable to anyone
> without several semesters' equivalent college education in statistics.
> While you may need that level of description to use all the tests to
> their fullest, a beginner should be able to do things like plot,
> evaluate, and integrate standard PDFs within a few minutes of starting
> to read the docs there.  If two stats experts wrote all the pages and
> reviewed each others' writing, such improvements would never be
> suggested.  Yet, a single presentation-oriented reviewer might not
> catch technical errors.  That's why we need two types of reviewers.

I agree with all the reviewing proposals, but I have two
qualifications for specialized parts in scipy

Most scipy subpackages have tutorials, and if a user needs an
introduction then it is necessary to read the tutorial. I went through
them when I looked at the part of scipy that I didn't know. A basic
introduction cannot be included in every docstring. I think the
presentation review for accessibility with less prior information
should focus also more on the tutorials. For the
scipy.stats.distributions, I tried to do this in the stats.tutorial
(although my plots seemed to have disappeared again.)

For some functions it will be difficult to determine what one level
below the likely user actually means. I recently fixed a problem with
http://docs.scipy.org/scipy/docs/scipy.signal.signaltools.hilbert/
but I still don't know what the use for the analytical signal is or
what it really means. I don't think a random user will bump into
signal.hilbert, for example. And for a basic introduction Wikipedia is
more informative than a docstring can be.

Josef

>
> TECHNICAL REVIEW
>
> A technical review ensures that all the features, API points,
> underlying methods that affect the results, and limitations of the
> item are noted properly in the docstring.  It implies familiarity with
> (or at least a good, hard look at) the source code and the general
> topic (e.g., fitting, stats, etc.).  In the ideal case, an expert
> should be able to take the doc and write a more-or-less equivalent
> routine.  This review also should check that internal cross-references
> are complete and that external references are sufficient (and
> long-lived).
>
> PRESENTATION REVIEW
>
> A presentation review ensures that our target audience - which we long
> ago defined at one level *below* that of a likely user of a given
> routine - can read and understand all but the expert parts of the
> document, that the doc follows the docstring format, that it is as
> clear as reasonably possible, that, if expert sections are needed,
> they are properly introduced as such, that the examples are the right
> ones to have and that they work, etc.
>
> --jh--
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>