[SciPy-dev] Fwd: [sage-devel] numpy in SAGE, etc.

Fernando Perez fperez.net at gmail.com
Wed Dec 6 21:39:05 EST 2006


Hi Scipy-dev,

I'm forwarding this from William Stein, the lead developer of SAGE
(http://modular.math.washington.edu/sage/).

William, I won't have any email access until next Monday, but
hopefully others may pitch in. It's probably also worth mentioning
these two pages:

http://scipy.org/Numpy_Example_List_With_Doc
http://www.hjcb.nl/python/Arrays.html

Their contents may serve as the starting point for material for
docstring examples (this has been suggested several times before, it
just hasn't  happened).

Cheers,

f

---------- Forwarded message ----------
From: William Stein <wstein at gmail.com>
Date: Dec 5, 2006 11:41 PM
Subject: [sage-devel] numpy in SAGE, etc.
To: "sage-devel at googlegroups.com" <sage-devel at googlegroups.com>
Cc: "Fernando.Perez at colorado.edu" <Fernando.Perez at colorado.edu>,
oliphant at ee.byu.edu, pearu at cens.ioc.ee



Hello,

In case you don't know, SAGE-1.5 (http://sage.math.washington.edu/sage)
will included numpy by default.  Inclusion
of the scipy distribution might not be far off either.  We will also
definitely continue to include GSL in SAGE and develop its unique
functionality.

I want SAGE to develop into a truly viable alternative to MATLAB (in
addition
to everything else it is), and it's clear to me that
numpy/scipy/vtk/mayavi/gsl
are crucial pieces of software if there is any hope of succcess.

Anyway, to the first point of this email.  I want to try some functions in
numpy, so I type

     numpy.[tab],

then say

    numpy.array?

and I see some _minimal_ documentation but ABSOLUTELY NO EXAMPLES.
The same is true for tons of the functions in numpy (and Numeric),
and even Python for that matter.  Anyway, this is simply *not* up
to snuff for what is needed for SAGE.    For SAGE my goal is that
every mathematical function in the system is illustrated by examples
that the user can paste into the interpreter and have work (and
moreover, they are autotested).  Currently there are about 12000
lines of such input already.

I think this is unrelated to the whole issue with the
official numpy documentation being commercial.   Given the extremely
limited number of examples in Numeric, numpy, and the official
Python docs, it must be a conscience design decision to *not*
have lots and lots of doctests.  In SAGE, often files have way
more docs and doctes than actual code -- again this is a design
decision.  The question, then, it was to do if numpy is to be included
in SAGE in a way that satisfies our design goals?   Some options
include:

(1) The file numpy/add_newdocs.py in the numpy distribution
defines somehow docstrings for a lot of the numpy constructors.
A SAGE developer could simply add tons of examples to this file,
based on playing around, and reading the numpy book to learn what
is relevant to illustrate.    As each numpy distribution is
released, we would *merge* this file with the one in the new
numpy distribution (e.g., very easily using Mercurial).

(2) You might think it would be possible to change the docstrings
at runtime, but I think they may be hardcoded in (many are for
code defined in extension classes).

OK, so I don't have many options.  Thoughts?  Does anybody
want to help?   Any person who wants to learn numpy could
probably easily write these examples along the way.  Instead of
just learning numpy, you could more systematically learn numpy and
at the same time contribute tons of useful doctests.

And finally, am I just wrong -- would Travis, etc., want these
docstrings with tons of examples?  Travis -- since I cc'd you,
maybe you can just answer.   I can completely understand if you
don't want tons of doctests; it's fine if your design goals are
different.  By the way, SAGE Days 3 is in LA at IPAM Feb 17-21,
and I hope both Fernando and Travis will consider coming. Some
travel funding is available.

By the way, most of the remarks above also apply to Networkx -- it's
docs seem to have almost no examples.   Actually, I don't think I know
of *any* Python packages that do have much in the way of examples in
the docstrings, at least nothing on the order of SAGE.

-------------------------

Here's the official statement about the scipy module documentation,
in the DEVELOPERS.txt file of the scipy distribution:

"Currently there are

* A SciPy tutorial by Travis E. Oliphant.  This is maintained using LyX.
   The main advantage of this approach is that one can use mathematical
   formulas in documentation.

* I (Pearu) have used reStructuredText formated .txt files to document
   various bits of software. This is mainly because ``docutils`` might
   become a standard tool to document Python modules. The disadvantage
   is that it does not support mathematical formulas (though, we might
   add this feature ourself using e.g. LaTeX syntax).

* Various text files with almost no formatting and mostly badly out
   dated.

* Documentation strings of Python functions, classes, and modules.
   Some SciPy modules are well-documented in this sense, others are very
   poorly documented. Another issue is that there is no consensus on how
   to format documentation strings, mainly because we haven't decided
   which tool to use to generate, for instance, HTML pages of
   documentation strings."

--------

So evidently the scipy people have for some reason not even decided
on a format for their documentation, which is partly why they don't
have it.  For the record, in SAGE there is a very precise documentation
format that is systematically and uniformly applied throughout the
system:

      * We liberally use latex in the docstrings.  I wrote a Python function
        that preparses this to make it human readable, when one types
            foo?

      * The format of each docstring is as follows:
function header
"""
1-2 sentences summarizing what the function does.

INPUT:
     var1 -- type, defaults, what it is
     var2 -- ...
OUTPUT:
     description of output var or vars (if tuple)

EXAMPLES:
     a *bunch* of examples, often a whole page.

NOTES:
     misc other notes

ALGORITHM:
     notes about the implementation or algorithm, if applicable

AUTHORS:
     -- name (date): notes about what was done
"""

The INPUT and OUTPUT blocks are typeset as a latex verbatim
environment.  The rest is typeset using normal latex.
It's good to use the itemize environment when necessary
for lists, also.

Essentially all the documtation in SAGE has exactly this format.
VERY often I rewrite documentation for code people send me so
that it is formated as above.  (So if you send me code, please
format it exactly as above!!!)

The SAGE reference manual is autogenerated from the source code
using a script I wrote that basically extracts the documentation
(mainly using Python introspection), and puts it in the same format
as the standard Python documentation suite, then runs the standard
Python documentation tools on it.   Note, however, that the
structure of the reference manual is laid out by hand (i.e., order
of chapters, some heading text, etc.), which is I think very important.

I suggest scipy consider a similar strategy.

Hey, I just got:

"Successfully installed scipy-2006-12-05
Now cleaning up tmp files."

on my MacBookPro, after installing the new Intel fortran compiler.


  -- William

--~--~---------~--~----~------------~-------~--~----~
To post to this group, send email to sage-devel at googlegroups.com
To unsubscribe from this group, send email to
sage-devel-unsubscribe at googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~----------~----~----~----~------~----~------~--~---



More information about the SciPy-Dev mailing list