[SciPy-user] Docstring standards for NumPy and SciPy
Edward Loper
edloper at gradient.cis.upenn.edu
Tue Jan 16 13:52:53 EST 2007
[I sent this 5 days ago, but it's been held because I was not
subscribed -- so I decided to just go ahead & subscribe and resend
it. Apologies if it ends up being a dup.]
I'm glad to hear that you're making a push towards using standardized
markup in docstrings -- I think this is a worthy goal. I wanted to
respond to a few points that have come up, though.
First, I'd pretty strongly recommend against inventing your own
markup language. It increases the barrier for contributions, makes
life more difficult for tools, and takes up that much more brain
space that could be devoted to better things. Plus, it's
surprisingly hard to do right, even if you're translating from your
markup to an existing one -- there are just too many corner cases to
consider. I know Travis has reservations about the amount of 'line
noise,' but believe me, there are good reasons why that 'line noise'
is there, and the authors of ReST have done a *very* good job at
keeping it to a minimum.
Given the expressive power that's needed for scipy docs, I would
recommend using ReST. Epytext is a much simpler markup language, and
most likely won't be expressive enough. (e.g., it has no support for
tables.)
Whatever markup language you settle on, be sure to indicate it by
setting module-level __docformat__ variables, as described in PEP
258. __docformat__ should be a string containing the name of the
module's markup language. The name of the markup language may
optionally be followed by a language code (such as en for English).
Conventionally, the definition of the __docformat__ variable
immediately follows the module's docstring. E.g.:
__docformat__ = 'restructuredtext'
Other standard values include 'plaintext' and 'epytext'.
As for extending ReST and/or epydoc to support any specializiations
you want to make, I don't think it'll be that hard. E.g., adding
'input' and 'output' as aliases for 'parameters' and 'returns' is
pretty simple. And adding support for generating latex-math should
be pretty straight-forward. I think concerns about the markup for
marking latex-math are perhaps exaggerated, given that the *contents*
of latex-math expressions are quite likely to look like line-noise to
the uninitiated. :) I've patched my local version of docutils to
support inline math with `x=12`:math: and block math with:
.. math:: F(x,y;w) = \langle w, \Phi(x,y) \rangle
And I've been pretty happy with how well it reads. And for people
who aren't latex gurus, it may be more obvious what's going on if
they see :math:`..big latex expr..` than if they just see $..big
latex expr..$.
If you really think that's too line-noise-like, then you could set
the default role to be math, so `x=12` would render as math. But
then you'd need to explicitly mark crossreferences, so I doubt that
would be a win overall.
[Alan Isaac]
> Must items (e.g., parameters) in a consolidated field be
> marked as interpreted text (with back ticks).
> Yes. It does seem redundant, so I will ask why.
>
I wouldn't mind changing this to work both with & without the
backticks around parameter names. At the time when I implemented it,
I just checked what the standard practice within docutils for writing
consolidated fields was, and wrote a parser for that.
[Alan Isaac]
> Would it not be nice to have :Inputs: and :Outputs:
> consolidated fields as synonyms for :Parameters:
> and :Returns:?
> Yes! Perhaps Ed Loper would be willing to add this.
>
The only concern might be if other projects have defined
custom :input: and :output: fields that they use for other uses --
I'll try to check if this is the case. In the mean time, the
following should do what you want:
from epydoc.docstringparser import *
register_field_handler(process_return_field, 'output')
from epydoc.markup import restructuredtext as epytext_rst
epytext_rst.CONSOLIDATED_FIELDS['input'] = 'param'
epytext_rst.CONSOLIDATED_DEFLIST_FIELDS.append('input')
[Alan Isaac]
> Is Epydoc easily customizable?
> In what ways? It is easy to add new fields
> (see above), but I do not know about new
> consolidated fields.
>
I intend for epydoc to be easily customizable, but at the moment it's
only customizable in those places where I've thought to make it
customizable. If you find there's some customization you'd like to
do, but there's no hook for it, let me know & I can try to think
about what kind of hook would be appropriate.
[Alan Isaac]
> Is table support adequate in reST?
>
See <http://docutils.sourceforge.net/docs/ref/rst/
restructuredtext.html#tables>
If ReST table support isn't expressive enough for you, then you must
be using some pretty complex tables. :)
[Alan Isaac]
> math, so we could inline `f(x)=x^2` rather than
> :latex-math:`f(x)=x^2`.
>
As I noted above, this would mean you'd have to explicitly mark
crossreferences to python objects with some tag -- rst can't read
your mind to know whether `foo` refers to a math expression or a
variable.
> It may be worth asking whether
> epydoc developers would be willing to pass $f(x)=x^2$
> as latex-math.
>
Overall, I'm reluctant to make changes to the markup language(s)
themselves that aren't supported by the markup language's own
extension facilities.
> Why use underlining to define sections?
> So that they are really sections.
> The indented examples will display fine
> but will not give access to sectioning controls.
>
If you don't use underlining, you'll get definition lists instead of
sections. It would be possible to register a transformation w/ ReST
that checks for top-level definition lists & transforms them to
sections, but I doubt it's worth it. In my experience, the only time
when you need to add section headings within a docstring is if the
docstring is quite long, and in that case the underlining doesn't
bother me too much.
[Gary Ruben]
> Currently epydoc generates far too much
> information (2371 pages worth when I ran it on the numpy source a few
> days ago) and seems unable to be easily modified to reduce its output.
>
If you can explicitly specify what you'd like included in the output,
and how you'd like it formatted, then I can give you an idea of how
hard that would be to produce. You are right that, at the moment,
epydoc's output generators are not terribly customizable. And the
latex output isn't as pretty as I'd like. :)
[Gary Ruben]
> The other thing we want is to be able to generate examples from
> heavily
> marked-up example modules a'la what FiPy does. I don't think epydoc
> even
> allows that without modification.
>
For this, I highly recommend writing stand-alone doctest files, which
can be run through docutils as-is to generate marked-up examples; and
can be run through doctest to verify that all examples are correct.
E.g., see:
<http://epydoc.sourceforge.net/doctest/index.html>
Each of the files linked from that page is generated from a rst-
formatted doctest file.
[Perry Greenfield]
> Any reason ipython can't use epydoc or some other tool to format the
> markup in ascii (I forget if epydoc does ascii output) so that the
> user doesn't see the 'line noise' when using the ipython
> introspection features?
>
If you add this to ipython, please be sure to check the __docformat__
variable before deciding how to convert the docstring! (If you
encounter an unknown markup, then just render it as plaintext.)
As a final note, it's probably true that epydoc may currently be
missing some of the hooks that you'd need to specialize ReST without
doing some monkey-patching. If you find that this is the case,
please let me know what hooks you'd like to see added to epydoc. Or
if the construction you're trying to add is one that's likely to be
useful to other epydoc users (e.g., latex-math), then it could
certainly be added to epydoc itself.
-Edward
(disclaimer: I'm not subscribed to scipy-user; I just read the thread
from the archives. So please cc me on responses.)
More information about the SciPy-User
mailing list