[Doc-SIG] Comments on the DPS texts

David Goodger dgoodger@bigfoot.com
Sun, 05 Aug 2001 13:26:28 -0400


on 2001-08-03 6:01 AM, Tony J Ibbs (Tibs) (tony@lsl.co.uk) wrote:
> PEP 256 (DPS framework) version 1.1.1.2 of 2001/07/19
> =====================================================
> 256:`Abstract`_[1]_, paragraph 1:
> 
>     You don't explicitly mention methods.

Added.

>         inside their definition
          ^^^^^^

Done.

> 256:`Rationale`_
> 
>     I think that you should add Edward Loper to the list of previous
>     attempts at writing systems.

Yes, I'd been meaning to; will do so now. He promised to send the most
up-to-date URLs, but then disappeared. Are you out there, Edward?

> 256:`Specification`_
> 
>     The diagram at the end of this is indeed much too wide, and really
>     needs redrawing, preferably as an image (even though PEPs don't
>     currently support that).

Added to the dps-notes.txt To Do list. Will do when (a) it's possible
and (b) I have time; not necessarily in that order.

> .. [1] Hmm - it is useful in a document like this to be able to refer to
> targets in another document. I've "invented" the `namespace` directive
> for this purpose, but I don't pretend to like it, nor do I particularly
> think that I've got the syntax for *using* a namespace correct. Regard
> it as a naff convention for these documents only...

So you're suggesting a "namespace" (working name) directive which
dynamically defines a base hyperlink, to which the hyperlink itself
is appended? Would this apply to hyperlinks only, or also to interpreted
text roles?

I think something like this would be useful. Garth Kidd brought it up
recently as well. I don't know how it could be implemented though. I've
downloaded the XLink and XPointer docs, and have a past familiarity with
HyTime. Added to the rst-notes.txt To Do list.

> We should explicitly allow control-L (formfeed) in documents.

^L is whitespace, and thus permitted. I'll explicitly define what
characters constitute whitespace, and make sure the parser knows too.

> PEP 257 (Docstring conventions) version 1.2.1.1 of 2001/07/19
> =============================================================
> 
> .. namespace:: 257 is pep-0257
> 
> .. comment:: yes, I know that's a duplicate declaration - I
> assume that
>    that is allowed and will work? (i.e., have no effect if
> the same, and
>    override if different).

Since the "namespace" directive doesn't exist yet, its semantics can
be anything you like! Obvious caveat: to get your semantics, *you've*
got to implement it!

> 257:`What is a Docstring?`_
> 
>     In the second paragraph after the little two item list,
> you describe
>     """..""" and r"""..""" strings, but not the Unicode variants
>     thereof.

I've added u"""Unicode triple-quoted strings""". Are ru"""raw Unicode
strings""" useful? How would you get the Unicode escapes in?

> 257:`One-line Docstrings`_
> 
>     Hmm. When discussing the "signature" in C functions - shouldn't the
>     DPS mandate that this signature line should be (a) of a particular
>     format [the obvious one!], and (b) that the tool "looking for" and
>     interpreting code structure should make use of said signature if
>     present.

(a) Sounds good. Care to provide the wording?

(b) PEP 257 doesn't care about the DPS, but this could be added to PEP
    258.

> 257:`Multi-line Docstrings`_
> 
>     Since presumably we will ultimately be wanting to have an reST mode
>     for [X]Emacs, as a subsidiary mode within Python mode (or however
>     one describes it), I think that this requirement should be removed.

>From dps-notes.txt's To Do list:

- Rework PEP 257, separating style from spec wrt DPS. See Doc-SIG
  from 2001-06-19/20.

I think this is another aspect, "tools". I've updated the To-Do.

> OK, I remembered why I sometimes want non-breaking spaces. In contexts
> such as ``PEP 258``, ...
> 
> On the other hand, how to get round it? Maybe we just need to allow an
> escaped space to be a non-breaking space - since I lost the backslash
> battle, I might as well lose it thoroughly. In which case my examples
> would be written "PEP\ 258" and "ISO/IEC\ 8211".

Now the question is how to represent it internally. I suppose we could
store all strings as Unicode internally, and use the Unicode
non-breaking space character. I haven't gotten into the Unicode
encodings, so I'm inclined to put this one off (or let someone else
implement it -- someone who cares ;-).


> PEP 258 (DPS generic implementation details)
> version 1.1.1.3 of 2001/07/19
> ============================================
> 
> .. comment:: OK, so I've found an instance of a multi-line header...

I guess it does have to be supported. Oh, well.

> 258:`Docstring extraction rules`_ item 3
> 
>     At the end of the first paragraph, you say "Of course, standard
>     Python parsing tools such as the 'parser' library module should be
>     used.".
> 
>     I think that is too strong a statement - it should either say "may
>     be used" or it should say something more like "are likely to be used
>     in general".

Changed to "may".

This is an area where I'm looking for help: a docstring extraction
module. For interpreted text to work also, it should also extract
namespaces, to a point. See docstring/dps/spec/dps-notes.txt, section
"Docstring Extractor".

> 258:`Attribute docstrings`_ item 1
> 
>     Does this really mean to leave functions out? That is, can one
>     really not do::
> 
>         def fred():
>             a = 1
>             """``a`` is a silly name for a value."""

'a' is a local variable, and not of interest to the outside world.
It's an implementation detail, best documented with a comment (or,
better yet, with a self-documenting, descriptive name). In other
words, 'a' is only of interest to someone reading the code itself,
not to someone calling the function, so there's no need to document
it externally.

> 258:`Additional docstrings`_
> 
>     When you say "this breaks 'from __future__ import'" can you give
>     more context...

Context added.

>     If it means what I think it means (that the __future__ statement
>     must be the first statement in a module, excepting a docstring (and
>     presumably another __future__!)), it would seem that option 2 is the
>     obvious one to adopt.

That would require a change to the Python core, something I'm trying
to avoid. I think option 3 ("ignore the problem") is good enough for
now. If it comes back to bite us later (after reStructuredText has
taken the community by storm and popular demand is undeniable), we
can revisit the issue.

>     If there *are* going to be additional docstrings, can I ask that we
>     have some way of *identifying* them...

I think the use of docstrings as metadata is dying out, now that we
have function attributes. I included additional docstrings as a
mechanism to reduce runtime usage for voluminous documentation.
They're not an absolute requirement though: easy to drop, and easy to
put off until later.

Sorry, I'm not convinced. Do we really need to identify additional
docstrings? Can you come up with a convincing rationale? And then, a
good way of identifying them.

> 258:`Choice of Docstring format`_
> 
>     The default of "plaintext" seems sensible - indeed, it's not clear
>     to me that one need change it *ever*, since plain text is always a
>     valid form of documentation.

If some syntax (reStructuredText or another) becomes *the* official
docstring syntax, it would be useful to change it. But I agree,
"plaintext" is a good default. The only problem is that we *must*
specify ``__docformat__ = 'restructuredtext'`` (or whatever) to use
the markup. This could be a problem for 3rd party modules. The DPS
program could have a syntax-override option.

>     I suggest format names are something like:
> 
>       - "plaintext" (the default)
>       - "reST" (reStructuredText)
>       - "ST" (StructuredText - for Zope compatibility,
>         and thus maybe deprecated)
>       - "STNG" (StructuredTextNG - ditto)

I've been considering a mapping of names, so syntaxes can register
their full names and aliases. Names would be case-insensitive.

> Other names
>     OK, so the DPS is responsible (in some sense) for the __docformat__
>     name. Should it also be aware of some of the other (semi) standard
>     names that people use - the following are ones I'm aware of:
> 
>      * __author__
>      * __version__
>      * __history__ (this is less common)
>      * __copyright__ (I've just made this one up)

I don't like to see such namespace pollution. Once we open the door on
these names (and admittedly it's already at least partially open), we
can't close it. That's one reason why I added field lists to the spec,
so they could be leveraged for bibliographic information.

Should the DPS support the above? Maybe. I'd prefer that it didn't.
That information is documentation, and doesn't need to take up more of
the global namespace. If you look at the pydoc output for modules
which have these variables, it's very ugly. I'd want to avoid
contributing to the further proliferation of these variables. If we
support them, any of them, it just encourages the practice.

I'd rather leave this up to a BDFL pronouncement, if Guido is willing.

> 258:`Intermediate data structure`_
> 
>     It should be made clear(er) that the DOM tree is intermediate
>     between the input parser and the output formatter - it is not a
>     requirement for the *internal* workings of either. The first
>     sentence of this section reads as if it means that the
> input parser
>     *must* use the DOM tree inside itself.

Made explicit.

Speaking of DOM, I tried using it at first, but decided to take your
advice and create a class library, dps.nodes, with one class per
element type. The classes have ``asdom()`` methods which convert to
xml.dom.minidom.

I think the next step for PEP 258 is to move it away from the
"generic" toward the "specific", adding details from the reference
implementation. In PEP 256 I'd specified that there would be an
implementation-specific PEP as well as the generic one, but somehow I
doubt anybody else would implement an alternate DPS.

> 258:`Output management`_
> 
>     The final sentence: "Use a directory hierarchy ... couldn't run on
>     MacOS)" doesn't make sense to me - please explain it.

Explained:

    (The files generated by pythondoc used compound file names, like
    'packagename.modulename.classname.html', which were often too long
    for the 38-character MacOS file name length limit. This is one of
    the reasons pythondoc couldn't run on MacOS).

> 258:`Error handling`_
> 
>     Good. Of course, these correspond to the VMS information, warning,
>     error and fatal message levels (and those might be good names to use
>     for them).

That's news to me. Interesting. Good names, better than what I'd come
up with. I did a Google search and came up with this:

+-------+-------+--------------+------------------------------------+
| DPS   | VMS   |              |                                    |
| Level | Value | Severity     | Response                           |
+=======+=======+==============+====================================+
| n/a   | 1     | Success      | Execution continues, expected      |
|       |       |              | results                            |
+-------+-------+--------------+------------------------------------+
| 0     | 3     | Information  | Execution continues, informational |
|       |       |              | message displayed                  |
+-------+-------+--------------+------------------------------------+
| 1     | 0     | Warning      | Execution continues, unpredictable |
|       |       |              | results                            |
+-------+-------+--------------+------------------------------------+
| 2     | 2     | Error        | Execution continues, erroneous     |
|       |       |              | results                            |
+-------+-------+--------------+------------------------------------+
| 3     | 4     | Severe error | Execution terminates, no output    |
+-------+-------+--------------+------------------------------------+

(Source:
http://www.openvms.compaq.com:8000/73final/5841/5841pro_027.html
#error_cond_severity)

Also, the "Response" column gives a good explanation of the effect of
the warnings. I've added this to the dps/dps-notes.txt To Do list.

-- 
David Goodger    dgoodger@bigfoot.com    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net