[Doc-SIG] numbered headings in reST

David Goodger goodger@users.sourceforge.net
Sat, 10 Aug 2002 22:28:37 -0400


[David]
>>>> In the spec/notes.txt file there is this idea for a "sectnum"
>>>> directive:
>>>>
>>>>     _`parts.sectnum` (automatic section numbering; add support to
>>>>     the "contents" directive; could be cmdline option also)
>>>>
>>>> By this I'm thinking of an option to automatically number sections;
>>>> the user wouldn't have to write or maintain the numbering.  Would
>>>> that be a better solution for you?

[Dmitry]
>>> How exactly would that look and work? Automatic section numbering
>>> would be a good thing for me.

[David]
>> Something like this::
>> 
>>     .. sectnum::
>> 
>>     Section One
>>     ===========
>> 
>>     Section Two
>>     ===========
>> 
>>     Subsection One
>>     --------------
>> 
>> When processed, the numbers "1", "2", and "2.1" would be prefixed to
>> the titles automatically.  The directive name could be "sectnum" or
>> "section-numbers" or "section-numbering", perhaps with a ":global:"
>> attribute.

[Dmitry]
> OK. Docutils is still quite new to me, and I didn't understand at
> once that sectnum will be a global transform specified once in the
> document, and not in every section title.

Actually, I'd first thought of "sectnum" altering how sections were
handled by the parser, but this way is probably better.  It's
certainly simpler!

> And after all, document transforms seemed easier to figure out than
> list parsing code, and so the patch implementing the sectnum
> directive is now at:
> 
> 
http://sourceforge.net/tracker/index.php?func=detail&aid=593461&group_id=384
14&atid=422032

Thank you!  The patch looks good, but there are some issues we should
consider.

> A test-case is included with the patch.

That's great; thank you.  How are you at documentation? ;-)

The first issue is,

> The directive doesn't have any attributes (should there be any?
> which?).

Perhaps, like with the "contents" directive, there could be a "depth"
attribute?  Would it be useful to limit the depth of numbered
sections?  Sections too deeply nested wouldn't be numbered.

Also, should there be a "local" attribute?  Perhaps sections from a
certain level down could have numbering enabled, where sections in the
rest of the document are not numbered.  For example, a reference
section of a manual might be numbered, but not the rest.  OTOH, an
all-or-nothing approach would probably be enough.

Perhaps there should be some way to control the enumeration sequences?
For example, in a long document chapters are usually numbered, but
appendices use letters (A, B, C, ...).  Perhaps multiple "sectnum"
directives could exist, each specifying the style of the remainder of
the document.  So just before the first appendix, there could be a "..
sectnum:: :enum: A" directive.  Should this apply only to the first
part of the section number, or should each part be adjustable?  Of
course, this doesn't have to be implemented now; we can add it to the
to-do list until someone actually needs it.

In any case, the directive as is doesn't do anything with its ``data``
parameter.  If it doesn't accept attributes, it should signal an error
if there *is* any directive data.

The second issue is the interaction of "sectnum" with the "contents"
directive.  Try processing a document with both directives; the order
of the directives is significant.  If the "contents" directive comes
before "sectnum", the section numbers are not reflected in the table
of contents.  If "sectnum" comes first, the section number do make it
to the table of contents.  I've put examples on the web:

- "contents" first: http://docutils.sourceforge.net/tools/test1.html
- "sectnum" first: http://docutils.sourceforge.net/tools/test2.html

The reason for this is because the "sectnum" directive inserts a
``pending`` element, which is fine.  The "contents" directive does
too.  Both are labeled "last reader", meaning they're both triggered
after the standard reader transforms are finished; a simple scheduling
mechanism.  But within "last reader", it's first come, first served.
Should we leave the behavior dependent on the order of directives, or
choose one and enforce it?  It would be easy to enforce either order
by labeling one of the ``pending`` elements as "first writer" instead
of "last reader".

(Actually, in the test documents there's a small local table of
contents under "Directives", so in test1.html you can see *both*
styles!)

It's clear that the ``pending`` elements need to be documented: what
they're used for and when they're triggered ("first reader", "last
writer", etc.).

Notice the table of contents in the test2.html document.  Currently
it looks something like this::

    * 1. Structural Elements
      - 1.1. Section Title
      - 1.2. Transitions
    * 2. Body Elements
      - 2.1. Paragraphs

I think it should look like this (perhaps with extra indentation)::

    1. Structural Elements
    1.1. Section Title
    1.2. Transitions
    2. Body Elements
    2.1. Paragraphs

An alternative could look like this::

    1. Structural Elements
       1. Section Title
       2. Transitions
    2. Body Elements
       1. Paragraphs

To enable a different type of table of contents, the section titles
should have an extra attribute, such as "auto".

The third issue has to do with references:

> Also, the auto-generated numbers are not added to the section IDs,
> and only to the title text visible in the document.

I think this is the correct approach.

>> On a related note, in a 2001-07-10 post to Doc-SIG, I wrote:
>> 
>>     I'm also toying with the idea of removing leading numbers from
>>     implicit link names, so a section titled "3. Conclusion" can be
>>     referred to by "Conclusion_" (i.e., without the "3.").
>> 
>> I've added it to the to-do list, but with a "?", so it's low-to-no
>> priority.
> 
> I could fix this too if you like. This should be consistent with
> section auto-numbering, I think - either the section auto-numberer
> should add the numbers to link names, or the numbers specified in
> link names by the user should be removed. The second option looks
> better.

What I meant by statement quoted was that, given a manually numbered
section title like this::

    1. Introduction
    ===============

Currently (if the parser worked properly), we would only be able to
refer to this section with a reference like this::

    `1. Introduction`_

This is a bit unnatural.  It would be better to be able to refer more
simply::

    Introduction_

I think with the "sectnum" directive, we've got the right solution.
It's independent of hyperlinks.  We'd probably have to leave both
reference options open.  Fiddling with the reference names doesn't
feel right to me; too much opportunity for edge-case screw-ups.

Other than that, the directive looks great!

> Please review the code and check it in if you consider it worthy.

Better yet, I'll add you as a developer and let you check it in
yourself.  Let's hash out the issues raised above though.  The
reference issue (third) can be ignored for now.  The table of contents
interaction (second issue) must be dealt with.  And we should at least
decide which attributes could be useful, and which should be tossed
(first issue).

On a related note, the small patch to test/package_unittest.py (which
we discussed off-list) inspired me to finally fix this bug:

    - Fix tests to run standalone.  I.e., allow::
    
          cd test/test_rst
          test_inline_markup.py
    
      Raises an exception with path processing on GNU/Linux (but only
      sometimes???).

I'd been frustrated before because the failure seemed to be random,
not reproducible.  Well, I finally found a way to reproduce it.  Using
code earlier than today's (which no longer has the bug; the 0.2
release does), execute the following shell commands::

    cd docutils
    find . -name '*.pyc' | xargs rm
    cd test
    alltests.py
    cd test_transforms
    for f in test_*.py ; do ls -l $f ; $f ; done

An interaction with the ``inspect`` module causes the test to crash.
``inspect`` is used to get the path of the calling module, and under
conditions that are unclear to me, sometimes the module's path is not
available.  I suspect it has to do with whether the module was
compiled or not, and/or if the module was imported or run as a script.
It remains a bit of a mystery, but the changes appear to have fixed
the tests.  Perhaps Garth Kidd (the original author of the
``package_unittest`` code) can shed some light?

> I don't give up on correct list parsing yet - I'll try to solve the
> list parsing problem tomorrow.

I look forward to the results!

And thanks again!

-- 
David Goodger  <goodger@users.sourceforge.net>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/