[Doc-SIG] Cleaning up HTML output (part 1 - 'name' and numerical ids)

David Goodger goodger@users.sourceforge.net
Mon, 08 Jul 2002 21:48:26 -0400


[fantasai]
>>> And numerical id's aren't very use-friendly.

[David Goodger]
>> Docutils actually uses names wherever possible.  I don't know that
>> it could be improved much, but if you do, please let me know.

[fantasai]
> Well, let's take an example from your test.txt output:
> 
>  | <div class="section" id="structural-elements"
>  |  name="structural-elements">
>  | <h1><a href="#id21">Structural Elements</a></h1>
> 
> 'name' is not a valid attribute for <div>, <li>, or most of the
> other elements in HTML. It's used in forms, and it's used in the
> anchor tag.

I see from the HTML spec that that's true.  I guess I implemented
"name" and "id" everywhere as an over-liberal interpretation of the
XHTML 1.0 spec (Appendix C, section 8 "Fragment Identifiers"), where
it says:

    In XML, URIs [RFC2396] that end with fragment identifiers of the
    form "#foo" do not refer to elements with an attribute name="foo";
    rather, they refer to elements with an attribute defined to be of
    type ID, e.g., the id attribute in HTML 4. Many existing HTML
    clients don't support the use of ID-type attributes in this way,
    so identical values may be supplied for both of these attributes
    to ensure maximum forward and backward compatibility (e.g., <a
    id="foo" name="foo">...</a>).

    (http://www.w3.org/TR/2000/REC-xhtml1-20000126#guidelines)

Patches to rectify this and any other oversights/mistakes/bugs are
always welcome.

> The above code should be written one of three ways:
> 
> <div class="section" id="structural-elements">
> <h1>Structural Elements</h1>
...
> The first only works in browsers that support the 'id' attribute for
> targets, but it is a cleaner syntax.

Should we be supporting older browsers?  Or can we write code to the
latest & greatest specs exclusively?

> <div class="section">
> <h1><a id="structural-elements" name="structural-elements">
> Structural Elements</a></h1>
...
> The second is redundant.

But better for older browsers.  Also, it would be tricky to implement,
since the ID skips from the container (<div>, a Docutils section) to
the header/title.

> <div class="section">
> <h1><a name="structural-elements">Structural Elements</a></h1>
...
> The third is not ideal, but it works in every HTML browser I have
> ever come across.

It's also deprecated in the latest specs.  Will future browsers ever
stop supporting it?

> You use the numerical ids to have headings refer back to their
> respective section entries in the table of contents. I don't see
> that this is a particularly necessary behavior to have--you can just
> link back to the table of contents as a whole, if you really want
> to.

That could be optional behavior, specified either by a command-line
option or as a "contents" directive attribute (or both; perhaps both
would be best).  I'll enter it in the "To Do" list; patches are
welcome.  I modeled the current TOC behavior on GNU HTML documents.

> Ideally, you'd put in a <link> to the table of contents in the
> <head>:
>     <link rel="toc" href="#table-of-contents"
>      title="Table of Contents">
> 
> and leave it at that.

I don't understand how that is supposed to work.  Could you supply an
example or a reference?

> Unfortunatly, most browsers don't support <link>.

So perhaps it's a non-issue?

> If linking back to the corresponding toc entry is important to you,
> then identify the entries with the section's id preceded by 'toc-'.
> For example:
> 
>   <li id="toc-structural-elements"><a href="#structural-elements">
>       Structural Elements</a></li>
> 
> or
> 
>   <li><a href="#structural-elements" name="toc-structural-elements">
>       Structural Elements</a></li>

Since I normally don't read the raw HTML, and it's not *intended* to
be read in raw form, I don't think the form of the IDs is that
important.  If you do... patches are welcome.

-- 
David Goodger  <goodger@users.sourceforge.net>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/