[Doc-SIG] Approaches to structuring module documentation

Manuel Gutierrez Algaba Manuel Gutierrez Algaba <irmina@ctv.es>
Fri, 12 Nov 1999 17:12:17 +0000 (GMT)


A heavily technical document!! Far from my usual "way of thinking".

On Thu, 11 Nov 1999, Fred L. Drake, Jr. wrote:

> 
>   DOCUMENT-ORIENTED CONTENT:  Documents which are structured similarly

Is this the LaTeX one ? or the "traditional" XML ?

> 
>   DOCUMENT-CENTRIC APPROACH: The human-read document is the primary

Is this TeEncontreX'es ? Are "module reference material" the 
"\indexpython" things ?

>   MICRODOCUMENT APPROACH:  Multiple DTDs are used to encode
> document-level information and module reference material.  Let's only

What's this ?

> 
> Document-centric Approach
> -------------------------
<description of things very related to TeEncotreX, I think>

> 
> Microdocument Approach
> ---------------------- 
>   Using a separate DTD to document modules offers advantages when it
> comes time to extract information programmatically.  Creating skeleton 
> module references from the current documentation would be harder and
> would certainly require more code to be written, but the payoffs are
> potentially very high. 

To put it short: "Lot of work coding _details_". Just a comment,
python is **much** better than C++, for example, because you
have  no need to declare every type, every detail, even, you can
have large parts of a python programm broken, parts that a C++
compiler would mark as erroneous. 

> To really make it work, a lot of attention
> would have to be applied to the result of the first-stage conversion
> to check the accuracy of the results, make the various bits of text
> actually land in the right place (since everything is pretty much
> thrown together now), and encode a lot of additional information about 
> types, parameters, exceptions thrown, etc. 

More heavy work !

> On the other hand, getting 
> this information into the documents in the document-centric approach
> also requires a lot of this work.

But perhaps, in a more free way. Let's say that document-centric 
( at least TeEncontreX)
seems more robust for mark-up. Anyone may mark wrong, but you can
readjust or define similarities among different markings.

> Comparison
> ----------
>   That's a 200% increase in line count and a 150% increase in file
> size.  The later isn't much of an issue, but the former is because it
> seriously impacts readability.
>   This explosion of markup is of most concern for authors; a lot of
> markup is required to encode enough information to justify changing
> the approach.  As more markup is required, it is increasingly
> difficult to get contributions because it takes the authors more time
> to document their work. 

The biggest problem I see here is that you get a very good documentation
( due to the huge ammount of work) or you get nothing ( the author
doesn't documentate).

It'd be wise to provide several levels of marking-up , so people
can mark-up little by little, some important things first and so...

This is the "TeEncontreX" version of Mailbox, this should
work if you have AnalizaToo.py:

\newcommand{\indexmoduleinbox}{\index{module}\index{mail}\index{inbox}}
\newcommand{\indexshortdescription}{\index{description}}
\newcommand{\indexdescription}{\index{description}}
\newcommand{\indexreturnvalue}{\index{returnvalue}\index{protocol}}
\newcommand{\indexclassdefinition}{\index{classdefinition}\index{protocol}}
\newcommand{\indexMmdfMailbox}{\index{MmdfMailbox}\index{MDMF}}
\newcommand{\indexMHMailbox}{\index{MmdfMailbox}\index{MH}}
\newcommand{\indexMailDir}{\index{Mail}\index{dir}}
\newcommand{\indexBabylMailbox}{\index{Babyl}\index{\indexBabylMailbox}}
\newcommand{\indexMMDF}{\index{MMDF}}

\jiji
mailbox
Read various mailbox formats.

\indexmoduleinbox \indexname \indexshortdescription
\jiji
This module defines a number of classes that allow easy and
 uniform access to mail messages in a mailbox.  Most of the
 supported mailbox formats come from the Unix world.

None of the classes defined in this module lock the
 mailboxes that are accessed; this needs to be handled by
 application code.

\indexdescription \indexmoduleinbox
\jiji
 The next message in the mailbox.  The message's 
 ("rfc822.Message") fp will be a  file object, but not a real
 file object.  If no messages have been  read, this will
 be the first message.  If all messages have
 been read, None will be returned.

\indexmoduleinbox  \indexreturnvalue  \indexrfc822
\jiji
UnixMailbox

Access a classic Unix-style mailbox, where all messages are
 contained in a single file and separated by "From name
 time lines".

The file object fp points to the mailbox file.

Initialize the mailbox object and point to the first
message in the mailbox.

\indexmoduleinbox \indexUnixMailBox \indexclassdefinition
\jiji
MmdfMailbox

Access an MMDF-style mailbox, where all
 messages are contained in a single file and separated by lines
 consisting of four control-A characters.

 The file object fp points to the mailbox file.

 Initialize the mailbox object and point to the first
 message in the mailbox.


\indexmoduleinbox \indexMmdfMailbox \indexclassdefinition
\jiji

 Access an MH mailbox, a directory with
 each message in a separate file with a numeric name.  Messages
 that are added to the mailbox after the instance is created
 are not accessible; a new instance is needed to access newly
 added messages.

\indexmoduleinbox \indexMHMailbox \indexclassdefinition
\jiji
Maildir

Access a Qmail mail directory.  All new and current mail
 for the mailbox is made available.  Messages that are added to
 the mailbox after the instance is created are not accessible;
 a new instance is needed to access newly added messages.

The name of the mailbox directory.

Initialize the list of messages that can be loaded from
 the mailbox.
The dirname parameter points to the mailbox directory.

\indexmoduleinbox  \indexMaildir \indexclassdefinition
\jiji
BabylMailbox

Access a Babyl mailbox, which is similar to an
MMDF mailbox.  Mail messages start with a
 line containing only <literal>'*** EOOH ***'</literal> and end 
 with a line containing only <literal>'\037\014'</literal>.

 A file object fp that  points to the mailbox file.

Initialize the mailbox object and point to the first
 message in the mailbox.

\indexmoduleinbox  \indexBabylMailbox \indexclassdefinition \indexmmdf
\jiji

Just some comments:
- Thinking about it, I mentioned the need for an appropos utility
one year ago, If you realise, this IS the apropos utility!!
- If one name should bear TeEncontreX it'd be : pico-documentation.
Every chunk of info, delimited by \jiji, is independent of 
the rest of the universe and you can have them in different files.
The only thing links to the world are the \newcommand definitions.
Every chunk of info is very, very small, although it could be 
very big. You have absolute freedom in size, and you can 
refine the info as much (as less as you want).

(VERY IMPORTANT POINT):
- Because of the tiny size of every chunk you can analize typical
chunks to interpolate "obvious" marking:

\jiji
BabylMailbox ( this would be marked as name)

(this would be marked as description)
Access a Babyl mailbox, which is similar to an
MMDF mailbox.  Mail messages start with a
 line containing only <literal>'*** EOOH ***'</literal> and end
 with a line containing only <literal>'\037\014'</literal>.

(these as params)
 A file object fp that  points to the mailbox file.

Initialize the mailbox object and point to the first
 message in the mailbox.
...
Or you can simply use "positional" marking into these very
small chunks.
You can have a bunch of small python programms for intelligent
analysis of typical chunks, because they have little, I guess
they'd be easy.

( VERY IMPORTANT POINT):
- As they're very small you can include in docstrings, or simply
as comments ( everywhere ).

(DEFINITE POINT):
- Using a mix of chunks of data and low-intelligent python script
for deciding on (structure, position, "hidden marks"...)
you can create XML code. So that XML can be considered as 
a low level form of chunk of infos.

(Object Oriented point):
- Chunks are nothing less than objects with info and processes
related to them. Do you like objects ? or do you like the
Pascal-like syntax of XML?  Down to XML!!!

Regards/Saludos
Manolo
-------------
My addresses / mis direcciones: 
a="www.ctv.es/USERS/irmina"
b=[("Lritaunas Peki Project", ""),
   ("Spanish users of LaTeX(en Espanyol)", "/pyttex.htm" ),
   ("page of drawing utility for tex ", "/texpython.htm" ),
   ("CrossWordsLand","/cruo/cruo.html")
   ]
for i in b:
  print i[0],":", a+i[1]

  You have to run as fast as you can just to stay where you are. If you want to get anywhere, you'll have to run much faster. -- Lewis Carroll