[DOC-SIG] Library reference manual debate

Fred L. Drake Fred L. Drake, Jr." <fdrake@acm.org
Fri, 21 Nov 1997 09:56:48 -0500


(In this post I quote from a message John sent directly to me; he gave 
permission to quote this message to this forum.  I quote more than I
normally would since the original message wasn't sent to the doc-sig
list.)

I wrote:
 > >  I think it's possible to support multiple submission formats, but
 > >only if clearly repeatable conversions to the cononical format is
 > >possible in an automated fashion.  

John Skaller wrote:
 >         There doesn't have to be a "canonical" format.
 > (Although it may be useful)

  I disagree.  To allow maintaining the documentation, we need to have 
a common format which provides an interlingua for conversion.  This
does not require that the submission format(s) are the same as the
canonical format.

I wrote:
 > >  Who doesn't have access to at least the free tutorial?  It's
 > >available in several formats.

John replied:
 >         I agree. And it is a good document. 
 > 
 >         But it is not enough. 
 > 
 >         I am continually seeking information -- being
 > a Python newbie -- and I'm having a LOT of trouble finding it.

  This is a problem.  If you can tell us what you're looking for and
how you went about the search (esp. what you tried first), we can see
about improving the situation.  But that's hard to do without the
feedback, especially for those of us who've been at it a few years.

I asked:
 > >  Are you asking that documents get added as they are submitted?  

And John replied:
 >         Yes. Immediately. Naturally, they should be
 > classified "unmoderated" or whatever. 
 > 
 >         If this is NOT done, I will be greatly discouraged
 > from submitting articles. 

  This is an interesting approach.  It looks like what's needed is a
"knowledgebase" in addition to the standard distribution.  I think for 
now we've been looking only at dealing with the Library Reference, but 
your comment introduces a couple of aspects that should be addressed:

1.  Providing revisions to the Library Reference as they become
    available.  I think this is a good idea, though I'm not sure that
    "immediately" needs to be immediately or "within a short period of 
    time"; as far as python.org and the Doc-SIG is concerned, we're
    all volunteers.

    I think that as sections are checked and placed/replaced in the
    Library Reference, the online HTML can be updated and the
    distribution archives can be updated on a periodic basis (monthly
    perhaps)?  This is a good reason to provide a separation between
    the documentation are source/library archives.

    The conversions from canonical format to distribution formats must 
    be completely automated for this to be feasible, or the cost in
    person-hours is too high.  Tarball & Zipball(?) creation must also
    be automated, but that's trivial once the conversions are
    automated.

2.  An online knowledgebase of HOW-TO articles, FAQs, and the like
    needs to be available.  This could be updated in a more continuous 
    fashion, with distribution packages produced in a similar way to
    the primary documentation packages.

    This probably lends itself to a simpler input format, perhaps
    allowing HTML and structured text as inputs, with conversion to an 
    internal format done behind the scenes.

I said:
 > >That may be 
 > >possible with a shared submission format, but only if the submitted
 > >documents can be verified.  So far, SGML is the only format which
 > >allows this.  

John said:
 >         HTML is verifiable, isn't it?

  Yes.  It is an SGML application.  The problem with using HTML as the 
canonical format for the Library Reference is that it is
insufficiently structure.  While HTML 4.0 might allow the structure to 
be imposed using CLASS=<???> attributes all over the place, that would 
require custom verification software to be written; this should be
avoided if at all possible.

 > >  Aside from the technical issue, there are other reasons not to
 > >publish documents which have not been checked by a person.
 > 
 >         Yes, but there are levels of checking.

  I wasn't refering so much for format checking (which should be done
in software whenever possible), or to accuracy checking (which would
be nice, but can be handled by responding to bug reports).  I was
thinking more of the malicious user (not a Python user, certainly!)
posting something obscene or not related to Python.  This sort of
thing is not something that software can effectively check for and
*must* be checked before a document can be made available on
python.org.  No matter what disclaimers are in place, a presentation
including such garbage would do nothing but damage Python's
reputation.  This is something which must be guarded against very
carefully, and this can (at this time) only be done by human
inspection of documents.

  It sounds as if there's a lot of work ahead.  Does anyone know of
existing "knowledgebase" systems that allow document updates and
interdocument linking, at least in the form of See-Also's?  I think it 
would be nice to have something better than dejanews searches.
  My expectation is that the Library Reference project needs to be
done first, primarily because the problem is better understood and we
have more concrete notions about what should be done about it.
Discussion on these other ideas doesn't need to wait, however.  Ideas, 
anyone?  ;-)


  -Fred

--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA    20191-5434

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________