[Doc-SIG] Evolution of library documentation

Mon, 12 Mar 2001 13:35:23 -0000

(do we still need to spam everyone individually? hmm - just once more)

Ka-Ping Yee wrote, in response to my earlier missage (I'm thus the bit
double-chevroned):
> > Hmm. I've had this argument before.
>
> Okay.  Well, i think it's good to have this particular debate.  It's
> worth discussing, so please bear with me as i argue it out with you.

No, debates are good - in particular, in the context of Doc-SIG, I've
changed my mind at least once because of debate (and significantly so,
since I was initially vehemently against ST).

And I understand that you're trying to improve things (I just worry that
it will turn into improoooovement) - if I get too argumentative below,
please understand that I tend to talk too loud when excited (and
documentation issues have been getting me worried/excited for well-on 20
years now, I'm afraid).

> > Not *necessarily* a bad goal (although I would point out it's
> > *significantly* easier to "have TeX" than, for instance, to
> > "have CVS",
>
> This really surprised me.  CVS is installed by default, i believe,
> on all modern Linux distributions, and i have yet to install TeX,
> which is much bigger and more complex.  How is it that you perceive
> the opposite?

Well. CVS is not present by default on Windows, and it is not present at
all where I work (from where I type). We use RCS (with in-house
jacketing - don't ask Eddie!) for Unix software, and a Microsoft thingy
for NT work, at the moment. Moreover, I can't just install new software
on a machine 'cos I want to. At home, whilst I have Linux, I have a
crappy modem connection (so CVS is a poor choice 'cos it assumes good
connectivity), and also when I briefly tried installing CVS (my Debian
setup did *not* initially have it installed, 'cos I didn't ask for it),
I cocked up what I asked it to do, I *think*.

Regardless, I don't have the *time* to learn CVS, or the time to install
it at home - those hours could be spent doing other things (and hours
are *very* precious - yes, I know I keep harping on about that, sorry,
but it's true).

TeX, now. <doddery_old_man_voice="I remember the days when..." />

You download the package (on Linux you can probably omit this step, he
says, throwing back the comment about what is already present on
"modern" distributions). You tell it to install itself. It uses up lots
of space on your system. You run the appropriate thingy on the files.
You use dvi2ps or whatever and presto bongo.

As I understand it, nowadays *installing* TeX and friends on a PC is a
doddle, and it's pretty easy on Unix - one probably doesn't even need to
compile stuff. And *using* TeX and friends is fairly easy too.
*Understanding* what is going on when something goes wrong may be
another matter, but I've already been there and done that, so the
learning curve is pretty flat.

So basically we've got a "my package is harder to install than your
package" argument here, and I expect we'll both lose to someone using
something less popular. The advantage TeX and friends have is that
they're designed (nowadays) to be installed relatively easily by people
who are not system admin types, and don't want to be.

> > > This would address the duplication problem and
> > > also keep all of a module's documentation in one place
> > > together with the module.
> >
> > Now, if you said "package" I'd be happy, but since it's
> > "module", I'll gripe.
>
> But the library reference manual is arranged by module, and there
> is a chapter of documentation on each individual module.  It also
> makes sense since the modules are the organizational units that you
> import and name in your code.

Accident of history, that, surely? We didn't used to have packages, so
all of the existing documentation more or less had to be by the module.
Now that packages are around, that constraint is no longer true, and
indeed we begin to get documentation for the XML package and so on (and
if there *isn't* grand scope documentation for these, then that's the
fault of normal lack-of-volunteer-itis, surely?).

Hmm. "the duplication problem". Eddie notwithstanding, I'm not convinced
it always *is* a problem. I don't always *want* my documentation to
reflect truth-in-implementation - sometimes the documentation is
deliberately behind (or even ahead) of the code.

> > Aagh! No, sorry, my problem wouldn't be with paging
> > (although that *is* a problem ...
>
> I do think the inconvenience is mitigated by putting the docs at the
> end -- but i acknowledge that having bigger files is a concern.
> I don't see this as a 100% win myself -- it just seems that keeping
> the code and docs in the same file has advantages large enough to
> outweigh the inconvenience.

I suspect that we have two major differences, and this is one. I believe
that putting the "full" documentation at the end of the file is bad, and
not a win - I don't believe that this "same file" idea is either a win,
or even a Good Idea, particularly. But I said that. Philosophically, I
*want* to be able to point to a different file and say "that's
documentation".

(hmm - on the type-SIG some while back all sorts of people seemed happy
with a separate interface file (mind you, *there* I disagreed, for the
same reason I wouldn't like to put the *docstrings* in a separate
file).)

Incidentally, for a *package*, where do you stand? Are you more willing
for a separate file there, or do you want one file to be magically
decided on as "special" and to have the documentation therein?

> > Tutorial, reference and other "grander scope" documentation
> > relates to the source code as a whole.
>
> Can you delineate clearly what you consider "grander scope"
> documentation as opposed to "point" documenation on a particular
> module? I'd like to better understand what you mean by "different"
> in the sense of different enough that something should be in a
> separate file.

Well, at the moment we have the language reference manual, the tutorial
and the library reference manual. Oh, and HOW-TOs. All overlap each
other (heh, so they should share common source!!! - erm, no).

I would hope that in future, for a package we might have the following:

* docstrings (at least) - this serves source code readers/IDEs, and
*may* provide input for other things in the absence of anything else.
But people like me are going to write stuff in docstrings you don't
*want* in other documents. Regardless.
* for a "standard" module/package, its entry in the library manual
* for a non-standard module/package, equivalent (one hopes!)
* for many packages (and some modules), a HOW-TO document - regular
expressions are an example here, where AMK has written such.
* for some packages, a tutorial document, perhaps a subsection *for* the
tutorial

The docstrings I termed "point" documentation because each docstring
refers to a particular point in the source code - a particular object or
whatever (heck, pydoc uses this to find them!).

The "grander scope" documentation is anything that looks at a package as
a whole. Such things should be written in a different mood, and quite
possibly (if one can) by different people (as the HOW-TOs are frequently
written by someone who didn't write the code). I mean, would you want
*me* to write tutorial user documentation for STpy? Wouldn't it tend to
be a bit too long?

Anyway, back to the point. If a module *does* have all of those, which
one do you choose to put at the back of the source file? And if it only
has one, what happens when it gains another?

> > Also, *because* one might have more than one sort of "grander scope"
> > documentation for a module/package, you will have to consider
> > *supporting* more than one.
>
> Could you give an example?

Hmm. distutils is one. docutils/STpy/pydoc/whatever will be another
(surely they deserve integrated documentation for *some* purposes, and
docutils alone already has at least two documentation files, close on
three, all for different purposes). HOW-TOs are another, as a class (and
*they* sometimes span packages, even - a "string" HOW-TO will need to
talk about Unicode and string.py and maybe buffer.py or whatever it is
called, and so on).

> By Guido-of-the-markup-languages did you mean "benevolent dictator" or
> "good designer" or "long-term keeper of the faith" or something else?

Oh, sorry, I meant the "good designer" sense (although the others are
needed after that, but they weren't what I meant).

> Although it may seem surprising, i don't immediately conclude that
> there are so many tags that we can't possibly design a reasonably
> useful markup syntax.  Many of the tags are redundant or produce
> shades of meaning finer than i consider really necessary.

Hmm. Unfortunately, losing shades of meaning early on means you can
never regain them, and one person's shade of meaning is another's
hearfelt "but they're not the same".

But this, I think, is the second place of disagreement (the "new markup
language" disease). Strangely enough, I worry less about this than the
other one - I trust Fred and co. to ensure that we can *produce*
documentation for printing out that is worth using, and I've used too
many different methods of markup to worry overmuch about what good or
rubbish system I'm required to use - it's unlikely to be as bad as DSR
(Digital Standard Runoff). Although if we *want* to do proper markup, I
wish I could convince people that you actually *do need a proper markup
language* (heck, we already all know that if you want to do proper
programming you need a proper programming language, so why is this so
hard to convey?)

Interestingly, I've seen this game (or reducing tags because they're
"not needed") played out in an entirely different arena - Great
Britain's national mapping agency (OS(GB) - that is, Ordnance
Survey(GB) - Northern Ireland has its own) reduced the number of feature
codes they use to distinguish map objects drastically some years back,
mainly to enable cost effective digitising of the non-digital map base
(and they *did* have rather an over-presence of railway related codes -
shows who *used* to be important in the country!). I still believe
that's going to cause them grief (trans. "money"), and in the not so
distant future either - but that's related to work...

> I'm not claiming it's possible until i really give it a try, but
> i do think it's worth a serious attempt.

Please do bear in mind the "philosophy" of ST then, whilst trying.

Hmm - best stop fiddling with this, and send it - my tummy is
rumbling...

[[[Last thought:

I wonder if the ability of pydoc to reproduce (something that *looks*
like) the library documentation for some (maybe even many) modules -
case in point, string - is an influence on your stance?

My second thought on that is, beware of thinking that the appearance is
all. To those who love/want markup, the appearance may be important, but
the *meaning* is what they want to be able to extract from the text.

My first thought is that, actually, I *don't* particularly have an
objection to a module's library documentation being generated directly
from its docstrings if

 a. that's all there is,
 b. Fred et al say that is sufficient
   (which I suspect they'd rather not, if they want
    better markup, but I'll let them decide), and
 c. noone volunteers to write more (more is generally
    not a bad thing, mind, even for the string module).

Hmm. Must stop thinking and eat.]]]

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
Give a pedant an inch and they'll take 25.4mm
(once they've established you're talking a post-1959 inch, of course)
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)