[Doc-SIG] Cross-reference proposal

Tony J Ibbs (Tibs) tony@lsl.co.uk
Thu, 10 Feb 2000 12:30:39 -0000


Ka-Ping Yee reiterated a list of possible things to autodetect:
>     1. dotted.references to classes or functions in other modules
>     2. dotted.references to class methods or attributes
>     3. references() to class methods or functions
>     4. references to class names in the local module
>     5. references to argument names in function and method docstrings

> The question to ask, then, is: Of these cases, for which are there
> likely to exist situations where the cross-reference is misleading?

		He then guesses my answers, but I'm going to ignore that...

Hmm. There are two answers:

T1. In "marked up" docstrings, none of the above should be autodetected.
I'll get into why below, but basically mark it down as "I'm paranoid" for
the moment.

T2. In docstrings with no markup, I think you should be able to do what you
like, and although 3..5 may be risky, the benefit probably outweighs the
problem (i.e., I'll put up with spurious references IN THIS CASE so I can
get the correct ones as well). It might be polite to put a note at the top
of the page saying that the markup is autogenerated, though, so that people
don't blame the original author of the doc string for any mistakes (that's
politeness to those of us with ego!).

Expanding on T1 above. As you posit, 5 is generally dangerous. I hope I
demonstrated (elsewhere, "London") that 4 is equally dangerous, and I reckon
I can come up with other cases (than "OS(GB)") why 3 is dangerous too. But
note that 1 and 2 are dangerous as well - english allows acronyms with "."
as well as without (so NASA and BBC don't have dots, but N.B. does, and
there are better examples which unfortunately I can't call to mind - try
looking up a good style guide to publishing and I'm sure it will have
examples).

My *general* point is that, given the nature of the english language, it is
*impossible* to *guarantee* that you will get it right, however complex the
rules you produce. Now, for case T2 that's OK - we're trying to generate
something from nothing, and I personally am willing to accept mistakes. But
for T1 you're messing with someone's (hopefully) lovingly crafted text, and
I refer you back to my previous points (and Eddy's too).

[In fact, given that this is *programming* we're documenting, it is almost
certain that someone will want to do *very* odd things in documentation that
we haven't and probably *can't* think of, which render non-markup schemes
untenable if one wants a decent result.]

Hmm - and it's just occurred to me that if a programmer is marking up their
text, they might also have a *reason* for suppressing a cross reference -
one example is if they're talking about how one *might* have implemented
fred.spam (definitely don't want a reference here) rather than how they
actually *have* implemented #fred.spam# (to use Eddy's annotation, which I
still hate).

Summary: if it's autogenerated from no markup, the more guessing one can do
the better, and your rules are well thought out, but if it's written by an
author with intentional markup, then that markup is *all* one is *allowed*
to infer.

Tibs

(And yes, I'm bringing in experience as a reader and writer of english, of
being a pedant, of being a standards writer and user (very different
experiences!), and also of producing a fanzine/magazine (using TeX)
containing contributions from English and American speakers, where I needed
to keep the original voice, spelling, etc., intact - how's that for spurious
appeals to authority!)    (oh - and I'm a programmer as well)


--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.demon.co.uk/
Feet first with 5 wheels... (although not enough in the past few weeks)
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)