From goodger@users.sourceforge.net Thu Nov 1 04:08:20 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Wed, 31 Oct 2001 23:08:20 -0500 Subject: [Doc-SIG] Inline Substitutions Message-ID: .. Here's another essay, excerpted from http://structuredtext.sourceforge.net/spec/alternatives.txt. Comments, criticism, alternatives, suggestions welcome. Inline Substitutions ==================== Inline substitutions arose out of a Doc-SIG thread begun on 2001-10-28 by Alan Jaffray, "reStructuredText inline markup". It reminded me of a missing piece of the reStructuredText puzzle, first referred to in my contribution to "Documentation markup & processing / PEPs" (Doc-SIG 2001-06-21). Inline substitutions allow the power and flexibility of directives to be shared by inline text. They are a way to allow arbitrarily complex inline objects, while keeping the details out of the flow of text. They are the equivalent of SGML/XML's named entities. For example, an inline image (using inline syntax alternative 1 & substitution alternative 1):: The `biohazard`:sub: symbol must be used on containers used to dispose of medical waste. .. sub:: biohazard .. image:: biohazard.png [height=20 width=20] ```biohazard`:sub:`` would be replaced in-line by whatever the ``sub:: biohazard`` directive generates. A "sub" or "substitution" directive would contain the substitution name as the directive's data, followed by a directive block containing either replacement text (one paragraph) or a nested inline-compatible directive, such as "image". A transform would be required to handle the substitution itself. Syntax alternatives for the inline part: 1. Use the existing interpreted text syntax, with a predefined role such as "sub":: The `biohazard`:sub: symbol... Advantages: existing syntax, explicit. Disadvantages: verbose, obtrusive. 2. Use a variant of the interpreted text syntax, with a new suffix akin to the underscore in phrase-link references:: `name`@ or `name`# or `name`& or `name`/ or `name`< Due to incompatibility with other constructs and ordinary text usage, the following are not possible:: `name`:: and `name`: 3. Use interpreted text syntax with a fixed internal format:: `:name:` or `name:` or `name::` or `::name::` or `%name%` or `#name#` or `/name/` or `&name&` or even `` or `&name;` (To avoid ML confusion those last two are definitely out.) The ```/name/``` syntax is reminiscent of substitution. 4. Use specialized syntax, something new:: #name# or @name@ or /name/ or... "#" and "@" are obtrusive. "/" without backquotes looks just like a POSIX path; it is likely for such usage to appear in text. Syntax alternatives for the substitution part:: 1. Use the existing directive syntax, with a predefined directive such as "sub". It contains either replacement text or another directive resolving to an inline-compatible object:: .. sub:: biohazard .. image:: biohazard.png [height=20 width=20] .. sub:: parrot That bird wouldn't *voom* if you put 10,000,000 volts through it! The advantages and disadvantages are the same as in inline alternative 1. 2. Use syntax as in #1, but compressed. If the substitution contents is a directive, append it to the substitution directive marker:: .. sub:: biohazard image:: biohazard.png [height=20 width=20] Replacement text could also be (optionally) compressed:: .. sub:: parrot That bird wouldn't *voom* if you put 10,000,000 volts through it! This is a bit better than alternative 1, but still too much. 3. Use a variant of directive syntax, incorporating the substitution name, obviating the need for a directive name. If we assume inline alternative 3 (slashes), the matching substitutions would look like this:: .. /biohazard/ image:: biohazard.png [height=20 width=20] .. /parrot/ That bird wouldn't *voom* if you put 10,000,000 volts through it! There is potential conflict with short paths in comments, but that can be safely ignored. At first blush, my favorite combination is inline alternative 3 (slashes) with substitution alternative 3:: The `/biohazard/` symbol... .. /biohazard/ image:: biohazard.png [height=20 width=20] This syntax seems consistent and suggestive of its intended purpose. I imagine that the next question might be: can we combine inline substitutions with hyperlinks, so that we could click on an image-link? Let's try:: The `/biohazard/`_ symbol... .. /biohazard/ image:: biohazard.png [height=20 width=20] .. _biohazard: http://www.cdc.gov/ That seems to work well. Anonymous hyperlinks would also work. I don't know about "anonymous substitutions" though; it seems that substitutions ought to be fully spelled out, always. -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From jaffray@pobox.com Tue Nov 6 06:56:30 2001 From: jaffray@pobox.com (Alan Jaffray) Date: Tue, 6 Nov 2001 01:56:30 -0500 (EST) Subject: [Doc-SIG] Alternative inline markup Message-ID: Here are my suggested changes to the current inline markup system. 1) Inline markup can be nested: :: `*US vs. Miller* (1939)`_ does what you expect. Ambiguity is resolved with close tags first, then maximal munch. Simple. If you are sick enough to try:: ***Strong enclosing emphasis*** **Strong enclosing *emphasis*** *Emphasis enclosing **strong*** then the first two will work and the third won't. If the user expects any of them to work without consulting documentation, they're foolish. Besides, if you want the third, you can do:: `Emphasis enclosing **strong**`__ __ emphasis See below. 2) An underscore suffix currently modifies the preceding text by making it a link. This notion is extended - the suffix indicates that the text is to be tagged in some way, indicated by a directive or destination URL in the target:: I had lunch with Jonathan_ today. We talked about Zope_. .. _Jonathan: lj [user=jhl] .. _Zope: http://www.zope.org/ becomes the equivalent of:: I had lunch with .. lj:: [user=jhl] Jonathan today. We talked about .. link:: [refuri=`http://www.zope.org/`] Zope . in the current syntax (if directives inlined, which they don't). Link targets which are also legal directive names must be enclosed in backquotes. 3) Substitution becomes a directive: :: +----------------+ | TOP_ | +-------+--------+ | LEFT_ | RIGHT_ | +-------+--------+ .. _top: sub My First Playbook .. _left: sub Hail Mary .. _right: sub - 16! - 4! - 23! - hut! is equivalent to:: +--------------------+ | My First Playbook | +-----------+--------+ | Hail Mary | - 16! | | | - 4! | | | - 23! | | | - hut! | +-----------+--------+ 4) Inside markup delimited by backquotes or curly braces, curly braces may be used as delimiters equivalent to backquotes:: `Photos of {Bob Johnson}_ and {Sue Fernandez}_ dancing`__ .. _Bob Johnson: grad 1989 .. _Sue Fernandez: grad 1991 __ images/dscn0018.jpg This is because backquotes don't nest. Open-curly-brace must be escaped to be literal in single-backquoted text. 5) Roles can go away. We don't need them. Optionally if we want the ability to put short directive names inline, we could declare :: `foo:: bar bar bar` to be the equivalent of :: `bar bar bar`__ __ foo Compared to the current options :: `bar bar bar`:foo: :foo:`bar bar bar` we're way ahead on both readability and lack of ambiguity either way. Summary: - We gain nesting. - We gain arbitrary extensibility of inline markup. - We gain substitutions. - We retain unobtrusive markup. - We lose by occasionally having to escape a curly brace inside backquotes, or quote a hyperlink target with no ``/`` or ``#`` characters to distinguish it from a directive name. Alan From tony@lsl.co.uk Tue Nov 6 11:11:44 2001 From: tony@lsl.co.uk (Tony J Ibbs (Tibs)) Date: Tue, 6 Nov 2001 11:11:44 -0000 Subject: [Doc-SIG] Alternative inline markup In-Reply-To: Message-ID: <00bc01c166b3$d1687eb0$545aa8c0@lslp862.int.lsl.co.uk> Immediate off-the-cuff comments - but for inline markup usage, I think that's actually what one *wants*... Alan Jaffray wrote: > 1) Inline markup can be nested: OK - not a bad idea in and of itself, but we *do* know it can be difficult to work out the implications. > If you are sick enough to try:: > > ***Strong enclosing emphasis*** > **Strong enclosing *emphasis*** > *Emphasis enclosing **strong*** > > then the first two will work and the third won't. And *that* is unacceptable (not the "sick enough" bit, since I don't think a user trying to apply rules they've, presumably, been given *is* sick!), since it is quite clear to a user what the second and third mean (so they are the ones that *must* work), whilst the first is ambiguous (and thus the one I could cope with not allowing - although in fact it doesn't matter in many cases whether it is strong emphasis or emphasised strongness, so the user won't care, so it's easy to special case). As with other facets of reST, I believe that what we *want* should come before the implementation choice - David (and Edward before him) have argued this very cogently in the past. I'd still vote for leaving nested inline markup as a "possible future enhancement" - not having it gets us some large percentage of the way to a perfect tool, and *definitely* leaves us with a *usable* tool (for the vast majority of cases). > If the user expects any of them to work without consulting > documentation, they're foolish. Then count me as a fool (heh, do I get a hat with bells on?), since (as I say) I regard the last two as perfectly obvious in meaning, and the first one as castable either way without mattering. I'm mostly leaving the other items alone for now, since they actually require more thought, except to say: 1. We want a *simple* markup scheme, so it is easy to learn and easy to remember (all of). I think David's latest suggestion about quoted slashmarky things (about which I am undecided) is pushing the very edge of that. *Unless* Alan's scheme *does* simplify overall, that added complexity makes it a non-starter for me. 2. We do *not* have to get the whole thing right at the start, so long as any extensions/additions can be carefully added at a later stage. For this sort of purpose, having things like directives, roles, and so on, is a *good* thing - they allow one to extend without changing the format. (i.e., losing roles isn't necessarily such a good thing as it sounds). 3. Previous rounds of the Doc-SIG have died partly because people kept trying to jam things in. (which isn't to say one shouldn't try to get it right, but I just get a rather uncomfortable feeling). Now, as a user I am primarily interested in producing documentation within docstrings. I'm also likely to use reST for producing very simple HTML pages (not *much* simpler than the sort I normally produce, mind you!), and for providing a neat way of integrating text and doctest outwith Python files. The first and last of these uses *require* reST to be readable easily in the raw form (one of the Main Precepts). None of these uses require complicated substitution schemes (so I am doubtless not the best person to comment on them positively!). Because I'm not target audience for Alan's changes, and because I *am* a different audience who doesn't particularly want them, I'd want a lot more explanation of *why* they are valuable to him, particularly taking into consideration the read-as-raw issue. And I would want a *very* good usage case for Alan's item 3, where he has a table whose contents is *very* hard to discern! Random other notes: Having to quote things because they "happen" to be a directive is not a good idea, since directives are not predictably named (i.e., one cannot easily tell which directives may exist at a given time). David has used punctuation to discriminate in the places where such a thing might be an issue, previously. Having to quote { and } in Python-related text is surely a non-starter (heck, we don't want to have to quote < and > because one might have to talk about *ML elements, so not being able to backquote dictionaries is *surely* out). Roles are supremely useful in Python docstrings, where one wants to qualify :class:`Fred` as opposed to :attribute:`Fred` (to use them incorrectly - but that's another argument). A more complex scheme for doing this is *not* a good thing (and splattering the information about the document is "more complex"). Eagerly awaiting David's comments... Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ "I'm a little monster, short and stout Here's my horns and here's my snout When you come a calling, hear me shout I will ROAR and chase you out" My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) From Paul.Moore@atosorigin.com Tue Nov 6 11:54:21 2001 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Tue, 6 Nov 2001 11:54:21 -0000 Subject: [Doc-SIG] Alternative inline markup Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5B0AC@UKRUX002.rundc.uk.origin-it.com> From: Tony J Ibbs (Tibs) [mailto:tony@lsl.co.uk] > Immediate off-the-cuff comments - but for inline markup usage, I think > that's actually what one *wants*... I agree with pretty much all of this - I'll add a few specific comments. > Alan Jaffray wrote: > > If you are sick enough to try:: > > > > ***Strong enclosing emphasis*** > > **Strong enclosing *emphasis*** > > *Emphasis enclosing **strong*** > > > > then the first two will work and the third won't. > > And *that* is unacceptable (not the "sick enough" bit, > since I don't think a user trying to apply rules they've, > presumably, been given *is* sick!), since it is quite > clear to a user what the second and third mean Agreed entirely. I had to work to understand why Alan's rules made the third one illegal. You *can* make rules which "work" (something like closest opener is closed first), but we're back to trying to codify do-what-I-mean rules, which is too complex for this type of application. > > If the user expects any of them to work without consulting > > documentation, they're foolish. > > Then count me as a fool Me too. One of the crucial things about reST to my mind is that it should be possible to (quickly) get enough of a grip on the rules to use them "naturally" in normal text. A key example of this is the usage of reST in E-Mail in this group. It isn't valid reST (normal E-Mail quoting constructs simply don't work), but it's *very* readable, and adds useful structure to "raw" text. And it's *not* (quite) "standard" usage, as the standard is to use ``_underline_`` and ``*bold*`` rather than ``*emphasized (conceptually italic)*`` and ``**strong (bold)**``. So it is "learned"... > 3. Previous rounds of the Doc-SIG have died partly > because people kept trying to jam things in. > (which isn't to say one shouldn't try to get it > right, but I just get a rather uncomfortable > feeling). To me, this is a serious uncomfortable feeling. I think reST *as it stands, right now* is "just right". I'm emphatically **not** saying that it is perfect for all application areas. But we'll break it in the attempt to "make it better" if we go *any* further. (IMHO) > Now, as a user I am primarily interested in producing documentation > within docstrings. I'm also likely to use reST for producing > very simple HTML pages [...] My personal goal is to get it seen as a "normal mindset" for people who want to add a little structure to their plain text writing - wherever that may be. If people use it "naturally", it will start turning up in the oddest places[1]_[2]_. And the "normal" usage will tend to be marked up plain text, read "raw". .. [1] Docstrings being one such case. And even there, you need the non-reST converts to "accept" reST as raw text. I've seen (current) docstrings written in various forms of markup - even DocBook (!) - and my normal reaction is "grr - stupid markup, I can't read this direct from the source". The fact that I read E-Mails in reST *without* that reaction is a very significant point. (I intend to start posting in reST on other lists, to see the reaction...). .. [2] Annoyingly, I couldn't recall the footnote syntax straight off here. And I was fairly sure I'd guess wrong. That implies to me that we're close to the complexity limit. (Or that I'm getting old :-). Interestingly, I looked through old E-Mails for example usage, rather than going to the spec, or even the refcard. Make of that what you will... The Perl people (based on Larry Wall's thinking) tend to talk in terms of "memes", and view ideas as existing in a space where there is some sort of natural selection. In that context, I'd like to see reST taking over the ecological niche currently held by ``_underline_`` and ``*bold*`` (and "oh, heck - I don't know how to lay this out" :-)) That means that people need to be able to use it "by example" - just looking at other people's markup, and getting it right without manuals. Tibs' refcard is the *absolute maximum* level of documentation that can be expected to capture this audience. On that basis, reST is currently just about right. My view (in summary) is: Let's just implement what we've got. It may not be perfect, but I'll take implemented and imperfect over perfect but theoretical any day. Heck, there are millions of people using POD, on the sole basis that it *exists* :-) I want a full implementation **now**. (And if I don't get one, I'll thcream and thcream until I'm thick. [As 2 short planks. Oops - didn't mean to say that :-)]) With the existing parser converting to XML, and the XSLT stylesheets, we're a good way there. Once I learn XSLT a bit, I'm planning on doing a prototype converter to LaTeX based on one of the existing XSLT stylesheets. (A pure-Python implementation might be better, but I'm getting bogged down in technical issues over tree walks, and as I say, I want it now...) This went on too long. But I'm convinced - we should freeze the design now, and work on implementation. We can't hit a moving target. (At the very least, the DOM needs to be frozen so that work on output has a firm basis...) Paul. From tony@lsl.co.uk Tue Nov 6 12:22:06 2001 From: tony@lsl.co.uk (Tony J Ibbs (Tibs)) Date: Tue, 6 Nov 2001 12:22:06 -0000 Subject: [Doc-SIG] Alternative inline markup In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5B0AC@UKRUX002.rundc.uk.origin-it.com> Message-ID: <00c001c166bd$a6329730$545aa8c0@lslp862.int.lsl.co.uk> I like people who agree with me (unfortunately for simplicity, I also have a lot of people who I like who *don't*, but still). Paul Moore wrote: > A key example of this is the usage of reST in > E-Mail in this group. It isn't valid reST (normal > E-Mail quoting constructs simply don't work), I suspect that if/when the other aspects of reST/DPS are finished being implemented, we'll see an email mode - it's just too tempting to do, and the "quoted text" thing is the only really difficult bit (and we can *steal* ideas about how to handle that). > And it's *not* (quite) "standard" usage, as the standard is to use > ``_underline_`` and ``*bold*`` rather than ``*emphasized (conceptually > italic)*`` and ``**strong (bold)**``. So it is "learned"... I would actually argue that only *this* is "standard" usage - the underline thing seems to be fashionable in some places, at some times, but not everywhere. > .. [2] Annoyingly, I couldn't recall the footnote syntax straight off > here. And I was fairly sure I'd guess wrong. Strangely, since David simplified the footnote usage (and explained one mistake I'd made!) I've found it very easy to remember. I've probably been confusing people in other venues by using it in other emails - they probably wonder why there's a trailing underscore after the ``[1]`` in the body of the text... > The Perl people (based on Larry Wall's thinking) tend to talk > in terms of "memes", ...rest of paragraph omitted... Yes, I like that idea. And I agree that the "refcard" limit is about true - certainly for myself. > On that basis, reST is currently just about right. > > My view (in summary) is: Let's just implement what we've got. > It may not be perfect, but I'll take implemented and imperfect > over perfect but theoretical any day. Yep, I'd agree (although bear in mind that *David* has *got* just about all of it implemented - people like me are the slow coaches - hmm, maybe we should insist he does documentation instead of coding!!!!) For what it's worth, HTML output from Python code is progressing (albeit slowly). Another release may come in the next few days - or not, depending on what happens regarding replacing our now-broken washing machine, and the possible minor rebuild of the kitchen cabinets thus necessitated (mutter, mutter, built-in-appliances, mutter). I'd certainly hope for a useful release by Christmas or soon after (especially if I get that second-hand laptop as a Christmas present). Hmm - should we aim for a formal release (alpha? beta?) early in the new year? From here it looks doable... Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ Give a pedant an inch and they'll take 25.4mm (once they've established you're talking a post-1959 inch, of course) My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) From Juergen Hermann" Message-ID: On Tue, 6 Nov 2001 12:22:06 -0000, Tony J Ibbs (Tibs) wrote: >For what it's worth, HTML output from Python code is progressing What about XML output? ;) From goodger@users.sourceforge.net Wed Nov 7 04:31:11 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Tue, 06 Nov 2001 23:31:11 -0500 Subject: [Doc-SIG] Alternative inline markup In-Reply-To: <00c001c166bd$a6329730$545aa8c0@lslp862.int.lsl.co.uk> Message-ID: [Alan] > Here are my suggested changes to the current inline markup system. Thanks for your input. I wish it had come earlier though! If you've been following the \*-checkins lists, you've noticed that I've already written the specs and checked in the implementation of:: Substitutions: `/text/` `/picture/`. .. /text/ If you're happy and you know it .. /picture/ image:: clapping_hands.png Except for Tony's neutral note, I haven't seen any reaction to this construct & syntax yet (posted last week as "Inline Substitutions"). Alan? Oh well. All I have to do is make a concerted effort to be objective when reviewing the suggestions. [Wrenching noise due to ego separation.] OK, done. > 1) Inline markup can be nested:: ... > Ambiguity is resolved with close tags first, then maximal munch. I don't follow. Clarify please? The easiest way I see to implement this is to first identify the outer inline markup as we do now, then recursively scan for nested inline markup. It won't work in the general case though, as explained below. Changing the parse algorithm to be fully nested-inline-markup-friendly could be difficult and/or ambiguous. I'm not sure the general case *can* be done, without a lot of exceptions and special rules, which complexity I'm not willing to add to reStructuredText. > If you are sick enough to try:: > > ***Strong enclosing emphasis*** > **Strong enclosing *emphasis*** > *Emphasis enclosing **strong*** > > then the first two will work and the third won't. Actually, all of those would work with the outer-to-inner recursive algorithm. The current definition of inline markup treats as significant the whitespace, punctuation, or bracketing before the start-string and after the end-string. These would work too:: **Strong enclosing *emphasis* in the middle.** ***Emphasis* inside strong.** These wouldn't work with the current definition though:: *Emphasis enclosing **strong** in the middle.* ***Strong** inside emphasis.* For the first example, the last asterisk of the closing "**" of "strong" would be recognized as a closing "*". For the second, strong emphasis is recognized first, without lookahead; the "*" after "emphasis." wouldn't be significant. We'd have to refine/redefine the algoritm to work with such cases. > If the user expects any of them to work without consulting > documentation, they're foolish. With a little bit of experience, I'd say such expectations would be common. Nested inline markup should be obvious and orthogonal in the general case, or nonexistent. > Besides, if you want the third, you can do:: > > `Emphasis enclosing **strong**`__ > > __ emphasis This goes against the design goals of reStructuredText. So, not on my watch. ;-) > 2) An underscore suffix currently modifies the preceding text by > making it a link. This notion is extended - the suffix indicates > that the text is to be tagged in some way, indicated by a > directive or destination URL in the target:: > > I had lunch with Jonathan_ today. We talked about Zope_. > > .. _Jonathan: lj [user=jhl] > .. _Zope: http://www.zope.org/ Interesting idea, putting arbitrary constructs in the link target. However, for consistency that depends on two things: 1. The link text remains behind, untouched except for being "activated" in some way. 2. There must *be* a link target. Corollary: the reference must *be* a reference. What will "Jonathan" become? A clickable hyperlink to something? Or a user image from the database? For example, say I want Jonathan's user icon to appear in my paragraph:: I had lunch with [Jonathan's icon here] today. How do I do this *without* having a hyperlink at the same time? On the other hand, we could say that the trailing-underscore syntax doesn't signify a hyperlink reference, but only indicates a "tagging reference". A tagging reference becomes a hyperlink reference if the contents of the "tag" resolve to a hyperlink. And how do we do the straight icon-substitution example? Would we *replace* the reference text depending on the contents of the "tag"? This seems too indirect and complicated for easy comprehension. It's too much. I've asked this before, and I really would like to know: what does the "lj" tag *do* in the end? Can you show us some HTML output? > Link targets which are also legal directive names must be > enclosed in backquotes. The frequency of link targets would far outweigh directives, so markup would suffer from extra syntax on targets. I thought of this alternative syntax:: I had lunch with Jonathan_ today. We talked about Zope_. .. _Jonathan: lj:: user=jhl .. _Zope: http://www.zope.org/ But it suffers from the same conceptual problems: the reference in the text sometimes become links, sometimes not, we don't know *at the markup*. > 3) Substitution becomes a directive: ... combined with a hyperlink target. No, I don't think so. Substitutions are going to be relatively rare, and should have distict syntax. The syntax you're proposing is internally inconsistent. > 4) Inside markup delimited by backquotes or curly braces, curly > braces may be used as delimiters equivalent to backquotes:: ... > This is because backquotes don't nest. There's no difference between backquotes and asterisks with regard to nesting. Unless you're referring to double-backquotes: ``no further processing of `backquotes` in inline literals``? Why the fixation on curly braces? :> > 5) Roles can go away. We don't need them. Optionally if we want > the ability to put short directive names inline, we could > declare :: > > `foo:: bar bar bar` Similar syntax has already been considered and rejected. See http://structuredtext.sf.net/spec/alternatives.txt, "Interpreted Text 'Roles'" alternative 1. > Summary: > > - We gain nesting. Not without significant work, though. If it's even possible unambiguously, it can be added independently later. > - We gain arbitrary extensibility of inline markup. > - We gain substitutions. But at the expense of complicating hyperlinks. > - We retain unobtrusive markup. Debatable, since it adds complexity to the underlying concepts. > - We lose by occasionally having to escape a curly brace inside > backquotes, or quote a hyperlink target with no ``/`` or ``#`` > characters to distinguish it from a directive name. That last one is a significant loss. [Tony] > Immediate off-the-cuff comments - but for inline markup usage, I > think that's actually what one *wants*... I don't follow. > 1. We want a *simple* markup scheme, so it is easy to > learn and easy to remember (all of). I think David's > latest suggestion about quoted slashmarky things > (about which I am undecided) is pushing the very > edge of that. *Unless* Alan's scheme *does* simplify > overall, that added complexity makes it a non-starter > for me. Which added complexity? > 2. We do *not* have to get the whole thing right at > the start, so long as any extensions/additions > can be carefully added at a later stage. For this > sort of purpose, having things like directives, > roles, and so on, is a *good* thing - they allow > one to extend without changing the format. > (i.e., losing roles isn't necessarily such a > good thing as it sounds). They're meant as a last resort anyhow. Substitutions provide more flexibility with less obtrusive markup, albeit indirectly. > 3. Previous rounds of the Doc-SIG have died partly > because people kept trying to jam things in. > (which isn't to say one shouldn't try to get it > right, but I just get a rather uncomfortable > feeling). Substitutions (or equivalent, whatever the syntax) filled a gap in the reStructuredText specification. Directives allow arbitrary block-level structures. Substitutions allow arbitrary text-level (inline) structures. Without them, every time someone wants a specific new inline structure, they'd have to petition for a syntax change. With them plus existing syntax, any inline structure can be coded without new syntax (or with only directive-local syntax). This was the goal of interpreted text roles also, but roles have limited functionality and the syntax is obtrusive. They're most useful if the role can be inferred by the system. The provisional syntax I've chosen for substitutions isn't particularly elegant, but that's OK: I don't expect substitutions to be used often enough to be painfully noticeable. On the contrary, I think noticeable syntax is appropriate for this construct. [Paul] > To me, this is a serious uncomfortable feeling. I think reST *as it > stands, right now* is "just right". I'm emphatically **not** saying > that it is perfect for all application areas. But we'll break it in > the attempt to "make it better" if we go *any* further. (IMHO) With the addition of substitutions, I consider reStructuredText to be pretty much complete. There are a couple of details remaining in rst-notes.txt (multi-line titles, an external hyperlink mechanism with a finer resolution, and ``\ `` as non-breaking space), but they're not significant in the grand scheme. > If people use it "naturally", it will start turning up in the oddest > places[1]_[2]_. A usage note: footnote references require a preceding space (or brackets, etc.). > (I intend to start posting in reST on other lists, to > see the reaction...). Good idea. How about adding a line to our signatures? :: Marked up with reStructuredText: http://structuredtext.sf.net/ > The Perl people (based on Larry Wall's thinking) tend to talk in > terms of "memes", and view ideas as existing in a space where there > is some sort of natural selection. In that context, I'd like to see > reST taking over the ecological niche currently held by > ``_underline_`` and ``*bold*`` (and "oh, heck - I don't know how to > lay this out" :-)) That means that people need to be able to use it > "by example" - just looking at other people's markup, and getting it > right without manuals. I think that's a possibility with the simpler, more often-used parts of the markup. > Tibs' refcard is the *absolute maximum* level of documentation that > can be expected to capture this audience. Actually, I think the quickref needs to be broken up (at least internally) into "basic" and "advanced" parts, to make the introduction easier. Perhaps each construct with an advanced aspect should have an "Advanced Usage" subsection. > (A pure-Python implementation might be better, but I'm getting > bogged down in technical issues over tree walks, and as I say, I > want it now...) I'm about ready to put the parser to bed; enough fiddling already. It does need lots of internal documentation and some refactoring, but functionally it's complete. The next thing is to tackle a Reader component and the transforms (including Ueli Schlaepfer's patch). > This went on too long. But I'm convinced - we should freeze the > design now, and work on implementation. We can't hit a moving > target. (At the very least, the DOM needs to be frozen so that work > on output has a firm basis...) I agree with the sentiment, but even if the parser is frozen the document tree model is still subject to change. Its functionality is only construction-oriented (parser) now; as you well know, there's not much support for transformations, tree walking, and whatever else output needs. I also want to take a good look at HappyDoc and others before reinventing any more wheels. [Tony] > Yep, I'd agree (although bear in mind that *David* has *got* just > about all of it implemented - people like me are the slow coaches - > hmm, maybe we should insist he does documentation instead of > coding!!!!) And how are you going to make me? Coding is much more fun! > Hmm - should we aim for a formal release (alpha? beta?) early in > the new year? From here it looks doable... If it's ready, we will. I won't commit any further than that. -- David Goodger goodger@users.sourceforge.net Marked up with reStructuredText: http://structuredtext.sf.net/ From tony@lsl.co.uk Wed Nov 7 09:53:27 2001 From: tony@lsl.co.uk (Tony J Ibbs (Tibs)) Date: Wed, 7 Nov 2001 09:53:27 -0000 Subject: [Doc-SIG] Alternative inline markup In-Reply-To: Message-ID: <010901c16772$0c6d31e0$545aa8c0@lslp862.int.lsl.co.uk> Juergen Hermann wrote: > On Tue, 6 Nov 2001 12:22:06 -0000, Tony J Ibbs (Tibs) wrote: > >For what it's worth, HTML output from Python code is progressing > > What about XML output? ;) The capability to do XML output is provided "for free" by DPS itself. Thus pydps has an --xml switch which outputs XML. The only issue there is that the rest of the program isn't finished - the exact information to be in the output is not yet finalised, I may be missing some information, and the final form of the DPS tree is not yet determined. But as soon as *anything* sensible is released, XML output will be there (because it doesn't cost me anything!). Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ .. "equal" really means "in some sense the same, but maybe not .. the sense you were hoping for", or, more succinctly, "is .. confused with". (Gordon McMillan, Python list, Apr 1998) My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) From tony@lsl.co.uk Wed Nov 7 10:17:00 2001 From: tony@lsl.co.uk (Tony J Ibbs (Tibs)) Date: Wed, 7 Nov 2001 10:17:00 -0000 Subject: [Doc-SIG] Alternative inline markup In-Reply-To: Message-ID: <010a01c16775$567f6a70$545aa8c0@lslp862.int.lsl.co.uk> David Goodger saved me having to learn new syntax by disliking it... David then wrote: > Except for Tony's neutral note, I haven't seen any reaction to this > construct & syntax yet (posted last week as "Inline Substitutions"). I'm still way behind on internalising all of the ways links have changed recently, and I made a firm decision to leave this one on trust until I could think more about it. It depends a bit on whether I can think of reasons *I* might want to use it. I assume that the silence means that (a) people like Paul Moore, Ueli Schlaepfer, Garth Kidd and so on are on holiday, or (b) they don't have a great feeling either way. At which point it becomes an executive decision... > [Tony] > > Immediate off-the-cuff comments - but for inline markup usage, I > > think that's actually what one *wants*... > > I don't follow. I simply meant that if one couldn't understand how inline markup worked *immediately* (essentially by looking at it, given it *is* markup, even if one doesn't know its *meaning*) then we're probably losing. In other words, complex UNINTUITIVE schemes on what is legal to nest will not work. > > 1. We want a *simple* markup scheme, ... > > *Unless* Alan's scheme *does* simplify > > overall, that added complexity makes it > > a non-starter for me. > > Which added complexity? Sorry - the added complexity of Alan's proposals. > > 2. We do *not* have to get the whole thing right at > > the start, so long as any extensions/additions > > can be carefully added at a later stage. For this > > sort of purpose, having things like directives, > > roles, and so on, is a *good* thing - they allow > > one to extend without changing the format. > > (i.e., losing roles isn't necessarily such a > > good thing as it sounds). > > They're meant as a last resort anyhow. Substitutions provide more > flexibility with less obtrusive markup, albeit indirectly. ... > Substitutions (or equivalent, whatever the syntax) filled a gap in the > reStructuredText specification. Directives allow arbitrary block-level > structures. Substitutions allow arbitrary text-level (inline) > structures. Without them, every time someone wants a specific new > inline structure, they'd have to petition for a syntax change. With > them plus existing syntax, any inline structure can be coded without > new syntax (or with only directive-local syntax). This was the goal of > interpreted text roles also, but roles have limited functionality and > the syntax is obtrusive. They're most useful if the role can be > inferred by the system. And that's a good summary of why they should be in - as I say, I still need to work out a use case for myself, but I'm quite prepared to trust David until I do (that *doesn't* mean I'm thinking "oh yes, that's more or less OK, so I'll trust David" - it means "ah - I'm not at all sure about that, but since I don't have time to think deeply about it now, and since David has a good track record in the past, I shall trust his judgement is still working" (or, putting it another way, "hmm_, if I decide I don't like this, past evidence shows David has a good chance of changing my mind, since he's awkward that way")). > With the addition of substitutions, I consider reStructuredText to be > pretty much complete. There are a couple of details remaining in > rst-notes.txt (multi-line titles, an external hyperlink mechanism with > a finer resolution, and ``\ `` as non-breaking space), but they're not > significant in the grand scheme. All of which are clearly things that can be added later, so and celebrations! > > (I intend to start posting in reST on other lists, to > > see the reaction...). > > Good idea. How about adding a line to our signatures? :: > > Marked up with reStructuredText: http://structuredtext.sf.net/ Ooh, sneaky. And that means someone will *have* to produce an email mode... > > Tibs' refcard is the *absolute maximum* level of documentation that > > can be expected to capture this audience. > > Actually, I think the quickref needs to be broken up (at least > internally) into "basic" and "advanced" parts, to make the > introduction easier. Perhaps each construct with an advanced aspect > should have an "Advanced Usage" subsection. I suspect that it needs to have two forms (at least for my own sake) - a one page form that has everything (i.e., the current thing with the missing details added), and something more like what you describe - which is essentially a quick tutorial as well as a reminder. > I also want to take a good look at HappyDoc and others before > reinventing any more wheels. Good. > [Tony] > > hmm, maybe we should insist he does documentation instead of > > coding!!!!) > > And how are you going to make me? Coding is much more fun! By withholding those candy treats you ask for in the README files (how *are* we meant to get them to you without a postal address?). .. _hmm: or, perhaps, by egregious compliments until he gives in... Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ Give a pedant an inch and they'll take 25.4mm (once they've established you're talking a post-1959 inch, of course) My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) From Paul.Moore@atosorigin.com Wed Nov 7 10:24:14 2001 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Wed, 7 Nov 2001 10:24:14 -0000 Subject: [Doc-SIG] Alternative inline markup Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5B0B5@UKRUX002.rundc.uk.origin-it.com> I've said most of what I think already, so I'll keep this short (if I can)... From: David Goodger [mailto:goodger@users.sourceforge.net] > [Alan] > > Here are my suggested changes to the current inline markup system. > > Thanks for your input. I wish it had come earlier though! If you've > been following the \*-checkins lists, you've noticed that I've already > written the specs and checked in the implementation of:: > > Substitutions: `/text/` `/picture/`. > > .. /text/ If you're happy and you know it > .. /picture/ image:: clapping_hands.png > > Except for Tony's neutral note, I haven't seen any reaction to this > construct & syntax yet (posted last week as "Inline Substitutions"). > Alan? I didn't comment, mainly due to lack of time to think about it, and lack of any real motivation to *use* it. Some comments: 1. I'm not keen on the overloading of the \` character in the syntax. Substitutions look like interpreted text. For some reason, hyperlinks don't have this problem, even though they reuse the \` character as well. Maybe some form of postfix notation would help for substitutions, as well? [1]_ 2. I'm not 100% clear on the semantics, either. To my mind, pure text substitutions are a bad thing - put the text inline instead. Otherwise, you are making the marked up text *less* readable. `/swim/`? And the semantics of anything else is effectively output-processor dependent. Generally, I'm against any markup which makes my text output dependent. Even your image example (the least contentious possibility for this sort of thing) may not be renderable in certain output formats (ASCII text, for instance!!!). 3. Overall, I'd like to see clearer examples of what all this could be *for*. Your example of interpreted text roles in Python documentation (can't recall where I saw this - things like ``:class:`zipfile``` and ``:variable:`filename``` to distinguish types of identifier). I'd like to see a "motivation" section, with more of this sort of example. .. /swim/ See what I mean .. [1] Actually, having used the syntax in my example above, I find I quite like it. But point (2) still stands - it's not clear (to me) that the construct is *useful*. > > If the user expects any of them to work without consulting > > documentation, they're foolish. > > With a little bit of experience, I'd say such expectations would be > common. Nested inline markup should be obvious and orthogonal in the > general case, or nonexistent. And in my view, the corner cases kill the possibility of obviousness, so I vote for nonexistent. > I've asked this before, and I really would like to know: what does the > "lj" tag *do* in the end? Can you show us some HTML output? That gets back to my point - what's the motivation? And for me, I'd like to see more than just HTML output. What would such a document show in PDF/PostScript intended for printing? (If the answer is "you don't use it in that context", then we're getting too domain-specific). > > 5) Roles can go away. We don't need them. Optionally if we want > > the ability to put short directive names inline, we could > > declare :: > > > > `foo:: bar bar bar` The example:: :class:`zipfile` is, to me, a telling argument in *favour* of roles. I can see the point, it reads clearly in markup form, and I can imagine output formatters rendering class names specially (or putting them in an index, or whatever). Of course, getting that level of flexibility in the output code is the next exciting step... :-) > [Tony] > > Immediate off-the-cuff comments - but for inline markup usage, I > > think that's actually what one *wants*... > > I don't follow. What I understood Tony to mean was that if you had to stop and think about it, it wasn't clear and simple enough to be what we want. > Substitutions (or equivalent, whatever the syntax) filled a gap in the > reStructuredText specification. Directives allow arbitrary block-level > structures. Substitutions allow arbitrary text-level (inline) > structures. OK, that sounds like a killer argument for substitutions. And the converse is, that *any* request for extra features should be addressable with "use a (substitution|directive)". If not, then we need to rethink these constructs, to understand why they aren't doing their job properly. > This was the goal of interpreted text roles also, but roles have > limited functionality and the syntax is obtrusive. They're most > useful if the role can be inferred by the system. Hmm, by my own argument, that implies that roles should be *replaced* by substitutions. Maybe the fact that they can't means that there is still something to address here. Substitutions don't take "parameters" (the interpreted text part of a role). Roles don't have the "supplemental information" (the directive-like bit in the ``..`` section - I don't know what to call it) that substitutions do. > I agree with the sentiment, but even if the parser is frozen the > document tree model is still subject to change. Its functionality is > only construction-oriented (parser) now; as you well know, there's not > much support for transformations, tree walking, and whatever else > output needs. Sorry - that's actually closer to what I was trying to say. Freeze the parser (and hence, certain key parts of the document tree model, like what types of node exist - effectively the DTD side of it), and work on the other parts. With the document tree at the centre of it all, there's no way you can completely freeze it when only one of its "clients" has been frozen. > > Hmm - should we aim for a formal release (alpha? beta?) early in > > the new year? From here it looks doable... > > If it's ready, we will. I won't commit any further than that. No commitment, but it sounds like a worthwhile target to aim at. -- Paul Moore (paul.moore@atosorigin.com) Marked up with reStructuredText: http://structuredtext.sf.net/ From jaffray@pobox.com Wed Nov 7 15:53:49 2001 From: jaffray@pobox.com (Alan Jaffray) Date: Wed, 7 Nov 2001 10:53:49 -0500 (EST) Subject: [Doc-SIG] Alternative inline markup In-Reply-To: <00bc01c166b3$d1687eb0$545aa8c0@lslp862.int.lsl.co.uk> Message-ID: On Tue, 6 Nov 2001, Tony J Ibbs (Tibs) wrote: > Alan Jaffray wrote: > > If you are sick enough to try:: > > > > ***Strong enclosing emphasis*** > > **Strong enclosing *emphasis*** > > *Emphasis enclosing **strong*** > > > > then the first two will work and the third won't. > > And *that* is unacceptable OK, so forbid all three. General rule: A nested markup element cannot be delimited by the same characters as its parent. That's probably a better design decision. Do we lose anything? No. In the current spec you can't nest at all. In the proposed spec there are happier alternatives for all three, since ``*`` and ``**`` become mere sugar for tagged content. The reason I said "sick" is because I don't know a semantic meaning for "emphasized strong text" other than "the author wants to demonstrate a case where nesting is difficult to parse". :-) "Perverse" would have been a more accurate adjective. > I'd still vote for leaving nested inline markup as a "possible future > enhancement" - not having it gets us some large percentage of the way to > a perfect tool, and *definitely* leaves us with a *usable* tool (for the > vast majority of cases). Nesting is a fundamental feature. It's not going to become easier to add it later. It's going to become more difficult. Meanwhile, attempts to get around the need to add it will complicate and clutter the language, while adding it now can simplify matters. > Having to quote things because they "happen" to be a directive is not a > good idea, since directives are not predictably named (i.e., one cannot > easily tell which directives may exist at a given time). David has used > punctuation to discriminate in the places where such a thing might be an > issue, previously. Oh, I agree entirely. That's what I meant - URLs which contain punctuation (such as ``/``) making them illegal directive names don't have to be backquoted. If we were willing to forbid ``.`` and ``:`` in directive names, this would become the *vast* majority of cases. As it is, it's just a considerable majority. It misses relative links to files in the current directory and mailto URLs (or other similar schemes without ``//``). > Having to quote { and } in Python-related text is surely a non-starter > (heck, we don't want to have to quote < and > because one might have to > talk about *ML elements, so not being able to backquote dictionaries is > *surely* out). Really? Some of our documents talk about Python, and some of them talk about Perl which uses curly braces even more heavily, but I didn't think it'd be an issue. I would expect code fragments to be in *literal* blocks much more often than single backquotes. > Roles are supremely useful in Python docstrings, where one wants to > qualify :class:`Fred` as opposed to :attribute:`Fred` (to use them > incorrectly - but that's another argument). A more complex scheme for > doing this is *not* a good thing (and splattering the information about > the document is "more complex"). You still have those. Instead of ``:attribute:`Fred``` you write ```attribute:: Fred```. Directives are slightly expanded to subsume the role of roles. And if you ever want to give that "class" directive/role some arguments or otherwise make it more complex, you can do that, which you couldn't with the current roles. Alan From Paul.Moore@atosorigin.com Wed Nov 7 16:37:32 2001 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Wed, 7 Nov 2001 16:37:32 -0000 Subject: [Doc-SIG] Alternative inline markup Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5B0BA@UKRUX002.rundc.uk.origin-it.com> From: Alan Jaffray [mailto:jaffray@pobox.com] > On Tue, 6 Nov 2001, Tony J Ibbs (Tibs) wrote: > > Alan Jaffray wrote: > > > If you are sick enough to try:: > > > > > > ***Strong enclosing emphasis*** > > > **Strong enclosing *emphasis*** > > > *Emphasis enclosing **strong*** > > > > > > then the first two will work and the third won't. > > > > And *that* is unacceptable > > OK, so forbid all three. General rule: A nested markup element cannot > be delimited by the same characters as its parent. That's probably a > better design decision. That makes the nesting rules dependent on the markup characters involved. That's a *very* odd distinction to make - it implies that there is an argument for changing the markup for strong text to, say ``!strong!``, as that makes it nestable (!!). > Do we lose anything? No. In the current spec you can't nest at all. > In the proposed spec there are happier alternatives for all three, > since ``*`` and ``**`` become mere sugar for tagged content. We lose consistency, which is what I am arguing is crucial. > The reason I said "sick" is because I don't know a semantic > meaning for "emphasized strong text" other than "the author > wants to demonstrate a case where nesting is difficult to > parse". :-) Bold italic, in most web browsers. That's not to say I feel the need to support it, but it's a *perfectly* sensible thing. > Nesting is a fundamental feature. It's not going to become easier > to add it later. It's going to become more difficult. Meanwhile, > attempts to get around the need to add it will complicate and > clutter the language, while adding it now can simplify matters. I probably agree here - nesting is a fundamental issue. It's just that I disagree that that fact makes it necessary to support it. On the contrary, I'd say that lack of nesting is a distinguishing, simplifying, feature of the design. You can't get around it - the language doesn't support nesting, and unless that is changed, it means that it simply isn't *possible* to use emphasized strong text in reST. More relevantly, it means that you can't emphasize parts of a hyperlink. This is a more realistic requirement, but I *still* don't see it as so earth-shattering that we have to accept either inconsistent or complex nesting rules (all options I've seen so far are one or the other of these...) just to support it. > > Having to quote things because they "happen" to be a > > directive is not a good idea, since directives are not > > predictably named > > Oh, I agree entirely. That's what I meant - URLs which contain > punctuation (such as ``/``) making them illegal directive names > don't have to be backquoted. If we were willing to forbid ``.`` > and ``:`` in directive names, this would become the *vast* > majority of cases. As it is, it's just a considerable majority. > It misses relative links to files in the current directory and > mailto URLs (or other similar schemes without ``//``). Regardless of the actual rules, there's still the restriction that certain things *require* quoting, where others don't. Inconsistency again. In fact, making the special cases rarer is arguably worse, as it makes it less likely that people will remember the exceptions. But we disagree on whether people should be able to write reST without reading the spec. I believe that things should be deducible from examples, you feel that attempting to use markup without knowing the rules is foolish. (I hope I didn't misrepresent you - I'm not trying to argue that your position is wrong, just point out that we have differing perspectives). > Really? Some of our documents talk about Python, and some of > them talk about Perl which uses curly braces even more heavily, > but I didn't think it'd be an issue. I would expect code > fragments to be in *literal* blocks much more often than single > backquotes. Agreed (about use of literal blocks), but I think keeping the special characters to a minimum is an important goal. In particular, code becomes unreadable if it needs *any* escaping at all. ("Is that markup, or am I supposed to type that?") And while braces-as-delimiters are probably only seen in literal blocks, it's quite reasonable to talk about dictionaries (ie, {'a':1, 'b':2}) inline... > You still have those. Instead of ``:attribute:`Fred``` you write > ```attribute:: Fred```. Directives are slightly expanded to subsume > the role of roles. And if you ever want to give that "class" > directive/role some arguments or otherwise make it more complex, > you can do that, which you couldn't with the current roles. I thought that interpreted text (ie, anything in \`...\`, with the exception of substitutions and hyperlinks, which have extra delimiting characters) was entirely application-defined. You seem to be suggesting some standard interpretation of the contents to cover attributes. [1]_ .. [1] By the way, you do realise that in advocating nesting, you are making the construct:: ```attribute:: Fred``` which you just used, illegal? (Or at least different.) At the moment it is a literal display of markup. What would it be with nesting? Or is \`\`...\`\` [2]_ another example of an exception to the nesting rule? .. [2] Boy, it's hard to discuss markup using marked up text... Paul. -- Paul Moore (paul.moore@atosorigin.com) Marked up with reStructuredText: http://structuredtext.sf.net/ From jaffray@pobox.com Wed Nov 7 19:05:25 2001 From: jaffray@pobox.com (Alan Jaffray) Date: Wed, 7 Nov 2001 14:05:25 -0500 (EST) Subject: [Doc-SIG] Alternative inline markup In-Reply-To: Message-ID: On Tue, 6 Nov 2001, David Goodger wrote: > Thanks for your input. I wish it had come earlier though! Sorry. Combination of illness and much-needed vacation. > > 1) Inline markup can be nested:: > ... > > Ambiguity is resolved with close tags first, then maximal munch. > > I don't follow. Clarify please? Forget it, bad idea. :) > The easiest way I see to implement this is to first identify the outer > inline markup as we do now, then recursively scan for nested inline > markup. It won't work in the general case though, as explained below. > Changing the parse algorithm to be fully nested-inline-markup-friendly > could be difficult and/or ambiguous. I'm not sure the general case > *can* be done, without a lot of exceptions and special rules, which > complexity I'm not willing to add to reStructuredText. Is "markup can't be delimited with the same character as its parent" too complicated? > > 2) An underscore suffix currently modifies the preceding text by > > making it a link. This notion is extended - the suffix indicates > > that the text is to be tagged in some way, indicated by a > > directive or destination URL in the target:: > > > > I had lunch with Jonathan_ today. We talked about Zope_. > > > > .. _Jonathan: lj [user=jhl] > > .. _Zope: http://www.zope.org/ > > Interesting idea, putting arbitrary constructs in the link target. > However, for consistency that depends on two things: > > 1. The link text remains behind, untouched except for being > "activated" in some way. > 2. There must *be* a link target. Corollary: the reference must *be* > a reference. I agree with (2) but not (1). Here's the principle I'm going on: A reStructuredText-to-plaintext converter should modify the non-directive parts of the document as little as possible. The marked-up text should "read" like non-marked-up text. > What will "Jonathan" become? ``Jonathan`` or some such. After that, it's an output format issue. For the given application I would expect the default text output to be ``Jonathan`` and the default HTML output to be:: Jonathan but I believe the HTML can be customized through the style engine. It has also been suggested that the lj-user tags could be used to track "who am I talking about" or "who's talking about whom". > For example, say I want Jonathan's user > icon to appear in my paragraph:: > > I had lunch with [Jonathan's icon here] today. > > How do I do this *without* having a hyperlink at the same time? The way you'd write this paragraph in plaintext is:: I had lunch with Jonathan today. This implies that the reStructuredText paragraph should be:: I had lunch with Jonathan_ today. Then follow it with:: .. _Jonathan: lj-icon jhl or the like. If you're really referring to the icon itself, rather than referring to Jonathan but using his icon in graphical output, then you'd say something like:: Jonathan has a goofy icon: `Jonathan's icon`__ __ lj-icon jhl > On the other hand, we could say that the trailing-underscore syntax > doesn't signify a hyperlink reference, but only indicates a "tagging > reference". Yes. > A tagging reference becomes a hyperlink reference if the contents of > the "tag" resolve to a hyperlink. Or, rather, a hyperlink *is* a type of tag, and ``__ http://python.org`` is just sugar for ``__ link http://python.org``. We're not adding a construct. We're replacing a construct with a more general one. > > Link targets which are also legal directive names must be > > enclosed in backquotes. > > The frequency of link targets would far outweigh directives, so > markup would suffer from extra syntax on targets. Anything with a slash or an at-sign doesn't need to be escaped. This is the vast majority of cases. In all the reStructuredText documentation, you don't have a single target that would need quoting. The fraction of links on my site that would require quoting is also tiny. > > 4) Inside markup delimited by backquotes or curly braces, curly > > braces may be used as delimiters equivalent to backquotes:: > ... > > This is because backquotes don't nest. > > There's no difference between backquotes and asterisks with regard to > nesting. True. Asterisks don't nest either. :-) I guess I wasn't clear. I should have said "inside tagged content, curly braces may be used to delimit tagged content". I'm referring solely to:: `Putting {a tag}_ inside another tag`_ > Why the fixation on curly braces? :> We only have four pairs of nesting characters on the keyboard, and everyone wants them. I think braces are more available than the other three. (And I say this despite wanting to use reST for other programming languages besides Python; Python is relatively light on braces.) > > 5) Roles can go away. We don't need them. Optionally if we want > > the ability to put short directive names inline, we could > > declare :: > > > > `foo:: bar bar bar` > > Similar syntax has already been considered and rejected. See > http://structuredtext.sf.net/spec/alternatives.txt, "Interpreted Text > 'Roles'" alternative 1. Alternative 1 is more ambiguous than what I'm suggesting, and does not have the benefit of consistency with out-of-line directives. However, which syntax to use for simple inline directives is a minor side issue, and I shouldn't have combined it with this proposal. More important is whether "roles" and "directives" and "tags" should all be unified. I think they should. It adds both simplicity and power. > > Summary: > > > > - We gain nesting. > > Not without significant work, though. If it's even possible > unambiguously, it can be added independently later. I don't mind writing code, but I'd rather not fork to do it. > > - We gain arbitrary extensibility of inline markup. > > - We gain substitutions. > > But at the expense of complicating hyperlinks. > > - We lose by occasionally having to escape a curly brace inside > > backquotes, or quote a hyperlink target with no ``/`` or ``#`` > > characters to distinguish it from a directive name. > > That last one is a significant loss. I really don't think so. Cases where you have to quote the target are few and far between. Alan From jaffray@pobox.com Wed Nov 7 19:20:46 2001 From: jaffray@pobox.com (Alan Jaffray) Date: Wed, 7 Nov 2001 14:20:46 -0500 (EST) Subject: [Doc-SIG] Alternative inline markup In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5B0AC@UKRUX002.rundc.uk.origin-it.com> Message-ID: On Tue, 6 Nov 2001, Moore, Paul wrote: > My personal goal is to get it seen as a "normal mindset" for people who want > to add a little structure to their plain text writing - wherever that may be. We're on the same page here. In fact, that's exactly how I explained it to a coworker yesterday. He asked "so this'll make it easier to mark up stuff we send around in email to put it on the intranet?" I replied "Ideally, you shouldn't have to mark it up at all! It's simple enough that you should be able to just use it in everyday messages and process them as is, I'm already doing that on one of my mailing lists." He thought that was nifty. :-) I realize that what I'm suggesting sounds more complicated. However, that's largely because I"m explaining it relative to the current spec, not from scratch. I'm keeping in mind "does this make it easier or more difficult to explain how to use the language in under two minutes" (or under 40 lines of email) because I'm going to have to do just that. I don't think I'm compromising that goal; if anything what I'm suggesting brings us *closer*. > I want a full implementation **now**. (And if I don't get one, I'll thcream > and thcream until I'm thick. Hell, me too. :) I'm already writing rST and converting it to XML and HTML, and I'm writing a Zope product akin to the existing STXDocument to use rST in Zope without explicitly invoking converters. I'm planning to train people on it and start having them use it by next *week*. (Freeze or no freeze - if the spec changes on me after that, well, I'll suffer.) > This went on too long. But I'm convinced - we should freeze the design now, > and work on implementation. We can't hit a moving target. (At the very > least, the DOM needs to be frozen so that work on output has a firm > basis...) Can we at least add support for inline nested markup to the DOM before freezing, even if the current parser doesn't support it, so those of us who want to add it can do so without breaking everything in existence? Surely that wouldn't be too difficult. Alan From jaffray@pobox.com Wed Nov 7 20:05:39 2001 From: jaffray@pobox.com (Alan Jaffray) Date: Wed, 7 Nov 2001 15:05:39 -0500 (EST) Subject: [Doc-SIG] Alternative inline markup In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5B0BA@UKRUX002.rundc.uk.origin-it.com> Message-ID: On Wed, 7 Nov 2001, Moore, Paul wrote: > From: Alan Jaffray [mailto:jaffray@pobox.com] > > > > The reason I said "sick" is because I don't know a semantic > > meaning for "emphasized strong text" other than "the author > > wants to demonstrate a case where nesting is difficult to > > parse". :-) > > Bold italic, in most web browsers. That's not semantics or structure, it's presentation. Honestly, I don't mind the idea of having "bold" and "italic" tags in the language, but if structural purity is a goal, then we shouldn't treat "emphasis" and "strong emphasis" as euphemisms for "italic" and "bold". If "emphasized strong emphasis" has a meaning, it's not a terribly important one. :-) > You can't get around it - the language doesn't support nesting, and unless > that is changed, it means that it simply isn't *possible* to use emphasized > strong text in reST. More relevantly, it means that you can't emphasize > parts of a hyperlink. This is a more realistic requirement, but I *still* > don't see it as so earth-shattering that we have to accept either > inconsistent or complex nesting rules (all options I've seen so far are one > or the other of these...) just to support it. I think being able to link anything - emphasized text, class and attribute names, other inline interpreted/tagged text - is a basic need. I don't think "you can't put emphasis inside strong emphasis" is too great an inconsistency to accept to support the vast majority of nesting cases. > But we disagree on whether people should be able to write reST without > reading the spec. Actually, no, I agree with you completely on this. What we might disagree on is whether people should be able to learn **all** of reST without reading some documentation. I don't intend to teach my users nesting in the two-minute rundown. I'll tell them how to do titles and lists and links and emphasis and literals and literal blocks and what to escape. They'll probably try to link emphasized text on their own at some point, since it's a natural thing to do; it'd be nice if that Just Worked. If I get another five minutes, I'll tell them about field lists and tables and footnotes and definition lists and nested markup and transitions and whatever else. > I believe that things should be deducible from examples, > you feel that attempting to use markup without knowing the rules is foolish. I shouldn't have made the "foolish" comment. What I meant was this: If you were a user and you wanted to write emphasized strong emphasized text, and you typed ``***foo***``, wouldn't you look at the ``***`` just a little funny and maybe wonder whether or not it would work, and either check the docs or try it out before being really surprised if it didn't work? I know that I looked at ``````` a little funny the first time I used it. (And if you didn't raise an eyebrow at my inline literal triple backquotes, you're far more unflappable than I am. :-) ) > (I hope I didn't misrepresent you - I'm not trying to argue that your > position is wrong, just point out that we have differing perspectives). > > > Really? Some of our documents talk about Python, and some of > > them talk about Perl which uses curly braces even more heavily, > > but I didn't think it'd be an issue. I would expect code > > fragments to be in *literal* blocks much more often than single > > backquotes. > > Agreed (about use of literal blocks), but I think keeping the special > characters to a minimum is an important goal. I agree. One cool thing about more powerful inline tagging is that we don't have to add any more special characters. If we want to add substitution, or inline images, or the Spanish Inquisition, we can do that with the tag syntax rather than more punctuation. > In particular, code becomes > unreadable if it needs *any* escaping at all. ("Is that markup, or am I > supposed to type that?") And while braces-as-delimiters are probably only > seen in literal blocks, it's quite reasonable to talk about dictionaries > (ie, {'a':1, 'b':2}) inline... I misspoke again. I meant "literal text", not just literal blocks. I'd write that ``{'a':1, 'b':2}`` whether or not braces had any special meaning, just out of habit, both for output formatting and to avoid any problems with backquotes or backslashes. But even with my suggestion, you can still write {'a':1, 'b':2}. What you can't write without escaping is:: `{'a':1, 'b':2}`__ __ /docs/dictionary.html I really can't think of a common case where link text would contain curly braces, even when you're talking about code a lot. > > You still have those. Instead of ``:attribute:`Fred``` you write > > ```attribute:: Fred```. Directives are slightly expanded to subsume > > the role of roles. And if you ever want to give that "class" > > directive/role some arguments or otherwise make it more complex, > > you can do that, which you couldn't with the current roles. > > I thought that interpreted text (ie, anything in \`...\`, with the exception > of substitutions and hyperlinks, which have extra delimiting characters) was > entirely application-defined. You seem to be suggesting some standard > interpretation of the contents to cover attributes. [1]_ To cover roles. In general I was suggesting that ``:rolename:`text``` could be written as ```rolename:: text```. However, as I wrote in my reply to David, this is a side issue; keeping the ``:rolename:`text``` syntax wouldn't really change things. Fergetaboutit. > .. [1] By the way, you do realise that in advocating nesting, > you are making the construct:: > > ```attribute:: Fred``` > > which you just used, illegal? (Or at least different.) > At the moment it is a literal display of markup. What would > it be with nesting? It'd be a literal display of markup. As the spec says: No markup interpretation (including backslash-escape interpretation) is done within inline literals. Anyway, summary: I think we have the same goals. But I think the suggestions I'm making aren't as difficult or complex as you seem to believe. Alan From gustav@morpheus.demon.co.uk Wed Nov 7 22:27:55 2001 From: gustav@morpheus.demon.co.uk (Paul Moore) Date: Wed, 07 Nov 2001 22:27:55 +0000 Subject: [Doc-SIG] Alternative inline markup In-Reply-To: References: <714DFA46B9BBD0119CD000805FC1F53B01B5B0BA@UKRUX002.rundc.uk.origin-it.com> Message-ID: On Wed, 7 Nov 2001 15:05:39 -0500 (EST), Alan Jaffray wrote: >On Wed, 7 Nov 2001, Moore, Paul wrote: >> Bold italic, in most web browsers. > >That's not semantics or structure, it's presentation. You're right. Sorry. >I think being able to link anything - emphasized text, class and = attribute >names, other inline interpreted/tagged text - is a basic need. I don't >think "you can't put emphasis inside strong emphasis" is too great an >inconsistency to accept to support the vast majority of nesting cases.=20 Hmm. I'll have to think about this. I'm not sure I agree (that linking anything is a basic need). I suspect that there are very few cases where I'd try to link anything other than straight text - in fact, other than a single word or *very* short phrase. And where I was inclined to link something more complex, a rewrite would probably remove the need (and simplify the text at the same time). But I'm not sure, and my needs aren't the only ones involved. >What we might disagree on is whether people should be able to learn >**all** of reST without reading some documentation. I don't think that people should be able to do this. But you're offering a two-minute rundown. What is "obvious" depends strongly on what goes into that rundown [1]_. If nesting isn't allowed, it takes a couple of seconds to say "nesting isn't allowed - anywhere". But if you don't say it, I agree people might well expect it to work. It depends on what counts as basic tenets... .. [1] I know I'm making too much of your passing comment - I hope my point is still valid. >> In particular, code becomes >> unreadable if it needs *any* escaping at all. ("Is that markup, or am = I >> supposed to type that?") And while braces-as-delimiters are probably = only >> seen in literal blocks, it's quite reasonable to talk about = dictionaries >> (ie, {'a':1, 'b':2}) inline... > >I misspoke again. I meant "literal text", not just literal blocks. > >I'd write that ``{'a':1, 'b':2}`` whether or not braces had any special >meaning, just out of habit, both for output formatting and to avoid any >problems with backquotes or backslashes. Possibly I would, too. But I'd rather be doing it because I thought it was clearer in raw form, than because I wasn't sure what the effect of the non-literal version was. >But even with my suggestion, you can still write {'a':1, 'b':2}. What >you can't write without escaping is:: > > `{'a':1, 'b':2}`__ > > __ /docs/dictionary.html Snmfrglph. Can you encapsulate the rule that makes this true in a simple statement. David's substitutions can be easily encapsulated as "starts with ```/`` and ends with ``/```". Are you defining a construct which starts with ```{`` and ends with ``}`__``? If so, please clarify what it is (I've lost context here, and forgotten). If not, what *is* the rule? >I really can't think of a common case where link text would contain >curly braces, even when you're talking about code a lot. See above. I can't see link text containing curly braces ever. But that's not the point - I'm not so much worried about losing the option, as understanding what I'm losing it *for*. Consistency is high on my list of priorities. I don't really like David's ```/subst/``` syntax, either, because it loses the "anything in \` characters is interpreted text" rule. (Which links don't break for me, as the rule "anything ending in ``_`` is a link" overrides it) [2]_. I can't find a rule matching your syntax here... .. [2] Note - these "rules" are just my mental heuristics, not specs. >Anyway, summary: I think we have the same goals. But I think the >suggestions I'm making aren't as difficult or complex as you seem >to believe. You may well be right. This message (yours) has certainly eased some of my concerns, in theory. But I'd like to see: 1. Clearer definitions of the syntax rules, in the context of the reST spec (so I can see what are exceptions, and what fall naturally into the overall scheme. 2. More use cases, and examples. In context others can relate to. Your user example does nothing for me, as I can't see how it would fit into my model (marked up text documents being processed into a printable form, ut eradable in "raw" form as well). 3. Better separation of distinct cases. I think there is more than one concept being discussed at once, here... Of course, whether you feel the need to provide for these unreasonable demands is entirely up to you... Thanks for keeping up with this - we seem to be getting closer. Paul. From jaffray@pobox.com Wed Nov 7 23:07:28 2001 From: jaffray@pobox.com (Alan Jaffray) Date: Wed, 7 Nov 2001 18:07:28 -0500 (EST) Subject: [Doc-SIG] Alternative inline markup In-Reply-To: Message-ID: On Wed, 7 Nov 2001, Paul Moore wrote: > On Wed, 7 Nov 2001, Alan Jaffray wrote: > > >But even with my suggestion, you can still write {'a':1, 'b':2}. What > >you can't write without escaping is:: > > > > `{'a':1, 'b':2}`__ > > > > __ /docs/dictionary.html > > Snmfrglph. Can you encapsulate the rule that makes this true in a simple > statement. David's substitutions can be easily encapsulated as "starts > with ```/`` and ends with ``/```". Are you defining a construct which > starts with ```{`` and ends with ``}`__``? Ack, no! I'm saying that in the existing construct :: `content content content`__ curly braces in the content would have to be escaped. So if the content was ``{'a':1, 'b':2}`` you'd have to write ``\{'a':1, 'b':2\}`` instead. This is certainly ugly, the same way escaping backslashes and backquotes in single-backquote-delimited content is ugly. Fortunately it should be exceedingly rare. > You may well be right. This message (yours) has certainly eased some of > my concerns, in theory. But I'd like to see: > > 1. Clearer definitions of the syntax rules, in the context of the > reST spec (so I can see what are exceptions, and what fall naturally > into the overall scheme. > > 2. More use cases, and examples. In context others can relate to. Your > user example does nothing for me, as I can't see how it would fit > into my model (marked up text documents being processed into a > printable form, ut eradable in "raw" form as well). > > 3. Better separation of distinct cases. I think there is more than > one concept being discussed at once, here... Oh, certainly! These are all necessary. They'll be easier to provide now that you and others have offered feedback. I knew there were issues with my latest proposal, but it seemed better to get *something* out there rather than continue to ponder in isolation. I'll try to send out an edited and clearer proposal with a wider variety of examples within a day or two. > Thanks for keeping up with this - we seem to be getting closer. Agreed. Thanks for taking the time to comment. Alan From Paul.Moore@atosorigin.com Thu Nov 8 09:55:13 2001 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Thu, 8 Nov 2001 09:55:13 -0000 Subject: [Doc-SIG] Alternative inline markup Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5B0BB@UKRUX002.rundc.uk.origin-it.com> From: Alan Jaffray [mailto:jaffray@pobox.com] > Ack, no! I'm saying that in the existing construct :: > > `content content content`__ > > curly braces in the content would have to be escaped. So if > the content was ``{'a':1, 'b':2}`` you'd have to write > ``\{'a':1, 'b':2\}`` instead. Question: precisely what markup are you referring to here? See the spec, but we have:: `xxxxx`_ - named hyperlink reference (type 2) `xxxxx`__ - anonymous hyperlink reference (type 2) _`xxxx` - inline hyperlink targets `/xxx/` - substitution reference `xxxxx` - interpreted text (assuming we limit ourselves to markup which limits itself to using single "`" characters). As you can see, there's a wealth of different possibilities here. I'll assume you *really* mean that in **hyperlink references** (a specific type of markup, as defined in the spec) that curly braces have special meaning. That's fine - I'm not keen, as I see no value in this, but if that *is* what you mean, you should say so. And you'll also have to *define* that meaning. That's the hard bit. You seem to be saying that, within a hyperlink reference, constructs with start-string="{" and end-string="}_" should *also* be named hyperlink references, and that constructs with start-string="{" and end-string="}__" should be anonymous hyperlink references. Yes? No? But how can you word the rationale for this ("because the normal \` character doesn't nest") in such a way that it sounds reasonable in the context of the spec? Once you put it in pseudo-legalese like this, the contradiction becomes clear. You are proposing a second syntax for hyperlink references, which is *only* valid within current-syntax hyperlink references, in order to get over the fact that the existing syntax doesn't nest. This is bogus. Either we should fix the nesting issue at the "top level", or we should not make special-case changes. The general feeling seems to be that nesting is *not* worth adding. It's interesting to note that in HTML, elements *don't* nest - so prior art here says that nesting isn't useful for links. Apologies if I've misinterpreted what you propose - hopefully, the misinterpretation is valuable in pointing out where you are still not getting your point across :-) Paul. From tony@lsl.co.uk Thu Nov 8 10:27:20 2001 From: tony@lsl.co.uk (Tony J Ibbs (Tibs)) Date: Thu, 8 Nov 2001 10:27:20 -0000 Subject: [Doc-SIG] Alternative inline markup In-Reply-To: Message-ID: <011901c1683f$f2d8b000$545aa8c0@lslp862.int.lsl.co.uk> Alan and Paul seem to be teasing meaning out of/into the debate whilst I sleep, but I can't resist making *some* comments... Despite my earlier comment about complexity worries, I *do* believe that raising issues and worrying around them causes a "language" to be stronger, even if the result is eventually to reject the proposals - provided one keeps a record of *why* it was rejected, of course (which David seems to be doing for such issues). Alan Jaffray wrote: > The reason I said "sick" is because I don't know a semantic > meaning for "emphasized strong text" other than "the author > wants to demonstrate a case where nesting is difficult to > parse". :-) "Perverse" would have been a more accurate adjective. Well, it's quite clear to me what the author is trying to do, if one thinks in the final presentation terms - they presumably want a bold italic text. Not at all unreasonable, surely, if one allows both bold and italic? (and yes, I *know* they're not technically indicating presentation with *..* and **..**, but I bet most people forget that when they're actually typing!) In reply to Paul saying much the same, Alan responded: > That's not semantics or structure, it's presentation. Honestly, > I don't mind the idea of having "bold" and "italic" tags in the > language, but if structural purity is a goal, then we shouldn't > treat "emphasis" and "strong emphasis" as euphemisms for "italic" > and "bold". If "emphasized strong emphasis" has a meaning, it's > not a terribly important one. :-) But worrying about it because they're *called* "emphasis" and "strong emphasis" is non-sensible, too - would you be happier if I called them "flurgle" and "splurgle"? Surely one couldn't object to someone wanting to flurgle their splurgled text, for clarity? (I guess my *real* point is that the provision of two forms is *actually* derived from the two typographic forms. To then say that back-applying the semantic distinction to the typographic distinction doesn't make sense may be true theoretically, but it isn't much use in real life, in the mindspace that people *want* to work in.) > Nesting is a fundamental feature. True. > It's not going to become easier to add it later. True, possibly (modulo experience gained in doing such things). > It's going to become more difficult. I don't know the innards of David's parser well enough to comment on that, but I doubt it. Certainly, if *I* were writing a parser for reST, and could use any tool I like (so it would be mxTextTools, then), it would not be any harder to add it in later on. > Meanwhile, attempts to get around the need to add it will > complicate and clutter the language, while adding it now > can simplify matters. Hmm. So far as *I'm* concerned, I can live without it - but then I'm mostly typing, well, text. Your needs seem to be more complex, but I'm not convinced that David's suggestions don't go a good way to coping with them (OK, maybe not terribly well, that's for you to explain), and they certainly extend the language about as far as I can cope with, for now. As to adding the ability to the DTD, so that other implementations can do it (was that what Alan meant?) - there's a problem if the *core* (reference) implementation doesn't handle something this fundamental - I don't like the idea that the distributed-with-Python tool would grumble about texts that appear to conform to the format they're meant to be supporting - I don't think that would fly. Hmm - I like Paul Moore's summary of backquoted markup possibilities: > `xxxxx`_ - named hyperlink reference (type 2) > `xxxxx`__ - anonymous hyperlink reference (type 2) > _`xxxx` - inline hyperlink targets > `/xxx/` - substitution reference > `xxxxx` - interpreted text I'll have to keep that to hand... Alan Jaffray wrote, in response to me: > > Having to quote { and } in Python-related text is surely a > > non-starter (heck, we don't want to have to quote < and > > > because one might have to talk about *ML elements, so not > > being able to backquote dictionaries is *surely* out). > > Really? Some of our documents talk about Python, and some of > them talk about Perl which uses curly braces even more heavily, > but I didn't think it'd be an issue. I would expect code > fragments to be in *literal* blocks much more often than single > backquotes. Historical decision. In one thread of discussion (looking at maybe using < and > to delimit URIs), Guido made the point that he often wanted to write XML elements within his documentation text, and it would not be acceptable, to him, to have to quote them. This seemed like a valid point (heck, it was the BDFL who said it). I believe (although I can't cite cases from memory) that there has also been a similar feeling that *having* to quote elements of code, particularly Python code (since we *did* start out looking at docstrings) would be a Bad Thing (it also has the advantage that a surprising number of docstrings are probably already close to being valid reST). Thus we *want* people still to be able to write:: This function takes a and splits it into a series of elements. To do this, it uses a dictionary which is something like {nl:'\n',cr='\r'}. *Why* it wants to do this is beyond my ken. You can, of course, immediately see why there was a loud argument about whether backslash could/should be used as an escape character for reST! Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ "No one trike will do everything... buy the whole set!" - Rob Hague My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) From jaffray@pobox.com Thu Nov 8 18:33:00 2001 From: jaffray@pobox.com (Alan Jaffray) Date: Thu, 8 Nov 2001 13:33:00 -0500 (EST) Subject: [Doc-SIG] Zope product: RSTDocument 0.1 Message-ID: Not sure how many Zopistas are on the list... Let me emphasize that this is not a general release. I want to get feedback from people on this list before going further. reStructuredText Document ========================= The RSTDocument_ product provides a document class for reStructuredText_ documents. It is in *very* preliminary form. It depends on Xalan_, it can't be customized from ZMI, there's no help, no error handling, etc. There's also the minor issue that rST itself hasn't yet been finalized. But hey, it's better than nothing. You can create an RSTDocument_, then call it (or the index_html or html methods) to get HTML output, or call the xml or rst or text methods to get output in those formats. It works. Or at least it works for me. Known Issues ------------ It sucks. To Do ----- Everything. Credits ------- The code has been blatantly ripped off from philiKON_'s STXDocument_ class, and modified using the latest sophisticated find-and-replace coding techniques. STXDocument_ is in turn derived largely from the built-in DTMLDocument class. Blame ----- The author of this mess is `Alan Jaffray`_. Send comments, patches, or other feedback to him; any suggestions are welcome, since I'm a Zope newbie and need all the help I can get. Alternately, disregard this code entirely, take a couple of hours and write something better. .. _RSTDocument: http://pobox.com/~jaffray/rst/RSTDocument.tgz .. _reStructuredText: http://structuredtext.sourceforge.net .. _Xalan: http://xml.apache.org/xalan-c/ .. _philiKON: http://www.philikon.de .. _STXDocument: http://www.zope.org/Members/philikon/STXDocument .. _Alan Jaffray: http://pobox.com/~jaffray/ From jaffray@pobox.com Thu Nov 8 18:56:17 2001 From: jaffray@pobox.com (Alan Jaffray) Date: Thu, 8 Nov 2001 13:56:17 -0500 (EST) Subject: [Doc-SIG] Zope product: RSTDocument 0.1 In-Reply-To: Message-ID: Oh, you'll need to install Xalan (no relation) and change XALAN_PATH in RSTDocument/RSTDocument.py for this to work. On Thu, 8 Nov 2001, Alan Jaffray wrote: > Not sure how many Zopistas are on the list... > > Let me emphasize that this is not a general release. I want to get > feedback from people on this list before going further. > > > reStructuredText Document > ========================= > > The RSTDocument_ product provides a document class for reStructuredText_ > documents. It is in *very* preliminary form. It depends on Xalan_, > it can't be customized from ZMI, there's no help, no error handling, etc. > There's also the minor issue that rST itself hasn't yet been finalized. > > But hey, it's better than nothing. You can create an RSTDocument_, > then call it (or the index_html or html methods) to get HTML output, > or call the xml or rst or text methods to get output in those formats. > > It works. Or at least it works for me. > > Known Issues > ------------ > It sucks. > > To Do > ----- > Everything. > > Credits > ------- > The code has been blatantly ripped off from philiKON_'s STXDocument_ > class, and modified using the latest sophisticated find-and-replace > coding techniques. STXDocument_ is in turn derived largely from the > built-in DTMLDocument class. > > Blame > ----- > The author of this mess is `Alan Jaffray`_. Send comments, patches, > or other feedback to him; any suggestions are welcome, since I'm a > Zope newbie and need all the help I can get. Alternately, disregard > this code entirely, take a couple of hours and write something better. > > > .. _RSTDocument: http://pobox.com/~jaffray/rst/RSTDocument.tgz > .. _reStructuredText: http://structuredtext.sourceforge.net > .. _Xalan: http://xml.apache.org/xalan-c/ > .. _philiKON: http://www.philikon.de > .. _STXDocument: http://www.zope.org/Members/philikon/STXDocument > .. _Alan Jaffray: http://pobox.com/~jaffray/ > > > _______________________________________________ > Doc-SIG maillist - Doc-SIG@python.org > http://mail.python.org/mailman/listinfo/doc-sig > From jaffray@pobox.com Thu Nov 8 19:48:08 2001 From: jaffray@pobox.com (Alan Jaffray) Date: Thu, 8 Nov 2001 14:48:08 -0500 (EST) Subject: [Doc-SIG] Alternative inline markup In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5B0BB@UKRUX002.rundc.uk.origin-it.com> Message-ID: On Thu, 8 Nov 2001, Moore, Paul wrote: > The general feeling seems to be that nesting is *not* worth adding. It's > interesting to note that in HTML, elements *don't* nest - so prior art > here says that nesting isn't useful for links. Uh, that's not how I read the `HTML4 DTD`__ ... :: __ http://www.w3.org/TR/html4/struct/links.html#edef-A > Apologies if I've misinterpreted what you propose - hopefully, the > misinterpretation is valuable in pointing out where you are still not > getting your point across :-) Well, it's become clear that I need to provide much stronger arguments for why richer inline markup is important. I feel pretty strongly that this is something that will come to bite us later, and not very much later, if we want rST to thrive as a general markup. I have use for it now, and I have relatively simple applications and have barely started to use the language. And yeah, it can be added later, but I *really* don't want to head down the road of writing reStructuredTextWithNesting while someone else writes reStructuredTextWithNestingDoneSomewhatDifferent and someone else writing reStructuredTextWithChocolateSprinkles and the whole incompatible dialect-proliferation fiasco that STX has gotten itself into. Long email with more examples coming soon, I suppose. Alan From Juergen Hermann" Message-ID: On Thu, 8 Nov 2001 14:48:08 -0500 (EST), Alan Jaffray wrote: >Uh, that's not how I read the `HTML4 DTD`__ ... :: > > You did not read the "-(A)". From jaffray@pobox.com Thu Nov 8 20:53:14 2001 From: jaffray@pobox.com (Alan Jaffray) Date: Thu, 8 Nov 2001 15:53:14 -0500 (EST) Subject: [Doc-SIG] Alternative inline markup In-Reply-To: Message-ID: On Thu, 8 Nov 2001, Juergen Hermann wrote: > On Thu, 8 Nov 2001 14:48:08 -0500 (EST), Alan Jaffray wrote: > > >Uh, that's not how I read the `HTML4 DTD`__ ... :: > > > > > > You did not read the "-(A)". Ah, thanks. I can understand not being able to nest links inside links. Not nesting anything inside links, or anything inside a more generalized tag construct, is another matter... Alan From fdrake@acm.org Fri Nov 9 08:23:16 2001 From: fdrake@acm.org (Fred L. Drake) Date: Fri, 9 Nov 2001 03:23:16 -0500 (EST) Subject: [Doc-SIG] [development doc updates] Message-ID: <20011109082316.750AD28697@beowolf.digicool.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ A variety of miscellaneous updates have been made over the past several days, most involving typographical or grammatical corrections. Some small clarifications have been made. From Paul.Moore@atosorigin.com Fri Nov 9 09:25:55 2001 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Fri, 9 Nov 2001 09:25:55 -0000 Subject: [Doc-SIG] Alternative inline markup Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5B0C2@UKRUX002.rundc.uk.origin-it.com> From: Alan Jaffray [mailto:jaffray@pobox.com] > > It's interesting to note that in HTML, elements > > *don't* nest - so prior art here says that nesting > > isn't useful for links. > > Uh, that's not how I read the `HTML4 DTD`__ ... :: > > > %special; | %formctrl;"> > | Q | SUB | SUP | SPAN | BDO"> Sorry, I didn't look at the spec. Pragmatist that I am, I tried it :-) On IE5.5 (not a bastion of standards-compliance, I know...) nested links confuse it horribly. In firstnestedsecond, "first" becomes a link to "1", "nested" becomes a link to "2" and "second is not a link at all! Even ignoring IE5.5, what would you expect "nested" to link to? (I know, "2", but where's the nesting in that?) So what I'm saying is that nesting is only semantically meaningful in certain cases (assuming your point that there's no semantic meaning to nested strong and emphasized, then I'm not sure we actually have a case of meaningful nesting at all, but ignore that for now...). We need to codify when nesting makes sense, *before* we decide how to spell it in those cases. Your mission, should you choose to accept it [:-)], is to write down, in excruciating legalese, precisely which inline constructs (as per the reST spec) should nest. My answer is fairly simple - none. (I would have said that strong and emphasized could nest in each other, but I've been convinced that I'm thinking in display terms rather than semantics. And I can see reasons for allowing strong/emphasized to go inside links, but I don't think those reasons are strong enough to make the case.) Paul. From Paul.Moore@atosorigin.com Fri Nov 9 09:27:27 2001 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Fri, 9 Nov 2001 09:27:27 -0000 Subject: [Doc-SIG] Alternative inline markup Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5B0C3@UKRUX002.rundc.uk.origin-it.com> From: Juergen Hermann [mailto:jh@web.de] > On Thu, 8 Nov 2001 14:48:08 -0500 (EST), Alan Jaffray wrote: > > >Uh, that's not how I read the `HTML4 DTD`__ ... :: > > > > > > You did not read the "-(A)". Spoilsport! I've just sent a message explaining that even though the standard allows it, I don't think it's sensible. Now you tell me the standard was on my side after all. There's no fun left in life... :-) Paul. From tony@lsl.co.uk Fri Nov 9 10:22:59 2001 From: tony@lsl.co.uk (Tony J Ibbs (Tibs)) Date: Fri, 9 Nov 2001 10:22:59 -0000 Subject: [Doc-SIG] Alternative inline markup In-Reply-To: Message-ID: <012501c16908$81480480$545aa8c0@lslp862.int.lsl.co.uk> Congrats to Alan on announcing RSTDocument - not something I'm likely to use in the foreseeable future, but still a Good Thing! Meanwhile, back at the discussion on reST itself... Alan Jaffray wrote: > Well, it's become clear that I need to provide much stronger > arguments for why richer inline markup is important. I feel > pretty strongly that this is something that will come to bite > us later, and not very much later, if we want rST to thrive > as a general markup. I have use for it now, and I have > relatively simple applications and have barely started to > use the language. A feeling like that is very valuable, but, as you say, the rest of us need articulation to understand the basis for it. Sufficient examples will lead to either agreement, or "aha - but the way you do it is this", or, at worst, a decision that the problem isn't going to be addressed, but at least we understand why not. All of those are stronger positions to be in. > And yeah, it can be added later, but I *really* don't want > to head down the road of writing reStructuredTextWithNesting > while someone else writes reStructuredTextWithNestingDone > SomewhatDifferent and someone else writing > reStructuredTextWithChocolateSprinkles and the whole incompatible > dialect-proliferation fiasco that STX has gotten itself into. I hope that won't happen (judicial use of modes should prevent some of it). But even with the best will in the world you can't stop it entirely (in my paid field of work we have an example of a large company that was involved in standards work, with one head, developing an incompatible variant of the same standard with another head - if one head had talked to the other, the standard would almost certainly have been extended in the relevant manner (the requirements were real and sensible), but unfortunately it didn't happen. You can't stop people from acting without foresight.). So, yes, it would be nice to solve it - but the case is still to be made that we *have* to do it. And I *do* think (from historical experience) that *deciding* how to treat nested inline markup is difficult - I have generally had a very laissez-faire attitude to the problem (so I would allow almost anything that might make sense to someone, and expect the parser to cope), whereas David and Edward Loper have had a much stronger wish to be able to *describe* in a simple manner what is and is not allowed. > Long email with more examples coming soon, I suppose. Interesting times to come! Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ "No one trike will do everything... buy the whole set!" - Rob Hague My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) From ping@lfw.org Fri Nov 9 10:52:25 2001 From: ping@lfw.org (Ka-Ping Yee) Date: Fri, 9 Nov 2001 02:52:25 -0800 (PST) Subject: [Doc-SIG] Alternative inline markup In-Reply-To: <012501c16908$81480480$545aa8c0@lslp862.int.lsl.co.uk> Message-ID: On Fri, 9 Nov 2001, Tony J Ibbs (Tibs) wrote: > And I *do* think (from historical experience) that *deciding* how to > treat nested inline markup is difficult - I have generally had a very > laissez-faire attitude to the problem (so I would allow almost anything > that might make sense to someone, and expect the parser to cope), > whereas David and Edward Loper have had a much stronger wish to be able > to *describe* in a simple manner what is and is not allowed. I agree with the latter position. My two cents, in short: If you have to run the system in order to know exactly what output you will get, the system design is broken. -- ?!ng From tony@lsl.co.uk Fri Nov 9 11:17:38 2001 From: tony@lsl.co.uk (Tony J Ibbs (Tibs)) Date: Fri, 9 Nov 2001 11:17:38 -0000 Subject: [Doc-SIG] Alternative inline markup In-Reply-To: Message-ID: <012901c16910$23c7bb90$545aa8c0@lslp862.int.lsl.co.uk> Ka-Ping Yee wrote: > On Fri, 9 Nov 2001, Tony J Ibbs (Tibs) wrote: > > And I *do* think (from historical experience) that *deciding* how to > > treat nested inline markup is difficult - I have generally > > had a very laissez-faire attitude to the problem (so I would allow > > almost anything that might make sense to someone, and expect the > > parser to cope), whereas David and Edward Loper have had a much > > stronger wish to be able to *describe* in a simple manner what > > is and is not allowed. > > I agree with the latter position. My two cents, in short: > > If you have to run the system in order to know exactly > what output you will get, the system design is broken. Oh, I agree with that as well, but that isn't *quite* what I was trying to say (words - great for communicating, but somehow not sufficient to get meaning across). David and Edward have seemed (to me) to want to determine formal rules first, and work outwards from them. Whereas I have wanted to establish what people expect to work, and work the formal rules out backwards from that (possibly not explicitly). This tends to mean that I believe constructs like ``***fred* some text *jim***`` are required to be allowed a priori, and that if this is difficult to implement, then tough. But it does land me in difficulty explaining what I actually want! And my favourite example of why nested inline markup is going to be difficult to decide is now:: ```something``` - is that an interpreted literal, or a literal with single backquotes at each end? And if it is one, how does one get the other? (this is *much* more interesting than triple asterisks). *Maybe* (just maybe) this is an insoluble problem, in the general case. Which would mean that use cases to define what is *actually* needed (if anything) become essential. Tibs (sorry - I'm actually concentrating on something else in the background, so forgive any vagueness) -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ "No one trike will do everything... buy the whole set!" - Rob Hague My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) From goodger@users.sourceforge.net Sat Nov 10 03:20:53 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Fri, 09 Nov 2001 22:20:53 -0500 Subject: [Doc-SIG] My address Message-ID: > > [Tony] > > > hmm, maybe we should insist he does documentation instead of > > > coding!!!!) > > > > And how are you going to make me? Coding is much more fun! > > By withholding those candy treats you ask for in the README files > (how *are* we meant to get them to you without a postal address?). Here it is: David Goodger 41 Morrison Road Kitchener, Ontario N2A 2W6 Canada I'll be watching my mailbox with anticipation! -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From goodger@users.sourceforge.net Sat Nov 10 03:23:16 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Fri, 09 Nov 2001 22:23:16 -0500 Subject: [Doc-SIG] nested inline markup (was RE: Alternative inline markup) In-Reply-To: <012901c16910$23c7bb90$545aa8c0@lslp862.int.lsl.co.uk> Message-ID: [I'm splitting up the discussions because they really are independent issues. Keeping up one thread is confusing. ("My brain hurts!") Please keep the threads separated, or my brain will explode. Thanks. ("Oh my God he's burst his brain!") This post contains replies to a bunch of other posts. I've tried to pull them into some semblance of coherence, but there may be redundancy, ramblings, and misquotes. Apologies in advance.] [Alan] > Nesting is a fundamental feature. It's not going to become easier to > add it later. It's going to become more difficult. Meanwhile, > attempts to get around the need to add it will complicate and > clutter the language, while adding it now can simplify matters. The problem is that in the what-you-see-is-more-or-less-what-you-get markup language that is reStructuredText, the symbols used for inline markup ("*", "**", "`", "``", etc.) may preclude nesting. I've thought over how we might implement nested inline markup. The first algorithm ("first identify the outer inline markup as we do now, then recursively scan for nested inline markup") won't work; counterexamples were given in my last post. The second algorithm makes my head hurt:: while 1: scan for start-string if found: push on stack scan for start or end string if new start string found: recurse elif matching end string found: pop stack elif non-matching end string found: if its a markup error: generate warning elif the initial start-string was misinterpreted: # e.g. in this case: ***strong** in emphasis* restart with the other interpretation # but it might be several layers back ... ... This is similar to how the parser does section title recognition, but sections are much more regular and deterministic. Bottom line is, I don't think the benefits are worth the effort, even if it is possible. I'm not going to try to write the code, at least not now. If somebody codes up a consistent, working, general solution, I'll be happy to consider it. [Paul] > I probably agree here - nesting is a fundamental issue. It's just > that I disagree that that fact makes it necessary to support it. On > the contrary, I'd say that lack of nesting is a distinguishing, > simplifying, feature of the design. > > You can't get around it - the language doesn't support nesting, and > unless that is changed, it means that it simply isn't *possible* to > use emphasized strong text in reST. More relevantly, it means that > you can't emphasize parts of a hyperlink. This is a more realistic > requirement, but I *still* don't see it as so earth-shattering that > we have to accept either inconsistent or complex nesting rules (all > options I've seen so far are one or the other of these...) just to > support it. Well said. [Paul] > That makes the nesting rules dependent on the markup characters > involved. That's a *very* odd distinction to make - it implies that > there is an argument for changing the markup for strong text to, say > ``!strong!``, as that makes it nestable (!!). Good point. [Alan] > > Do we lose anything? No. In the current spec you can't nest at > > all. In the proposed spec there are happier alternatives for all > > three, since ``*`` and ``**`` become mere sugar for tagged > > content. [Paul] > We lose consistency, which is what I am arguing is crucial. I agree. > > The reason I said "sick" is because I don't know a semantic > > meaning for "emphasized strong text" other than "the author > > wants to demonstrate a case where nesting is difficult to > > parse". :-) > > Bold italic, in most web browsers. That's not to say I feel the need > to support it, but it's a *perfectly* sensible thing. We could add a new inline markup construct: ***strongemph***. Still not nestable, but it gives us bold-italic. Then we could add ****flourescent****, *****blinking*****, and ******SCREAMING****** as well. Or use asterisk strings whose lengths are powers of 2: ****flourescent****, ********blinking********, ****************SCREAMING****************. This would allow arbitrary combinations (5 asterisks means flourescent emphasis, 26 means strong blinking screaming, etc.). But we won't. [Alan] > That's not semantics or structure, it's presentation. Honestly, I > don't mind the idea of having "bold" and "italic" tags in the > language, but if structural purity is a goal, then we shouldn't > treat "emphasis" and "strong emphasis" as euphemisms for "italic" > and "bold". If "emphasized strong emphasis" has a meaning, it's not > a terribly important one. :-) That's a debate that has raged for years and never been properly resolved as far as I know. My take on it is, "emphasis" is often typographically represented by italics, but it could just as easily be represented by the colour red or blinking or a different typeface or by *asterisks*. The term "emphasis" doesn't tie you to any one representation, so you're free to choose what suits best. Same for "strong". [Paul] > But we disagree on whether people should be able to write reST > without reading the spec. I believe that things should be deducible > from examples, you feel that attempting to use markup without > knowing the rules is foolish. Learning by example without reference to the spec (or at least the quickref) is possible, up to a point. Once the more advanced features are encountered and used, the reference materials are a necessity. > .. [1] By the way, you do realise that in advocating nesting, > you are making the construct:: > > ```attribute:: Fred``` > > which you just used, illegal? No, it's the inline equivalent of a literal block: :: `attribute:: Fred` > At the moment it is a literal display of markup. What would > it be with nesting? Same. Inline literals ("``") explicitly do no further processing of their contents. > .. [2] Boy, it's hard to discuss markup using marked up text... Use inline literals or (especially in this case) literal blocks. And when inline markup start-strings are not in a start-string context, they're not recognized anyway. That's a feature! [Alan] > Is "markup can't be delimited with the same character as its parent" > too complicated? No, but it is limiting. [Alan] > > > 4) Inside markup delimited by backquotes or curly braces, curly > > > braces may be used as delimiters equivalent to backquotes:: > > ... > > > This is because backquotes don't nest. [David] > > There's no difference between backquotes and asterisks with regard > > to nesting. [Alan] > True. Asterisks don't nest either. :-) I guess I wasn't clear. I > should have said "inside tagged content, curly braces may be used to > delimit tagged content". I'm referring solely to:: > > `Putting {a tag}_ inside another tag`_ This is too much. -1 (just on braces for tags in tags, independent of the "tagging reference" proposal) [Alan] > > > Summary: > > > > > > - We gain nesting. [David] > > Not without significant work, though. If it's even possible > > unambiguously, it can be added independently later. [Alan] > I don't mind writing code, but I'd rather not fork to do it. No fork necessary. If it can be done cleanly, and if you or someone else does code it, we'll incorporate it into the parser. > Can we at least add support for inline nested markup to the DOM > before freezing, even if the current parser doesn't support it, so > those of us who want to add it can do so without breaking everything > in existence? Surely that wouldn't be too difficult. Not difficult, and valid for other markup syntaxes... done. See "Inline Elements" in http://docstring.sf.net/spec/gpdi.dtd, especially read the caveats. [Tony] > As to adding the ability to the DTD, so that other implementations > can do it (was that what Alan meant?) I've added the ability to the DTD so that other *markups* can do inline nesting, possibly reStructuredText as well. The DTD represents the internal data structure, not the input markup. (Of course, I've modified it based on the needs of reStructuredText, the only input markup available. That, combined with my experience, intuition, and desire for a generic data structure, has shaped the DTD.) [Alan] > I think being able to link anything - emphasized text, class and > attribute names, other inline interpreted/tagged text - is a basic > need. I don't think "you can't put emphasis inside strong emphasis" > is too great an inconsistency to accept to support the vast majority > of nesting cases. It may be a basic need, but for now at least, it is a basic limitation of reStructuredText that you can't do it. [Alan] > And yeah, it can be added later, but I *really* don't want to head > down the road of writing reStructuredTextWithNesting while someone > else writes reStructuredTextWithNestingDoneSomewhatDifferent and > someone else writing reStructuredTextWithChocolateSprinkles and the > whole incompatible dialect-proliferation fiasco that STX has gotten > itself into. I think the main reason for the StructuredText dialect proliferation, which reStructuredText hopes to put an end to (but may be seen as just another example of), is that there was no easy way to add your own little extensions. reStructuredText is much richer in terms of supplied constructs, and thanks to directives and substitutions (thanks to Alan for lighting the fire under that last one), application-specific extensions *are* easy to add. [Tony] > And my favourite example of why nested inline markup is going to be > difficult to decide is now:: > > ```something``` > > - is that an interpreted literal, or a literal with single > backquotes at each end? Something implicit in the parser and never spelled out explicitly in the spec (will do so), is the "order of operations" for resolving inline markup. "``" and "`/" and "_`" are checked before "`"; "**" is checked before "*"; standalone URIs are checked for last of all. So the above example resolves to:: `something` > And if it is one, how does one get the other? It seems to be impossible to get:: something Although:: `were nesting allowed, ``something`` like this would be OK` might result in:: were nexting allowed, something like this would be OK The text between the interpreted start-string (`) and the literal start-string (``) would be enough to differentiate. It's the same problem as ``***emphasized strong or strong emphasis?***``, except that unlike emphasis & strong, interpreted & literal are not commutable. -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From goodger@users.sourceforge.net Sat Nov 10 03:25:24 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Fri, 09 Nov 2001 22:25:24 -0500 Subject: [Doc-SIG] substitutions or tagging references or inline directives (was RE: Alternative inline markup) In-Reply-To: <012901c16910$23c7bb90$545aa8c0@lslp862.int.lsl.co.uk> Message-ID: [I'm splitting up the discussions because they really are independent issues. Keeping up one thread is confusing. ("My brain hurts!") Please keep the threads separated, or my brain will explode. Thanks. ("Oh my God he's burst his brain!") This post contains replies to a bunch of other posts. I've tried to pull them into some semblance of coherence, but there may be redundancy, ramblings, and misquotes. Apologies in advance.] [Paul] > 1. I'm not keen on the overloading of the \` character in the > syntax. Substitutions look like interpreted text. The slashes don't stand out enough? Perhaps a more obtrusive markup is in order? ;-) Seriously though, suggestions for alternative syntax are welcome. (Note: You don't actually have to escape the backquote in these and other cases: ` "`" (`). Try it with the parser!) > 2. I'm not 100% clear on the semantics, either. To my mind, pure > text substitutions are a bad thing - put the text inline instead. > Otherwise, you are making the marked up text *less* readable. > `/swim/`? > .. /swim/ See what I mean Yes. They're easily abused. Perhaps any use is abuse. We could remove the replacement text aspect and leave the directive only. That would mean that an extra level of indirection plus a directive is needed for the abuse:: Do you `/swim/`? .. /swim/ text:: See what I mean This type of abuse cannot be prevented, but it would discourage potential abusers. Yes, you've convinced me to remove "replcement text" from the spec. That makes substitutions easier to understand as well: just one case. > And the semantics of anything else is effectively > output-processor dependent. I don't see how. As it stands, given this input:: `/Peace/`, `/love/`, and `/Tux/`. (It's an IBM ad. Tux is the Linux penguin.) .. /peace/ image:: peace.png .. /love/ image:: heart.png .. /tux/ image:: tux.png The parser will turn it into:: Peace , love , and Tux . (It's an IBM ad. Tux is the Linux penguin.) After transformation, this will turn into:: , , and . (It's an IBM ad.) This is still output-processor-independent. User-defined directives may well be output-processor-dependent though; it can't be avoided. > Even your image example (the least contentious possibility for > this sort of thing) may not be renderable in certain output > formats (ASCII text, for instance!!!). That's not a valid argument. Converting to ASCII text almost invariably loses information. Almost *none* of the reStructuredText constructs are directly representable in ASCII text. 90% of web content depends on graphics. Somebody could insert one of those bitmap-to-ASCII-matrix converters, resulting in huge ASCII graphics displays that you have to stand back from and squint to see. > 3. Overall, I'd like to see clearer examples of what all this could > be *for*. ... I'd like to see a "motivation" section, with more > of this sort of example. So would I! As I get around to it, I will add them. Documentation patches are welcome! > .. [1] Actually, having used the syntax in my example above, > I find I quite like it. But point (2) still stands - it's not > clear (to me) that the construct is *useful*. I think substitutions will be useful at times, but I haven't had a real need for them yet. Now that the syntax is there, I may find occasion to use it. Most importantly, substitutions tie up the last loose end that would prevent the reStructuredText spec from being "complete". There are sets of structural and body elements, with directives to extend them as required. There's a set of inline elements, now with substitutions to extend *them* too. > > I've asked this before, and I really would like to know: what does > > the "lj" tag *do* in the end? Can you show us some HTML output? > > That gets back to my point - what's the motivation? And for me, I'd > like to see more than just HTML output. What would such a document > show in PDF/PostScript intended for printing? (If the answer is "you > don't use it in that context", then we're getting too > domain-specific). If the "lj" tag is meant for an internal Wiki application, then the lack of printed-output support is not really an issue. I suggested HTML because I wanted to understand the problem better. > And the converse is, that *any* request for extra features should be > addressable with "use a (substitution|directive)". If not, then we > need to rethink these constructs, to understand why they aren't > doing their job properly. Yes. > Hmm, by my own argument, that implies that roles should be > *replaced* by substitutions. Maybe the fact that they can't means > that there is still something to address here. Substitutions don't > take "parameters" (the interpreted text part of a role). Roles don't > have the "supplemental information" (the directive-like bit in the > ``..`` section - I don't know what to call it) that substitutions > do. See my post "Clarification: interpreted text vs. directives vs. substitutions". [Alan] > Instead of ``:attribute:`Fred``` you write ```attribute:: Fred```. > Directives are slightly expanded to subsume the role of roles. And > if you ever want to give that "class" directive/role some arguments > or otherwise make it more complex, you can do that, which you > couldn't with the current roles. See my post "Clarification: interpreted text vs. directives vs. substitutions". [Alan] > > > 2) An underscore suffix currently modifies the preceding text by > > > making it a link. This notion is extended - the suffix > > > indicates that the text is to be tagged in some way, > > > indicated by a directive or destination URL in the target:: > > > > > > I had lunch with Jonathan_ today. We talked about Zope_. > > > > > > .. _Jonathan: lj [user=jhl] > > > .. _Zope: http://www.zope.org/ [David] > > Interesting idea, putting arbitrary constructs in the link target. > > However, for consistency that depends on two things: > > > > 1. The link text remains behind, untouched except for being > > "activated" in some way. > > 2. There must *be* a link target. Corollary: the reference must > > *be* a reference. [Alan] > I agree with (2) but not (1). That's the opposite of what I first expected to hear. Just to confirm: what I meant was, given this input:: this is a `link to something`_ The output would contain the (possibly formatted but otherwise *unaltered*) text:: this is a link to something For hyperlink references, I absolutely require (1). I'd be willing to entertain an extension such that (2) is no longer true (i.e., it depends on the processed contents of the "target), but not the other way around. > Here's the principle I'm going on: A reStructuredText-to-plaintext > converter should modify the non-directive parts of the document as > little as possible. The marked-up text should "read" like > non-marked-up text. I don't see how this follows from the above. > > What will "Jonathan" become? > > ``Jonathan`` or some such. > After that, it's an output format issue. > > For the given application I would expect the default text output to > be ``Jonathan`` and the default HTML output to be:: > > > Jonathan So they *are* hyperlinks, in this case at least. > > For example, say I want Jonathan's user > > icon to appear in my paragraph:: > > > > I had lunch with [Jonathan's icon here] today. > > > > How do I do this *without* having a hyperlink at the same time? > > The way you'd write this paragraph in plaintext is:: > > I had lunch with Jonathan today. > > This implies that the reStructuredText paragraph should be:: > > I had lunch with Jonathan_ today. > > Then follow it with:: > > .. _Jonathan: lj-icon jhl > > or the like. If you're really referring to the icon itself, rather > than referring to Jonathan but using his icon in graphical output, > then you'd say something like:: > > Jonathan has a goofy icon: `Jonathan's icon`__ > > __ lj-icon jhl It still looks like a hyperlink to me. My objection remains: if we change the meaning of "_", then when looking at a *reference*, I can't tell if it's a hyperlink or not. (BTW, what does "lj" stand for?) > > On the other hand, we could say that the trailing-underscore > > syntax doesn't signify a hyperlink reference, but only indicates a > > "tagging reference". > > Yes. OK, I think I understand your proposal now. I've updated "Inline Substitutions" in http://structuredtext.sf.net/spec/alternatives.txt to reflect this understanding. Please let me know if I'm mistaken. > > A tagging reference becomes a hyperlink reference if the contents > > of the "tag" resolve to a hyperlink. > > Or, rather, a hyperlink *is* a type of tag, and ``__ > http://python.org`` is just sugar for ``__ link http://python.org``. > We're not adding a construct. We're replacing a construct with a > more general one. We'd be replacing a well-understood, specific construct, hyperlinks, for which the syntax heuristics and mnemonics are well-understood: - "_" denotes a reference when it's a suffix, and a dereference when it's a prefix. Think of "_" as a right-pointing arrow. - "_" was chosen from Setext partly because it goes well with what we see in web pages. Typically, a hyperlink is underlined. The problem I'm having with this is that we're redefining the syntax for a relatively straightforward concept, hyperlinks, to a much more complex one. Hyperlink refs become a special case of "tagged references". Whether ``this_`` would become a hyperlink reference, an icon, or something else entirely, is only apparent from examination of the target: ``.. _this:``. We could say that ``this_`` is "*almost always* a hyperlink reference (except when the target happens to contain a directive)", but that's too vague; it's just a thin veil. When I say ``this_``, I'm making a reference whose text contains "this" (the text "this" would be underlined in the HTML). If ``this_`` were a "tagged reference" instead, the text "this" might be replaced by some other text entirely, or a graphic, or nothing at all. If hyperlink references are just one special case of tagged references, I don't like the path that the text "this" takes: - "this" is ripped out of the surrounding text. - Later in the processing (a post-parse transformation), something else (the product of the "tagging target" or "substitution") is put in its place. - Elsewhere, there's a ``.. _this: http://some.uri`` target which is really a shortcut for ``.. _this: link_target:: http://some.uri``. This is processed into a which is stored, waiting for the aforementioned transformation to pick it up. One problem is, the "this" in the reference may not match the "this" in the target. The reference may be ``This_`` or ``THIS_``, while the target is ``.. _this: ...`` or ``.. _tHiS: ...``. The will be constructed at parse time, so its contents will get the *target's* text. Anonymous targets wouldn't know what text to use. OK, so to solve this problem, when we do the substitution transformation, we take the text from the ``This_`` reference and plug it in to the elements. This seems far too complicated and roundabout. Hyperlink references are common enough to require their own unique syntax and direct conceptual resolution. > > > Link targets which are also legal directive names must be > > > enclosed in backquotes. > > > > The frequency of link targets would far outweigh directives, so > > markup would suffer from extra syntax on targets. > > Anything with a slash or an at-sign doesn't need to be escaped. > This is the vast majority of cases. That's not acceptable. However, changing the target syntax as I showed earlier removes this restriction:: .. _name: directive:: data So it's not really an issue. I already added one special case when we added the indirect target syntax, ``.. _target: reference_``. Simple URI's cannot end with an underscore unless it's escaped. The value of indirect targets outweighed the cost of the special case. In this case however, the cost is too high, so the extra syntax ("::") to *avoid* the cost is worthwhile. [Alan] > > > 5) Roles can go away. We don't need them. Optionally if we > > > want the ability to put short directive names inline, we > > > could declare :: > > > > > > `foo:: bar bar bar` [David] > > Similar syntax has already been considered and rejected. See > > http://structuredtext.sf.net/spec/alternatives.txt, "Interpreted > > Text 'Roles'" alternative 1. [Alan] > Alternative 1 is more ambiguous than what I'm suggesting, and does > not have the benefit of consistency with out-of-line directives. > > However, which syntax to use for simple inline directives is a minor > side issue, and I shouldn't have combined it with this proposal. > More important is whether "roles" and "directives" and "tags" should > all be unified. I think they should. It adds both simplicity and > power. I don't think they should. Inline text is another form of inline markup, the role usually inferred from context. Explicit "roles" are like adjectives: strong, literal, red, cold. (Not all role names are adjectives though.) Roles describe their text but don't alter it; they're for extending the built-in set of descriptive inline tags. "Directives" are a constructive extension mechanism. They're for arbitrary structures where there is no built-in syntax. I understand your concept of "tags" to be the inline equivalent of directives. But the "tagged" text *is* inline, and we're striving for readability, so we don't want all the details to be inline. "Tags" could be called "indirect inline directive references". So "roles" and "tags" are not compatible; they can't be unified. "Tags" and "directives" *are* compatible though. "Tags" are just inline references to directives. > I realize that what I'm suggesting sounds more complicated. However, > that's largely because I"m explaining it relative to the current > spec, not from scratch. I'm keeping in mind "does this make it > easier or more difficult to explain how to use the language in under > two minutes" (or under 40 lines of email) because I'm going to have > to do just that. I don't think I'm compromising that goal; if > anything what I'm suggesting brings us *closer*. I find the overloading of what is now "hyperlink reference" syntax is very confusing and adds a lot of conceptual complexity. Heck, it's taken this long for *me* to "get it", to understand what you're proposing. > I'm writing a Zope product akin to the existing STXDocument to use > rST in Zope without explicitly invoking converters. I'm planning to > train people on it and start having them use it by next *week*. > (Freeze or no freeze - if the spec changes on me after that, well, > I'll suffer.) That's great. Feedback from actual use will be much appreciated. > One cool thing about more powerful inline tagging is that we don't > have to add any more special characters. If we want to add > substitution, or inline images, or the Spanish Inquisition, we can > do that with the tag syntax rather than more punctuation. At the expense of shoehorning inline tagging onto the same syntax as hyperlink references. [Paul] > Consistency is high on my list of priorities. I don't really like > David's ```/subst/``` syntax, either, because it loses the "anything > in \` characters is interpreted text" rule. The ```/subst/``` syntax is just the best I could come up with. If I hear a better suggestion, I'll take it. [Paul] > > Are you defining a construct which starts with ```{`` and ends > > with ``}`__``? [Alan] > Ack, no! I'm saying that in the existing construct :: > > `content content content`__ > > curly braces in the content would have to be escaped. So if the > content was ``{'a':1, 'b':2}`` you'd have to write ``\{'a':1, > 'b':2\}`` instead. To reiterate, I can't see ever accepting this embedded curly brace syntax, so you might as well drop it. It's confusing the main issue anyhow. [Alan] > Well, it's become clear that I need to provide much stronger > arguments for why richer inline markup is important. I feel pretty > strongly that this is something that will come to bite us later, and > not very much later, if we want rST to thrive as a general markup. reStrcturedText will never be a general-purpose markup like XML or TeX. It's a limited, what-you-see-is-pretty-close-to-what-you-get markup syntax, for converting plain text to structured formats. As such, we must accept certain limitations. Interpreted text, directives, and substitutions all extend the basic set of constructs so that any "almost there" structure can be represented. > I'll try to send out an edited and clearer proposal with a wider > variety of examples within a day or two. Looking forward to it. -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From goodger@users.sourceforge.net Sat Nov 10 03:29:15 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Fri, 09 Nov 2001 22:29:15 -0500 Subject: [Doc-SIG] Clarification: interpreted text vs. directives vs. substitutions Message-ID: I think there may be a fundamental misunderstanding as to the intended function of interpreted text, directives, and substitutions. Easy to understand, since the spec is lacking in explanation. I will update the spec based on these discussions. "Interpreted text" is text that is meant to be related, indexed, linked, summarized, or otherwise processed, but the text itself is left alone. The text is "tagged" directly, in-place. Interpreted text may become a hyperlink reference, or may be character-formatted, or may become an entry in an index, but the words within the backquotes in the source text will not be altered. The "role" determines how the text is interpreted. Roles are simply extensions of the available inline constructs; to emphasis, strong, literal, and reference, we can add "index entry" or "acronym" or "class" or "red" or "blinking" or anything else we want. Originally, I intended interpreted text to do some of what substitutions now do (I had mentioned a ``:graphic:`picture.png``` example at some point). I think that was a mistake. By adding substitutions, the meaning of interpreted text becomes clear. The "role" should be inferred from context whenever possible. Using the eventual Python source "Reader" component, the role of each bit of interpreted text can be entirely inferred from the namespace of the context:: class B(A): """Use the `run()` method to do the work.""" def __init__(self): """Set `self.a`.""" self.a = 1 def run(self): """Extend `A.run()`. `A` is an abstract base class.""" pass In the "B" class docstring, the reference to "run()" gets a role of "method". In the "B.__init__()" docstring, "self.a" is an "instance_attribute". In the "B.run()" docstring, "A.run()" is a "method" and "A" is a "class". At each point in the definition, the role is inferred from the local namespace, an analog to the way Python itself does namespace lookup. The processed docstrings might represent these as links to their definitions. Another possible application is for index entries:: Attach the `gizmo`:index: to the `widget`:index: with the `whatsit`:index:. After processing, the interpreted text itself would be unchanged:: Attach the gizmo to the widget with the whatsit. But an index may have been generated by the processing system. (Admittedly, the ":index:" suffixes are an eyesore. The role could simply be ":x:", but that's not much better. If there's a better idea out there, I'm all ears.) "Directives" are meant for the arbitrary processing of their contents (the directive data & text block), which can be transformed into something possibly unrelated to the original text. Directives are a constructive extension mechanism. They're for arbitrary structures where there is no built-in syntax. Directives typically produce structural or block-level (body) elements, but they can also affect processing in other ways (not yet explored). Example:: .. figure:: picture.png This is the caption of the figure (a simple paragraph). +-----------------------+-----------------------+ | Symbol | Meaning | +=======================+=======================+ | .. image:: tent.png | Campground | +-----------------------+-----------------------+ | .. image:: waves.png | Lake | +-----------------------+-----------------------+ | .. image:: peak.png | Mountain | +-----------------------+-----------------------+ "Substitutions" are a text-level (inline) construct, a way of getting a directive into the middle of some text. The substitution reference encloses a name, which must match with the name on the substitution itself. The substitution itself is a named, indirect directive, which must produce an inline-compatible object (text or inline elements):: The `/biohazard/` symbol must be used on containers used to dispose of medical waste. .. /biohazard/ image:: biohazard.png Note that I've dropped the "replacement text" aspect of substitutions. With this change, "substitution references" could be renamed "indirect inline directive references" (or "inline directive references" or "directive references" for short). "Substitutions" could be renamed "indirect directives" or "indirect directive definitions". Again, I'm open to any better ideas for syntax out there. Lay 'em on me! -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From jaffray@pobox.com Sat Nov 10 22:08:20 2001 From: jaffray@pobox.com (Alan Jaffray) Date: Sat, 10 Nov 2001 17:08:20 -0500 (EST) Subject: [Doc-SIG] Block quotes In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5B0AC@UKRUX002.rundc.uk.origin-it.com> Message-ID: On a completely different note... On Tue, 6 Nov 2001, Moore, Paul wrote: > Me too. One of the crucial things about reST to my mind is that it should > be possible to (quickly) get enough of a grip on the rules to use them > "naturally" in normal text. A key example of this is the usage of reST in > E-Mail in this group. It isn't valid reST (normal E-Mail quoting constructs > simply don't work), but it's *very* readable, and adds useful structure to > "raw" text. Having a usable natural markup for email seems to be a common goal; several of us have mentioned it. But the email quoting construct means rST doesn't actually work. Can we fix this? What if any section of text 1) where every line starts with the same prefix, either ">" or "> " 2) with blank lines after it 3) with either blank lines, or a line ending with a colon, before it were treated as a block quote? This "does the right thing" with most common email quoting, and I'm having trouble thinking of an example where it would result in ambiguity. Nor does it seem complicated. And with minimal effort (being sure to either add the colon or the blank line), our messages would actually be valid rST. For example, the top of this message could be treated identically to:: On Tue, 6 Nov 2001, Moore, Paul wrote: Me too. One of the crucial things [...] Having a usable natural [...] I don't see the need for a new DTD element, but it might be worth adding a type attribute to block_quote to indicate what sort of quote it is, in case the desired output styling is different. Thoughts? Alan From goodger@users.sourceforge.net Sun Nov 11 23:29:35 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Sun, 11 Nov 2001 18:29:35 -0500 Subject: [Doc-SIG] Block quotes In-Reply-To: Message-ID: Alan Jaffray wrote: > What if any section of text > > 1) where every line starts with the same prefix, either ">" or "> " > 2) with blank lines after it > 3) with either blank lines, or a line ending with a colon, before it > > were treated as a block quote? I think an "email Reader" is appropriate, to establish context. Your rules handle one construct found in email messages. But there's another one, just as important: RFC822 headers. They are ambiguous without a known email context. Also, there are signatures, usually (but not always) preceded by "-- " on a line by itself. I can't point to examples, since none exist yet. Once there's an official Reader or two in the project, it will be a lot easier to make new ones. A "standalone document Reader" is on my to-do list, but I haven't gotten around to it yet. Go ahead & write some code if you're keen. > This "does the right thing" with most common email quoting, and I'm having > trouble thinking of an example where it would result in ambiguity. Doctest blocks? :: >>> print ">>> hello" >>> hello Contrived, yes, but possible. -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From tony@lsl.co.uk Mon Nov 12 10:14:32 2001 From: tony@lsl.co.uk (Tony J Ibbs (Tibs)) Date: Mon, 12 Nov 2001 10:14:32 -0000 Subject: [Doc-SIG] Status of transformers? In-Reply-To: Message-ID: <015801c16b62$d2893920$545aa8c0@lslp862.int.lsl.co.uk> It looks like I may have (at least part of) Friday off (waiting for a new washing machine) and available to coding pydps/pysource. That should mean getting "understanding" of all the DPS tree into the HTML outputter, and writing a draft of why we need a tag, and don't need all of the "special" Python tags that David currently defines (I shan't try to explain now!). However, what is the current status on other transformers (the obvious special cases being the "find me a title" and "explicitly index my automatic footnotes") - have they been coded? Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ "No one trike will do everything... buy the whole set!" - Rob Hague My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) From tony@lsl.co.uk Mon Nov 12 10:15:15 2001 From: tony@lsl.co.uk (Tony J Ibbs (Tibs)) Date: Mon, 12 Nov 2001 10:15:15 -0000 Subject: [Doc-SIG] Clarification: interpreted text vs. directives vs. substitutions In-Reply-To: Message-ID: <015901c16b62$ec21a700$545aa8c0@lslp862.int.lsl.co.uk> ... whole message deleted - see below! ... Thanks - this was a good summary, both of where we are now and why we are there. I'll even get to like inline substitutions soon (hah - repetition as tuition [1]_). As to the other two threads (the ones David split off to prevent imminent head explosion) - thanks, I'll side with David there. Tibs .. [1] Anyone else here remember/know of the "teapot" joke on I'm Sorry I'll Read That Again [2]_? .. [2] Old British comedy series, from when I was a child - summary of joke: if any phrase (*any* phrase) is repeated often enough, it will become funny [3]_ (I believe the normal statistic is about the third time round). The instance in this case being the interjection of the word "teapot" at random intervals, finally culminating in an audience rendered helpless by constant repetition of the word. .. [3] Something that any 5 or 6 year old is *quite* willing to believe! (and their parents less so). -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ Give a pedant an inch and they'll take 25.4mm (once they've established you're talking a post-1959 inch, of course) My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) From goodger@users.sourceforge.net Mon Nov 12 23:27:48 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Mon, 12 Nov 2001 18:27:48 -0500 Subject: [Doc-SIG] Re: Status of transformers? In-Reply-To: <015801c16b62$d2893920$545aa8c0@lslp862.int.lsl.co.uk> Message-ID: Tony J Ibbs (Tibs) wrote: > It looks like I may have (at least part of) Friday off (waiting for a > new washing machine) and available to coding pydps/pysource. That should > mean getting "understanding" of all the DPS tree into the HTML > outputter, and writing a draft of why we need a tag, and don't > need all of the "special" Python tags that David currently defines (I > shan't try to explain now!). I'm aquiver with anticipation! > However, what is the current status on other transformers (the obvious > special cases being the "find me a title" and "explicitly index my > automatic footnotes") - have they been coded? Short answer: no. Long answer: I ripped the original code out of the parser some time ago. Ueli Schlaepfer sent me a reworked version, but I need to do some work on it first and I haven't gotten to it yet. I may be able to put something together this week, but can't promise. -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From jaffray@pobox.com Tue Nov 13 01:16:44 2001 From: jaffray@pobox.com (Alan Jaffray) Date: Mon, 12 Nov 2001 20:16:44 -0500 (EST) Subject: [Doc-SIG] Block quotes In-Reply-To: Message-ID: On Sun, 11 Nov 2001, David Goodger wrote: > Alan Jaffray wrote: > > What if any section of text > > > > 1) where every line starts with the same prefix, either ">" or "> " > > 2) with blank lines after it > > 3) with either blank lines, or a line ending with a colon, before it > > > > were treated as a block quote? > > I think an "email Reader" is appropriate, to establish context. This worries me a bit. To what extent are we going to be dependent on known context? I'd like a user to be able to copy text out of email, stick it on a web site or in a docstring or on a Wiki, and have the software do the right thing knowing only that it's reStructuredText, not that it's rST-for-email as opposed to rST-for-python or rST-for-wikis or whatever. > Your rules handle one construct found in email messages. But there's > another one, just as important: RFC822 headers. Let me clarify - having a usable natural markup for email *bodies* would seem to be a common goal. I don't need a markup for email headers. RFC822 is a format in itself, and plenty of tools exist for it already, so it seems unnecessary for rST to concern itself with it. (I tend to think of the "document" of an email message as being the body, while the headers are more akin to metadata about that document and its delivery through the mail system; the document is what needs markup.) > Also, there are signatures, usually (but not always) preceded by > "-- " on a line by itself. It's impossible to handle signatures anywhere close to reliably. Even if you follow that informal standard, which many (like me) don't, the trailing whitespace will often be munged in transit. Fortunately, it's not very important to do so; it doesn't affect the usability of the text. The same is not true of email quotes. Making this message into valid rST would be a major pain. > > This "does the right thing" with most common email quoting, and I'm having > > trouble thinking of an example where it would result in ambiguity. > > Doctest blocks? :: > > >>> print ">>> hello" > >>> hello > > Contrived, yes, but possible. Ah! I knew I was forgetting something. This actually brings up something else I was thinking about, and I swear I'm not just bringing it up because it affects the email quote thing. :-) Do doctest blocks really belong in core rST? As a construct which is only used in one appplication, they seem like a prime candidate for an application-specific directive:: .. doctest:: >>> print ">>> hello" >>> hello For anyone who intends to use reStructuredText for purposes other than documenting Python code, doctest blocks are just language clutter, even aside from the ambiguity issues. (It'd be lovely for Java programmers if rST could recognize javadoc comments, for instance, but surely we'd never consider putting that in the core spec...) Alan From jaffray@pobox.com Tue Nov 13 01:36:55 2001 From: jaffray@pobox.com (Alan Jaffray) Date: Mon, 12 Nov 2001 20:36:55 -0500 (EST) Subject: [Doc-SIG] Clarification: interpreted text vs. directives vs. substitutions In-Reply-To: Message-ID: I think my (now defunct) revised proposal and what you've done with substitutions are pretty similar. :-) 1) Does the "indirect directive" have access to the substitution reference text? Can it take a block argument? How exactly does this work? 2) I dislike the slashes. To me they mean either "italics" or "path" or "regex". The latter two are also sources of ambiguity. I can't think of a *good* syntax, but I think `` `|text here|` `` and ``.. |text here| directive:: args`` would be better. 3) "Directive references" and "directive targets" were the terms I used in my revised proposal. We're certainly way past "substitution" now. 4) I dislike the double-colon after the directive name - we already know that it's a directive name, it's part of a directive target. I'd remove it or make it optional. However, I expect everyone else will disagree. :-) Cheers, Alan On Fri, 9 Nov 2001, David Goodger wrote: > "Substitutions" are a text-level (inline) construct, a way of getting > a directive into the middle of some text. The substitution reference > encloses a name, which must match with the name on the substitution > itself. The substitution itself is a named, indirect directive, which > must produce an inline-compatible object (text or inline elements):: > > The `/biohazard/` symbol must be used on containers used to > dispose of medical waste. > > .. /biohazard/ image:: biohazard.png > > Note that I've dropped the "replacement text" aspect of substitutions. > With this change, "substitution references" could be renamed "indirect > inline directive references" (or "inline directive references" or > "directive references" for short). "Substitutions" could be renamed > "indirect directives" or "indirect directive definitions". > > Again, I'm open to any better ideas for syntax out there. Lay 'em on me! From goodger@users.sourceforge.net Tue Nov 13 03:38:48 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Mon, 12 Nov 2001 22:38:48 -0500 Subject: [Doc-SIG] Block quotes In-Reply-To: Message-ID: > On Sun, 11 Nov 2001, David Goodger wrote: >> I think an "email Reader" is appropriate, to establish context. Alan Jaffray wrote: > This worries me a bit. To what extent are we going to be dependent on > known context? Depending on the ambiguity otherwise, possibly considerably. > I'd like a user to be able to copy text out of email, stick it on a web > site or in a docstring or on a Wiki, and have the software do the right > thing knowing only that it's reStructuredText, not that it's rST-for-email > as opposed to rST-for-python or rST-for-wikis or whatever. Reasonable, to certain extent. Eventually, ambiguity rears its ugly head. >> Your rules handle one construct found in email messages. But there's >> another one, just as important: RFC822 headers. > > Let me clarify - having a usable natural markup for email *bodies* would > seem to be a common goal. I don't need a markup for email headers. > RFC822 is a format in itself, and plenty of tools exist for it already, > so it seems unnecessary for rST to concern itself with it. That depends on how much of the email message you copy, doesn't it? If the copying includes the "To:", "From:", "Date:", and "Subject:" headers as well, we'd want those converted as well (to field lists). That absolutely requires context. >> Also, there are signatures, usually (but not always) preceded by >> "-- " on a line by itself. > > It's impossible to handle signatures anywhere close to reliably. True. But an email Reader could easily match a "^-- ?$" pattern within 10 lines of the end of the text, requiring the author or user to ensure that "--" is placed there properly. Useful. >>> This "does the right thing" with most common email quoting, and I'm having >>> trouble thinking of an example where it would result in ambiguity. >> Doctest blocks? :: > Ah! I knew I was forgetting something. Look at the first of the three quote blocks above. Looks just like a doctest block, don't it? > This actually brings up something else I was thinking about, and I swear > I'm not just bringing it up because it affects the email quote thing. :-) > > Do doctest blocks really belong in core rST? As a construct which is > only used in one appplication, they seem like a prime candidate for an > application-specific directive:: Yes, they belong in core rST, but perhaps turned off by default, since email quoting would probably have a bigger audience. However, I would imagine that some email messages would have both constructs. Tricky. No, not as an application-specific directive, for the simple reason that the "Python source docstrings" and "Writings about Python" applications are the most important to reStructuredText at present. But it might be a directive in another context, such as the "Email Reader". > (It'd be lovely for Java programmers if rST could recognize javadoc comments, > for instance, but surely we'd never consider putting that in the core spec...) I get your meaning, but that's not a valid comparison. The text has to be *extracted* from the comments before it can be processed, which is clearly the responsibility of the Java Source Reader. -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From goodger@users.sourceforge.net Tue Nov 13 04:05:07 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Mon, 12 Nov 2001 23:05:07 -0500 Subject: [Doc-SIG] Clarification: interpreted text vs. directives vs. substitutions In-Reply-To: Message-ID: Alan Jaffray wrote: > 1) Does the "indirect directive" have access to the substitution > reference text? I assume you mean the substitution name, the text between the "`/" and "/`"? Currently no, it doesn't. I don't see why it would need any such access. > Can it take a block argument? The directive part of the substitution can take a block argument, that's up to the directive:: Here's a contrived `/example/`. .. /example/ contrived-directive:: The directive does something with all this text. Don't ask me what. 23 17 5 ... hut! > How exactly does this work? The substitution construct is recognized, which triggers the recognition of an embedded directive (it's an error if there is no directive). The directive's code is run, and the result becomes the contents of the substitution. A transform later replaces any matching substitution references with said contents. > 2) I dislike the slashes. To me they mean either "italics" or "path" > or "regex". The latter two are also sources of ambiguity. I can't > think of a *good* syntax, but I think `` `|text here|` `` and > ``.. |text here| directive:: args`` would be better. Decent alternative. I'll take it under advisement. (Now I've got judiciary delusions!) Another possibility is `[name]`. Each of `/name/` and `[name]` and `|name|` effectively limit (albeit only slightly) interpreted text though. Other (more radical) alternatives? > 3) "Directive references" and "directive targets" were the terms I used > in my revised proposal. We're certainly way past "substitution" now. Perhaps. I'd have to add "inline" though: "inline directive references" & "inline directive targets". "Targets" doesn't seem right though, perhaps "definitions" instead. I'll have a go through my thesaurus (an invaluable reference on my techref shelf), see if anything else rings true. Both the syntax and the name of substitutions (& references) may change. > 4) I dislike the double-colon after the directive name - we already > know that it's a directive name, it's part of a directive target. > I'd remove it or make it optional. They're a useful reminder of what's intended. Think of it as the "/subname/" being wedged into an existing directive. > However, I expect everyone else will disagree. :-) Bingo! -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From jaffray@pobox.com Tue Nov 13 04:45:08 2001 From: jaffray@pobox.com (Alan Jaffray) Date: Mon, 12 Nov 2001 23:45:08 -0500 (EST) Subject: [Doc-SIG] Clarification: interpreted text vs. directives vs. substitutions In-Reply-To: Message-ID: On Mon, 12 Nov 2001, David Goodger wrote: > Alan Jaffray wrote: > > 1) Does the "indirect directive" have access to the substitution > > reference text? > > I assume you mean the substitution name, the text between the "`/" and "/`"? > Currently no, it doesn't. I don't see why it would need any such access. Well, going back to that inline image example:: The `/biohazard/` symbol is scary-looking. .. /biohazard/ image:: biohazard.png The intended HTML output is:: The biohazard symbol is scary-looking. That alt text needs to come from somewhere. > > 2) I dislike the slashes. To me they mean either "italics" or "path" > > or "regex". The latter two are also sources of ambiguity. I can't > > think of a *good* syntax, but I think `` `|text here|` `` and > > ``.. |text here| directive:: args`` would be better. > > Decent alternative. I'll take it under advisement. (Now I've got judiciary > delusions!) Delusions? That's no delusion. :) > Another possibility is `[name]`. Each of `/name/` and `[name]` and `|name|` > effectively limit (albeit only slightly) interpreted text though. Other > (more radical) alternatives? I'm surprised you'd be willing to go for `` `[name]` ``. That would be my preference, but I'm not sure what we'd do about the referent, since ``.. [name]`` is footnote syntax. > > [...] > > However, I expect everyone else will disagree. :-) > > Bingo! I'm getting used to it. :-) Alan From jaffray@pobox.com Tue Nov 13 06:10:17 2001 From: jaffray@pobox.com (Alan Jaffray) Date: Tue, 13 Nov 2001 01:10:17 -0500 (EST) Subject: [Doc-SIG] Readers In-Reply-To: Message-ID: On Sun, 11 Nov 2001, David Goodger wrote: > I can't point to examples, since none exist yet. Once there's an official > Reader or two in the project, it will be a lot easier to make new ones. A > "standalone document Reader" is on my to-do list, but I haven't gotten > around to it yet. Go ahead & write some code if you're keen. Could you explain in more detail what you mean by "Reader", or specifically a "standalone document Reader"? Thanks, Alan From tony@lsl.co.uk Tue Nov 13 10:24:24 2001 From: tony@lsl.co.uk (Tony J Ibbs (Tibs)) Date: Tue, 13 Nov 2001 10:24:24 -0000 Subject: [Doc-SIG] Clarification: interpreted text vs. directives vs.substitutions In-Reply-To: Message-ID: <017301c16c2d$5ddcbcc0$545aa8c0@lslp862.int.lsl.co.uk> David Goodger wrote: > Alan Jaffray wrote: > > 2) I dislike the slashes. To me they mean either "italics" > > or "path" or "regex". The latter two are also sources of > > ambiguity. I can't think of a *good* syntax, but I think > > `` `|text here|` `` and ``.. |text here| directive:: args`` > > would be better. > > Decent alternative. I'll take it under advisement. (Now I've > got judiciary delusions!) Hmm. To me, the slashes remind me of something like:: s/this/that/g (I've probably got that slightly wrong) in ed-like languages, and since that's the effect we want, I would vote for slashes over the other alternatives (I have an idea the *actual* character used there doesn't have to be a slash, but it's what just about everyone uses by default). > Another possibility is `[name]`. Ah, but that breaks the precedent that all inline escape sequences have matched start and end characters (humph, I never thought I'd be defending that policy as a principle!), so I'd argue against it on "least startlement" grounds. Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ Give a pedant an inch and they'll take 25.4mm (once they've established you're talking a post-1959 inch, of course) My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) From goodger@users.sourceforge.net Wed Nov 14 02:59:36 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Tue, 13 Nov 2001 21:59:36 -0500 Subject: [Doc-SIG] Clarification: interpreted text vs. directives vs.substitutions In-Reply-To: <017301c16c2d$5ddcbcc0$545aa8c0@lslp862.int.lsl.co.uk> Message-ID: [Tony] > Hmm. To me, the slashes remind me of something like:: > > s/this/that/g > > (I've probably got that slightly wrong) in ed-like languages Me too. (You got it right.) The only disadvantage with using this syntax is that interpreted text and phrase-hyperlink references can't start with an unescaped slash. That's a bit of a wart. [Alan] > > > 1) Does the "indirect directive" have access to the substitution > > > reference text? [David] > > I assume you mean the substitution name, the text between the "`/" > > and "/`"? Currently no, it doesn't. I don't see why it would need > > any such access. [Alan] > Well, going back to that inline image example:: > > The `/biohazard/` symbol is scary-looking. > > .. /biohazard/ image:: biohazard.png > > The intended HTML output is:: > > The biohazard symbol is > scary-looking. > > That alt text needs to come from somewhere. I remember shelving fleeting thoughts about the "alt" text; time to revisit them. OK, there are three possible sources for the alt text: 1. The substitution name. 2. The source URI or file name. 3. Explicit text in the "image" directive. I think (2) is the worst option, (3) is the best, and (1) is in-between. (1) and/or (2) could be fallbacks when (3) isn't given. (3) will require support from the "image" directive; I'll add it. (1) would also require changes, and would only be applicable to inline images (block-level images won't have a substitution name). > > Another possibility is `[name]`. > > I'm surprised you'd be willing to go for `` `[name]` ``. That would > be my preference, but I'm not sure what we'd do about the referent, > since ``.. [name]`` is footnote syntax. True. That rules out that syntax. Didn't think it through. > > > However, I expect everyone else will disagree. :-) > > > > Bingo! > > I'm getting used to it. :-) The suggestions *are* useful. You've managed to get several ideas adopted, although not always with your suggested syntax. Thanks for helping to make reStructuredText more complete! -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From goodger@users.sourceforge.net Wed Nov 14 02:59:02 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Tue, 13 Nov 2001 21:59:02 -0500 Subject: [Doc-SIG] Re: Readers In-Reply-To: Message-ID: Alan Jaffray wrote: > Could you explain in more detail what you mean by "Reader", > or specifically a "standalone document Reader"? See the "DPS Components" thread beginning 2001-09-18: http://mail.python.org/pipermail/doc-sig/2001-September/002221.html. I have yet to update the PEPs. (They're long overdue.) -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From jaffray@pobox.com Wed Nov 14 09:04:52 2001 From: jaffray@pobox.com (Alan Jaffray) Date: Wed, 14 Nov 2001 04:04:52 -0500 (EST) Subject: [Doc-SIG] Substitution syntax In-Reply-To: Message-ID: On Tue, 13 Nov 2001, David Goodger wrote: > [Tony] > > Hmm. To me, the slashes remind me of something like:: > > > > s/this/that/g > > > > (I've probably got that slightly wrong) in ed-like languages > > Me too. (You got it right.) My experience with ed-like languages is precisely what leads me to the *wrong* conclusion looking at that syntax. In ed (and sed and ex and vi and awk and perl), text between slashes is a regexp. So ed has /re/d (delete the next line matching the regexp re), and g/re/p (global-search for lines matching re, print each one), and s/re/foo (substitute foo for the first match of re on this line). Vi is similar. Awk and Perl both use /re/ throughout the language to denote regexps, and as a standalone expression meaning "match the current line against this regexp." So, seeing the syntax:: .. /foo/ name:: args I'd think it had something to do with applying a directive to things that match the regexp /foo/. It's similar to "do things when this matches" constructs in other languages. IMHO the intended association is more misleading than useful. > The only disadvantage with using this > syntax is that interpreted text and phrase-hyperlink references can't > start with an unescaped slash. That's a bit of a wart. Using pipes instead of slashes would be a less ugly wart, as they rarely start text. In contrast, absolute paths are quite common. Alan From jaffray@pobox.com Wed Nov 14 17:03:43 2001 From: jaffray@pobox.com (Alan Jaffray) Date: Wed, 14 Nov 2001 12:03:43 -0500 (EST) Subject: [Doc-SIG] Elucidative Programming Message-ID: I just found the `Elucidative Programming home page`__ in an entry on the `Lambda weblog`__. It's a 21st-century riff on Literate Programming a la Knuth; the reference implementation is in Scheme but the concepts are could be applied to any language. Interesting stuff, take a look if you haven't seen it. Alan __ http://www.cs.auc.dk/~normark/elucidative-programming/index.html __ http://lambda.weblogs.com/ From goodger@users.sourceforge.net Thu Nov 15 01:23:37 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Wed, 14 Nov 2001 20:23:37 -0500 Subject: [Doc-SIG] Re: Substitution syntax In-Reply-To: Message-ID: [Alan, re the `/subname/` syntax] > IMHO the intended association is more misleading than useful. I understand your concern. That alone wouldn't necessarily sway the decision though. > Using pipes instead of slashes would be a less ugly wart, as they > rarely start text. In contrast, absolute paths are quite common. I'm beginning to be swayed. Vertical bars ('|') instead of slashes do solve the problem of confusion with absolute POSIX paths. They don't solve the problem of limiting interpreted text though (can't begin with a vertical bar). Thinking about removing this limitation brought me back to reference syntax alternative 4. I had three alternatives already:: (a) #name# (b) @name@ (c) /name/ To which I have added a fourth:: (d) |name| The resulting example looks like this:: The |biohazard| symbol... .. |biohazard| image:: biohazard.png [height=20 width=20] Simpler and doesn't comflict with interpreted text in any way. I can't think of any significant conflicts that shouldn't be in inline literals anyhow (such as shell command pipelines). The only example of a problem that I can come up with on short notice is contrived, and involves tables:: +--------------+----------+------------+------------+ | row 1, col 1 | column 2 | column 3 | column 4 | +--------------+----------+------------+------------+ | row 2 | This cell and the one below span | | | from column 2 to column 4. | +--------------+----------+------------+------------+ | row 3 | Here's a |substitution|; or is it? | +--------------+----------+------------+------------+ | row 4 | | | | +--------------+----------+------------+------------+ The cell in row 3 spanning columns 2 through 4 contains ``|substitution|``, but because of the way the vertical bars line up with the cell corner markers ("+"), it would be recognized as three cells (pretend row 2 isn't there). There is no requirement in tables for left or right margins (the margins that are there are just my preference), and adding such a requirement would be onerous. The vertical bar syntax is not too obtrusive; is it obtrusive enough? Is it aesthetically pleasing? Is the table issue a showstopper? (Note that the same table issue would also exist for "`|" syntax.) -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From goodger@users.sourceforge.net Thu Nov 15 03:25:24 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Wed, 14 Nov 2001 22:25:24 -0500 Subject: [Doc-SIG] image "alt" text In-Reply-To: Message-ID: [Alan] >> The intended HTML output is:: >> >> The biohazard symbol is >> scary-looking. >> >> That alt text needs to come from somewhere. [David] > I remember shelving fleeting thoughts about the "alt" text; time to > revisit them. OK, there are three possible sources for the alt text: > > 1. The substitution name. > 2. The source URI or file name. > 3. Explicit text in the "image" directive. > > I think (2) is the worst option, (3) is the best, and (1) is > in-between. (1) and/or (2) could be fallbacks when (3) isn't given. > (3) will require support from the "image" directive; I'll add it. (1) > would also require changes, and would only be applicable to inline > images (block-level images won't have a substitution name). I've implemented "alt" text for images:: .. image:: picture.png [alt="this is the alt text"] I've decided against any implicit fallbacks; it's explicit or nothing. Also, I've extracted the attribute parsing code into dps.utils, and extended it to handle more general cases. Now the same attribute syntax can be used by other directives. Docs, code, & test cases updated, available by CVS or snapshot. -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From tony@lsl.co.uk Thu Nov 15 12:37:48 2001 From: tony@lsl.co.uk (Tony J Ibbs (Tibs)) Date: Thu, 15 Nov 2001 12:37:48 -0000 Subject: [Doc-SIG] Re: Substitution syntax In-Reply-To: Message-ID: <019801c16dd2$55677d90$545aa8c0@lslp862.int.lsl.co.uk> I can't say I'm fussed whether we use "/" or "|" as the substitution character - the "|" may be slightly more visually pleasing, perhaps. As to David's showstopper:: > +--------------+----------+------------+------------+ > | row 1, col 1 | column 2 | column 3 | column 4 | > +--------------+----------+------------+------------+ > | row 2 | This cell and the one below span | > | | from column 2 to column 4. | > +--------------+----------+------------+------------+ > | row 3 | Here's a |substitution|; or is it? | > +--------------+----------+------------+------------+ > | row 4 | | | | > +--------------+----------+------------+------------+ It *is* a showstopper, but only if you decide it is so (i.e., it depends how much you actually care about the problem, as Designer in Chief). Otherwise, it's just an awkwardness that needs documenting carefully. The "obvious" workaround is, of course:: +--------------+----------+------------+------------+ | | | | row 3 | Here's a |substitution|; or is it? | | | | +--------------+----------+------------+------------+ or perhaps the (maybe simpler):: +--------------+----------+------------+-------------+ | row 3 | Here's a |substitution|; or is it? | +--------------+----------+------------+-------------+ (I assume either will work?). Regardless, careful documentation of this edge case (and how to get around it) *should* be enough for most people. So, that's a vote either way, then... Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ Well we're safe now....thank God we're in a bowling alley. - Big Bob (J.T. Walsh) in "Pleasantville" My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) From fdrake@acm.org Thu Nov 15 17:35:24 2001 From: fdrake@acm.org (Fred L. Drake) Date: Thu, 15 Nov 2001 12:35:24 -0500 (EST) Subject: [Doc-SIG] [development doc updates] Message-ID: <20011115173524.741DD28697@cj42289-a.reston1.va.home.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Added a chapter on Tkinter & friends, contributed by Mike Clarkson. There are a few additional updates as well. From garth@deadlybloodyserious.com Thu Nov 15 18:30:00 2001 From: garth@deadlybloodyserious.com (Garth T Kidd) Date: Fri, 16 Nov 2001 05:30:00 +1100 Subject: [Doc-SIG] Re: Substitution syntax In-Reply-To: Message-ID: > The resulting example looks like this:: > > The |biohazard| symbol... > > .. |biohazard| image:: biohazard.png > [height=20 width=20] +1, now it's clean enough to match the rest of the spec. :) Regards, Garth. From goodger@users.sourceforge.net Thu Nov 15 23:35:28 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Thu, 15 Nov 2001 18:35:28 -0500 Subject: [Doc-SIG] Re: Substitution syntax In-Reply-To: Message-ID: After I added a fourth syntax option ("|name|", option (d) below) for substitution reference syntax alternative 4 (specialized), I reexamined the keyboard for any and all other possible characters:: (a) #name# (b) @name@ (c) /name/ (d) |name| (e) <> (f) //name// (g) ||name|| (h) ^name^ (i) [[name]] (j) ~name~ (k) !name! (l) =name= (m) ?name? (n) >name< Looking back at http://structruedtext.sf.net/spec/problems.txt, I recalled that I'd done the same type of search for the syntax of inline literals. The "runner up" was carets (^), which look pretty good for substitution syntax (option (h) above). I'd say they're at least equal aesthetically, and suggestive of "insertion" (also suggestive of a name for this thing...). Carets have no syntax conflicts with tables or anything else, a bonus. The resulting example looks like this:: The ^biohazard^ symbol... .. ^biohazard^ image:: biohazard.png [height=20 width=20] For comparison, here's the vertical bar syntax:: The |biohazard| symbol... .. |biohazard| image:: biohazard.png [height=20 width=20] Hmm. I like the carets better. Any objections or dissenting opinions? -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From jaffray@pobox.com Fri Nov 16 00:38:56 2001 From: jaffray@pobox.com (Alan Jaffray) Date: Thu, 15 Nov 2001 19:38:56 -0500 (EST) Subject: [Doc-SIG] Re: Substitution syntax In-Reply-To: Message-ID: I'd considered carets as well. They seem uglier and more unnatural to me than bars, but I'd be hard-pressed to explain why. Maybe I perceive them as binary operators, and thus surrounding text with them feels wrong, whereas bars feel more like generic separators or delimiters? Maybe the greater width makes them feel more intrusive? (You know, examining my personal feelings, history, and relationships with punctuation marks, on a public mailing list, feels very silly.) Anyway, register one vote for bars over carets. :) Alan On Thu, 15 Nov 2001, David Goodger wrote: > > The resulting example looks like this:: > > The ^biohazard^ symbol... > > .. ^biohazard^ image:: biohazard.png > [height=20 width=20] > > For comparison, here's the vertical bar syntax:: > > The |biohazard| symbol... > > .. |biohazard| image:: biohazard.png > [height=20 width=20] > > Hmm. I like the carets better. Any objections or dissenting opinions? From jaffray@pobox.com Fri Nov 16 03:40:14 2001 From: jaffray@pobox.com (Alan Jaffray) Date: Thu, 15 Nov 2001 22:40:14 -0500 (EST) Subject: [Doc-SIG] Use cases for inline directive references Message-ID: I'm playing fast and loose with the hypothetical directive names and syntax; this is just intended to get the ideas across. Objects ------- Inline directive references may be used to associate ambiguous text with a unique object identifier. For example, many sites may wish to implement an inline user directive:: |Michael| and |Jon| are our widget-wranglers. .. |Michael| user:: mjones .. |Jon| user:: jhl Depending on the needs of the site, this may be used to index the document for later searching, to hyperlink the inline text in various ways (mailto, homepage, mouseover Javascript with profile and contact information, etc), or to customize presentation of the text (include username in the inline text, include an icon image with a link next to the text, make the text bold or a different color, etc). The same approach can be used in documents which frequently refer to a particular type of objects with unique identifiers but ambiguous common names. Movies, albums, books, photos, court cases, and laws are a few that come to mind for me. :: |The Transparent Society| offers a fascinating alternate view on privacy issues. .. |The Transparent Society| book:: isbn=0738201448 Classes or functions, in contexts where the module or class names are unclear, are another possibility:: 4XSLT has the convenience method |runString|, so you don't have to mess with DOM objects if all you want is the transformed output. .. |runString| function:: module=xml.xslt class=Processor Images ------ Images are a common use for inline directive references:: West led the |H| 3, covered by dummy's |H| Q, East's |H| K, and trumped in hand with the |S| 2. .. |H| image:: src=/images/heart.png height=11 width=11 .. |S| image:: src=/images/spade.png height=11 width=11 * |Red light| means stop. * |Green light| means go. * |Yellow light| means go really fast. .. |Red light| image:: src=red_light.png .. |Green light| image:: src=green_light.png .. |Yellow light| image:: src=yellow_light.png | -><- | is the official symbol of POEE_. .. | -><- | image:: discord.png .. _POEE: http://www.poee.org/ Styles ------ Inline directive references may be used to associate inline text with an externally defined presentation style:: Even |the text in Texas| is big. .. |the text in Texas| style:: big The style name may be meaningful in the context of some particular output format (CSS class name for HTML output, LaTeX style name for LaTeX, etc), or may be ignored for other output formats (often for plain text). Interpreted text is unsuitable for this purpose because the set of style names cannot be predefined - it is the domain of the content author, not the author of the parser and output formatter - and there is no way to associate a stylename argument with an interpreted text style role. Also, it's desirable to use the same mechanism for styling blocks:: .. style:: motto At Bob's Underwear Shop, we'll do anything to get in your pants. .. style:: disclaimer All rights reversed. Reprint what you like. Templates --------- Inline markup may be used for later processing by a template engine. For example, a Zope author might write :: Welcome back, |name|! .. |name| tal:: replace user/getUserName and get the ZPT output :: Welcome back, name! which Zope would then transform to something like :: Welcome back, David! during a session with an actual user. Substitution ------------ Inline directive references may be used for simple macro substitution. This may be appropriate when the replacement text is repeated many times throughout one or more documents, especially if it may need to change later. A short example is unavoidably contrived:: |RST| is a little annoying to type over and over, especially when writing about |RST| itself, and spelling out the bicapitalized word |RST| every time isn't really necessary for |RST| source readability. .. |RST| replace:: reStructuredText_ .. _reStructuredText: http://structuredtext.sourceforge.net/ Substitution is also appropriate when the replacement text cannot be represented using other inline constructs, or is obtrusively long:: But still, that's nothing compared to a name like |j2ee-cas|. .. |j2ee-cas| replace:: .. link:: http://developer.java.sun.com/developer/earlyAccess/j2eecas/ the Java`TM`:super: 2 Platform, Enterprise Edition Client Access Services From jaffray@pobox.com Fri Nov 16 04:12:22 2001 From: jaffray@pobox.com (Alan Jaffray) Date: Thu, 15 Nov 2001 23:12:22 -0500 (EST) Subject: [Doc-SIG] Alt text In-Reply-To: Message-ID: On Tue, 13 Nov 2001, David Goodger wrote: > I remember shelving fleeting thoughts about the "alt" text; time to > revisit them. OK, there are three possible sources for the alt text: > > 1. The substitution name. > 2. The source URI or file name. > 3. Explicit text in the "image" directive. > > I think (2) is the worst option, (3) is the best, and (1) is > in-between. (1) and/or (2) could be fallbacks when (3) isn't given. It'd be nice to be able to explicitly specify alt text, but really, I can't think of a situation where I would want anything but the substitution name as the alt text. In HTML4, alt is a required attribute for img, and is intended to contain a text equivalent of the inline graphic, ensuring readability when the output format or user agent does not support graphics. We want the source document to be readable, and display of the source document does not support graphics... so why not put the text inline? (Yes, I know not everyone cares about HTML4; but since source readability is a goal of reStructuredText, having usable nongraphical representations of the document isn't an output-specific concern.) Alan From goodger@users.sourceforge.net Fri Nov 16 04:17:45 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Thu, 15 Nov 2001 23:17:45 -0500 Subject: [Doc-SIG] Use cases for inline directive references In-Reply-To: Message-ID: Great writeup, Alan. May I use it in the spec? A few details need work, but overall you state the case very well. As originally specified & implemented, substitution names were simple reference names (i.e., a single alphanumeric word with internal ".-_" chars but no whitespace). But I've been thinking (and your examples have confirmed) that single words are too limited, that arbitrary text should be allowed. I'll make the mods. I'm leaning towards carets for syntax though, sorry. :-) -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From goodger@users.sourceforge.net Fri Nov 16 04:29:57 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Thu, 15 Nov 2001 23:29:57 -0500 Subject: [Doc-SIG] Re: Alt text In-Reply-To: Message-ID: Alan Jaffray wrote: > It'd be nice to be able to explicitly specify alt text, but really, > I can't think of a situation where I would want anything but the > substitution name as the alt text. > > In HTML4, alt is a required attribute for img So it is (... having checked my HTML reference). Having declared that we'll allow arbitrary text as substitution names, I'm willing to add them as a fallback if the "alt" attribute isn't explicitly given. It will require a change to the directive API; no big deal. -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From Paul.Moore@atosorigin.com Fri Nov 16 09:09:39 2001 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Fri, 16 Nov 2001 09:09:39 -0000 Subject: [Doc-SIG] Use cases for inline directive references Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5B0CC@UKRUX002.rundc.uk.origin-it.com> From: Alan Jaffray [mailto:jaffray@pobox.com] > I'm playing fast and loose with the hypothetical directive names > and syntax; this is just intended to get the ideas across. Thank you for putting the effort in to produce these! You have me convinced - I see the need, and I understand how the syntax is going to achieve it, now. (BTW, I think I prefer |...| over ^...^, as well.) I'm happy with most of this - the only one I dislike is the last one (your "replace" directive), which offers scope for the sort of abuse I pointed out earlier, which prompted David to remove the "replacement text" part of the construct. But you can't avoid allowing this, as processors can implement any directives they like, I guess - so it's just something that *can* be abused, but *shouldn't* be. But your "j2ee-cas" example confuses me - a "replace" directive with an embedded "link" directive plus some text? I can't parse that so it works. Which means that it isn't readable (to me, at least) in "raw" form, so there's some form of issue here (if only that directive implementers should exercise restraint in what they allow...) I'm +1 on most of this, and even +0 on the borderline cases (I favour consistency and generality over special-casing them out of existence)... Paul. From jaffray@pobox.com Fri Nov 16 23:15:31 2001 From: jaffray@pobox.com (Alan Jaffray) Date: Fri, 16 Nov 2001 18:15:31 -0500 (EST) Subject: [Doc-SIG] Use cases for inline directive references In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5B0CC@UKRUX002.rundc.uk.origin-it.com> Message-ID: On Fri, 16 Nov 2001, Moore, Paul wrote: > I'm happy with most of this - the only one I dislike is the last one (your > "replace" directive), which offers scope for the sort of abuse I pointed out > earlier, which prompted David to remove the "replacement text" part of the > construct. But you can't avoid allowing this, as processors can implement > any directives they like, I guess - so it's just something that *can* be > abused, but *shouldn't* be. Yeah. I wasn't especially happy with my first example - but it's difficult to come up with a *short* example using "boilerplate text" in any reasonable way. > But your "j2ee-cas" example confuses me - a "replace" directive with an > embedded "link" directive plus some text? The block argument to "replace" can be any valid RST. In this case, it's a single directive, a hypothetical "link" directive, which takes a URL argument (the link target) and a block argument (the link text). That block argument can then be any valid RST; in this case it included hypothetical "super" interpreted text role to make "TM" a superscript. > I can't parse that so it works. > Which means that it isn't readable (to me, at least) in "raw" form, so > there's some form of issue here (if only that directive implementers should > exercise restraint in what they allow...) Well, if you need to write nested inline markup, you gotta do it somehow; and if it's complicated markup, it's gonna look complicated. It'd be more readable to do it entirely with nested inline directive references, but those have been vetoed. So instead we do the nesting in a block, where there are no issues with syntactic ambiguity, and substitute it into the inline text. You certainly don't want to do this often, I agree. But having it as a fallback is useful, and it's consistent with the rest of the language. Alan From jaffray@pobox.com Sat Nov 17 00:51:49 2001 From: jaffray@pobox.com (Alan Jaffray) Date: Fri, 16 Nov 2001 19:51:49 -0500 (EST) Subject: [Doc-SIG] scope of the parser Message-ID: Since the rules for which references are associated with which targets are defined by reStructuredText, shouldn't the parser explicitly state which target is to be used for each reference? Ditto for footnotes, inline directive references, etc. It seems misplaced for an output formatter to be dealing with details of autonumbering, ambiguous refnames, etc. Meanwhile, since the meaning of directives and the set of meaningful directive names is *not* defined by reStructuredText, shouldn't the rST parser output most directives (including unknown ones) untouched? It seems advantageous to be able to decouple directive processing and implementation from reStructuredText parsing and implementation. Alan From goodger@users.sourceforge.net Sat Nov 17 05:52:02 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Sat, 17 Nov 2001 00:52:02 -0500 Subject: [Doc-SIG] Use cases for inline directive references In-Reply-To: Message-ID: [Paul] >> But your "j2ee-cas" example confuses me - a "replace" directive with an >> embedded "link" directive plus some text? [Alan] > The block argument to "replace" can be any valid RST. Maybe so, but there are easier ways to do it. Here's the original example:: But still, that's nothing compared to a name like |j2ee-cas|. .. |j2ee-cas| replace:: .. link:: http://developer.java.sun.com/developer/earlyAccess/ j2eecas/ the Java`TM`:super: 2 Platform, Enterprise Edition Client Access Services I had to split up the URI; Outlook Express wouldn't allow such a long line. Which shows why a construct like that "link" directive wouldn't work as is: in many cases, URIs will need to be split across lines. How do you tell the difference between URI continuation and the link text? It would need some added syntax (like a blank line in-between). Anyhow, the example above can be represented much more easily as follows:: But still, that's nothing compared to a name like |j2ee-cas|__. .. |j2ee-cas| replace:: the Java`TM`:super: 2 Platform, Enterprise Edition Client Access Services __ http://developer.java.sun.com/developer/earlyAccess/j2eecas/ Substitution references are allowed to be hyperlink references too. That removes the need for a "link" directive, simplifying the construct greatly. But the above still won't work. All inline markup needs to begin after whitespace or a quote or one of "([{<". The "Java`TM`:super:" won't work, becaue the initial backquote is not in start-string context. A work-around would be to add a space:: .. |j2ee-cas| replace:: the Java `TM`:super: 2 Platform, Enterprise Edition Client Access Services But this is probably not acceptable. At present, there is no way to do character-level inline markup; it's limited to word-level. A severe limitation, yes, but intentional: it allows text like "2*x" and "2 * x" and "*" (with or without quotes) to exist without causing parser errors. This certainly would create problems for a Japanese or Chinese application of reStructuredText, since these languages do not use spaces between words. Put on your thinking caps, everybody! Any way to allow character-level inline markup without sacrificing the parser's current forgiving nature? [#]_ If not, then we'll just have to live with the limitation. .. [#] I have one (somewhat ugly) idea, but I'll keep it to myself for now to avoid contaminating the creative process. -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From goodger@users.sourceforge.net Sat Nov 17 06:22:40 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Sat, 17 Nov 2001 01:22:40 -0500 Subject: [Doc-SIG] scope of the parser In-Reply-To: Message-ID: Alan Jaffray wrote: > Since the rules for which references are associated with which > targets are defined by reStructuredText, shouldn't the parser > explicitly state which target is to be used for each reference? > Ditto for footnotes, inline directive references, etc. It seems > misplaced for an output formatter to be dealing with details of > autonumbering, ambiguous refnames, etc. This is what I call "linking" (just like compilers... I remember those... glad I don't have to deal with them any more). The writer doesn't do the linking. The parser can't do the linking either, because it may not have complete information during (or at the end of) any one run. It's the Reader component that does the linking. Linking is one example of a transform. Transforms will be controlled by the Reader component. For example, when processing Python source docstrings, there will probably be a lot of cross-references between individual docstrings. Each docstring is essentially a separate and independent document. The "Python Source Reader" extracts the docstrings, sends them through the parser, knits the results together into a single coherent doc tree, and *then* the linking can take place. Looks like it's time for another... DPS Components Diagram! Here's my current thinking:: +--------+ +--------+ | READER | ----------------> | WRITER | +--------+ +--------+ / \ / \ / .... \ / \ / / \ \ / \ +--------+ +------------+ +--------+ +------------+ | PARSER | | transforms | | sylist | | deployment | +--------+ | | +--------+ +------------+ | - docinfo | (?) | - titles | | - linking | | - lookups | | - etc. | +------------+ UPPERCASE names are major DPS components; only one of each type is used per document. They are chosen either by the user or based on the input. Lowercase names are groups of common services used dynamically as required. The dotted line between the parser and the transforms indicates that the choice and order of transforms used will depend on the parser as well as the reader. Some transforms used on doc trees generated from reStructuredText will not be required for doc trees generated from other markup. In addition, the transforms used will depend on the reader. Some transforms will be used only by the Python Source Reader, others by the Standalone Document Reader. There will be some overlap as well. I believe that if required at all, "stylists" will be specific to each Writer. They'll transform documents into different layouts. The "deployment" services will comprise at least: output to a single file, output to multiple files in a directory structure, and output to objects in memory. A lot of this is still up in the air, until concrete implementations of each part are complete. > Meanwhile, since the meaning of directives and the set of meaningful > directive names is *not* defined by reStructuredText, shouldn't the > rST parser output most directives (including unknown ones) > untouched? They're not necessarily defined in reStructuredText *spec*, but they *are* parser constructs; directives are taken care of by the parser, and only by the parser. Directive-handling code must be present at parse time; if the parser enounters an unknown directive, an error is generated. Directives must transform their data & blocks to doc tree elements at parse time. If any further processing is to be done down the road, the directive generates specialized elements which can be processed by a transform or at some later stage. But once it leaves the parser, there's no longer a "directive" as such. For example, say you have a directive "TOC" which is meant to generate a table of contents. The table of contents typically appears at the head of a document, but you can't generate it until the entire document has been processed. So we insert a ".. toc::" directive where we want the TOC to be, and the TOC directive generates a "" placeholder element. A transform further downstream could recognize the " element, generate and substitute a full table of contents in place. Such directives would be two- (or more) step processes. Perhaps I should remove the "directive" element from the DTD and dps/nodes.py. I'm only using it for testing right now. All true directives generate proper doc tree elements. Yes, I think I will remove it. Its presence is confusing at best. > It seems advantageous to be able to decouple directive > processing and implementation from reStructuredText parsing and > implementation. We can't do that, because directives are specifically allowed to reuse the parser for their own purposes. For example, the admonition directives (note, caution, danger, etc.) recursively generate and run a parser state machine to do most of their work for them. -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From jaffray@pobox.com Sat Nov 17 21:29:14 2001 From: jaffray@pobox.com (Alan Jaffray) Date: Sat, 17 Nov 2001 16:29:14 -0500 (EST) Subject: [Doc-SIG] scope of the parser In-Reply-To: Message-ID: On Sat, 17 Nov 2001, David Goodger wrote: > Looks like it's time for another... DPS Components Diagram! Thanks for the explanation and the diagram. It cleared things up for me. > > It seems advantageous to be able to decouple directive > > processing and implementation from reStructuredText parsing and > > implementation. > > We can't do that, because directives are specifically allowed to reuse > the parser for their own purposes. For example, the admonition > directives (note, caution, danger, etc.) recursively generate and run > a parser state machine to do most of their work for them. That's why I said "be able to". 95% of the time the desired behavior for directives will be to process the block argument as RST, and not do any other RST processing. In those cases, you can decouple. If you want something else, you have to muck with the inside of the parser. Alan From goodger@users.sourceforge.net Sat Nov 17 22:33:17 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Sat, 17 Nov 2001 17:33:17 -0500 Subject: [Doc-SIG] scope of the parser In-Reply-To: Message-ID: Alan Jaffray wrote: > 95% of the time the desired behavior for directives will be to process the > block argument as RST, and not do any other RST processing. In those cases, > you can decouple. If you want something else, you have to muck with the > inside of the parser. I don't follow. Can you give an example of a directive in the other 5%? What do you mean by "decouple"? -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From fdrake@acm.org Sun Nov 18 05:10:58 2001 From: fdrake@acm.org (Fred L. Drake) Date: Sun, 18 Nov 2001 00:10:58 -0500 (EST) Subject: [Doc-SIG] [development doc updates] Message-ID: <20011118051058.E6F6328697@beowolf.digicool.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Update to docs beyond Python 2.2 beta 2: Clarified a couple of points in the SAX API descriptions for startElement() and startElementNS(). Better description of gc.garbage value (gc module). Cleaned up & slightly modernized some sample code in the Python/C API and Extending & Embedding manuals. From jaffray@pobox.com Sun Nov 18 12:54:56 2001 From: jaffray@pobox.com (Alan Jaffray) Date: Sun, 18 Nov 2001 07:54:56 -0500 (EST) Subject: [Doc-SIG] scope of the parser In-Reply-To: Message-ID: David Goodger wrote: > Alan Jaffray wrote: > > 95% of the time the desired behavior for directives will be to process the > > block argument as RST, and not do any other RST processing. In those cases, > > you can decouple. If you want something else, you have to muck with the > > inside of the parser. > > I don't follow. Can you give an example of a directive in the other 5%? A directive indicating an indexed and captioned code fragment:: .. example:: index=2.2 caption=`Recognizing inline markup` inline.openers = '\'"([{<' inline.closers = '\'")]}>' inline.start_string_prefix = (r'(?:(?<=^)|(?<=[ \n%s]))' % re.escape(inline.openers)) inline.end_string_suffix = (r'(?:(?=$)|(?=[- \n.,:;!?%s]))' % re.escape(inline.closers)) A directive indicating raw output for a given format:: .. raw:: format=latex \documentclass[twocolumn]{article} > What do you mean by "decouple"? Do directive processing outside the parser, and in fact outside the entire DPS/RST codebase. If a user comes along and decides they want a new directive, they ought to be able to do this while treating RST as a black box. User feeds in ``.. foo:: bar``, gets back ````, and can then process the XML into whatever they want, or treat it as a literal, or output a warning, or whatever. Alan From jaffray@pobox.com Mon Nov 19 02:09:46 2001 From: jaffray@pobox.com (Alan Jaffray) Date: Sun, 18 Nov 2001 21:09:46 -0500 (EST) Subject: [Doc-SIG] Use cases for inline directive references In-Reply-To: Message-ID: On Sat, 17 Nov 2001, David Goodger wrote: > At present, there is no way to do character-level inline markup; it's > limited to word-level. A severe limitation, yes, but intentional: it > allows text like "2*x" and "2 * x" and "*" (with or without quotes) to > exist without causing parser errors. I'm uncomfortable with the current "forgiving" rules. It's clear that you've put a lot of thought into them, and they're clever, and they handle many cases well. But they're not clever enough to prevent the user from being surprised if they use unquoted characters without knowing the rules, and they're sufficiently complicated and ad-hoc that knowing all the rules is difficult. :: Then delete *.bak and *~ to clean up the backup files. Instead of **argv, Python uses sys.argv. The anonymous phrase link syntax is `` `text text text`__ ``. (I used that last one recently, despite having read the inline markup rules a couple of times already. I only realized it wasn't legal when I reread them again while writing this message.) I would prefer a simpler, more consistent set of rules, something I can explain in a short sentence. "Backquote and asterisk and non-internal underscores are magic, escape or quote them if you want them literally." If that allows us to do character-level markup, so much the better. Alan From goodger@users.sourceforge.net Mon Nov 19 03:55:50 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Sun, 18 Nov 2001 22:55:50 -0500 Subject: [Doc-SIG] Use cases for inline directive references In-Reply-To: Message-ID: Alan Jaffray wrote: > I'm uncomfortable with the current "forgiving" rules. It's clear that > you've put a lot of thought into them, and they're clever, and they > handle many cases well. But they're not clever enough to prevent > the user from being surprised if they use unquoted characters without > knowing the rules, and they're sufficiently complicated and ad-hoc > that knowing all the rules is difficult. :: > > Then delete *.bak and *~ to clean up the backup files. > > Instead of **argv, Python uses sys.argv. Each of these will generate level-1 system warnings, which ought to be reported to stderr but filtered out of the final output. I consider such processing behavior successful. We may change the system warnings to level-0 if they pop up too often or are too distracting (level-0 warnings are intended to be silently discarded unless a "verbose" or "strict" option is in effect). The "*" in ``*.bak`` is not ended by the "*" in ``*~`` (there's whitespace before it, and non-whitespace after), so there will be no inline elements at all. In order to get such unintended inline elements, you'd need something like this:: Then delete *.bak and backup.* to clean up the backup files. I would suggest that those ought to be literals anyhow. > The anonymous phrase link syntax is `` `text text text`__ ``. > > (I used that last one recently, despite having read the inline markup > rules a couple of times already. I only realized it wasn't legal when > I reread them again while writing this message.) Not recognized because of whitespace. Simply add "no whitespace just inside the delimiters" to the explanation. > I would prefer a simpler, more consistent set of rules, something I can > explain in a short sentence. "Backquote and asterisk and non-internal > underscores are magic, escape or quote them if you want them literally." This explanation is fine, if over-protective. It won't cause any surprises, but doesn't inform of the context that allows for the "exceptions" that make life easier and avoid nasty surprises when you're not expecting them. For a beginner's intro, how about: Non-internal asterisk, backquote, vertical bar [#]_, and underscore are inline magic delimiters, ``\`` escape or `` quote them if you want them literally. Asterisk, backquote, and vertical bar act like quote marks; matching delimiters are required at each end of the marked-up text, you need whitespace or other quoting outside them, and you can't have whitespace just inside them. Two (OK, maybe three) sentences, yes, but they say a lot more. The quickref will have examples, a better description, and a link to the spec itself for those who require the gory details. Don't forget, the quickref is the only attempt so far at user documentation; the spec is not. .. [#] Yes, I've given in to the vertical bar syntax. I took a look at the two side by side and carets do lose aesthetically. The likelihood of unintended interaction with table syntax is acceptably small, especially since it requires quite a fancy and unlikely table structure. The potential interaction is well explained in the spec. Also, I've settled on the names: "substitution references" and "substitution definitions". See the latest alternatives.txt for a lengthy discussion of all the alternatives I considered. I'm beyond embarrassment discussing punctuation in public. It comes with the territory. This project is an enormous pedantic exercise. It has to be. :-) > If that allows us to do character-level markup, so much the better. Removing the context would allow character-level markup, but I believe that it would also increase both the likelihood of surprising output and the requirement for escaping to unacceptable levels. I haven't done any research to back up this belief, which is partly based on the way StructuredText handled inline markup. I'd be interested to see proof one way or the other, such as the results of a survey of a large and varied body of applicable texts. -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From Paul.Moore@atosorigin.com Mon Nov 19 09:05:34 2001 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Mon, 19 Nov 2001 09:05:34 -0000 Subject: [Doc-SIG] Use cases for inline directive references Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5B0CF@UKRUX002.rundc.uk.origin-it.com> From: Alan Jaffray [mailto:jaffray@pobox.com] > I'm uncomfortable with the current "forgiving" rules. It's clear that > you've put a lot of thought into them, and they're clever, and they > handle many cases well. But they're not clever enough to prevent > the user from being surprised if they use unquoted characters without > knowing the rules, and they're sufficiently complicated and ad-hoc > that knowing all the rules is difficult. I agree that they are somewhat confusing to *write*, but they are very natural to *read*. In other words, I agree that there are many borderline cases which I, as a non-expert, would quote because I don't understand that I don't need to. On the other hand, when I read text written by other people which use things like 2*x inline, with no quoting, I find it very natural to read, and much less obtrusive. So, in my view, I would say that the current rules are good, as they enhance the goal of *allowing* marked-up text which is readable in raw form. (The fact that they don't quite manage to *encourage* writing of such text, because the rules are a bit more complex than the minimum possible, is a shame, but worth it). It boils down to whether you like DWIM-type rules (which are never 100% perfect, by their nature). But I don't mind much about the character-level markup issue... Paul From tony@lsl.co.uk Mon Nov 19 11:16:18 2001 From: tony@lsl.co.uk (Tony J Ibbs (Tibs)) Date: Mon, 19 Nov 2001 11:16:18 -0000 Subject: [Doc-SIG] Last Friday, "groups", washing machines Message-ID: <01c701c170eb$9c2a06f0$545aa8c0@lslp862.int.lsl.co.uk> Humph. OK, Friday happened, and we have a new washing machine, but for reasons mostly to do with the vagaries of how such things get delivered and the difficulties of arranging childcare and pickup around them, I actually got less than two hours of computer time - and essentially none over the weekend for other reasons. Mind you, we're now building up a nice backlog of ironing, so it's not all downside! I managed to write something on and "style", but it's still a bit hazy, and I haven't had time to make it shorter. Still, rather than put it off, here goes: .. begin:: reST included text .. ############################################################ Adding , losing and their ilk ====================================================== :Author: Tibs :Date: 2001-11-18 Background ---------- I am currently writing software that will take information from Python source files, produce a DPS node tree therefrom, and allow the user to generate HTML from that. My initial implementation produced a *variant* of the DPS node tree, which contained many structures that related closely to the information derived from Python - for instance, something like:: This is a very silly class. For various reasons, the (implicit) DTD wasn't shaping up very like that proposed by David Goodger, so I asked about the possibility of amending the "standard" DTD. This led to a discussion of how the flow of information through a DPS processor should actually work, with the result being David's diagram [#diagram]_:: +--------+ +--------+ +------------+ +--------+ | READER | --> | linker | --> | transforms | --> | WRITER | +--------+ +--------+ +------------+ +--------+ | TOC, index, | | etc. (optional) | +--------+ +--------+ | PARSER | | filer | +--------+ +--------+ David also made the point that, within this plan, the result of the ``linker`` phase is a normal DPS tree, which can be transformed with any of the "standard" transformation tools (for instance, to resolve automatic footnotes), and then output with any writer. Whilst David's diagram is not *quite* how I see the process, it's close enough for this purpose. Thus pydps [#pydps]_ might be shown as:: +--------+ +--------------+ +------------+ +---------+ | READER | --> | transform.py | --> | transforms | --> | html.py | +--------+ +--------------+ +------------+ +---------+ | | +----------+ +---------------+ | visit.py | | buildhtml.py | +----------+ +---------------+ The "READER" is implicit in the main utility (currently ``pydps.py``), and locates the relevant Python files. It then uses ``visit.py`` to generate a tree representing the Python code (the Python ``compiler`` module, standard with Python 2.2 and above, but in ``Tools`` before that, is used). ``transform.py`` (which, by David's diagrams, should maybe be called ``link``) transforms that information into a proper DPS node tree. At the time of writing, no transformations are done. Finally, HTML is output from the DPS node tree by ``html.py``. So, in summary: 1. ``transform.py`` generates a *normal* DPS tree. It doesn't use any "odd" nodes (except - but we'll discuss that later on). This means that it should be possible to plug in any other writer, and produce a different format as output - a very significant advantage. 2. ``html.py`` expects to be given a *normal* DPS tree. This means that it should be usable by any other utility that also provides a normal DPS tree - again an advantage. The problem ----------- But there is a clash in requirements here. Whilst it is very nice to be able to represent the Python information as "normal" DPS (guaranteeing that anyone can do useful things with it), there is some need to transfer information that goes beyond that. There are two main reasons for wanting to do this: * Data mining * Presentation For the first, although DPS/reST/pydps is primarily about producing human-viewable documentation, it might also be nice to be able to extract parts of it automatically, for various purposes - for instance, retrieve just the information about which classes inherit from other. This information will, in part, be obvious from the text chosen within the document (a title like "Class Fred" might be taken to be useful, for instance!), but it would be nice to give a bit more help. For the second, it's relatively difficult to produce better layout for DPS Python documentation without more information to work on. If one uses the (rather garish) default presentation produced by pydps (and no, I'm not saying that's a *nice* presentation, but it is the one I've got as an example), it is clearly useful to be able to: 1. group together the package/class/method/etc title and its full name/path/whatever 2. group together a method or function's signature and its docstring David's original approach to this was to introduce a host of Python specific tags into ``nodes.py`` [#nodes]_ - for instance:: package_section module_section ... instance_attribute_section ... parameter_item parameter_tuple ... package module class_attribute There are several problems with approach. Perhaps the most serious is that *all* generic DPS writers need to understand this host of elements that are only relevant to Python. Clearly, someone writing a writer for other purposes may be reluctant to go to this (to them) redundant effort. >From my point of view, an immediate problem is that the set of elements is not *quite* what I want - which means working towards a set of patches for ``nodes.py`` and the relevant DTD, and getting David to agree to them (well, that last is a useful process to have in place, but still). Since I'm not likely to get it right immediately, this is a repetitive process. Lastly, one might imagine someone from another programming language domain adopting DPS/reST. One can expect them to be frustrated if the set of constructs provided for Python doesn't quite match the set of constructs required to describe their language in a natural manner. Luckily (!), I have a counter proposal, which hopefully keeps the baby and just loses the bath water. Groups and "style" ------------------ The first thing that I realised was that, for convenience of output at least, I wanted to be able to group elements together - or, in terms of the DPS tree, insert an arbitrary "subroot" within the tree to 'abstract' a branch of the tree. This is particularly useful for linking together the parts of the text that deal with (for instance) attribution, or unusual globals, without having to embed yet another section. There is, of course, precedent. HTML provides
and - one for "structural" elements, and one for inline elements (I forget which is which), and TeX and LaTeX are well-known for their use of grouping (e.g., the ``\begin{xxx}`` and ``\end{xxx}`` in LaTeX). I don't consider it worth making the
/ distinction in the context of a tree - it is perfectly evident what is beingp grouped from the elements themselves. Once one has a element, it is natural to annotate it with *what* it is a group of/for. I chose the arbitrary term "style" - partly because it is not used in HTML/XML for any purpose I am aware of. And once one has the "style" attribute, it becomes possible to use it elsewhere - most notably in
elements, saving the need for a myriad of different sections for different purposes. In these instances, it is perhaps acting more like the "class" attribute in HTML - indicating, to some extent, meaning, but aimed mostly at presentation (via CSS). The other obvious place to use it is in the automatically generated text for things like names, where (at least pre-"combing"/transformation) one is "pretending" to have assigned a role (just as a person might in a docstring) (but see [#style]_). Summary ------- Current DPS defines many element types for use in Python code representation. However, there are major advantages in only using the "simple" DPS nodes for all purposes. This becomes simple and practical given a single extra, general purpose, element: . Furthermore, adding a standard attribute called "style" (or perhaps "role" - see [#style]_) seems to fulfil any other outstanding requirements. References ---------- .. [#diagram] in email by David Goodger to Doc-SIG, dated 21 September 2001 04:31, "Re: [Doc-SIG] DPS components". .. [#style] Hmm - maybe "style" should be "role", to match with the way that a :role:`of something` gets handled... .. [#pydps] Normally to be found at http://www.tibsnjoan.co.uk/reST/pydps.tgz, although note that this is not currently up-to-date. .. [#nodes] dps/dps/nodes.py in the DPS distribution (``dps.nodes`` if one is importing it). .. ############################################################ .. end:: reST included text Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ "No one trike will do everything... buy the whole set!" - Rob Hague My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) From tony@lsl.co.uk Mon Nov 19 14:45:33 2001 From: tony@lsl.co.uk (Tony J Ibbs (Tibs)) Date: Mon, 19 Nov 2001 14:45:33 -0000 Subject: [Doc-SIG] Last Friday, "groups", washing machines - addendum In-Reply-To: <01c701c170eb$9c2a06f0$545aa8c0@lslp862.int.lsl.co.uk> Message-ID: <01c801c17108$d788fa40$545aa8c0@lslp862.int.lsl.co.uk> One important thing I forgot to say: It is not possible to insert a into a DPS tree by use of reST alone (well, subject to not doing sophisticated/complicated substitions a la Jaffray - but that's such an escape clause I shall ignore it). That is, there is no reST "spelling" of . elements are *only* intended to be inserted by things which manipulate the DPS tree directly - i.e., linkers and trasformers. Furthermore, a "plain" DPS writer is perfectly at liberty (and may perhaps be expected) to treat a element as if it had no representation - for instance, in my own "plain" HTML Writer class:: def write_group(self,element,stream): """By default, we don't do anything with tags. """ for node in element: self.write_html(node,stream) Alternatively (in HTML output) a reasonable response would be to translate it to
or as appropriate - but that is *not requried*. Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ Marked up with reStructuredText: http://structuredtext.sf.net/ My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) From goodger@users.sourceforge.net Tue Nov 20 04:01:39 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Mon, 19 Nov 2001 23:01:39 -0500 Subject: [Doc-SIG] scope of the parser In-Reply-To: Message-ID: Alan: > > > 95% of the time the desired behavior for directives will be to > > > process the block argument as RST, and not do any other RST > > > processing. In those cases, you can decouple. If you want > > > something else, you have to muck with the inside of the parser. David: > > I don't follow. Can you give an example of a directive in the > > other 5%? Alan: > A directive indicating an indexed and captioned code fragment:: > > .. example:: index=2.2 caption=`Recognizing inline markup` > inline.openers = '\'"([{<' > inline.closers = '\'")]}>' > inline.start_string_prefix = (r'(?:(?<=^)|(?<=[ \n%s]))' > % re.escape(inline.openers)) > inline.end_string_suffix = (r'(?:(?=$)|(?=[- \n.,:;!?%s]))' > % re.escape(inline.closers)) I don't see how this is an example of a non-RST (or post-RST) directive. It seems like a simple variant of a literal block. Perhaps it depends on what the "index" and "caption" do? However, the second example is easier to understand: > A directive indicating raw output for a given format:: > > .. raw:: format=latex > \documentclass[twocolumn]{article} I assume this is to be passed through to the LaTeX writer untouched. Easy to accomplish. The "raw" directive should produce something like this:: \documentclass[twocolumn]{article} Even better would be if the element name was "raw-latex". Either way, there are two possibilities here: 1. A directive becomes widespread enough to warrant built-in support. The DTD, dps.nodes, and all writers learn about it (some may learn to ignore it). 2. It's of limited application-specific use. The application's code needs a dps.nodes.Node subclass (probably subclass of TextElement) to handle it, the application may need a DTD extension [#]_, and the application's writer variants need support. .. [#] Currently, the DTD is used only as a formalized notation of the DPS doc tree implemented in dps.nodes. In future, there might be direct doc tree validation support added to the code, which may use the DTD or some other notation (such as TREX or RELAX). > > What do you mean by "decouple"? > > Do directive processing outside the parser, and in fact outside the > entire DPS/RST codebase. Do you mean "processing using code outside the DPS/RST project codebase, but still run interleaved with it" or "processing after the DPS/RST codebase has finished its work"? The former can be done now. The latter, I'm not sure about. > If a user comes along and decides they want a new directive, they > ought to be able to do this while treating RST as a black box. I don't agree. You need a degree of understanding of the inner workings of a system in order to extend it. Adding a directive might require a less than complete understanding of the internals, but not none. > User feeds in ``.. foo:: bar``, gets back `` args="bar"/>``, and can then process the XML into whatever they > want, or treat it as a literal, or output a warning, or whatever. The problem with this is that error checking goes out the window. A directive with a misspelled type name would trigger no warnings in the parser, and possibly not even in the writer. If application-specific directives appear that cannot or should not be added to the core parser, the current code can't handle it. In order to add a directive, you have to put the directive implementation module into the restructuredtext/directives directory, and edit project source files: restructuredtext/directives/__init__.py for a canonical directive name to (module name, function name) mapping, and restructuredtext/languages/en.py for a language-specific to canonical name mapping. This shortcoming could be remedied to allow runtime additions to the "directive registry", with implementation modules outside of the parser's package tree. -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From goodger@users.sourceforge.net Tue Nov 20 04:24:05 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Mon, 19 Nov 2001 23:24:05 -0500 Subject: [Doc-SIG] Re: Adding , losing and their ilk In-Reply-To: <01c801c17108$d788fa40$545aa8c0@lslp862.int.lsl.co.uk> Message-ID: (I've dropped the washing machine reference from the title because it's TOO SILLY!) Tony J Ibbs (Tibs) wrote: > Whilst David's diagram is not *quite* how I see the process, it's > close enough for this purpose. Thus pydps [#pydps]_ might be shown > as:: > > +--------+ +--------------+ +------------+ +---------+ > | READER | --> | transform.py | --> | transforms | --> | html.py | > +--------+ +--------------+ +------------+ +---------+ > | | > +----------+ +---------------+ > | visit.py | | buildhtml.py | > +----------+ +---------------+ > > The "READER" is implicit in the main utility (currently > ``pydps.py``), and locates the relevant Python files. It then > uses ``visit.py`` to generate a tree representing the Python code I'd consider ``visit.py`` to be part of the Reader. > So, in summary: > > 1. ``transform.py`` generates a *normal* DPS tree. It doesn't > use any "odd" nodes (except - but we'll discuss > that later on). This means that it should be possible to > plug in any other writer, and produce a different format as > output - a very significant advantage. Substitute "The Reader" for "transform.py" and: Yes. In the revised model, the Reader subsumes the transforms. > 2. ``html.py`` expects to be given a *normal* DPS tree. This > means that it should be usable by any other utility that > also provides a normal DPS tree - again an advantage. Ditto "The Writer" for "html.py": Yes. > The problem > ----------- > But there is a clash in requirements here. Whilst it is very nice > to be able to represent the Python information as "normal" DPS > (guaranteeing that anyone can do useful things with it), there is > some need to transfer information that goes beyond that. There > are two main reasons for wanting to do this: > > * Data mining > * Presentation > > For the first, although DPS/reST/pydps is primarily about > producing human-viewable documentation, it might also be nice to > be able to extract parts of it automatically, for various > purposes - for instance, retrieve just the information about > which classes inherit from other. This information will, in part, > be obvious from the text chosen within the document (a title like > "Class Fred" might be taken to be useful, for instance!), but it > would be nice to give a bit more help. Wouldn't that be better left to a tool accessing the information directly? (Correct me if I'm wrong...) Your visit.py extracts structural information from the AST via compiler.py, and transforms it into the trunk and major branches of the doc tree, onto which the parsed docstrings get attached like minor branches and leaves. The doc tree is one step removed from the AST; I would think that a data mining application would want to start as close to the source AST data as possible. I wouldn't want to try to extract such structural information from textual clues. There are certain limits on such operations (as we all well know). > For the second, it's relatively difficult to produce better > layout for DPS Python documentation without more information to > work on. If one uses the (rather garish) default presentation > produced by pydps (and no, I'm not saying that's a *nice* > presentation, but it is the one I've got as an example), it is > clearly useful to be able to: > > 1. group together the package/class/method/etc title and its > full name/path/whatever > > 2. group together a method or function's signature and its > docstring I would say that's a job for the Reader and its tranforms, not for the Writer. Let me elaborate a bit more on the latest diagram:: 1,3,5 6,8 +--------+ +--------+ | READER | =======================> | WRITER | +--------+ (purely presentational) +--------+ // \ / \ // \ / \ 2 // 4 \ 7 / 9 \ +--------+ +------------+ +------------+ +---------------+ | PARSER |...| reader | | writer |...| deployment | +--------+ | transforms | | transforms | | | | | | | | - one file | | - docinfo | | - styling | | - many files | | - titles | | - writer- | | - object data | | - linking | | specific | | structure | | - lookups | | - etc. | +---------------+ | - reader- | +------------+ | specific | | - parser- | | specific | | - layout | | - etc. | +------------+ I've added double-width lines between reader & parser and between reader & writer, meaning that data sent along these paths should be standard (pure & unextended) DPS doc trees. Single-width lines signify that internal tree extensions are OK (but must be supported internally at both ends), and may in fact be totally unrelated to the DPS doc tree structure. I've added "reader-specific" and "layout" transforms to the list of transforms. BTW, these transforms are not necessarily all in one directory; it's a nebulous grouping (it's hard to draw ASCII clouds). I've also added numbers to show the path a document would take through the code. > David's original approach to this was to introduce a host of > Python specific tags into ``nodes.py`` [#nodes]_ - for instance:: > > package_section > module_section > ... > instance_attribute_section > ... > parameter_item > parameter_tuple > ... > package > module > class_attribute > > There are several problems with approach. Perhaps the most > serious is that *all* generic DPS writers need to understand > this host of elements that are only relevant to Python. Clearly, > someone writing a writer for other purposes may be reluctant to > go to this (to them) redundant effort. I think the Python-specific extensions should be removed from dps.nodes and relocated to the PySource Reader for Reader-internal use only. The DTDs have been split since the beginning: gpdi.dtd for generic elements, and pdpi.dtd for Python-specific stuff. Writer modules should support only the standard doc tree elements, and should be considered "presentation only". I'm sure there are lots of changes to the doc tree structure that can be made for the benefit of Writers. The only reason there aren't any presentation-oriented attributes on elements is that nobody's added them yet. > From my point of view, an immediate problem is that the set of > elements is not *quite* what I want - which means working towards > a set of patches for ``nodes.py`` and the relevant DTD, and > getting David to agree to them (well, that last is a useful > process to have in place, but still). Since I'm not likely to get > it right immediately, this is a repetitive process. If we extract the Python-specific stuff out of dps.nodes, you'll have a free hand to do whatever you like within the PyDPS/PySource Reader. > Lastly, one might imagine someone from another programming > language domain adopting DPS/reST. One can expect them to be > frustrated if the set of constructs provided for Python doesn't > quite match the set of constructs required to describe their > language in a natural manner. So they write the LanguageXSource Reader with their own language-specific extensions. They'd have to anyway since compiler.py won't help them. > Groups and "style" > ------------------ > The first thing that I realised was that, for convenience of > output at least, I wanted to be able to group elements together - > or, in terms of the DPS tree, insert an arbitrary "subroot" > within the tree to 'abstract' a branch of the tree. > > This is particularly useful for linking together the parts of the > text that deal with (for instance) attribution, or unusual > globals, without having to embed yet another section. That was the purpose of the Python-specific elements in ppdi.dtd. But they were just my first guess at what would be needed; feel free to modify as necessary. > Once one has a element, it is natural to annotate it with > *what* it is a group of/for. I chose the arbitrary term "style" - > partly because it is not used in HTML/XML for any purpose I am > aware of. > > And once one has the "style" attribute, it becomes possible to > use it elsewhere - most notably in
elements, saving the > need for a myriad of different sections for different purposes. I can see using a "style" attribute to communicate formatting information between the Reader and the Writer (which would be free to ignore the advice if not understood). I'd much rather have a bunch of different section-level classes than one class with a "style" attribute. With section-level classes, we can use polymorphism to advantage. Plus, the classes are significantly different. Here's one:: The "package" element is just a replacement for a generic section's "title". It is easier to hang boilerplate text or formatting onto a specific element though. But here's another:: The "class" element contains the class name; it's the section title. But then there's an inheritance list and parameter list; much more interesting. If you had a generic "group" element, *and* you wanted an "inheritance_list" sometimes, you'd have to allow it anywhere, even when it's not applicable. You'd either end up with a freeform doc tree structure or one that's impossible to validate. > Summary > ------- > Current DPS defines many element types for use in Python code > representation. Gut 'em. > However, there are major advantages in only using the "simple" > DPS nodes for all purposes. True. > This becomes simple and practical given a single extra, general > purpose, element: . Nah. (For the same reason we don't use DOM: too general.) > Furthermore, adding a standard attribute called "style" (or > perhaps "role" - see [#style]_) seems to fulfil any other > outstanding requirements. Could be... -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From tony@lsl.co.uk Tue Nov 20 11:26:59 2001 From: tony@lsl.co.uk (Tony J Ibbs (Tibs)) Date: Tue, 20 Nov 2001 11:26:59 -0000 Subject: [Doc-SIG] RE: Adding , losing and their ilk In-Reply-To: Message-ID: <01e601c171b6$44988f80$545aa8c0@lslp862.int.lsl.co.uk> David Goodger wrote: > (I've dropped the washing machine reference from the title because > it's TOO SILLY!) That's OK - I hadn't thought about people's "subject" fields when they're *replying* to the message! There's a lot for me to digest in your reply - I shall have to mull it over properly. Clearly I've still to understand the deep meaning of the DPS way - but then working this sort of thing out leads one ever closer. By the way - the diagram is getting better every time! Specific replies/comments another time. Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ "No one trike will do everything... buy the whole set!" - Rob Hague My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) From goodger@users.sourceforge.net Tue Nov 20 13:19:13 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Tue, 20 Nov 2001 08:19:13 -0500 Subject: [Doc-SIG] Re: Adding , losing and their ilk In-Reply-To: <01e601c171b6$44988f80$545aa8c0@lslp862.int.lsl.co.uk> Message-ID: Tony J Ibbs (Tibs) wrote: > By the way - the diagram is getting better every time! It's got a nice symmetry to it now, doesn't it? -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From amos@zope.com Thu Nov 22 00:52:19 2001 From: amos@zope.com (Amos Latteier) Date: Wed, 21 Nov 2001 16:52:19 -0800 Subject: [Doc-SIG] Getting started with restructured text Message-ID: <3BFC4C43.7000603@zope.com> Hi, I don't know much about restx, but what I've seen so far impresses me. The spec seems well thought out. I currently use structured text quite a bit. In fact I've written a book with it. I am painfully aware of shortcomings in structured text. I'd like to evaluate restx as an alternative for writing articles and books. I downloaded the latest cvs snapshot and tried to figure out how to get going. I ran into a couple problems: * distutils doesn't add __init__.py files to the dps and parsers directories that it creates. This means that they aren't considered packages. I don't know if this is a distutils bug or a problem with how restx uses distutils. * restx seems to require that the entire dps package be installed. * There is no obvious way to create HTML output. Should I write an HTML formatter along the lines shown in dps.formatters.model? Or is XSLT preferred? Thanks for any pointers. -Amos -- Amos Latteier amos@zope.com Zope Corporation http://www.zope.com/ From goodger@users.sourceforge.net Thu Nov 22 03:46:22 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Wed, 21 Nov 2001 22:46:22 -0500 Subject: [Doc-SIG] Getting started with restructured text In-Reply-To: <3BFC4C43.7000603@zope.com> Message-ID: Hi Amos, > I don't know much about restx, but what I've seen so far impresses me. > The spec seems well thought out. Thanks! We aim to please. > * distutils doesn't add __init__.py files to the dps and parsers > directories that it creates. This means that they aren't considered > packages. I don't know if this is a distutils bug or a problem with how > restx uses distutils. That's odd. There *are* __init__.py files in those directories in CVS. I just checked the latest CVS snapshot, and the files are there too. The snapshots are generated daily but not archived (CVS is the archive); it's possible that there may have been a glitch in the snapshot you downloaded. Glitches been known to happen, but they've generally been all-or-nothing (the SourceForge CVS repository is unavailable when the script is run, etc.). Perhaps it is a distutils thing? I can't say; I've been running code from my local CVS tree. I'll make a point of periodically making sure setup.py/install.py work correctly. Those two files are just empty package indicators though. Others are not. > * restx seems to require that the entire dps package be installed. Yes. reStructuredText is a part of the DPS project. I felt it best to keep them separate as I wanted to allow for alternate syntaxes (StructuredText classic/NG, Setext, TeX, XML, or anything else anybody wanted to hack on). DPS is currently little more than support for the reStructuredText parser though. I plan to merge the two projects at some point into the "docutils" project, but I want to leave it until the ideas and APIs have matured. The parser is nearly complete, but the rest of the system isn't there yet. > * There is no obvious way to create HTML output. Should I write an HTML > formatter along the lines shown in dps.formatters.model? Or is XSLT > preferred? That's up to you! Tony Ibbs has produced a first stab at what I call a "Reader", a Python source analyzer, and an HTML "Writer", still available at http://www.tibsnjoan.co.uk/reST/pydps.tgz. It's no longer being maintained in favour of a fabled new implementation (which I'd love to see one of these days). Several contributors have produced XSL stylesheets; they're available in the reStructuredText "sandbox". Unfortunately, they're not up to date. I plan to create a standalone Reader (i.e., input consists of standalone reStructuredText source files without context), but haven't had time yet. This is definitely a work in progress. Contributions would be most appreciated! -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From Juergen Hermann" Message-ID: On Wed, 21 Nov 2001 22:46:22 -0500, David Goodger wrote: >Perhaps it is a distutils thing? I can't say; I've been running code fr= om my >local CVS tree. I'll make a point of periodically making sure >setup.py/install.py work correctly. > >Those two files are just empty package indicators though. Others are no= t. Given this info, it's a WinZIP thing, which NEVER unpacks empty files. J= ust add a docstring, as you should anyway ;), and the problem is solved. Ciao, J=FCrgen -- J=FCrgen Hermann, Developer (jhe@webde-ag.de) WEB.DE AG, http://webde-ag.de/ From tony@lsl.co.uk Thu Nov 22 10:26:42 2001 From: tony@lsl.co.uk (Tony J Ibbs (Tibs)) Date: Thu, 22 Nov 2001 10:26:42 -0000 Subject: [Doc-SIG] Getting started with restructured text In-Reply-To: Message-ID: <021301c17340$2dd9bc70$545aa8c0@lslp862.int.lsl.co.uk> David Goodger wrote: > Tony Ibbs has produced a first stab at what I call a > "Reader", a Python source analyzer, and an HTML "Writer", > still available at > http://www.tibsnjoan.co.uk/reST/pydps.tgz. It's no longer > being maintained in favour of a fabled new implementation > (which I'd love to see one of these days). Erm, well, I wouldn't say "no longer being maintained" - I still haven't changed the name. I *did* try to upload a new version this morning, but something has gone wrong with my ability to upload (password changed under my feet), so I'm awaiting a reply from "customer services" before I can do anything more. If Amos wants a copy urgently, I can always email it to him - it seems to cope adequately with "plain" reST text, although it's a bit behind the times on some subtleties that have been changing lately! Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ .. "equal" really means "in some sense the same, but maybe not .. the sense you were hoping for", or, more succinctly, "is .. confused with". (Gordon McMillan, Python list, Apr 1998) My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) From goodger@users.sourceforge.net Thu Nov 22 23:37:14 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Thu, 22 Nov 2001 18:37:14 -0500 Subject: [Doc-SIG] Getting started with restructured text In-Reply-To: Message-ID: David Goodger wrote: >> Perhaps it is a distutils thing? I can't say; I've been running code from my >> local CVS tree. I'll make a point of periodically making sure >> setup.py/install.py work correctly. >> >> Those two files are just empty package indicators though. Others are not. Juergen Hermann wrote: > Given this info, it's a WinZIP thing, which NEVER unpacks empty files. Just > add a docstring, as you should anyway ;), and the problem is solved. Actually, they have a few lines of comments each, so they're not *completely* empty. -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From amos@zope.com Mon Nov 26 18:20:19 2001 From: amos@zope.com (Amos Latteier) Date: Mon, 26 Nov 2001 10:20:19 -0800 Subject: [Doc-SIG] Getting started with restructured text References: Message-ID: <3C0287E3.8000604@zope.com> David Goodger wrote: >>* distutils doesn't add __init__.py files to the dps and parsers >>directories that it creates. This means that they aren't considered >>packages. I don't know if this is a distutils bug or a problem with how >>restx uses distutils. >> > > That's odd. There *are* __init__.py files in those directories in CVS. I > just checked the latest CVS snapshot, and the files are there too. I think the problem was that I was trying to use rstx without installing dps. >>* There is no obvious way to create HTML output. Should I write an HTML >>formatter along the lines shown in dps.formatters.model? Or is XSLT >>preferred? >> > > That's up to you! Tony Ibbs has produced a first stab at what I call a > "Reader", a Python source analyzer, and an HTML "Writer", still available at > http://www.tibsnjoan.co.uk/reST/pydps.tgz. It's no longer being maintained > in favour of a fabled new implementation (which I'd love to see one of these > days). Several contributors have produced XSL stylesheets; they're available > in the reStructuredText "sandbox". Unfortunately, they're not up to date. > > I plan to create a standalone Reader (i.e., input consists of standalone > reStructuredText source files without context), but haven't had time yet. > > This is definitely a work in progress. Contributions would be most > appreciated! I'm working on a HTML formatter which I'll contribute. Thanks! -Amos -- Amos Latteier amos@zope.com Zope Corporation http://www.zope.com/ From fdrake@acm.org Mon Nov 26 21:38:59 2001 From: fdrake@acm.org (Fred L. Drake) Date: Mon, 26 Nov 2001 16:38:59 -0500 (EST) Subject: [Doc-SIG] [development doc updates] Message-ID: <20011126213859.1A7AE28696@cj42289-a.reston1.va.home.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Various small corrections and additions. From goodger@users.sourceforge.net Tue Nov 27 05:43:20 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Tue, 27 Nov 2001 00:43:20 -0500 Subject: [Doc-SIG] yummy treats! Message-ID: It really was only meant as a joke... >From the reStructuredText home page: If you can test, administrate, or contribute code, ideas, bug reports, yummy treats, computer equipment, and/or large sums of money, please `contact the project administrator`_. .. _contact the project administrator: me_ Got a call from my wife today. Seems a package arrived at our door, addressed to me but "'re: structured text', what does that mean?". I get home, and my 4-year-old son helps me to sift through the styrofoam peanuts to dig out a seemingly endless assortment of delectables, from caramel corn to ginger chews (*way* too spicy for my taste, but I'm sure we'll find them a good home), from light-as-feathers lemon meringues to a "Pound Plus" (500g) bar of Belgian bittersweet chocolate. About a cubic foot of yummy treats. Thanks Alan! Whatever possessed you? Now if only someone would send me a nice new (or even slightly used) laptop. Boy would the rate of checkins ever go up then! JUST KIDDING!!! (well...) -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From tony@lsl.co.uk Tue Nov 27 16:08:15 2001 From: tony@lsl.co.uk (Tony J Ibbs (Tibs)) Date: Tue, 27 Nov 2001 16:08:15 -0000 Subject: [Doc-SIG] yummy treats! In-Reply-To: Message-ID: <000201c1775d$b8d64ac0$545aa8c0@lslp862.int.lsl.co.uk> > It really was only meant as a joke... > > Thanks Alan! Whatever possessed you? Damn - he got in first (but congrats to Alan on actually *doing* it! and in such an impressive manner). I'm still thinking of investigating the depths of Glaswegian confectionary [1]_ after Christmas, though (we'll be visiting relatives). > Now if only someone would send me a nice new (or even > slightly used) laptop. Hmm - I'm afraid I've only just bought my own (slightly used) laptop, and you shan't have it except over my cold, dead, etc. But if someone else wants to chip in... Tibs .. [1] Glasgow, where they test sweets, 'cos if it doesn't sell in Glasgow it won't sell anywhere. ...where they sell half-pound bars of Caramac (a substance almost but not entirely unlike chocolate, except sweeter). ...where I first had tablet (basically, take condensed milk and render it down into a solid...) Glasgow, a good place to be a dentist. -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ "I'm a little monster, short and stout Here's my horns and here's my snout When you come a calling, hear me shout I will ROAR and chase you out" My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) From fdrake@acm.org Wed Nov 28 07:57:33 2001 From: fdrake@acm.org (Fred L. Drake) Date: Wed, 28 Nov 2001 02:57:33 -0500 (EST) Subject: [Doc-SIG] [development doc updates] Message-ID: <20011128075733.9558928696@beowolf.digicool.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Minor updates; mostly changed organization of the "Internet Data Handling" chapter. From fdrake@acm.org Thu Nov 29 08:48:05 2001 From: fdrake@acm.org (Fred L. Drake) Date: Thu, 29 Nov 2001 03:48:05 -0500 (EST) Subject: [Doc-SIG] [development doc updates] Message-ID: <20011129084805.0885328696@beowolf.digicool.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Added section of example regular expressions to the "re" module docs. Various small clarifications. From fdrake@acm.org Fri Nov 30 06:10:49 2001 From: fdrake@acm.org (Fred L. Drake) Date: Fri, 30 Nov 2001 01:10:49 -0500 (EST) Subject: [Doc-SIG] [development doc updates] Message-ID: <20011130061049.B6ACB28696@beowolf.digicool.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Updated the httplib docs to cover the API provided in recent versions of Python. Please review and comment on this and all other new material! (In case you have forgetten, other large chunks of new material include the sections covering modules in the email package, the compiler package, and the Tkinter/Tix chapter.)