A vector version of the BSD Daemon

A vector version of the BSD Daemon

Preview of the BSD Daemon

The BSD daemon originally was done by John Lasseter. The copyright holder for these images is Marshall Kirk McKusick <mckusick@mckusick.com>.

[some more simple paragraphs ommitted]

Grab the tarball here.

Have fun!
Simon Budig <simon@budig.de>

I think it should be fairly trivial to create such an output with reST (just need to change some small things for the header/footer). And there is no CSS used yet. Bye, Simon -- Simon.Budig@unix-ag.org http://www.home.unix-ag.org/simon/ From goodger@users.sourceforge.net Wed Jul 3 05:02:28 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Wed, 03 Jul 2002 00:02:28 -0400 Subject: [Doc-SIG] References in the same line as the target text In-Reply-To: <20020703014603.A43242@vmax.unix-ag.uni-siegen.de> Message-ID: Simon, I don't have time to answer all of your points right away, but I can quickly respond to some of them. Simon Budig wrote: > It is in my eyes a great pity that you seem to have a tendency to > limit reST to the docstring-scope. Nothing could be further from the truth! Although a major goal is to get docstring processing working, and it will be a major part of the project, take a look at the project right now: docstring processing is not yet part of the core. Tony Ibbs has a preliminary prototype in the sandbox, but that's it. The only working part of the project now is geared toward producing web pages! And there's plenty of room for improvement, which would be quite welcome. Just because I'm not accepting the ``reference_(URL)`` syntax, doesn't mean reStructuredText is anti-HTML. Simply put, that syntax just doesn't fit with the rest of reStructuredText. Please don't confuse unrelated issues. [Referring to http://docutils.sf.net/index.txt, source of http://docutils.sf.net/index.html:] > Just to show the uglyness of anonymous and named links: This is from > the webpage source (as in CVS): Yes, the source to that web page is not pretty. It was written in a lazy manner, using many anonymous hyperlinks. But that's OK, because this is an example of a document where only the HTML is meant to be seen. The source is *not* intended to be read by anyone but me or other project developers. That file is not distributed with the project code or documentation. It is not given as a model for good reStructuredText usage. Also, being a home page, it is complex and full of external links, much more so than a typical document. > Please don't try to tell me that a newbie easily finds the URL for > `Docutils CVS repository`__ Sorry to burst your bubble, but I think it would be *very easy* for a newbie to find the URL. They just click on the text "Docutils CVS repository" on the HTML page, and their browser takes them there. No newbie would ever be exposed to the source text. (This is a poor choice to single out as a poor example. ;-) will (willg@bluesock.org) wrote: >> Personally, I think you're crazy to use reST with SSI and whatever >> else to build your web-site. I don't think it's crazy; the Docutils web site is built with reStructuredText/Docutils and I intend for this functionality to become more sophisticated. I agree with Simon when he wrote: [Simon] > I think that there really is a need for a simple markup language > that can be used by - for example - a secretary to maintain a simple > Website. HTML fails the "understandable to non-geek-guys"-test, > GUI-tools are known to produce crappy HTML code when used by > non-experts. > > reST really could fill a gap here. And that's one of the things it's already doing, and pretty successfully I think. Of course there's room to grow. > Just to show you what is possible with SSIs and why I think using > reST for webpages is not crazy: > > Have a look at http://www.home.unix-ag.org/simon/bsdaemon/ Apart from the extras pulled in through server-side includes, the page is very simple, nothing that reStructuredText couldn't handle. > The raw source of it is:: > > A server-side include directive could be added to reStructuredText, no problem. Although I'm very careful about new *syntax*, new directives are much easier to let in, because they're explicit and typically don't require new syntax (and if they do, it's localized and *explicit*). ... > Preview of the BSD Daemon

width="326" height="352">

I've thought of adding support for more image attributes, like "align". Care to try? > I think it should be fairly trivial to create such an output with > reST I agree, as long as the fancy graphical layout elements don't have to be reStructuredText. That's what stylesheets and server-side includes are for. -- David Goodger Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From fdrake@acm.org Wed Jul 3 06:10:00 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 3 Jul 2002 01:10:00 -0400 Subject: [Doc-SIG] Re: [Python-Dev] [development doc updates] In-Reply-To: <20020702231715.GG25927@laranja.org> References: <20020702222813.8990118EC22@grendel.zope.com> <20020702231715.GG25927@laranja.org> Message-ID: <15650.34600.410233.510315@grendel.zope.com> Lalo Martins writes: > Re: textwrap.TextWrapper.fix_sentence_endings ... > Well, actually the convention of separating sentences by two spaces is also > specific to the English language, so I don't see that as a problem. Insidious, isn't it? I've tried to clarify the matter further in the documentation; please let me know if you think more is needed. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From gherron@islandtraining.com Wed Jul 3 09:25:37 2002 From: gherron@islandtraining.com (Gary Herron) Date: Wed, 3 Jul 2002 01:25:37 -0700 Subject: [Doc-SIG] Re: [Python-Dev] [development doc updates] In-Reply-To: <20020702222813.8990118EC22@grendel.zope.com> References: <20020702222813.8990118EC22@grendel.zope.com> Message-ID: <200207030125.38106.gherron@islandtraining.com> Fred, Reading the "What's New in Python 2.3" section, I find the following sentence in "5 Extended Slices": Ever since Python 1.4 the slice syntax has supported a third ``Stride'' argument, but the builtin sequence types have not supported this feature (it was initially included at the behest of the developers of the Numerical Python package). This changes with Python 2.3. This is ambiguous. Exactly *HOW* does it change with Python 2.3? Does the stride argument go away, or do builtin sequence types now support the stride argument? If I'd followed this newsgroup more carefully, I'd probably know the answer. The paragraph about PendingDeprecationWarning, which follows the above quote, probably provides a clue, but it seems out of place, having nothing to do with slices. Gary Herron gherron@islandtraining.com From fdrake@acm.org Wed Jul 3 13:04:30 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 3 Jul 2002 08:04:30 -0400 Subject: [Doc-SIG] Re: [Python-Dev] [development doc updates] In-Reply-To: <200207030125.38106.gherron@islandtraining.com> References: <20020702222813.8990118EC22@grendel.zope.com> <200207030125.38106.gherron@islandtraining.com> Message-ID: <15650.59470.76955.48172@grendel.zope.com> Gary Herron writes: > Reading the "What's New in Python 2.3" section, I find the following > sentence in "5 Extended Slices": ... > This is ambiguous. Exactly *HOW* does it change with Python 2.3? > Does the stride argument go away, or do builtin sequence types now > support the stride argument? If I'd followed this newsgroup more > carefully, I'd probably know the answer. The built-in types now support stride. Thanks for pointing this ambiguity out; I've changed the explanation in the document so that this is clear. > The paragraph about PendingDeprecationWarning, which follows the above > quote, probably provides a clue, but it seems out of place, having > nothing to do with slices. There was a section heading that was commented out in the document source; I've uncommented the heading. More material will be added to the new section as we have time to complete the material. Thanks! -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From Simon.Budig@unix-ag.uni-siegen.de Wed Jul 3 20:40:45 2002 From: Simon.Budig@unix-ag.uni-siegen.de (Simon Budig) Date: Wed, 3 Jul 2002 21:40:45 +0200 Subject: [Doc-SIG] References in the same line as the target text In-Reply-To: ; from goodger@users.sourceforge.net on Wed, Jul 03, 2002 at 12:02:28AM -0400 References: <20020703014603.A43242@vmax.unix-ag.uni-siegen.de> Message-ID: <20020703214045.A43694@vmax.unix-ag.uni-siegen.de> Hi all, Hi david. Sorry, I did not mean to stomp on somebodys toes with my reference to "docstring centric" design. I apologize for this. David Goodger (goodger@users.sourceforge.net) wrote: > Simply put, that syntax just doesn't fit with the rest of > reStructuredText. However, I'd like to point out a flaw in your argument. It seems that to me that the design-goals of reST could be clarified a bit. In another mail you wrote: > It comes down to this: the top goal of reStructuredText is to be > as readable in plaintext (source) form as in processed form. in this Mail however, you wrote: > [Referring to http://docutils.sf.net/index.txt, source of > http://docutils.sf.net/index.html:] > > Just to show the uglyness of anonymous and named links: This is from > > the webpage source (as in CVS): > > Yes, the source to that web page is not pretty. It was written in a > lazy manner, using many anonymous hyperlinks. But that's OK, no, it is not, because this shows that the current link syntax easily leads to files where the topmost goal (readability in raw and processed form) is seriously hurt. In this case the badness if this file counts twice because... > because > this is an example of a document where only the HTML is meant to be > seen. The source is *not* intended to be read by anyone but me or > other project developers. ... this is definitely wrong. index.txt is (intended to be) linked from the bottom of the docutils website. As a newbie searching for information on how this looks in practice (not the example texts with all features in one file) I certainly would look at the source of the page. And frankly: If I weren't so stubborn, the usage of links in this sample would scare me away from reST. > [...] Also, being a home page, it is complex and > full of external links, much more so than a typical document. What exactly is a typical document? I think that the usage of reST for a homepage with lots of links would be a good use for reST. > > Please don't try to tell me that a newbie easily finds the URL for > > `Docutils CVS repository`__ > > Sorry to burst your bubble, but I think it would be *very easy* for a > newbie to find the URL. They just click on the text "Docutils CVS > repository" on the HTML page, and their browser takes them there. No > newbie would ever be exposed to the source text. (This is a poor > choice to single out as a poor example. ;-) I think the above explains why I think it is a perfectly valid choice. I am wondering a bit about your reasoning. The rejection of my proposal is based on it's "readability" in the source. When I point out that the current syntax has it's flaws too your argument basically is "It doesn't matter, because nobody will ever read it". Uhm. What was the argument against my proposal again? Let me rephrase the main goals of my proposal. My focus is not mainly the reST-source-readability, the processed output is currently more important to me. I want to improve the maintainability of links in reST. Anonymous Links IMO have the maintainance-problem, that inserting links or removing links in a paragraph always means monotonous and error prone counting. You will always have to check on the processed output, if the links indeed point to the correct target. Named Links IMO have the maintainence-problem that the reference-text appears twice in the document. This is prone to errors, since you can easily introduce mismatches. Both approaches have the problem, that the reference and the target specification can be quite a bit apart. This makes it necessary, that a change to a reference or a link might need editing in two different places in the source file. From my personal experience with myself this is always a problem, regardless if I am editing sourcecode, reST, HTML or whatever. I tend to forget the editing in the second place. Maybe it is just me, but I doubt this. My reference_(target) proposal would solve these maintainance problems, since it keeps reference and target close together. In my eyes - but this is a controversial point in this discussion - it does not look too weird to the unsuspecting reader of the reST-source, since the usage of parentheses suggests that their content is a remark related to the topic mentioned above (which an URL indeed is), the added underscore is the same as in the current use of references and makes the distinction between this construct and "topic (remark)"-usage possible. As a last remark: I promise, that I will not occupy this mailinglist forever with this stuff. I guess I will be tired of this discussion by the end of the week or so - but I still think that this is a good idea... ;-) Since the rest of your mail had a focus on more technical stuff I will change the subject a bit and reply to it in a separate mail. Bye, Simon -- Simon.Budig@unix-ag.org http://www.home.unix-ag.org/simon/ From Simon.Budig@unix-ag.uni-siegen.de Wed Jul 3 21:02:53 2002 From: Simon.Budig@unix-ag.uni-siegen.de (Simon Budig) Date: Wed, 3 Jul 2002 22:02:53 +0200 Subject: [Doc-SIG] Making reST more useable with HTML templates In-Reply-To: ; from goodger@users.sourceforge.net on Wed, Jul 03, 2002 at 12:02:28AM -0400 References: <20020703014603.A43242@vmax.unix-ag.uni-siegen.de> Message-ID: <20020703220253.B43694@vmax.unix-ag.uni-siegen.de> (as promised in a separate mail, focus on technical stuff) David Goodger (goodger@users.sourceforge.net) wrote: > Simon Budig (simon.budig@unix-ag.org) wrote: > > Just to show you what is possible with SSIs and why I think using > > reST for webpages is not crazy: > > > > Have a look at http://www.home.unix-ag.org/simon/bsdaemon/ > > Apart from the extras pulled in through server-side includes, the page > is very simple, nothing that reStructuredText couldn't handle. > > > The raw source of it is:: > > > > > > A server-side include directive could be added to reStructuredText, no > problem. Although I'm very careful about new *syntax*, new directives > are much easier to let in, because they're explicit and typically > don't require new syntax (and if they do, it's localized and > *explicit*). I am not sure if this would be necessary. I would not want to have these SSI-directives in my sourcecode, since they add unnecessary complexity to the raw page source ("The secretary would have to know about SSIs"). I would prefer if the stuff could be embedded easily in a template system. Either make it easy to write a tool for creating the pages where the site administrator has full control over the HTML output of the htmp4css1-writer. This means customizeable headers with the option to discard the headers/footers. I am not sure how the preferred use of the docutils would be for a random site administrator. I think I would try to write a small propriate application where I would try to derive the correct writer class and expand it with my personal preferences. Other people might prefer to have a simple template system where you can specify a simple template. Also the class names used in some 's should be customizeable, maybe a dictionary with a native <--> target mapping of the class names. > > Preview of the BSD Daemon > width="326" height="352">

> > I've thought of adding support for more image attributes, like > "align". Care to try? Maybe on the weekend, when I manage to get the CVS to sourceforge to work... Bye, Simon -- Simon.Budig@unix-ag.org http://www.home.unix-ag.org/simon/ From goodger@users.sourceforge.net Thu Jul 4 01:30:11 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Wed, 03 Jul 2002 20:30:11 -0400 Subject: [Doc-SIG] Making reST more useable with HTML templates In-Reply-To: <20020703220253.B43694@vmax.unix-ag.uni-siegen.de> Message-ID: Simon Budig wrote: > (as promised in a separate mail, focus on technical stuff) Good idea. I'll merge in some technical answers from previous posts. > I would not want to have these SSI-directives in my sourcecode, > since they add unnecessary complexity to the raw page source ("The > secretary would have to know about SSIs"). That's fair. > I would prefer if the stuff could be embedded easily in a template > system. Either make it easy to write a tool for creating the pages > where the site administrator has full control over the HTML output > of the htmp4css1-writer. This means customizeable headers with the > option to discard the headers/footers. [Simon, from previous post] > But for a start I'd like to have a bit more configurable html > converter for reST. This means for example to be able to switch off > the .... and ... parts. The HTMLTranslator class of the docutils/writers/html4css1.py module keeps the parts separately: head_prefix ( ... ), head ( & <meta>), body_prefix (</head><body>), body (page contents), and body_suffix (</body></html). These are all lists of strings. I'll expose these in the Writer class. Beyond that it's up to you. Please don't feel that you have to use html4css1.py. It's just one way of producing HTML. You can write your own, or subclass it and add in your customizations. Patches are gratefully accepted. > I am not sure how the preferred use of the docutils would be for a > random site administrator. I think I would try to write a small > propriate application where I would try to derive the correct writer > class and expand it with my personal preferences. Other people might > prefer to have a simple template system where you can specify a > simple template. Sounds good. > Also the class names used in some <span>'s should be customizeable, > maybe a dictionary with a native <--> target mapping of the class > names. I don't follow. Examples? [David, from previous post] >> However, such functionality is certainly within the realm of >> possibility, and I'd encourage anyone to tackle the challenge posed >> in the To Do list: >> >> Construct a _`templating system`, as in ht2html/yaptu, using >> directives and substitutions for dynamic stuff. [Simon] > I don't know yaptu. It's a simple templating tool by Alex Martelli, a recipe in the Python Cookbook: http://aspn.activestate.com/ASPN/Python/Cookbook/Recipe/52305. I helped to extend it a bit, and used it for the old project pages. The extended version is here: http://structuredtext.sourceforge.net/yaptu.py. -- David Goodger <goodger@users.sourceforge.net> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From goodger@users.sourceforge.net Thu Jul 4 01:33:36 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Wed, 03 Jul 2002 20:33:36 -0400 Subject: [Doc-SIG] References in the same line as the target text In-Reply-To: <20020703214045.A43694@vmax.unix-ag.uni-siegen.de> Message-ID: <B949101F.253C5%goodger@users.sourceforge.net> Simon Budig wrote: > This essentially boils down to "Ok, named links are hard to use, so > we introduced anonymous links. They are hard to maintain too, but a > little less." I'd say that hyperlinks are hard to mark up, period. You can't have a markup that is simultaneously readable, unobtrusive, easy, maintainable, and URLs-close-to-the-reference-text. You have to choose some aspect as most important; and you have to give up something. reStructuredText chose "readable" and "unobtrusive" as the most important aspects. > So to use links in reST I have the choice between two evil things: > 1) make sure that Link targets and Link text are absolutely the same > 2) count "__"s. > > This is exactly why I proposed a third solution :-) ... > I'd think this is a good thing... IMO, the solution is worse than the problem. But I'm not interested in debating this point *ad nauseum* (and it is). I've already written a bunch of replies to yesterday's posts, but I think it's better to deep-six them. This is getting old, real fast. Our positions can be summed up with: > Ok, we seem to disagree here fundamentally. I personally think that > the proposed syntax is readable, you think it isn't. Not sure how to > solve this... You seem to feel inline URLs are important. Here's a chance to prove your position. I'll leave it up to you to implement them, on an experimental basis. There is a proposed mechanism for experimental syntax called "pragma directives": It may also be possible for directives to be used as pragmas, to modify the behavior of the parser, such as to experiment with alternate syntax. There is no parser support for this functionality at present; if a reasonable need for pragma directives is found, they may be supported. (http://docutils.sf.net/spec/rst/reStructuredText.html#directives) I will help with the infrastructure (any changes that need to be made to the parser to accept pragma directives), but I won't implement the parsing itself. Here's an example of how such a directive might work:: .. enable-inline-urls:: Ordinary text ... A paragraph containing an `inline hyperlink`_(http://www.example.org/). However, I really don't like that syntax; it doesn't make sense. Let's examine it and see if we can come up with something better. I have two objections: 1. The "`ref`_(URL)" syntax forces the last word of the reference text to be joined to the URL, making a potentially very long word that can't be wrapped (URLs can be very long). The reference and the URL should be separate. 2. The "inline hyperlink" text is *not* a named reference (there's no lookup by name), so it shouldn't look like one. Instead, use the anonymous double-underscore syntax. Perhaps a matching double-underscore "anonymous inline target" syntax for the URL as well? A space in-between would separate the reference from the target and allow words to wrap. For example:: A paragraph containing an `inline hyperlink`__ __`http://www.example.org/`. Yes, that's much better. A bit more verbose, but it fits better with the rest of the syntax. If you insist on parentheses, then some compromise may do. Perhaps:: A paragraph containing an `inline hyperlink`__ __(http://www.example.org/). However, looking at the URI-recognition code (based on the IETF standards RFC 2396 and RFC 2732), parentheses are legal URI characters. This would introduce ambiguity (a legal URI containing parentheses wouldn't be recognized properly). Curly braces and backquotes are not legal URI characters, but they *are* legal email characters (see RFC 822). It's not easy to come up with a completely unambiguous syntax! The only useful characters that are neither URI characters nor email characters are angle brackets, "<>". So the syntax becomes:: A paragraph containing an `inline hyperlink`__ __<http://www.example.org/>. Which actually doesn't look too bad. There's precedent in using angle brackets for URIs. Coming full circle, perhaps we can now drop the leading "__":: A paragraph containing an `inline hyperlink`__ <http://www.example.org/>. Ah, but then it would be difficult to write about HTML/XML/SGML tags ("img" in "the <img> tag" would be parsed as a relative URL). We *could* recognize inline URLs only immediately after anonymous references, but that would require keeping track of state. So the leading "__" *is* required. Once the pragma directive is implemented, we'll see how it fares in the real world. I may accept it into standard reStructuredText, leave it in as a pragma directive, or reject it outright. Your mission, Mr. Budig, if you choose to accept it, is to create an "enable-inline-urls" pragma directive that implements some variation of the above syntax (recommended: "__<URI>"). Except for necessary infrastructure support for pragmas, there should be no changes to the parser itself. I will work with you to add support for pragmas. -- David Goodger <goodger@users.sourceforge.net> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From Simon.Budig@unix-ag.uni-siegen.de Thu Jul 4 23:46:57 2002 From: Simon.Budig@unix-ag.uni-siegen.de (Simon Budig) Date: Fri, 5 Jul 2002 00:46:57 +0200 Subject: [Doc-SIG] References in the same line as the target text In-Reply-To: <B949101F.253C5%goodger@users.sourceforge.net>; from goodger@users.sourceforge.net on Wed, Jul 03, 2002 at 08:33:36PM -0400 References: <20020703214045.A43694@vmax.unix-ag.uni-siegen.de> <B949101F.253C5%goodger@users.sourceforge.net> Message-ID: <20020705004656.A28508@vmax.unix-ag.uni-siegen.de> David Goodger (goodger@users.sourceforge.net) wrote: > You seem to feel inline URLs are important. Here's a chance to prove > your position. I'll leave it up to you to implement them, on an > experimental basis. There is a proposed mechanism for experimental > syntax called "pragma directives": > > It may also be possible for directives to be used as pragmas, to > modify the behavior of the parser, such as to experiment with > alternate syntax. There is no parser support for this > functionality at present; if a reasonable need for pragma > directives is found, they may be supported. > > (http://docutils.sf.net/spec/rst/reStructuredText.html#directives) > > I will help with the infrastructure (any changes that need to be made > to the parser to accept pragma directives), but I won't implement the > parsing itself. Here's an example of how such a directive might > work:: > > .. enable-inline-urls:: [...] > A paragraph containing an `inline > hyperlink`__ __<http://www.example.org/>. > Once the pragma directive is implemented, we'll see how it fares in > the real world. I may accept it into standard reStructuredText, leave > it in as a pragma directive, or reject it outright. > > Your mission, Mr. Budig, if you choose to accept it, is to create an > "enable-inline-urls" pragma directive that implements some variation > of the above syntax (recommended: "__<URI>"). Except for necessary > infrastructure support for pragmas, there should be no changes to the > parser itself. I will work with you to add support for pragmas. I'd like to raise two points. First I am not sure if the use of pragmas to change the behaviour is a good way to do this. There might be a need for lots of different local extensions to the syntax. You'd end up implementing lots of pragmas... It might be better to have either a pragma that looks like:: .. reST-options:: :inline-urls: true :math-markup: true :whatever-id: "blah" or something like this. So you could have a more generic framework for extensions to the parser. reST could provide a mechanism to derive for example class names from the first field and try to import and plug them into the parser. This would also make it easier to avoid having to type this pragma by creating customized document processors where you would do something like parser.add_plugin (InlineUrlPlugin (1)) The second point is closely connected to this. When looking at Inline markup the parsing work is done by a class "Inliner". This is dominated by a huge regular expression that matches to a lot of different constructs. In my eyes it would be better to break this apart in different regular expressions and test them in a sequence (it might be necessary to remember which match starts first). An extension could add a regular expression to that list instead of having to replace a complicated regular expression with an even more complicated regex. Of course this would mean that there *would* be changes to the parser itself, but it might result in a more flexible parsing framework. Do you think this is worth it? Bye, Simon -- Simon.Budig@unix-ag.org http://www.home.unix-ag.org/simon/ From Simon.Budig@unix-ag.uni-siegen.de Fri Jul 5 00:09:00 2002 From: Simon.Budig@unix-ag.uni-siegen.de (Simon Budig) Date: Fri, 5 Jul 2002 01:09:00 +0200 Subject: [Doc-SIG] Making reST more useable with HTML templates In-Reply-To: <B9490F52.253C4%goodger@users.sourceforge.net>; from goodger@users.sourceforge.net on Wed, Jul 03, 2002 at 08:30:11PM -0400 References: <20020703220253.B43694@vmax.unix-ag.uni-siegen.de> <B9490F52.253C4%goodger@users.sourceforge.net> Message-ID: <20020705010859.B28508@vmax.unix-ag.uni-siegen.de> David Goodger (goodger@users.sourceforge.net) wrote: > Simon Budig wrote: > <meta>), body_prefix (</head><body>), body (page contents), and > body_suffix (</body></html). These are all lists of strings. I'll > expose these in the Writer class. Beyond that it's up to you. > > Please don't feel that you have to use html4css1.py. It's just one > way of producing HTML. You can write your own, or subclass it and add > in your customizations. With my python knowledge coming mainly from 1.5.x I am not sure if I got the idea behind the packages correctly. Could somebody point me to a resource, why it is a good idea to have different classes with the same name (there are writers.Writer, html4css1.Writer and docutils_xml.Writer in docutils, why aren't they named after what they actually write?) I most probably miss something here since this technique is also used in the "encodings" package from the core python. > > Also the class names used in some <span>'s should be customizeable, > > maybe a dictionary with a native <--> target mapping of the class > > names. > > I don't follow. Examples? In processed reST output you find class names like '<a class="reference"' or '<p class="field-name">'. If you want to include reST output as part of a larger website these class names might clash with the names in the system-wide css file. It might be useful to be able to replace the class names used by reST with the class names used in the rest of the site, so that reST does not output the stuff above but '<a class="link"' instead. I am not sure how important this is, alternatively you could also adjust the CSS file. But it seems nicer to me to be able to control the output of reST instead of having to adjust the rest of your framework to the needs of reST. Hmm. Bye, Simon -- Simon.Budig@unix-ag.org http://www.home.unix-ag.org/simon/ From goodger@users.sourceforge.net Fri Jul 5 01:28:43 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Thu, 04 Jul 2002 20:28:43 -0400 Subject: [Doc-SIG] Making reST more useable with HTML templates In-Reply-To: <20020705010859.B28508@vmax.unix-ag.uni-siegen.de> Message-ID: <B94A607B.2553C%goodger@users.sourceforge.net> Simon Budig wrote: > With my python knowledge coming mainly from 1.5.x I am not sure if I > got the idea behind the packages correctly. Could somebody point me > to a resource, why it is a good idea to have different classes with > the same name (there are writers.Writer, html4css1.Writer and > docutils_xml.Writer in docutils, why aren't they named after what > they actually write?) I most probably miss something here since this > technique is also used in the "encodings" package from the core > python. I did it that way out of practicality. The front-end tells Docutils which format it wants. Docutils looks up that format name in a mapping, to determine the actual module name ({'html': 'html4css1'}). The docutils.writers module (docutils/writers/__init__.py) imports the module, and returns the Writer class. If each Writer class had a different name, there would be one more level of indirection, one more variable. Of course, each writer could be given the same name as its module, but I prefer lowercase module names and StudlyCaps class names. If done that way, it wouldn't be possible pass around the module; the class itself would have to be passed. It may be an arbitrary decision, but it works well for Docutils and hasn't presented any problems. Plus, I find the uniformity of API elegant. >>> Also the class names used in some <span>'s should be >>> customizeable, maybe a dictionary with a native <--> target >>> mapping of the class names. >> >> I don't follow. Examples? > > In processed reST output you find class names like '<a > class="reference"' or '<p class="field-name">'. If you want to > include reST output as part of a larger website these class names > might clash with the names in the system-wide css file. I see. It's not our problem. If an application has this problem, it can deal with it; Docutils doesn't need to. If a real example of conflict ever appears, we can deal with it then. Until then, think XP: "always do the simplest thing that could possibly work" and "never add functionality before it's needed." -- David Goodger <goodger@users.sourceforge.net> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From goodger@users.sourceforge.net Fri Jul 5 03:39:16 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Thu, 04 Jul 2002 22:39:16 -0400 Subject: [Doc-SIG] References in the same line as the target text In-Reply-To: <20020705004656.A28508@vmax.unix-ag.uni-siegen.de> Message-ID: <B94A7F13.25541%goodger@users.sourceforge.net> Simon Budig wrote: > First I am not sure if the use of pragmas to change the behaviour is > a good way to do this. There might be a need for lots of different > local extensions to the syntax. You'd end up implementing lots of > pragmas... > > It might be better to have either a pragma that looks like:: > > .. reST-options:: > :inline-urls: true > :math-markup: true > :whatever-id: "blah" Lots of individual directives, or one large pragma directive with subcommands. Either way would be fine. I'd drop the "true" though; just the presence of the field is enough. > reST could provide a mechanism to derive for example class names > from the first field and try to import and plug them into the parser. Too much magic; potentially dangerous. Better to have a registry. > This would also make it easier to avoid having to type this pragma > by creating customized document processors where you would do > something like > > parser.add_plugin (InlineUrlPlugin (1)) Yes, something along those lines. But please don't worry about the mechanics; it's too early. > The second point is closely connected to this. When looking at > Inline markup the parsing work is done by a class "Inliner". This is > dominated by a huge regular expression that matches to a lot of > different constructs. In my eyes it would be better to break this > apart in different regular expressions and test them in a sequence > (it might be necessary to remember which match starts first). An > extension could add a regular expression to that list instead of > having to replace a complicated regular expression with an even more > complicated regex. The "Inliner" class has to use one large regular expression. If we have some text like this:: Here is an ``inline **literal**``. If we check for "strong" (**) first, the result will be wrong. No ordering would get it right for all constructs. We have to check for each start-string simultaneously, because there are no precedence rules (almost); first occurrence from left to right in the text is the determinant. But that idea is close to the solution I'm thinking of. My idea is to break up the one huge regexp into several lists of individual regexps, one list per construct/regexp type (find start-string only, find the whole construct, etc.), and join them dynamically into compound OR-groups, building the large regexp from components at runtime. Dynamic syntax directives can install new regexps and rebuild the master regexp. > Of course this would mean that there *would* be changes to the > parser itself, but it might result in a more flexible parsing > framework. This is the infrastructure support I spoke of. For now, please just make a subclass of the "Inliner" class and pass it to the parser. See the PEP reader for an example. Don't try to be fancy, just brute-force copy & paste the code you need from docutils.parsers.rst.states.Inliner; we'll sort out what needs to be done afterward. Please put your code in the sandbox for now (see http://docutils.sf.net/spec/notes.html#additions-to-docutils). -- David Goodger <goodger@users.sourceforge.net> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From Simon.Budig@unix-ag.uni-siegen.de Fri Jul 5 14:41:59 2002 From: Simon.Budig@unix-ag.uni-siegen.de (Simon Budig) Date: Fri, 5 Jul 2002 15:41:59 +0200 Subject: [Doc-SIG] References in the same line as the target text In-Reply-To: <B94A7F13.25541%goodger@users.sourceforge.net>; from goodger@users.sourceforge.net on Thu, Jul 04, 2002 at 10:39:16PM -0400 References: <20020705004656.A28508@vmax.unix-ag.uni-siegen.de> <B94A7F13.25541%goodger@users.sourceforge.net> Message-ID: <20020705154159.A30380@vmax.unix-ag.uni-siegen.de> David Goodger (goodger@users.sourceforge.net) wrote: > Simon Budig wrote: > > The second point is closely connected to this. When looking at > > Inline markup the parsing work is done by a class "Inliner". This is > > dominated by a huge regular expression that matches to a lot of > > different constructs. In my eyes it would be better to break this > > apart in different regular expressions and test them in a sequence > > (it might be necessary to remember which match starts first). An > > extension could add a regular expression to that list instead of > > having to replace a complicated regular expression with an even more > > complicated regex. > > The "Inliner" class has to use one large regular expression. If we > have some text like this:: > > Here is an ``inline **literal**``. > > If we check for "strong" (**) first, the result will be wrong. No > ordering would get it right for all constructs. We have to check for > each start-string simultaneously, because there are no precedence > rules (almost); first occurrence from left to right in the text is the > determinant. This is why I meant that it might be necessary to remember which match starts first. To emulate the behaviour of a big regex we have to match against all regexes, check which one starts closest to the beginning of the string and if this is ambigous check, which one is the longest match. Advantage: This would immediately give the matching construct. > But that idea is close to the solution I'm thinking of. My idea is to > break up the one huge regexp into several lists of individual regexps, > one list per construct/regexp type (find start-string only, find the > whole construct, etc.), and join them dynamically into compound > OR-groups, building the large regexp from components at runtime. > Dynamic syntax directives can install new regexps and rebuild the > master regexp. The advantage of this approach is that it might be a bit more quick since it is inside a single regular expression. It makes it a bit harder to detect what actually was the matching regex. Of course this is doable via ((?P<regex1>blablabla)|(?P<regex2>blu(?P<data>b*)lubb)) and then check, which of the named groups regex1 or regex2 matches. It might be a problem because you have to be careful with the naming of additional groups in the different regexes to avoid conflicts. Bye, Simon -- Simon.Budig@unix-ag.org http://www.home.unix-ag.org/simon/ From fantasai@escape.com Fri Jul 5 23:49:17 2002 From: fantasai@escape.com (fantasai) Date: Fri, 05 Jul 2002 18:49:17 -0400 Subject: [Doc-SIG] References in the same line as the target text Message-ID: <3D26226D.9EBDDCDA@escape.com> Simon Budig wrote: | | I think, that at least the intuitivity is given. I have seen lots | of texts where an URL is mentioned in braces inline with the text | and it seems natural to me. David Goodger wrote: | There are conflicting goals here: | | 1. Keep the plaintext as readable as possible. | 2. Keep the URLs as close to the references as possible. | 3. Keep the inter-paragraph space clear of targets. | | I find the suggested syntax, ``Python_(http://www.python.org)``, | conflicts with goal 1. I agree with Simon. In many cases, though certainly not in all, I find parenthesizing the url in plain text flows better than relegating it to a footnote. You suggest that leaving the url in the final HTML text would serve this purpose, but as you say - | HTML has a third dimension, that of links "underneath" the | text (in <a href=...> tags), which we can only simulate in | reStructuredText. Placing a url in parentheses after the relevant text is a way to simulate this third dimension in plain text. Why not make the simulation real when transforming to HTML? | Put a relative URL (no "http://") in that syntax and I'd expect | it to be quite confusing in plaintext. It would be as confusing if you used reStructuredText's target syntax_ .. _syntax: ling-214 From goodger@users.sourceforge.net Sat Jul 6 00:10:24 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Fri, 05 Jul 2002 19:10:24 -0400 Subject: [Doc-SIG] References in the same line as the target text In-Reply-To: <20020705154159.A30380@vmax.unix-ag.uni-siegen.de> Message-ID: <B94B9F9F.2556D%goodger@users.sourceforge.net> [David Goodger:] >> The "Inliner" class has to use one large regular expression. If we >> have some text like this:: >> >> Here is an ``inline **literal**``. >> >> If we check for "strong" (**) first, the result will be wrong. No >> ordering would get it right for all constructs. We have to check >> for each start-string simultaneously, because there are no >> precedence rules (almost); first occurrence from left to right in >> the text is the determinant. [Simon Budig:] > This is why I meant that it might be necessary to remember which > match starts first. To emulate the behaviour of a big regex we have > to match against all regexes, check which one starts closest to the > beginning of the string and if this is ambigous check, which one is > the longest match. > > Advantage: This would immediately give the matching construct. But at what cost? Sounds very complex. It ain't broke. Why fix it? Let's just use the big regexp, and not try to emulate it. >> But that idea is close to the solution I'm thinking of. My idea >> is to break up the one huge regexp into several lists of >> individual regexps, one list per construct/regexp type (find >> start-string only, find the whole construct, etc.), and join them >> dynamically into compound OR-groups, building the large regexp >> from components at runtime. Dynamic syntax directives can install >> new regexps and rebuild the master regexp. > > The advantage of this approach is that it might be a bit more quick > since it is inside a single regular expression. It makes it a bit > harder to detect what actually was the matching regex. Of course > this is doable via > ((?P<regex1>blablabla)|(?P<regex2>blu(?P<data>b*)lubb)) and then > check, which of the named groups regex1 or regex2 matches. It might > be a problem because you have to be careful with the naming of > additional groups in the different regexes to avoid conflicts. If it ever does become a problem, we'll deal with it. Until then, I don't see the point of redesigning something that works well. I don't think we'll be adding much more to the regexp, so I don't anticipate running into name clashes any time soon. If you think it's worth doing though, please try it and show us. -- David Goodger <goodger@users.sourceforge.net> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From Simon.Budig@unix-ag.uni-siegen.de Sat Jul 6 04:00:07 2002 From: Simon.Budig@unix-ag.uni-siegen.de (Simon Budig) Date: Sat, 6 Jul 2002 05:00:07 +0200 Subject: [Doc-SIG] References in the same line as the target text In-Reply-To: <B949101F.253C5%goodger@users.sourceforge.net>; from goodger@users.sourceforge.net on Wed, Jul 03, 2002 at 08:33:36PM -0400 References: <20020703214045.A43694@vmax.unix-ag.uni-siegen.de> <B949101F.253C5%goodger@users.sourceforge.net> Message-ID: <20020706050006.A45810@vmax.unix-ag.uni-siegen.de> David Goodger (goodger@users.sourceforge.net) wrote: > You seem to feel inline URLs are important. Here's a chance to prove > your position. I'll leave it up to you to implement them, on an > experimental basis. In the sandbox there is a drop in replacement for the states.py file. I have not yet come around to implement this as a subclass of Inliner() but it should not be too hard - all changes to the file are insde this class... It is nearly 5 am now and I don't want to think about that now... :-) [...] > However, I really don't like that syntax; it doesn't make sense. > Let's examine it and see if we can come up with something better. I > have two objections: > > 1. The "`ref`_(URL)" syntax forces the last word of the reference text > to be joined to the URL, making a potentially very long word that > can't be wrapped (URLs can be very long). The reference and the > URL should be separate. > > 2. The "inline hyperlink" text is *not* a named reference (there's no > lookup by name), so it shouldn't look like one. I have now implemented reference__ __<uri> and `refe rence`__ __<uri>. they are analogous to anonymous links. I also implemented reference_ _<uri> and `refe rence`_ _<uri> analogoes to named links, this is some kind of closure of the syntax (mathematically speaking... :-) I am currently not sure if the possibility to wrap before long URLs is worth the added line noise by doubling the underscores. I think reference_<uri> resp. reference__<uri> might be acceptable too. Comments? Bye, Simon -- Simon.Budig@unix-ag.org http://www.home.unix-ag.org/simon/ From fantasai@escape.com Sat Jul 6 04:42:49 2002 From: fantasai@escape.com (fantasai) Date: Fri, 05 Jul 2002 23:42:49 -0400 Subject: [Doc-SIG] References in the same line as the target text Message-ID: <3D266739.638BF318@escape.com> David Goodger wrote: > > A paragraph containing an `inline > hyperlink`_(http://www.example.org/).> > > However, I really don't like that syntax; it doesn't make sense. > Let's examine it and see if we can come up with something better. ... > However, looking at the URI-recognition code (based on the IETF > standards RFC 2396 and RFC 2732), parentheses are legal URI > characters. This would introduce ambiguity (a legal URI containing > parentheses wouldn't be recognized properly). Curly braces and > backquotes are not legal URI characters, but they *are* legal email > characters (see RFC 822). Why would accepting curly braces in an email address preclude using them to delimit a URI? I think this: `inline hyperlink` { uri_with@weird,symbols } looks much better than this: `inline hyperlink`__ __<uri_with@weird,symbols> Whitespace cannot appear literally in either URIs or email addresses, so any leading/trailing spaces should be stripped. In cases where there is no ambiguity (e.g. no @ symbol), the uri can be put in without the leading/trailing whitespace. `docutils project` {http://docutils.sourceforge.net/} ~fantasai From goodger@users.sourceforge.net Sat Jul 6 06:03:54 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Sat, 06 Jul 2002 01:03:54 -0400 Subject: [Doc-SIG] References in the same line as the target text In-Reply-To: <20020706050006.A45810@vmax.unix-ag.uni-siegen.de> Message-ID: <B94BF279.255F2%goodger@users.sourceforge.net> Simon Budig wrote: > In the sandbox there is a drop in replacement for the states.py > file. Great! I'll take a look. > I have not yet come around to implement this as a subclass of > Inliner() but it should not be too hard - all changes to the file > are insde this class... This way is fine; don't bother converting it into a subclass. We can use "diff". > It is nearly 5 am now and I don't want to think about that now... You're keeping hacker's hours. ;-) It's 1 am here; time for bed. > I have now implemented reference__ __<uri> and `refe rence`__ > __<uri>. they are analogous to anonymous links. Did you allow for long URIs split over lines? This would have to be allowed:: reference__ __<http://this.is.the.beginning .of.a.very.long.uri.com/index.html#and-here-is -even-more> > I also implemented reference_ _<uri> and `refe rence`_ _<uri> > analogoes to named links, this is some kind of closure of the syntax > (mathematically speaking... :-) Where the target name is implied. Yes, I suppose it follows. > I am currently not sure if the possibility to wrap before long URLs > is worth the added line noise by doubling the underscores. I think > reference_<uri> resp. reference__<uri> might be acceptable too. Without the spaces & matching underscores the syntax would be too subtle I think. And allowing for line-wrapping is important. -- David Goodger <goodger@users.sourceforge.net> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From goodger@users.sourceforge.net Sat Jul 6 06:04:55 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Sat, 06 Jul 2002 01:04:55 -0400 Subject: [Doc-SIG] References in the same line as the target text In-Reply-To: <3D266739.638BF318@escape.com> Message-ID: <B94BF2B7.255F3%goodger@users.sourceforge.net> [David Goodger] >> However, looking at the URI-recognition code (based on the IETF >> standards RFC 2396 and RFC 2732), parentheses are legal URI >> characters. This would introduce ambiguity (a legal URI containing >> parentheses wouldn't be recognized properly). Curly braces and >> backquotes are not legal URI characters, but they *are* legal email >> characters (see RFC 822). [fantasai] > Why would accepting curly braces in an email address preclude using > them to delimit a URI? Because the parsing would be ambiguous. Even if we could work around the corner cases, the code/regexp would be nightmarish. > I think this: > `inline hyperlink` { uri_with@weird,symbols } > looks much better than this: > `inline hyperlink`__ __<uri_with@weird,symbols> I disagree, for several reasons: 1. Using whitespace like that is a kluge. Line-wrap the text and you could end up with:: `inline hyperlink`_ { uri_with@weird,symbols } Not pretty. 2. Curly braces are very similar to parentheses, which cannot be used because they're too common in text. 3. There's overwhelming precedent for angle brackets with URLs. From RFC 2396 (URI syntax): The angle-bracket "<" and ">" and double-quote (") characters are excluded [from URIs] because they are often used as the delimiters around URI in text documents and protocol fields. Using <> angle brackets around each URI is especially recommended as a delimiting style for URI that contain whitespace. From RFC 822 (email headers): Angle brackets ("<" and ">") are generally used to indicate the presence of a one machine-usable reference (e.g., delimiting mailboxes), possibly including source-routing to the machine. Also, see my signature block. Angle brackets are familiar, standard URI delimiters. Much better than curly braces. -- David Goodger <goodger@users.sourceforge.net> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From fantasai@escape.com Sat Jul 6 15:41:26 2002 From: fantasai@escape.com (fantasai) Date: Sat, 06 Jul 2002 10:41:26 -0400 Subject: [Doc-SIG] References in the same line as the target text References: <B94BF2B7.255F3%goodger@users.sourceforge.net> Message-ID: <3D270196.6938208F@escape.com> David Goodger wrote: > > [fantasai] > > Why would accepting curly braces in an email address preclude using > > them to delimit a URI? > > Because the parsing would be ambiguous. Even if we could work around > the corner cases, the code/regexp would be nightmarish. > > > I think this: > > `inline hyperlink` { uri_with@weird,symbols } > > looks much better than this: > > `inline hyperlink`__ __<uri_with@weird,symbols> > > I disagree, for several reasons: > > 1. Using whitespace like that is a kluge. Line-wrap the text and you > could end up with:: > > `inline hyperlink`_ { > uri_with@weird,symbols > } You would only have to do that in very rare cases: a relative url with an @ sign. Absolute urls must specify a scheme followed by a colon. Colons aren't allowed in mail addresses, so the email parsing regex wouldn't pick them up even if you left out the spaces. > 2. Curly braces are very similar to parentheses, which cannot be used > because they're too common in text. Can you give me an example? I don't remember the last time I saw curly braces in non-mathmematical text. > 3. There's overwhelming precedent for angle brackets with URLs. .. > Angle brackets are familiar, standard URI delimiters. Much better > than curly braces. Certainly, I agree, but as you said, they are already taken for SGML notation. And this workaround: `inline hyperlink`__ __<uri> Looks horrendous. If your goal is unobtrusive, that syntax fails miserably. Seven consecutive punctuation characters--really, it's not necessary. Alternatives: a. This sentence contains an `inline hyperlink` {uri} which will direct you to some other resource. b. This sentence contains an `inline hyperlink` __<uri> which will direct you to some other resource. c. This sentence contains an `inline hyperlink` <"uri"> which will direct you to some other resource. d. This sentence contains an `inline hyperlink` >uri> which will direct you to some other resource. e. This sentence contains an `inline hyperlink` ^<uri> which will direct you to some other resource. f. This sentence contains an `inline hyperlink` [>uri] which will direct you to some other resource. I personally like (c) and (f) the best. ~fantasai From fantasai@escape.com Sat Jul 6 16:25:05 2002 From: fantasai@escape.com (fantasai) Date: Sat, 06 Jul 2002 11:25:05 -0400 Subject: [Doc-SIG] Cleaning up HTML output (part 1 - 'name' and numerical ids) Message-ID: <3D270BD1.7911DC3F@escape.com> David Goodger wrote: > fantasai wrote: > > And numerical id's aren't very use-friendly. > > Docutils actually uses names wherever possible. I don't know that it > could be improved much, but if you do, please let me know. Well, let's take an example from your test.txt output: | <div class="section" id="structural-elements" name="structural-elements"> | <h1><a href="#id21">Structural Elements</a></h1> 'name' is not a valid attribute for <div>, <li>, or most of the other elements in HTML. It's used in forms, and it's used in the anchor tag. The above code should be written one of three ways: <div class="section" id="structural-elements"> <h1>Structural Elements</h1> <div class="section"> <h1><a id="structural-elements" name="structural-elements">Structural Elements</a></h1> <div class="section"> <h1><a name="structural-elements">Structural Elements</a></h1> The first only works in browsers that support the 'id' attribute for targets, but it is a cleaner syntax. The second is redundant. The third is not ideal, but it works in every HTML browser I have ever come across. You use the numerical ids to have headings refer back to their respective section entries in the table of contents. I don't see that this is a particularly necessary behavior to have--you can just link back to the table of contents as a whole, if you really want to. Ideally, you'd put in a <link> to the table of contents in the <head>: <link rel="toc" href="#table-of-contents" title="Table of Contents"> and leave it at that. Unfortunatly, most browsers don't support <link>. If linking back to the corresponding toc entry is important to you, then identify the entries with the section's id preceded by 'toc-'. For example: <li id="toc-structural-elements"><a href="#structural-elements">Structural Elements</a></li> or <li><a href="#structural-elements" name="toc-structural-elements">Structural Elements</a></li> IMO, having the headers as links is distracting. But that's just my opinion. ~fantasai From goodger@users.sourceforge.net Sat Jul 6 17:19:27 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Sat, 06 Jul 2002 12:19:27 -0400 Subject: [Doc-SIG] References in the same line as the target text In-Reply-To: <3D270196.6938208F@escape.com> Message-ID: <B94C90CE.25628%goodger@users.sourceforge.net> [fantasai] >>> I think this: >>> `inline hyperlink` { uri_with@weird,symbols } >>> looks much better than this: >>> `inline hyperlink`__ __<uri_with@weird,symbols> [David] >> I disagree, for several reasons: >> >> 1. Using whitespace like that is a kluge. Line-wrap the text and >> you could end up with:: >> >> `inline hyperlink`_ { >> uri_with@weird,symbols >> } [fantasai] > You would only have to do that in very rare cases: a relative url > with an @ sign. Absolute urls must specify a scheme followed by a > colon. Colons aren't allowed in mail addresses, so the email parsing > regex wouldn't pick them up even if you left out the spaces. Exceptions like that make the rules very hard to remember. Curly braces, with or without spaces, are out. >> 2. Curly braces are very similar to parentheses, which cannot be >> used because they're too common in text. > > Can you give me an example? I don't remember the last time I saw > curly braces in non-mathmematical text. I'm saying that "(parenthesized text)" is very common in ordinary text, and "{curly braces}" look just like them. A weak argument, to be sure. In any case, curly braces will not be accepted. > > 3. There's overwhelming precedent for angle brackets with URLs. > .. > > Angle brackets are familiar, standard URI delimiters. Much better > > than curly braces. > > Certainly, I agree, but as you said, they are already taken for SGML > notation. And this workaround: > > `inline hyperlink`__ __<uri> > > Looks horrendous. If your goal is unobtrusive, that syntax fails > miserably. Seven consecutive punctuation characters--really, it's > not necessary. My goal for this syntax is not "unobtrusive", it's "unambiguous" and "unsurprising". This is one of the "Less common constructs, for which there is no natural or obvious markup," and therefore it "should be distinctive" (quotes from the "unobtrusive" goal). Please remember, I'm not convinced this syntax is a good idea. In fact, I still think it's probably a bad idea. I'm only allowing an experimental extension via a pragma directive, to see how it fares in the wild, and to try to keep an open mind. I'm reserving the right to rip it out if it proves unsuccessful. This may sound all high-and-mighty, but I've got the unenviable responsibility of being in the position of making the final decisions. I have to try to keep the syntax as consistent and elegant as possible. Although you and Simon might think it's a brilliant idea, I've got to look at the big picture. That means saying "no" sometimes; tough. > Alternatives: > > a. This sentence contains an `inline hyperlink` {uri} which will > direct you to some other resource. It's never going to happen. Please drop it. > b. This sentence contains an `inline hyperlink` __<uri> which will > direct you to some other resource. You've dropped the trailing underscores from the reference, turning it into interpreted text. No good. Same in all of these alternatives; I'll pretend they're all spelled "`inline hyperlink`__". > c. This sentence contains an `inline hyperlink` <"uri"> which will > direct you to some other resource. Why do we need quotation marks? To distinguish from HTML/SGML/XML tags? At least the underscores have precedent. And see (b). > d. This sentence contains an `inline hyperlink` >uri> which will > direct you to some other resource. Not in my lifetime. > e. This sentence contains an `inline hyperlink` ^<uri> which will > direct you to some other resource. Yuck. > f. This sentence contains an `inline hyperlink` [>uri] which will > direct you to some other resource. Double-yuck. -- David Goodger <goodger@users.sourceforge.net> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From goodger@users.sourceforge.net Sat Jul 6 17:21:13 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Sat, 06 Jul 2002 12:21:13 -0400 Subject: [Doc-SIG] Call for opinions on "inline external targets" idea Message-ID: <B94C9138.25629%goodger@users.sourceforge.net> I'd like to hear what long-time reStructuredText users think of all this. Simon Budig has proposed an "inline anonymous external target" construct, where a URL immediately follows a reference in the text flow. I'm reluctant to add the construct (in whatever form) because it breaks the flow of text. It's similar to the syntax used by StructuredText, which was rejected because it broke up the flow of text. The advantage of the "inline" construct is that it keeps the URI close to the reference, making it easier to edit the text while keeping references and targets in sync. The best syntax so far looks like this:: This is a reference__ __<http://example.com> to an example. This is equivalent to:: This is a reference__ to an example. __ http://example.com A "named" version has also been proposed:: This is a reference_ _<http://example.com> to an example. And another reference_ to the same one. This is equivalent to:: This is a reference_ to an example. And another reference_ to the same one. .. _reference: http://example.com Here the name for the inline target is implied. So, what do you think? Is this construct useful? Is the syntax clear? Is it intrusive? Is it worth adding to reStructuredText? Add it outright, or as an explicitly invoked optional extension? Thanks! -- David Goodger <goodger@users.sourceforge.net> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From Simon.Budig@unix-ag.org Sun Jul 7 13:13:39 2002 From: Simon.Budig@unix-ag.org (Simon Budig) Date: Sun, 7 Jul 2002 14:13:39 +0200 Subject: [Doc-SIG] References in the same line as the target text In-Reply-To: <3D270196.6938208F@escape.com>; from fantasai@escape.com on Sat, Jul 06, 2002 at 10:41:26AM -0400 References: <B94BF2B7.255F3%goodger@users.sourceforge.net> <3D270196.6938208F@escape.com> Message-ID: <20020707141338.B42702@vmax.unix-ag.uni-siegen.de> fantasai (fantasai@escape.com) wrote: > Alternatives: > a. This sentence contains an `inline hyperlink` {uri} which will direct > you to some other resource. [...] > f. This sentence contains an `inline hyperlink` [>uri] which will direct > you to some other resource. > > I personally like (c) and (f) the best. I wonder how you came up with this selection - it seems a bit arbitrary to me. why not ~~uri~~ or +*uri*+? :-) Nah. I think that the alternative of David is not that bad. Of course a reference__ __<link> syntax has lots of line noise, but it has the advantage of directly referring to the other reference-syntax'es. Of course in my eyes reference__<link> and reference_<link> would be acceptable too, especially when you can wrap URLs with newlines within the URL. As mentioned in my other mail I think that if this seems too subtle to David reference__<<link>> and reference_<<link>> would be OK with me too. But I am also happy with the current syntax - as long as there *is* a syntax for inlining links... :-) Bye, Simon -- Simon.Budig@unix-ag.org http://www.home.unix-ag.org/simon/ From Simon.Budig@unix-ag.uni-siegen.de Sun Jul 7 13:25:09 2002 From: Simon.Budig@unix-ag.uni-siegen.de (Simon Budig) Date: Sun, 7 Jul 2002 14:25:09 +0200 Subject: [Doc-SIG] References in the same line as the target text Message-ID: <20020707142508.D42702@vmax.unix-ag.uni-siegen.de> [sorry, this was intended to go to doc-sig as well] David Goodger (goodger@users.sourceforge.net) wrote: > Simon Budig wrote: > > I have now implemented reference__ __<uri> and `refe rence`__ > > __<uri>. they are analogous to anonymous links. > > Did you allow for long URIs split over lines? This would have to be > allowed:: > > reference__ __<http://this.is.the.beginning > .of.a.very.long.uri.com/index.html#and-here-is > -even-more> I implemented this now (I was not aware that this is possible). It does not yet generate a warning when there are additional spaces in the URL though. > > I also implemented reference_ _<uri> and `refe rence`_ _<uri> > > analogoes to named links, this is some kind of closure of the syntax > > (mathematically speaking... :-) > > Where the target name is implied. Yes, I suppose it follows. > > > I am currently not sure if the possibility to wrap before long URLs > > is worth the added line noise by doubling the underscores. I think > > reference_<uri> resp. reference__<uri> might be acceptable too. > > Without the spaces & matching underscores the syntax would be too > subtle I think. And allowing for line-wrapping is important. Hmm. Isn't the possibility to break URLs with a newline enough and would reduce the need for the additional space in the markup? You could do `long text followed by a long URL that might make it necessary`__<http:// to.break.the.URL.at/the/end/of/the#line> I think that the advantage of this definitely is that it would indicate the tight connection between the text and the argument. The separation with a space looks a bit as if this were two totally independant syntactic constructs (which it isn't because __<this.url> would not be recognized. If you think that reference_<uri> is to subtle, we could make it more explicit with_<<double angle brackets>>. I also thought about adding an additional underscore__<like/this> for named and three underscores___<like/this> for anonymous inline targets. However, this could lead to confusion because "__" would be used for anonymous links and named inline links. Could lead to confusion. Bye, Simon -- Simon.Budig@unix-ag.org http://www.home.unix-ag.org/simon/ From garth@deadlybloodyserious.com Mon Jul 8 03:31:29 2002 From: garth@deadlybloodyserious.com (Garth Kidd) Date: Mon, 8 Jul 2002 12:31:29 +1000 Subject: [Doc-SIG] Call for opinions on "inline external targets" idea In-Reply-To: <B94C9138.25629%goodger@users.sourceforge.net> Message-ID: <006501c22627$92d94dd0$3d00800a@gkiddlap> > Simon Budig has proposed an "inline anonymous external > target" construct, where a URL immediately follows a > reference in the text flow. > The best syntax so far looks like this:: > > This is a reference__ __<http://example.com> to an example. -1. > This is equivalent to:: > > This is a reference__ to an example. > > __ http://example.com I strongly prefer the current way of doing it. Inline is spectactularly messy, IMHO. Regards, Garth. From tony@lsl.co.uk Mon Jul 8 10:07:05 2002 From: tony@lsl.co.uk (Tony J Ibbs (Tibs)) Date: Mon, 8 Jul 2002 10:07:05 +0100 Subject: [Doc-SIG] Call for opinions on "inline external targets" idea In-Reply-To: <006501c22627$92d94dd0$3d00800a@gkiddlap> Message-ID: <00b601c2265e$d4647590$545aa8c0@lslp862.int.lsl.co.uk> Garth Kidd wrote: > I strongly prefer the current way of doing it. Inline is > spectactularly messy, IMHO. I vehemently agree with Garth (gosh, that's nice) and David (that's nice too) that the inline alternatives being suggested look messy - there are/were good reasons they've been taken out (I may have had to be convinced of that in the past - don't remember - but if so, then that's an even better reason to stay with the status quo). Thinking about the sorts of HTML output I'm likely to generate (have generated), I don't believe I would gain from the new syntaxes. Hmm. I relatively recently produced an in-house document that did contain a fair number of URLs related to what was being discussed (a report from a conference I'd attended). In actual fact, it turned out that the result I *wanted* in the HTML was of the form:: Talked to X, Y and Z from ABC (http://www.....) to make the URLs more obvious, and not less - so it (sometimes) goes. OK - time for daft ideas again. Simon wants an *inline* way of writing his URLs, 'cos that's how he likes to type (the "but I might spell it wrong" argument isn't very convincing to me, since docutils either will warn about mismatched/unused links and targets, or could easily be made to, and link names should be distinct anyway). We don't particularly want to give that inline form to him explicitly. But both sides think that text like my example above is sensible (at some times). So why doesn't Simon write a transformer/filter for docutils that takes a Docutils tree, finds all the URLs that are in parentheses, and shifts them out of line - i.e., backwards onto the preceding interpreted text or single "word".? Thus he would type my example above, but after filtering, the *result* would be as if he had typed:: Talked to X, Y and Z from ABC_ .. _ABC: http://www..... Now, assuming he never wants to mix the two styles (which I imagine he doesn't), then that does more-or-less exactly what he wants... Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ Give a pedant an inch and they'll take 25.4mm (once they've established you're talking a post-1959 inch, of course) My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) From Paul.Moore@atosorigin.com Mon Jul 8 10:15:45 2002 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Mon, 8 Jul 2002 10:15:45 +0100 Subject: [Doc-SIG] Call for opinions on "inline external targets" idea Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5B415@UKRUX002.rundc.uk.origin-it.com> From: Tony J Ibbs (Tibs) [mailto:tony@lsl.co.uk] > Garth Kidd wrote: > > I strongly prefer the current way of doing it. Inline is > > spectactularly messy, IMHO. > > I vehemently agree with Garth (gosh, that's nice) and David FWIW (not much, as I don't use reST much for this type of thing) I agree as well. The proposed syntax is far too punctuation-heavy, and any of the alternatives discussed are ambiguous or too subtle. > Talked to X, Y and Z from ABC (http://www.....) [...] > why doesn't Simon write a transformer/filter for docutils that takes a > Docutils tree, finds all the URLs that are in parentheses, and shifts > them out of line - i.e., backwards onto the preceding interpreted text > or single "word".? > > Thus he would type my example above, but after filtering, the *result* > would be as if he had typed:: > > Talked to X, Y and Z from ABC_ > > .. _ABC: http://www..... The problem, as I understand it, is that Simon wants to do this for *relative* URLs, so he's looking at Talked to X, Y and Z from ABC (localfile.htm) which reST won't pick up as a link. So the transformer won't see link nodes, which leaves us back at parsing raw text. Paul. From Simon.Budig@unix-ag.org Mon Jul 8 10:23:01 2002 From: Simon.Budig@unix-ag.org (Simon Budig) Date: Mon, 8 Jul 2002 11:23:01 +0200 Subject: [Doc-SIG] Call for opinions on "inline external targets" idea In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5B415@UKRUX002.rundc.uk.origin-it.com>; from Paul.Moore@atosorigin.com on Mon, Jul 08, 2002 at 10:15:45AM +0100 References: <714DFA46B9BBD0119CD000805FC1F53B01B5B415@UKRUX002.rundc.uk.origin-it.com> Message-ID: <20020708112301.A47492@vmax.unix-ag.uni-siegen.de> Moore, Paul (Paul.Moore@atosorigin.com) wrote: > From: Tony J Ibbs (Tibs) [mailto:tony@lsl.co.uk] > > Garth Kidd wrote: > > > I strongly prefer the current way of doing it. Inline is > > > spectactularly messy, IMHO. > > > > I vehemently agree with Garth (gosh, that's nice) and David > > FWIW (not much, as I don't use reST much for this type of thing) I agree as > well. The proposed syntax is far too punctuation-heavy, and any of the > alternatives discussed are ambiguous or too subtle. Just for the records: The reference__ __<uri> syntax is punctuation heavy, but this is due to the fact that David *wanted* to have a whitespace between the reference and the uri. I originally proposed reference_(uri) (which has issues of parentheses being a valid URI-Component) and I currently would prefer reference_<uri> or reference_<<uri>>. Since it is allowed to break URIs with newlines the reason for embedding the whitespace in the syntax IMHO has gone. Bye, Simon -- Simon.Budig@unix-ag.org http://www.home.unix-ag.org/simon/ From tony@lsl.co.uk Mon Jul 8 10:26:59 2002 From: tony@lsl.co.uk (Tony J Ibbs (Tibs)) Date: Mon, 8 Jul 2002 10:26:59 +0100 Subject: [Doc-SIG] Call for opinions on "inline external targets" idea In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5B415@UKRUX002.rundc.uk.origin-it.com> Message-ID: <00b701c22661$9c1e2660$545aa8c0@lslp862.int.lsl.co.uk> Paul Moore wrote: > The problem, as I understand it, is that Simon wants to do this for > *relative* URLs, so he's looking at > > Talked to X, Y and Z from ABC (localfile.htm) > > which reST won't pick up as a link. So the transformer won't > see link nodes, which leaves us back at parsing raw text. Ah - I keep forgetting that - mainly because I parse the statement:: I want relative URLs, without an ``http://`` at the front, to be recognised. as being in the same class as the statement:: I want the moon to be made of cheese. - i.e., it might be fun (for some sense of fun), but it doesn't make much sense. So what's the problem with the overhead of just making the URL be a legal URL - i.e., adding the ``http://`` in front of it? It's about the same amount of typing as all the other solutions being proposed (and then my "solution" can be applied...). Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ "I'm a little monster, short and stout Here's my horns and here's my snout When you come a calling, hear me shout I will ROAR and chase you out" My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) From Simon.Budig@unix-ag.org Mon Jul 8 10:38:30 2002 From: Simon.Budig@unix-ag.org (Simon Budig) Date: Mon, 8 Jul 2002 11:38:30 +0200 Subject: [Doc-SIG] Call for opinions on "inline external targets" idea In-Reply-To: <00b701c22661$9c1e2660$545aa8c0@lslp862.int.lsl.co.uk>; from tony@lsl.co.uk on Mon, Jul 08, 2002 at 10:26:59AM +0100 References: <714DFA46B9BBD0119CD000805FC1F53B01B5B415@UKRUX002.rundc.uk.origin-it.com> <00b701c22661$9c1e2660$545aa8c0@lslp862.int.lsl.co.uk> Message-ID: <20020708113830.A47496@vmax.unix-ag.uni-siegen.de> Tony J Ibbs (Tibs) (tony@lsl.co.uk) wrote: > Paul Moore wrote: > > The problem, as I understand it, is that Simon wants to do this for > > *relative* URLs, so he's looking at > > > > Talked to X, Y and Z from ABC (localfile.htm) > > > > which reST won't pick up as a link. So the transformer won't > > see link nodes, which leaves us back at parsing raw text. > > Ah - I keep forgetting that - mainly because I parse the statement:: > > I want relative URLs, without an ``http://`` at the > front, to be recognised. > > as being in the same class as the statement:: > > I want the moon to be made of cheese. > > - i.e., it might be fun (for some sense of fun), but it doesn't make > much sense. > > So what's the problem with the overhead of just making the URL be a > legal URL - i.e., adding the ``http://`` in front of it? It's about the > same amount of typing as all the other solutions being proposed (and > then my "solution" can be applied...). Because this needs a complete specification of the server and the path and makes it necessary to edit a group of documents when you just want to move them to another location. Bye, Simon -- Simon.Budig@unix-ag.org http://www.home.unix-ag.org/simon/ From tony@lsl.co.uk Mon Jul 8 11:26:28 2002 From: tony@lsl.co.uk (Tony J Ibbs (Tibs)) Date: Mon, 8 Jul 2002 11:26:28 +0100 Subject: [Doc-SIG] Call for opinions on "inline external targets" idea In-Reply-To: <20020708113830.A47496@vmax.unix-ag.uni-siegen.de> Message-ID: <00bd01c22669$eb6bb040$545aa8c0@lslp862.int.lsl.co.uk> I wondered out loud why one didn't just write ``http://....``. Simon Budig wrote: > Because this needs a complete specification of the server and the path > and makes it necessary to edit a group of documents when you just want > to move them to another location. and Paul Moore wrote: > Relocatability. I don't *think* there's a http:... syntax for a > relative link. If there is, it's pretty obscure... So I did a google search on "relative URL" and got to RFC 1808 - ah, I see. The RFC doesn't say how one is aware one has a relative URL - that is up to the embedding document (pretty obvious in HTML, of course). So if I put ``http://`` I have already overspecified (since the base scheme may not be, or may not only be, ``http``). It just so happens that all of the occasions I've used this sort of thing have been of the form ``http://../fred.html#thingy`` (and so on) - i.e., not addressing the problem at all. Not that that makes me any more eager to adopt the proposed solutions. Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ Well we're safe now....thank God we're in a bowling alley. - Big Bob (J.T. Walsh) in "Pleasantville" My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) From fdrake@acm.org Mon Jul 8 17:44:47 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 8 Jul 2002 12:44:47 -0400 Subject: [Doc-SIG] Bug in PYthon Doc 2.2.1 In-Reply-To: <GCEDKONBLEFPPADDJCOEIEFBDNAA.whisper@oz.net> References: <GCEDKONBLEFPPADDJCOEIEFBDNAA.whisper@oz.net> Message-ID: <15657.49535.88870.211243@grendel.zope.com> David LeBlanc writes: > Is this the right place to report one? This is one such place, yes. A better one would be the SourceForge bug tracker for Python: http://sourceforge.net/tracker/?func=browse&group_id=5470&atid=105470 > If so: > > Python Documentation > Release 2.2.1 > April 10, 2002 > > pythondoc/ref/if.html: > if_stmt ::= "if" expression ":" suite > > Clicking on the word expression above results in: > The requested URL /pythondoc/ref/node61.html was not found on this server This is a known bug: http://www.python.org/sf/484967 It looks like this will be fairly difficult to fix, mostly due to limitations of the tools we're currently using. Feel free to "Monitor" the issue to receive notifications of further action on this if you're so inclined. Thanks for reporting the problem. -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Zope Corporation From goodger@users.sourceforge.net Tue Jul 9 02:48:26 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Mon, 08 Jul 2002 21:48:26 -0400 Subject: [Doc-SIG] Cleaning up HTML output (part 1 - 'name' and numerical ids) In-Reply-To: <3D270BD1.7911DC3F@escape.com> Message-ID: <B94FB929.257F0%goodger@users.sourceforge.net> [fantasai] >>> And numerical id's aren't very use-friendly. [David Goodger] >> Docutils actually uses names wherever possible. I don't know that >> it could be improved much, but if you do, please let me know. [fantasai] > Well, let's take an example from your test.txt output: > > | <div class="section" id="structural-elements" > | name="structural-elements"> > | <h1><a href="#id21">Structural Elements</a></h1> > > 'name' is not a valid attribute for <div>, <li>, or most of the > other elements in HTML. It's used in forms, and it's used in the > anchor tag. I see from the HTML spec that that's true. I guess I implemented "name" and "id" everywhere as an over-liberal interpretation of the XHTML 1.0 spec (Appendix C, section 8 "Fragment Identifiers"), where it says: In XML, URIs [RFC2396] that end with fragment identifiers of the form "#foo" do not refer to elements with an attribute name="foo"; rather, they refer to elements with an attribute defined to be of type ID, e.g., the id attribute in HTML 4. Many existing HTML clients don't support the use of ID-type attributes in this way, so identical values may be supplied for both of these attributes to ensure maximum forward and backward compatibility (e.g., <a id="foo" name="foo">...</a>). (http://www.w3.org/TR/2000/REC-xhtml1-20000126#guidelines) Patches to rectify this and any other oversights/mistakes/bugs are always welcome. > The above code should be written one of three ways: > > <div class="section" id="structural-elements"> > <h1>Structural Elements</h1> ... > The first only works in browsers that support the 'id' attribute for > targets, but it is a cleaner syntax. Should we be supporting older browsers? Or can we write code to the latest & greatest specs exclusively? > <div class="section"> > <h1><a id="structural-elements" name="structural-elements"> > Structural Elements</a></h1> ... > The second is redundant. But better for older browsers. Also, it would be tricky to implement, since the ID skips from the container (<div>, a Docutils section) to the header/title. > <div class="section"> > <h1><a name="structural-elements">Structural Elements</a></h1> ... > The third is not ideal, but it works in every HTML browser I have > ever come across. It's also deprecated in the latest specs. Will future browsers ever stop supporting it? > You use the numerical ids to have headings refer back to their > respective section entries in the table of contents. I don't see > that this is a particularly necessary behavior to have--you can just > link back to the table of contents as a whole, if you really want > to. That could be optional behavior, specified either by a command-line option or as a "contents" directive attribute (or both; perhaps both would be best). I'll enter it in the "To Do" list; patches are welcome. I modeled the current TOC behavior on GNU HTML documents. > Ideally, you'd put in a <link> to the table of contents in the > <head>: > <link rel="toc" href="#table-of-contents" > title="Table of Contents"> > > and leave it at that. I don't understand how that is supposed to work. Could you supply an example or a reference? > Unfortunatly, most browsers don't support <link>. So perhaps it's a non-issue? > If linking back to the corresponding toc entry is important to you, > then identify the entries with the section's id preceded by 'toc-'. > For example: > > <li id="toc-structural-elements"><a href="#structural-elements"> > Structural Elements</a></li> > > or > > <li><a href="#structural-elements" name="toc-structural-elements"> > Structural Elements</a></li> Since I normally don't read the raw HTML, and it's not *intended* to be read in raw form, I don't think the form of the IDs is that important. If you do... patches are welcome. -- David Goodger <goodger@users.sourceforge.net> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From goodger@users.sourceforge.net Tue Jul 9 03:29:33 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Mon, 08 Jul 2002 22:29:33 -0400 Subject: [Doc-SIG] References in the same line as the target text In-Reply-To: <20020707142508.D42702@vmax.unix-ag.uni-siegen.de> Message-ID: <B94FC2CD.257F2%goodger@users.sourceforge.net> Simon Budig wrote: > Hmm. Isn't the possibility to break URLs with a newline enough and > would reduce the need for the additional space in the markup? You > could do `long text followed by a long URL that might make it > necessary`__<http:// to.break.the.URL.at/the/end/of/the#line> Rethinking why I objected to the initial syntax because of the line wrapping issue, I realized that was just one superficial aspect. More importantly, that syntax produces a single compound construct made up of two equally important parts, *with syntax in the middle*, *between* the reference and the target. This is unprecedented in reStructuredText; every other inline construct consists of only one logical part (even in "interpreted text" with explicit roles, the role can be considered a qualifier and not equal to the text itself). The compromise syntax ("`reference text`__ __<http://example.com>") is executed in two parts: a hyperlink reference identical to all other references, and an "inline external target" which is new. This fits better with the rest of reStructuredText. I also notice from a quick perusal of the code that the drop-in replacement is implemented to recognize both parts at once; I don't know if I agree with that. Perhaps the inline external target should not be restricted to appearing immediately after the reference only? I think that allowing it to be anywhere would be a logical extension/generalization of the idea. Someone might choose a middle ground and want to put targets between sentences instead of interrupting the sentence flow. > I think that the advantage of this definitely is that it would > indicate the tight connection between the text and the argument. The > separation with a space looks a bit as if this were two totally > independant syntactic constructs (which it isn't because > __<this.url> would not be recognized. I think they *are* two totally independent syntactic constructs. The space (separation) is an essential part of that. > If you think that reference_<uri> is to subtle, we could make it > more explicit with_<<double angle brackets>>. Doubling the <>s doesn't improve it. > I also thought about adding an additional underscore__<like/this> > for named and three underscores___<like/this> for anonymous inline > targets. Increasing the number of underscores doesn't improve the syntax either. -- David Goodger <goodger@users.sourceforge.net> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From Simon.Budig@unix-ag.uni-siegen.de Tue Jul 9 10:22:32 2002 From: Simon.Budig@unix-ag.uni-siegen.de (Simon Budig) Date: Tue, 9 Jul 2002 11:22:32 +0200 Subject: [Doc-SIG] References in the same line as the target text In-Reply-To: <B94FC2CD.257F2%goodger@users.sourceforge.net>; from goodger@users.sourceforge.net on Mon, Jul 08, 2002 at 10:29:33PM -0400 References: <20020707142508.D42702@vmax.unix-ag.uni-siegen.de> <B94FC2CD.257F2%goodger@users.sourceforge.net> Message-ID: <20020709112232.A31784@vmax.unix-ag.uni-siegen.de> David Goodger (goodger@users.sourceforge.net) wrote: > The compromise syntax ("`reference text`__ __<http://example.com>") is > executed in two parts: a hyperlink reference identical to all other > references, and an "inline external target" which is new. This fits > better with the rest of reStructuredText. > > I also notice from a quick perusal of the code that the drop-in > replacement is implemented to recognize both parts at once; I don't > know if I agree with that. Perhaps the inline external target should > not be restricted to appearing immediately after the reference only? I > think that allowing it to be anywhere would be a logical > extension/generalization of the idea. Someone might choose a middle > ground and want to put targets between sentences instead of > interrupting the sentence flow. Ah. The very first version I wrote indeed treated the second thing as an independant construct. It simply added an anonymous target when stumbling across __<uri>. However, It made the behaviour in the following case different than what I expected:: Here is a reference__ and here is a `second one`__ __<target1> and here is a third__ __ target2 __ target3 target1 got assigned to reference__, target2 to `second one`__ and target3 to third__ This was a different and a bit unexpected compared to my proposal. I wanted to have target1 assigned to `second one`__ and the regular anonymous targets fill the gaps as usual. I had to add this as an optional item to the reference__ syntax so that I get this behaviour. When stumbling over a __<uri> thing it is too late to figure out which was the last reference to assign the target to. However, It would be OK with me to change the behaviour as you described it. This would also mean dropping the reference_<uri> syntax for inline named targets (because there is no easy way to figure out a matching name...), maybe this is a good thing. > > I think that the advantage of this definitely is that it would > > indicate the tight connection between the text and the argument. The > > separation with a space looks a bit as if this were two totally > > independant syntactic constructs (which it isn't because > > __<this.url> would not be recognized. > > I think they *are* two totally independent syntactic constructs. The > space (separation) is an essential part of that. Well, I originally looked it as an optional "Ok, lets fill the needed uri at once" so making it two syntactic constructs made no sense. Hmm. Not sure. I think I would prefer to keep it the way it currently works. It reduces a source for confusion. However, I can change the code to the other behaviour if wanted (I am not sure how much work I should invest in this, since the reactions to your call for opinions were pretty clear up to now, so this patch is likely to end as an local extension...). Bye, Simon -- Simon.Budig@unix-ag.org http://www.home.unix-ag.org/simon/ From fantasai@escape.com Tue Jul 9 15:09:49 2002 From: fantasai@escape.com (fantasai) Date: Tue, 09 Jul 2002 10:09:49 -0400 Subject: [Doc-SIG] Cleaning up HTML output (part 1 - 'name' andnumerical ids) References: <B94FB929.257F0%goodger@users.sourceforge.net> Message-ID: <3D2AEEAD.BE4D7705@escape.com> David Goodger wrote: > > fantasai wrote: > > The above code should be written one of three ways: > > > > <div class="section" id="structural-elements"> > > <h1>Structural Elements</h1> > ... > > The first only works in browsers that support the 'id' attribute for > > targets, but it is a cleaner syntax. > > Should we be supporting older browsers? Or can we write code to the > latest & greatest specs exclusively? I think it's a good idea to support older browsers--as an option, at least. Not everybody has the latest and greatest software, and some older systems /can't/ support the latest and greatest software. (NS4 does not support ID targets, and it's still widely used. I'm not sure when IE began supporting IDs.) > > <div class="section"> > > <h1><a name="structural-elements">Structural Elements</a></h1> > ... > > The third is not ideal, but it works in every HTML browser I have > > ever come across. > > It's also deprecated in the latest specs. Will future browsers ever > stop supporting it? They probably won't for many years since most content on the web is written with the <a name="fragment-id"> construct. > > Ideally, you'd put in a <link> to the table of contents in the > > <head>: > > <link rel="toc" href="#table-of-contents" > > title="Table of Contents"> > > > > and leave it at that. > > I don't understand how that is supposed to work. Could you supply an > example or a reference? http://www.subotnik.net/html/link.html.en > > Unfortunatly, most browsers don't support <link>. > > So perhaps it's a non-issue? Practically, yes. > > If linking back to the corresponding toc entry is important to you, > > then identify the entries with the section's id preceded by 'toc-'. > > For example: ... > > Since I normally don't read the raw HTML, and it's not *intended* to > be read in raw form, I don't think the form of the IDs is that > important. Ah, I forgot to mention another problem with numerical IDs-- If I add or delete a chapter, all the IDs from there on change. Suppose I had a bookmark to that part of the document. It wouldn't point to the right element anymore. It's not critical in a table of contents, where there's only a few list items to scroll through. But in general, you do *not* want numerical identifiers, and it's a good practice to avoid them. ~fantasai From fdrake@acm.org Tue Jul 9 17:16:03 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue, 9 Jul 2002 12:16:03 -0400 Subject: [Doc-SIG] Error in pyexpat docs In-Reply-To: <200010291916.MAA16420@localhost.localdomain> References: <200010291916.MAA16420@localhost.localdomain> Message-ID: <15659.3139.489289.158582@grendel.zope.com> [Cleaning out some old email...] On 29 Oct 2000, uche.ogbuji@fourthought.com wrote: > It turns out that ParseFile actually returns 0 on error, returning > 1 otherwise. > > The first matter is that the code and the docs need to be > reconciled. However, I would _much_ rather prefer that things were > as in the docs. I think ParseFile should raise an exception rather > than return an error flag. Interestingly enough, this is the same > argument I had with a colleague just last week. The return value for ParseFile has just recently been made to do the right thing in all cases, and now matches Parse. An exception is raised when there's an error in parsing, and the exception instance carries the "code", "offset", and "lineno" attributes. Just in case anyone thought this was still a problem. ;-) -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Zope Corporation From klm@zope.com Tue Jul 9 17:19:25 2002 From: klm@zope.com (Ken Manheimer) Date: Tue, 9 Jul 2002 12:19:25 -0400 (EDT) Subject: [Doc-SIG] References in the same line as the target text In-Reply-To: <20020709112232.A31784@vmax.unix-ag.uni-siegen.de> Message-ID: <Pine.LNX.4.44.0207091205410.4080-100000@korak.zope.com> I'd like to weigh in requesting some kind of easy, direct inline reference link. I have particular preferences concerning the current discussion. Before i dive in though, i have to confess - as someone who hasn't tried the non-inline style of references, i may not have the practical experience which would enlighten me otherwise. That said, the idea of having to coordinate collections of references to distant collections of URLs, either by inventing intermediate names or by ordering anonymous correspondences, seems quite daunting. > David Goodger (goodger@users.sourceforge.net) wrote: > > The compromise syntax ("`reference text`__ __<http://example.com>") is > > executed in two parts: a hyperlink reference identical to all other > > references, and an "inline external target" which is new. This fits > > better with the rest of reStructuredText. > > > > I also notice from a quick perusal of the code that the drop-in > > replacement is implemented to recognize both parts at once; I don't > > know if I agree with that. Perhaps the inline external target should > > not be restricted to appearing immediately after the reference only? I > > think that allowing it to be anywhere would be a logical > > extension/generalization of the idea. Someone might choose a middle > > ground and want to put targets between sentences instead of > > interrupting the sentence flow. This sounds like it would satisfy my craving, as long as the link binds to the nearest prior reference. On Tue, 9 Jul 2002, Simon Budig wrote: > Ah. The very first version I wrote indeed treated the second thing as > an independant construct. It simply added an anonymous target when > stumbling across __<uri>. However, It made the behaviour in the > following case different than what I expected:: > > Here is a reference__ and here is a `second one`__ __<target1> > and here is a third__ > > __ target2 > __ target3 > > target1 got assigned to reference__, target2 to `second one`__ > and target3 to third__ > > This was a different and a bit unexpected compared to my proposal. > I wanted to have target1 assigned to `second one`__ and the > regular anonymous targets fill the gaps as usual. If i understand correctly, i would, um, *really* dislike having it work this way. As i said above, i would unequivocally want __<target1> to bind to the nearest prior reference, `second one`. The whole point of this construct, for me, would be to insulate the person creating the references from changes anywhere except the intervening space between the reference and the link. It sounds like someone is suggesting a scheme where adding new references at the beginning of the paragraph would change all of the anonymous bindings throughout the paragraph, including the "inline" one. I quote "inline" because it *wouldn't* be inline, in that case, it would be "interleaved", and in a loosely-coupled, surprising way. Lemme know if i'm flailing at non-issues here - my tracking of the discussion is a bit spotty, though i am interested... -- Ken Manheimer klm@zope.com From guido@python.org Tue Jul 9 18:21:32 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 09 Jul 2002 13:21:32 -0400 Subject: [Doc-SIG] Older Python Documentation In-Reply-To: Your message of "Sun, 23 Jun 2002 20:34:42 PDT." <000001c21b30$149fc3b0$32cae40c@homer> References: <000001c21b30$149fc3b0$32cae40c@homer> Message-ID: <200207091721.g69HLWL02578@odiug.zope.com> > I'm working on a web site (http://www.webdocs.org) that hosts the > current and older releases of the official Python documentation, along > with some other Python-related docs. Why bother? Python.org has all previous doc releases back to 1.4: http://www.python.org/doc/versions.html > I currently have Python documentation from as far back as version 1.4 on > the site, but nothing older. If possible, I'd like to include as many > releases of the Python documentation as I can find (if not all of them). > However, I can't seem to find anything prior to Python 1.4. (I did > happen to find some tex files for a couple older versions, but no html > files.) Older releases didn't publish the HTML (you'd have to run latex2html yourself). Note that the most complete collection of Python downloadables is at http://www.python.org/ftp/python/src/ -- the 1.2 and 1.3 source releases are there and they contain the .tex files. > Also, I'm relatively new to Python (less than a year), so I don't even > know which older versions were officially released. For instance, I > know that there was a version 1.3 and a 1.2 released, but was there ever > a version 1.2.1 or 1.3.1? If it's not at the URL I mentioned, it wasn't released. (Except for releases up to 1.0, which were lost. > My guess is that no one today would even care about documentation from > that far back. But personally I think it's cool to be able to see how > Python has evolved over the years. Me too. :-) You can see the history at the website here (there's a link to it from the python.org home page): http://web.archive.org/web/*sa_/http://www.python.org --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Jul 9 18:31:17 2002 From: guido@python.org (Guido van Rossum) Date: Tue, 09 Jul 2002 13:31:17 -0400 Subject: [Doc-SIG] Older Python Documentation In-Reply-To: Your message of "Tue, 09 Jul 2002 13:21:32 EDT." Message-ID: <200207091731.g69HVHf02624@odiug.zope.com> > Older releases didn't publish the HTML (you'd have to run latex2html > yourself). Note that the most complete collection of Python > downloadables is at http://www.python.org/ftp/python/src/ -- the 1.2 > and 1.3 source releases are there and they contain the .tex files. Actually, the 1.2 and 1.3 HTML at least have been preserved: ftp://www.python.org/pub/python/doc/1.2/ ftp://www.python.org/pub/python/doc/1.3/ --Guido van Rossum (home page: http://www.python.org/~guido/) From goodger@users.sourceforge.net Thu Jul 11 03:04:38 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Wed, 10 Jul 2002 22:04:38 -0400 Subject: [Doc-SIG] Cleaning up HTML output (part 1 - 'name' andnumerical ids) In-Reply-To: <3D2AEEAD.BE4D7705@escape.com> Message-ID: <B9525FF5.25A1E%goodger@users.sourceforge.net> [David] >> Should we be supporting older browsers? Or can we write code to the >> latest & greatest specs exclusively? [fantasai] > I think it's a good idea to support older browsers--as an option, at > least. Not everybody has the latest and greatest software, and some > older systems /can't/ support the latest and greatest software. I agree, to a point. I think Docutils should target the current "mainstream", the software that most people are currently using. Neither the bleeding edge (extreme early adopters) nor the trailing edge (extreme laggards) need be considered if it's a pain to do so. In any case, if there are problems for users outside the mainstream, fixes will have to come from those users. > Ah, I forgot to mention another problem with numerical IDs-- > If I add or delete a chapter, all the IDs from there on > change. Suppose I had a bookmark to that part of the document. It > wouldn't point to the right element anymore. It's not critical in a > table of contents, where there's only a few list items to scroll > through. But in general, you do *not* want numerical identifiers, > and it's a good practice to avoid them. I would agree if we were using numerical IDs exclusively, but we're not. We use named IDs wherever possible; a section titled "Introduction" will have ID "introduction", and that won't change even if a "Preface" section is inserted earlier in the document. This breaks down if we have two identical names: with two "Introduction" sections, the first keeps the ID "introduction", but the second one gets a numbered "id1". I don't see any way around that. The place we *are* using numerical IDs exclusively, is in TOC back-references. I don't think anyone would ever want to link to a table of contents entry; to the TOC itself, perhaps, but not to an arbitrary *entry*. I've added automatic IDs to generated TOCs, taken from the supplied or default title ("Contents" in English). -- David Goodger <goodger@users.sourceforge.net> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From goodger@users.sourceforge.net Thu Jul 11 03:09:50 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Wed, 10 Jul 2002 22:09:50 -0400 Subject: [Doc-SIG] References in the same line as the target text In-Reply-To: <20020709112232.A31784@vmax.unix-ag.uni-siegen.de> Message-ID: <B952612D.25A23%goodger@users.sourceforge.net> Simon Budig wrote: > The very first version I wrote indeed treated the second thing as an > independant construct. It simply added an anonymous target when > stumbling across __<uri>. However, It made the behaviour in the > following case different than what I expected:: > > Here is a reference__ and here is a `second one`__ __<target1> > and here is a third__ > > __ target2 > __ target3 > > target1 got assigned to reference__, target2 to `second one`__ > and target3 to third__ > > This was a different and a bit unexpected compared to my proposal. Yes, this behavior would be broken. As Ken mentioned, the inline external target would have to match with the nearest prior reference. > Hmm. Not sure. I think I would prefer to keep it the way it currently > works. It reduces a source for confusion. However, I can change the > code to the other behaviour if wanted I don't prefer *either* way, as you know ;-). I'm just exploring ramifications. As soon as a new construct is added, somebody will try to apply it in an orthogonal way. If we're going to require that the target immediately follow the reference, we might as well embed the target in the reference and use syntax like "`reference text <http://example.com/>`__". See my post "Summary of reference/target syntaxes". > (I am not sure how much work I should invest in this, since the > reactions to your call for opinions were pretty clear up to now, so > this patch is likely to end as an local extension...). You've already done most of it. I have to add support for parser-changing pragma directives, and then I suppose I will implement one as a test case. It may not be this one though; *that* may depend on your investment! -- David Goodger <goodger@users.sourceforge.net> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From goodger@users.sourceforge.net Thu Jul 11 03:11:30 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Wed, 10 Jul 2002 22:11:30 -0400 Subject: [Doc-SIG] References in the same line as the target text In-Reply-To: <Pine.LNX.4.44.0207091205410.4080-100000@korak.zope.com> Message-ID: <B9526191.25A24%goodger@users.sourceforge.net> Ken Manheimer wrote: > I'd like to weigh in requesting some kind of easy, direct inline > reference link. Your input is always welcome. > the idea of having to coordinate collections of references to > distant collections of URLs, either by inventing intermediate names > or by ordering anonymous correspondences, seems quite daunting. There's no "inventing" going on. reStructuredText simply uses the reference text (the part of the text which would be highlighted as a hyperlink in HTML) as a hyperlink name. The target (URL) is listed out-of-line, with the same name. I think that if you try it, you'll at least get used to it if not get to like it outright. It's different from the StructuredText syntax, which I found next to unreadable *because* of the embedded URLs (they break the flow of the plaintext). Simon's proposal is basically asking to bring back the StructuredText way. The syntax differs a bit, but the idea is the same. [David] >>> Perhaps the inline external target should not be restricted to >>> appearing immediately after the reference only? I think that >>> allowing it to be anywhere would be a logical >>> extension/generalization of the idea. Someone might choose a >>> middle ground and want to put targets between sentences instead of >>> interrupting the sentence flow. [Ken] > This sounds like it would satisfy my craving, as long as the link > binds to the nearest prior reference. I had similar thoughts about the reference/target binding. Unmatched inline external targets would not be allowed; there would always have to be a reference waiting. And it would have to be the most recent reference, otherwise the targets and references would nest, which would be very confusing behavior. We wouldn't be able to have text like:: This is the `first reference`__ and here's the `second reference`__. __<first_url> __<second_url> > The whole point of this construct, for me, would be to insulate the > person creating the references from changes anywhere except the > intervening space between the reference and the link. That's a valid goal, but conflicts with reStructuredText's goal of readability. It's a tough call. > It sounds like someone is suggesting a scheme where adding new > references at the beginning of the paragraph would change all of the > anonymous bindings throughout the paragraph, including the "inline" > one. I quote "inline" because it *wouldn't* be inline, in that > case, it would be "interleaved", and in a loosely-coupled, > surprising way. Anonymous references and targets do have this potential problem: the author has to keep references and targets in sync. Docutils generates an error when there aren't the same number of references and targets, which helps, but only so far. Don't worry, I won't be adding anything that will make hyperlinks more complicated or surprising. Not without a *lot* of thought and discussion, anyhow. -- David Goodger <goodger@users.sourceforge.net> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From goodger@users.sourceforge.net Thu Jul 11 03:19:36 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Wed, 10 Jul 2002 22:19:36 -0400 Subject: [Doc-SIG] Summary of reference/target syntaxes Message-ID: <B9526378.25A27%goodger@users.sourceforge.net> Here is a summary of the current and proposed hyperlink syntaxes, with concrete examples, and my current thinking. 1. Named hyperlinks (in current reStructuredText):: This is a named reference_ of one word ("reference"). Here is a `phrase reference`_. Phrase references may even cross `line boundaries`_. .. _reference: http://www.example.org/reference/ .. _phrase reference: http://www.example.org/phrase_reference/ .. _line boundaries: http://www.example.org/line_boundaries/ Advantages: - The plaintext is readable. - Each target may be reused multiple times (e.g., just write "reference_" again). - No syncronized ordering of references and targets is necessary. Disadvantages: - The reference text must be repeated as target names; could lead to mistakes. - The target URLs may be located far from the references, and hard to find in the plaintext. 2. Anonymous hyperlinks (in current reStructuredText):: This is a named reference__ of one word ("reference"). Here is a `phrase reference`__. Phrase references may even cross `line boundaries`__. __ http://www.example.org/reference/ __ http://www.example.org/phrase_reference/ __ http://www.example.org/line_boundaries/ Advantages: - The plaintext is readable. - The reference text does not have to be repeated. Disadvantages: - References and targets must be kept in sync. - Targets cannot be reused. - The target URLs may be located far from the references. 3. The proposed inline external target syntax:: This is a named reference__ __<http://www.example.org/ reference/> of one word ("reference"). Here is a `phrase reference`__ __<http://www.example.org/phrase_reference/>. Advantages: - The target is specified immediately adjacent to the reference, improving maintainability: - References and targets are easily kept in sync. - The reference text does not have to be repeated. Disadvantages: - Poor plaintext readability. - Targets cannot be reused (but see below). To alleviate the readability issue slightly, we could allow the target to appear later, such as after the end of the sentence:: This is a named reference__ of one word ("reference"). __<http://www.example.org/reference/> Here is a `phrase reference`__. __<http://www.example.org/phrase_reference/> This could only work for one reference at a time (reference/target pairs must be proximate [refA trgA refB trgB], not interleaved [refA refB trgA trgB] or nested [refA refB trgB trgA]). Perhaps this restriction is too onerous; then references and targets would have to be imediately adjacent. The above syntax is actually for "anonymous inline external targets", emphasized by the double underscores. It follows that single trailing & leading underscores would lead to implicitly named inline external targets. This would allow the reuse of targets by name. So after "reference_ _<target>", another "reference_" would point to the same target. 4. If it is best for references and inline external targets to be immediately adjacent, they might as well be integrated. Here's an alternative syntax embedding the target URL in the reference:: This is a named `reference <http://www.example.org/reference />`__ of one word ("reference"). Here is a `phrase reference <http://www.example.org/phrase_reference/>`__. Advantages and disadvantages are the same as in (3). Readability is still an issue, but the syntax is a bit less heavyweight. There's a problem with this syntax: how to refer to a title like "HTML Anchors: <a>" (ending with an HTML/SGML/XML tag)? We could either require more syntax on the target (like "`reference text __<http://example.com/>`__"), or require the odd conflicting title to be escaped (like "`HTML Anchors: \<a>`__"). The latter seems preferable. Similarly to (3) above, a single trailing underscore would convert the reference & inline external target from anonymous to implicitly named, allowing reuse of targets by name. 5. For comparison and historical background, StructuredText has two syntaxes for hyperlinks. First, ``"reference text":URL``:: This is a named "reference":http://www.example.org/reference/ of one word ("reference"). Here is a "phrase reference":http://www.example.org/phrase_reference/. Second, ``"reference text", http://example.com/absolute_URL``:: This is a named "reference", http://www.example.org/reference/ of one word ("reference"). Here is a "phrase reference", http://www.example.org/phrase_reference/. Advantages: - The target is specified immediately adjacent to the reference. Disadvantages: - Poor plaintext readability. - Targets cannot be reused. - Both syntaxes use double quotes, common in ordinary text. - In the first syntax, the URL and the last word are stuck together, exacerbating the line wrap problem. - The second syntax is too magical; text could easily be written that way by accident (although only absolute URLs are recognized here, perhaps because of the potential for ambiguity). With any kind of inline external target syntax it comes down to the conflict between maintainability and plaintext readability. I don't see a major problem with reStructuredText's maintainability, and I don't want to sacrifice plaintext readability to "improve" it. The proponents of inline external targets want them for easily maintainable web pages. The arguments go something like this: - Named hyperlinks are difficult to maintain because the reference text is duplicated as the target name. To which I said, "So use anonymous hyperlinks." - Anonymous hyperlinks are difficult to maintain becuase the references and targets have to be kept in sync. "So keep the targets close to the references, grouped after each paragraph. Maintenance is trivial." - But targets grouped after paragraphs break the flow of text. "Surely less than URLs embedded in the text! And if the intent is to produce web pages, not readable plaintext, then who cares about the flow of text?" As is probably obvious, I'm ambivalent/against the proposed "inline external targets". I value reStructuredText's readability very highly, and although it may add some convenience, the "inline external target" syntax(es) compromise that readability IMO. Unless something changes (better syntax, new & better arguments and/or use cases), the best result this proposal can hope for is inclusion as "experimental syntax" via a pragma directive. -- David Goodger <goodger@users.sourceforge.net> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From Paul.Moore@atosorigin.com Thu Jul 11 09:39:45 2002 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Thu, 11 Jul 2002 09:39:45 +0100 Subject: [Doc-SIG] Summary of reference/target syntaxes Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5B43E@UKRUX002.rundc.uk.origin-it.com> From: David Goodger [mailto:goodger@users.sourceforge.net] > Here is a summary of the current and proposed hyperlink syntaxes, > with concrete examples, and my current thinking. Thanks for the summary of the current state of play. I believe that it's fair. For what it's worth, I agree with your arguments. Link-heavy [sections of] documents (which are basically what this proposal is addressing) are not, in general, common. Maybe there's a more specialised approach to take. As far as I can see, the type of link-heavy document sections which we are considering are generally some form of index (a HTML page of links, a table of references, or some such). I could be wrong, but *I* would find it hard reading such text if it wasn't somehow list-formatted. You can see where I'm headed... Why not, instead of inventing another (general) hyperlink format, use a directive. You could have something like (completely untested, off the top of my head, and probably irrational) .. link-list:: First link -- http://www.example.com/wherever Second link -- relative_link.html I can't think of an example where I would want to have frequent hypertext links in *running* text. And for the occasional one or two links per paragraph, relative links located between paragraphs would be fine. To be honest, the longer a paragraph is, the less I would want to see interleaved URLs (unless the URLs themselves were the link text, a case which is already covered by standalone hyperlinks). Maybe this is worth noting as a separate alternative - don't bother with a general construct, but implement a directive to handle the specific case you are trying to deal with. Paul From klm@zope.com Thu Jul 11 15:45:48 2002 From: klm@zope.com (Ken Manheimer) Date: Thu, 11 Jul 2002 10:45:48 -0400 (EDT) Subject: [Doc-SIG] References in the same line as the target text In-Reply-To: <B9526191.25A24%goodger@users.sourceforge.net> Message-ID: <Pine.LNX.4.44.0207111001530.31105-100000@korak.zope.com> On Wed, 10 Jul 2002, David Goodger wrote: > Ken Manheimer wrote: > > the idea of having to coordinate collections of references to > > distant collections of URLs, either by inventing intermediate names > > or by ordering anonymous correspondences, seems quite daunting. > > There's no "inventing" going on. reStructuredText simply uses the > reference text (the part of the text which would be highlighted as a > hyperlink in HTML) as a hyperlink name. The target (URL) is listed > out-of-line, with the same name. Ah - i thought there might be an option to "name" a reference, particularly to disambiguate references with identical text. Come to think of it, how does reST handle something like: You can find more info about life `here`_ and `here`_ ? I assume you're not forced to use different text for your references just so they unambiguously couple with their links. (That might not be so bad in my contrived example, but would be a nuisance for more distantly situated references that happen to have identical text.) > I think that if you try it, you'll at least get used to it if not get > to like it outright. It's different from the StructuredText syntax, > which I found next to unreadable *because* of the embedded URLs (they > break the flow of the plaintext). Simon's proposal is basically > asking to bring back the StructuredText way. The syntax differs a > bit, but the idea is the same. I really think the readability in this case is a judgement call, and the author should have the opportunity to choose the style they think is most appropriate for the situation. For instance, i personally often find footnote-style URL references to be a nuisance (i am actively annoyed by the Python PEP style of separating the reference from the URL, both in the source *and* the HTML rendering, for instance!) I think the issue is much like that for parenthetic versus footnote-style asides in regular text. Footnotes are *much* further removed than ("embedded") parenthesized notes, and a nuisance when the aside is fairly relevant to the flow. (Eg, the "embedded" aside in the previous sentence, or this sentence [#]_) URLs can be like that, with relevant information in their addresses, eg "whose site is the content on?", "is that the page i'm thinking of?", etc. > > The whole point of this construct, for me, would be to insulate the > > person creating the references from changes anywhere except the > > intervening space between the reference and the link. > > That's a valid goal, but conflicts with reStructuredText's goal of > readability. It's a tough call. I think it may be a judgement call, that could be left to the author. I know it would be for my own tastes. .. [#] Don't you love self-referentiality?-) -- Ken klm@zope.com From goodger@users.sourceforge.net Fri Jul 12 05:38:43 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Fri, 12 Jul 2002 00:38:43 -0400 Subject: [Doc-SIG] Summary of reference/target syntaxes In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5B43E@UKRUX002.rundc.uk.origin-it.com> Message-ID: <B953D592.25AE6%goodger@users.sourceforge.net> Paul Moore wrote: > Why not, instead of inventing another (general) hyperlink format, > use a directive. You could have something like (completely untested, > off the top of my head, and probably irrational) > > .. link-list:: > First link -- http://www.example.com/wherever > Second link -- relative_link.html I don't see how this is useful (maybe I just don't "get it"). Could you expand on your example, with context (show the references as well)? -- David Goodger <goodger@users.sourceforge.net> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From goodger@users.sourceforge.net Fri Jul 12 05:39:24 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Fri, 12 Jul 2002 00:39:24 -0400 Subject: [Doc-SIG] References in the same line as the target text In-Reply-To: <Pine.LNX.4.44.0207111001530.31105-100000@korak.zope.com> Message-ID: <B953D5BB.25AE7%goodger@users.sourceforge.net> Ken Manheimer wrote: > Come to think of it, how does reST handle something like: > > You can find more info about life `here`_ and `here`_ > > ? I assume you're not forced to use different text for your > references just so they unambiguously couple with their links. > (That might not be so bad in my contrived example, but would be a > nuisance for more distantly situated references that happen to have > identical text.) Actually, named references *do* have to have different text if they're to refer to different targets. In a situation like the above, currently you'd have to use anonymous hyperlinks like this:: You can find more info about life here__ and here__ __ http://www.example.com/first __ http://www.example.com/second Using the embedded variation of inline external targets (#4 in the "summary" post), it would look like this:: You can find more info about life `here <http://www.example.com/first>`__ and `here <http://www.example.com/second>`__ BTW, you don't need to use backquotes for single-word references ("`here`_"), although they don't hurt. > I really think the readability in this case is a judgement call, and > the author should have the opportunity to choose the style they > think is most appropriate for the situation. Noted. But on the other hand, we can't cater to everyone's taste; the result would be an uncoordinated mess. I'm torn on this issue. The proposed syntax offers convenience, but I don't know if the convenience is worth the cost in ugliness. Perhaps if the syntax is *allowed* but its use strongly *discouraged*, for aesthetic reasons? Or would that just be hypocritical? Dilemma. -- David Goodger <goodger@users.sourceforge.net> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From klm@zope.com Fri Jul 12 14:40:04 2002 From: klm@zope.com (Ken Manheimer) Date: Fri, 12 Jul 2002 09:40:04 -0400 (EDT) Subject: [Doc-SIG] References in the same line as the target text In-Reply-To: <B953D5BB.25AE7%goodger@users.sourceforge.net> Message-ID: <Pine.LNX.4.44.0207120911200.31105-100000@korak.zope.com> On Fri, 12 Jul 2002, David Goodger wrote: > Ken Manheimer wrote: > > I really think the readability in this case is a judgement call, and > > the author should have the opportunity to choose the style they > > think is most appropriate for the situation. > Noted. But on the other hand, we can't cater to everyone's taste; the > result would be an uncoordinated mess. I'm torn on this issue. The > proposed syntax offers convenience, but I don't know if the > convenience is worth the cost in ugliness. Perhaps if the syntax is > *allowed* but its use strongly *discouraged*, for aesthetic reasons? > Or would that just be hypocritical? Dilemma. I'm also torn about this - we may have reached the same level of ambivalence, coming from opposite sides.-) One point i'd make about "discouraging for aesthetic reasons" - you could consider that people's choosing to use it or not _conveys_ their aesthetic judgement. The concern i see is having the extraneous syntax if peopl choose not to use it, getting in the way of people finding and learning the things they do want to use. Earlier in the cited message: > Using the embedded variation of inline external targets (#4 in the > "summary" post), it would look like this:: > > You can find more info about life `here > <http://www.example.com/first>`__ and `here > <http://www.example.com/second>`__ I've lost track of the summary post, and i apologize if good reasons for dismissing this alternative have already been covered: here_(http://www.python.org) or `sometimes here`_(http://www.zope.org), I think it is pretty natural for people to put URLs in parens this way, in regular text. The act of deciding between making it appear in the HTML as a reference link or just some text followed by an explicit link in parens is practically just adding the '_' underscore. (At least for the single word case - there's the backticks for the multiple word case, but the spirit is the same.) I think it is exceptionallly intuitively obvious, and actually like how it looks (which i can't say about #4 syntax above). Like i said, i'm also torn about this. The here_(link) syntax seems so unobjectionable to me that it make it easy for me to lean to the dark side (possibly extraneous syntax), but i lost track of the objections, so may well be missing something crucial... -- Ken klm@zope.com From Simon.Budig@unix-ag.org Fri Jul 12 14:56:02 2002 From: Simon.Budig@unix-ag.org (Simon Budig) Date: Fri, 12 Jul 2002 15:56:02 +0200 Subject: [Doc-SIG] References in the same line as the target text In-Reply-To: <Pine.LNX.4.44.0207120911200.31105-100000@korak.zope.com>; from klm@zope.com on Fri, Jul 12, 2002 at 09:40:04AM -0400 References: <B953D5BB.25AE7%goodger@users.sourceforge.net> <Pine.LNX.4.44.0207120911200.31105-100000@korak.zope.com> Message-ID: <20020712155602.A33474@vmax.unix-ag.uni-siegen.de> Ken Manheimer (klm@zope.com) wrote: > On Fri, 12 Jul 2002, David Goodger wrote: > I've lost track of the summary post, and i apologize if good reasons > for dismissing this alternative have already been covered: > > here_(http://www.python.org) or `sometimes here`_(http://www.zope.org), The reasons that were given were: "Too subtle" and "closing parentheses are valid URI characters: http://www.fo0.de/blah)foo might be a valid URL and would give problems with the syntax mentioned above. The latter is comprehensible to me. I might have forgotten other concerns. Bye, Simon -- Simon.Budig@unix-ag.org http://www.home.unix-ag.org/simon/ From goodger@users.sourceforge.net Sat Jul 13 04:16:18 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Fri, 12 Jul 2002 23:16:18 -0400 Subject: [Doc-SIG] References in the same line as the target text In-Reply-To: <Pine.LNX.4.44.0207120911200.31105-100000@korak.zope.com> Message-ID: <B95513C2.25BC5%goodger@users.sourceforge.net> Ken Manheimer wrote: > One point i'd make about "discouraging for aesthetic reasons" - you > could consider that people's choosing to use it or not _conveys_ > their aesthetic judgement. It's not *their* aesthetic judgement I'm worried about, it's *mine*. I've got to look at the big picture. In designing this markup, I'm saying to the world, "This is my best effort; this is what I think strikes the best balance between readability, practicality, functionality, etc." Of course I'm extremely grateful for input from everyone participating, but -- like Guido with Python -- the buck stops here. My "spider sense" is still tingling on this issue. > The concern i see is having the extraneous syntax if peopl choose > not to use it, getting in the way of people finding and learning the > things they do want to use. Thus my idea of hiding it as an explicitly activated extension. That doesn't help the poor sap who has to *read* the plaintext though. > I've lost track of the summary post, and i apologize if good reasons > for dismissing this alternative have already been covered: > > here_(http://www.python.org) or `sometimes > here`_(http://www.zope.org), I took Wednesday's "Summary of reference/target syntaxes" Doc-SIG post, added references and excerpts from discussions to date, and added it to "A Record of reStructuredText Syntax Alternatives". The main objection with ``"`reference`_(target)"`` was the unprecedented "infix" syntax (significant syntax *in the middle* of the construct). I won't repeat it all here; please take a look: http://docutils.sf.net/spec/rst/alternatives.html#inline-external-targets Simon's original syntax (``"reference_(target)"``) is alternative 1, the split compromise (``"reference__ __<target>"``) is alternative 2, and the embedded syntax (``"`reference <target>`__"``) is alternative 3. I currently consider #3 to be the least objectionable. > I think it is pretty natural for people to put URLs in parens this > way, in regular text. Agreed, if you're saying "See the Python homepage (http://www.python.org)". But if the URL is meant to be hidden (an active link), or if it's a two-line monstrosity, it makes the plaintext painfully hard to read. -- David Goodger <goodger@users.sourceforge.net> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From goodger@users.sourceforge.net Sat Jul 20 04:33:53 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Fri, 19 Jul 2002 23:33:53 -0400 Subject: [Doc-SIG] Updates to Docutils Message-ID: <B95E5260.26001%goodger@users.sourceforge.net> Download the latest snapshot from: http://docutils.sf.net/docutils-snapshot.tgz The sandbox snapshot is also available: http://docutils.sf.net/docutils-sandbox-snapshot.tgz Lots of changes in the last few weeks: * Added support for configuration files. Config files (/etc/docutils.conf, ./docutils.conf, ~/.docutils) override application defaults, and command-line options override all. Documentation is pending. * Added "simple tables" to reStructuredText (spec and parser): ===== ===== ====== Inputs Output ------------ ------ A B A or B ===== ===== ====== False False False True False True False True True True True True ===== ===== ====== * Improved HTML output with many small changes. * Added PEP processing support: - tools/pep.py: Front-end for processing reStructuredText PEPs. - tools/pep2html.py: Based on the existing script (in CVS at python/nondist/peps); processes old-style *and* reStructuredText PEPs. - docutils/writers/pep_html.py: HTML Writer for PEPs (subclass of ``html4css1.Writer``). * Updated the PEPs: - A "Roadmap to the Docstring PEPs" section was added to PEP 256: http://docutils.sf.net/spec/pep-0256.html - PEP 258 has been reorganized and extensively updated: http://docutils.sf.net/spec/pep-0258.html - PEP 287: updated, renamed to "reStructuredText Docstring Format" and converted to reStructuredText format (provisionally, pending PythonLabs OK). Check out the processed PEP 287: http://docutils.sf.net/spec/pep-0287.html I also converted PEP 0 as a proof of concept because of its special processing: http://docutils.sf.net/spec/pep-0000.html Late breaking news: I just got a reply from Guido, in which he critiques some style issues, then says, "I'm sure you can fix all these things with a simple style sheet change, and then I'm all for allowing Docutils for PEPs." I'd appreciate more critiques on PEP formatting issues, no matter how small. Especially, please help with stylesheet issues with the various browsers, by comparing the reStructuredText PEPs above to the old-style versions: http://www.python.org/peps/pep-0287.html http://www.python.org/peps/pep-0000.html * Added Docutils-native XML output: - tools/docutils-xml.py: A front-end. - docutils/writers/docutils_xml.py: A Writer. * Unicode & encodings are working. I hope to release Docutils version 0.2 in a week or two, incorporating Swedish and German language patches I've received and *maybe* the "inline external targets" syntax in some form. After that, I plan to focus on Python autodocumentation: analyze Tony Ibbs' PySource code and Doug Hellman's HappyDoc project, take the best ideas and integrate them into Docutils 0.3. -- David Goodger <goodger@users.sourceforge.net> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From Simon.Budig@unix-ag.org Sat Jul 20 19:18:04 2002 From: Simon.Budig@unix-ag.org (Simon Budig) Date: Sat, 20 Jul 2002 20:18:04 +0200 Subject: [Doc-SIG] Summary of reference/target syntaxes In-Reply-To: <B9526378.25A27%goodger@users.sourceforge.net>; from goodger@users.sourceforge.net on Wed, Jul 10, 2002 at 10:19:36PM -0400 References: <B9526378.25A27%goodger@users.sourceforge.net> Message-ID: <20020720201803.A49754@vmax.unix-ag.uni-siegen.de> Hi all. Sorry for not replying earlier. I was busy with semester ending stuff... David Goodger (goodger@users.sourceforge.net) wrote: [...] > To alleviate the readability issue slightly, we could allow the > target to appear later, such as after the end of the sentence:: > > This is a named reference__ of one word ("reference"). > __<http://www.example.org/reference/> Here is a `phrase > reference`__. __<http://www.example.org/phrase_reference/> > > This could only work for one reference at a time (reference/target > pairs must be proximate [refA trgA refB trgB], not interleaved > [refA refB trgA trgB] or nested [refA refB trgB trgA]). Perhaps > this restriction is too onerous; then references and targets would > have to be imediately adjacent. There is also a problem. __<link> has to be detected as a construct independantly from other syntactic constructs. So the parser needs to do some evil poking in the internal data structure, find the latest anonymous link, resolve it and remove it from the list of anonymous links. Also: What happens if there has not been a anonymous link earlier? I don't like the splitting up of the inline reference in two constructs. > 4. If it is best for references and inline external targets to be > immediately adjacent, they might as well be integrated. Here's an > alternative syntax embedding the target URL in the reference:: > > This is a named `reference <http://www.example.org/reference > />`__ of one word ("reference"). Here is a `phrase reference > <http://www.example.org/phrase_reference/>`__. > > Advantages and disadvantages are the same as in (3). Readability > is still an issue, but the syntax is a bit less heavyweight. > > There's a problem with this syntax: how to refer to a title like > "HTML Anchors: <a>" (ending with an HTML/SGML/XML tag)? We could > either require more syntax on the target (like "`reference text > __<http://example.com/>`__"), or require the odd conflicting title > to be escaped (like "`HTML Anchors: \<a>`__"). The latter seems > preferable. Hmm. What about "`name`<link>__" ? Then you could easily do `HTML Anchors: <a>`<anchors.html>__ . I think it is a bit weird to include the target inside some quoted text. I have updated the states.py in the sandbox to keep up with the latest changes. Bye, Simon -- Simon.Budig@unix-ag.org http://www.home.unix-ag.org/simon/ From goodger@users.sourceforge.net Sun Jul 21 20:57:43 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Sun, 21 Jul 2002 15:57:43 -0400 Subject: [Doc-SIG] Summary of reference/target syntaxes In-Reply-To: <20020720201803.A49754@vmax.unix-ag.uni-siegen.de> Message-ID: <B9608A75.260D1%goodger@users.sourceforge.net> >David Goodger (goodger@users.sourceforge.net) wrote: >> To alleviate the readability issue slightly, we could allow the >> target to appear later, such as after the end of the sentence:: >> >> This is a named reference__ of one word ("reference"). >> __<http://www.example.org/reference/> Here is a `phrase >> reference`__. __<http://www.example.org/phrase_reference/> >> >> This could only work for one reference at a time >> (reference/target pairs must be proximate [refA trgA refB trgB], >> not interleaved [refA refB trgA trgB] or nested [refA refB trgB >> trgA]). Perhaps this restriction is too onerous; then >> references and targets would have to be imediately adjacent. Simon Budig wrote: > There is also a problem. __<link> has to be detected as a construct > independantly from other syntactic constructs. So the parser needs > to do some evil poking in the internal data structure, find the > latest anonymous link, resolve it and remove it from the list of > anonymous links. If this construct were chosen, and the internal data structures couldn't support it easily, then that is a deficiency in the internal data structures. Nothing "evil" about it. It's better to judge syntax at the conceptual and usage level rather than at the implementation level. (Does the syntax make sense? Does it fit in with the existing constructs? Does it look natural? Is it easy to use?). Look at tables; compared to other constructs they're a big pain to parse, but the syntax is *right*, so I endured the pain and wrote the parsing code. > Also: What happens if there has not been a anonymous link earlier? That would be an error, like a `named reference`_ without a corresponding target. This is definitely a problem at the conceptual/usage level. > I don't like the splitting up of the inline reference in two > constructs. IMO, "`reference <target>`__" is better than "`reference`__ __<target>", which is better than "`reference`__<target>". So a single construct is currently on top. >> 4. If it is best for references and inline external targets to be >> immediately adjacent, they might as well be integrated. Here's >> an alternative syntax embedding the target URL in the reference:: >> >> This is a named `reference <http://www.example.org/reference >> />`__ of one word ("reference"). Here is a `phrase reference >> <http://www.example.org/phrase_reference/>`__. >> >> Advantages and disadvantages are the same as in (3). >> Readability is still an issue, but the syntax is a bit less >> heavyweight. >> >> There's a problem with this syntax: how to refer to a title like >> "HTML Anchors: <a>" (ending with an HTML/SGML/XML tag)? We >> could either require more syntax on the target (like "`reference >> text __<http://example.com/>`__"), or require the odd >> conflicting title to be escaped (like "`HTML Anchors: \<a>`__"). >> The latter seems preferable. > > Hmm. What about "`name`<link>__" ? Then you could easily do `HTML > Anchors: <a>`<anchors.html>__ . That's a minor issue. I think the likelihood of a reference with a <tag> at the end is sufficiently small that it's not worth worrying about. A "`name`<link>__" construct has the same line wrapping problem as "`name`__<link>". That *is* worth worrying about. I see no significant advantage moving from "`name <link>`__" to "`name`<link>__", but I see a significant disadvantage. > I think it is a bit weird to include the target inside some quoted > text. I don't think it's any worse than the inline external targets idea itself. The target is simply a sub-construct within (and in the context of) the reference construct. Having the target inside the text has the advantage of allowing line-wrapping for free (it's already there). We just have to interpret a substring. > I have updated the states.py in the sandbox to keep up with the > latest changes. As a proof of concept of pragma directives, I'll work on converting that soon. But I still have strong reservations. Simon, back on July 2nd you showed us a portion of the Docutils' home page source to illustrate "the uglyness of anonymous and named links". You're proposing inline external targets as a solution. To help convince me & others, please show us a before & after example: text marked up with the existing constructs (current syntax), and with the new proposed syntax. -- David Goodger <goodger@users.sourceforge.net> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From Simon.Budig@unix-ag.uni-siegen.de Mon Jul 22 01:30:54 2002 From: Simon.Budig@unix-ag.uni-siegen.de (Simon Budig) Date: Mon, 22 Jul 2002 02:30:54 +0200 Subject: [Doc-SIG] Summary of reference/target syntaxes In-Reply-To: <B9608A75.260D1%goodger@users.sourceforge.net>; from goodger@users.sourceforge.net on Sun, Jul 21, 2002 at 03:57:43PM -0400 References: <20020720201803.A49754@vmax.unix-ag.uni-siegen.de> <B9608A75.260D1%goodger@users.sourceforge.net> Message-ID: <20020722023054.A37280@vmax.unix-ag.uni-siegen.de> David Goodger (goodger@users.sourceforge.net) wrote: > Simon Budig wrote: > > I don't like the splitting up of the inline reference in two > > constructs. > > IMO, "`reference <target>`__" is better than "`reference`__ > __<target>", which is better than "`reference`__<target>". So a > single construct is currently on top. > [...] > > Hmm. What about "`name`<link>__" ? Then you could easily do `HTML > > Anchors: <a>`<anchors.html>__ . > > That's a minor issue. I think the likelihood of a reference with a > <tag> at the end is sufficiently small that it's not worth worrying > about. A "`name`<link>__" construct has the same line wrapping > problem as "`name`__<link>". That *is* worth worrying about. I see > no significant advantage moving from "`name <link>`__" to > "`name`<link>__", but I see a significant disadvantage. Hmm - how would this work with simple-word names? name <link>__ ? What about when only partial words are links? This is especially interesting for german where we have lots of compound words. `Hyper <http://www.hyper.org>`__link vs. `Hyper`<http://www.hyper.org>__link I can understand your concerns regarding linebreak issues. However, when preparing the examples below I often found the automatic linewrap done by my editor to be utterly ugly, since the individual line lengths differed wildly from line to line. I ended up in breaking the line inside the URLs, to make the paragraph look more homogenous. The additional space between reference and link was in about 50% of all cases not helpful at all. > > I think it is a bit weird to include the target inside some quoted > > text. > > I don't think it's any worse than the inline external targets idea > itself. The target is simply a sub-construct within (and in the > context of) the reference construct. Having the target inside the > text has the advantage of allowing line-wrapping for free (it's > already there). We just have to interpret a substring. > > > I have updated the states.py in the sandbox to keep up with the > > latest changes. > > As a proof of concept of pragma directives, I'll work on converting > that soon. But I still have strong reservations. Simon, back on July > 2nd you showed us a portion of the Docutils' home page source to > illustrate "the uglyness of anonymous and named links". You're > proposing inline external targets as a solution. To help convince me > & others, please show us a before & after example: text marked up with > the existing constructs (current syntax), and with the new proposed > syntax. Ok. Due to the size I don't attach these files. Please have a look at http://www.home.unix-ag.org/simon/files/rst/. "stuartlittle" is a text from yahoo. The use of anonymous links is questionable in this case, since the references usually are direct mappings to actor or movie names. But this is not always the case and the maintenance problem is obvious: You have lots of slightly different links with non narrative URLs. Reordering sentences will create headache. The "wochenschau" is a similiar example. It is cited from www.heise.de, but is unfortunately in german. Here you have the very same problem: You have huge paragraphs and links sparkled within. Since the links often just point to something that the author just crossed the mind while writing the article, the links do not necessarily have a direct connection to the reference name. Since the paragraphs are quite huge it is hard to find the matching URL. Named links in this case are not that helpful, since the names in this text are highly context sensitive and listing them prominently even might lead to wrong associations. The last example "gimplinks" is from my very own homepage and IMHO shows clearly, why inline links could add real value to rest. If you look around a bit you'll see that there are lots of pages out there, where this kind of link lists are used, just as an other example I'd like to point to the python homepage. Due to the length of the reference texts it is not practical to have them twice in the file (for named links) and anonymous links make reordering a list very hard. Inline links enable you to easily reorder list items and freely change the text of the reference. This kind of pages really would gain from this kind of use of inline references. I already pointed out, that I currently care more for maintainability than for readability and I am willing to accept a bit less readability for the first two samples to get an additional degree of freedom in moving text around. Of course the author of the text can freely select his preferred syntax. However, in the third sample the readability and maintainability is both increased. I think, inline references would be a useful addition to structured text. Bye, Simon -- Simon.Budig@unix-ag.org http://www.home.unix-ag.org/simon/ From goodger@users.sourceforge.net Mon Jul 22 03:22:48 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Sun, 21 Jul 2002 22:22:48 -0400 Subject: [Doc-SIG] Summary of reference/target syntaxes In-Reply-To: <20020722023054.A37280@vmax.unix-ag.uni-siegen.de> Message-ID: <B960E4B7.260D4%goodger@users.sourceforge.net> Simon Budig wrote: > Hmm - how would this work with simple-word names? > name <link>__ ? I think not. The inline external target is part of the reference text, so there are no simple-word cases; a "phrase reference" is implied, and backquotes are required. Without backquotes the association between reference text and target URL would be too magical. So the syntax would be "`name <link>`__". > What about when only partial words are links? Partial words are not supported by reStructuredText. Inline markup is at the word or phrase level, not at the character level; adjacent whitespace or punctuation is crucial. See the introductory text of http://docutils.sf.net/spec/rst/reStructuredText.html#inline-markup, especially the inline markup recognition rules. > I can understand your concerns regarding linebreak issues. However, > when preparing the examples below I often found the automatic > linewrap done by my editor to be utterly ugly, since the individual > line lengths differed wildly from line to line. I ended up in > breaking the line inside the URLs, to make the paragraph look more > homogenous. The additional space between reference and link was in > about 50% of all cases not helpful at all. Which implies that in about 50% of all cases, the space *was* helpful. That's more than good enough for me. >> To help convince me & others, please show us a before & after >> example: text marked up with the existing constructs (current >> syntax), and with the new proposed syntax. > > Ok. Due to the size I don't attach these files. Please have a look > at http://www.home.unix-ag.org/simon/files/rst/. Thank you. I'll take a look at these soon, and I invite opinions from others. -- David Goodger <goodger@users.sourceforge.net> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From goodger@users.sourceforge.net Wed Jul 24 03:53:19 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Tue, 23 Jul 2002 22:53:19 -0400 Subject: [Doc-SIG] compact HTML output from Docutils Message-ID: <B9638EDE.26271%goodger@users.sourceforge.net> I've tweaked Docutils' HTML writer to produce more visually compact HTML (less vertical whitespace). HTML's mixed content models allow list items to contain "<li><p>body elements</p></li>" or "<li>just text</li>" or even "<li>text<p>and body elements</p>combined</li>", each with different effects. I'd prefer to stick with strict body elements in list items, but they affect vertical spacing in browsers (although they really shouldn't). Does anybody know of a good discussion of vertical space issues for HTML and/or CSS? After trying various algorithms, I've settled on a hybrid: - Check for and omit <p> tags in "simple" lists: list items contain either a single paragraph, a nested simple list, or a paragraph followed by a nested simple list. This means that this list can be compact: - Item 1. - Item 2. But this list cannot be compact: - Item 1. This second paragraph forces space between list items. - Item 2. - In non-list contexts, omit <p> tags on a paragraph if that paragraph is the only child of its parent (footnotes & citations are allowed a label first). - Regardless of the above, in definitions, table cells, field bodies, option descriptions, and list items, mark the first child with 'class="first"' if it is a paragraph. The stylesheet sets the top margin to 0 for these paragraphs. I'd appreciate feedback, especially from people using different browsers (I've checked IE and Mozilla on both MacOS & Win2K, and Netscape 4 & 6 on MacOS). Please browse the following files: - http://docutils.sf.net/tools/test.html - http://docutils.sf.net/spec/pep-0287.html - http://docutils.sf.net/spec/pep-0000.html Is the above approach correct? Does the generated HTML come out right? Should the compact output be optional (i.e. should there be an option to turn it off)? Any suggestions? Get your snapshots here: http://docutils.sf.net/docutils-snapshot.tgz -- David Goodger <goodger@users.sourceforge.net> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From tony@lsl.co.uk Wed Jul 24 09:56:32 2002 From: tony@lsl.co.uk (Tony J Ibbs (Tibs)) Date: Wed, 24 Jul 2002 09:56:32 +0100 Subject: [Doc-SIG] compact HTML output from Docutils In-Reply-To: <B9638EDE.26271%goodger@users.sourceforge.net> Message-ID: <01af01c232f0$01fec990$545aa8c0@lslp862.int.lsl.co.uk> Of course, this is a doomed excercise (doomed, I tell you), but a brave one nonetheless... David Goodger wrote: > I'd appreciate feedback, especially from people using different > browsers (I've checked IE and Mozilla on both MacOS & Win2K, and > Netscape 4 & 6 on MacOS). Please browse the following files: Hmm - well, I'll try to remember to look at the pages with Opera and Galeon on Debian some evening (at home). > - http://docutils.sf.net/tools/test.html > - http://docutils.sf.net/spec/pep-0287.html > - http://docutils.sf.net/spec/pep-0000.html > > Is the above approach correct? Does the generated HTML come out > right? Should the compact output be optional (i.e. should there be an > option to turn it off)? Any suggestions? The last two look very good to me (using IE 5.50 on Windows/NT). In particular, I think that the handling of list spacing in the second looks good. The first exposes some niggles (well, hardly surprising): * In "Bullet Lists", and the other examples with sublists, the gap around each internal list seems a bit big. This is a minor grumble as it is plainly difficult to decide what to do here. * In "Definition Lists", there is no extra space between the first item and the second - done in bad ASCII art, the effect is:: Term Definition Term: classifier Definition paragraph 1. Definition paragraph 2. Clearly some of the same work done for lists needs promulgating here. * In "Field Lists", the effect is not so bad, but the vertical gap between the first item and the second is about half the size of that between the two paragraphs of the second item. * In "Option Lists", there is again the problem of "internal paragraphs" - so the separation between the paragraphs in ``--very-long-option`` is much greater than the separation between each option. But, as I said, for "normal" documents, it's looking rather fine - I like the handling of literal blocks (with the grey background). Admonition boxes are a bit oversized for their content, but since we're not meant to be generating those (!) I'm much less fussed about that. Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ Give a pedant an inch and they'll take 25.4mm (once they've established you're talking a post-1959 inch, of course) My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) From goodger@users.sourceforge.net Wed Jul 24 13:37:45 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Wed, 24 Jul 2002 08:37:45 -0400 Subject: [Doc-SIG] Comments on "inline external targets" example texts In-Reply-To: <20020722023054.A37280@vmax.unix-ag.uni-siegen.de> Message-ID: <B96417D7.2627C%goodger@users.sourceforge.net> In order to further the cause of inline external targets, Simon Budig created some "before & after" example texts at http://www.home.unix-ag.org/simon/files/rst/. Files ending with "1.txt" are marked up using inline external target syntax ("inline versions"), and those ending with "2.txt" are marked up using current syntax. For reference, the three syntax alternatives for the proposed extension are: 1. Simon's original proposal:: `reference text`_(target/URL) 2. Compromise:: `reference text`__ __<target/URL> 3. Integrated:: `reference text <target/URL>`__ The alternatives and arguments to date are summarized here: http://docutils.sf.net/spec/rst/alternatives.html#inline-external-targets The first thing I noticed about the inline versions of the example texts was that the syntax alternative 2 was used (probably because that's the syntax that Simon's modified states.py recognizes). Looking at lots of such syntax in context, it's obvious that the line noise is excessive and seriously harms readability (the underscores, "__ __", seem to attract the eye). Alternative 3 is much better: less line noise and a more natural feel. I converted the inline versions to use this syntax; along with Simon's originals, the new files (ending with "3.txt") are online at http://docutils.sf.net/sandbox/simonb/examples/. The second thing I noticed was how difficult to read the inline versions are, especially the "Stuart Little" and "wochenschau" texts: practically unreadable. The long and complex URLs do a lot of damage, breaking the flow of text unacceptably. The integrated syntax files (``*3.txt``) are a bit more readable than the compromise syntax files (``*1.txt``). The "Gimp Links" texts have much shorter URLs that don't interfere nearly as much when reading the plaintext. Clearly, the "inline external targets" feature should only be used with short URLs, or with source files that are *not* intended to be read in plaintext form. Simon Budig wrote: > I already pointed out, that I currently care more for > maintainability than for readability Our goals are in conflict. The above statement and the "inline external targets" feature itself conflict so strongly with reStructuredText's "Readable" goal (number 1: "It should be as easily read in raw [plaintext] form as in processed [HTML etc.] form") that I'm tempted to just say "forget it." But read on... > and I am willing to accept a bit less readability for the first two > samples to get an additional degree of freedom in moving text > around. In the context of "readable plaintext", I find that unacceptable. The inline URLs in the English-language "Stuart Little" text (stuartlittle[13].txt), make it almost as illegible to me as any of the German "wochenschau" texts. And I can't read German! > Of course the author of the text can freely select his preferred > syntax. I'm more concerned with the *reader* of the text, who would be made to suffer if forced to read stuartlittle1.txt. > However, in the third sample the readability and maintainability is > both increased. Indeed, the "Gimp Links" example offers the only hope for redemption for the feature. That style of "list of links" document lends itself to inline external targets. > I think, inline references would be a useful addition to structured > text. Useful, yes. We're back to the dilemma: the feature is useful, but it easily turns ugly, it reduces readability, and it's easy to abuse. I can't bring myself to include it in reStructuredText in any more prominent form than as a pragma directive with strong cautions attached. BTW, I've added cautions to: http://docutils.sf.net/spec/rst/reStructuredText.txt#anonymous-hyperlinks ~~~~~~~~~~ A specific observation: With syntax alternative 2, here's a good example of why it may be good to allow the target to appear later, not immediately adjacent to the reference. From stuartlittle1.txt:: ... while `Harrison Ford`__ __<http://movies.yahoo.com/shop ?d=hc&cf=gen&id=1800017113>'s submarine drama "K:19 -- The Widowmaker" ... Allowing a later target would make it a bit better:: ... while `Harrison Ford`__'s __<http://movies.yahoo.com/shop ?d=hc&cf=gen&id=1800017113> submarine drama "K:19 -- The Widowmaker" ... But the best solution would probably be to move the possessive ("'s") into the reference text:: ... while `Harrison Ford's`__ __<http://movies.yahoo.com/shop ?d=hc&cf=gen&id=1800017113> submarine drama "K:19 -- The Widowmaker" ... Syntax alternative 3 seems much more natural:: ... while `Harrison Ford's <http://movies.yahoo.com/shop ?d=hc&cf=gen&id=1800017113>`__ submarine drama "K:19 -- The Widowmaker" ... -- David Goodger <goodger@users.sourceforge.net> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From Paul.Moore@atosorigin.com Thu Jul 25 10:52:06 2002 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Thu, 25 Jul 2002 10:52:06 +0100 Subject: [Doc-SIG] Comments on "inline external targets" example texts Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5B47A@UKRUX002.rundc.uk.origin-it.com> From: David Goodger [mailto:goodger@users.sourceforge.net] > The second thing I noticed was how difficult to read the > inline versions are, especially the "Stuart Little" and > "wochenschau" texts: practically unreadable. The long and > complex URLs do a lot of damage, breaking the flow of text > unacceptably. The integrated syntax files (``*3.txt``) > are a bit more readable than the compromise syntax files > (``*1.txt``). The "Gimp Links" texts have much shorter > URLs that don't interfere nearly as much when reading the > plaintext. > > Clearly, the "inline external targets" feature should only > be used with short URLs, or with source files that are > *not* intended to be read in plaintext form. I agree entirely with this. I tried to read the "Stuart Little" text with inline links, and frankly I gave up. The text was simply unreadable. OK, so Simon's point is that I shouldn't be reading the plain text. Can I also point out, however, that if I were given the inline link form to *maintain*, I would not be able to do so, as I would find that the unreadability made it impossible for me to reasonably maintain. The first thing I would do, as a new maintainer, would be to convert to non-inline links... > Simon Budig wrote: > > > I already pointed out, that I currently care more for > > maintainability than for readability > > Our goals are in conflict. The above statement and the > "inline external targets" feature itself conflict so > strongly with reStructuredText's "Readable" goal (number > 1: "It should be as easily read in raw [plaintext] form as > in processed [HTML etc.] form") that I'm tempted to just > say "forget it." But read on... The insight I had in looking at these was that the *maintainer* needs to read the plaintext, even if no-one else does. And virtually nothing which gets published ends up with only one maintainer... > > However, in the third sample the readability and > > maintainability is both increased. > > Indeed, the "Gimp Links" example offers the only hope for > redemption for the feature. That style of "list of links" > document lends itself to inline external targets. A suggestion I made before, but badly, may be relevant here. If you can define a syntax specifically for "list of links" type of documents, could you not use a custom directive? For the Gimp example, suppose I write a link-list directive, which takes a series of entries, each of which contains a link, followed by '--', followed by free text, and formats it as a list of links, any way the directive writer prefers. Then the first sections look like:: Some things for the Gimp ======================== Gimp__ is the best image manipulation program for Unix-Computers. It is very easy to make extensions for it. Here are my pages and extensions: __ http://www.gimp.org Plugins ------- .. link-list: pagecurl.html -- A Plugin to create an pagecurl effect fsdither.html -- A Plugin to do a proper Floyd-Steinberg dithering. gimpbuttons.html -- A unuseable Plugin to provide a Buttonbar for the Gimp. Yes - I mean unuseable! quant.html -- A Plugin to reduce the number of colors Does this example make my suggestion any clearer? I may have some of the details wrong, as I've never really looked at custom directives. But basically, the idea is to abstract out the concept of a "list of links" and code it directly. I'd love to try coding the link-list directive myself, but I don't have the time right now... Hope this helps, Paul. From goodger@users.sourceforge.net Sat Jul 27 03:54:14 2002 From: goodger@users.sourceforge.net (David Goodger) Date: Fri, 26 Jul 2002 22:54:14 -0400 Subject: [Doc-SIG] Comments on "inline external targets" example texts In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5B47A@UKRUX002.rundc.uk.origin-it.com> Message-ID: <B9678395.26546%goodger@users.sourceforge.net> Paul Moore wrote: > The insight I had in looking at these was that the *maintainer* > needs to read the plaintext, even if no-one else does. And virtually > nothing which gets published ends up with only one maintainer... Good point. The authors/maintainers of a document are also among its readers. >> Indeed, the "Gimp Links" example offers the only hope for >> redemption for the feature. That style of "list of links" >> document lends itself to inline external targets. > > A suggestion I made before, but badly, may be relevant here. If you > can define a syntax specifically for "list of links" type of > documents, could you not use a custom directive? For the Gimp > example, suppose I write a link-list directive, which takes a series > of entries, each of which contains a link, followed by '--', > followed by free text, and formats it as a list of links, any way > the directive writer prefers. ... > Does this example make my suggestion any clearer? I may have some of > the details wrong, as I've never really looked at custom > directives. But basically, the idea is to abstract out the concept > of a "list of links" and code it directly. Thanks for the re-statement. In context, I get it now! Your directive would implement a very specific case (bullet lists of links), but we can extract the underlying idea (which is a basic idea of directives). By using a directive, we can force a local implicit syntax interpretation different from the global one, and thus reduce the explicit syntax required. I don't know if it's general enough to satisfy Simon, but it may have potential. As I was visiting Slashdot.org just now, I realized that inline external targets would be very useful for blogs, which are stream-of-consciousness, written once but read often (in processed form only). I can see places where this would be useful, but I want to prevent abuse, somehow, if possible. -- David Goodger <goodger@users.sourceforge.net> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From Simon.Budig@unix-ag.org Sat Jul 27 23:49:13 2002 From: Simon.Budig@unix-ag.org (Simon Budig) Date: Sun, 28 Jul 2002 00:49:13 +0200 Subject: [Doc-SIG] Comments on "inline external targets" example texts In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5B47A@UKRUX002.rundc.uk.origin-it.com>; from Paul.Moore@atosorigin.com on Thu, Jul 25, 2002 at 10:52:06AM +0100 References: <714DFA46B9BBD0119CD000805FC1F53B01B5B47A@UKRUX002.rundc.uk.origin-it.com> Message-ID: <20020728004912.A36244@vmax.unix-ag.uni-siegen.de> Moore, Paul (Paul.Moore@atosorigin.com) wrote: > A suggestion I made before, but badly, may be relevant here. If you can > define a syntax specifically for "list of links" type of documents, could > you not use a custom directive? For the Gimp example, suppose I write a > link-list directive, which takes a series of entries, each of which contains > a link, followed by '--', followed by free text, and formats it as a list of > links, any way the directive writer prefers. Then the first sections look > like:: > > Some things for the Gimp > ======================== > > Gimp__ is the best image manipulation program for > Unix-Computers. It is very easy to make extensions for it. > Here are my pages and extensions: > > __ http://www.gimp.org > > Plugins > ------- > > .. link-list: > pagecurl.html -- A Plugin to create an pagecurl effect > fsdither.html -- A Plugin to do a proper Floyd-Steinberg dithering. > gimpbuttons.html -- A unuseable Plugin to provide a Buttonbar for > the Gimp. Yes - I mean unuseable! > quant.html -- A Plugin to reduce the number of colors > David is indeed right in his other Mail, I really do think that this syntactic construct is not general enough, but I'll leave this discussion to other people. However, I want to point out that the proposed syntax has a problem with the use of "--" as the separator. From my LaTeX-Background "--" is used as the shortcut of a – and --- as shortcut for a — . IIRC this is already mentioned as a suggestion in the reST-Docs. I think it would be useful for the typographically interested user. BTW: There are two ways people use this separating dash: In Germany you use the ndash with spaces around it -- like this. In English books I often see the mdash without spaces around it---like this. I personally prefer the "german" way... :) This would interfere with your proposed syntax, expecially when you *want* to have the URLs in the processed output -- separated by a ndash from some descriptive text. Bye, Simon -- Simon.Budig@unix-ag.org http://www.home.unix-ag.org/simon/