[Doc-SIG] New document - pytext-fat

Guido van Rossum guido@digicool.com
Thu, 29 Mar 2001 17:09:40 -0500


> > 	<http://www.tibsnjoan.co.uk/docutils/fat.html>

Lots of good stuff here.  Comments (I wish I had the ST source, but I
don't, so I won't bother quoting from it):

- I think that the references to DOM trees are unnecessarily
  implementation details -- even though I like the idea of a
  formalized tree representation.  (I happen to think DOM is
  overrated, but I won't object against its use -- I do object against
  mentioning it in the spec.)

- Call me oldfashioned, but I like having an extra space between
  sentences.  Emacs text mode has some very good heuristics for this.

- Why is it bad to insist on whitespace between list items?  That
  would make the block rules simpler and cause significantly fewer
  situations where a list item is mistakenly started.  At the very
  least I'd insist on a blank line before the first item and after the
  last item of a list.

- The indentation rules are essentially those used for Python source.
  I'm glad you say outright that you won't use them in the way ST uses
  them -- as you may know by now, I think ST's use of indentation
  level to derive heading levels is painful.  For nested lists, it's
  fine of course, and also for creating the occasional indented text
  paragraph a la <blockquote>.

- Using --- for descriptive list is no better than --.  (Note that
  there's a typo in the example -- the second example uses '--'
  instead of '---'.  If you *have* to have descriptive lists, try
  doing something creative with input that already looks the way you
  propose that descriptive lists be rendered.  I think that maybe, as
  long as you are measuring indentation anyway, you *should* consider
  the indentation of subsequent lines, requiring all lines of a text
  paragraph to be indented the same, and marking the start of a new
  block on a change in indent.  After all, if we want the source to be
  readable, we can't tolerate a ragged left margin (except in literal
  blocks).  Sure, there are people who indent the first line of their
  paragraphs.  There are also typesetting conventions that *dedent*
  the first line of each paragraph.  But for plain text, I've found
  both conventions ugly and distracting, and I wouldn't mind ruling
  these out so we can use indentation changes for other purposes.

- Requiring the spaces around the delimiter doesn't help (it's too
  subtle).

- An alternative could be to use -- or --- but require some kind of
  explicit markup to start and end a descriptive list.

- I'm glad that you don't auto-renumber ordered lists.

- I'm not sure that there's a point to allowing disjoint text for
  ordered (or any kind of) list.  Again, if we believe that the source
  should read as well as plain text, we should require that it is
  formatted neatly.  The disjoint text example you give seems to come
  straight from the LaTeX (or similar) manual where it explains that
  whitespace details in the input are ignored.  But we *shouldn't*
  ignore any whitespace details in the input, since that's our main
  clue!

- Having to work around auto-detection of numbered lists is my #1 ST
  pet peeve.  I know that part of that's a ST bug -- but I still
  believe ordered lists are not sufficiently important to warrant the
  pain they occasionally cause.  The Emacs text mode I use
  automatically detects numbered lists and it is *never* what I want.
  At the very least you should require that the rest of the input is
  neatly formatted the way one would format an ordered list in a plain
  text document.

- The paragraph about intermingling is ambiguous.  Is it natural to
  have a list with some ordered items, some items using *, and some
  items using -?  I think not.  If you meant nesting, of course you're
  right -- but please say so, and give an example.

- Do we really need more than two levels of headings?  I kind of doubt
  it.  Alternatively, we could allow numbered headings (of course the
  numbers have to be supplied by the author) and derive the level from
  the structure of the number.  (Q: are unnumbered headings at higher
  or lower levels than numbered headings?  I dunno!)

- About dedented paragraphs after indented sections: you can't really
  express in regular text that a plain paragraph is not part of the
  previous section unless you insert a heading.  Maybe a better
  alternative (again using the rule that we should never ignore the
  whitespace clues in the source!) would be to simply indent indented
  headings and and paragraphs, a la <blockquote>.

- I like the idea of anchor blocks -- they seem to be like References
  in scientific papers.  But why do they have to start with two dots?
  And how much semantics (as opposed to formatting) do they need?

- Labels: I'm not sure I get the point.  What is this for?  The
  "explanation" doesn't explain it for me.  I think this is digressing
  too far from the "plain text as documentation" idea.

- The concept of children seems wrong for literal blocks.  I agree
  with the rule that a literal block starts after a paragraph ending
  in "::" and ends at the first line that's indented the same or less
  as that "::" paragraph; but I would propose that conceptually, the
  entire literal block is a child of the previous paragraph.

- The example with a paragraph consisting of *just* "::" should render
  that as a single colon, to be consistent.  If you think this should
  be special-cased, you need to explain why -- the argument "(a) it's
  not worth preventing" doesn't really hold when you special-case it
  anyway!

- You can collapse most of the description of doctest blocks with that
  of literal blocks -- they are really just a different way of
  *recognizing* a literal block (the >>> start), they are not to be
  treated differently (except by doctest).  Note that we may not need
  to recognize doctest blocks separately -- doctest is perfectly happy
  with indented doctest blocks.

- In-line literals: I don't like the use of '...' for literals.  It's
  too unintuitive (unless you leave the quotes in the output!).

- The section on Python literals is missing someting -- what is a
  Python literal?  From the example I have to guess that it's
  something between hash marks.  It's too ugly IMO.

- URL recognition: you know my position. :-)

Hope this helps,

--Guido van Rossum (home page: http://www.python.org/~guido/)