[Doc-SIG] Re: docutils REs

Edward D. Loper edloper@gradient.cis.upenn.edu
Fri, 23 Mar 2001 09:28:14 EST


[question about big long re for descr list items]
> I can't offhand remember - the RE growed until it appeared to work, and
> some of it appeared to rely on the "fuzzy" handline that REs appear (to
> me) to do in balancing the greediness of different bits of the RE. It's
> possible it's skeletal remains which should be excised, I suppose.

Hm.. I bet that's how the STNG REs got where they are today! ;)  If
you get a chance, could you try taking those lines out, and see if
it still passes your test cases?

> Not according to the RE documentation in the Python 1.5.2 reference
> manual, they don't - that's quite clear in saying start and end of
> STRING, and recognition of newlines is only in MULTILINE mode.

Hm.  You're right.  I was confused.  I wonder why I was.  Oh well,
I still don't like the fact that '$' matches '\n'.

    >>> re.match('$', '\n')
    <SRE_Match object at 0x80d2790>

> > we should start making a list of proposed changes to STNG,
> > in order to make STpy and STNG more compatible..
> Well, no, I wouldn't say that.

Ho hum.  My list of things to do grows by one. :)

> > I haven't decided yet on whether I'm happy about having this
> > concept of "acceptable ending punctuation.."  It sort of seems
> > like *all* punctuation should be ok, or *none*..
> 
> I'm not *too* happy about it myself, and actually it's a string that's
> '%' included into the RE texts where it's needed - this means that (a)
> it's easy to change, but (b) it should be the same in all places - I
> thought consistency was a Good Idea.

I definitely agree that, if you do have it, using '%' to splice it in
is the Right Thing to do.  And that way we can one day try replacing
it with an RE for all punctuation, and see how that affects our
test cases. :)

> > (e.g., should it be ok to have a dash after
> > an *emph region*-like this?)
> 
> That looks wrong to me - but then you can see how I use dashes in plain
> text!

Ok.  Bad example.  How about saying e-*mail* to put stress on the 
"mail" part, or *bad*-ass...

> There *are* some conventions on how one uses punctuation - for instance,
> 'this ,' looks wrong to almost everyone. ST<whatever> just enforces some
> of them (this is, of course, yet another class of things to consider
> warning people about).

Well, we're not in the business of enforcing punctuation use, so when
we can get away with it reasonably, we should let them do whatever
they want.. The problem is deciding how it interacts with markup..

-Edward