[Doc-SIG] formalizing StructuredText

Edward D. Loper edloper@gradient.cis.upenn.edu
Fri, 16 Mar 2001 12:00:38 EST


(concerning:)
> > > > >            This *is "too* confusing":http://some.url

> No, seriously, the only way (with RE technology) that you are going to
> detect an error there is by adding that pattern to your "long list of
> error patterns" RE. To an RE-using system that looks for things going
> '*..*' and things going '"..":<url>', there will simply be no
> ambiguity - the one that finds it first will win, leaving odd bits of
> "definitely not markup, guv" text lying strewn around it. Specifically,
> the above would *either* be::

Well, it depends on how you're detecting errors...

> 	plain: 'This '
> 	emph:  'is "too'
> 	plain: ' confusing":http://some.url'

Here, you could say that the string '":' without a matching '"'
is illegal, and raise an error..

> 	plain:   'This *is '
> 	urltext: 'too* confusing'
> 	         [url: http://some.url]

Here you could say that non-matching '*'s are illegal, and raise
an error..

> In neither case is there any ambiguity - it just depends on the order in
> which things are done. It's because it's done with REs, you see - there
> isn't any *real* understanding of document structure going on.

But from the point of view of someone formalizing the language, saying
"there's an ambiguity" is no good.  I have to either explicitly say
"it's illegal" (=undefined) or "xyz is the correct answer."

-Edward

p.s., I'm not sure it's safe for us both to be writing email at the
same time.  We might overload other peoples' mailboxes. :)