Parsing syntax (was: Why don't people like lisp?)

Duane Rettig duane at franz.com
Thu Oct 23 04:10:36 EDT 2003


"Andrew Dalke" <adalke at mindspring.com> writes:

> Kaz Kylheku:

> > Moreover,
> > two or more domain-specific languages can be mixed together, nested in
> > the same lexical scope, even if they were developed in complete
> > isolation by different programmers.
> 
> We have decidedly different definitions of what a "domain-specific
> language" means.

Probably not; more likely it is just a different emphasis.

>  To you it means the semantics expressed as
> an s-exp.  To me it means the syntax is also domain specific.  Eg,
> Python is a domain specific language where the domain is
> "languages where people complain about scope defined by
> whitespace." ;)

Your whole article leans heavily toward raising the importance of
syntax.  Lispers tend to see it differently.  For Common Lispers,
and many other lispers who tend to minimize syntax, if the domain-
specific language already has or is able to have similar syntax
as Lisp, then parsing is analready-solved problem, and one can
just use the CL parser (i.e. read) and move on to other more
important problems in the domain language.  But if the syntax
doesn't match, then it really still isn't a big deal; a parser
is needed just as it is in any other language, and one must
(and can) solve that problem as well as the rest of the
domain-specifics.  The real question is how quickly you can
finally leave the issue of syntax behind you in your new problem
domain and move on to the problem to solve.

In the early '80s, I did an experiment, using Franz Lisp and Naomi
Sager's (NYU) English Grammar parser from her early 1981 book
"Natural Language Information Processing".  I wrote a parser
for English out of her BNF and Franz Lisp's character macros and
the lisp reader.  When I joined Franz Inc, I was able to port the
parser to Common Lisp with a little cheating (CL doesn't define
infix-macros like Franz Lisp does, so I had to redefine read-list
in order to make the ' macro work well with the lexicon).

[Unfortunately, I can't release this parser, because in my
correspondences with her, Dr. Sager made it clear that although
the sentences and descriptions in the book are not copyrighted,
the BNF and restrictions are.  So although you can see the BNF
nodes reresented in a tree below, I won't define the terms;
you'll have to get the book for that...]

The whole point of my experiment was not to write a parser or
to try to parse English, but to show how powerful Lisp's parser
already is.  With an input of

 "John's going to Mary's house."

for example, with neither John nor Mary being present in the
lexicon, the parser is able to provide the following analysis
(note that the parser was written as a stand-alone program used
in a fashion similar to a scripting language, with no actual
interactive input from the user except via pipe from stdin):

Franz Lisp, Opus 38.89+  plus English Grammar Parser version 0
-> nil
-> nil
-> t
-> t
-> sentence = 
(|John's| going to |Mary's| house |.|)
form = <sentence>
revised sentence = 
(|John's| going to Mary is house |.|)
form = <sentence>
revised sentence = 
(|John's| going to Mary has house |.|)
form = <sentence>
revised sentence = 
(John is going to |Mary's| house |.|)
form = <sentence>
found parse 1
1   1 <sentence>
2    2 <introducer>
4    2 <center>
5     3 <assertion>
6      4 <sa>
8      4 <subject>
9       5 <nstg>
10       6 <lnr>
11        7 <ln>
12         8 <tpos>
18         8 <qpos>
20         8 <apos>
22         8 <nspos>
24         8 <npos>
26        7 <nvar>
27         8 <namestg>
28          9 <lnamer>
29           10 <lname>
34           10 n --------------> John
35           10 <rname>
37        7 <rn>
39     4 <sa>
41     4 <tense>
42      5 <null>
43     4 <sa>
45     4 <verb>
47      5 <lv>
49      5 <vvar>
50       6 tv --------------> is
53     4 <sa>
55     4 <object>
56      5 <objectbe>
57       6 <vingo>
58        7 <lvsa>
60        7 <lvingr>
61         8 <lv>
63         8 ving --------------> going
64         8 <rv>
66          9 <rv1>
67           10 <pn>
69            11 <lp>
71            11 p --------------> to
72            11 <nstgo>
73             12 <nstg>
74              13 <lnr>
75               14 <ln>
76                15 <tpos>
77                 16 <lnamesr>
78                  17 <lname>
83                  17 ns --------------> |Mary's|
84                15 <qpos>
86                15 <apos>
88                15 <nspos>
90                15 <npos>
92               14 <nvar>
93                15 n --------------> house
94               14 <rn>
98        7 <sa>
100       7 <object>
101        8 <nullobj>
102       7 <rv>
104       7 <sa>
106    4 <rv>
108    4 <sa>
110  2 <endmark>
111   3 |-.-| --------------> |.|
revised sentence = 
(John is going to Mary is house |.|)
form = <sentence>
revised sentence = 
(John is going to Mary has house |.|)
form = <sentence>
revised sentence = 
(John has going to |Mary's| house |.|)
form = <sentence>
revised sentence = 
(John has going to Mary is house |.|)
form = <sentence>
revised sentence = 
(John has going to Mary has house |.|)
form = <sentence>
(no more parses count= 1)
-> 


A more complex structured sentence, which Sager gave in her book,
was "the force with which an isolated heart beats depends on
the concentration of calcium in the medium which surrounds it."
which also was parsed correctly, though I won't show the parse
tree for it here because it is long.  I did have trouble with
conjunctions, because they were not covered fullly in her book
and involve splicing copies of parts of the grammar together,
and there are a number of "restrictions" (pruning and
well-formedness tests specific to some of the BNF nodes that help
to find the correct parse) which she did not describe in her book.

Again, the point is that syntax and parsing are already-solved
problems., and even problems that don't on the surface look like
problems naturally solved with the lisp reader can come fairly
close to being solved with very little effort.  Perhaps we can
thus move a little deeper into the problem space a little faster.

-- 
Duane Rettig    duane at franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182   




More information about the Python-list mailing list