[Python-Dev] Grammar help.
Greg Ward
gward@mems-exchange.org
Thu, 6 Jul 2000 18:22:56 -0400
On 07 July 2000, Thomas Wouters said:
> Argh, my head hurts ! :P
Yeah, but try writing your *own* parser-generator. >grin<
> Here's what I did. The current definition of 'atom' is:
>
> atom: '(' [testlist] ')' | '[' [testlist] ']' | << ... >>
> testlist: test (',' test)* [',']
>
> I tried adding it like this:
>
> atom: '(' [testlist] ')' | '[' [listmaker] ']' | << ... >>
> testlist: test (',' test)* [',']
> listmaker: testlist | rangemaker
> rangemaker: [test] ':' test
I think the problem is this:
* one possible candidate for atom is: '[' listmaker
* two candidates for listmaker: testlist or rangemaker
* but both of these start with a test; the fact that one of them
is optional is probably beside the point
* the parser needs to see ',' or ':' to know if it's got a
testlist or a rangemaker: but because a test could be any number
of tokens, it would need infinite lookahead. (I think.) Since
this is a recursive-descent parser, it is most likely LL(1) --
at most one token of lookahead. Not good enough.
Another, entirely hypothetical way of looking at it (based on experience
with a completely different recursive descent parser generator
(PCCTS/ANTLR) and dim recollections of the Dragon Book):
* parser sees a '[' token and enters the 'atom' state
(ie. calls a subroutine 'atom()')
* swallows the '[' and immediately enters 'listmaker' (no other option)
* now it wants to enter either 'testlist' or 'rangemaker': but the
FIRST(testlist) == FIRST(test) (testlist can only start with a
test), and FIRST(rangemaker) == FIRST(test) + ':' (rangemaker
can start with a test or the ':' token)
* I suspect this is the ambiguity: overlap in the FIRST sets of
testlist and rangemaker
The fix? Hmmm, I dunno. Never hacked on the Python grammar myself, and
it's been a while since I hacked seriously on any grammar. Maybe the
people (person?) who really understands this stuff will have an idea...
Greg
--
Greg Ward - software developer gward@mems-exchange.org
MEMS Exchange / CNRI voice: +1-703-262-5376
Reston, Virginia, USA fax: +1-703-262-5367