Ply(LALR) and Yacc behaving differently

Åsmund Grammeltvedt asmundg at stud.cs.uit.no
Thu Apr 7 14:51:21 EDT 2005


Hi.

I am trying to implement a small compiler in python and, trying to use 
something a bit more pythonic than lex/yacc, ended up with ply 
(http://systems.cs.uchicago.edu/ply/). The only problem is that whereas 
yacc accepts the grammar and appears to parse it correctly, ply does not.

Perhaps this belongs on some compiler list, but I couldn't decide if it 
was a compiler or a python problem, so bear with me.

This smaller grammar illustrates the problem:

Goal ::= (+|+-)*;


The yacc grammar goes:

%start Goal
%%

Goal
  : Block ';'
;

Block
  : Empty
  | Block T
  | Block S
;

Empty
  : /* empty */
;

S
  : '+'
;

T
  : '+' '-'
;


Translated to ply:

def p_Goal(p):
    """
    Goal : Block SEMI
    """

def p_Block(p):
  """
  Block : Empty
        | Block T
        | Block S
  """

def p_Empty(p):
    """
    Empty :
    """
    pass

def p_S(p):
    """
    S : PLUS
    """

def p_T(p):
    """
    T : PLUS MINUS
    """

Now, looking at the state machines that yacc and ply produces, the 
difference show up.

Yacc:

state 0
    0 $accept: . Goal $end

    $default  reduce using rule 5 (Empty)

    Goal   go to state 1
    Block  go to state 2
    Empty  go to state 3

state 2
    1 Goal: Block . ';'
    3 Block: Block . T
    4      | Block . S

    ';'  shift, and go to state 5
    '+'  shift, and go to state 6

    S  go to state 7
    T  go to state 8

state 3
    2 Block: Empty .

    $default  reduce using rule 2 (Block)


Ply:

state 0

    (0) S' -> . Goal
    (1) Goal -> . Block SEMI
    (2) Block -> . Empty
    (3) Block -> . Block T
    (4) Block -> . Block S
    (5) Empty -> .

    SEMI            reduce using rule 5 (Empty -> .)

    Empty                          shift and go to state 3
    Goal                           shift and go to state 2
    Block                          shift and go to state 1


OK, I believe those are the interesting parts. Appearently, ply doesn't
want to shift anything other than a ';' and thereby produces an error if
it encounters '+'. This only occurs in LALR mode, though. SLR handles
correctly, but my larger grammar isn't SLR, sigh.

So, any ideas? Have I completely messed up how to handle empty productions 
or should I switch to another lexer/parser?

-- 
Åsmund Grammeltvedt

All these moments will be lost in time, like tears in the rain.
http://electricsheep.org/




More information about the Python-list mailing list