Is pyparsing really a recursive descent parser?

Just Another Victim of the Ambient Morality ihatespam at hotmail.com
Sun Nov 4 21:05:10 EST 2007


"Kay Schluehr" <kay.schluehr at gmx.net> wrote in message 
news:1194223837.104424.242360 at o3g2000hsb.googlegroups.com...
> On Nov 4, 10:44 pm, "Just Another Victim of the Ambient Morality"
> <ihates... at hotmail.com>
>
>> I believe there is a cure and it's called recursive descent parsing.
>> It's slow, obviously, but it's correct and, sometimes (arguably, often),
>> that's more important the execution speed.
>
> Recursive decendent parsing is not necessarily slow but from your
> remarks above I infer you want a general RD parser with backtracking:
> when one rule doesn't match, try another one to derive the current
> symbol in the input stream.

    I think I've just discovered a major hurdle in my understand of the 
problem.
    You keep saying "with backtracking."  Why?  Isn't "backtracking" 
inherent in recursion?  So, why can't these alleged "recursive descent 
parsers" find valid parsings?  How are they not already backtracking?  What 
was the point of being recursive if not to take advantage of the inherent 
backtracking in it?
    Obviously, these parsers aren't recursing through what I think they 
should be recursing.  The question is "why not?"

    Correct me if I'm wrong but I'm beginning to think that pyparsing 
doesn't typically use recursion, at all.  It only employs it if you create 
one, using the Forward class.  Otherwise, it does everything iteratively, 
hence the lack of "backtracking."


> I'm not sure one needs to start again with a naive approach just to
> avoid any parser theory. For a user of a parser it is quite important
> whether she has to wait 50 seconds for a parse to run or 50
> milliseconds. I don't like to compromise speed for implementation
> simplicity here.

    This attitude is all too prevalent among computer professionals...  Of 
course it's a useful thing to shield users from the intricacies of parser 
theory!  Just as much as it is useful to shield drivers from needing 
automotive engineering or software users from programing.  How many people 
have come to this newsgroup asking about anomalous pyparsing behaviour, 
despite their grammars being mathematically correct.
    Think of it this way.  You can force all the clients of pyparsing to 
duplicate work on figuring out how to massage pyparsing to their grammars, 
or you can do the work of getting pyparsing to solve people's problems, 
once.  That's what a library is supposed to do...
    Finally, I can't believe you complain about potential speed problems. 
First, depending on the size of the string, it's likely to be the difference 
between 2ms and 200ms.  Secondly, if speed were an issue, you wouldn't go 
with a recursive descent parser.  You'd go with LALR or the many other 
parsing techniques available.  Recursive descent parsing is for those 
situations where you need correctness, regardless of execution time.  These 
situations happen...
    I've said this before, albeit for a different language, but it applies 
to Python just as well.  I don't use Python to write fast code, I use it to 
write code fast.
    If _you_ "don't like to compromise speed for implementation simplicity" 
then you have a plethora choices available to you.  What about the guy who 
needs to parse correctly and is unconcerned about speed?







More information about the Python-list mailing list