Is pyparsing really a recursive descent parser?

Wed Nov 7 16:15:50 EST 2007

"Neil Cerutti" <horpner at yahoo.com> wrote in message 
news:ILpYi.39972$G23.31487 at newsreading01.news.tds.net...
> On 2007-11-07, Just Another Victim of the Ambient Morality
> <ihatespam at hotmail.com> wrote:
>> "Neil Cerutti" <horpner at yahoo.com> wrote in message
>> news:slrnfj13e4.1hc.horpner at FIAD06.norwich.edu...
>>> You might be interested in the Early parsing algorithm. It is
>>> more efficient than the naive approach used in your prototype,
>>> and still handles ambiguous grammars.
>>
>
> I'll take this opportunity to correct my misspelling. It's
> "Earley".
>
>> I think I might be interested in this algorithm, thank you!
>>
>>> http://www.cpsc.ucalgary.ca/~aycock/spark/
>>
>>     You know, I tried this thing but, for the life of me, I
>> can't figure out how to use it and the few tutorials out there
>> are less than illuminating...
>
> I'll send you the code I composed.
>
> The tricky part of Spark is the Token and AST classes you have to
> use. A type used as a token class is required to provide a
> __cmp__ function that behaves in the following confounding
> manner:
>
>  class Token(object):
>    def __cmp__(self, o):
>      return cmp(self.type, o)
>
> If you attempt to use a list, string or a tuple as token, it just
> barfs. AST's are required to provide an even weirder interface.
>
> In effect, you have to write badly designed wrappers around
> tuples and lists, respectively to take advantage of the generic
> classes.
>
> Go to the examples directory of the distribution to find working
> versions of these stupid classes.
>
> Once you get over that hurdle it becomes easier. Be sure to
> provide your Parser and Scanner classes with an error method to
> prevent the library from raising SystemExit(!) on errors. Scanner
> classes are also required to override the t_default method to
> prevent this mishap.
>
> In short, it hasn't really evovled into a user-friendly package
> yet.

    Thank you.
    How is it that I seem to be the only one in the market for a correct 
parser?  Earley has a runtine of O(n^3) in the worst case and O(n^2) 
typically.  I have trouble believing that everyone else in the world has 
such intense run-time requirements that they're willing to forego 
correctness.  Why can't I find a pyparsing-esque library with this 
implementation?  I'm tempted to roll my own except that it's a fairly 
complicated algorithm and I don't really understand how it's any more 
efficient than the naive approach...