Pyparsing: Grammar Suggestion. 2nd thought

Heiko Wundram me+python at modelnine.org
Wed May 17 14:24:18 EDT 2006


Am Mittwoch 17 Mai 2006 20:05 schrieb Khoa Nguyen:
> ========
> On 2nd thought, I don't think this will check for the correct order of
> the fields. For example, the following would be incorrectly accepted:
>
> f1,f5,f2 END_RECORD
>
> Thanks,
> Khoa

If I'm not completely mistaken, parsers written using PyParsing can accept a 
small superset of all languages that an N/DFA can accept, and as such 
PyParsing isn't a "general purpose" parsing toolkit (the latter implements 
matching of a subset of all languages a N/DSA can accept, think SLR, LR(1), 
LALR(1), GLR or the like), because it doesn't support the notion of 
left-/right-recursion (at least I didn't find anything like it back in the 
days when I had a look at PyParsing), but I might be wrong here. If I am, 
someone enlighten me. ;-)

Anyway, the language you're trying to match here (along with more complex 
productions for the f's) is nothing that an NFA can ever match. So, either 
you use PyParsing to implement the "tokenization" for you, and postprocess 
using a handwritten parser (LL-parsers are easy to implement, and I'd guess a 
small LL-parser is sufficient for your needs), or you have a look at one of 
the available LR-parsing frameworks for Python, such as pyrr (which isn't the 
only one, by far).

By the way:

if you have a variable length argument list, such as:

f1,f2,,,f5,f9,f12,...

and there is no upper bound on the number of acceptable arguments, no parsing 
framework that doesn't accept context sensitive grammars (DSA) can ever 
verify the order for you. You'll have to do the verification of correct order 
in a later step, after the parsing has been done.

--- Heiko.



More information about the Python-list mailing list