[Tutor] extracting phrases and their memberships from syntax trees

A.T.Hofkamp a.t.hofkamp at tue.nl
Fri Feb 13 08:43:31 CET 2009


Emad Nawfal (عماد نوفل) wrote:
> just want to be able to do this myself. My question is: what tools do I need
> for this? Could you please give me pointers to where to start? I'll then try
> to do it myself, and ask questions when I get stuck.

I'd start with parsing (reading) the tree to a generic abstract tree, and do 
the processing on the tree as a separate second step.

For your tree, it seems feasible to write a parser by hand (you'd need to read 
about recursive descendant parsers then). Another approach is to use a parser 
generator tool. There are many: http://wiki.python.org/moin/LanguageParsing

To give you a rough classification, LL(1) is the weakest parsing algorithm 
(but also the simplest to understand, comparable to recursive descendant), 
LALR(1) is 'normal', you can parse source code of many real programming 
languages with it (eg C and Python). GLR is the heavy-duty equipment but is 
also much slower and more memory-greedy.

Many people are fond of PyParsing. My personal favorite is PLY.


Enjoy your yourney!


Sincerely,
Albert


More information about the Tutor mailing list