[pyparsing] How to get arbitrary text surrounded by keywords?

Paul McGuire ptmcg at austin.rr._bogus_.com
Mon Nov 28 16:00:58 EST 2005


"Inyeol Lee" <inyeol.lee at siliconimage.com> wrote in message
news:mailman.1297.1133203971.18701.python-list at python.org...
> I'm trying to extract module contents from Verilog, which has the form
> of;
>
>     module foo (port1, port2, ... );
>
>         // module contents to extract here.
>         ...
>
>     endmodule
>
> To extract the module contents, I'm planning to do something like;
>
> from pyparsing import *
>
> ident = Word(alphas+"_", alphanums+"_")
> module_begin = Group("module" + ident + "(" + OneOrMore(ident) + ")" +
";")
> module_contents = ???
> module_end = Keyword("endmodule")
> module = Group(module_begin + module_contents + module_end)
>
> (abobe code not tested.)
>
> How should I write the part of 'module_contents'? It's an arbitrary text
> which doesn't contain 'endmodule' keyword. I don't want to use full
> scale Verilog parser for this task.
>
> -Inyeol

The simplest way is to use SkipTo.  This only works if you don't have to
worry about nesting.  I think Verilog supports nested modules, but if the
files you are parsing don't use this feature, then SkipTo will work just
fine.

module_begin = Group("module" + ident + "(" + OneOrMore(ident) + ")" + ";")
module_end = Keyword("endmodule")
module_contents = SkipTo(module_end)

If you *do* care about nested modules, then a parse action might help you
handle these cases.  But this starts to get trickier, and you may just want
to consider a more complete grammar.  If your application is non-commercial
(i.e., for academic or personal use), there *is* a full Verilog grammar
available (also available with commercial license, just not free).

-- Paul





More information about the Python-list mailing list