[pyparsing] How to get arbitrary text surrounded by keywords?
Paul McGuire
ptmcg at austin.rr._bogus_.com
Mon Nov 28 16:00:58 EST 2005
"Inyeol Lee" <inyeol.lee at siliconimage.com> wrote in message
news:mailman.1297.1133203971.18701.python-list at python.org...
> I'm trying to extract module contents from Verilog, which has the form
> of;
>
> module foo (port1, port2, ... );
>
> // module contents to extract here.
> ...
>
> endmodule
>
> To extract the module contents, I'm planning to do something like;
>
> from pyparsing import *
>
> ident = Word(alphas+"_", alphanums+"_")
> module_begin = Group("module" + ident + "(" + OneOrMore(ident) + ")" +
";")
> module_contents = ???
> module_end = Keyword("endmodule")
> module = Group(module_begin + module_contents + module_end)
>
> (abobe code not tested.)
>
> How should I write the part of 'module_contents'? It's an arbitrary text
> which doesn't contain 'endmodule' keyword. I don't want to use full
> scale Verilog parser for this task.
>
> -Inyeol
The simplest way is to use SkipTo. This only works if you don't have to
worry about nesting. I think Verilog supports nested modules, but if the
files you are parsing don't use this feature, then SkipTo will work just
fine.
module_begin = Group("module" + ident + "(" + OneOrMore(ident) + ")" + ";")
module_end = Keyword("endmodule")
module_contents = SkipTo(module_end)
If you *do* care about nested modules, then a parse action might help you
handle these cases. But this starts to get trickier, and you may just want
to consider a more complete grammar. If your application is non-commercial
(i.e., for academic or personal use), there *is* a full Verilog grammar
available (also available with commercial license, just not free).
-- Paul
More information about the Python-list
mailing list