extract c/cpp include file with regular expression

Philip Semanchuk philip at semanchuk.com
Thu Jul 23 13:01:45 EDT 2009


On Jul 23, 2009, at 12:36 PM, tiefeng wu wrote:

> 2009/7/24 Philip Semanchuk <philip at semanchuk.com>:
>>
>> I know this will sound like a sarcastic comment, but it is sincere:  
>> my
>> suggestion is that if you want to parse C/C++ (or Python, or Perl, or
>> Fortran, etc.), use a real parser, not regexes unless you're  
>> willing to
>> sacrifice some accuracy. Sooner or later you'll come across some  
>> code that
>> your regexes won't handle, like this --
>>
>> #ifdef FOO_BAR
>> #include <this.h>
>> /* #else */
>> #include <that.h>
>> #endif
>>
>>
>> Parsing code is difficult...
>>
> I understand your point, thanks for your suggestion, Philip. And I've
> met the problem like in your example
> The reason I choose regex because I barely know about "real parser",
> for me it still in some "dark area" :)
> But I'll find something to learn.

Yes! Learning is always good. And as I said, if you don't mind missing  
some unusual cases, regexes are fine. I don't know how accurate you  
want your results to be.

As for real parsers, there's lots of them out there, although they may  
be overkill for what you want to do. Here's one written entirely in  
Python:
http://www.dabeaz.com/ply/

Whatever you choose, good luck with it.

Cheers
Philip




More information about the Python-list mailing list