replace c-style comments with newlines (regexp)

Neil Cerutti horpner at yahoo.com
Fri Dec 21 09:06:11 EST 2007


On 2007-12-21, lex __ <comp_lexx at hotmail.com> wrote:
> I'm tryin to use regexp to replace multi-line c-style comments
> (like /*  this /n */ ) with /n (newlines). I tried someting
> like   re.sub('/\*(.*)/\*'  , '/n'   , file) but it doesn't
> work for multiple lines. 
>
> besides that I want to keep all newlines as they were in the
> original file, so I can still use the original linenumbers (I
> want to use linenumbers as a reference for later use.) I know
> that that will complicate things a bit more, so this is a bit
> less important.
>
> background: I'm trying to create a 'intelligent' source-code
> security analysis tool for c/c++ , python and php files, but
> filtering the comments seems to be  the biggest problem.  :( 
>
> So, if you have an answer to this , please let me know how to
> do this!

There are free C lexers and parsers available (e.g., gcc). I
recommend them to you. Gluing a real C parser into your Python
code might be easier than writing one. Not that it's impossible
to discover C comments with your own special-purpose, simple
parser (see Exercise 1-23 in K&R _The C Programming Language 2nd
Edition_), but it's not remotely doable with a regex.

-- 
Neil Cerutti



More information about the Python-list mailing list