[Tutor] Finding C comments with regular expressions

Danny Yoo dyoo at hkn.eecs.berkeley.edu
Wed Apr 21 15:05:41 EDT 2004



On Wed, 21 Apr 2004, Magnus Lycka wrote:

> Tony  Cappellini wrote:
> > Now for the real problem
> > I don't want my macro lister program, to find macro definitions that are
> > inside of a multi-line C-style comment.
>
> I'm sure you can write a regular expression that works for this most of
> the time, but if you want to take into consideration things like...
>
> /* We comment out some code
>
> sprintf("We can use */ inside a string of course");
>
> Now we end the comment */
>
> ..it gets much harder.



Hi Magnus,


Actually, C should behave as if:


> /* We comment out some code
>
> sprintf("We can use */


were the comment: C comments don't care if there are quotes in the comment
content.


The regular expression is a little tricky, because the beginning and
ending of a comment uses two characters instead of one.  Here's a regular
expression that takes this into consideration:

###
pattern = re.compile(r"""
           / \*                ##  Leading "/*"

           (                   ##  Followed by any number of
              ([^*])           ##  non star characters
              |                ##  or
              (\* [^/])        ##  star-nonslash
           )*
           \* /                ##  with a trailing "/*"
          """, re.VERBOSE)
###


Does this work?  Let's see this in action:

###
>>> match = pattern.search("""
...  str = ajStrNew();
...
...     /* seed the random number generator */
...     ajRandomSeed();
... """)
>>>
>>> match.group(0)
'/* seed the random number generator */'
>>>
>>>
>>>
>>>
>>> match = pattern.search("""
... /* We comment out some code
...
... sprintf("We can use */ inside a string of course");
...
... Now we end the comment */
... """)
>>>
>>> match.group(0)
'/* We comment out some code\n\nsprintf("We can use */'
###


I hope this helps!




More information about the Tutor mailing list