delete comments with re

Andrew M. Kuchling akuchlin at mems-exchange.org
Tue Mar 7 13:57:02 EST 2000


laurent8 at sxb.bsf.alcatel.fr writes:
> with this regular expression I've got
> >>> r="/\*.*\*/"
> >>> re.sub(r,'',t)
> 'a=b;  class foobar {}'

Instead of .* in the middle, try a non-greedy match with .*? :

pat = re.compile('/[*] .*? [*]/', re.VERBOSE)
t='a=b;  ...'
print repr(pat.sub('', t))

This outputs 'a=b; o=1; j=1; class foobar {}'.  

Comments in C don't nest, luckily, so the comment will end at the
first */ encountered after the comment starts.  (The comment in
'/*a/*b*/c*/' is '/*a/*b*/', that is.)  

	.* will run to the end of the string and then backtrack until
	it finds a */; it matches as *much* as possible.

	.*? will repeatedly advance by one character and try to
	match */ at every point, so it matches as *little* as possible.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
The criterion of simplicity is not necessarily based on the speed of the
algorithm or in its complexity in serial computers.
    -- Armand De Callatay, _Natural and Artificial Intelligence_




More information about the Python-list mailing list