Translating a Perl regex into Python
C. Laurence Gonsalves
clgonsal at kami.com
Fri Sep 7 18:31:59 EDT 2001
On Fri, 7 Sep 2001 13:13:56 +0200, Stephan Tolksdorf <andorxor at gmx.de> wrote:
>I'm having a problem to translate a complex Perl regular expression which
>I've found in the Perl faq. To be honest, I haven't tried to fully
>understand it, but it seems to flawlessly strip all c++ like comments from
>sourcefiles.
>
># The Perl regex from the Perl faq
>content =~
>s#/\*[^*]*\*+([^/*][^*]*\*+)*/|//[^\n]*|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^
>/"'\\]*)#$2#gs
>
...
>
>But I'm getting "sre_constants.error: unexpected end of regular
>expression"...
AFAIK, Python regexes (a la the re module) are exactly the same as Perl
regexes. I think your problem here has to do with escaping/quoting
rather than regex syntax.
Try just taking the original Perl regex, and just wrap r'''...''' around
it. ie:
rex = \
re.compile(r'''/\*[^*]*\*+([^/*][^*]*\*+)*/|//[^\n]*|("(\\.|[^"\\])*"'''
r'''|'(\\.|[^'\\])*'|.[^>/"'\\]*)''', re.M | re.S)
(I've split the string in two for posting purposes...)
Incidently, it looks like that regex also matches string literals and
char literals.
--
C. Laurence Gonsalves "Any sufficiently advanced
clgonsal at kami.com technology is indistinguishable
http://cryogen.com/clgonsal/ from magic." -- Arthur C. Clarke
More information about the Python-list
mailing list