Translating a Perl regex into Python

C. Laurence Gonsalves clgonsal at kami.com
Fri Sep 7 18:31:59 EDT 2001


On Fri, 7 Sep 2001 13:13:56 +0200, Stephan Tolksdorf <andorxor at gmx.de> wrote:
>I'm having a problem to translate a complex Perl regular expression which
>I've found in the Perl faq. To be honest, I haven't tried to fully
>understand it, but it seems to flawlessly strip all c++ like comments from
>sourcefiles.
>
># The Perl regex from the Perl faq
>content =~
>s#/\*[^*]*\*+([^/*][^*]*\*+)*/|//[^\n]*|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^
>/"'\\]*)#$2#gs
>
...
>
>But I'm getting "sre_constants.error: unexpected end of regular
>expression"...

AFAIK, Python regexes (a la the re module) are exactly the same as Perl
regexes. I think your problem here has to do with escaping/quoting
rather than regex syntax.

Try just taking the original Perl regex, and just wrap r'''...''' around
it. ie:

rex = \
    re.compile(r'''/\*[^*]*\*+([^/*][^*]*\*+)*/|//[^\n]*|("(\\.|[^"\\])*"'''
    r'''|'(\\.|[^'\\])*'|.[^>/"'\\]*)''', re.M | re.S)

(I've split the string in two for posting purposes...)

Incidently, it looks like that regex also matches string literals and
char literals.

-- 
  C. Laurence Gonsalves                "Any sufficiently advanced
  clgonsal at kami.com                     technology is indistinguishable
  http://cryogen.com/clgonsal/          from magic." -- Arthur C. Clarke



More information about the Python-list mailing list