Help needed: cryptic perl regular expression in python syntax, Ugly solution
Pekka Niiranen
pekka.niiranen at wlanmail.com
Tue Oct 19 15:47:51 EDT 2004
Thanks,
I managed to solve my problem with code like this:
>>> line = ' s^\\?AAA\\?01^BBB^g; #Comment '
>>> r1 = '(^\\s*)(s|tr)(.)(\\\\\\?\\\\??'
>>> key = "AAA\?01"
>>> r2 = '\\\\??)\\3(.*?)\\3(.*)'
>>> r = r1 + re.escape(key) + r2
>>> re.compile(r).findall(line)
[(' ', 's', '^', '\\?AAA\\?01', 'BBB', 'g; #Comment ')]
but what an ugly piece of code...
I was hoping to do without excess backslashes with re.escape(),
but no avail since group item '\3' gets misquoted (among other things):
>>> r2 = "\??)\3(.*?)\3(.*)/)"
>>> re.escape(r2)
'\\\\\\?\\?\\)\\\x03\\(\\.\\*\\?\\)\\\x03\\(\\.\\*\\)\\/\\)'
-pekka-
Antoon Pardon wrote:
> Op 2004-10-19, pekka niiranen schreef <pekka.niiranen at wlanmail.com>:
>
>>Hi there,
>>
>>I have perl script that uses dynamically
>>constructed regular in this way:
>>
>>------perl code starts ----
>>$result "";
>>$key = AAA\?01;
>>$key = quotemeta $key;
>>$line = " s^\?AAA\?01^BBB^g; #Comment "
>>if ($line =~ /(^\s*)(s|tr)(.)(\\?\??$key\??)\3(.*?)\3(.*)/) {
>> $result = $5;
>>
>># $result should be "BBB"
>># \3 gets the same value as returned by (.)
>># which is in this example ^. So we are searching
>># parameter limited by first two ^-signs
>># and returning the one limited byt the second
>># and third ^-sign. Note that using \3 in regular
>># expression enables other constants used than ^ -sign.
>>
>>------perl code stops ----
>>
>>How can I construct equivalent python regural expression ?
>>
>>I have tested with constant regular expression like this:
>>
>>
>>>>>line = ' s^\\?AAA\\?01^BBB^g; #Comment '
>>>>>r1 = "(^\s*)(s|tr)(.)(\\\\\?\\\??AAA\\\\\?01)"
>>>>>re.compile(r1).findall(line)
>>
>>[(' ', 's', '^', '\\?AAA\\?01')]
>>
>>Which is fine, but is there a way to join 3 raw strings
>>together into another raw strings? like:
>>
>>r1 = r'''(^\s*)(s|tr)(.)(\\?\??'''
>>r2 = r'''\\?\??)\3(.*?)\3(.*)'''
>>p1 = r1 + key + r2 # p1 should remain raw string too
>>
>
>
> If I understand correctly there are no raw strings, just raw string
> literals. The re.compile uses just a normal string.
>
> raw string literal just make it easier to form a strings that are
> typically used for regular expressions but the strings themselves
> are just ordinary strings.
>
>
>>>>s1="\\b"
>>>>s2=r"\b"
>>>>s1==s2
>
> 1
>
>>>>s1
>
> '\\b'
>
>>>>s2
>
> '\\b'
>
>>>>print s1
>
> \b
>
>>>>print s2
>
> \b
>
>
More information about the Python-list
mailing list