SOLUTION: Regexp problem for gurus + possible PEP

Pekka Niiranen krissepu at vip.fi
Mon Jan 7 15:54:04 EST 2002


I managed to find a limited solution in Python where there is only one
nested level and
a single character is used as a limit:

>>> pattern = re.compile("(\?[^?!]+(\?[^?!]+\!)*[^?!]+\!)")
>>> Line = "?AA?BB!CC!?DD!ee?EE!ff?FF?GG!HH!"
>>> print re.findall(pattern, Line)
[('?AA?BB!CC!', '?BB!'), ('?DD!', ''), ('?EE!', ''), ('?FF?GG!HH!', '?GG!')]

Now, If we could get such a version of  re.findall() that does not return
empty matches ("").
For example re.findall(pattern, text, flag), where flag tells, whether empty
matches are returned.

Maybe I could use filter or the new finditer -method in sre -module ? Any
ideas are welcomed ?

-pekka-

Pekka Niiranen wrote:

> How can I make python to return list of matches in the following case:
>
> The searched texts are between ?- and ! -signs
> Different start (?) and end (!) -signs are used in order to able
> to detect nested matches.
>
> case 1:        Line = "aaa?HHH!bbb?JJJJ!ccc?KKKK!dddd
>                    Returned list should be:
>                        ?HHH!, ?JJJJ!, ?KKKK!
>
>                    For this one I allready know the answer:
>
>                    pattern = re.compile(r'?[^?!]+!')
>                    list = re.findall(pattern, Line)
>
> case 2:        Line = "aaa?HHHbbb?JJJJ!ccc!?KKKK!dddd
>                    Returned list should be (the order of matches does
> not matter):
>                        ?HHHbbb?JJJJ!ccc!, ?JJJJ!, ?KKKK!
>
>                     In this case ?JJJJ! is nested inside
> ?HHHbbb?JJJJ!ccc!
>                     The amount of nested patterns is limited by the
> python ?
>
> Is it possible to compile a single regular expression that covers both
> the cases ?
> Howabout if the limiting signs consist more than one characters
> (not ? and !, but <?> and <!>)? Then negation group [^] would not work ?
>
> -pekka-




More information about the Python-list mailing list