regular expression

Diez B. Roggisch deets at nospam.web.de
Sun Nov 18 16:54:21 EST 2007


gardsted schrieb:
> I just can't seem to get it:
> I was having some trouble with finding the first <REAPER_PROJECT in the 
> following with this regex:
> 
> Should these two approaches behave similarly?
> I used hours before I found the second one,
> but then again, I'm not so smart...:
> 
> kind retards
> jorgen / de mente
> using python 2.5.1
> -------------------------------------------
> import re
> 
> TESTTXT="""<REAPER_PROJECT 0.1
>   <METRONOME 6 2.000000
>     SAMPLES "" ""
>   >
>   <TRACK
>     MAINSEND 1
>     <VOLENV2
>       ACT 1
>     >
>     <PANENV2
>       ACT 1
>     >
>   >
>  >
> """
> print "The First approach - flags in finditer"
> rex = re.compile(r'^<(?P<TAGNAME>[a-zA-Z0-9_]*)')
> for i in rex.finditer(TESTTXT,re.MULTILINE):
>     print i,i.groups()
> 
> print "The Second approach - flags in pattern "
> rex = re.compile(r'(?m)^<(?P<TAGNAME>[a-zA-Z0-9_]*)')
> for i in rex.finditer(TESTTXT):
>     print i,i.groups()

What the heck is that format? XML's retarded cousin living in the attic?

Ok, back to the problem then...

This works for me:

rex = re.compile(r'^<(?P<TAGNAME>[a-zA-Z0-9_]+)',re.MULTILINE)
for i in rex.finditer(TESTTXT):
     print i,i.groups()

However, you might think of getting rid of the ^ beceause otherwise you 
_only_ get the first tag beginning at a line. And making the * a + in 
the TAGNAME might also be better.

Diez



More information about the Python-list mailing list