Regular expression problem
Tim Legant
tim-dated-1015476021.3d1cbb at catseye.net
Wed Feb 27 23:40:21 EST 2002
Asheesh Laroia <pan-news at asheeshenterprises.com> writes:
> This is great, thanks!
>
> Only one problem. I'm having trouble (I did give it a try) making the
> following work:
>
> <@Trap Body text:>Useful Text
>
> I need to still be able to extract "Useful Text", not delete it.
>
> Thanks again!
Try this:
>>> rc = re.compile(r'<@Trap\s+\w+\s+\w+=?(?:<.+?>)*>', re.MULTILINE|re.DOTALL)
>>> text = """<@Trap Body text=<FONT "Times"><CCOLOR\n "Black"><
11><HORIZONTAL 100><LETTERSPACE 0><CTRACK 127><CSSIZE 70><C+SIZE\n
58.3><C-POSITION 33.3><C+POSITION 33.3><P><CBASELINE 0><CNOBREAK
0><CLEADING -0.05\n ><GGRID 0><GLEFT 0><GRIGHT 0><GFIRST 19.2><G+BEFORE
0><G+AFTER 0><GALIGNMENT \n "justify\n "><GMETHOD "proportional"><G&
"ENGLISH"><GPAIRS 4><G% 120><GKNEXT 0><GKWIDOW \n 1><GKORPHAN\n
1><GTABS $><GHYPHENATION 2 36 0><GWORDSPACE 75 100 150><GSPACE -5 0
25>>Useful Text"""
>>> rc.sub('', text)
'Useful Text'
Tim
More information about the Python-list
mailing list