Regex - where do I make a mistake?

Johny python at hope.cz
Fri Feb 16 08:34:55 EST 2007


On Feb 16, 2:14 pm, Peter Otten <__pete... at web.de> wrote:
> Johny wrote:
> > I have
> > string="""<span class="test456">55</span>.
> > <td><span class="test123">128</span>
> > <span class="test789">170</span>
> > """
>
> > where I need to replace
> > <span class="test456">55</span>.
> > <span class="test789">170</span>
>
> > by space.
> > So I tried
>
> > #############
> > import re
> > string="""<td><span class="test456">55</span>.<span
> > class="test123">128</span><span class="test789">170</span>
> > """
> > Newstring=re.sub(r'<span class="test(?!123)">.*</span>'," ",string)
> > ###########
>
> > But it does NOT work.
> > Can anyone explain why?
>
> "(?!123)" is a negative "lookahead assertion", i. e. it ensures that "test"
> is not followed by "123", but /doesn't/ consume any characters. For your
> regex to match "test" must be /immediately/ followed by a '"'.
>
> Regular expressions are too lowlevel to use on HTML directly. Go with
> BeautifulSoup instead of trying to fix the above.
>
> Peter- Hide quoted text -
>
> - Show quoted text -

Yes, I know "(?!123)" is a negative "lookahead assertion",
but do not know excatly why it does not work.I thought that

(?!...)
Matches if ... doesn't match next.  For example, Isaac (?!Asimov) will
match 'Isaac ' only if it's not followed by 'Asimov'.




More information about the Python-list mailing list