emacs lisp text processing example (html5 figure/figcaption)
Ian Kelly
ian.g.kelly at gmail.com
Tue Jul 5 18:09:46 EDT 2011
On Tue, Jul 5, 2011 at 2:37 PM, Xah Lee <xahlee at gmail.com> wrote:
> but in anycase, i can't see how this part would work
> <p class="cpt">((?:[^<]|<(?!/p>))+)</p>
It's not that different from the pattern 「alt="[^"]+"」 earlier in the
regex. The capture group accepts one or more characters that either
aren't '<', or that are '<' but are not immediately followed by '/p>'.
Thus it stops capturing when it sees exactly '</p>' without consuming
the '<'. Using my regex with the example that you posted earlier
demonstrates that it works:
>>> import re
>>> s = '''<div class="img">
... <img src="jamie_cat.jpg" alt="jamie's cat" width="167" height="106">
... <p class="cpt">jamie's cat! Her blog is <a href="http://example.com/
... jamie/">http://example.com/jamie/</a></p>
... </div>'''
>>> print re.sub(pattern, replace, s)
<figure>
<img src="jamie_cat.jpg" alt="jamie's cat" width="167" height="106">
<figcaption>jamie's cat! Her blog is <a href="http://example.com/
jamie/">http://example.com/jamie/</a></figcaption>
</figure>
Cheers,
Ian
More information about the Python-list
mailing list