[Tutor] Regex across multiple lines

Liam Clarke ml.cyresse at gmail.com
Wed Apr 26 14:25:42 CEST 2006


Hi Frank, just bear in mind that the pattern:

patObj = re.compile("<title>.*</title>", re.DOTALL)

will match

<title>
   This is my title
</title>

But, it'll also match

<title>
   This is my title
</title>
<p>Some content here</p>
<title>
    Another title; not going to happen with a title tag in HTML, but
more an illustration
</title>

All of that.

Got to watch .* with re.DOTALL; try using .*? instead, it makes it
non-greedy. Functionality for your current use case won't change, but
you won't spend ages when you have a different use case trying to
figure out why half your data is matching. >_<

To the Tutor list - can't re.MULTILINE also be used? I've never really
used that flag.

Regards,

Liam Clarke

On 4/26/06, Frank Moore <francis.moore at rawflow.com> wrote:
> Kent Johnson wrote:
>
> >Use your compiled regex for the sub(), so it will have the DOTALL flag set:
> >html_text = p.sub(replace_string, html_text)
> >
> >
> Kent,
>
> I was trying to work out how to use the DOTALL flag with the sub method,
> but couldn't figure it out.
> It's so obvious once someone points it out. ;-)
>
> Many thanks,
> Frank
>
>
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
>


More information about the Tutor mailing list