re troubles
Robert Brewer
fumanchu at amor.org
Thu Dec 18 18:50:38 EST 2003
If 'kittens' is ALWAYS in the first column, then there will never be a /
between it and its containing tr start tag. Then you could use:
m = re.compile(r"<tr>[^/]*kittens.*?</tr>", re.DOTALL)
m.sub("", data)
Robert Brewer
MIS
Amor Ministries
fumanchu at amor.org
> -----Original Message-----
> From: Evanda Remington [mailto:evanda at remingtons.org]
> Sent: Thursday, December 18, 2003 3:23 PM
> To: python-list at python.org
> Subject: re troubles
>
>
> I'm trying to filter some rows of an html table out, based on their
> contents. For input like:
> """
> <table>
> <tr>
> <td>Lasers</td><td>17</td> </tr>
> <tr> << want to filter
> <td>kittens</td><td>8</td> << this out.
> </tr> <<
> <tr> <td>robots</td><td>8</td> </tr>
> </table>
> """
> I would like to completely remove the (3 line) table row that
> makes mention
> of kittens. The regexp I have tried to use is:
> r"<tr>.*?kittens.*?</tr>".
> When compiled and used with subs("",data), strangely removes
> everything
> from the first "<tr>" to the first "<tr>" after kittens.
>
> That is, the ".*?" notation works in the second half, but not
> in the first
> half. It behaves the same as ".*" should.
>
> Any advice?
>
> -e
>
> --
> Evanda Remington
> evanda at wreck.org
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>
More information about the Python-list
mailing list