re troubles

Robert Brewer fumanchu at amor.org
Thu Dec 18 18:50:38 EST 2003


If 'kittens' is ALWAYS in the first column, then there will never be a /
between it and its containing tr start tag. Then you could use:

m = re.compile(r"<tr>[^/]*kittens.*?</tr>", re.DOTALL)
m.sub("", data)


Robert Brewer
MIS
Amor Ministries
fumanchu at amor.org

> -----Original Message-----
> From: Evanda Remington [mailto:evanda at remingtons.org] 
> Sent: Thursday, December 18, 2003 3:23 PM
> To: python-list at python.org
> Subject: re troubles
> 
> 
> I'm trying to filter some rows of an html table out, based on their
> contents.  For input like:
> """
> <table>
>   <tr>
>     <td>Lasers</td><td>17</td> </tr>
>   <tr>                                            <<  want to filter
>     <td>kittens</td><td>8</td>                    <<  this out.
>   </tr>                                           <<
>   <tr> <td>robots</td><td>8</td> </tr>
> </table>
> """
> I would like to completely remove the (3 line) table row that 
> makes mention
> of kittens.  The regexp I have tried to use is: 
> r"<tr>.*?kittens.*?</tr>".
> When compiled and used with subs("",data), strangely removes 
> everything
> from the first "<tr>" to the first "<tr>" after kittens.
> 
> That is, the ".*?" notation works in the second half, but not 
> in the first
> half.  It behaves the same as ".*" should.
> 
> Any advice?
> 
> -e
> 
> -- 
> Evanda Remington
> evanda at wreck.org
> 
> -- 
> http://mail.python.org/mailman/listinfo/python-list
> 





More information about the Python-list mailing list