how to get text between HTML tags with URLLIB??

Roy Katz katz at Glue.umd.edu
Fri Aug 18 22:45:34 EDT 2000


There should be a way through urllib, right? 
if urllib can't do it, then I see at as a deficiency in urllib. 
but thanks for the regexp! 


Roey


On Sat, 19 Aug 2000, sp00fD wrote:

> In article <Pine.GSO.4.21.0008181829180.905-100000 at y.glue.umd.edu>,
>   Roy Katz <katz at Glue.umd.edu> wrote:
> > Hello,
> >
> >
> >  <a href=http://wacky.roey.com > 'Roey's Wacky Server of Fun!' </a
> href>
> >
> > This is really frustrating.  Why isn't this mentioned in the urllib
> docs?
> > deranged pythoneer
> >
> >
> 
> I take it that you don't want to use a regex?
> 
> #completely untested...but may work ;)
> p = re.compile(r'<a href=([^>])', re.IGNORECASE)
> stripped_url = p.sub('\1', url)
> 
> 
> 
> Sent via Deja.com http://www.deja.com/
> Before you buy.
> 




More information about the Python-list mailing list