Startying with Python, need some pointers with manipulating strings

Paul McGuire ptmcg at austin.rr._bogus_.com
Thu Jan 27 16:46:05 EST 2005


"Benji99" <bob at nospam.net> wrote in message
news:41f95bb2$0$24349$9a6e19ea at unlimited.newshosting.com...
>
> Basically, I'm getting a htmlsource from a URL and need to
> a.) find specific URLs
> b.) find specific data
> c.) with specific URLs, load new html pages and repeat.
>
<snip>
>
> Basically, I want to search through the whole string(
> htmlSource), for a specific keyword, when it's found, I want to
> know which line it's on so that I can retrieve that line and
> then I should be able to parse/extract what I need using Regular
> Expressions (which I'm getting quite confortable with). So how
> can this be accomplished?
>
If you download pyparsing (at http://pyparsing.sourceforge.net), you'll find
in the examples something very close to this called urlextractor.py (lists
out all href's and their associated links on the page at www.yahoo.com).

-- Paul





More information about the Python-list mailing list