Startying with Python, need some pointers with manipulating strings

Benji99 bob at nospam.net
Thu Jan 27 16:22:58 EST 2005


Hi guys, I'm starting to learn Python and so far am very 
impressed with it's possibilities. I do however need some help 
with certain things I'm trying to do which as of yet haven't 
managed to find the answer by myself. Hopefully, someone will be 
able to give me some pointers :)

First my background, I haven't programmed seriously in over 5 
years, but recently have started programming again in 
Delphi/Pascal scripting, and that's what I'm most familiar with 
right now. I'm also much more confortable with structured 
programming in contrast to OO (which isn't helping much with 
Python :))

Anyway, I have a very specific project in mind which I've mostly 
implemented in Pascal and I'd like to implement it in Python 
since the possibilities after that are much more interesting.

Basically, I'm getting a htmlsource from a URL and need to
a.) find specific URLs
b.) find specific data
c.) with specific URLs, load new html pages and repeat.

I've managed to load the html source I want into an object 
called htmlsource using:

>>> import urllib
>>> sock = urllib.urlopen("URL Link")
>>> htmlSource = sock.read()
>>> sock.close()

I'm assuming that htmlSource is a string with \n at the end of 
each line.
NOTE: I've become very accustomed with the TStringList class in 
Delphi so forgive me if I'm trying to work in that way with 
Python...

Basically, I want to search through the whole string( 
htmlSource), for a specific keyword, when it's found, I want to 
know which line it's on so that I can retrieve that line and 
then I should be able to parse/extract what I need using Regular 
Expressions (which I'm getting quite confortable with). So how 
can this be accomplished?

Second main thing I'd like to know has to do with urllister, I'm 
very intrigued by it's use of grabbing automatically url links 
from the source. but I've only managed to get it to retrive 
everything, which is a lot. what are my options in term of 
getting it to be more specific? Can I tell it to retrieve a URL 
IF a keyword is found on the same string line?

Hopefully someone will be able able/willing to give me a hand, I 
think with these roadblocks out of the way, I should be able to 
figure out the rest of what I need. Thanks in advance!

Benji99

----------------------------------------------
Posted with NewsLeecher v1.0 Final
 * Binary Usenet Leeching Made Easy
 * http://www.newsleecher.com/?usenet
----------------------------------------------




More information about the Python-list mailing list