Parsing

Simon Bayling sfb at alysseum.com
Thu Jul 10 16:04:27 EDT 2003


whatsupg21 at hotmail.com (Michael) wrote in 
news:e5fb8973.0307100938.13fcea56 at posting.google.com:

> I have been assigned a project to parse a webpage for data using
> Python. I have finished only basic tutorials. Any suggestions as to
> where I should go from here? Thanks in advance.
> 

Parsing? What are you looking for?
Do you have to download the page as well?

If it's a fairly simple thing to find, you could use something like;

>>> import urllib
>>> source = urllib.urlopen("http://www.google.com").readlines()
>>> for line in source:
>>>     if line.find("logo.gif") > -1:
>>>         print "Found google logo"

If the data to find is more complicated, or you need to parse the HTML as 
well, you should look at more string methods, maybe regular expressions 
(import re)...

Cheers,
Simon.




More information about the Python-list mailing list