extract news article from web
Fuzzyman
fuzzyman at gmail.com
Thu Dec 23 10:06:47 EST 2004
If you have a reliably structured page, then you can write a custom
parser. As Steve points out - BeautifulSOup would be a very good place
to start.
This is the problem that RSS was designed to solve. Many newssites will
supply exactly the information you want as an RSS feed. You should then
use Universal Feed Parser to process the feed.
The module you need for fecthing the webpages (in case you didn't know)
is urllib2. There is a great article on fetching webpages in the
current issue of pyzine. See http://www.pyzine.com :-)
Regards,
Fuzzy
http://www.voidspace.org.uk/python/index.shtml
More information about the Python-list
mailing list