Is Python good for web crawlers?

Paul Rubin http
Tue Feb 7 15:31:52 EST 2006


"Tempo" <bradfordh at gmail.com> writes:
> I was wondering if python is a good language to build a web crawler
> with? For example, to construct a program that will routinely search x
> amount of sites to check the availability of a product. Or to search
> for news articles containing the word 'XYZ'. These are just random
> ideas to try to explain my question a bit further.

I've written a few of these in Python.  The language itself is fine
for this.  The built-in libraries do most of what you'd hope, though
they have room for improvement.  Generally I use urllib.read() to get
the whole html page as a string, then process it from there.  I just
look for the substrings I'm interested in, making no attempt to
actually parse the html into a DOM or anything like that.



More information about the Python-list mailing list