search an entire website given the homepage URL
Fredrik Lundh
fredrik at pythonware.com
Tue Apr 25 13:20:42 EDT 2006
"Bell, Kevin" wrote:
> I know I can use urllib2 to get at a website given urllib2.urlopen(url)
> but I'm unsure how to then go through all pages that are linked to it,
> but still in the domain. If I want to search through the entire python
> website give the homepage, how would I go about it?
use a search engine (try the search box in the upper right corner).
using a spider to download the entire site just so you can "search through
it" is bloody impolite.
if you have a valid reason to download portions of the site, use wget's mirror
function, or some similar tool, and be nice. there's a tool called "websucker"
in the Tools directory of the standard Python distribution that can also be used
to mirror portions of a site:
http://svn.python.org/view/python/trunk/Tools/webchecker/
</F>
More information about the Python-list
mailing list