web crawler help?

Sorin Gherman s_gherman at yahoo.com
Tue Sep 10 15:22:31 EDT 2002


"koko" <kokohh at hotmail.com> wrote in message news:<JARd9.6927$yt3.3340577 at newssrv26.news.prodigy.com>...
> is there any sample for basic web crawler, that ask for a starting url and
> log the url and extract the hyperlinks?
> thx

    It's a very simple one in Mark Pilgrim's "Dive into Python" book,
whose text is freely available at: http://diveintopython.org/
    Check the "HTML processing" chapter. It contains a urllister.py 9
lines program, followed by a 7 lines usage example which does just
that: given a URL for a HTML file, it lists the hyperlinks inside it.

/sorin gherman



More information about the Python-list mailing list