spidering script

Jonathan Curran jonc at icicled.net
Thu Jan 18 13:58:04 EST 2007


On Thursday 18 January 2007 11:57, David Waizer wrote:
> Hello..
>
> I'm  looking for a script (perl, python, sh...)or program (such as wget)
> that will help me get a list of ALL the links on a website.
>
> For example ./magicscript.pl www.yahoo.com and outputs it to a file, it
> would be kind of like a spidering software..
>
> Any suggestions would be appreciated.
>
> David

David, this is a touchy topic but whatever :P Look into sgmllib, and you can 
filter on the "A" tag. The book 'Dive Into Python' covers it quite nicely: 
http://www.diveintopython.org/html_processing/index.html

Jonathan



More information about the Python-list mailing list