spidering script

Bernard bernard.chhun at gmail.com
Fri Jan 19 15:14:52 EST 2007


4 easy steps to get the links:

1. Download BeautifulSoup and import it in your script file.
2. Use urllib2 to download the html of the url.
3. mash the html using BeautifulSoup
4.
[code]
for tag in BeautifulSoupisedHTML.findAll('a'):
        print tag
[/code]

David Waizer a écrit :
> Hello..
>
> I'm  looking for a script (perl, python, sh...)or program (such as wget)
> that will help me get a list of ALL the links on a website.
>
> For example ./magicscript.pl www.yahoo.com and outputs it to a file, it
> would be kind of like a spidering software..
> 
> Any suggestions would be appreciated.
> 
> David




More information about the Python-list mailing list