Fetching websites with Python

wes weston wweston at att.net
Wed Mar 31 14:38:34 EST 2004


Markus Franz wrote:
> Hi.
> 
> How can I grab websites with a command-line python script? I want to start
> the script like this:
> 
> ./script.py ---xxx--- http://www.address1.com http://www.address2.com
> http://www.address3.com
> 
> The script should load these 3 websites (or more if specified) in parallel
> (may be processes? threads?) and show their contents seperated by ---xxx---.
> The whole output should be print on the command-line. Each website should
> only have 15 seconds to return the contents (maximum) in order to avoid a
> never-ending script.
> 
> How can I do this?
> 
> Thanks.
> 
> Yours sincerely
> 
> Markus Franz
> 
> 

Markus,
    I think there's a timeout in urllib; not sure.


import urllib
import sys
#--------------------------------------
if __name__ == "__main__":
     if len(sys.argv) < 3:
         print 'arg error'
         sys.exit(1)
     sep = sys.argv[1]
     for url in sys.argv[2:]:
         try:
             f = urllib.urlopen(url)
             lines = f.readlines()
             f.close()
             for line in lines:
                 print line[:-1]
         except:
             print url,'get error'
         print sep




More information about the Python-list mailing list