download all mib files from a web page

Jeff McNeil jeff at jmcneil.net
Wed May 27 13:50:31 EDT 2009


On May 27, 12:29 pm, powah <wong_po... at yahoo.ca> wrote:
> I want to download all mib files from the web page:http://www.juniper.net/techpubs/software/junos/junos94/swconfig-net-m...
>
> All mib filenames are of this format:www.juniper.net/techpubs... .txt
>
> I write this program but has the following error.
> Please help.
> Thanks.
>
> [code]
> #!/usr/bin/env python
> import urllib2,os,urlparse
> url="http://www.juniper.net/techpubs/software/junos/junos94/swconfig-
> net-mgmt/juniper-specific-mibs-junos-nm.html#jN18E19"
> page=urllib2.urlopen(url)
> f=0
> links=[]
> data=page.read().split("\n")
> for item in data:
>     if "www.juniper.net/techpubs" in item:
>         httpind=item.index("www.juniper.net/techpubs")
>         item=item[httpind:]
>         #print "item " + item
>         ind=item.index("<")
>         links.append(item[:ind]) #grab all links
> # download all links
> for link in links:
>     print "link " + link
>     filename=link.split("/")[-1]
>     print "downloading ... " + filename
>     u=urllib2.urlopen(link)
>     p=u.read()
>     open(filename,"w").write(p)
> [/code]
>
> $ ~/python/downloadjuniper.py
> linkwww.juniper.net/techpubs/software/junos/junos94/swconfig-net-mgmt/mib...
> downloading ... mib-jnx-user-aaa.txt
> Traceback (most recent call last):
>   File "/home/powah/python/downloadjuniper.py", line 20, in ?
>     u=urllib2.urlopen(link)
>   File "/usr/lib/python2.4/urllib2.py", line 130, in urlopen
>     return _opener.open(url, data)
>   File "/usr/lib/python2.4/urllib2.py", line 350, in open
>     protocol = req.get_type()
>   File "/usr/lib/python2.4/urllib2.py", line 233, in get_type
>     raise ValueError, "unknown url type: %s" % self.__original
> ValueError: unknown url type:www.juniper.net/techpubs/software/junos/junos94/swconfig-net-mgmt/mib...
>
> $ python
> Python 2.4.4 (#1, Oct 23 2006, 13:58:00)
> [GCC 4.1.1 20061011 (Red Hat 4.1.1-30)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>
>
>
> My computer is FC6 linux.

There's only a couple dozen of them, right-click->Save As. I'm sure
Juniper would appreciate that much more than an automated crawler.

As far as your ValueError is concerned, consider that
'www.juniper.com' doesn't start with a protocol specification when
passed into urllib2.urlopen.

-Jeff
mcjeff.blogspot.com




More information about the Python-list mailing list