downloading a link with javascript in it..

Jetus stevegill7 at gmail.com
Tue May 13 20:29:59 EDT 2008


On May 12, 6:59 pm, 7stud <bbxx789_0... at yahoo.com> wrote:
> On May 12, 1:54 pm, Jetus <stevegi... at gmail.com> wrote:
>
> > I am able to download this page (enclosed code), but I then want to
> > download a pdf file that I can view in a regular browser by clicking
> > on the "view" link. I don't know how to automate this next part of my
> > script. It seems like it uses Javascript.
> > The line in the page source says
>
> > href="javascript:openimagewin('JCCOGetImage.jsp?
> > refnum=DN2007036179');" tabindex=-1>
>
> 1) Use BeautifulSoup to extract the path:
>
> JCCOGetImage.jsp?refnum=DN2007036179
>
> from the html page.
>
> 2) The path is relative to the current url, so if the current url is:
>
> http://www.landrecords.jcc.ky.gov/records/S3DataLKUP.jsp
>
> Then the url to the page you want is:
>
> http://www.landrecords.jcc.ky.gov/records/JCCOGetImage.jsp?refnum=DN2...
>
> You can use urlparse.urljoin() to join a relative path to the current
> url:
>
> import urlparse
>
> base_url = 'http://www.landrecords.jcc.ky.gov/records/S3DataLKUP.jsp'
> relative_url = 'JCCOGetImage.jsp?refnum=DN2007036179'
>
> target_url = urlparse.urljoin(base_url, relative_url)
> print target_url
>
> --output:--http://www.landrecords.jcc.ky.gov/records/JCCOGetImage.jsp?refnum=DN2...
>
> 3) Python has a webbrowser module that allows you to open urls in a
> browser:
>
> import webbrowser
>
> webbrowser.open("www.google.com")
>
> You could also use system() or os.startfile()[Windows], to do the same
> thing:
>
> os.system(r'C:\"Program Files"\"Mozilla Firefox"\firefox.exe')
>
> #You don't have to worry about directory names
> #with spaces in them if you use startfile():
> os.startfile(r'C:\Program Files\Mozilla Firefox\firefox.exe')
>
> All the urls you posted give me errors when I try to open them in a
> browser, so you will have to sort out those problems first.

7Stud;
Thanks for sharing your knowledge!!

1)The proper url to the website is http://www.landrecords.jcc.ky.gov/records/S0Search.html.

2) The join won't work. I found that the request it sends is
http://206.196.0.195/cgi-bin/webview/SEND2.PGM?dispfmt=&itype=Q&authorization=&parm2=SDAAAA76070B
It looks like it generates a random code for param2...
I have two choices for generating this javascript,
I can click on the View, or in the form, if I put a "i" in the code
and click on the
option link, it will send me pdf file.

3) Was not sure why you suggested I use the Webbrowser module?
But I am glad to find out about it.



More information about the Python-list mailing list