downloading a link with javascript in it..

Diez B. Roggisch deets at nospam.web.de
Mon May 12 16:06:28 EDT 2008


Jetus schrieb:
> I am able to download this page (enclosed code), but I then want to
> download a pdf file that I can view in a regular browser by clicking
> on the "view" link. I don't know how to automate this next part of my
> script. It seems like it uses Javascript.
> The line in the page source says
> href="javascript:openimagewin('JCCOGetImage.jsp?
> refnum=DN2007036179');" tabindex=-1>
> 
> So, in summary, when I download this page, for each record, I would
> like to initiate the "view" link.
> Can anyone point me in the right direction?
> 
> When the "view" link is clicked on in IE or Firefox, it returns a pdf
> file, so I should be able to download it with
> urllib.urlretrieve('pdffile, 'c:\temp\pdffile')
> 
> Here is the following code I have been using
> ----------------------------------------------------------------
>     import urllib, urllib2
> 
>     params = [
>                 ('booktype', 'L'),
>                 ('book', '930'),
>                 ('page', ''),
>                 ('hidPageName', 'S3Search'),
>                 ('DoItButton', 'Search'),]
> 
>     data = urllib.urlencode(params)
> 
>     f = urllib2.urlopen("http://www.landrecords.jcc.ky.gov/records/
> S3DataLKUP.jsp", data)
> 
>     s = f.read()
>     f.close()
>     open('jcolib.html','w').write(s)

Use something like the FireBug-extension to see what the 
openimagewin-function ultimately creates as reqest. Then issue that, 
parametrised from parsed information out of the above href.

There is no way to interpret the JS in Python, let alone mimic possible 
browser dom behavior.

Diez



More information about the Python-list mailing list