[Tutor] Fw: running a javascript script with python

eryksun eryksun at gmail.com
Fri Nov 2 10:51:52 CET 2012


> ----- Forwarded Message -----
> From: Benjamin Fishbein <bfishbein79 at gmail.com>
> To: Alan Gauld <alan.gauld at btinternet.com>
> Sent: Friday, 2 November 2012, 3:55
> Subject: Re: [Tutor] running a javascript script with python
>
>>>> cj=cookielib.CookieJar()
>>>> opener=urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
> ....
>>>> data={"bb_isbns":isbns}
>>>> encoded_data=urllib.urlencode(data)
>>>> url='http://www.textbooks.com/BuyBack-Search.php'

You asked about this a month or so ago. This time around you're using
a cookie jar to store the session state, but you're skipping the CSID
parameter. If you look at the HTML source, you'll see
BuyBack-Search.php?CSID=Some_Value_From_Your_Session.
If you first open http://www.textbooks.com to read the session
cookies, CSID appears to be the 'tb_DSL" cookie.

That said, as I mentioned before, the site's terms of service forbid
scraping (see section II) :

http://www.textbooks.com/CustServ-Terms.php


More information about the Tutor mailing list