Utility to screenscrape sites using javascript ?

Nobody nobody at nowhere.com
Sat Jan 30 12:00:43 EST 2010


On Sat, 30 Jan 2010 06:21:01 -0800, KB wrote:

> I have a service I subscribe to that uses javascript to stream news.
> Ideally I would like to use python to parse the information for me. Note
> there is an option to take a static snapshot of the current stream but
> that is still done via Javascript. (I can reference the snapshot with a
> unique URL though, so I can pass that to a parser as long as it can
> "resolve" the javascript and get at the content)
> 
> I had a quick look at Windmill but it doesn't appear to be what I am
> looking for. Does anyone else have any experience in screenscraping sites
> that utilise javascript? Can you share how you did it and perhaps some
> sample code if possible?

There's a Python interface to SpiderMonkey (Mozilla's JavaScript
interpreter):

http://pypi.python.org/pypi/python-spidermonkey




More information about the Python-list mailing list