Python Screen Scraper

kyosohma at gmail.com kyosohma at gmail.com
Tue Apr 24 13:32:34 EDT 2007


On Apr 24, 7:17 am, Michael Bentley <mich... at jedimindworks.com> wrote:
> On Apr 24, 2007, at 11:50 AM, James Stroud wrote:
>
>
>
> > Hello,
>
> > Does anyone know of an example, however modest, of a screenscraper
> > authored in python? I am using Firefox.
>
> > Basically, I am answering problems via my browser and being scored for
> > each problem. I have a tendency to go past my peak for training
> > efficiency, so I would like to scrape the result page for each
> > problem I
> > answer, compile statistics, and have a program alert me when I should
> > stop (based on score and accuracy--assuming training value is
> > related to
> > changes in these metrics).
>
> > I have no idea how to go about writing such a beast and I am hoping
> > that
> > I could get some pointers or an example that could get me going in the
> > right direction.
>
> > Parsing, etc, is not a problem, but I'm not exactly sure how I might
> > interface python with Firefox, forwarding scraped pages to my browser
> > (or forwarding from the browser to the scraper).
>
> > Thanks in advance for any help or advice.
>
> Possibly the easiest thing will be to read from firefox' cache.
> Otherwise I think your only real options are to either build a proxy
> or sniff the wire...

You should be able to parse the html too. There are quite a few tools
out there for that purpose, "Beautiful Soup" being a good example.

Mike




More information about the Python-list mailing list