[Baypiggies] Adding testing to screen-scraping code

Charles Merriam charles.merriam at gmail.com
Mon Dec 17 06:56:12 CET 2007


Sorry, if I fail to understand.  When I don't understand, I break
things down into small steps where

1.  You use a web hosted service which processes transcations from
HTTP forms.  No AJAX or JavaScript.
2.  You have a scripts to automate this service to provide you a
better or more programmable interface.
3.  You want regression tests for your scripts to ensure that the same
page from the server will cause exactly the same output from your
scripts.

This means you probably want:
4.  A local HTTP server that checks for an exact set of inputs, and
then serves up a predetermined page if they match.  The page would
also include the right URL and the like.
5.  A tool that makes snapshotting all those pages and linking them
together easy.

I don't have a solution yet.  Is this the problem?

Charles



On Dec 16, 2007 4:30 PM, Asheesh Laroia <asheesh at asheesh.org> wrote:
> I am a (happy!) customer of a company that provides a service, but I
> prefer to automate my interactions with their website, so I wrote a Python
> module using mechanize that does that.  Now that I have a hunk of code
> that works, and it's been working for pretty much two years with minimal
> fixes, I decided I'd like to add some tests.  I'd like to be able to test
> my code without bothering the real web service.  First of all, does anyone
> have advice?
>
> I'll say what I'm thinking: It'd be nice to have a mock HTTP server,
> maybe, or a mock trivial version of their web app, bundled as part of the
> test suite.  Then I could easily set a flag in my code to ask it to use
> not realwebsite.com but localhost:8118 or some port, and then go through
> the usual methods and verify that they work.
>
> It'd be *nicest* if such a thing could be automatically generated from the
> calls to urllib2 or mechanize by watching the function calls and noticing
> their return values.
>
> Does something like this exist?  Is there another angle I should consider?
>
> The reason I want something like this is that the web interface changed in
> a tiny way, and then my code broke.  But I thought about it, and I wasn't
> even sure what used to work before.
>
> Thanks, all!
>
> -- Asheesh.
>
> --
> FORTUNE PROVIDES QUESTIONS FOR THE GREAT ANSWERS: #15
> A:      The Royal Canadian Mounted Police.
> Q:      What was the greatest achievement in taxidermy?
> _______________________________________________
> Baypiggies mailing list
> Baypiggies at python.org
> To change your subscription options or unsubscribe:
> http://mail.python.org/mailman/listinfo/baypiggies
>


More information about the Baypiggies mailing list