Python simulate browser activity
Roy Smith
roy at panix.com
Fri Mar 16 08:57:30 EDT 2012
In article
<214c4c0c-f8ec-4030-946b-8becc8e1aa9c at ur9g2000pbc.googlegroups.com>,
choi2k <rex.0510 at gmail.com> wrote:
> Hi, everyone
>
> I am trying to write a small application using python but I am not
> sure whether it is possible to do so..
> The application aims to simulate user activity including visit a
> website and perform some interactive actions (click on the menu,
> submit a form, redirect to another pages...etc)
> I have found some libraries / plugins which aims to simulate browser
> activity but ... none of them support AJAX request and/or running
> javascript.
>
> Here is one of the simple tasks which I would like to do using the
> application:
> 1. simulate the user activity, visit the website ("https://
> www.abc.com")
> 2. find out the target element by id ("submitBtn") and simulate mouse
> click on the item.
> 3. perform some javascript functions which invoked by click event from
> step 2. ( the function is bind to the item by jquery)
> 4. post ajax request using javascript and if the page has been
> redirected, load the new page content
>
> Is it possible to do so?
>
> Thank you in advance.
Depending on exactly what you're trying to test, this may or may not be
possible.
#1 is easy. Just get the URL with urllib (or, even better, Kenneth
Reitz's very cool requests library (http://docs.python-requests.org/).
#2 gets a little more interesting, but still not that big a deal. Parse
the HTML you get back with something like lxml (http://lxml.de/).
Navigate the DOM with a CSSSelector to find the element you're looking
for. Use urllib/requests again to get the href.
#3 is where things get fun. What, exactly, are you trying to test? Are
you trying to test that the js works, or are you really trying to test
your server and you're just using the embedded js as a means to generate
the next HTTP request?
If the former, then you really want to be exploring something like
selenium (http://seleniumhq.org/). If the latter, then skip all the js
crap and just go a GET or POST (back to urllib/requests) on the
appropriate url.
#4 falls into the same bucket as #3.
Once you have figured all this out, you will get a greater appreciation
for the need to cleanly separate your presentation (HTML, CSS,
Javascript) from your business logic and data layers. If there is a
clean interface between them, it becomes easy to test one in isolation
from the other. If not, then you end up playing games with screen
scraping via lxml and selenium just to test your database queries.
Which will quickly lead to you running screaming into the night.
More information about the Python-list
mailing list