Looking for browser emulator

Roy Smith roy at panix.com
Thu Oct 13 22:53:06 EDT 2011


In article 
<2323f3d7-42ff-4de5-9006-4741e865f09c at a9g2000yqo.googlegroups.com>,
 Jon Clements <joncle at googlemail.com> wrote:

> On Oct 14, 3:19 am, Roy Smith <r... at panix.com> wrote:
> > I've got to write some tests in python which simulate getting a page of
> > HTML from an http server, finding a link, clicking on it, and then
> > examining the HTML on the next page to make sure it has certain features.
> >
> > I can use urllib to do the basic fetching, and lxml gives me the tools
> > to find the link I want and extract its href attribute.  What's missing
> > is dealing with turning the href into an absolute URL that I can give to
> > urlopen().  Browsers implement all sorts of stateful logic such as "if
> > the URL has no hostname, use the same hostname as the current page".  
> > I'm talking about something where I can execute this sequence of calls:
> >
> > urlopen("http://foo.com:9999/bar")
> > urlopen("/baz")
> >
> > and have the second one know that it needs to get
> > "http://foo.com:9999/baz".  Does anything like that exist?
> >
> > I'm really trying to stay away from Selenium and go strictly with
> > something I can run under unittest.
> 
> lxml.html.make_links_absolute() ?

Interesting.  That might be exactly what I'm looking for.  Thanks!



More information about the Python-list mailing list