scripting browsers from Python

John J. Lee jjl at pobox.com
Wed Jun 1 18:27:44 EDT 2005


Olivier Favre-Simon <olivier.favre-simon at club-internet.fr> writes:

> On Tue, 31 May 2005 00:52:33 -0700, Michele Simionato wrote:
> 
> > I would like to know what is available for scripting browsers from
> > Python.
[...]
> ClientForm	http://wwwsearch.sourceforge.net/ClientForm/
> 
> I use it for automation of POSTs of entire image directories to
> imagevenue.com/imagehigh.com/etc hosts.

This doesn't actually address what the OP wanted: it's not a browser.


> The only drawback I've found are:
> - does not support nested forms (since forms are returned in a list)

Nested forms??  Good grief.  Can you point me at a real life example
of such HTML?  Can probably fix the parser to work around this.


> - does not like ill-formed HTML (Uses HTMLParser as the underlying parser.
> you may pass a parser class as parameter (say SGMLParser for greater
> acceptance of stupid HTML code) but it's tricky because there is no well
> defined parser interface)

Titus Brown says he's trying to fix sgmllib (to some extent, at least).

Also, you can always feed stuff through mxTidy.

I'd like to have a reimplementation of ClientForm on top of something
like BeautifulSoup...


John



More information about the Python-list mailing list