scripting browsers from Python
John J. Lee
jjl at pobox.com
Wed Jun 1 18:27:44 EDT 2005
Olivier Favre-Simon <olivier.favre-simon at club-internet.fr> writes:
> On Tue, 31 May 2005 00:52:33 -0700, Michele Simionato wrote:
>
> > I would like to know what is available for scripting browsers from
> > Python.
[...]
> ClientForm http://wwwsearch.sourceforge.net/ClientForm/
>
> I use it for automation of POSTs of entire image directories to
> imagevenue.com/imagehigh.com/etc hosts.
This doesn't actually address what the OP wanted: it's not a browser.
> The only drawback I've found are:
> - does not support nested forms (since forms are returned in a list)
Nested forms?? Good grief. Can you point me at a real life example
of such HTML? Can probably fix the parser to work around this.
> - does not like ill-formed HTML (Uses HTMLParser as the underlying parser.
> you may pass a parser class as parameter (say SGMLParser for greater
> acceptance of stupid HTML code) but it's tricky because there is no well
> defined parser interface)
Titus Brown says he's trying to fix sgmllib (to some extent, at least).
Also, you can always feed stuff through mxTidy.
I'd like to have a reimplementation of ClientForm on top of something
like BeautifulSoup...
John
More information about the Python-list
mailing list