Simulating a Web Browser in Python

Colin J. Williams cjw at connection.com
Thu Mar 23 13:04:10 EST 2000


Grant,

I may misunderstand what you aim to do but I think websucker in the
tools does what you want.

Colin W.

Grant Edwards wrote:
> 
> On Tue, 21 Mar 2000 15:10:12 -0600, David Fisher <python at rose164.wuh.wustl.edu> wrote:
> >
> >> Well to make my intent a bit more clear... What I'm writing is going to
> >> be a Web Robot.
> >>
> >> However, I'd heard of grail and it may be a good source of example and
> >> howto code.
> >
> >Well then, in the Tools directory of the Python source is webchecker.  It
> >will traverse a web tree looking for bad links.  In the same directory is
> >websucker, which will mirror a remote web site locally when pointed at the
> >root url.  I use it all the time when I want to pull content from the web
> >and maintain it's structure.
> 
> As long as we're on a somewhat related topic...
> 
> Lets say you've got reference manual for some software that's
> in HTML -- all chopped up into tiny chunks, one section per
> html file with links to next and previous sections.
> 
> I'd _really_ like to find a web-bot that given a pointer to the
> first section would suck all of the sections off the server and
> concatenate them into a single file.  Thus making it easier to
> search/print/whatever.
> 
> And no, I'm not talking about the Python documentation -- it's
> organized well enough that I don't find myself wishing it was
> just a big text file.
> 
> --
> Grant Edwards                   grante             Yow!  KARL MALDEN'S NOSE
>                                   at               just won an ACADEMY AWARD!!
>                                visi.com



More information about the Python-list mailing list