Simulating a Web Browser in Python

Grant Edwards nobody at nowhere.nohow
Tue Mar 21 23:18:27 EST 2000


On Tue, 21 Mar 2000 15:10:12 -0600, David Fisher <python at rose164.wuh.wustl.edu> wrote:
>
>> Well to make my intent a bit more clear... What I'm writing is going to
>> be a Web Robot.
>>
>> However, I'd heard of grail and it may be a good source of example and
>> howto code.
>
>Well then, in the Tools directory of the Python source is webchecker.  It
>will traverse a web tree looking for bad links.  In the same directory is
>websucker, which will mirror a remote web site locally when pointed at the
>root url.  I use it all the time when I want to pull content from the web
>and maintain it's structure.

As long as we're on a somewhat related topic...

Lets say you've got reference manual for some software that's
in HTML -- all chopped up into tiny chunks, one section per
html file with links to next and previous sections.

I'd _really_ like to find a web-bot that given a pointer to the
first section would suck all of the sections off the server and
concatenate them into a single file.  Thus making it easier to
search/print/whatever.

And no, I'm not talking about the Python documentation -- it's
organized well enough that I don't find myself wishing it was
just a big text file.

-- 
Grant Edwards                   grante             Yow!  KARL MALDEN'S NOSE
                                  at               just won an ACADEMY AWARD!!
                               visi.com            



More information about the Python-list mailing list