HTML parsing/scraping & python

Mike Meyer mwm at mired.org
Thu Dec 1 15:25:55 EST 2005


"Fuzzyman" <fuzzyman at gmail.com> writes:
> The standard library module for fetching HTML is urllib2.

Does urllib2 replace everything in urllib? I thought there was some
urllib functionality that urllib2 didn't do.

> There is a project called mechanize, built by John Lee on top of
> urllib2 and other standard modules.
> It will emulate a browsers behaviour - including history, cookies,
> basic authentication, etc.

urllib2 handles cookies and authentication. I use those features
daily. I'm not sure history would apply, unless you're also handling
javascript. Is there some other way to ask the browser to go back in
history?

        <mike
-- 
Mike Meyer <mwm at mired.org>			http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.



More information about the Python-list mailing list