How to save web pages for offline reading?

Anand Pillai pythonguy at Hotpop.com
Tue Aug 19 02:24:44 EDT 2003


I hope this thread is not dead.

I would like to know what you decided at the end :-)
Harvestman has about 10 active subscribers right now 
and some corporates in India and umpteen of my own friends
use it for their personal 'harvesting' needs :->

I hope you downloaded at least the (new) binaries
and gave it a go!

-Anand

hwlgw at hotmail.com (Will Stuyvesant) wrote in message news:<cb035744.0307220719.5b30a3e6 at posting.google.com>...
> > [Carsten Gehling]
> > Well since you ARE on Windows:
> > 
> > Open the page in Internet Explorer, choose "File" and "Save As...".
> > You've now saved all necessary files.
> > 
> 
> I know.  But I can't do File - Save As from python :-)  I guess it can
> be done via COM?
> 
> > > I thought this whole thing would be easy with all those Python
> > > internet modules in the standard distro: httplib, urllib, urllib2,
> > > FancyURLxxx etc.  Being able to download a "complete" page *from
> > > Python source* would be very nice for my particular application.
> > 
> > Well it's doable with those libraries, but you have to put your own meat 
> > on
> > the bones.
> > 
> > 1) Use httplib to get the page first.
> > 2) Parse it for all "src" attributes, and get the supporting files. The
> > parsin can be done with a html-parser ...
> 
> That would be htmllib.
> 
> What you describe is what I am going to do actually, when I have time
> again.  I was about to do it when I thought "somebody must have been
> doing this before".  It seems like mr. Pillai in another reply has
> done something similar but I couldn't figure out from his source code.
> 
> Thank you all for the help!




More information about the Python-list mailing list