How to save web pages for offline reading?
Anand Pillai
pythonguy at Hotpop.com
Tue Aug 19 02:24:44 EDT 2003
I hope this thread is not dead.
I would like to know what you decided at the end :-)
Harvestman has about 10 active subscribers right now
and some corporates in India and umpteen of my own friends
use it for their personal 'harvesting' needs :->
I hope you downloaded at least the (new) binaries
and gave it a go!
-Anand
hwlgw at hotmail.com (Will Stuyvesant) wrote in message news:<cb035744.0307220719.5b30a3e6 at posting.google.com>...
> > [Carsten Gehling]
> > Well since you ARE on Windows:
> >
> > Open the page in Internet Explorer, choose "File" and "Save As...".
> > You've now saved all necessary files.
> >
>
> I know. But I can't do File - Save As from python :-) I guess it can
> be done via COM?
>
> > > I thought this whole thing would be easy with all those Python
> > > internet modules in the standard distro: httplib, urllib, urllib2,
> > > FancyURLxxx etc. Being able to download a "complete" page *from
> > > Python source* would be very nice for my particular application.
> >
> > Well it's doable with those libraries, but you have to put your own meat
> > on
> > the bones.
> >
> > 1) Use httplib to get the page first.
> > 2) Parse it for all "src" attributes, and get the supporting files. The
> > parsin can be done with a html-parser ...
>
> That would be htmllib.
>
> What you describe is what I am going to do actually, when I have time
> again. I was about to do it when I thought "somebody must have been
> doing this before". It seems like mr. Pillai in another reply has
> done something similar but I couldn't figure out from his source code.
>
> Thank you all for the help!
More information about the Python-list
mailing list