How to save web pages for offline reading?

Will Stuyvesant hwlgw at hotmail.com
Tue Jul 22 11:19:38 EDT 2003


> [Carsten Gehling]
> Well since you ARE on Windows:
> 
> Open the page in Internet Explorer, choose "File" and "Save As...".
> You've now saved all necessary files.
> 

I know.  But I can't do File - Save As from python :-)  I guess it can
be done via COM?

> > I thought this whole thing would be easy with all those Python
> > internet modules in the standard distro: httplib, urllib, urllib2,
> > FancyURLxxx etc.  Being able to download a "complete" page *from
> > Python source* would be very nice for my particular application.
> 
> Well it's doable with those libraries, but you have to put your own meat 
> on
> the bones.
> 
> 1) Use httplib to get the page first.
> 2) Parse it for all "src" attributes, and get the supporting files. The
> parsin can be done with a html-parser ...

That would be htmllib.

What you describe is what I am going to do actually, when I have time
again.  I was about to do it when I thought "somebody must have been
doing this before".  It seems like mr. Pillai in another reply has
done something similar but I couldn't figure out from his source code.

Thank you all for the help!

-- 
Agree with them now, it will save so much time.




More information about the Python-list mailing list