sitemap and ftp

Martin Fräulin martin.fraeulin at t-online.de
Tue Mar 27 06:47:15 EST 2001


Hi Oliver,

try the htmllib and the urllib module. You could use the first one to
parse your HTML files. The second one is for retrieving files from
URLs. To work with URLs it is also helpful to use urlparse module.
All these modules and their methods are described in the Python
documentation.

Once, I tried to write a program that grabs whole sites from URL to
the local disk but I never finished it.

Erich Seifert


Oliver Vecernik wrote:
> 
> Hi,
> 
> I'm relatively new to Python, but I felt immediately in love because of
> it's clear structure and ease to read the code. Unfortenately I still
> lack a lot of knowledge. Maybe some can help me or point me to the right
> spot in the docs.
> 
> I'd like to parse through my website tree and generate an automated
> sitemap from the <title> tags. Is there a module that helps me achieving
> this task?
> 
> Furthermore I'd like to copy all *.html files and all *.jpg and *.gif
> files to my ISP preserving the directory structure. Before copying the
> *.html files have to be filtered (adding a header and a footer). Has
> anybody written such routines already?
> 
> Regards,
> Oliver



More information about the Python-list mailing list