scraping a tumblr.com archive page

Jabba Laci jabba.laci at gmail.com
Sun Nov 20 13:06:44 EST 2011


Hi,

I want to extract the URLs of all the posts on a tumblr blog. Let's
take for instance this blog: http://loveyourchaos.tumblr.com/archive .
If I download this page with a script, there are only 50 posts in the
HTML. If you scroll down in your browser to the end of the archive,
the browser will dynamically load newer and newer posts.

How to scrape such a dynamic page?

Thanks,

Laszlo



More information about the Python-list mailing list