Scraping a web page

Iain King iainking at gmail.com
Wed Apr 8 11:14:46 EDT 2009


On Apr 7, 1:44 pm, Tim Chase <python.l... at tim.thechases.com> wrote:
> > f = urllib.urlopen("http://www.google.com")
> > s = f.read()
>
> > It is working, but it's returning the source of the page. Is there anyway I
> > can get almost a screen capture of the page?
>
> This is the job of a browser -- to render the source HTML.  As
> such, you'd want to look into any of the browser-automation
> libraries to hook into IE, FireFox, Opera, or maybe using the
> WebKit/KHTML control.  You may then be able to direct it to
> render the HTML into a canvas you can then treat as an image.
>
> Another alternative might be provided by some web-services that
> will render a page as HTML with various browsers and then send
> you the result.  However, these are usually either (1)
> asynchronous or (2) paid services (or both).
>
> -tkc

WX can render html.



More information about the Python-list mailing list