export sites/pages to PDF

Nick Craig-Wood nick at craig-wood.com
Tue Aug 12 19:33:42 EDT 2008


jvdb <streamservenl at gmail.com> wrote:
>  My employer is asking for a solution that outputs the content of urls
>  to pdf. It must be the content as seen within the browser.
>  Can someone help me on this? It must be able to export several kind of
>  pages with all kind of content (javascript, etc.)

Sounds like you'd be best off scripting a browser.

Eg under KDE you can print to PDF from Konqueror using dcop to remote
control it.

Here is a demo... start Konqueror, select the PDF printer manually
before you start. (You can automate this I expect!)

Run

  dcop konq*

to find the id of the running konqueror (in my case
"konqueror-18286"), then open a URL

  dcop konqueror-18286 konqueror-mainwindow#1 openURL http://www.google.com

To print to a PDF file

  dcop konqueror-18286 html-widget2 print 1

Web site converted to PDF in ~/print.pdf ;-)

Easy enough to script that with python.

See here for some more info on dcop :-

  http://www.ibm.com/developerworks/linux/library/l-dcop/

-- 
Nick Craig-Wood <nick at craig-wood.com> -- http://www.craig-wood.com/nick



More information about the Python-list mailing list