How to generate pdf file from an html page??

Ramsey Nasser aladameh at gmail.com
Sun Dec 16 13:46:58 EST 2007


On Dec 16, 2007 7:26 PM, Zentrader <zentraders at gmail.com> wrote:
> Sorry, I read that backwards.  I do it the opposite of you.  Anyway a
> google for "html to pdf python" turns up a lot of hits.  Again, no
> reason to reinvent the wheel.
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>

Like Zentrader said, theres no reason to reinvent the wheel. An HTML
to PDF converter is no trivial task. You would essentially have to
implement an HTML layout engine that outputs PDF files. Not only does
that mean you would have to programatically produce a PDF file, but it
means you would have to parse and correctly render HTML and CSS
according to accepted web standards, the W3C's specifications. This
has proved difficult to do and get right in practice, as is evident
from the browser compatibility issues that continue to plague the web.

Theres a package called Prince that's supposed to do an excellent job.
Check it out:

http://www.princexml.com/

Its layout engine surpasses some browsers in terms of compatibility
with web standards. I don't think its free for commercial use, though,
so this might depend on what exactly you're trying to do.

An alternative idea it to wait for Firefox 3 to come out. If I'm not
mistaken, it will feature a new version of the Gecko layout engine
which will use Cairo for all its rendering. Coincidently, Cairo can be
made to output PDF files. So, you may be able to hack something
together.

-- 
nasser



More information about the Python-list mailing list