PDF Parser?

Adam Twardoch list.adam at twardoch.com
Tue Jul 15 04:09:06 EDT 2003


"John Hunter" <jdhunter at ace.bsd.uchicago.edu>

> A little more info would be helpful: do you need access to all the pdf
> structures or just the text?  AFAIK, there is no full pdf parser in
> python.

If you need to access the graphical elements, you may use pstoedit to
convert the PDF into SVG (Structured Vector Graphics). Since SVG is XML, you
can then use any Python-based XML toolkit to parse the data.
http://www.pstoedit.net/pstoedit

Adam






More information about the Python-list mailing list