PDF library?
Andreas Lobinger
andreas.lobinger at netsurf.de
Wed Apr 21 04:28:51 EDT 2004
Aloha,
Paul Rubin schrieb:
> Simon Burton <simonb at NOTTHISBIT.webone.com.au> writes:
> > http://www.reportlab.org/
> > handles pdf files.
> Reportlab generates reports in pdf format, but I want to do the
> opposite, namely read in pdf files that have already been generated by
> a different program, and crunch on them. Any more ideas? Thanks.
The commercial version (reportlab.com) mentions a tool named
PageCatcher, that seems to be able to extract pages and page descriptions
out of .pdf documents. There is not that many information on the web-page.
If you read comp.text.tex you will find various solutions for composing
and a few for extracting data/content from .pdf documents. Afaik there
is at the moment (read as: i'm working on it) no free-self-contained-
python solution. But as python is very interface-friendly you can use
general tools like gs easily.
For your problem i would suggest to use gs als a .pdf to .ps filter
in the first place, work on the .ps and distill back with gs.
Wishing a happy day
LOBI
More information about the Python-list
mailing list