PDF->Text converter/extractor

Paul Rubin phr-n2001d at nightsong.com
Mon Nov 5 16:15:16 EST 2001


"Igor Stroh" <igor.stroh at wohnheim.uni-ulm.de> writes:
> though I didnt find anything yet, perhaps there is someone who already
> had the same problem and solved it by writing an own PDF parser? :) I'm
> too lazy to start reading the specs of PDF and try to write the thingy by
> myself :)

I think there's a Postscript to text converter somewhere based on
Ghostscript.  PDF's are just compressed Postscript so it should be
straightforward to modify the Postscript to text converter to handle
PDF's, if it hasn't been done already.



More information about the Python-list mailing list