New newbie question.
John Hunter
jdhunter at nitace.bsd.uchicago.edu
Tue Jul 9 15:55:24 EDT 2002
>>>>> "SA" == SA <sarmstrong13 at mac.com> writes:
SA> Can you read a pdf with Python? I know you can read a text
SA> file with:
SA> Inp = open("textfile", "r")
SA> Will the same thing work on pdf files:
SA> Inp = open("pdffile", "rb")
You can do this, but you'll get the binary
If you are on a linux system, you may have pdftotext already
installed, and can call that command from python with
# Example usage:
# python ~/python/examples/pdf_demo.py HunterEtal2000.pdf
import os, sys
filename = sys.argv[1]
command = os.popen('pdftotext %s -' % filename)
for line in command.readlines():
print line,
You may want to have a look at these two python apps that for working
with pdfs:
http://www.reportlab.com/index.html - emphasis on pdf generation
http://pdfsearch.sourceforge.net - search pdfs
Cheers,
John Hunter
More information about the Python-list
mailing list