[Tutor] PDF to text conversion

Robert Berman bermanrl at cfl.rr.com
Tue Apr 21 18:48:55 CEST 2009


Hi,

I must convert a history file in PDF format that goes from May of 1988 
to current date.  Readings are taken twice weekly and consist of the 
date taken mm/dd/yy and the results appearing as a 10 character numeric 
+ special characters sequence. This is obviously an easy setup for a 
very small database  application with the date as the key, the result 
string as the data.

My problem is converting the PDF file into a text file which I can then 
read and process. I do not see any free python libraries having this 
capacity. I did see a PDFPILOT program for Windows but this application 
is being developed on Linux and should also run on Windows; so I do not 
want to incorporate a Windows only application.

I do not think i am breaking any new frontiers with this application. 
Have any of you worked with such a library, or do you know of one or two 
I can download and work with? Hopefully, they have reasonable 
documentation.

My development environment is:

Python
Linux
Ubuntu version 8.10


Thanks for any help  you might be able to offer.


Robert Berman


More information about the Tutor mailing list