[Tutor] PDF to text conversion

Robert Berman bermanrl at cfl.rr.com
Tue Apr 21 19:44:16 CEST 2009


Hello Emad,

I have seriously looked at the documentation associated with pyPDF. This 
seems to have the page as its smallest element of work, and what i need 
is a line by line process to go from .PDF format to Text. I don't think 
pyPDF will meet my needs but thank you for bringing it to my attention.

Thanks,


Robert Berman

Emad Nawfal (عماد نوفل) wrote:
>
>
> On Tue, Apr 21, 2009 at 12:54 PM, bob gailer <bgailer at gmail.com 
> <mailto:bgailer at gmail.com>> wrote:
>
>     Robert Berman wrote:
>
>         Hi,
>
>         I must convert a history file in PDF format that goes from May
>         of 1988 to current date.  Readings are taken twice weekly and
>         consist of the date taken mm/dd/yy and the results appearing
>         as a 10 character numeric + special characters sequence. This
>         is obviously an easy setup for a very small database
>          application with the date as the key, the result string as
>         the data.
>
>         My problem is converting the PDF file into a text file which I
>         can then read and process. I do not see any free python
>         libraries having this capacity. I did see a PDFPILOT program
>         for Windows but this application is being developed on Linux
>         and should also run on Windows; so I do not want to
>         incorporate a Windows only application.
>
>         I do not think i am breaking any new frontiers with this
>         application. Have any of you worked with such a library, or do
>         you know of one or two I can download and work with?
>         Hopefully, they have reasonable documentation.
>
>
>     If this is a one-time conversion just use the save as text feature
>     of adobe reader.
>
>
>
>         My development environment is:
>
>         Python
>         Linux
>         Ubuntu version 8.10
>
>
>         Thanks for any help  you might be able to offer.
>
>
>         Robert Berman
>         _______________________________________________
>         Tutor maillist  -  Tutor at python.org <mailto:Tutor at python.org>
>         http://mail.python.org/mailman/listinfo/tutor
>
>
>
>     -- 
>     Bob Gailer
>     Chapel Hill NC
>     919-636-4239
>
>     _______________________________________________
>     Tutor maillist  -  Tutor at python.org <mailto:Tutor at python.org>
>     http://mail.python.org/mailman/listinfo/tutor
>
>
>
> I tried pyPdf once, just for fun, and it was nice:
> http://pybrary.net/pyPdf/
> -- 
> لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه 
> كالحقيقة.....محمد الغزالي
> "No victim has ever been more repressed and alienated than the truth"
>
> Emad Soliman Nawfal
> Indiana University, Bloomington
> --------------------------------------------------------


More information about the Tutor mailing list