Read PDF content

James Matthews nytrokiss at gmail.com
Thu Aug 21 13:14:37 EDT 2008


You can also use pdflib
http://www.pdflib.com/download/pdflib-family/pdflib-7/

On Thu, Aug 21, 2008 at 6:47 AM, William Purcell
<williamhpurcell at gmail.com>wrote:

> Sorry, this last email was meant to be to the list.
>
> On Thu, Aug 21, 2008 at 8:41 AM, William Purcell <
> williamhpurcell at gmail.com> wrote:
>
>> I have been trying to do the same thing. Here is something I came up with,
>> although it's not completely dependent on Python. It requires pdftotext to
>> be installed. If your on a linux box, I think it comes in xpdf-utils but I'm
>> not comletely sure. Anyway, install pdftotext and then you could use this
>> function:
>>
>> ----------------------------------------------------------------------------
>> import os
>>
>> def readpdf(filepath):
>>     cmd = 'pdftotext -layout %s -'%(filepath,)
>>     lines=os.popen(cmd).readlines()
>>     return lines
>>
>> ----------------------------------------------------------------------------
>> I would like to find something totally Python, but this has worked for me
>> in a pinch.
>> -Bill
>>
>>
>>    On Thu, Aug 21, 2008 at 5:00 AM, AON LAZIO <aonlazio at gmail.com> wrote:
>>
>>>    Hi, Guys.
>>>       I am trying to extract the PDF file content(to get the specific
>>> information) using python. I already tried pyPdf with no success.
>>>       Anyone has suggestions?
>>>       Thanks in advance.
>>>
>>> Aonlazio
>>>
>>> --
>>> http://mail.python.org/mailman/listinfo/python-list
>>>
>>
>>
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>



-- 
http://www.goldwatches.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20080821/8bd5abb9/attachment-0001.html>


More information about the Python-list mailing list