[Tutor] PDF to text conversion

David david at abbottdavid.com
Wed Apr 22 14:11:27 CEST 2009


Robert Berman wrote:
> Dinesh,
> 
> I have pdftotext version 3.0.0.  I have decided to use this to go from 
> PDF to text. It is not the ideal solution, but is is a certainly doable 
> solution.
> 
> Thank you,
> 
> Robert
> 
> Dinesh B Vadhia wrote:
>> The best converter so far is pdftotext from 
>> http://www.glyphandcog.com/ who maintain an open source project at 
>> http://www.foolabs.com/xpdf/.
>>  
>> It's not a Python library but you can call pdftotext from with Python 
>> using os.system().  I used the pdftotext -layout option and that gave 
>> the best result.  hth.
>>  
>> dinesh
>>  
You can use subprocess;
#!/usr/bin/python

from subprocess import call
call(['pdftotext', 'test.pdf'])

-david
-- 
Powered by Gentoo GNU/Linux
http://linuxcrazy.com


More information about the Tutor mailing list