highlight words by regex in pdf files using python

David Boddie david at boddie.org.uk
Wed Mar 17 18:40:43 EDT 2010


On Wednesday 17 March 2010 00:47, Aahz wrote:

> In article
> <af0830ae-1d24-4db9-b721-d6602fedd540 at 15g2000yqi.googlegroups.com>,
> Peng Yu  <pengyu.ut at gmail.com> wrote:
>>
>>I don't find a general pdf library in python that can do any
>>operations on pdfs.
>>
>>I want to automatically highlight certain words (using regex) in a
>>pdf. Could somebody let me know if there is a tool to do so in python?
> 
> Did you Google at all?  "python pdf" finds this as the first link, though
> I have no clue whether it does what you want:
> 
> http://pybrary.net/pyPdf/

The original poster might also be interested in displaying the highlighted
words without modifying the original file. In which case, the Poppler
library is worth investigating:

  http://poppler.freedesktop.org/

David



More information about the Python-list mailing list