Extracting images from a PDF file

writeson doug.farrell at gmail.com
Thu Dec 27 10:13:39 EST 2007


On Dec 27, 1:12 am, Carl K <c... at personnelware.com> wrote:
> Doug Farrell wrote:
> > Hi all,
>
> > Does anyone know how to extract images from a PDF file? What I'm looking
> > to do is use pdflib_py to open large PDF files on our Linux servers,
> > then use PIL to verify image data. I want to do this in order
> > to find corrupt images in the PDF files. If anyone could help
> > me out, or point me in the right direction, it would be most
> > appreciated!
>
> If you are ok shelling out to a binary:
>
> pdfimages  -  Portable  Document  Format (PDF) image extractor (version
>         3.00)http://packages.ubuntu.com/gutsy/text/xpdf-utils
>
> I am trying to convert the pdf to a png, but without having to run external
> commands.  so I will understand if you arn't happy with pdfimages.
>
> Carl K

Carl,

Thanks for the feedback, and I don't mind shelling out to an external
command if it gets the job done. Thanks for the link to xpdf-utils,
I'm going to look into it this morning.

Doug



More information about the Python-list mailing list