Extracting images from a PDF file

Fri Dec 28 08:24:31 EST 2007

On Dec 27, 10:13 am, writeson <doug.farr... at gmail.com> wrote:
> On Dec 27, 1:12 am, Carl K <c... at personnelware.com> wrote:
>
>
>
> > Doug Farrell wrote:
> > > Hi all,
>
> > > Does anyone know how to extract images from aPDFfile? What I'm looking
> > > to do is use pdflib_py to open largePDFfiles on our Linux servers,
> > > then use PIL to verify image data. I want to do this in order
> > > to find corrupt images in thePDFfiles. If anyone could help
> > > me out, or point me in the right direction, it would be most
> > > appreciated!
>
> > If you are ok shelling out to a binary:
>
> > pdfimages  -  Portable  Document  Format (PDF) image extractor (version
> >         3.00)http://packages.ubuntu.com/gutsy/text/xpdf-utils
>
> > I am trying to convert thepdfto a png, but without having to run external
> > commands.  so I will understand if you arn't happy with pdfimages.
>
> > Carl K
>
> Carl,
>
> Thanks for the feedback, and I don't mind shelling out to an external
> command if it gets the job done. Thanks for the link to xpdf-utils,
> I'm going to look into it this morning.
>
> Doug

Hi,

Our linux servers run CentOS (4.X) I believe, and the repositories for
this version doesn't have xpdf-utils available. I'm going to look into
editing the sources.list file in order to get yum to install the
necessary dependencies for me as xpdf-utils looks very useful!

Doug