Errors with PyPdf

flebber flebber.crue at gmail.com
Mon Sep 27 10:19:34 EDT 2010


On Sep 27, 2:46 pm, Dave Angel <da... at ieee.org> wrote:
> On 2:59 PM, flebber wrote:
>
> > <snip>
> > Traceback (most recent call last):
> >    File "C:/Python26/Pdfread", line 16, in<module>
> >      open('x.txt', 'w').write(content)
> > NameError: name 'content' is not defined
> > When i use.
>
> > import pyPdf
>
> > def getPDFContent(path):
> >      content =C:\Components-of-Dot-NET.txt"
> >      # Load PDF into pyPDF
> >      pdf =yPdf.PdfFileReader(file(path, "rb"))
> >      # Iterate pages
> >      for i in range(0, pdf.getNumPages()):
> >          # Extract text from page and add to content
> >          content +=df.getPage(i).extractText() + "\n"
> >      # Collapse whitespace
> >      content = ".join(content.replace(u"\xa0", " ").strip().split())
> >      return content
>
> > print getPDFContent(r"C:\Components-of-Dot-NET.pdf").encode("ascii",
> > "ignore")
> > open('x.txt', 'w').write(content)
>
> There's no global variable content, that was local to the function.  So
> it's lost when the function exits.  it does return the value, but you
> give it to print, and don't save it anywhere.
>
> data = getPDFContent(r"C:\Components-of-Dot-NET.pdf").encode("ascii",
> "ignore")
>
> outfile = open('x.txt', 'w')
> outfile.write(data)
>
> close(outfile)
>
> I used a different name to emphasize that this is *not* the same
> variable as content inside the function.  In this case, it happens to
> have the same value.  And if you used the same name, you could be
> confused about which is which.
>
> DaveA

Thank You everyone.



More information about the Python-list mailing list