io module and pdf question

rusi rustompmody at gmail.com
Tue Jun 25 02:33:57 EDT 2013


On Tuesday, June 25, 2013 9:48:44 AM UTC+5:30, jyou... at kc.rr.com wrote:
> 1. Is there another way to get metadata out of a pdf without having to 
> install another module?
> 2. Is it safe to assume pdf files should always be encoded as latin-1 (when 
> trying to read it this way)?  Is there a chance they could be something else?

If your code is binary open in binary mode (mode="rb") rather than choosing a bogus encoding. You then have to make your strings also binary (b-prefix)
Also I am surprised that it works at all.  Most pdfs are compressed I thought??

> 3. Is the io module a good way to pursue this?

The docs say:
> The io module provides the Python interfaces to stream handling. Under Python 
> 2.x, this is proposed as an alternative to the built-in file object, but in 
> Python 3.x it is the default interface to access files and streams.

So I guess no point using io for python 3??




More information about the Python-list mailing list