reading PDF using Python [Q]

Mark Nottingham mnot at pobox.com
Wed May 12 18:38:02 EDT 1999


> > the format seem to use rather different styles of data structures.
>
> In fact, I don't think it's unreadybble at all! I've seen much
> more boring standards specifications already, like those of W3C.
> The PDF specification explains quite nicely the general architec-
> ture of a PDF document, the file format, etc. Give it a try!

I agree (although I like the more established w3c stuff as well ;-)

I wrote a PDF parser in Perl for a contract a while back; manipulating the
basic structure of the PDF is remarkably easy, and I plan to play with it in
Python some day. I never got down to the level of working with the streams
itself, which is what it sounds like is needed here.

Good luck,





More information about the Python-list mailing list