pdf library.

Shriphani shriphanip at gmail.com
Tue Jan 1 07:21:29 EST 2008


On Jan 1, 4:28 pm, Piet van Oostrum <p... at cs.uu.nl> wrote:
> >>>>>Shriphani<shripha... at gmail.com> (S) wrote:
> >S> I tried pyPdf for this and decided to get the pagelinks. The trouble
> >S> is that I don't know how to determine whether a particular page is the
> >S> first page of a chapter. Can someone tell me how to do this ?
>
> AFAIK PDF doesn't have the concept of "Chapter". If the document has an
> outline, you could try to use the first level of that hierarchy as the
> chapter starting points. But you don't have a guarantee that they really
> are chapters.
> --
> Piet van Oostrum <p... at cs.uu.nl>
> URL:http://pietvanoostrum.com[PGP 8DAE142BE17999C4]
> Private email: p... at vanoostrum.org

How would a pdf to html conversion work ? I've seen Google's search
engine do it loads of times. Just that running a 500odd page ebook
through one of those scripts might not be such a good idea.



More information about the Python-list mailing list