PDF parser

Christian Tismer tismer at stackless.com
Fri Jul 30 05:55:10 EDT 2004


Andreas Lobinger wrote:
> Aloha,
> 
> Radovan Garabik wrote:
> 
>> Christian Tismer <tismer at stackless.com> wrote:
>>
>>> need the bytecodehacks. I am writing a sophisticated package
>>> which involves parsing of PDF files, and I want to do it all in
>>> Python. In order to get this PDF processor to almost C speed,
> 
> 
> When you need hacks to get reasonable speed for full parsing PDF,
> your algorithms are not very efficiently designed...

I'm not sure what you are talking about.
By speed, I'm thinking of reaching almost the
speed of a pure C implementation.
For that reason, I use Psyco, and the primitive
routines are implemented in a way that Psyco
optimizes best.
But from design, the code looks nicer if things like
data streams and token sequences are implemented using
generators.
Generators are not supported by Psyco.
With Bytecodehacks, I can make Psyco support generators,
by applying code transformations which turn the yield
statement into something different, but semantical
identical.
The goal is to have things both fast and nicely readable.

-- 
Christian Tismer             :^)   <mailto:tismer at stackless.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  mobile +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/




More information about the Python-list mailing list