Pure Python JPEG parser

David Fraser davidf at sjsoft.com
Tue Nov 23 13:52:29 EST 2004


David Fraser wrote:
> Fredrik Lundh wrote:
> 
>> David Fraser wrote:
>>
>>> It *did* make me think "I wish there was some Pure Python 
>>> image-handling code". It seems like the C linkage is mainly required 
>>> for image formatting handling - I couldn't find any JPEG 
>>> reading/writing code in Pure Python ... would be nice :-)
>>
>>
>> and incredibly slow.  PIL uses C for a reason.
>>
> 
> I've recently discovered you can use the EXIF module to read thumbnails 
> that are embedded in a JPEG or TIFF file without having to parse all the 
> JPEG stuff. All I'm doing for my particular task is creating thumbnails 
> - I can imagine that this may be reasonably fast within Python.
> Even if it was slow, it wouldn't neccessarily have to be *incredibly* 
> slow :-)

I couldn't resist it... I found a C++ simple (and imperfect) JPEG parser 
(http://www.codeproject.com/bitmap/TonyJpegLib.asp) and converted it by 
hand to pure Python ...
It can basically decode most JPEG files and then output the result as a BMP.
Not surprisingly, it's fairly slow. Using Psyco can speed it up.
Here's a table of a brief test, image size and execution time under 
Standard Python and Psyco running on my Athlon
Image Size          |  Standard Python  |  Psyco
photo, 2048 x 1536  |  32 minutes       |  46 seconds
cartoon, 604 x 446  |  16 seconds       |  4 seconds

I would guess that the algorithms being designed for C is a major 
factor, and that doing some simple recoding would speed it up a fair bit.

I've put the code at http://davidf.sjsoft.com/files/pyjpeg/
I wrote a basic BMP format handler as well that the test handler 
requires so that's there too.

Note 1: I am only beginning to understand JPEG from converting the code 
:-) And the original C++ code doesn't convert the image perfectly, it 
has plenty of smudges which my Python code faithfully reproduces

Note 2: The code is horribly ugly for Python code

Anyone welcome to clean and speed it up...

David



More information about the Python-list mailing list