[SciPy-user] The IO library and image file formats -- compare with with PIL

Zachary Pincus zachary.pincus at yale.edu
Fri Apr 18 10:31:34 EDT 2008


> Ultimately we has to consider a fork of PIL.
> Do you guys know, if this is allowed -- per the PIL license !?
>
> Of course this would be super sub optimal,
> but then, it's effectively what I have right now -- and you have your
> own version .....
>
> To summerize: numpy can probably do many things of PIL already better
> -- I'm talking about all the transformation stuff of course.
> So only the file IO would have to get forked out  -- to scipy for  
> example ;-)

I have my own "internal fork" of PIL that I've been calling "PIL- 
lite". I tore out everything except the file IO, and I fixed that to  
handle 16-bit files correctly on all endian machines, and to have a  
more robust array interface.

IIRC, PIL is BSD-licensed (or BSD-compatible), so the fork should be  
OK to re-distribute.

Now, part of the reason that we may have heard nothing about the PIL  
patches we've submitted variously is that I understand that they're  
doing a big re-write of PIL, and in particular, its memory handling,  
that should address these sort of issues. However, we all know how  
well "big rewrites" go...

If people wanted to make a proper "fork" of PIL into a numpy- 
compatible image IO layer, I would be all for that. I'd be happy to  
donate "PIL-lite" as a starting point. Now, the file IO in PIL is a  
bit circuitous -- files are initially read by pure-Python code that  
determines the file type, etc. This information is then passed to  
(brittle and ugly) C code to unpack and swizzle the bits as necessary,  
and pack them into the PIL structs in memory.

I think that basically all of what PIL does, bit-twiddling-wise, could  
be done with numpy. So really, what's needed is to take the pure- 
Python "file format reading" functionality from PIL (with my  
modifications thereof to handle 16-bit files better, and Stéfan and  
Sebastian's modifications for other functionality, etc), and then  
attach it to a layer that uses Python and numpy to actually read the  
bits out of the files and directly into numpy arrays.

I've been meaning to do this for a while, but just haven't gotten  
around to it. I think it will be a surprisingly small amount of code  
needed around PIL's python file format readers.

Zach





More information about the SciPy-User mailing list