[SciPy-user] The IO library and image file formats -- compare with with PIL
Zachary Pincus
zachary.pincus at yale.edu
Mon Apr 21 11:31:25 EDT 2008
> On 21/04/2008, Zachary Pincus <zachary.pincus at yale.edu> wrote:
>> To answer Stéfan's earlier question of how I see things fitting
>> together, I *think* that the pure-python file format interpretation
>> code could be used (either by importing from PIL or using patched
>> copies as needed) to figure out what a given image file type is, and
>> where in the file the pixels are stored. Then the relevant region of
>> the file would be passed through python.zipfile/deflate/etc. if
>> needed
>> to decompress the pixels, and sent to numpy for unpacking the bits
>> from the string.
>
> So this is the bit that I don't understand. Those pixel values are
> encoded, so which component do you use to take the data chunk and
> convert it to actual pixel values?
numpy.fromstring takes a byte sequence and unpacks it into an array of
a specified shape and data type. Most image file formats are just
different ways of putting byte sequences on disk and specifying how
they were compressed, if at all. Most formats have either no
compression, or LZW/Deflate/zlib-style compression, for which there
are already python libraries.
So for example, reading a TIFF file would consist of looking at the
header to determine the pixel format, image size, and compression,
then rooting around in the file to assemble the relevant bytes, then
running that through deflate (most often), and passing the resulting
string to numpy.fromstring. Same for PNG, or most anything that's not
JPEG. Writing is similar.
Again, what I'm imagining wouldn't be a full-featured image IO
library, but something lightweight with no dependencies outside of
numpy, and potentially (if JPEG decoding isn't desired), no C-
extensions. (One could conceivably use numpy to do JPEG encoding and
decoding, but I've no interest in doing that...)
This is all just an idea, and I'm not convinced whether it's a great
idea. But I just wanted to put the suggestion out there...
Zach
More information about the SciPy-User
mailing list