[Image-SIG] BUG: PNG files with zTXt chunk can't be read incrementally

Gareth Rees gdr at garethrees.org
Sat Mar 31 18:22:59 CEST 2012


SUMMARY

A PNG file containing a compressed text (zTXt) chunk prior to the first IDAT chunk cannot reliably be read by feeding it in chunks to an ImageFile.Parser object.


STEPS TO REPRODUCE

Create a PNG image with a zTXt chunk prior to the first IDAT chunk:

    from PngImagePlugin import _MAGIC, putchunk
    import struct, zlib

    with open('bug2.png', 'wb') as f:
        f.write(_MAGIC)
        putchunk(f, 'IHDR', struct.pack('>IIBBBBB', 1, 1, 1, 0, 0, 0, 0))
        putchunk(f, 'zTXt', 'key\0\0' + zlib.compress('value'))
        putchunk(f, 'IDAT', zlib.compress(struct.pack('>BB', 0, 0)))
        putchunk(f, 'IEND', '')

The image (and its zTXt chunk) loads correctly via Image.open:

    >>> Image.open('bug2.png')
    <PngImagePlugin.PngImageFile image mode=1 size=1x1 at 0x10CDA0D88>
    >>> _.info
    {'key': 'value'}

But if you feed a chunk of this image to an ImageFile.Parser then you can get a ValueError:

    >>> ImageFile.Parser().feed(open('bug.png', 'rb').read(41))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "ImageFile.py", line 402, in feed
        im = Image.open(fp)
      File "Image.py", line 1965, in open
        return factory(fp, filename)
      File "ImageFile.py", line 91, in __init__
        self._open()
      File "PngImagePlugin.py", line 331, in _open
        s = self.png.call(cid, pos, len)
      File "PngImagePlugin.py", line 115, in call
        return getattr(self, "chunk_" + cid)(pos, len)
      File "PngImagePlugin.py", line 291, in chunk_zTXt
        k, v = string.split(s, "\0", 1)
    ValueError: need more than 1 value to unpack

or a zlib.error:

    >>> ImageFile.Parser().feed(open('bug.png', 'rb').read(50))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "ImageFile.py", line 402, in feed
        im = Image.open(fp)
      File "Image.py", line 1965, in open
        return factory(fp, filename)
      File "ImageFile.py", line 91, in __init__
        self._open()
      File "PngImagePlugin.py", line 331, in _open
        s = self.png.call(cid, pos, len)
      File "PngImagePlugin.py", line 115, in call
        return getattr(self, "chunk_" + cid)(pos, len)
      File "PngImagePlugin.py", line 296, in chunk_zTXt
        self.im_info[k] = self.im_text[k] = zlib.decompress(v[1:])
    zlib.error: Error -5 while decompressing data: incomplete or truncated stream


HISTORY

This bug was previously reported by Eddie Bishop in June 2011
<http://mail.python.org/pipermail/image-sig/2011-June/006782.html>


ANALYSIS

The immediate cause of the error is that PngStream.chunk_zTXt in PngImagePlugin.py gets called with an incomplete chunk. The method looks like this:

    def chunk_zTXt(self, pos, len):
        s = ImageFile._safe_read(self.fp, len)
        k, v = string.split(s, "\0", 1)
        comp_method = ord(v[0])
        if comp_method != 0:
            raise SyntaxError("Unknown compression method %s in zTXt chunk" % comp_method)
        import zlib
        self.im_info[k] = self.im_text[k] = zlib.decompress(v[1:])
        return s

When feeding the data to the parser, the _safe_read may return a short chunk (fewer than len bytes), and so either the call to string.split may fail (because the chunk ended mid-key), or else the call to zlib.decompress may fail.


WORKAROUND

I think the simplest workaround is for chunk_zTXt to abort if the read is short. The IOError will be caught by ImageFile.Parser.feed.

    def chunk_zTXt(self, pos, length):
        s = ImageFile._safe_read(self.fp, length)
        if len(s) != length:
            raise IOError
        # rest of function unchanged

(Note that I've had to change the argument name "len" to "length" to avoid shadowing the built-in.)


RELATED BUGS

I have not attempted to make test cases for any of the other chunk types, but from reading the code it looks as if tRNS, gAMA, and pHYs chunks could also fail in similar ways if there is a short read. (tEXt chunks appear to be safe, but I think that's by luck rather than good planning.) These related bugs could all be worked around in the same way (that is, by raising an IOError in the event of a short read). If you need me to make test cases, let me know.

-- 
Gareth Rees


More information about the Image-SIG mailing list