[New-bugs-announce] [issue21560] gzip.write changes trailer ISIZE field before type checking - corrupted gz file after trying to write string

Wolfgang Maier report at bugs.python.org
Fri May 23 13:22:57 CEST 2014


New submission from Wolfgang Maier:

I ran into this:

>>> gzout = gzip.open('test.gz','wb')
>>> gzout.write('abcdefgh') # write expects bytes not str
Traceback (most recent call last):
  File "<pyshell#2>", line 1, in <module>
    gzout.write('abcdefgh')
  File "/usr/lib/python3.4/gzip.py", line 343, in write
    self.crc = zlib.crc32(data, self.crc) & 0xffffffff
TypeError: 'str' does not support the buffer interface

>>> gzout.write(b'abcdefgh') # ok, use bytes instead
8
>>> gzout.close()

But now the file is not recognized as valid gzip format anymore (neither by the gzip module nor by external software):

>>> gzin = gzip.open('test.gz','rb')
>>> next(gzin)
Traceback (most recent call last):
  File "<pyshell#32>", line 1, in <module>
    next(gzin)
  File "/usr/lib/python3.4/gzip.py", line 594, in readline
    c = self.read(readsize)
  File "/usr/lib/python3.4/gzip.py", line 365, in read
    if not self._read(readsize):
  File "/usr/lib/python3.4/gzip.py", line 465, in _read
    self._read_eof()
  File "/usr/lib/python3.4/gzip.py", line 487, in _read_eof
    raise OSError("Incorrect length of data produced")
OSError: Incorrect length of data produced

Turns out that gzip.write increased the ISIZE field value by 8 already during the failed call with the str object, so it is now 16 instead of 8:
>>> raw = open('test.gz','rb')
>>> [n for n in raw.read()] # ISIZE is the fourth last element
[31, 139, 8, 8, 51, 46, 127, 83, 2, 255, 116, 101, 115, 116, 0, 75, 76, 74, 78, 73, 77, 75, 207, 0, 0, 80, 42, 239, 174, 16, 0, 0, 0]

in other words: gzip.GzipFile.write() leaps (and modifies) before it checks its input argument.

----------
components: Library (Lib)
messages: 218961
nosy: wolma
priority: normal
severity: normal
status: open
title: gzip.write changes trailer ISIZE field before type checking - corrupted gz file after trying to write string
type: behavior
versions: Python 3.4

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue21560>
_______________________________________


More information about the New-bugs-announce mailing list