[New-bugs-announce] [issue28436] GzipFile doesn't properly handle short reads and writes on the underlying stream
Evgeny Kapun
report at bugs.python.org
Thu Oct 13 16:29:24 EDT 2016
New submission from Evgeny Kapun:
GzipFile's underlying stream can be a raw stream (such as FileIO), and such streams can return short reads and writes at any time (e.g. due to signals). The correct behavior in case of short read or write is to retry the call to read or write the remaining data.
GzipFile doesn't do this. This program demonstrates the problem with reading:
import io, gzip
class MyFileIO(io.FileIO):
def read(self, n):
# Emulate short read
return super().read(1)
raw = MyFileIO('test.gz', 'rb')
gzf = gzip.open(raw, 'rb')
gzf.read()
Output:
$ gzip -c /dev/null > test.gz
$ python3 test.py
Traceback (most recent call last):
File "test.py", line 10, in <module>
gzf.read()
File "/usr/lib/python3.5/gzip.py", line 274, in read
return self._buffer.read(size)
File "/usr/lib/python3.5/gzip.py", line 461, in read
if not self._read_gzip_header():
File "/usr/lib/python3.5/gzip.py", line 409, in _read_gzip_header
raise OSError('Not a gzipped file (%r)' % magic)
OSError: Not a gzipped file (b'\x1f')
And this shows the problem with writing:
import io, gzip
class MyIO(io.RawIOBase):
def write(self, data):
print(data)
# Emulate short write
return 1
raw = MyIO()
gzf = gzip.open(raw, 'wb')
gzf.close()
Output:
$ python3 test.py
b'\x1f\x8b'
b'\x08'
b'\x00'
b'\xb9\xea\xffW'
b'\x02'
b'\xff'
b'\x03\x00'
b'\x00\x00\x00\x00'
b'\x00\x00\x00\x00'
It can be seen that there is no attempt to write all the data. Indeed, the return value of write() method is completely ignored.
I think that either gzip module should be changed to handle short reads and writes properly, or its documentation should reflect the fact that it cannot be used with raw streams.
----------
components: Library (Lib)
messages: 278606
nosy: abacabadabacaba
priority: normal
severity: normal
status: open
title: GzipFile doesn't properly handle short reads and writes on the underlying stream
type: behavior
versions: Python 3.5
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue28436>
_______________________________________
More information about the New-bugs-announce
mailing list