[issue15955] gzip, bz2, lzma: add option to limit output size

Serhiy Storchaka report at bugs.python.org
Sun Dec 9 18:20:48 CET 2012


Serhiy Storchaka added the comment:

> you can just stick "if not output: continue" before it.

And then hang. Because d.unconsumed_tail is not empty and no new data will be 
read.

> Why is this necessary? If unconsumed_tail is b'', then there's no need to
> prepend it (and the concatenation would be a no-op anyway). If
> unconsumed_tail does contain data, then we don't need to read additional
> compressed data from the file until we've finished decompressing the data
> we already have.

What if unconsumed_tail is not empty but less than needed to decompress at 
least one byte? We need read more data until unconsumed_tail grow enought to 
be decompressed.

> Are you proposing that the decompressor object maintain its own buffer, and
> copy the input data into it before passing it to the decompression library?
> Doesn't that just duplicate work that the library is already doing for us?

unconsumed_tail is such buffer and when we call decompressor with new chunk of 
data we should allocate buffer of size (len(unconsumed_tail)+len(compressed)) 
and copy len(unconsumed_tail) bytes from unconsumed_tail and len(compressed) 
from gotten data. But when you use internal buffer, you should only copy new 
data.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue15955>
_______________________________________


More information about the Python-bugs-list mailing list