[issue20132] Many incremental codecs don’t handle fragmented data

Sun Jan 25 05:47:50 CET 2015

Martin Panter added the comment:

Here is a new patch which fixes the bytes-to-bytes incremental codecs. It depends on my patches for these other issues being applied first:

* Issue 23231: Bytes-to-bytes support for iteren/decode()
* Issue 13881: Generic StreamWriter from IncrementalEncoder
* Issue 16473: Clarify and test quopri-codec implementation

In addition, without the fix for Issue 20121 (Quoted printable soft line breaking), the patch will apply, but there will be test failures.

Summary of the changes in this patch:

* Fix simple bz2-codec bug uncovered by new tests
* Implement stateful hex-codec IncrementalDecoder
* Add helpers for getstate() and setstate() to bijectively convert between arbitrary-length byte strings and integers
* Implement stateful base64-codec IncrementalEncoder; use it for the StreamWriter
* base64-codec IncrementalDecoder
* quopri-codec IncrementalEncoder, StreamWriter, IncrementalDecoder
* uu-codec IncrementalEncoder, StreamWriter, IncrementalDecoder
* Document that bytes-to-bytes StreamReader is not supported
* Document stateful raw-/unicode-escape decoding not supported
* Warn that stateful UTF-7 encoding may not be optimal, and that stateful UTF-7 decoding may buffer unlimited input

This patch is actually generated from a series of patches folded together. If anyone is interested, I could try pushing the original series to a Bitbucket repository or something.

----------
Added file: http://bugs.python.org/file37846/inc-codecs.diff

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue20132>
_______________________________________