[issue8271] str.decode('utf8', 'replace') -- conformance with Unicode 5.2.0

Thu Apr 1 17:07:08 CEST 2010

John Machin <sjmachin at users.sourceforge.net> added the comment:

Chapter 3, page 94: """As a consequence of the well-formedness conditions specified in Table 3-7, the following byte values are disallowed in UTF-8: C0–C1, F5–FF"""

Of course they should be handled by the simple expedient of setting their length entry to zero. Why write code when there is an existing mechanism??

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue8271>
_______________________________________