[issue8271] str.decode('utf8', 'replace') -- conformance with Unicode 5.2.0
John Machin
report at bugs.python.org
Thu Apr 1 17:07:08 CEST 2010
John Machin <sjmachin at users.sourceforge.net> added the comment:
Chapter 3, page 94: """As a consequence of the well-formedness conditions specified in Table 3-7, the following byte values are disallowed in UTF-8: C0–C1, F5–FF"""
Of course they should be handled by the simple expedient of setting their length entry to zero. Why write code when there is an existing mechanism??
----------
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue8271>
_______________________________________
More information about the Python-bugs-list
mailing list