[issue7615] unicode_escape codec does not escape quotes

Richard Hansen report at bugs.python.org
Tue Jan 5 00:56:33 CET 2010


Richard Hansen <rhansen at bbn.com> added the comment:

I thought about raw_unicode_escape more, and there's a way to escape quotes:  use unicode escape sequences (e.g., ur'\u0027').  I've attached a new patch that does the following:

 * backslash-escapes single quotes when encoding with the unicode_escape codec (the original subject of this bug)
 * replaces single quotes with \u0027 when encoding with the raw_unicode_escape codec (a separate bug not related to the original report, but brought up in comments)
 * replaces backslashes with \u005c when encoding with the raw_unicode_escape codec (a separate bug not related to the original report)
 * fixes a corner-case bug where the UTF-16 surrogate pair decoding logic could read past the end of the provided Py_UNICODE character array (a separate bug not related to the original report)
 * eliminates redundant code in PyUnicode_EncodeRawUnicodeEscape() and unicodeescape_string()
 * general cleanup in unicodeescape_string()

The changes in the patch are non-trivial and have only been lightly tested.

----------
Added file: http://bugs.python.org/file15742/unicode_escape_reorg.patch

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue7615>
_______________________________________


More information about the Python-bugs-list mailing list