[issue18814] Add utilities to "clean" surrogate code points from strings

Martin Panter report at bugs.python.org
Sun Sep 27 07:00:34 CEST 2015


Martin Panter added the comment:

[padding]

I think my suggested colours for the bikeshed would be handle_surrogates() and handle_surrogateescape(). “Rehandle” seems awkward and too assuming to me. And I agree with Serhiy that surrogates are a Unicode thing, not just related to the “surrogatepass” handler.

Adding them to “codecs” makes sense to me. The most important one, handle_surrogateescape() or equivalent, is closely related to the error handler of that module.

Having handle_surrogateescape or equivalent would probably be useful for Issue 25184 (displaying an arbitrary file path in a UTF-8 HTML file).

----------
nosy: +martin.panter

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18814>
_______________________________________


More information about the Python-bugs-list mailing list