[issue18814] Add utilities to "clean" surrogate code points from strings

STINNER Victor report at bugs.python.org
Sun Sep 27 10:55:47 CEST 2015


STINNER Victor added the comment:

Hum, I suggest to put these functions in a package on PyPI, or recipes on a
website like stackoverfkow., and close the issue.

I'm still not convinced that these functions are useful . Usually we take a
function from an existing project used in applications to put it in the
stdlib. Here the use case still looks artifical. For example which
application requires to escape non-BMP character? How does it handle them
currently?

Threre are too many ways to handle surrogate characters. The common ways to
show undecodable bytes are not supported by functions proposed by Serhiy.
Example: %80 on Mac OS X. Gnome uses something else.

It was said that one reason to add new functions is performance. I'm not
convinced neither that such function is the bottleneck on any application.

I prefer to wait until users experiment with their own implementation and
see if a common function can be extracted from this to put it in the stdlib.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18814>
_______________________________________


More information about the Python-bugs-list mailing list