[issue21051] incorrect utf-8 conversion with c api

Mark Dickinson report at bugs.python.org
Mon Mar 24 20:16:25 CET 2014


Mark Dickinson added the comment:

Indeed: the \u010d is being interpreted by your *C compiler* as a multibyte character, and the individual bytes of that multibyte character end up in the string that you actually pass to Python.  I suspect that the actual bytes you get depend on your locale.  Here I get (signed) bytes -60 and -115.  (See e.g. "translation phase 7" in C99 6.4.5.)

As Victor says, you need to escape the backslash in the C code.

----------
nosy: +mark.dickinson
resolution:  -> invalid
status: open -> closed

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue21051>
_______________________________________


More information about the Python-bugs-list mailing list