[issue9769] PyUnicode_FromFormatV() doesn't handle non-ascii text correctly

Wed Sep 8 01:21:48 CEST 2010

STINNER Victor <victor.stinner at haypocalc.com> added the comment:

> PyUnicode_FromFormat("%s", text) expects a utf-8 buffer.

Really? I don't see how "*s++ = *f;" (where s is Py_UNICODE* and f is char*) can decode utf-8. It looks more like ISO-8859-1.

> Very recently (r84472, r84485), some C files of CPython source code
> were converted to utf-8

Python source code (C and Python) is written in ASCII except maybe some headers or some tests written in Python with #coding:xxx header (or without the header, but in utf-8, for Python3). I don't think that a C file calls PyErr_Format() or PyUnicode_FromFormat(V)() with a non-ascii format string.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue9769>
_______________________________________