[pypy-issue] Issue #3095: PyUnicode_AsUCS4Copy fails with extended Unicode characters on Windows (pypy/pypy)

Ondrej B issues-reply at bitbucket.org
Wed Oct 16 06:49:26 EDT 2019


New issue 3095: PyUnicode_AsUCS4Copy fails with extended Unicode characters on Windows
https://bitbucket.org/pypy/pypy/issues/3095/pyunicode_asucs4copy-fails-with-extended

Ondrej B:

The following test case works on CPython 3.7 but fails on PyPy3.6-7.2.0 when running on Windows:

`ucs4_test.c`:

```c
#include "Python.h"

static PyObject*
test(PyObject* self, PyObject* args)
{
    PyObject* py_string;
    if (!PyArg_ParseTuple(args, "O", &py_string))
        return NULL;

    Py_UCS4 *text = PyUnicode_AsUCS4Copy(py_string);

    return PyLong_FromLong(*text);
}

// define module called ucs4_test ...
```

‌

`test.py`:

```python
import ucs4_test

def test(ch):
    assert ucs4_test.test(ch) == ord(ch)

test("\U0001F12B")
```

‌

I believe the issue is caused here: [https://bitbucket.org/pypy/pypy/commits/3ec1002a818c#Lpypy/module/cpyext/unicodeobject.pyT1088](https://bitbucket.org/pypy/pypy/commits/3ec1002a818c#Lpypy/module/cpyext/unicodeobject.pyT1088)

The type returned by `PyUnicode_AsUnicode` seems to be `wchar_t`, which is actually UCS2 on Windows \(and a few other platforms\).




More information about the pypy-issue mailing list