[issue36778] test_site.StartupImportTests.test_startup_imports fails if default code page is not cp1252

Eryk Sun report at bugs.python.org
Sat May 4 03:35:12 EDT 2019


Eryk Sun <eryksun at gmail.com> added the comment:

> cp65001 is *not* utf-8: Microsoft decided to handle surrogates 
> differently for some reasons.

Do you mean valid UTF-16 surrogate pairs? For example:

    >>> codecs.code_page_encode(65001, '\ud800\udc00')
    (b'\xf0\x90\x80\x80', 2)

PyUnicode_AsUnicodeAndSize is neutral about storing surrogate codes in a 16-bit wchar_t string. In particular, the Python string in this case contains two surrogate codes, but they're passed to WideCharToMultiByte as a UTF-16 surrogate pair for the single character U+10000.

Anyway, it seems to me this issue will be resolved if cp65001.py is rewritten without functools.partial.

----------
nosy: +eryksun

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue36778>
_______________________________________


More information about the Python-bugs-list mailing list