[issue36778] test_site.StartupImportTests.test_startup_imports fails if default code page is not cp1252
Eryk Sun
report at bugs.python.org
Sat May 4 03:35:12 EDT 2019
Eryk Sun <eryksun at gmail.com> added the comment:
> cp65001 is *not* utf-8: Microsoft decided to handle surrogates
> differently for some reasons.
Do you mean valid UTF-16 surrogate pairs? For example:
>>> codecs.code_page_encode(65001, '\ud800\udc00')
(b'\xf0\x90\x80\x80', 2)
PyUnicode_AsUnicodeAndSize is neutral about storing surrogate codes in a 16-bit wchar_t string. In particular, the Python string in this case contains two surrogate codes, but they're passed to WideCharToMultiByte as a UTF-16 surrogate pair for the single character U+10000.
Anyway, it seems to me this issue will be resolved if cp65001.py is rewritten without functools.partial.
----------
nosy: +eryksun
_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue36778>
_______________________________________
More information about the Python-bugs-list
mailing list