[issue41330] Inefficient error-handle for CJK encodings
Ma Lin
report at bugs.python.org
Sat Jul 18 04:31:13 EDT 2020
Ma Lin <malincns at 163.com> added the comment:
> But how many new Python web application use CJK codec instead of UTF-8?
A CJK character usually takes 2-bytes in CJK encodings, but takes 3-bytes in UTF-8.
I tested a Chinese book:
in GBK: 853,025 bytes
in UTF-8: 1,267,523 bytes
For CJK content, UTF-8 is wasteful, maybe CJK encodings will not be eliminated.
----------
_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue41330>
_______________________________________
More information about the Python-bugs-list
mailing list