[pypy-dev] Making the most of internal UTF8

Jerry Spicklemire jspicklemire at gmail.com
Thu Feb 27 12:54:45 EST 2020


Thanks for all the replies.

Anto, re:

"- some_unicode.encode('utf-8') is essentially for free
(because it is already UTF-8 internally)

- some_bytes.decode('utf-8') is very cheap (it

just needs to check that some_bytes is valid utf-8)"


I guess you mean the processing load for such

operations will be low. So that's good then.

Just wish they would both go away ...


Matt, re:

"The defaults are generally better for the

programming most people do imo."

Probably correct, just got spoiled, that's all.

Had a glimmer of hope that the need for either

would vanish, and wishing someone knew how.


Dan, re:

"I think you mostly don't want u'foo' in 3.x or b'foo' in 2.x"


Actually, I don't want either, anywhere.

If UTF8 is used internally, and ASCII is

already UTF8, then it is all UTF8, so ...

Sigh ...

Thanks anyhow,

Jerry S.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20200227/c37e9f5b/attachment.html>


More information about the pypy-dev mailing list