RE Module Performance

Sat Jul 27 00:12:36 EDT 2013

On Fri, Jul 26, 2013 at 9:37 PM, Steven D'Aprano
<steve+comp.lang.python at pearwood.info> wrote:
> See the similarity now? Both flexibly change the width used by code-
> points, UTF-8 based on the code-point itself regardless of the rest of
> the string, Python based on the largest code-point in the string.

No, I think we're just using the word "flexible" differently.  In my
view, simply being variable-width does not make an encoding "flexible"
in the sense of the FSR.  But I'm not going to keep repeating myself
in order to argue about it.

> Having watched this issue from Day One when JMF first complained about
> it, I believe this is entirely about denying any benefit to ASCII users.
> Had Python implemented a system identical to the current FSR except that
> it added a fourth category, "all ASCII", which used an eight-byte
> encoding scheme (thus making ASCII strings twice as expensive as strings
> including code points from the Supplementary Multilingual Planes), JMF
> would be the scheme's number one champion.

I agree.  In fact I made a similar observation back in December:

http://mail.python.org/pipermail/python-list/2012-December/636942.html