Cult-like behaviour [was Re: Kindness]

Mon Jul 16 17:04:05 EDT 2018

On Tue, Jul 17, 2018 at 6:27 AM, Marko Rauhamaa <marko at pacujo.net> wrote:
> Rhodri James <rhodri at kynesim.co.uk>:
>
>> On 16/07/18 20:40, Marko Rauhamaa wrote:
>>> You mean each code point is one code point wide. But that's rather an
>>> irrelevant thing to state. The main point is that UTF-32 (aka Unicode)
>>> uses one or more code points to represent what people would consider an
>>> individual character.
>>
>> UTF-32 != Unicode, but that's a separate esoteric argument.
>>
>> The problem everyone
>
> "Everyone"!!!
>
>> is having with you, Marko, is that you are using the terminology
>> incorrectly. When you say that more than one codepoint can be used to
>> represent what people would consider an individual character, you are
>> correct (and would be more correct if you called "what people would
>> consider an individual character" a "glyph"). When you call UTF-32 a
>> variable-width encoding, you are incorrect.
>
> Unicode is one of the primary selling points of Python3

Here, have a look at the original plans for Python 3.0:

https://www.python.org/dev/peps/pep-3100/

The default string type becoming Unicode was just one bullet point
among many. Remember, Python 2 had Unicode strings for a long time;
the change is not "now we use Unicode" but "now the simple and obvious
string type is the text string rather than the byte sequence". Both
types had previously been available. Both types remained available.
This was not a "primary selling point". The main selling point was
cleanups and simplifications.

ChrisA