Is Unicode support so hard...

Sat Apr 20 14:02:54 EDT 2013

On Sat, Apr 20, 2013 at 10:22 AM, Ned Batchelder <ned at nedbatchelder.com> wrote:
> On 4/20/2013 1:12 PM, jmfauth wrote:
>>
>> In a previous post,
>>
>>
>> http://groups.google.com/group/comp.lang.python/browse_thread/thread/6aec70817705c226#
>> ,
>>
>> Chris “Kwpolska” Warrick wrote:
>>
>> “Is Unicode support so hard, especially in the 21st century?”
>>
>> --
>>
>> Unicode is not really complicate and it works very well (more
>> than two decades of development if you take into account
>> iso-14****).
>>
>> But, - I can say, "as usual" - people prefer to spend their
>> time to make a "better Unicode than Unicode" and it usually
>> fails. Python does not escape to this rule.
>>
>> -----
>>
>> I'm "busy" with TeX (unicode engine variant), fonts and typography.
>> This gives me plenty of ideas to test the "flexible string
>> representation" (FSR). I should recognize this FSR is failing
>> particulary very well...
>>
>> I can almost say, a delight.
>>
>> jmf
>> Unicode lover
>
> I'm totally confused about what you are saying.  What does "make a better
> Unicode than Unicode" mean?  Are you saying that Python is guilty of this?
> In what way?  Can you provide specifics?  Or are you saying that you like
> how Python has implemented it?  "FSR is failing ... a delight"?  I don't
> know what you mean.
>
> --Ned.

Don't bother trying to figure this out. jmfauth has been hijacking
every thread that mentions Unicode to complain about the flexible
string representation introduced in Python 3.3. Apparently, having
proper Unicode semantics (indexing is based on characters, not code
points) at the expense of performance when calling .replace on the
only non-ASCII or BMP character in the string is a horrible bug.