Grapheme clusters, a.k.a.real characters

Rhodri James rhodri at kynesim.co.uk
Thu Jul 20 12:46:46 EDT 2017


On 20/07/17 16:18, Rustom Mody wrote:
> So coming to the point:
> Its not whether Einstein or Mencken¹ is right but rather that  Mencken applies to
> 1 whereas Einstein applies to 3
> 
> And (IMHO) text should be squarely classed in 3 not 1
> 
> The gmas of this world have made shopping lists, written (and taught to write)
> letters [my gpa wrote books] long before CS and before any of us existed.
> 
> And if suddenly text has moved from being obvious to anyone to something arcane
> involving
> - codepoints (which are abstract and platonic)
> - (≠) glyphs
> - (that fit into) octets (whatever that may be except they are not bytes)
> - And all other manner of Unicode-gobbledygook
> Something somewhere is wrong

The something that is wrong is a failure to consider the necessary 
_depth_ of knowledge.  The shallow (read: obvious and intuitive) 
definition of text works just fine in the context of grandma's shopping 
list or granddad's book, localised environments with heavily 
circumscribed usage patterns.  It breaks down in the global environments 
we've been talking about in much the same way that the obvious and 
intuitive definition of numbers breaks down when you start considering 
infinities, or Newtonian mechanics breaks down near the speed of light, 
or pretty much everything intuitive breaks down at quantum scales.

-- 
Rhodri James *-* Kynesim Ltd



More information about the Python-list mailing list