Grapheme clusters, a.k.a.real characters

Rick Johnson rantingrickjohnson at gmail.com
Sat Jul 15 21:20:30 EDT 2017


On Saturday, July 15, 2017 at 7:29:14 PM UTC-5, Chris Angelico wrote:
> [...] Also, that doesn't deal with
> U+200B or U+180E, which have well-defined widths *smaller* than
> typical Latin letters. (200B is a zero-width space. Is it a
> character?)

Of *COURSE* it's a character.

Would you also consider 0 not to be a number?

Sheesh! 

When call the `len()` function on a string containing only
three "zero-width unicode chars", i want `len` to return the
integer 3 not 0! In what upside-down/inside-out universe
would you prefer that `len` lie to you and return 0? You
can't be serious...

Doth not a string containing three characters have a
length of 3? And if not, what other length could it have?

Doth not a knapsack containing 3 items have a quantity of 3?
And if not, what other quantity could it have?

You seem to want this fine group to believe that if the 3
items in the knapsack are _visible_ to the naked eye (say,
three apples), then they are relevant to the quantity. But
what if the three objects in the knapsack are, say,
radiowaves -- yep, three radiowaves bouncing around inside a
knapsack -- are we to believe that the knapsack is empty?
And if we are, then every scientist and mathematician since
antiquity shall be rolling over in their graves.

Furthermore, why should the storage API and the display API
give a monkey's toss about the other, when they are
obviously "two sides of a mountain". 



More information about the Python-list mailing list