Instagram: 40% Py3 to 99% Py3 in 10 months (Posting On Python-List Prohibited)

Steve D'Aprano steve+python at pearwood.info
Thu Jun 22 09:33:26 EDT 2017


On Wed, 21 Jun 2017 09:23 am, Lawrence D’Oliveiro wrote:

> Though the Perl 6 folks claim their approach (encoding “characters” rather
> than “code points”) is superior.

Can you explain what you are referring to precisely?

According to the Perl 6 docs, they do encode code points, not "characters"
(which is an ill-defined concept, and besides some Unicode code points are not
characters at all).

http://www.unicode.org/faq/private_use.html#noncharacters

For example:

https://docs.perl6.org/language/unicode

talks about code points. The very first section is titled "Entering Unicode
Codepoints and Codepoint Sequences".

Likewise there is a method "codes" which returns the number of code points in a
string:

https://docs.perl6.org/routine/codes

On the other hand there is also a method "chars" which returns the number
of "characters" (graphemes? grapheme clusters? it doesn't specify) in the
string. 

https://docs.perl6.org/routine/chars

Anyone here got Perl 6 installed and can try it out? How many "characters" does
it think the string "a\uFDD5\uFDD6z" contain?

- if it says 4, that's the number of code points;

- if it says 2, that's the number of characters less the number of
noncharacters.




-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.




More information about the Python-list mailing list