[Python-Dev] Internal representation of strings and Micropython

Steven D'Aprano steve at pearwood.info
Wed Jun 4 16:40:53 CEST 2014


On Wed, Jun 04, 2014 at 01:38:57PM +0300, Paul Sokolovsky wrote:

> That's another reason why people don't like Unicode enforced upon them

Enforcing design and language decisions is the job of the programming 
language. You might as well complain that Python forces C doubles as the 
floating point type, or that it forces Bignums as the integer type, or 
that it forces significant indentation, or "class" as a keyword. Or that 
C forces you to use braces and manage your own memory. That's the 
purpose of the language, to make those decisions as to what features to 
provide and what not to provide.


> - all the talk about supporting all languages and scripts is demagogy
> and hypocrisy, given a choice, Unicode zealots would rather limit
> people to Latin script 

I have no words to describe how ridiculous this accusation is.


> then give up on their arbitrarily chosen, one-among-thousands,
> soon-to-be-replaced-by-apples'-and-microsofts'-"exciting-new" encoding.

 
> Once again, my claim is what MicroPython implements now is more correct
> - in a sense wider than technical - handling. We don't provide Unicode
> encoding support, because it's highly bloated, but let people use any
> encoding they like. That comes at some price, like length of strings in
> characters are not know to runtime, only in bytes

What's does uPy return for the length of '∞'? If the answer is anything 
but 1, that's a bug.


-- 
Steven


More information about the Python-Dev mailing list