[Python-Dev] Internal representation of strings and Micropython

Paul Sokolovsky pmiscml at gmail.com
Fri Jun 6 13:34:01 CEST 2014


Hello,

On Fri, 06 Jun 2014 20:11:27 +0900
"Stephen J. Turnbull" <stephen at xemacs.org> wrote:

> Paul Sokolovsky writes:
> 
>  > That kinda means "string is atomic", instead of your "characters
>  > are atomic".
> 
> I would be very surprised if a language that behaved that way was
> called a "Python subset".  No indexing, no slicing, no regexps, no
> .split(), no .startswith(), no sorted() or .sort(), ...!?
> 
> If that's not what you mean by "string is atomic", I think you're
> using very confusing terminology.

I'm sorry if I didn't mention it, or didn't make it clear enough - it's
all about layering.

On level 0, you treat strings verbatim, and can write some subset of
apps (my point is that even this level allows to write lot enough
apps). Let's call this set A0.

On level 1, you accept that there's some universal enough conventions
for some chars, like space or newline. And you can write set of 
apps A1 > A0.

On level 2, you add len(), and - oh magic - you now can center a string
within fixed-size field, something you probably to as often as once a
month, so hopefully that will keep you busy for few.

On level 3, it indeed starts to smell Unicode, we get isdigit(),
isalpha(), which require long boring tables, which hopefully can be
compressed enough to fit in your pocket.

On level 4, it's pumping up, with tolower() and friends, tables for
which you carry around in suitcase.

On level 5, everything is Unicode, what a bliss! You can even start
pretending that no other levels exist (God created Unicode on a second
day).

On level 6, there're mind-boggling, ugly manual-use utilities to deal
with internals of "magic" "working on its own for everyone" encoding to
deal with stuff like code-point vs charecters vs surrogate pair
vs grapheme separation, etc.



So, once again, for me and some other people, it's not that bright idea
to shoot for level 5 if levels 0-4 exist and well-proven pragmatic
model. And level 6 is still there anyway.


-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com


More information about the Python-Dev mailing list