[Python-Dev] Internal representation of strings and Micropython
Stephen J. Turnbull
stephen at xemacs.org
Thu Jun 5 09:54:11 CEST 2014
Paul Sokolovsky writes:
> Please put that in perspective when alarming over O(1) indexing of
> inherently problematic niche datatype. (Again, it's not my or
> MicroPython's fault that it was forced as standard string type. Maybe
> if CPython seriously considered now-standard UTF-8 encoding, results
> of what is "str" type might be different. But CPython has gigabytes of
> heap to spare, and for MicroPython, every half-bit is precious).
Would you please stop trolling? The reasons for adopting Unicode as a
separate data type were good and sufficient in 2000, and they remain
so today, even if you have been fortunate enough not to burn yourself
on character-byte conflation yet.
What matters to you is that str (unicode) is an opaque type -- there
is no specification of the internal representation in the language
reference, and in fact several different ones coexist happily across
existing Python implementations -- and you're free to use a UTF-8
implementation if that suits the applications you expect for
MicroPython.
PEP 393 exists, of course, and specifies the current internal
representation for CPython 3. But I don't see anything in it that
suggests it's mandated for any other implementation.
More information about the Python-Dev
mailing list