[Python-Dev] Internal representation of strings and Micropython

Paul Sokolovsky pmiscml at gmail.com
Wed Jun 4 13:49:33 CEST 2014


Hello,

On Wed, 4 Jun 2014 20:53:46 +1000
Chris Angelico <rosuav at gmail.com> wrote:

> On Wed, Jun 4, 2014 at 8:38 PM, Paul Sokolovsky <pmiscml at gmail.com>
> wrote:
> > And I'm saying that not to discourage Unicode addition to
> > MicroPython, but to hint that "force-force" approach implemented by
> > CPython3 and causing rage and split in the community is not
> > appreciated.
> 
> FWIW, it's Python 3 (the language) and not CPython 3.x (the
> implementation) that specifies Unicode strings in this way. 

Yeah, but it's CPython what dictates how language evolves (some people
even think that it dictates how language should be implemented!), so all
good parts belong to Python3, and all bad parts - to CPython3,
right? ;-)

> I don't
> know why it has to cause a split in the community; this is the one way
> to make sure *everyone's* strings work perfectly, rather than having
> ASCII strings work fine and others start tripping over problems in
> various APIs.

It did cause split in the community, that's the fact, that's why
Python2 and Python3 are at the respective positions. Anyway, I'm not
interested in participating in that split, I did not yet uttered my
opinion on that publicly enough, so I seized a chance to drop some
witty remarks, but I don't want to start yet another Unicode flame.



So, let's please be back to Unicode storage representation in
MicroPython. So, https://github.com/micropython/micropython/issues/657
discussed technical aspects, in a recent mail on this list I expressed
my opinion why following CPython way is not productive (for development
satisfaction and evolution of Python community, to be explicit).

Final argument I would have is that you certainly can implement Unicode
support the PEP393 way - it would be enormous help and would be gladly
accepted. The question, how useful it will be for MicroPython. It
certainly will be useful to report passing of testsuites. But will it
be *really* used?

For microcontroller board, it might be too heavy (put simple, with it,
people will be able to do less (== heap running out sooner)), than
without it, so one may expect it to be disabled by default. Then POSIX
port is there surely not to let people replace "python" command
with "micropython" and run Django, but to let people develop and debug
their apps with more comfort than on embedded board. So, it should
behave close to MCU version, and would follow with MCU choice
re: Unicode.

That's actually the reason why I keep up this discussion - not for the
sake of argument or to bash Python3's Unicode choices. With recent
MicroPython announcement, we surely looked for more people to
contribute to its development. But then we (or at least I can speak for
myself), would like to make sure that these contribution are actually
the most useful ones (for both MicroPython, and Python community in
general, which gets more choices, rather than just getting N% smaller
CPython rewrite).

So, you're not sure how O(N) string indexing will work? But MicroPython
offers a great opportunity to try! And it's something new and exciting,
which surely will be useful (== will save people memory), not just
something old and boring ;-).


> 
> ChrisA


-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com


More information about the Python-Dev mailing list