evaluation question

Chris Angelico rosuav at gmail.com
Wed Feb 1 12:57:23 EST 2023


On Thu, 2 Feb 2023 at 04:29, <Muttley at dastardlyhq.com> wrote:
>
> On Wed, 1 Feb 2023 11:59:25 +1300
> Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> >On 31/01/23 10:24 pm, Muttley at dastardlyhq.com wrote:
> >> All languages have their ugly corners due to initial design mistakes and/or
> >> constraints. Eg: java with the special behaviour of its string class, C++
> >> with "=0" pure virtual declaration. But they don't dump them and make all old
> >
> >> code suddenly cease to execute.
> >
> >No, but it was decided that Python 3 would have to be backwards
> >incompatible, mainly to sort out the Unicode mess. Given that,
> >the opportunity was taken to clean up some other mistakes as well.
>
> Unicode is just a string of bytes. C supports it with a few extra library
> functions to get unicode length vs byte length and similar. Its really
> not that hard. Rewriting an entire language just to support that sounds a
> bit absurd to me but hey ho...
>

No, Unicode is NOT a string of bytes. UTF-8 is a string of bytes, but
Unicode is not.

If you disagree with the way Python has been developed, you're welcome
to fork Python 2.7 and make your own language (but not called Python).
Meanwhile, the rest of us really appreciate the fact that Python
supports Unicode properly, not just as "a string of bytes". Also, be
sure to deal with the technical debt of refusing to ever remove any
feature. I'm curious how many dev hours that costs you.

Incidentally, the bytes->unicode transformation wasn't Python 3's
biggest reason for being. See https://peps.python.org/pep-3100/ for
details.

ChrisA


More information about the Python-list mailing list