Python 3 is killing Python

Wed Jul 16 11:18:59 EDT 2014

On Thu, Jul 17, 2014 at 12:27 AM, Frank Millman <frank at chagford.com> wrote:
> FWIW, here are my thoughts -
>
> 1. There were many backward-incompatible changes made in Python3, but the
> only one that seems to cause problems is the change to the bytes/str types.
> I agree that it is a big change, but the others seem to have been accepted
> without argument, so it seems to me that the python devs got an awful lot
> right.

There are quite a few changes that are almost completely
insignificant, like renaming (eg Tkinter to tkinter), where there's
this tiny difference at the top of your program and absolutely no
difference elsewhere. And there are a few where, for instance,
FileNotFoundError was created, as a subclass of OSError; I have a
program that needs to catch that exception, and I just have a trap at
the top that, if there's no FNFE, assigns it equal to OSError, and
then proceeds as normal. (This does mean that, under Python 2, the
mini HTTP server returns 404s for other types of OSError attempting to
read from certain files; under Python 3, those will result in 500s and
logged errors. I'm not overly concerned about that difference, but I
prefer the Py3 behaviour.) These sorts of changes, while technically
backward-incompatible, aren't going to cause argument - you just zip
through your code and change stuff (probably with a script like 2to3),
or else add a bit of header to ensure compatibility with both
versions. Pretty easy.

Then there are the changes that, while again technically
backward-incompatible, are practically identical *in normal usage*.
For instance, range no longer returns a list, but most range usage is
with iteration anyway. Dict views rather than lists might cause some
problems (if you iterate over d.keys() while mutating d, you'll have
problems in Py3, but in Py2 it's fine), but again, any place where you
have issues, you just tweak it to the new recommended style. Several
of these changes are actually less significant than one change that
happened within the 2.x line - the change from string exceptions to
subclasses of (Base)Exception. There have been a few complaints, but
they're not the stuff about which people say "Python 3 is killing
Python".

> 2. Those adversely affected by the change are very vocal, but we hear very
> little from those who have benefited from it. This is to be expected - they
> are just getting on with developing in Python3 and have no need to get
> involved in controversies.

That's very true. Sometimes you get an idea of how silent something is
and therefore how successful; for example, my house has been
progressively migrated almost exclusively to Linux, and the days that
go by without anyone asking me for help are proof that Linux is a
perfectly acceptable desktop OS. (Actually, even when people _do_ ask
me for help, it's usually either something to do with git, or
something advanced like "How can I find out which files in this whole
directory tree have been changed recently?", which your average user
wouldn't know off-hand how to do on Windows or OS/2 either.) Python 3
has served many people just fine, and those people aren't writing blog
posts about how unexciting their lives have become now that they don't
have to deal with bug reports about stuff the language just does for
them.

> I just tried an experiment in my own project. Ned Batchelder, in his
> Pragmatic Unicode presentation, http://nedbatchelder.com/text/unipain.html,
> suggests that you always have some unicode characters in your data, just to
> ensure that they are handled correctly. He has a tongue-in-cheek example
> which spells the word PYTHON using various exotic unicode characters. I used
> this to populate a field in my database, to see if it would display in my
> browser-based client.
>
> The hardest part was getting it in. There are 6 characters, but utf-8
> requires 16 bytes to store it -
>
>     b'\xe2\x84\x99\xc6\xb4\xe2\x98\x82\xe2\x84\x8c\xc3\xb8\xe1\xbc\xa4'.decode('utf-8')
>

Ideally, you would have a browser-based input system as well, which
would allow you to do the whole thing directly. Also, I would strongly
recommend using a database back-end that stores Unicode; and if that
back-end is MySQL, be aware that "utf8" is actually a messed-up
encoding that's like UTF-8 only restricted to three bytes (and
therefore the BMP), and you have to use "utf8mb4" to store all of
Unicode. With a decent back-end like PostgreSQL, you can do this sort
of thing directly:

rosuav=> create table test (id serial primary key,txt text);
CREATE TABLE
rosuav=> insert into test (txt) values ('U+1234 is ሴ');
INSERT 0 1
rosuav=> insert into test (txt) values ('U+12345 is 𒍅');
INSERT 0 1
rosuav=> select id,txt,length(txt) from test;
 id |     txt      | length
----+--------------+--------
  1 | U+1234 is ሴ  |     11
  2 | U+12345 is 𒍅 |     12
(2 rows)

Looks fine to me. You should be able to read and write Unicode from Python, too.

> However, that was it. Without any changes to my program, it read it from the
> database and displayed it on the screen. IE8 could only display 2 out of the
> 6 characters correctly, and Chrome could display 5 out of 6, but that is a
> separate issue. Python3 handled it perfectly.

That's more of a font issue than anything else. I played around with
U+12345 in the above example, and it didn't display usefully in either
my console or Chrome here, but it's still obviously there as a single
character.

> Would this have been so easy using Python2 - I don't think so.

If all you ever do is read stuff from one place and write it to
another, it doesn't make a lot of difference whether you're working
with Unicode text or UTF-8 bytes. The trouble comes when you want to
take the length of it, trim it, or anything like that; for instance,
suppose you want to have a preview of the text, ellipsizing if the
full text is longer than (say) 30 characters, with the full text
available by clicking or hovering the mouse or something. At that
point, UTF-8 becomes a dashed nuisance, and true Unicode support makes
it a breeze.

> What follows
> is blatant speculation, but it is quite possible that there are many
> non-English speakers out there that have had their lives made much easier by
> the changes to Python3  - a 'silent majority'? I don't mean an absolute
> majority, as I believe there are still more Python2 users than Python3. But
> of those who have made the switch from 2 to 3, maybe most of them are quite
> happy. If so, then the python devs got that right as well.

It's impossible to say how many Py2 users there are and how many Py3.
But I would say that there are a HUGE number of people who've either
written Py3 code brand new, or ported something from Py2, and had no
significant trouble.

> Unfortunately, human nature being what it is, the possibility of this split
> in the community continuing, to the detriment of Python itself, is all too
> real. I don't know what more the python devs can do, but there are no
> guarantees of success :-(

What split, exactly? There are always these talks of a split... but I
don't see one happening. I don't see, for instance,
python2-list at python.org or comp.lang.python2 being separated out. I
don't see Linux distributions choosing to support only one branch and
not the other (only one can be the default and the system Python, but
the other is usually just an apt-get/yum/pacman away). I don't see
anyone taking the Python 2 source code and backporting a bunch of
Python 3 features (and/or adding a bunch of their own features) and
creating the Python 2.8 that http://blog.startifact.com/guido_no.jpg
rejects. What split is actually occurring, or going to occur? I think
anyone who talks of splitting has an unrealistically low idea of the
costs of such a split; moving away from what the core Python devs are
doing means setting up everything fresh, and it's just way too much
work to do that.

I don't know what's going to happen in 2020, though. There might be a
split between three communities: the Python 3 community, the Red Hat
supported Python 2 community, and the ActiveState supported Python 2
community. Or maybe there'll be some other commercial support. Or
maybe there'll still be some measure of community support for Python 2
here on python-list and other such places, and there won't be a split
even then. (People have come here talking about Python 1.5, although
there isn't a huge amount of support for that anywhere!) Frankly, the
Python devs need do nothing and can do nothing; the mass of users will
go where it goes, 800lb gorilla style, and it's up to them to either
find their own bananas or join the python.org banana plantation. One
way or another, it'll work out.

ChrisA