Why Python 3?

Mon Apr 21 18:43:09 EDT 2014

On Tue, Apr 22, 2014 at 8:28 AM, Gregory Ewing
<greg.ewing at canterbury.ac.nz> wrote:
> The reason it doesn't work well is because of the
> automatic promotion of ints to floats when they meet
> other floats. This leads to a style where people often
> use ints to stand for int-valued floats and expect
> them to be promoted where necessary.
>
> Things would be different if ints and floats were
> completely separate types, like str and bytes, but
> that would be extremely inconvenient. I used a language
> like that once, and it wasn't a pleasant experience.

I do see that there are two sides to this. The question of "Is 1.0
equal to 1?" has a very obvious answer... whichever answer you go
with, it's absolutely obvious! (Yes! They're the same number, of
course they're equal! vs No! They're completely different
representations, like 1 and "1" and "\1" are all different!)
Separating the types makes very good sense, and unifying them makes
very good sense, and for different reasons. Unifying them in as many
ways as possible means you don't need the syntactic salt of ".0" on
every number; you should be able to add 2.5+1 and get 3.5, just as if
you'd added 2.5+1.0. And that's fine. Separating them also makes
sense, though; it means that an operation on Type X and Type Y will
behave equally sanely regardless of the values of those objects. As it
is, we have the case that most lowish integers have equivalent floats
(all integers within the range that most people use them), and beyond
that, you have problems. This is, in my opinion, analogous to a UTF-16
string type; if you work with strings of nothing but BMP characters,
everything works perfectly, but put in an astral character and things
may or may not work. A lot of operations will work fine, but just a
few will break. Python 3 has fixed that by giving us the pain of
transition *right at the beginning*; you look at Text and at Bytes as
completely separate things. People who like their ASCII like the idea
that the letter "A" is equivalent to the byte 0x41. It's convenient,
it's easy. But it leads to problems later.

Now, the analogy does break down a bit in that it's probably more
likely that a program will have to deal with non-ASCII characters than
with integers that can't be represented in IEEE double precision. But
would you rather have to deal with the problem straight away, or when
your program is some day given a much bigger number to swallow, and it
quietly rounds it off to the nearest multiple of 8?

ChrisA