Python is DOOMED! Again!

Sat Jan 31 22:46:15 EST 2015

On Sun, Feb 1, 2015 at 2:16 PM, Steven D'Aprano
<steve+comp.lang.python at pearwood.info> wrote:
> Chris Angelico wrote:
>
>> On Sat, Jan 31, 2015 at 10:56 PM, Steven D'Aprano
>> <steve+comp.lang.python at pearwood.info> wrote:
>>> Both ints and floats are models of the same abstract thing, namely
>>> "number". Ideally, from a mathematically standpoint, there should be no
>>> difference between 23 and 23.0. Automatic coercions allow us to get a
>>> little closer to that ideal.
>>
>> So far, I'm mostly with you. (Though if your float type is not a
>> perfect superset of your integer type - as in Python - then the
>> default "up-cast" from int to float, while disallowing a corresponding
>> implicit "down-cast", seems flawed. But in concept, yes, automatic
>> coercion allows us to treat 23 and 23.0 as the same.)
>
> In principle, we might have a real-number type that is a perfect superset of
> ints, and we might even have int versions of NAN and INF. But down-casting
> real-to-integer is ambiguous, due to the need to handle any fractional
> parts:
>
> - raise an exception if the fractional part is non-zero
> - truncate (round towards zero)
> - round down towards -infinity
> - round up toward +infinity
> - round to nearest, ties to odd numbers
> - round to nearest, ties to even numbers
> - round to nearest, ties split randomly
> - something else
>
> One might semi-arbitrarily pick one (say, truncation) as the default when
> you cast using int(x) but you need to support at least the most common
> rounding modes, perhaps as separate functions.

Agreed; but the trap here is that there are equivalent problems when
converting integers to floating point - just more subtle, because they
don't happen in the low ranges of values. In the same way that a
UTF-16 string representation has more subtle problems than an ASCII
string representation (because it's easy to test your code on "foreign
text" and still not realize that it has issues with astral
characters), casting int to float is subtle because you'll probably do
all your testing on numbers less than 2**53. It might take a lot of
tracking-down work before you finally discover that there's one place
in your code where you do division with / instead of //, and you get
back a float, and then only when your integers are really huge (maybe
you encode an eight-digit date, four-digit time, then a three-digit
country code, and finally a two-digit incrementing number) do you
actually start losing precision. The bulk of Python programs will
never run into this; yet we do have an arbitrary-precision integer
type, we're not like ECMAScript with a single "Number" type.

>>> Arbitrary objects, on the other hand, are rarely related to strings.
>>> Given that we need to be able to display arbitrary objects to the human
>>> programmer, if only for debugging, we need to be able to *explicitly*
>>> convert into a string:
>>>
>>>
>>> py> import nntplib
>>> py> SERVER = "news.gmane.org"
>>> py> server = nntplib.NNTP(SERVER)
>>> py> str(server)
>>> '<nntplib.NNTP instance at 0xb7bc76ec>'
>>
>> Here, though, I'm not so sure. Why should you be able to *type-cast*
>> anything to string? Python has another, and perfectly serviceable,
>> function for converting arbitrary objects into strings, and that's
>> repr().
>
> Which *also* converts to a string. (Note I didn't say *cast* to a string. I
> cannot imagine any meaningful definition of what casting a NNTP server
> object to a str might be.)

Sure, but Python doesn't really have a way to spell "convert this to a
string if it's already basically stringy, otherwise raise TypeError".
You can do that for other types like int, but not for string, because
you can always call str() on something.

> I agree with all of that. And for what it is worth, a class can refuse to
> convert to str while still supporting repr:
>
> py> class K(object):
> ...     def __str__(self): raise TypeError
> ...     def __repr__(self): return "Stuff and things. Mostly stuff."
> ...
> py> k = K()
> py> str(k)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "<stdin>", line 2, in __str__
> TypeError
> py> repr(k)
> 'Stuff and things. Mostly stuff.'

Hmm. Sure you can do that, but is that just part of the freedom you
have to shoot yourself in the foot? Would that be considered an
ill-behaved class?

>> Complete and automatic casting to string, I would agree. However, I
>> would suggest that there are a few *binary operations* which could be
>> more convenient if they allowed some non-strings. For instance, Pike
>> allows addition of strings and integers: "1" + 2 == "12", where Python
>> requires "1" + str(2) for the same operation. (But Pike does *not*
>> allow just any object there. Only a few, like numbers, can be quietly
>> cast on being added to strings.)
>
> I'm surprised you're not into Perl, with an attitude like that. A sick,
> disgusting, despicably perverted attitude. *wink*
>
> But seriously, I can see some uses there, but frankly why bother to make an
> exception for ints when you require all other types to have an explicit
> coercion?

Ints, floats, and any user-defined type that chooses to ask for it;
but not arrays, mappings, files, or other types that don't make sense.
It's the same feature that allows you to add a float and an int; the +
operator is defined as accepting certain pairs of dissimilar types,
and has well-defined behaviour around those types.

> The problem with string/int automatic coercions is that there are lots of
> answers but none of them are obviously the right answer:
>
> "1" + 1 --> "11" or 2?

If that's 2, then you have a classic "sloppy typing" system that has
to spell addition and concatenation differently (cf PHP and REXX). The
one obvious answer here is "11".

> "1a" + 1 --> 2 like Perl does, or "1a1" like Javascript does?
>
> Do you strip out all non-numeric characters, or truncate at the first
> non-numeric character?

You don't strip out anything from the string when you add an integer
to it, because you convert the integer to a string.

(Now, what the rules are for converting a string to an integer, well,
that's a completely different question. Fresh new argument as to
whether casting "1z" to integer should be 1 or TypeError.)

> Should you perhaps be a little more flexible and allow common mistypings
> like O for 0 and l for 1? How about whitespace?

Definitely not O/0 or I/1, but ignoring whitespace when converting
string to integer is often helpful, and seldom a problem. But none of
this has anything to do with adding strings and non-strings; you
*always* convert to string (or throw an error), never the other way.

ChrisA