What's the best way to minimize the need of run time checks?

Sun Aug 28 05:28:02 EDT 2016

On Sun, Aug 28, 2016 at 6:33 PM, Steven D'Aprano
<steve+comp.lang.python at pearwood.info> wrote:
> On Sunday 28 August 2016 15:29, Chris Angelico wrote:
>> It might be a good way of thinking about points on a Cartesian plane,
>> though. Rectangular and polar coordinates truly are just different
>> ways of expressing the same information.
>
> That's exactly my point, and that's why you shouldn't implement them as
> different classes. If you do, that's a limitation of your code and/or the
> language.
>
> (There may be *implementation specific* reasons why you are forced to, for
> example to avoid rounding errors due to finite precision: say, my polar number
> (1, 45°) may not evaluate as *exactly* (sqrt(2)/2, sqrt(2)/2) in Cartesian
> coordinates due to rounding. But that's a case of a leaky abstraction.)

class Complex(complex):
    @property
    def r(self):
        return abs(self)
    @property
    def phi(self):
        return cmath.phase(self)

One value, two ways of looking at it. One type. One class.

> I might be able to tell the compiler that x is Union[int, str] (a number, or a
> string) but that limits the ability of the compiler to tell what is and what
> isn't safe. If I declare that x is either an int or a str, what can we say
> about x.upper()? Is it safe? If x happens to be an int at the time we call
> x.upper(), will the language raise a runtime exception or will it blindly try
> to execute some arbitrary chunk of memory as the upper() method?

That's fine if you *tell* the compiler this. My question came from
your statement that a type *inference* system can detect errors of
assignment - not method/operator usage. A dynamic type inference
system could easily cope with this:

x = 5
y = x**2
x = "five"
z = x.upper()

and correctly deduce that, at the time of y's assignment,
exponentiation of x was legal, and at the time of z's, uppercasing
was. But you said that the type system could flag the third line as an
error, saying "hey, I'm expecting this to be integers only". Here's
what you said:

> x = 1
> x = "hello"  # a type error, at compile time

If I were doing type inference, with my limited knowledge of the
field, I would do one of two things:

1) Infer that x holds only integers (or maybe broaden it to
"numbers"), and then raise an error on the second line; this basically
restricts the type system to be union-free
2) Infer that x holds Union[int, str] in PEP 484 notation, or
int|string in Pike notation, and permit it to carry either type.

Which way is it? Do you get errors, as per your example, and thus are
never allowed to have union types? And if so, what happens with
compatible types (notably, int and float)? Can user-defined types be
deemed "compatible"? Are the same types always compatible? Or are no
types ever compatible, and you just have a single Number type, like in
ECMAScript?

> Okay. What happens when you say:
>
> if random() < 0.5:
>     x = 1234
> else:
>     x = "surprise!"
>
> y = 3*(x + 1)
> z = x.find("p")  # or however Pike does string functions/methods
>
>
> What is y? What is z?

In Pike, variables get declared. So we have a few possibilities:

1) Declaration was "int x;" and the else clause is a compile-time error
2) Declaration was "string x;" and the converse
3) Declaration was "int|string x;" or "mixed x;" or some other broad
form, and they are both accepted.

> There are solutions to this conundrum. One is weak typing: 3*("surprise!" + 1)
> evaluates as 3*(0 + 1) or just 3, while (1234).find("p") coerces 1234 to the
> string "1234".
>
> Another is runtime exceptions.

In this example, y would be "surprise!1surprise!1surprise!1", because
Pike allows strings and integers to be added (representing the int
with ASCII decimal digits), but if the expression were 3*(x-1)
instead, then these would be run-time exceptions. Pike, like Python,
strongly types its values, so if the variable declaration doesn't
prevent something illogical from being compiled, it'll throw a nice
tidy exception at you. The find method, being applied to an integer,
would definitely be an exception, more-or-less "integers don't have
such a method, moron".

> A third is "don't do that, if you do, you can deal with the segmentation
> fault".

Only in C, where segfaults are considered a normal part of life. In
high level languages, no thank you.

> A fourth would be that the type-checker is smart enough to recognise that only
> one of those two assignments is valid, the second must be illegal, and flag the
> whole thing. That's what a human would do -- I don't know if any type systems
> are that sophisticated.

Ooh that would be VERY sophisticated. I don't know of anything that
does that, but it could be done on the same basis as the C "undefined
behaviour" thing you so revile against - basically, the compiler says
"well, if the programmer knows what he's doing, x MUST be an integer
at this point, ergo I can assume that it really will be an integer".
I've seen levels of sophistication like that in tools like Coverity
and how it detects buffer size problems or null pointers, so it
wouldn't surprise me (though it would impress me!) if something could
deduce this kind of incompatibility.

ChrisA