What's the best way to minimize the need of run time checks?

Sun Aug 28 23:09:28 EDT 2016

On Mon, Aug 29, 2016 at 12:43 PM, Steve D'Aprano
<steve+python at pearwood.info> wrote:
> On Sun, 28 Aug 2016 07:28 pm, Chris Angelico wrote:
>
>> On Sun, Aug 28, 2016 at 6:33 PM, Steven D'Aprano
>> <steve+comp.lang.python at pearwood.info> wrote:
>>> On Sunday 28 August 2016 15:29, Chris Angelico wrote:
>>>> It might be a good way of thinking about points on a Cartesian plane,
>>>> though. Rectangular and polar coordinates truly are just different
>>>> ways of expressing the same information.
>>>
>>> That's exactly my point, and that's why you shouldn't implement them as
>>> different classes. If you do, that's a limitation of your code and/or the
>>> language.
> [... snip digression over possible leaky abstractions due to floating point
> rounding ...]
>
>> class Complex(complex):
>>     @property
>>     def r(self):
>>         return abs(self)
>>     @property
>>     def phi(self):
>>         return cmath.phase(self)
>>
>> One value, two ways of looking at it. One type. One class.
>
> I can't tell if you're saying this to agree with me or to disagree with me.

Agreeing, and positing that you don't even need two ways of storing it
(modulo FP rounding) - just two ways of *looking* at it.

>>> I might be able to tell the compiler that x is Union[int, str] (a number,
>>> or a string) but that limits the ability of the compiler to tell what is
>>> and what isn't safe.
>>
>> That's fine if you *tell* the compiler this.
>
> I trust you don't actually mean that it is fine for a high-level
> (non-assembly) language to blindly execute some arbitrary memory address.

Quote trimmed to clarify my point. Of course I don't want a high level
language to blindly execute random memory.

> Your question seems to be, what happens if you follow that with an
> assignment to a different type?
>
> x = 5
> some_code(x)
> x = "hello world"
>
>
> Will the type-checker consider that an error ("you're assigning a str to an
> int") or will it infer that x is the Union[int, str]?
>
> Surely that depends on the type-checker! I don't think there's any hard and
> fast rule about that, but as far as I know, all statically typed languages
> consider than an error. Remember, that's the definition of *static typing*:
> the variable carries the type, not the value. So x is an int, and "hello
> world" isn't an int, so this MUST be an error.

So in statically-typed inferred-type languages, unions are impossible.
Got it. That's perfectly acceptable with strings and integers, but
ints and floats are more problematic. More on that below.

> In dynamically typed languages... I don't know. I think that (again) I would
> expect that *by default* the checker should treat this as an invalid
> assignment. The whole point of a static type-checker is to bring some
> simulacrum of static typing to a dynamic language, so if you're going to
> enthusiastically infer union types every time you see an unexpected type
> assignment, it sort of defeats the purpose...
>
> "x was declared int, and you then call function foo() which is declared to
> return an int, but sometimes returns a str... oh, they must have meant a
> union, so that's okay..."

Not really; you forfeit any kind of assignment checking (since
assignment will simply expand the type union), but you still get
static type checking of operators, methods, attributes, etc, and of
function calls (passing a string|int to something that expects a list?
Error!). You still get a lot of the benefit.

>> A dynamic type inference
>> system could easily cope with this:
>>
>> x = 5
>> y = x**2
>> x = "five"
>> z = x.upper()
>>
>> and correctly deduce that, at the time of y's assignment,
>> exponentiation of x was legal, and at the time of z's, uppercasing
>> was.
>
> Could it? Do you have an example of a language or type-checker that can do
> that? This isn't a rhetorical question.

No, I don't, because the type-checking languages I use have
declarations. But since C compilers are capable of detecting that
"this variable isn't used after this point, so I can reuse its
register", the equivalent in type checking should be possible.

> My understanding is that all the standard algorithms for checking types are
> based on the principle that a variable only has a single type in any one
> scope. Global x and local x may be different, but once you explicitly or
> implicitly set x to an int, then you can't set it to a str. So I'm not sure
> that existing compile-time type-checkers could deal with that, even in a
> dynamic language. At best they might say "oh well, x is obviously
> duck-typed, so don't bother trying to check it".
>
> At least that's my understanding -- perhaps I'm wrong.
>
> I daresay that what you want is *possible*. If the human reader can do it,
> then an automated checker should be able to. The question is not "is this
> possible?" but "has anyone done it yet?".

Understood. That's what I wanted to clear up - that type inference
systems aim for a single type for any given variable.

> No, it only means that the system won't *infer* type unions just from
> assignment. And that's probably what we want: the whole point is to catch
> type errors.
>
> There are other ways that you can infer a union type. The most obvious is to
> declare it, but MyPy can infer that x is Union[int, str] following:
>
> if isinstance(x, int) or isinstance(x, str):
>     ...
>

Yep, that answers that question. Basically, an inferred type MUST come
from a single logical unit, eg a single statement. That makes fine
sense, as long as there is a way to declare union types.

>> Which way is it? Do you get errors, as per your example, and thus are
>> never allowed to have union types? And if so, what happens with
>> compatible types (notably, int and float)?
>
> Who says they're compatible?
>
>
> py> (23).bit_length()
> 5
> py> (23.0).bit_length()
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> AttributeError: 'float' object has no attribute 'bit_length'
>
> Some languages (Erlang?) are so strict that they forbid mixed arithmetic and
> require you to explicitly coerce one type to another. That's not a bad
> idea... consider adding an int n to a float x, can that ever raise?

It can, and that's the thing. You could simply declare that floats and
ints are completely and utterly different beasts:

1.0 != 1
1.0 + 1 # TypeError
x = 0;
x = 1.5; # TypeError

But Python has deemed, and I think a lot of people here would agree,
that there's an abstract concept of a "number", and that float and int
(and complex) are different representations of them - and that a float
is equal to an int if they represent the same number, they can be
summed, etc, etc, etc. I'm pretty sure most people on this list (not
all, but most) agree with the Python 3.0 change that means that the
quotient of the integers 7 and 2 is the float 3.5, not the integer 3.
In terms of a type checker, we end up with something like this:

value = 0
for flag in collection:
    if flag.condition1: value += 0.9
    if flag.condition2: value *= 2

If ints and floats are type-incompatible, the third line would throw
an error, because you'd clearly be assigning a float to an integer
variable. So you have to say "value = 0.0" at the top? Okay. What
about the multiplication? "value *= 2.0"? Do we have to mark every
single value with a decimal to make them compatible with floats? Seems
a lot of unnecessary bother.

The one thing I really REALLY do not want to see is some kind of
internal magic that makes ints and floats compatible, without
extending that offer to user-defined types. Complete incompatibility
would work, but probably not in Python.

ChrisA