[Types-sig] Type Inference I

Sat, 18 Dec 1999 10:59:24 +1100

Greg Stein wrote:

> On Sat, 18 Dec 1999, skaller wrote:
> > ... long post about exceptions and semantic definition ...
> 
> Sorry John, call me dense, but I really don't see what you're talking
> about. :-(

	It takes a while to understand the impact of conformance
and specifications on semantics .. and that this is not just a
matter of language lawyering, but a real, pragmatic, issue.

> I don't see a problem with exceptions. That is part of Python. I don't see
> that it causes any problems with type inference, either (it just
> introduces interesting items into the control/data flow graph).

	The problem arises roughly as follows:
type inference works by examining an expression like:

	x + 1

and _deducing_ that x MUST (**) be an integer. It cannot
be a file, because it isn't allowed to add a file to an integer.
But in Python you CAN add a file to an integer. It is perfectly
legal, it just throws an exception.

Do you see? This means we cannot deduce ANYTHING about 'x'
in the example snippet given above.

Of course, the _expression_ x+1 can only be an integer,
we _can_ deduce that. But that isn't enough. Python
is too dynamic. We need more constraints to be able
to do effective inference.

(**) This example ignores class instances with __add__ methods,
to make the argument easier to follow.

> This whole tangent about feeding an email to Python and claiming it is a
> valid Python program with defined semantics (raise SyntaxError). I
> understand your explanation, but I totally miss the point. So what?

	See above. We cannot infer anything, unless there are rules.
That is, there MUST be set of permitted signatures for
functions/operators,
in order to do inference at all.

	It is possible to do synthetic (bottom up) type analysis, such as:

	x = 1 + 1

Here, we know that Int + Int -> Int, and so x (at least at this
point in the program) must be an Int. But that is only
the 'deductive' part of inference, the 'inferential' part
infers the types of _arguments_ from the set of allowable
signatures of functions. That is, we must do  the inference
top down, not just bottom up.

> Type inferencing for the "1 + file" case is easy. You know the two types,
> and you know they can't be added. Bam. Error.

	but you're wrong, the result of applying the addition
operation is, in fact, well defined: it is NOT an error in the program,
it just throws an exception, rather than returning a value.
If you throw an exception deliberately, that is hardly an error, is it?

> It was a long email, but what exactly were you trying to say? "Define the
> semantics" isn't very clear. I feel Python has very clear semantics. What
> exactly is wrong with them?

	There is no distinction made between 'incorrect' code,
and 'correct' code for which an exception is thrown.

	In compiled code, we need the distinction, because there
is a lot of overhead in doing the dynamic type checking required
to throw the exception. The whole point of compilation is to
eliminate the overhead of run time type checking.

	This can be done if we know what the type of an
expression MUST be, but the current semantics don't
allow this in enough cases. Let me invent a temporary syntax, in which
there is no exception throwing, instead, the value
returned by an expression is either 'something nice',
or 'exceptional'. This IS what happens now in C code,
where NULL represents 'exceptional', right?

	Now, the type of 

	x + 1

can be 'nice or exceptional', which tells us nothing
about the type of x. But if the type is 'nice',
we know that x must be an integer. And then,
the compiler can generate code that doesn't bother
to check the result of calling PyAdd(x, &One),
because it cannot be NULL. We know PyAdd will 
return an actual object.  I'm sure YOU have done
that yourself, writing C extensions.

[In fact, we can do better, we can peek into the Int 
data structure and add 1 to the result, and rewrap it as a PyObject;
that is, we can INLINE the PyAdd function, and throw
out the cases that cannot occur -- with any luck, within
a larger code fragment, we can get rid of PyAdd altogether,
and just use 'x++' -- a single machine instruction.]

Hope this makes sense: to compile python code effectively,
we need to add some reasonable 'static-y' restrictions.
Where, 'reasonable' means 'suitably pythonic', but not
quite as dynamic as the current CPython 1.5.2 implementation
allows.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850