What is a type error?

Wed Jul 12 13:52:43 EDT 2006

Marshall schrieb:
> I can see the lack of a formal model being an issue, but is the
> imperative bit really all that much of an obstacle? How hard
> is it really to deal with assignment? Or does the issue have
> more to do with pointers, aliasing, etc.?

Actually aliasing is *the* hard issue.
Just one data point: C compiler designers spend substantial effort to 
determine which data structures cannot have aliasing to be able to apply 
optimizations.

This can bite even with runtime assertion checking.
For example, ordinarily one would think that if there's only a fixed set 
of routines that may modify some data structure A, checking the 
invariants of that structure need only be done at the exit of these 
routines.
Now assume that A has a reference to structure B, and that the 
invariants involve B in some way. (E.g. A.count = A->B.count.)
There's still no problem with that - until you have a routine that 
exports a reference to B, which gets accessed from elsewhere, say via C.
Now if I do
   C->B.count = 27
I will most likely break A's invariant. If that assignment is in some 
code that's far away from the code for A, debugging can become 
exceedingly hard: the inconsistency (i.e. A violating its invariant) can 
live for the entire runtime of the program.

So that innocent optimization "don't check all the program's invariants 
after every assignment" immediately breaks with assignment (unless you 
do some aliasing analysis).

In an OO language, it's even more difficult to establish that some data 
structure cannot be aliased: even if the code for A doesn't hand out a 
reference to B, there's no guarantee that some subclass of A doesn't. 
I.e. the aliasing analysis has to be repeated whenever there's a new 
subclass, which may happen at link time when you'd ordinarily don't want 
to do any serious semantic analysis anymore.

There's another way around getting destroyed invariants reported at the 
time the breakage occurs: lock any data field that goes into an 
invariant (or a postcondition, too). The rationale behind this is that 
from the perspective of assertions, an alias walks, smells and quacks 
just like concurrent access, so locking would be the appropriate answer. 
The problem here is that this means that updates can induce *very* 
expensive machinery - imagine an invariant that says "A->balance is the 
sum of all the transaction->amount fields in the transaction list of A", 
it would cause all these ->amount fields to be locked as soon as a 
transaction list is hooked up with the amount. OTOH one could argue that 
such a locking simply makes cost in terms of debugging time visible as 
runtime overhead, which isn't entirely a Bad Thing.

There are all other kinds of things that break in the presence of 
aliasing; this is just the one that I happen to know best :-)

Without mutation, such breakage cannot occur. Invariants are just the 
common postconditions of all the routines that may construct a structure 
of a given type.

Regards,
Jo