What is a type error?
Joachim Durchholz
jo at durchholz.org
Wed Jul 12 13:52:43 EDT 2006
Marshall schrieb:
> I can see the lack of a formal model being an issue, but is the
> imperative bit really all that much of an obstacle? How hard
> is it really to deal with assignment? Or does the issue have
> more to do with pointers, aliasing, etc.?
Actually aliasing is *the* hard issue.
Just one data point: C compiler designers spend substantial effort to
determine which data structures cannot have aliasing to be able to apply
optimizations.
This can bite even with runtime assertion checking.
For example, ordinarily one would think that if there's only a fixed set
of routines that may modify some data structure A, checking the
invariants of that structure need only be done at the exit of these
routines.
Now assume that A has a reference to structure B, and that the
invariants involve B in some way. (E.g. A.count = A->B.count.)
There's still no problem with that - until you have a routine that
exports a reference to B, which gets accessed from elsewhere, say via C.
Now if I do
C->B.count = 27
I will most likely break A's invariant. If that assignment is in some
code that's far away from the code for A, debugging can become
exceedingly hard: the inconsistency (i.e. A violating its invariant) can
live for the entire runtime of the program.
So that innocent optimization "don't check all the program's invariants
after every assignment" immediately breaks with assignment (unless you
do some aliasing analysis).
In an OO language, it's even more difficult to establish that some data
structure cannot be aliased: even if the code for A doesn't hand out a
reference to B, there's no guarantee that some subclass of A doesn't.
I.e. the aliasing analysis has to be repeated whenever there's a new
subclass, which may happen at link time when you'd ordinarily don't want
to do any serious semantic analysis anymore.
There's another way around getting destroyed invariants reported at the
time the breakage occurs: lock any data field that goes into an
invariant (or a postcondition, too). The rationale behind this is that
from the perspective of assertions, an alias walks, smells and quacks
just like concurrent access, so locking would be the appropriate answer.
The problem here is that this means that updates can induce *very*
expensive machinery - imagine an invariant that says "A->balance is the
sum of all the transaction->amount fields in the transaction list of A",
it would cause all these ->amount fields to be locked as soon as a
transaction list is hooked up with the amount. OTOH one could argue that
such a locking simply makes cost in terms of debugging time visible as
runtime overhead, which isn't entirely a Bad Thing.
There are all other kinds of things that break in the presence of
aliasing; this is just the one that I happen to know best :-)
Without mutation, such breakage cannot occur. Invariants are just the
common postconditions of all the routines that may construct a structure
of a given type.
Regards,
Jo
More information about the Python-list
mailing list