What is a type error?

Fri Jul 14 17:43:25 EDT 2006

Marshall schrieb:
> Joachim Durchholz wrote:
>> Marshall schrieb:
>>> What about my example of SQL? Mutation, no pointers, no aliasing.
>>> Yet: useful.
>> Sorry, but SQL does have aliasing.
> 
> Well. I suppose we do not have an agreed upon definition
> of aliasing, so it is hard to evaluate either way. I would
> propose using the same one you used for identity:
> 
> if there are two variables and modifying one also modifies
> the other, then there is aliasing between them.

I think that's an adequate example.
For a definition, I'd say it's a bit too narrow unless we use a fairly 
broad definition for "variable". I.e. in a C context, I'd say that a->b 
is a variable, too, as would be foo(blah)->x.

> I avoided mentioning equality to include, for example,
> having an array i that is an alias to a subset of array j.

This would mean that there's aliasing between, say, a list of 
transaction records and the balance of the account (since modifying the 
list of transactions will change the balance, unless the object isn't 
properly encapsulated).
For purposes of this discussion, it's probably fair to say "that's a 
form of aliasing, too", even though it's quite indirect.

>> E.g. if you have records that have name="John", surname="Doe", the
>> statements
>>    SELECT * FROM persons WHERE name = "John"
>> and
>>    SELECT * FROM persons WHERE name = "Doe"
>> are aliases of each other.

Arrrrrgh... I made a most braindamaged, stupid mistake here. The second 
SELECT should have used the *surname* field, so it would be

 >>    SELECT * FROM persons WHERE surname = "Doe"

Then, if there's a record that has name = "John", surname = "Doe", the 
two WHERE clauses have aliasing in the form of overlapping result sets.

>> The alias is actually in the WHERE clause.
> 
> Not by my definition, because there is only one variable here.

Sorry, my wording was sloppy.

I meant to say that 'in SQL, you identify records via clauses like WHERE 
name = "John", i.e. WHERE clauses are a kind of identity'.

This is still not precise - the identity of an SQL record isn't 
explicitly accessible (except the nonstandard OID facility that some SQL 
engines offer). Nevertheless, they do have an identity, and there's a 
possibility of aliasing - if you change all Johns, you may also change a 
Doe.

>> And this *can* get you into
>> trouble if you have something that does
>>    UPDATE ... WHERE name = "John"
>> and
>>    UPDATE ... WHERE surname = "Doe"
>> e.g. doing something with the Johns, then updating the names of all
>> Does, and finally restoring the Johns (but not recognizing that changing
>> the names of all Does might have changed your set of Johns).
> 
> The fact that some person might get confused about the
> semantics of what they are doing does not indicate aliasing.
> It is easy enough to do an analysis of your updates and
> understand what will happen; this same analysis is impossible
> with two arbitrary pointers, unless one has a whole-program
> trace. That strikes me as a significant difference.

Sure. I said that aliases in SQL aren't as bad as in other programs.

Once you get abstraction mixed in, the analysis becomes less 
straightforward. In the case of SQL, views are such an abstraction 
facility, and they indeed can obscure what you're doing and make the 
analysis more difficult. If it's just SQL we're talking about, you 
indeed have to look at the whole SQL to check whether there's a view 
that may be involved in the queries you're analysing, so the situation 
isn't *that* different from pointers - it's just not a problem because 
the amount of code is so tiny!

>> Conceptually, this is just the same as having two different access path
>> to the same memory cell. Or accessing the same global variable through a
>> call-by-reference parameter and via its global name.
> 
> There are similarities, but they are not the same.

What are the relevant differences? How does the semantics of a WHERE 
clause differ from that of a pointer, in terms of potential aliasing?

My line of argument would be this:
Pointers can be simulated using arrays, so it's no surprise that arrays 
can emulate all the aliasing of pointers.
Arrays can be simulated using WHERE (with SELECT and UPDATE), so I'd say 
that SQL can emulate all the aliasing of arrays, and hence that of pointers.

>> BTW with views, you get not just aliasing but what makes aliasing really
>> dangerous. Without views, you can simply survey all the queries that you
>> are working with and lexically compare table and field names to see
>> whether there's aliasing. With views, the names that you see in a
>> lexical scope are not enough to determine aliasing.
>> E.g. if you use a view that builds upon the set of Johns but aren't
>> aware of that (possibly due to abstraction barriers), and you change the
>> name field of all Does, then you're altering the view without a chance
>> to locally catch the bug. That's just as bad as with any pointer
>> aliasing problem.
> 
> It is certainly aliasing, but I would not call it "just as bad." There
> are
> elaborate declarative constraint mechanisms in place, for example.

Yes, but they can give you unexpected results.
It smells a bit like closing the gates after the horses are out.

On the other hand, *if* there is identity and updates, no matter how we 
handle them, there must be *some* way to deal with the problems that 
ensue. Having declarative constraints doesn't sound like the worst way 
to do that.

> And the definition of the view itsef is a declarative, semantic
> entity. Whereas a pointer is an opaque, semantics-free address.

Oh, that's a very BCPL view. It was still somewhat true for K&R C, but 
it's not really applicable to ANSI C, not at all to C++, and even less 
to languages that associate well-encapsulated types with pointers (such 
as Ada, Eiffel, or Java).
Unless, of course, you mean "address" if you say "pointer" ;-)

Regards,
Jo