What is a type error?

Mon Jul 17 04:17:13 EDT 2006

Marshall schrieb:
> Joachim Durchholz wrote:
>> Marshall schrieb:
>>> Good point. Perhaps I should have said "relational algebra +
>>> variables with assignment." It is interesting to consider
>>> assignment vs. the more restricted update operators: insert,
>>> update, delete.
>> Actually I see it the other way round: assignment is strictly less
>> powerful than DML since it doesn't allow creating or destroying
>> variables, while UPDATE does cover assignment to fields.
> 
> Oh, my.
> 
> Well, for all table variables T, there exists some pair of
> values v and v', such that we can transition the value of
> T from v to v' via assignment, but not by any single
> insert, update or delete.

I fail to see an example that would support such a claim.

On the other hand, UPDATE can assign any value to any field of any 
record, so it's doing exactly what an assignment does. INSERT/DELETE can 
create resp. destroy records, which is what new and delete operators 
would do.

I must really be missing the point.

 > Further, it is my understanding
> that your claim of row identity *depends* on the restricted
> nature of DML; if the only mutator operation is assignment,
> then there is definitely no record identity.

Again, I fail to connect.

I and others have given aliasing examples that use just SELECT and UPDATE.

>> (However, it's usually new+assignment+delete vs. INSERT+UPDATE+DELETE,
>> at which point there is not much of a difference.)
> 
> I am not sure what this means. Delete can be expressed in
> terms of assignment. (As can insert and update.)

INSERT cannot be expressed in terms of assignment. INSERT creates a new 
record; there's no way that assignment in a language like C can create a 
new data structure!
The same goes for DELETE.

 > (Assignment can also be expressed in terms of insert and delete.)

Agreed.

I also realize that this makes it a bit more difficult to nail down the 
nature of identity in a database. It's certainly not storage location: 
if you DELETE a record and then INSERT it with the same values, it may 
be allocated somewhere entirely else, and our intuition would say it's 
not "the same" (i.e. not identical). (In a system with OID, it would 
even be impossible to recreate such a record, since it would have a 
different OID. I'm not sure whether this makes OID systems better or 
worse at preserving identity, but that's just a side track.)

Since intuition gives me ambivalent results here, I'll go back to my 
semiformal definition (and take the opportunity to make it a bit more 
precise):
Two path expressions (lvalues, ...) are aliases if and only if the 
referred-to values compare equal, and if they stay equal after applying 
any operation to the referred-to value through either of the path 
expressions.

In the context of SQL, this means that identity isn't the location where 
the data is stored. It's also not the values stored in the record - 
these may change, including key data. SQL record identity is local, it 
can be defined from one operation to the next, but there is no such 
thing as a global identity that one can memorize and look up years 
later, without looking at the intermediate states of the store.

It's a gross concept, now that I think about it. Well, or at least 
rather alien for us programmers, who are used to taking the address of a 
variable to get a tangible identity that will stay stable over time.

On the other hand, variable addresses as tangible identities don't hold 
much water anyway.
Imagine data that's written out to disk at program end, and read back 
in. Further imagine that while the data is read into main memory, 
there's a mechanism that redirects all further reads and writes to the 
file into the read-in copy in memory, i.e. whenever any program changes 
the data, all other programs see the change, too.
Alternatively, think about software agents that move from machine to 
machine, carrying their data with them. They might be communicating with 
each other, so they need some means of establishing identity 
("addressing") the memory buffers that they use for communication.

 > I don't know what "new" would be in a value-semantics, relational
> world.

It would be INSERT.

Um, my idea of "value semantics" is associated with immutable values. 
SQL with INSERT/DELETE/UPDATE certainly doesn't match that definition.

So by my definition, SQL doesn't have value semantics, by your 
definition, it would have value semantics but updates which are enough 
to create aliasing problems, so I'm not sure what point you're making 
here...

>> Filters are just like array indexing: both select a subset of variables
>> from a collection.
> 
> I can't agree with this wording. A filter produces a collection
> value from a collection value. I don't see how variables
> enter in to it.

A collection can consist of values or variables.

And yes, I do think that WHERE is a selection over a bunch of variables 
- you can update records after all, so they are variables! They don't 
have a name, at least none which is guaranteed to be constant over their 
lifetime, but they can be mutated!

 > One can filter either a collection constant or
> a collection variable; if one speaks of filtering a collection
> variable, on is really speaking of filtering the collection value
> that the variable currently contains; filtering is not an operation
> on the variable as such, the way the "address of" operator is.
> Note you can't update the result of a filter.

If that's your definition of a filter, then WHERE is not a filter, 
simple as that.

>> In SQL, you select a subset of a table, in a
>> programming language, you select a subset of an array.
>>
>> (The SQL selection mechanism is far more flexible in the kinds of
>> filtering you can apply, while array indexing allows filtering just by
>> ordinal position. However, the relevant point is that both select things
>> that can be updated.)
> 
> When you have been saying "select things that can be updated"
> I have been assuming you meant that one can derive values
> from variables, and that some other operation can update that
> variable, causing the expression, if re-evaluated, to produce
> a different value.

That's what I meant.

 > However the phrase also suggests that
> you mean that the *result* of the select can *itself* be
> updated.

The "that" in "things that can be updated" refers to the selected 
things. I'm not sure how this "that" could be interpreted to refer to 
the selection as a whole (is my understanding of English really that bad?)

 > Which one do you mean? (Or is there a third
> possibility?)

I couldn't tell - I wouldn't have thought that there are even two 
possibilities.

Regards,
Jo