What is a type error?
Darren New
dnew at san.rr.com
Fri Jul 14 13:21:43 EDT 2006
Andreas Rossberg wrote:
> OK, this is interesting. I don't know Hermes, is this sort of like a
> dynamically checked equivalent of linear or uniqueness typing?
I'm not sure what linear or uniqueness typing is. It's typestate, and if
I remember correctly the papers I read 10 years ago, the folks at
TJWatson that invented Hermes also invented the concept of typestate.
They at least claim to have coined the term.
It's essentially a dataflow analysis that allows you to do the same
sorts of things that "don't read variables that may not yet have been
assigned to", except that you could annotate that variables change to
the state of "uninitialized" after they've already been initialized.
> Mh, but if I understand correctly, this seems to require performing a
> deep copy - which is well-known to be problematic, and particularly
> breaks all kinds of abstractions.
Um, no, because there are no aliases. There's only one name for any
given value, so there's no "deep copy" problems. A deep copy and a
shallow copy are the same thing. And there are types of values you can
assign but not copy, such as callmessages (which are problematic to copy
for the same reason a stack frame would be problematic to copy).
I believe, internally, that there were cases where the copy was "deep"
and cases where it was "shallow", depending on the surrounding code.
Making a copy of a table and passing it to another process had to be a
"deep" copy (given that a column could contain another table, for
example). Making a copy of a table and using it for read-only purposes
in the same process would likely make a shallow copy of the table.
Iterating over a table and making changes during the iteration made a
copy-on-write subtable, then merged it back into the original table when
it was done the loop, since the high-level semantic definition of
looping over a table is that you iterate over a copy of the table.
The only thing close to aliases are references to some other process's
input ports (i.e., multiple client-side sockets connected to a
server-side socket). If you want to share data (such as a file system or
program library), you put the data in a table in a process, and you hand
out client-side connections to the process. Mostly, you'd define such
connections as accepting a data value (for the file contents) with the
parameter being undefined upon return from the call, and the file name
as being read-only, for example. If you wanted to store the file, you
could just pass a pointer to its data (in the implementation). If you
wanted a copy of it, you would either copy it and pass the pointer, or
you'd pass the pointer with a flag indicating it's copy-on-write, or you
could pass the pointer and have the caller copy it at some point before
returning, depending on what the caller did with it. The semantics were
high-level with the intent to allow the compiler lots of leeway in
implementation, not unlike SQL.
> Not to mention the issue with
> uninitialized variables that I would expect occuring all over the place.
The typestate tracks this, and prevents you from using uninitialized
variables. If you do a read (say, from a socket) and it throws an "end
of data" exception, the compiler prevents you from using the buffer you
just tried but failed to read.
Indeed, that's the very point of it all. By doing this, you can run
"untrusted" code in the same address space as trusted code, and be
assured that the compiler will prevent the untrusted code from messing
up the trusted code. The predecessor of Hermes (NIL) was designed to let
IBM's customers write efficient networking code and emulations and such
that ran in IBM's routers, without the need for expensive (in
performance or money) hardware yet with the safety that they couldn't
screw up IBM's code and hence cause customer service problems.
> So unless I'm misunderstanding something, this feels like trading one
> evil for an even greater one.
In truth, it was pretty annoying. But more because you wound up having
to write extensive declarations and compile the declarations before
compiling the code that implements them and such. That you didn't get to
use uninitialized variables was a relatively minor thing, especially
given that many languages nowadays complain about uninitialized
variables, dead code, etc. But for lots of types of programs, it let you
do all kinds of things with a good assurance that they'd work safely and
efficiently. It was really a language for writing operating systems in,
when you get right down to it.
--
Darren New / San Diego, CA, USA (PST)
This octopus isn't tasty. Too many
tentacles, not enough chops.
More information about the Python-list
mailing list