What is Expressiveness in a Computer Language
Anton van Straaten
anton at appsolutions.com
Thu Jun 22 18:08:14 EDT 2006
Marshall wrote:
> Can you be more explicit about what "latent types" means?
> I'm sorry to say it's not at all natural or intuitive to me.
> Are you referring to the types in the programmers head,
> or the ones at runtime, or what?
Sorry, that was a huge omission. (What I get for posting at 3:30am.)
The short answer is that I'm most directly referring to "the types in
the programmer's head". A more complete informal summary is as follows:
Languages with latent type systems typically don't include type
declarations in the source code of programs. The "static" type scheme
of a given program in such a language is thus latent, in the English
dictionary sense of the word, of something that is present but
undeveloped. Terms in the program may be considered as having static
types, and it is possible to infer those types, but it isn't necessarily
easy to do so automatically, and there are usually many possible static
type schemes that can be assigned to a given program.
Programmers infer and reason about these latent types while they're
writing or reading programs. Latent types become manifest when a
programmer reasons about them, or documents them e.g. in comments.
(As has already been noted, this definition may seem at odds with the
definition given in the Scheme report, R5RS, but I'll address that in a
separate post.)
There's a close connection between latent types in the sense I've
described, and the "tagged values" present at runtime. However, as type
theorists will tell you, the tags used to tag values at runtime, as e.g.
a number or a string or a FooBar object, are not the same thing as the
sort of types which statically-typed languages have.
A simple example of the distinction can be seen in the type of a
function. Using Javascript as a lingua franca:
function timestwo(x) { return x * 2 }
In a statically-typed language, the type of a function like this might
be something like "number -> number", which tells us three things: that
timestwo is a function; that it accepts a number argument; and that it
returns a number result.
But if we ask Javascript what it thinks the type of timestwo is, by
evaluating "typeof timestwo", it returns "function". That's because the
value bound to timestwo has a tag associated with it which says, in
effect, "this value is a function".
But "function" is not a useful type. Why not? Because if all you know
is that timestwo is a function, then you have no idea what an expression
like "timestwo(foo)" means. You couldn't write working programs, or
read them, if all you knew about functions was that they were functions.
As a type, "function" is incomplete.
By my definition, though, the latent type of timestwo is "number ->
number". Any programmer looking at the function can figure out that
this is its type, and programmers do exactly that when reasoning about a
program.
(Aside: technically, you can pass timestwo something other than a
number, but then you get NaN back, which is usually not much use except
to generate errors. I'll ignore that here; latent typing requires being
less rigourous about some of these issues.)
So, where do tagged values fit into this? Tags help to check types at
runtime, but that doesn't mean that there's a 1:1 correspondence between
tags and types. For example, when an expression such as "timestwo(5) *
3" is evaluated, three checks occur that are relevant to the type of
timestwo:
1. Before the function call takes place, a check ensures that timestwo
is a function.
2. Before the multiplication in "x * 2", a check ensures that x is a number.
3. When timestwo returns, before the subsequent multiplication by 3, a
check ensures that the return type of timestwo is a number.
These three checks correspond to the three pieces of information
contained in the function type signature "number -> number".
However, these dynamic checks still don't actually tell us the type of a
function. All they do is check that in a particular case, the values
involved are compatible with the type of the function. In many cases,
the checks may infer a signature that's either more or less specific
than the function's type, or they may infer an incomplete signature --
e.g., the return type doesn't need to be checked when evaluating "arr[i]
= timestwo(5)".
I used a function just as an example. There are many other cases where
a value's tag doesn't match the static (or latent) type of the terms
through which it flows. A simple example is an expression such as:
(flag ? 5 : "foo")
Here, the latent type of this expression could be described as "number |
string". There won't be a runtime tag anywhere which represents that
type, though, since the language implementation never deals with the
actual type of expressions, except in those degenerate cases where the
type is so simple that it happens to be a 1:1 match to the corresponding
tag. It's only the programmer that "knows" that this expression has
that type.
Anton
More information about the Python-list
mailing list