What is Expressiveness in a Computer Language

Thu Jun 22 18:08:14 EDT 2006

Marshall wrote:
> Can you be more explicit about what "latent types" means?
> I'm sorry to say it's not at all natural or intuitive to me.
> Are you referring to the types in the programmers head,
> or the ones at runtime, or what?

Sorry, that was a huge omission.  (What I get for posting at 3:30am.)

The short answer is that I'm most directly referring to "the types in 
the programmer's head".  A more complete informal summary is as follows:

Languages with latent type systems typically don't include type 
declarations in the source code of programs.  The "static" type scheme 
of a given program in such a language is thus latent, in the English 
dictionary sense of the word, of something that is present but 
undeveloped.  Terms in the program may be considered as having static 
types, and it is possible to infer those types, but it isn't necessarily 
easy to do so automatically, and there are usually many possible static 
type schemes that can be assigned to a given program.

Programmers infer and reason about these latent types while they're 
writing or reading programs.  Latent types become manifest when a 
programmer reasons about them, or documents them e.g. in comments.

(As has already been noted, this definition may seem at odds with the 
definition given in the Scheme report, R5RS, but I'll address that in a 
separate post.)

There's a close connection between latent types in the sense I've 
described, and the "tagged values" present at runtime.  However, as type 
theorists will tell you, the tags used to tag values at runtime, as e.g. 
a number or a string or a FooBar object, are not the same thing as the 
sort of types which statically-typed languages have.

A simple example of the distinction can be seen in the type of a 
function.  Using Javascript as a lingua franca:

   function timestwo(x) { return x * 2 }

In a statically-typed language, the type of a function like this might 
be something like "number -> number", which tells us three things: that 
timestwo is a function; that it accepts a number argument; and that it 
returns a number result.

But if we ask Javascript what it thinks the type of timestwo is, by 
evaluating "typeof timestwo", it returns "function".  That's because the 
value bound to timestwo has a tag associated with it which says, in 
effect, "this value is a function".

But "function" is not a useful type.  Why not?  Because if all you know 
is that timestwo is a function, then you have no idea what an expression 
like "timestwo(foo)" means.  You couldn't write working programs, or 
read them, if all you knew about functions was that they were functions. 
  As a type, "function" is incomplete.

By my definition, though, the latent type of timestwo is "number -> 
number".  Any programmer looking at the function can figure out that 
this is its type, and programmers do exactly that when reasoning about a 
program.

(Aside: technically, you can pass timestwo something other than a 
number, but then you get NaN back, which is usually not much use except 
to generate errors.  I'll ignore that here; latent typing requires being 
less rigourous about some of these issues.)

So, where do tagged values fit into this?  Tags help to check types at 
runtime, but that doesn't mean that there's a 1:1 correspondence between 
tags and types.  For example, when an expression such as "timestwo(5) * 
3" is evaluated, three checks occur that are relevant to the type of 
timestwo:

1. Before the function call takes place, a check ensures that timestwo 
is a function.

2. Before the multiplication in "x * 2", a check ensures that x is a number.

3. When timestwo returns, before the subsequent multiplication by 3, a 
check ensures that the return type of timestwo is a number.

These three checks correspond to the three pieces of information 
contained in the function type signature "number -> number".

However, these dynamic checks still don't actually tell us the type of a 
function.  All they do is check that in a particular case, the values 
involved are compatible with the type of the function.  In many cases, 
the checks may infer a signature that's either more or less specific 
than the function's type, or they may infer an incomplete signature -- 
e.g., the return type doesn't need to be checked when evaluating "arr[i] 
= timestwo(5)".

I used a function just as an example.  There are many other cases where 
a value's tag doesn't match the static (or latent) type of the terms 
through which it flows.  A simple example is an expression such as:

     (flag ? 5 : "foo")

Here, the latent type of this expression could be described as "number | 
string".  There won't be a runtime tag anywhere which represents that 
type, though, since the language implementation never deals with the 
actual type of expressions, except in those degenerate cases where the 
type is so simple that it happens to be a 1:1 match to the corresponding 
tag.  It's only the programmer that "knows" that this expression has 
that type.

Anton