What is Expressiveness in a Computer Language

Thu Jun 22 22:35:40 EDT 2006

Anton van Straaten wrote:
> Marshall wrote:
> > Can you be more explicit about what "latent types" means?
> > I'm sorry to say it's not at all natural or intuitive to me.
> > Are you referring to the types in the programmers head,
> > or the ones at runtime, or what?
>
> Sorry, that was a huge omission.  (What I get for posting at 3:30am.)

Thanks for the excellent followup.

> The short answer is that I'm most directly referring to "the types in
> the programmer's head".

In the database theory world, we speak of three levels: conceptual,
logical, physical. In a dbms, these might roughly be compared to
business entities described in a requirements doc, (or in the
business owners' heads), the (logical) schema encoded in the
dbms, and the b-tree indicies the dbms uses for performance.

So when you say "latent types", I think "conceptual types."

The thing about this is, both the Lisp and the Haskell programmers
are using conceptual types (as best I can tell.) And also, the
conceptual/latent types are not actually a property of the
program; they are a property of the programmer's mental
model of the program.

It seems we have languages:
with or without static analysis
with or without runtime type information (RTTI or "tags")
with or without (runtime) safety
with or without explicit type annotations
with or without type inference

Wow. And I don't think that's a complete list, either.

I would be happy to abandon "strong/weak" as terminology
because I can't pin those terms down. (It's not clear what
they would add anyway.)

> A more complete informal summary is as follows:
>
> Languages with latent type systems typically don't include type
> declarations in the source code of programs.  The "static" type scheme
> of a given program in such a language is thus latent, in the English
> dictionary sense of the word, of something that is present but
> undeveloped.  Terms in the program may be considered as having static
> types, and it is possible to infer those types, but it isn't necessarily
> easy to do so automatically, and there are usually many possible static
> type schemes that can be assigned to a given program.
>
> Programmers infer and reason about these latent types while they're
> writing or reading programs.  Latent types become manifest when a
> programmer reasons about them, or documents them e.g. in comments.

Uh, oh, a new term, "manifest." Should I worry about that?

> (As has already been noted, this definition may seem at odds with the
> definition given in the Scheme report, R5RS, but I'll address that in a
> separate post.)
>
> There's a close connection between latent types in the sense I've
> described, and the "tagged values" present at runtime.  However, as type
> theorists will tell you, the tags used to tag values at runtime, as e.g.
> a number or a string or a FooBar object, are not the same thing as the
> sort of types which statically-typed languages have.
>
> A simple example of the distinction can be seen in the type of a
> function.  Using Javascript as a lingua franca:
>
>    function timestwo(x) { return x * 2 }
>
> In a statically-typed language, the type of a function like this might
> be something like "number -> number", which tells us three things: that
> timestwo is a function; that it accepts a number argument; and that it
> returns a number result.
>
> But if we ask Javascript what it thinks the type of timestwo is, by
> evaluating "typeof timestwo", it returns "function".  That's because the
> value bound to timestwo has a tag associated with it which says, in
> effect, "this value is a function".

Well, darn. It strikes me that that's just a decision the language
designers
made, *not* to record complete RTTI. (Is it going to be claimed that
there is an *advantage* to having only incomplete RTTI? It is a
serious question.)

> But "function" is not a useful type.  Why not?  Because if all you know
> is that timestwo is a function, then you have no idea what an expression
> like "timestwo(foo)" means.  You couldn't write working programs, or
> read them, if all you knew about functions was that they were functions.
>   As a type, "function" is incomplete.

Yes, function is a parameterized type, and they've left out the
parameter values.

> By my definition, though, the latent type of timestwo is "number ->
> number".  Any programmer looking at the function can figure out that
> this is its type, and programmers do exactly that when reasoning about a
> program.

Gotcha.

> (Aside: technically, you can pass timestwo something other than a
> number, but then you get NaN back, which is usually not much use except
> to generate errors.  I'll ignore that here; latent typing requires being
> less rigourous about some of these issues.)
>
> So, where do tagged values fit into this?  Tags help to check types at
> runtime, but that doesn't mean that there's a 1:1 correspondence between
> tags and types.  For example, when an expression such as "timestwo(5) *
> 3" is evaluated, three checks occur that are relevant to the type of
> timestwo:
>
> 1. Before the function call takes place, a check ensures that timestwo
> is a function.
>
> 2. Before the multiplication in "x * 2", a check ensures that x is a number.
>
> 3. When timestwo returns, before the subsequent multiplication by 3, a
> check ensures that the return type of timestwo is a number.
>
> These three checks correspond to the three pieces of information
> contained in the function type signature "number -> number".
>
> However, these dynamic checks still don't actually tell us the type of a
> function.  All they do is check that in a particular case, the values
> involved are compatible with the type of the function.  In many cases,
> the checks may infer a signature that's either more or less specific
> than the function's type, or they may infer an incomplete signature --
> e.g., the return type doesn't need to be checked when evaluating "arr[i]
> = timestwo(5)".
>
> I used a function just as an example.  There are many other cases where
> a value's tag doesn't match the static (or latent) type of the terms
> through which it flows.  A simple example is an expression such as:
>
>      (flag ? 5 : "foo")
>
> Here, the latent type of this expression could be described as "number |
> string".  There won't be a runtime tag anywhere which represents that
> type, though, since the language implementation never deals with the
> actual type of expressions, except in those degenerate cases where the
> type is so simple that it happens to be a 1:1 match to the corresponding
> tag.  It's only the programmer that "knows" that this expression has
> that type.

Hmmm. Another place where the static type isn't the same thing as
the runtime type occurs in languages with subtyping.

Question: if a language *does* record complete RTTI, and the
languages does *not* have subtyping, could we then say that
the runtime type information *is* the same as the static type?

Marshall