What is Expressiveness in a Computer Language

Fri Jun 23 02:32:49 EDT 2006

Chris Smith wrote:
> I don't recall who said what at this 
> point, but earlier today someone else posted -- in this same thread -- 
> the idea that static type "advocates" want to classify some languages as 
> untyped in order to put them in the same category as assembly language 
> programming.  That's something I've never seen, and I think it's far 
> from the goal of pretty much anyone; but clearly, *someone* was 
> concerned about it.  I don't know if much can be done to clarify this 
> rhetorical problem, but it does exist.

For the record, I'm not concerned about that problem as such.  However, 
I do think that characterizations of dynamically typed languages from a 
static type perspective tend to oversimplify, usually because they 
ignore the informal aspects which static type systems don't capture.

Terminology is a big part of this, so there are valid reasons to be 
careful about how static type terminology and concepts are applied to 
languages which lack formally defined static type systems.

> The *other* bit that's been brought up in this thread is that the word 
> "type" is just familiar and comfortable for programmers working in 
> dynamically typed languages, and that they don't want to change their 
> vocabulary.

What I'm suggesting is actually a kind of bridge between the two 
positions.  The dynamically typed programmer tends to think in terms of 
values having types, rather than variables.  What I'm pointing out is 
that even those programmers reason about something much more like static 
types than they might realize; and that there's a connection between 
that reasoning and static types, and also a connection to the tags 
associated with values.

If you wanted to take the word "type" and have it mean something 
reasonably consistent between the static and dynamic camps, what I'm 
suggesting at least points in that direction.  Obviously, nothing in the 
dynamic camp is perfectly isomorphic to a real static type, which is why 
I'm qualifying the term as "latent type", and attempting to connect it 
to both static types and to tags.

> The *third* thing that's brought up is that there is a (to me, somewhat 
> vague) conception going around that the two really ARE varieties of the 
> same thing.  I'd like to pin this down more, and I hope we get there, 
> but for the time being I believe that this impression is incorrect.  At 
> the very least, I haven't seen a good way to state any kind of common 
> definition that withstands scrutiny.  There is always an intuitive word 
> involved somewhere which serves as an escape hatch for the author to 
> retain some ability to make a judgement call, and that of course 
> sabotages the definition.  So far, that word has varied through all of 
> "type", "type error", "verify", and perhaps others... but I've never 
> seen anything that allows someone to identify some universal concept of 
> typing (or even the phrase "dynamic typing" in the first place) in a way 
> that doesn't appeal to intuition.

It's obviously going to be difficult to formally pin down something that 
is fundamentally informal.  It's fundamentally informal because if 
reasoning about the static properties of terms in DT languages were 
formalized, it would essentially be something like a formal type system.

However, there are some pretty concrete things we can look at.  One of 
them, which as I've mentioned elsewhere is part of what led me to my 
position, is to look at what a soft type inferencer does.  It takes a 
program in a dynamically typed language, and infers a static type scheme 
for it (obviously, it also defines an appropriate type system for the 
language.)  This has been done in both Scheme and Erlang, for example.

How do you account for such a feat in the absence of something like 
latent types?  If there's no static type-like information already 
present in the program, how is it possible to assign a static type 
scheme to a program without dramatically modifying its source?

I think it's reasonable to look at a situation like that and conclude 
that even DT programs contain information that corresponds to types. 
Sure, it's informal, and sure, it's usually messy compared to an 
explicitly defined equivalent.  But the point is that there is 
"something" there that looks so much like static types that it can be 
identified and formalized.

> Undoubtedly, some programmers sometimes perform reasoning about their 
> programs which could also be performed by a static type system.  

I think that's a severe understatement.  Programmers always reason about 
things like the types of arguments, the types of variables, the return 
types of functions, and the types of expressions.  They may not do 
whole-program inference and proof in the way that a static type system 
does, but they do it locally, all the time, every time.

BTW, I notice you didn't answer any of the significant questions which I 
posed to Vesa.  So let me pose one directly to you: how should I rewrite 
the first sentence in the preceding paragraph to avoid appealing to an 
admittedly informal notion of type?  Note, also, that I'm using the word 
in the sense of static properties, not in the sense of tagged values.

>>There are reasons to connect 
>>it to type theory, and if you can't see those reasons, you need to be 
>>more explicit about why.
> 
> 
> Let me pipe up, then, as saying that I can't see those reasons; or at 
> least, if I am indeed seeing the same reasons that everyone else is, 
> then I am unconvinced by them that there's any kind of rigorous 
> connection at all.

For now, I'll stand on what I've written above.  When I see if or how 
that doesn't convince you, I can go further.

> It is, nevertheless, quite appropriate to call the language "untyped" if 
> I am considering static type systems.  

I agree that in the narrow context of considering to what extent a 
dynamically typed language has a formal static type system, you can call 
it untyped.  However, what that essentially says is that formal type 
theory doesn't have the tools to deal with that language, and you can't 
go much further than that.  As long as that's what you mean by untyped, 
I'm OK with it.

> I seriously doubt that this usage 
> in any way misleads anyone into assuming the absence of any mental 
> processes on the part of the programmer.  I hope you agree.  

I didn't suggest otherwise (or didn't mean to).  However, the term 
"untyped" does tend to lead to confusion, to a lack of recognition of 
the significance of all the static information in a DT program that is 
outside the bounds of a formal type system, and the way that runtime tag 
checks relate to that static information.

One misconception that occurs is the assumption that all or most of the 
static type information in a statically-typed program is essentially 
nonexistent in a dynamically-typed program, or at least is no longer 
statically present.  That can easily be demonstrated to be false, of 
course, and I'm not arguing that experts usually make this mistake.

> If not, 
> then I think you significantly underestimate a large category of people.

If you think there's no issue here, I think you significantly 
overestimate a large category of people.  Let's declare that line of 
argument a draw.

>>The first point I was making is that *automated* checking has very 
>>little to do with anything, and conflating static types with automated 
>>checking tends to lead to a lot of confusion on both sides of the 
>>static/dynamic fence.
> 
> 
> I couldn't disagree more.  Rather, when you're talking about static 
> types (or just "types" in most research literature that I've seen), then 
> the realm of discussion is specifically defined to be the very set of 
> errors that are automatically caught and flagged by the language 
> translator.  I suppose that it is possible to have an unimplemented type 
> system, but it would be unimplemented only because someone hasn't felt 
> the need nor gotten around to it.  Being implementABLE is a crucial part 
> of the definition of a static type system.

I agree with the latter sentence.  However, it's nevertheless the case 
that it's common to confuse "type system" with "compile-time checking". 
  This doesn't help reasoning in debates like this, where the existence 
of type systems in languages that don't have automated static checking 
is being examined.

> I am beginning to suspect that you're make the converse of the error I 
> made earlier in the thread.  That is, you may be saying things regarding 
> the psychological processes of programmers and such that make sense when 
> discussing dynamic types, and in any case I haven't seen any kind of 
> definition of dynamic types that is more acceptable yet; but it's 
> completely irrelevant to static types.  Static types are not fuzzy -- if 
> they were fuzzy, they would cease to be static types -- and they are not 
> a phenomenon of psychology.  To try to redefine static types in this way 
> not only ignores the very widely accepted basis of entire field of 
> existing literature, but also leads to false ideas such as that there is 
> some specific definable set of problems that type systems are meant to 
> solve.

I'm not trying to redefine static types.  I'm observing that there's a 
connection between the static properties of untyped programs, and static 
types; and attempting to characterize that connection.

You need to be careful about being overly formalist, considering that in 
real programming languages, the type system does have a purpose which 
has a connection to informal, fuzzy things in the real world.  If you 
were a pure mathematician, you might get away with claiming that type 
systems are just a self-contained symbolic game which doesn't need any 
connections beyond its formal ruleset.

Think of it like this: the more ambitious a language's type system is, 
the fewer uncaptured static properties remain in the code of programs in 
that language.  However, there are plenty of languages with rather weak 
static type systems.  In those cases, code has more static properties 
that aren't captured by the type system.  I'm pointing out that in many 
of these cases, those properties resemble types, to the point that it 
can make sense to think of them and reason about them as such, applying 
the same sort of reasoning that an automated type inferencer applies.

If you disagree, then I'd be interested to hear your answers to the two 
questions I posed to Vesa, and the related one I posed to you above, 
about what else to call these things.

>>I agree, to make the comparison perfect, you'd need to define a type 
>>system.  But that's been done in various cases.
> 
> 
> I don't think that has been done, in the case of dynamic types.  

I was thinking of the type systems designed for soft type inferencers; 
as well as those cases where e.g. a statically-typed subset of an 
untyped language is defined, as in the case of PreScheme.

But in such cases, you end up where a program in these systems, while in 
some sense statically typed, is also a valid untyped program.  There's 
also nothing to stop someone familiar with such things programming in a 
type-aware style - in fact, books like Felleisen et al's "How to Design 
Programs" encourage that, recommending that functions be annotated with 
comments expressing their type. Examples:

;; product : (listof number) -> number

;; copy : N X -> (listof X)

You also see something similar in e.g. many Erlang programs.  In these 
cases, reasoning about types is done explicitly by the programmer, and 
documented.

What would you call the descriptions in those comments?  Once you tell 
me what I should call them other than "type" (or some qualified variant 
such as "latent type"), then we can compare terminology and decide which 
is more appropriate.

> It has 
> been done for static types, but much of what you're saying here is in 
> contradiction to the definition of a type system in that sense of the 
> word.

Which is why I'm using a qualified version of the term.

>>The problem we're dealing with in this case is that anything that's not 
>>formally defined is essentially claimed to not exist.
> 
> 
> I see it as quite reasonable when there's an effort by several 
> participants in this thread to either imply or say outright that static 
> type systems and dynamic type systems are variations of something 
> generally called a "type system", and given that static type systems are 
> quite formally defined, that we'd want to see a formal definition for a 
> dynamic type system before accepting the proposition that they are of a 
> kind with each other.  

A complete formal definition of what I'm talking about may be impossible 
in principle, because if you could completely formally define it, you'd 
have a static type system.

If that makes you throw up your hands, then all you're saying is that 
you're unwilling to deal with a very real phenomenon that has obvious 
connections to type theory, examples of which I've given above.  That's 
your choice, but at the same time, you have to give up exclusive claim 
to any variation of the word "type".

Terms are used in a context, and it's perfectly reasonable to call 
something a "latent type" or even a "dynamic type" in a certain context 
and point out connections between those terms and their cousins (or if 
you insist, their completely unrelated namesakes) static types.

> So far, all the attempts I've seen to define a 
> dynamic type system seem to reduce to just saying that there is a well-
> defined semantics for the language.

That's a pretty strong claim, considering you have so far ducked the 
most important questions I raised in the post you replied to.

> I believe that's unacceptable for several reasons, but the most 
> significant of them is this.  It's not reasonable to ask anyone to 
> accept that static type systems gain their essential "type system-ness" 
> from the idea of having well-defined semantics.  

The definition of static type system is not in question.  However, 
realistically, as I pointed out above, you have to acknowledge that type 
systems exist in, and are inextricably connected to, a larger, less 
formal context.  (At least, you have to acknowledge that if you're 
interested in programs that do anything related to the real world.)  And 
outside the formally defined borders of static type systems, there are 
static properties that bear a pretty clear relationship to types.

Closing your eyes to this and refusing to acknowledge any connection 
doesn't achieve anything.  In the absence of some other account of the 
phenomena in question, (an informal version of) types turn out to be a 
pretty convenient way to deal with the situation.

> From the perspective of 
> a statically typed language, this looks like a group of people getting 
> together and deciding that the real "essence" of what it means to be a 
> type system is... 

There's a sense in which one can say that yes, the informal types I'm 
referring to have an interesting degree of overlap with static types; 
and also that static types do, loosely speaking, have the "purpose" of 
formalizing the informal properties I'm talking about.

But I hardly see why such an informal characterization should bother 
you.  It doesn't affect the definition of static type.  It's not being 
held up as a foundation for type theory.  It's simply a way of dealing 
with the realities of programming in a dynamically-checked language.

There are some interesting philophical issues there, to be sure 
(although only if you're willing to stray outside the formal), but you 
don't have to worry about those unless you want to.

> and then naming something that's so completely non-
> essential that we don't generally even mention it in lists of the 
> benefits of static types, because we have already assumed that it's true 
> of all languages except C, C++, and assembly language.

This is based on the assumption that all we're talking about is 
"well-defined semantics".  However, there's much more to it than that. 
I need to hear your characterization of the properties I've described 
before I can respond.

Anton