[Edu-sig] Properties use case

Sat Mar 18 19:03:10 CET 2006

Arthur wrote:
> ... But if we trace back the thread we will see that the bottom line question
> that I was struggling with at the beginning was precisely the question of
> what *makes* a primitive type such. Obviously something much deeper than the
> fact that it is coded in C.

Well, if it hadn't been buried in a hail of words, it would have been
more obvious to the readers as well as the writer.  I'll try to talk
from the point of view of a language designer.

Typically, a computer language is developed to solve some kind of
problem (or family of problems).  The primitives must be enough to
express, succinctly, problems in that space and parts of solutions
to those problems.  You'd be surprised at the variety of things
people want to express.  It used to be commonly accepted that
"little languages" was the way to attack this problem.  A new
problem space meant building a "little language" to express its
problems and solutions.  However, it turns out that these "little
languages" have this "feeping creaturism" problem; the users
often want "just a bit more" added until the language (which was
once nice and small and tidy) seems to be all bolts and knobs
and corners where unanticipated extensions got thrown on to
solve "just one problem."  Also, the number of people capable of
building, supporting, and extending a language is not huge in
comparison to the number of problem spaces.

Experience in building languages has shown us that the more
"distinct" things in the language, the harder it is to learn, use,
and remember.  Here I am not talking about the complexity of the
compiler/interpreter/language translation system, but the amount of
"stuff" a _user_ of the language needs to know about the language.
The fewer things a user has to track mentally, the "simpler" the
language.  So, here is a tension: we want a simple language that can
express succinctly problems and (effective) solutions to problems.
The simpler the language, the more quickly you can become expert in
the language, and move on to the problems you really want to solve.
The more expressive the language, the simpler your programs in your
field of discourse.  We do know that program complexity grows
_significantly_ worse than linearly with the size of a program, so
we want the programs the user writes to be small to help him out.
Often adding more to the language shrinks the users code, so there
is a balancing act here.

Here are some criteria for primitives.  We want as few primitives as
possible.  We should add primitives for anything almost every user
will need.  In addition to those, we may add some primitives for those
things that the user cannot build himself nearly as efficiently (or
perhaps is very likely to get wrong).  We probably need primitives for
every "manifest constant" in a program (the 5 in "i *= 5").

Almost always, programs need a way to perform I/O (Knuth, in one class
defined a computer program as something that took zero or more inputs
and produced one or more outputs, on the grounds a rock implements
anything that produces zero outputs).  But programs that simply produce
output, but consume no input are rare; ab initio chemists may write some
of these.  Usually the programs are to be operated by people other than
those who wrote the programs, so they must consume input in some form.

Input, output, and calculation all need to be expressed.  Even if you
have no intention of doing text processing, input and output almost
force you to have a text type.  Most calculation will want to do
arithmetic (there are languages w/o arithmetic primitives).  There
must be a way to build data structures (combine our primitives), since
the design and use of data structures is core CS (and the key to not
having to build everything into the language).

We need to be able to define functions (a way to decompose problems),
and how functions and their results are combined is, finally, the
probable answer to the mutability question.  If every time you call
a function it might mutate its arguments, safe practice is to copy
any arguments to the function.  Early FORTRAN said a procedure could
change its arguments, and the change would be reflected in the calling
program.  So, after:
     pi = 3.1415297
     somesub(pi, 3)
the value of pi might be changed.  Even worse, the program might change
the value of 3 (the constant used in the compiled chunk of code for 3).
The FORTRAN solution was to tell the programmer "don't do that."  We
now solve these problems either by using "pass-by-value" semantics
(the subroutine gets a copy of its argument), or by passing immutable 
objects.  So, it makes sense that manifest constants must be immutable
in a language like Python that does call-by-sharing.  Also you avoid
another nasty problems called "aliasing".  If you have a function like:
     def same_range(a, b):
         while a < b:
             a *= 2
         while a >= b:
             a /= 2
         return a
Calling "same_range(lumberjacks, lumberjacks)" will make the second
loop infinite because both a and b are aliases for the same object.
That's no problem for immutables (one version is as good as another),
but can cause surprisingly hard-to-find bugs.  Note aliases are _much_
harder to detect than simply this case, and the results can be nasty.
Some people have proposed that program have no side-effects to solve
this, but others among us consider that absurd: a program models
reality, and reality is mutable.  There is no way to write to a printer
and then abandon that calculation and move to another where the printer
has not yet been written upon.

So I've gone on at absurd length here to explain why immutable
primitives are preferred.  The reasons are subtle, and come from
shared experience on building languages.  This is why some of us
study computer science; we are interested in such questions.  It
is frustrating sometimes to be asked a simple-to-ask question that
seems combined with a "nasty CS elitists" attitude.  There _is_ a
body of work in CS, what it talks about is not trivial, and it
builds on itself in a way that makes it hard to answer some
questions succinctly.

Some of the questions feel a lot like, "why so many planar surfaces in
architecture;" to answer them requires work, not simply in the saying,
but in looking back for the whys.  Often the first answer "it makes my
skin crawl" is the real answer; some rule has become so internalized
you don't know why you feel it.  It doesn't necessarily mean Kirby was
saying you had done an incredibly stupid thing; it might simply mean
that something about that as design felt dangerous to him in some way.

--Scott David Daniels
Scott.Daniels at Acm.Org