tough-to-explain Python

Sun Jul 19 11:21:01 EDT 2009

On Jul 9, 1:20 pm, Steven D'Aprano <st... at REMOVE-THIS-
cybersource.com.au> wrote:
[...]
> You'll excuse my skepticism about all these claims about how anyone can
> program, how easy it is to teach the fundamentals of Turing Machines and
> functional programming to anybody at all. Prove it. Where are your peer-
> reviewed studies demonstrating such successes on randomly chosen people,
> not self-selected motivated, higher-than-average intelligence students?

I'm engaged presently in starting a school to teach programming from
the ground up, based roughly on the curriculum outlined in the article
I mentioned.  I can't randomly select students but I do intend to
gather data about who does how well from what starting conditions.

I'm excited about formal methods because one, I just heard about them,
and two, they provide me with a much more well-fleshed-out way to
bridge rigorously from "raw" Turing machine models to higher order
abstractions.

I'm confident that anyone who can play Sudoku can learn programming
and formal methods.

> In the real world, human beings are prone to serious cognitive biases and
> errors. Our brains are buggy. Take any list of reasoning fallacies, and
> you'll find the majority of people have difficulty avoid them. The first
> struggle is to get them to even accept that they *are* fallacies, but
> even once they have intellectually accepted it, getting them to live up
> to that knowledge in practice is much, much harder.
>
> In fact I'd go further and say that *every single person*, no matter how
> intelligent and well-educated, will fall for cognitive biases and
> fallacious reasoning on a regular basis.

I agree completely.  One of my chief goals is to "debug" human
reasoning to whatever extent possible.  This is precisely where I
expect training in computer programming to come in handy.  The machine
symbol manipulator is not going to be fooled by reasoning fallacies.
Fluency in rigorous symbol manipulation can only help us cope with the
vagaries of our too-often sub-rational wetware.

I think the real win starts when computers can be used as a
communication medium, not for blogs and tweets and SMS txt and IM and
VoIP, but for reasoned symbolically coded discussion.

On Jul 9, 2:10 pm, Steven D'Aprano <st... at REMOVE-THIS-
cybersource.com.au> wrote:
> On Wed, 08 Jul 2009 22:05:57 -0700, Simon Forman wrote:
> >> persistent idea "out there" that programming is a very accessible
> >> skill, like cooking or gardening, anyone can do it, and even profit
> >> from it, monetarily or otherwise, etc., and to some extent I am
>
> > Programming is not like any other human activity.
>
> In practice? In principle? Programming in principle is not the same as it
> is performed in practice.
>
> But in either case, programming requires both the logical reasoning of
> mathematics and the creativity of the arts. Funnily enough,
> mathematicians will tell you that mathematics requires the same, and so
> will the best artists. I think mathematicians, engineers, artists, even
> great chefs, will pour scorn on your claim that programming is not like
> any other human activity.

Well it's actually Dijkstra's claim, and I find it reasonable, that
programming has two novel or unique aspects not present in other areas
of human endeavor: vast scale and digital mechanism.

http://www.cs.utexas.edu/users/EWD/transcriptions/EWD10xx/EWD1036.html

> [...]
>
> > He talks about how "when all is said and done, the only thing computers
> > can do for us is to manipulate symbols and produce results of such
> > manipulations" and he emphasises the "uninterpreted" nature of
> > mechanical symbol manipulation, i.e. that the machine is doing it
> > mindlessly.
>
> "Manipulate symbols" is so abstract as to be pointless. By that

That's the "koan form", longhand he means the Boolean domain {true,
false}, the logical operations and their algebraic relationships, and
the means of using same to deduce new formulae.

> reasoning, I can build a "computer" consisting of a box open at the top.
> I represent a symbol by an object (say, a helium-filled balloon, or a
> stone), instead of a pattern of bits. I manipulate the symbol by holding
> the object over the box and letting go. If it flies up into the sky, that
> represents the symbol "Love is War", if it falls into the box, it
> represents the symbol "Strength is Blue", and if it just floats there, it
> represents "Cheddar Cheese". This is a deterministic, analog computer
> which manipulates symbols. Great.
>
> And utterly, utterly useless. So what is my computer lacking that real
> computers have? When you have answered that question, you'll see why
> Dijkstra's claim is under-specified.

Well, you said it yourself: your computer is lacking utility.

[...]
>
> What is an uninterpreted formula? If you don't interpret it, how can you
> distinguish it from random noise?

Uninterpreted formula means that you manipulate the symbols without
thinking about what they mean.  You just use the rules on the formulae
as such.  To be useful, once you're "done" with them you can consider
their meaning.  You can distinguish it at any time, but Dijktra's
saying that we should pay attention to the [mathematical, logical]
reasoning process as it's own thing, without reference to the meaning
of the symbols, at least temporarily.

[...]
>
> > If you're never exposed to that constellation of concepts that underpins
> > "mechanical symbol manipulation" you are adrift in a sea ("c", ha ha) of
> > abstractions.
>
> > However, if you /are/ exposed to the "so few and simple" rules of
> > manipulation the gates (no pun intended) to the kingdom are thrown wide.
>
> What nonsense. The keys to the kingdom are the abstractions.
>
> Here's an exercise for you, if you dare. It's not that difficult to
> remove the abstraction from integer addition, to explain it in terms of
> bit manipulation. If you prefer, you can write a Turing Machine instead.
>
> Now try doing the same thing for Quicksort. Don't worry, we'll wait.
>
> Once you've done that, let's see you implement a practical, scalable, non-
> toy webserver as a Turing Machine.
>
> No, it's not the fundamental operations that open the doors to the
> kingdom. It's the abstractions. The history of computing is to gain more
> and more power by leveraging better and better abstractions.

I agree that "better and better abstractions" are the key(s), but I
strongly disagree that the fundamental operations are not, uh, key to
those keys.

There are two intertwined phenomenon:

1) formulae, written in symbols

2) deriving new formulae by applying [in]formal methods to existing
formulae

The rules, and the application of those rules, are two separate but
compatible phenomenon. (Compatible in that you can encode the latter
in terms of the former. This is the essential "trick" of mechanized
thought.)

Now (1) is constantly being elaborated, these elaborations are the
"higher" abstractions we're talking about.

But (2) stays the same(-ish) no matter how high you go, and (2) is the
process by which all the higher (1)'s come from, so if we get it right
(and by right I mean of course by using rigorous proofs, formal
methods et. al.) from the get go and keep it right, we're golden.

I do literally intend to teach e.g. "integer addition ... in terms of
bit manipulation" and a bit of parsing too. But I intend to rapidly
bring students to a Forth or Forth-like stage, and then immediately
create simple parsers that implement, for example, C for loops, and
other "higher" abstractions.

Highly abridged, this is the curriculum: Enough about bits to bring
them to bytes, enough bytes to bring them to RAM, enough RAM to
implement a crude CPU, enough CPU to implement a bit of Forth, enough
Forth to implement parsers, and enough parsing (and compilation and
interpreting, etc.) to let them grok languages in general.

Formal methods gives me a wide straight road from each stage to the
next.

[...]
> > Quoting Dijkstra again [1]: "Before we part, I would like to invite you
> > to consider the following way of doing justice to computing's radical
> > novelty in an introductory programming course.
>
> > "On the one hand, we teach what looks like the predicate calculus, but
> > we do it very differently from the philosophers. In order to train the
> > novice programmer in the manipulation of uninterpreted formulae, we
> > teach it more as boolean algebra, familiarizing the student with all
> > algebraic properties of the logical connectives. To further sever the
> > links to intuition, we rename the values {true, false} of the boolean
> > domain as {black, white}.
>
> This is supposed to be a good thing?
>
> Oh my. It must be nice in that ivory tower, never having to deal with
> anyone who hasn't already been winnowed by years of study and self-
> selection before taking on a comp sci course.
>
> I don't think Dijkstra would have taken that suggestion seriously if he
> had to teach school kids instead of college adults.

I'm not going to teach it exactly like he outlines there, but I really
am going to try the curriculum I outlined.  I'll post videos. :]

> ...
>
> > (Did you read that last paragraph and think, "Well how the heck else are
> > you supposed to program a computer if not in a computer language?"?  If
> > so, well, that is kind of my point.)
>
> No. What I thought was, how the hell are you supposed to manipulate
> symbols without a language to describe what sort of manipulations you can
> do?

Direct-manipulation graphical Abstract Syntax Trees.

[...]
> > tables and arrays. He didn't even know that they were different
> > "things".
>
> I shall manfully resist the urge to make sarcastic comments about
> ignorant PHP developers, and instead refer you to my earlier reply to you
> about cognitive biases. For extra points, can you identify all the
> fallacious reasoning in that paragraph of yours?

I'll pass.  I should have resisted the urge to whinge about him in the
first place.

>
> But those abstractions just get in the way, according to you. He should
> be programming in the bare metal, doing pure symbol manipulation without
> language.

No, the abstractions are useful, but their utility is tied to their
reliability, which in turn derives from the formal rigour of the
proofs of the abstractions' correctness.

My issues with modern art are, that too many of these abstractions are
not /proven/, and they are both frozen and fractured by the existing
milieu of "static" mutually incompatible programming languages.

I see a possibility for something better.

> If programming is symbol manipulation, then you should remember that the
> user interface is also symbol manipulation, and it is a MUCH harder
> problem than databases, sorting, searching, and all the other problems
> you learn about in academia. The user interface has to communicate over a
> rich but noisy channel using multiple under-specified protocols to a
> couple of pounds of meat which processes information using buggy
> heuristics evolved over millions of years to find the ripe fruit, avoid
> being eaten, and have sex. If you think getting XML was hard, that's
> *nothing* compared to user interfaces.
>
> The fact that even bad UIs work at all is a credit to the heuristics,
> bugs and all, in the meat.

Ok, yes, but have you read Jef Raskin's "the Humane Interface"?

> > As for "a minimum of bugs"... sigh. The "minimum of bugs" is zero, if
> > you derive your "uninterpreted formulae" /correctly/.

Sorry.  I can't believe that I was so melodramatic.

> And the secret of getting rich on the stock market is to buy low, sell
> high. Very true, and very pointless. How do you derive the formula
> correctly?
>
> Do you have an algorithm to do so? If so, how do you know that algorithm
> is correct?

Well we know logic works, we can trust that. The algorithm of proving
the formula is logic.

> > Deriving provably
> > correct "programs" should be what computer science and computer
> > education are all about
>
> That's impossible. Not just impractical, or hard, or subject to hardware
> failures, but impossible.
>
> I really shouldn't need to raise this to anyone with pretensions of being
> a Computer Scientist and not just a programmer, but... try proving an
> arbitrary program will *halt*.
>
> If you can't do that, then what makes you think you can prove all useful,
> necessary programs are correct?

It may well be that there are useful, necessary programs that are
impossible to prove correct. I'm not trying to claim there aren't
such.

I think that wherever that's not the case, wherever useful necessary
programs can be proven, they should be.

It may be that flawless software is an unreachable asymptote, like the
speed of light for matter, but I'm (recently!) convinced that it's a
goal worthy of exploration and effort.

Just because it's unattainable doesn't make it undesirable.  And who
knows? Maybe the horse will learn to sing.

> > Again with Dijkstra[3]: "The prime paradigma of the pragmatic designer
> > is known as "poor man's induction", i.e. he believes in his design as
> > long as "it works", i.e. until faced with evidence to the contrary. (He
> > will then "fix the design".)
>
> Yes, that is the only way it can be.
>
> > The scientific designer, however, believes
> > in his design because he understands why it will work under all
> > circumstances.
>
> Really? Under *all* circumstances?
>
> Memory failures? Stray cosmic radiation flipping bits? Noise in the power
> supply?
>
> What's that you say? "That's cheating, the software designer can't be
> expected to deal with the actual reality of computers!!! We work in an
> abstract realm where computers never run out of memory, swapping never
> occurs, and the floating point unit is always correct!"

First, yes, that could be considered cheating.  We don't expect
automotive engineers to design for resistance to meteor strikes.
Second, those issues plague all programs, formally derived or not.
Formal methods certainly can't hurt.  Third, in cases where you do
want reliability in the face of things like cosmic rays you're going
to want to use formal methods to achieve it.

And last but not least, it may be possible to "parameterize" the
abstractions we derive with information about their susceptibility to
stability of their underlying abstractions.

> Fail enough. I don't want to make it too hard for you.
>
> Okay, so let's compare the mere "pragmatic designer" with his provisional
> expectation that his code is correct, and Dijkstra's "scientific
> designer". He understands his code is correct. Wonderful!
>
> Er, *how exactly* does he reach that understanding?
>
> He reasons about the logic of the program. Great! I can do that. See,
> here's a proof that binary search is correct. How easy this is.
>
> Now try it for some non-trivial piece of code. Say, the Linux kernel.
> This may take a while...

Well, I'd start with Minix 3...  But really the method would be to
build up provably correct compounds out of proven sub-assemblies.
Even if it takes a long time is it time wasted?  A provably correct OS
is kind of a neat idea IMO.

>
> What guarantee do we have that the scientific designer's reasoning was
> correct? I mean, sure he is convinced he has reasoned correctly, but he
> would, wouldn't he? If he was unsure, he'd keep reasoning (or ask for
> help), until he was sure. But being sure doesn't make him correct. It
> just makes him sure.
>
> So how do you verify the verification?

Peer review and run it through a mechanical verifier.  There is a self-
reflexive loop there but it's not infinite.  If it was we couldn't
implement any computer.

> And speaking of binary search:
[Bentley's binary search bug]

This is a perfect example of leaky abstraction. The C "integer"
doesn't behave like a "real" integer but part of the algorithm's
implementation relied on the tacit overlap between C ints and integers
up to a certain value.  This is the sort of thing that could be
annotated onto the symbolic representation of the algorithm and used
to compile provably correct machine code (for various platforms.)

If the same proven algorithm had been implemented on top of a correct
integer abstraction (like python's long int) it wouldn't have had that
bug.

(And yes, there would still be limits, truly vast arrays could
conceivably result in memory exhaustion during execution of the sort.
I think the answer there lies in being able to represent or calculate
the limits of an algorithm explicitly, the better to use it correctly
in other code.)

> Perhaps the "scientific designer" should be a little less arrogant, and
> take note of the lesson of real science: all knowledge is provisional and
> subject to correction and falsification.

(I apologize for my arrogance on behalf of formal methods, I just
found out about them and I'm all giddy and irritating.)

> Which really makes the "scientific designer" just a slightly more clever
> and effective "pragmatic designer", doesn't it?

Um, er, well, scientists are ultimate pragmatists, and Dijkstra's
description of "pragmatic designer" above could well be applied to
"scientific designers".  The difference turns on /why/ each person has
confidence is his design: one can prove mathematically that his design
is correct while the other is ... sure.

Whether or not the delivered software is actually flawless, the
difference seems to me more than just "slightly more clever and
effective".

>
> > The transition from pragmatic to scientific design would
> > indeed be a drastic change within the computer industry."
>
> Given Dijkstra's definition of "scientific design", it certainly would.
> It would be as drastic a change as the discovery of Immovable Objects and
> Irresistible Forces would be to engineering, or the invention of a
> Universal Solvent for chemistry, or any other impossibility.
>
> > "Obviously no errors" is the goal to strive for, and I am comfortable
> > calling anyone an amateur who prefers "no obvious errors."
>
> It's not a matter of preferring no obvious errors, as understanding the
> limitations of reality. You simply can't do better than no obvious errors
> (although of course you can do worse) -- the only question is what you
> call "obvious".

It may be expensive, it may be difficult, and it may certainly be
impossible, but I feel the shift from ad hoc software creation to
rigorous scientific software production is obvious.  To me it's like
the shift from astrology to astronomy, or from alchemy to chemistry.

~Simon