[Types-sig] Non-conservative inferencing considered harmful

Greg Stein gstein@lyra.org
Wed, 22 Dec 1999 17:16:00 -0800 (PST)


On Wed, 22 Dec 1999, Paul Prescod wrote:
> Greg Stein wrote:
> > 
> > BUT!! -- you never said what the error in the program was, and what the
> > type-checker was supposed to find:
> > 
> > 1) was "a" filled in with inappropriate types of values?
> > 2) was "j" assigned a type it wasn't supposed to hold?
> > 3) was "k" declared wrong?
> >
> > In the absence of knowing which of the three cases is wrong, I *strongly*
> > maintain that the error at the "k = j" assignment is absolutely correct.
> > How is the compiler to know that "a" or "j" is wrong? You didn't tell it
> > that their types were restricted (and violated).
> 
> That's right. This is all valid Python code and should *not* give an
> error message.

Woah... if you presume enforcement of assignments, then a type-check error
exists somewhere. We're *trying* to raise errors. Did you typo here?

> That's why I'm been trying to make a distinction between
> a type safety declaration and a type declaration. If you didn't ask for
> this code to be type-safe then it won't cause a problem.

All right. This is a different problem than I was responding to. You asked
me whether an error would be raised by a specified piece of code. I said
yes. Now you're saying the code was actually correct and an error
shouldn't be raised? Or that there was some type-safety vs type-decl
difference being discussed?

You're confusing me.

Regardless: in your world of assignment-enforcement, the code you
specified *would* raise an error. Either at the assignment to "a", "j", or
"k", depending on which ones you had declared ahead of time.

> Here's where
> the error might arise:
>
> ... 10,000 lines of code ...
> 
> for el in a:
>    j = el.doSomething()
> 
> ... 10,000 lines of code ...
> 
> type-safe
> def foo( k: Int ):
> 	k = j

Sure. You'd get an error at the "k = j" line.

What's the point? We agree on this one. [presuming assignment enforcement]

> > In other words, the error message is NOT "miles away". It is exactly where
> > it should be. When that error hit, the programmer could have went "oops! I
> > declared k wrong. I see that j is supposed to be an Int or a String.
> > okay... lemme fix that..."
> 
> No, here's what really happens (based on my experience with ML):
> 
> "Oooops. Something weird has happened why would it expect an int or
> string? Where does this variable get its value? Humm. What are the
> possible types of el? Humm, what are the possible contents of a? Hummm.
> Why does this language make it so hard for me to find my errors?"

All right. This is because you are specifying the error was #1 or #2. If
the error was #3, then it would be immediately obvious what happened (you
declared "k" wrong).

Given error #1 or #2, this implies that at the "k = j" statement, you had
the misconception that "j" was of type Int. The compiler tells you that
your impression is wrong. No harm in that, and exactly what you're seeking
from the type checker. As a thrifty Python programmer, knowing how to
declare variables, you simply insert "j: Int" somewhere. Now you get the
error up there at the doSomething() call.

If your error was back at #1 when you constructed "a", then the error at
doSomething() will make you pause. Again, you'll realize that your
assumptions were incorrect and you insert another declaration to verify
your assumption. Now, you get an error at the assignment to "a".

I see no head-scratching in here, other than that caused by the
programmer's failure to understand their code. That is what we are trying
to solve -- find cases where their misunderstanding is going to cause
problems at run time.

> > Find another example -- this one doesn't support your position that
> > inferencing is harmful.
> 
> It absolutely does. If I can't jump immediately from the error message
> to the line that causes the problem then something is wrong with the
> type system. I shouldn't have to "debug" compiler errors by inserting
> declarations here and there.

But I'm telling you: in this case, there is ONE of THREE problems. If the
problem was #3, then you *can* immediately jump to the spot to fix the
declaration of "k". Otherwise, your assumptions are incorrect and the
compiler just told you that. That is exactly what it is supposed to do.

When I get a segfault in my C code, it is rarely caused by a local
problem. If you believe that all error messages and their cause are
supposed to be localized, then you are truly mistaken.

On one hand, you say that you don't want to "debug compiler errors" by
inserting declarations. But that is *exactly* what you're proposing to do!
You just want to be pre-emptive and force people to declare them up front.

What if "a" and "j" were *supposed* to be of type "any"? There is no
compiler error in your example and my proposed response. You specifically
did not want to enforce any types on "a" or "j". The compiler did exactly
what you told it to: no type checks on those. Later, you made an assertion
that "j" was an Int by virtue of believing that you could assign it to
"k". Well, the compiler just *helped* you out by telling you that
assumption was incorrect.

What is the problem with that?

> > Sure. And in the absence of saying so, the inferencer above did exactly
> > what it was supposed to do. You have one of three problems in your code,
> > yet no way for any compiler to know which one you meant.
> 
> It shouldn't have let me get so far without saying *exactly what I
> mean*. It shouldn't have tried to read my mind. I'm the human asking for
> the computer's help in keeping things straight. If it tries to read my
> mind it's going to be making the same mistakes I make! Rather it should
> tell me when I've first done something fishy.

But you didn't do anything fishy! Not until you assumed that "j" was an
Int, when it really wasn't. The compiler isn't trying to read your mind.
It just told you that you messed up.

The assignments to "a" and "j" were perfectly valid.

If, in your mind, they were not, then you should have made that clear. But
I'm trying to say that we should not require that. We can get along quite
fine with knowing the type information as it gets resolved, rather than
having to declare it all up front.

>...
> Therefore our real decision is: do we want to force programmers who want
> static type checking to sprinkle their code with declarations
> *explicitly* or do we want to wait until they get frustrated with the
> inferencer?

You are assuming that it will cause frustration. It isn't making a single
guess. It is tracking exactly what you are doing with your types -- it is
entirely logical and deterministic. When it tells that you goofed, that is
because you really did. That is valid, helpful information.

I do not want to require explicit declaration. There is no need for it.

> It seems to me that "simple and explicit" is more Pythonic
> than "we'll guess what you mean and you can being explicit if we guess
> wrong." If you don't want to put in declarations, don't use static type
> checking!

Stop that. There are no guesses occurring!

> > We will lose one of the best things of
> > Python -- the ability to toss out and ignore all that declaration crap
> > from Algol-like languages. If you give a means to people to declare their
> > variables, then they'll start using it. "But it's optional!" you'll say.
> > Well, that won't be heard. People will just blindly follow their C and
> > Java knowledge and we will lose the cleanliness of syntax that Python has
> > enjoyed for so long.
> 
> I don't see the absence of type declarations as having much to do with
> Python's cleanliness.

I do. Python is remarkably devoid of syntactic sugar. The lack of
declarations is a huge factor in that.

> As Tim said, declarations improve readability by
> serving as documentation.

I agree and submit:

# foo is an Integer

Perfectly valid documentation if that is what you need. No reason to
introduce syntax for that.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/