[Types-sig] minimal or major change? (was: RFC 0.1)

Greg Stein gstein@lyra.org
Wed, 15 Dec 1999 06:40:03 -0800 (PST)


On Wed, 15 Dec 1999, Martijn Faassen wrote:

... me: stating the "GFS proposal" isn't that major of a change ...

> The programmer needs to deal with the following new things and their
> consequences:
> 
> * New grammar with function definitions.

Right. And this is optional. I don't see this extension of the grammar or
semantic as difficult to deal with.

> * A whole new operator (which you can't overload..or can you?), which
> does something quite unusual (most programmers associate types with
> names, not with expressions). The operation also doesn't actually return
> much that's useful to the program, so the semantics are weird too.

No, you cannot overload the operator. That would be a Bad Thing, I think.
That would throw the whole type system into the garbage :-).

The operator is not unusual: it is an inline type assertion. It is not a
"new-fangled way to declare the type of something." It is simply a new
operation. The compiler happens to be able to create associations from it,
but that does *not* alter the basic semantic of the operation.

Given:

   x = y or z

In the above statement, it returns "y" if it is "true". In the statement:

   x = y ! z

It returns "y" if it has "z" type; otherwise, throws an exception. The
semantics aren't all the difficult or unusual.

Programmers are confronted with "new stuff" all the time. How about:

   values = cgi.parse()

Just because the above happens to be a method invocation rather than a
syntactical construction does not reduce the amount of new semantics that
a programmer must learn.

In summary: a new operator isn't that much of a burden.

> * Interfaces with a new 'decl' statement. [If you punt on this you'll
> have to the innocent Python programmer he can't use the static type
> system with instances? or will we this be inferenced?]

Yes, I'd prefer to punt this for a while, as it is a much larger can of
worms. It is another huge discussion piece. In the current discussion, I
believe that we can factor out the interface issue quite easily -- we
can do a lot of work now, and when interfaces arrive, they will slide
right in without interfering with the V1 work. In other words, I believe
there is very little coupling between the proposal as I've outline, and
the next set of type system extensions (via interfaces).

Without interfaces (or the "decl" statement, or whatever), I *do* posit
that the type system will not be applicable to attributes. And no: we
cannot infer their type -- that would require global type inferencing.
Thankfully, I believe the inferencing required by the "GFS proposal" is
local to a single function at a time.

> * Unspecified syntax to actually *specify* types, I mean, a ! operator
> with
> something syntactically wholly new behind it may not be that simple for
> the Python programmer either. It's not that hard with IntType and so on,
> but it gets complex if you have function types, class types, etc.

True. I've been suggesting the use of dotted names, but also allowing for
the fact that new syntax can be designed to generate typedecl objects.

Specifying a typedecl is necessary to introduce any typing. That is a hit
that we take no matter what. I don't see it as a "major" change, though,
since we can keep the syntax simple and limit where/how they are used.

> * And then there's the type inferencer which will interact with the
> Python programmer's code as well, right? And the interpreter will spew
> out errors if compile time checks fail on types?

This is behind the scenes. The Python programmer is usually not impacted,
so yes... again a minimal impact.

IMO, the compile-time checks are not enabled by default. If you want them,
then you can deal with the errors and warnings.

> And you call this: '*very* little change' ?

Yes. From the standpoint of the Python programmer, there is not much more
to learn or to deal with. [unless we introduce interfaces, IMO]

> I'll call adding a list with
> names of static type associations to the module an 'an even *smaller*
> change' then, as you don't need any new operator or statement, at least
> to start with. :)

I never said yours was more complex :-). I just said that we aren't
necessarily creating a "major change". I'd like to see variable decls
punted and interfaces deferred. Add a new semantic (typedecls), a new
operator, and an extension to the "def" statement. Done.

(hehe... if only the code backing that were so easy...)

> Adding anything like static type checking to Python entails fairly major
> changes to the language, I'd think. Not that we shouldn't aim at keeping
> those transparant and mostly compatible with Python as it is now, but
> what we'll add will still be major.

Sure. I think we're just viewing it a bit differently. To me, something
like the metaclass stuff was a big change: it is capable of altering the
very semantics of class construction. Adding package support was the same
-- Python moved from a flat import space to an entirely new semantic for
importing and application packaging.

> > > The 'simplicity' part comes in because you don't need *any* type
> > > inferencing. Conceptually it's quite simple; all names need a type.
> > 
> > 1) There is *no* way that I'm going to give every name a type. I may as
> >    well switch to Java, C, or C++ (per Guido's advice in another email :-)
> 
> Sure, but we're looking at *starting* the process. Perhaps we can do
> away with specifying the type of each local variable very quickly by
> using type inferencing, but at least we'll have a working
> implementation!

I don't want to start there. I don't believe we need to start there. And
my point (2) below blows away your premise of simplicity. Since you still
need inferencing, the requirement to declare every name is not going to
help, so you may as well relax that requirement.

> > 2) You *still* need inferencing. "a = foo() + bar()" implies that some
> >    inferencing occurs.
> >    (for a compile-time check; the compiler can insert a runtime check to
> >     assert the type being assigned to "a" (but you know my opinion
> >     there...))
> 
> Sure, that's true.
> 
> [me]
> > > > > Later on you can work on blurring the interface between the two. First
> > > > > *fully* type annotated functions (classes, modules, what you want),
> > > > > which can only refer to other things that are fully annotated. By 'fully
> > > > > annotated' I mean all names have a type.
> 
> [Paul]
> > > > I think that's a non-starter because it will take forever to become
> > > > useful because the standard library is not type-safe. Anyhow I fell like
> > > > I've *already solved* the problem of integration so why would I undo
> > > > that?
> > 
> > Agreed. Also, if I grab some module Foo from Joe, and he didn't add
> > typedecls, then why shouldn't I be able to use it?
> > (and I'd just add some type-asserts if that even mattered to me)
> 
> I'm not saying this is a good situation, it's just a way to get off the
> ground without having to deal with quite a few complexities such as
> inferencing (outside expressions), interaction with modules that don't
> have type annotations, and so on. I'm *not* advocating this as the end
> point, but I am advocating this as an intermediate point where it's
> actually functional.

IMO, it is better to assume "PyObject" when you don't have type
information, rather than throw an error. Detecting the lack of type info
is the same in both cases, and the resolution of the lack is easy in both
mehtods: throw an error, or substitute "PyObject". I prefer the latter so
that I don't have to update every module I even get close to.

> [me]
> > > > > Our static type checker/compiler can use the Python type constructions
> > > > > directly. We can put limitations on them to forbid any type
> > > > > constructions that the compiler cannot fully evaluate before the
> > > > > compilation of the actual code, of course, just like we can put
> > > > > limitations on statically typed functions (they shouldn't be able to
> > > > > call any non-static functions in the first iteration of our design, I'm
> > > > > still maintaining)
> > 
> > The compiler can issue a warning and insert a type assertion for a runtime
> > check. IMO, it should not forbid you from doing anything simply because it
> > can't figure out some type. Python syntax's "type agnosticism" is one of
> > its major strengths.
> 
> Yes, but now you're building a static type checker *and* a Python
> compiler inserting run time checks into bytecodes. This is two things.
> This is more work, and more interacting systems, before you get *any*
> payoff. My sequence would be:

Who says *both* must be implemented in V0.1? If the compiler can't figure
it out, then it just issues a warning and continues. Some intrepid
programmer comes along and tweaks the AST to insert a runtime check. Done.
The project is easily phased to give you a working system very quickly.

Heck, it may even be easier for the compiler to insert runtime checks in
V0.1. Static checking might come later. Or maybe an external tool does the
checking at first; later to be built into the compiler.

... proposed implementation order ...
> If you don't separate out your development path like this you end up
> having to do it all at once, which is harder and less easy to test.

Of course. Nobody is suggesting a "do it all at once" course of
implementation.

> [Paul]
> > > > I see no reason for that limitation. The result of a call to a
> > > > non-static function is a Pyobject. You cast it in your client code to
> > > > get type safety. Just like the shift from K&R C to ANSI C. Functions
> > 
> > Bunk! It is *not* a cast. You cannot cast in Python. It is a type
> > assertion. An object is an object -- you cannot cast it to something else.
> > Forget function call syntax and casting syntax -- they don't work
> > grammatically, and that is the wrong semantic (if you're using that format
> > to create some semantic equivalent to a cast).
> 
> This'd be only implementable with run-time assertions, I think, unless
> you do inferencing and know what the type the object is after all. So
> that's why I put the limitation there. Don't allow unknown objects
> entering a statically typed function before you have the basic static
> type system going. After that you can work on type inference or cleaner
> interfaces with regular Python.

Why not allow unknown objects? Just call it a PyObject and be done with
it.

Note that the type-assert operator has several purposes:

* a run-time assertion (and possibly: unless -O is used)
* signal to the compiler that the expression value will have that type
  (because otherwise, an exception would hav been raised)
* provides a mechanism to type-check: if the compiler discovers (thru
  inferencing) that the value has a different type than the right-hand
  side, then it can flag an error.

The limitation you propose would actually slow things down. People would
not be able to use the type system until a lot of modules were
type-annotated.

> But perhaps I'm mistaken and local variables don't need type
> descriptions, as it's easy to do type inferencing from the types of the
> function arguments and what the function returns,

That is my (alas: unproven) belief.

> as well as the types
> of any instance attributes involved.

These would always be "PyObject" (or "Any" if you prefer) until we
introduce some kind of "decl" or interface mechanism. Needless to say, I
do agree that this would be very difficult.

> I'd like to see some actual
> examples of how this'd work first, though. For instance:
> 
> def brilliant() ! IntType:
>     a = []
>     a.append(1)
>     a.append("foo")
>     return a[0]
> 
> What's the inferred type of 'a' now? A list with heterogenous contents,
> that's about all you can say, and how hard is it for a type inferencer
> to deduce even that?

It would be very difficult for an inferencer. It would have to understand
the semantics of ListType.append(). Specifically, that the type of the
argument is added to the set of possible types for the List elements.

Certainly: a good inferencer would understand all the builtin types and
their methods' semantics.

> But for optimization purposes, at least, but it
> could also help with error checking, if 'a' was a list of IntType, or
> StringType, or something like that?

It would still need to understand the semantics to do this kind of
checking. In my no-variable-declaration world, the type error would be
raised at the return statement. a[0] would have the type set: (IntType,
StringType). The compiler would flag an error stating "return value may be
a StringType or an IntType, but it must only be an IntType".

> It seems tough for the type
> inferencer to be able to figure out that this is so, but perhaps I'm
> overestimating the difficulty.

Yes it would be tough -- you aren't overestimating :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/