[Types-sig] feedback, round 2 (LONG) (Re: PyDL RFC 0.4)

Tue, 4 Jan 2000 04:50:08 -0800 (PST)

On Mon, 3 Jan 2000, Paul Prescod wrote:
> Thanks for your feedback. I will need a lot more before we are done this
> thing!

And thanx for the detailed response :-) ... let's see if we can get
another round in place here...

> Greg Stein wrote:
> > Wouldn't these be called "abstract interfaces" or "parameterized
> > intefaces"? That seems to be a more standard terminology.
> 
> I don't like "parameterized interface" because Sequence(Int) *is*
> parameterized. I need to distinguish Sequence(_X) from Sequence(Int).
> 
> Abstract is a little better, but we aren't dealing with abstract classes
> as they are known in C++ or Java.

Yah. I didn't really like either of my suggestions, for similar reasons.
However, I think we want to reserve the name "incomplete interface" for
the recursive-type situation.

Maybe you could use the terms "parameterized interface" for Sequence(_X)
and "bound interface) for Sequence(Int). Hrm. Maybe "unbound" and "bound".
What do you think?

> > > Typedefs allow us to give names to complete or incomplete interfaces
> > > described by interface expressions. Typedefs are an interface
> > > expression re-use mechanism.
> > 
> > typedefs are also used to assign names to things like "Int or String".
> > 
> > I don't see "Int" as an interface (even though it probably is in *theory*,
> > it doesn't seem that way in layman usage).
> 
> It makes the spec much easier to read and write if we think of them
> uniformly as interfaces. Else we must constantly refer to "interfaces
> and thingees like Int and String."

Well... a name in a type declarator could refer to several objects: a
class, an interface, a type object, or a typedecl object. Each of these
certainly has an associated interface, but they are not (all) interfaces.

You're right no the Int thing. I withdraw, but will still point out that a
typedef can refer to more things than interfaces. For example:

a = typedef types.IntType or types.StringType
b = typedef a or types.ListType

In the former, "a" refers to a typedecl object which is the union of two
type objects. The the latter, "b" refers to a union between a typedecl
object and another type object.

I would amend your statement to read:

  "typedefs allow us to give names to interface expressions."

[ note that I will still argue they are type declarators ]

Your language also implicitly/subtly seems to state that we will
*generate* an (unnamed) interface as the result of an interface
expression. For sanity's sake, I would ask that we avoid that notion... it
will then add Yet Another Type of Interface to our already burgeoning set
of complete, incomplete, bound, unbound, concrete, abstract,
parameterized, or whatever you want to call them :-). At the basic level,
we have interface objects and a type declarator will not automatically
generate composite interface objects. You can argue it mathematically that
this happens, but I've seen too many specs that fall into that trap and
become hard to understand. Please just have one notion of interface
objects and avoid generated/composite/synthesized interfaces.

> > And I still don't understand the need to specify that *two* files exist.
> > Why are we going to look for two files? Isn't one sufficient?
> 
> One is where you put your hand-written declarations. The other is where
> the interpreter dumps the declarations that it extracts from the Python
> file. That way you can use all inline declarations, a separate file or
> *both* with no danger of having your hard work overwritten.

But this is my part of my point (also covered in other parts of that
email, but I'll summarize here): I don't believe that we will necessarily
create that second file. I think it is a spurious requirement to state
that we will extract/replicate inline definitions out into another file.

You rightfully point out somewhere that an implementation can take the
inline stuff and cache it somewhere, but let's have the spec deal with
only *two* inputs: the inline data and an interface file. If the
implementation has cached the inline data somewhere... fine... but let's
not spec that as another file to deal with.

This also helps to reduce our problem set to: given an interface
declaration inline and an interface declaration in an interface file, how
do we merge them?

[ note that a module may have multiple interfaces declared inline and each
  interface may occur in a different file, but on a per-interface basis,
  we only need to worry about two locations. ]

Adding a second file means we must deal with (discuss) three inputs. We
can point out the third is just a copy of the inline data, but then why
should we bring that up in the spec? Just talk about inline and an
interface file.

And what will the rules be for an interface showing up in multiple places?
For example, let's say that I define an interface three times in my module
and twice in interface file A and another twice in interface file B. When
I type check the whole bugger, I feed it the module and two interface
files. Do we get an error, or do they resolve somehow?

Are the interfaces in the interface file(s) always referred to by
module.name, or do they jammed into the module namespace somehow? (this
probably depends on the definition of the external tool's process)

> > In the above example, we have three interface objects. One is available
> > via the name "foo1" in the module-level namespace. One is available as
> > "Bar.foo2" (via the class' namespace, and the class is in the module
> > namespace). The third, foo2, is only available within the function Baz().
> 
> We're making a static type checking system. I don't see what runtime
> definition of interfaces in a function scope buys other than confusion.
> If we need to have interface decarlations in random contexts then we
> should differentiate compile-time available ones with a "decl" keyword
> prefix.

The different interfaces are to deal with scoping of their usage. We scope
classes and functions all the time in Python. Nothing says that we must
always place them at the global level.

Maybe I want a private interface to be used within my class. For example:

  class MyClass:
    interface Item:
      decl member a: Int
      decl member b: Int

    decl member _items: List{Item}
    ...

I don't want to be forced to move that interface out to the global
namespace. Nor do I want to worry about whether "Item" interferes with an
Item defined elsewhere -- that name occurs within a "class" definition, so
I expect it to be scope that way.

You mention runtime definition. Your words :-). I'm talking about
compile-time definition and scoping of the interfaces.

Sure, there is also a runtime component that assigns an interface object
to MyClass.Item, *but* that does not negate the scoping.

[ and yah: if you track down my original comment, it started out with the
  basis of runtime objects being assigned to names in different
  namespaces. So sue me :-) ... the scoping is the basic requirement for
  wanting those names in the different namespaces. ]

> > I do not believe there is a need to place the interfaces into a distinct
> > namespace. I'd be happy to hear one (besides forward-refs, which can be
> > handled by an incomplete interface definition).
> 
> A static type checking system exists to precede and constrain dynamism,
> not to expand it.

Huh? I don't see how that answers the need to place them into a different
namespace. It just sounds like some mumbo-jumbo.

AFAIK, a static type checking system exists to check whether you've
written your program correctly.

[ A good compiler can use that to impute certain constraints and thereby
  optimize the program, but that is a side effect rather than the main,
  original purpose. ]

Back to my supposition: I do not believe we need to place names and
interface objects into a distinct namespace. I believe these can go into
the namespace of the context where the interface is defined.

And my question: what is the requirement that establishes the need for a
new namespace?

[ namespace in terms of resolving names; a dict of name:interface is just
  a dict. ]

>...
> > > Certain interfaces may have only one implementation. These "primitive"
> > > types
> > > are Int, Long, Float, String, UnboundMethods, BoundMethods, Module,
> > > Function
> > > and Null. Over time this list may get shorter as the Python
> > > implementation is generalized to work mostly by interfaces.
> > 
> > I don't understand what you're saying here. This paragraph doesn't seem to
> > be relevant.
> 
> It is crucial to the distinction between implementations and interfaces.
> Certain types do not have such a distinction so you cannot just
> implement the right attributes and expect it to "work". You cannot make
> a new class that Python treats as an Integer. You cannot make a new
> class that MFC treats as a window handle.

Ah. Understood. Your original statement didn't make this clear :-)

> Here's what I say in my
> current working version:
> 
> > Sometimes there exists code that is only compatible with a single
> > implementation of an interface. This is the case when the object's
> > actual bit-pattern is more important than its interface. Examples
> > include integers, window handles, C pointers and so forth. For this
> > reason, every class is considered also an interface. Only instances of
> > the class and its subclasses (if any) conform to the interface. These
> > are called "implementation specific interfaces."

The line about "bit-pattern" is superfluous. I think that should be
omitted as it is distracting.

I would also change the wording to:

"Every class implies a specific interface. Only instances of this class
and its subclasses (if any) conform to this implicit interface."

The "for this reason" doesn't make sense. I don't see the connection
between the preceeding sentences and the next sentence.

Also, the "its subclasses" is a bit shaky. Consider:

  class Foo:
    def f1(self, x):
      ...
  class Bar(Foo):
    def f1(self):
      Foo.f1(self, 5)

The subclass has a different signature for f1(). While it is pretty
uncommon to use this pattern on random methods, it is *very* common for
the __init__ method. The subclass is used to fill in specific parameters
for the superclass constructor. Considering that the __init__ method forms
part of the implicit interface, then the Bar subclass does not conform to
that interface.

I understand where you're going with the paragraph now, and it is
definitely needed. However, I think the topic needs a bit more fleshing
out.

> > > Note: The Python interface graph may not always be a tree. For
> > > instance there might someday be a type that is both a mapping and a
> > > sequence.
> > 
> > In the above statement, you're mixing up implementations (which can use
> > disjoint interfaces) with the interface hierarchy. Or by "type" are you
> > referring to a new interface which combines a couple interfaces?
> > 
> > Note that I think it is quite valid to state that interfaces must always
> > be a tree, although I don't see any reason to avoid multiple-inheritance.
> 
> So if we allow multiple inheritance then they will not always be a tree,
> right?

Correct. Multiple inheritance creates a DAG rather than a tree. If you
aren't careful, you might actually get cycles:

  decl incomplete class Foo

  class Bar(Foo):
    ...
  class Foo(Bar):
    ...

Oops! :-)

Actually, a simple rule can toss this out: don't allow incomplete classes
for one of the base classes.

And to my comment: I was pointing out that we could specify a *rule* that
it must be a tree. But I don't see any reason to do that... multiple
inheritance on interfaces seems fine to me.

[ the Java designers say otherwise, though ]

> In my working draft, "Class" is a sub-interface of both "Interface" and
> "Callable".

Then you have a DAG rather than a tree :-)

I'd simply ask that you clear up that text a bit. When I saw "type", I
thought of the builtin Python types. It looks like you meant "interface".

> > >...
> > > Interface expression language:
> > > ==============================
> > 
> > These are normally called "type declarators". I would suggest using
> > standard terminology here.
> 
> We aren't dealing with types. We are dealing with interfaces. And we
> aren't dealing with declarators, but with expressions. These expressions
> can be used in contexts other than type declarations.

In this context, "type" and "interface" are the same. We are not referring
to the builtin types. The "expressions" you refer to are exactly analogous
to what C calls a "type declarator" -- a thing which declares a specific
type.

   int []

That is a type declarator. It refers to an array of ints. We are using the
same semantic concept, but with Pythonic syntax and base types.

We are also constructing a composite [type declarator] from basic pieces.
This composite specifies a type that an object may have. Yes, you could
say that the composite specifies the interface that the object may conform
to.

But: a type declarator can be more than an interface, as I mentioned
above:

  types.IntType or types.StringType

In this case, the declarator refers to actual PyType objects, not
interface specifications. The compile- and run- time checks would look at
the ob->ob_type field ... not the __interfaces__ attribute (which might
not even exist!).

If we go back to you "implementation specific intefaces" concept... you're
really referring to a PyType object rather than an interface spec.

>...
> > Just use the "dotted_name" construct here -- that is well-defined by the
> > Python grammar already. It also provides for things like
> > "os.path.SomeInterface".
> 
> The construct is fine for the grammar but it doesn't describe the
> semantics.

Then add some text... geez. The point is that the text you provided does
not seem to allow deeply nested interface references. It also recreates
some of the dotted_name grammar construct, but isn't as clear about it.

> > Note that interfaces do *not* have to occur in a PyDL module. Leave the
> > spec open for a combined syntax -- we shouldn't be required to declare all
> > interfaces in separate files.
> 
> Interfaces in a Python file are automatically extracted and are thus
> available in a PyDL module.

I disagree, as mentioned above. If an implementation happens to cache
inline data, then that is its business. Let's leave it out of the spec.

>...
> > > 3. parameterize a interface:
> > >
> > > Array( Int, 50 )
> > > Array( length=50, elements=Int )
> > >
> > > Note that the arguments can be either interface expressions or simple
> > > Python expressions. A "simple" Python expression is an expression that
> > > does not involve a function call or variable reference.
> > 
> > I disagree with the notion of expressions for the parameter values. I
> > think our only form of parameterization is with typedecl objects. The type
> > checker is only going to be dealing with type information -- expression
> > values as part of an interface don't make sense at compile time.
> 
> Parameters would be made available to implementing classes at runtime. I
> see a lot of virtue in numeric bounds, string prefixes and so forth:
> 
> typedecl colors as Enum(elements=["Red","Green","Blue"])

Woah, Nelly!

Hrm... let me find a reference here...

  http://www.python.org/pipermail/types-sig/1999-December/000776.html

Basically, the message points out that runtime-parameterization is a huge
problem. Python is just not (yet) set up to handle runtime
parameterization.

Constructors? Yes. If your Enum() example referred *only* to a
construction of an Enum instance which is then used in some way to perform
type checks... then yah. I could see something like this.

But let's be very clear. Using runtime parameterization with something
like this is out of the question:

  class (_X) Foo:
    ...

Also, I'd like to note that your example could be rewritten as:

  # construct enum instance
  RGB_Enum = Enum(elements=["Red","Green","Blue"])

  # create a typedef (I'm assuming you meant "typedef ...")
  typedef colors as RGB_Enum

But the typedef line isn't really necessary since it is a simple alias for
the class instance:

  RGB_Enum = Enum(...)
  colors = RGB_Enum

or:

  colors = Enum(...)

Presuming that later you will be doing something like:

  def foo(rgb: colors)->Int:
    ...

or:

  rgb = getColors(image) ! colors

It is *very* interesting to note that your example here has effectively
duplicated some of the compile-time vs. run-time distinctions from one of
my previous emails:

  http://www.python.org/pipermail/types-sig/2000-January/001095.html

In that email, I stated that the type-checker could not do much with a
construct such as:

  a = doSomething()
  def foo(x: a)->None:
    ...

And that it would defer to a runtime check. (and if the option flags were
set properly, it would issue a "suspicious construct" warning or
something).

If you see the equivalence to value-expression-based parameterization and
my two referenced postings, then I believe we may be getting somewhere!
:-)

In summary: torch value-expression parameterization -- only allow
type-expression parameterization. Your value-expression-based example is
actually a "type-checker-punted-to-runtime" example, which I believe we
should be able to handle as such.

> > I agree. The return type should also be optional. Note that we can't allow
> > just a name (and no type), as that would be ambiguous with just a type
> > name.
> 
> I like the explicitness of requireing a return type and I harbor hopes
> that Python will one day distinguish between NO return type and
> something that happens to be able to return None.

No problem. Please mark this as an issue in the RFC. There is at least one
vote for requiring the return type, and one for making it optional. Let's
see some other discussion/input.

Personally, I want to be able to mark one or two function parameters as
having types, but not worry about some other params or the return types.
i.e. when I'm retrofitting some code

> > > Note that at this point in time, every Python callable returns
> > > something, even if it is None. The return value can be named,
> > > merely as documentation:
> > >
> > > def( Arg1 as Int , ** as {String: Int}) - > ReturnCode as Int
> > 
> > Ack! ... no, I do not think we should allow names in there. Return values
> > are never named and would never be used. Parameters actually have names,
> > which the values are bound to. A return value name also introduces a minor
> > problem in the grammar (is the name a name for the return value or a type
> > name?).
> 
> How is the issue different in the return code versus in parameters?

Parameters use those names in binding values to names in the local
namespace.

The return code name is never bound anywhere. It is totally superfluous.
Barely even handy as documentation. "ReturnCode" doesn't say much :-) If
you want doc, then use a docstring (rather than introducing even more
syntax).

And: as I mentioned above: there are definite grammar issues with allowing
a NAME right there. The parser can't tell whether the NAME is a return
value name or the first part of a type declarator. Tony has already
pointed out this problem in some of my current syntax change proposals.
I've gotta go jigger up some ugly grammar stuff to solve that. I would
hope that we can keep the ugly grammar stuff minimized.

> I think that this is a very useful features for IDEs and other
> documentation and has zero cost.

If you feel strongly, then okay... mark down another issue in the RFC to
collect input on this :-)

> > >...
> > >  2. Basic attribute interface declarations:
> > >
> > > decl myint as Int                   # basic
> > > decl intarr as Array( Int, 50 )     # parameterized
> > > decl intarr2 as Array( size = 40, elements = Int ) # using keyword
> > > syntax
> > 
> > "as" does make sense in this context, but I'd use colons for consistency.
> 
> The inconsistency is very minor and I am somewhat uncomfortable with
> appearing to begin a suite. I doubt that programmers would even notice
> the inconsistency.

I noticed it :-)

If colons are used in function parameters, then we should use colons in
the declarations.

> > > So this is allowed:
> > >
> > > class (_X,_Y) spam( A, B ):
> > >     decl someInstanceMember as _X
> > >     decl someOtherMember as Array( _X, 50 )
> > >
> > >     ....
> > 
> > You haven't introduced this syntax before. Is this a class definition? 
> 
> Er, yes, but I don't have that syntax in the language anymore. Just
> change "class" to "interface"

We should. I want to parameterize classes. Don't force me to extract an
interface from my class definition -- I want an implicit, parameterized
interface derived from my class definition and its inline declarations.

There are issues with the syntax for parameterizing classes and instances
(as mentioned in another email tonite), but I *do* want to parameterize
both.

The basic premise is that I will use classes as *both* an interface
definition and an implementation. To that end, I want all the features of
a standard inteface definition to be available through my class
definition.

> > > These are NOT allowed:
> > >
> > > decl someModuleMember(_X) as Array( _X, 50 )
> > 
> > Reason: modules are not parameterizable.
> 
> No, the reason was stated before. Because *attributes* like
> someModuleMember cannot be declared to need incomplete interfaces. Only
> interfaces can be incomplete.

All right. I see your distinction. Not sure what I was thinking :-)  I
might have glossed over the first (_X) and just saw the second. Module
attributes should be able to have that second _X within their interface
definitions.

>...
> > > class (_Y) spam( A, B ):
> > >     decl someInstanceMember(_X) as Array( _X, 50 )
> > >
> > > Because that would allow you to create a "spam" without getting around
> > > to saying what _X is for that spam's someInstanceMember. That would
> > > disallow static type checking.
> > 
> > Agreed. The _X must occur in the class declaration statement.
> 
> No, that's another typo. Here's another example and it comes back to the
> fact that attributes cannot be incomplete:
> 
> interface (_Y) spam( A, B ):
>     decl someInstanceMember(_Y) as Array( _Y, 50 ) 

Gotcha. I completely agree.

If you rewrite to:

  interface (_Y) spam( A, B ):
      decl someInstanceMember as Array( _Y, 50 )

then it is legal.

[ with the caveats that a class is also parameterizable, I think we need
  to use different syntax for parameterization, and that value expressions
  should not occur in a parameter binding. ]

>...
> > Note that you will then have to define a
> > rule for whether "decl x as Int" is the "same" as "decl x as Number". For
> > conformance, is the first too specific, or is it just a more concrete form
> > of the latter? (but still allowed)
> 
> Well, in general there is no problem specifying a base versus derived
> interface. Your choice of specificity. The "Int" is a special case
> because it is also an implementation specific interface derived from

Ignore the Int thing. Let's say I have the following:

  interface Foo:
    ...
  interface Bar(Foo):
    ...

The question is now: how are these viewed from a conformance standpoint?

-- Are List{Foo} and List{Bar} the same?  Probably not.
-- If I have a List{Bar}, can I pass it to something that asks for a
   List{Foo}?  Should be able to. But what if that function inserts a Foo
   into my List{Bar}? Oops!
-- If I have a List{Foo}, can I pass it to something that asks for a
   List{Bar}?  Probably not. The target wants Bar objects, which Foo
   objects definitely are not.

These rules will need to be defined at some point [for function parameter 
passing/binding].

This sub-email-thread started with a parameterization of the form:

  decl Add(_X as Number) as def(...

The binding of a type to _X would follow similar rules to those defined
above for function parameters.

>...
> I don't want to introduce a new kind of interface declaration for
> classes. You should use ordinary interface declarations. There is no
> need for a new kind of "class-y" interface declaration and it will
> likely be abused so that more code is implementation specific than it
> needs to be.

Yes, but I do. I feel very strongly against having to separate an
interface definition out of my implementation.

If I have to say:

   interface TreeNodeInterface{_X}:
      def __init__(self, a: _X,
                   Right: TreeNodeInterface{_X} or None,
                   Left: TreeNodeInterface{_X} or None)

   class TreeNode{_X}:
      __interfaces__ = TreeNodeInterface
      def __init__(self, a: _X,
                   Right: TreeNodeInterface{_X} or None,
                   Left: TreeNodeInterface{_X} or None):
        my_code()

then I'll scream. I'll scream even louder if that interface declaration
has to go into a separate file. The maintenance would be painful.

And note that I have no idea how to remove the _X parameterization from
the class definition. I need/want to include the declarations in the
__init__ function and need the specific parameterization of
TreeNodeInterface. I think... I'm not even so sure how this thing would
work in this case.

Basically: I need to be able to parameterize a class definition. And don't
dare to tell me that the implied interface of that class is "abused so
that more code is implementation specific than it needs to be." As the
application developer, this is my choice. I do not want dual-path
maintenance, and the binding of interface to implementation is entirely
appropriate in my application.

> > If you're just trying to create the notion of a factory, then "def" is
> > appropriate:
> > 
> >   decl TreeNode(_X): def(a: _X,
> >                          Right: TreeNode(_X) or None,
> >                          Left: TreeNode(_X) or None)    \
> >                        -> (ParentClasses or Interfaces)
> 
> No, we need ot differentiate functions from classes because classes can
> be subclassed. Otherwise there is no difference. That's why all we do is
> change the keyword.

You shouldn't be able to subclass a class without an actual definition
being present. Otherwise, you could end up with loops (as I mentioned
above).

Therefore, you have factories, or you have classes. But you don't declare
a class as a constructor function -- you define the whole thing.

[ or declare it "incomplete" per my suggestion above... not even the
  constructor is known in that case... not until definition time. ]

> > These should be assignments and use a unary operator. The operator is much
> > more flexible:
> > 
> >   print_typedecl_object(typedef Int or String)
> > 
> > Can't do that with a typedef or decl *statement*.
> 
> You can't do it in one line, but you can do it. It is of debatable
> utility anyhow. The vast majority of the time you want to introspect
> interface objects that are *in use* not hard-coded ones. Introspection
> is really a secondary consideration anyhow.

My example was extremely simple. You cannot discount the concept based on
that. Here, I'll give you a better example:

  def some_kind_of_apply_thing(func, params):
    arg_tds = func.func_signature.values()
    td = arg_tds[0]
    for arg_td in arg_tds[1:]:
      td = typedef td or arg_td
    for param in params:
      if not td.check(param):
        raise ArgTypeError
    return apply(func, params)

Is that better? The typedef operator is a *big* win in terms of clarity
and utility.

And as I've argued before, the type-checker can easily understand all the
forms of:

   x = typedef some.typedecl.expression

that you were proposing to allow in the "typedef" statement (e.g. only
dotted names, parameterizations, or other simple constructions).

> > Also note that your BoundedInt example is a *runtime* parameterization.
> > The type checker can't do anything about:
> > 
> >   decl x: PositiveInt
> >   x = -1
> 
> That's true. I don't see that as a problem.

It is a problem if you are shooting for *only* compile-time checking. I've
been recommending the ability to punt to some runtime checks for a while,
and it seems that you've been against them. It sounds like you're starting
to allow for them now.

The above construct is fine, then. I simply was trying to point out that
you wouldn't be happy with it because we couldn't do anything with it at
compile time.

> > But we *can* check something like this:
> > 
> >   def foo(x: NegativeInt):
> >     ...
> >   decl y: PositiveInt
> >   y = 5
> >   foo(y)
> 
> I'm curious how you would see your type inferencer knowing whether to
> inference 
> 
> j=6
> 
> as "int", "PositiveInt" or "NegativeInt"

"Int". All integer constants are "Int". The compile-time checker has no
way to determine the semantics of value-expression-based parameterized
types such as your PositiveInt or NegativeInt. Since it can't know the
semantics, then it can't classify the integer constant appropriately.

> Anyhow, you still don't prevent:
> 
> decl y: NegativeInt
> y=5
> foo(y)

Correct. (and your point is...?)

I never figured we could prevent this because of the value-expression
binding for NegativeInt causing the whole thing off to be punted off to
runtime.

[ and for other reasons, the value-expression binding can't be done ]

> > But this latter case is more along the lines of naming a particular type
> > of Int. The syntax could very well be something like:
> > 
> >   decl PositiveInt: subtype Int
> >   decl NegativeInt: subtype Int
> 
> No need for the keyword "subtype". Two different int typedefs should
> both be usable as ints (I think), but not as each other.

Agreed. I was trying to shoot for a possible syntax that would provide the
compile-time checker with that semantic.

In the "subtype" example, the checker can know this now.

  decl a: PositiveInt
  decl b: NegativeInt
  a = 5 ! PositiveInt
  b = -1 | NegativeInt
  func_taking_ints(a, b)   # success
  a = b                    # failure

The checker knows that PositiveInt and NegativeInt are subtypes of Int and
can therefore be bound to the params of func_taking_ints(). But it knows
they are different types, so it raises an error on the assignment.

> > The type-checker would know that PositiveInt is related somehow to Int
> > (and it would have to issue warnings when mixed). 
> 
> Argh. More warnings. I do not view it within our purview to require
> implementations to issue warnings. We define something as legal or as
> illegal. Anything else is between the implementor and the user.

I have *never* recommended to issue runtime warnings. I was referring to a
compile-time warning about mixing the two types. e.g. between the checker
and the implementor.

And the checker will most certainly issue warnings.

> > It would also view
> > PositiveInt and NegativeInt as different (thereby creating the capability
> > for the warning in the foo(y) example above).
> > 
> > Anyhow... as I mentioned above, we should only be allowing typedecl
> > parameters. We can't type-check value-based parameters.
> 
> Not at compile time, but we can provide them to the implementation
> object which can check them at runtime.

While true, we cannot truly parameterize classes at runtime. We can only
create instances [for use with runtime type-checking] with different
constructor values.

In that sense, we *still* do not have value-based parameterization.

> > If you want to introduce a type name for a runtime type-enforcement (a
> > valid concept! such as your PositiveInt thing), then we should allow full
> > expressions and other kinds of fun in the parameter expressions (since the
> > runtime type should be createable with anything it wants; we've already
> > given up all hope on it). But then we get into problems trying to
> > distinguish between a type declarator and an expression. For example:
> > 
> >   MyType = typedef ParamType(0, Int or String)
> > 
> > In this example, the first is an expression, but the second should be a
> > type declarator. Figuring out which is which is tricky for the parser.
> 
> Well maybe we need to just make the expression syntax unifiable with
> Python syntax. I am not comfortable to say that a type expression will
> never occur unadorned in Python code or vice versa.

I posit that type declarator syntax can never be mixed directly with
standard expression syntax. [Int] has very different meanings based on
whether you're talking about a type declarator or an expression.

The typedef operator is *intended* to allow this mixing. The RHS of the
operator is a type declarator, it produces a value [for use in a Python
expression].

>...
> > >     "typesafe":
> > >     ===========
> > > In addition to decl and typedecl the keyword "typesafe" can be used to
> > > indicate that a function or method uses types in such a way that each
> > > operation can be checked at compile time and demonstrated not to call
> > > any function or operation with the wrong types.
> > 
> > What about the problem of non-existence? How "safe" is "typesafe"? And how
> > is this different from regular type checking?
> 
> It is how you *turn on* regular type checking at compile time.

Isn't it also enabled by the presence of declarations?

I also believe that the checker won't be integrated directly into the
compiler. But that doesn't negate the question: how safe is "typesafe" and
how does that differ from regular type checking (e.g. when I put a
declaration on a function parameter).

>...
> > Class definitions also have the parameterization syntax change:
> > 
> >   class (_X) Foo(Super):
> >     decl node: _X
> >     ...
> 
> No, that was a bug in the spec. Classes are declared just like functions
> except for the "class" keyword. They behave just like functions except
> that they can be subclassed.

I disagree. Explained above.

> > Class and modules should also have a syntax for specifying the
> > interface(s) they conform to. 
> 
> I think that that is extracted from the class declaration's return type
> automatically. We will have to invent something for modules. "moddecl"
> or something.

Classes are not always declared. They may simply be defined. (and
certainly, the module is plain-old-defined) I have proposed that we define
the interfaces conformed-to by assigning them to the __interfaces__
attribute of the class and module.

>...
> > > __conforms__ : def (obj: Any ) -> boolean
> > 
> > Just call it "conforms". There is no need to "hide" this method since the
> > interface does not expose interface members as its *own* members.
> 
> It is __conforms__ for the same reason that __repr__, __cmp__ and
> __init__ are hidden: it is actually invoked through the magic "isa"
> syntax, not directly. I would be amenable to getting rid of the
> underscores Python-wide, but in any case I want to be consistent.

Different cases.

  class Foo:
    def __init__(self):
      self.a = 5
    def __str__(self):
      return str(self.a)

The __str__ needs to have the underscores to avoid conflicts with "a".

An interface object does not expose the defined-attributes as its own
attributes. Hence, there is no need for conflict avoidance.

conforms() will be quite sufficient. Jim Fulton has some other methods
defined for interface objects that may be interesting.

>...
> > > There is a backwards compatible syntax for embedding declarations in a
> > > Python 1.5x file:
> > >
> > > "decl","myint as Integer"
> > 
> > Just use a single string. The parse tree actually gets even uglier if you
> > put that comma in there :-). We can pull the "decl" out just as easily if
> > it is the first part of a "decl myint: Integer".
> 
> I don't want to get confused with docstrings.

Well... the checker certainly won't be confused. Humans? Nah.

Really: go generate a parse tree for that comma-separated bugger sometime.
Then try to fit that in with a mechanism for extracting the declaration.
It will be a bitch times three.

If/when I add the string-based declaration stuff to my check.py prototype,
I'll be doing it as a single string. I truly don't want to deal with
commas between a couple strings. I doubt any other person who goes to
implement a prototype (or the real thing!) will want to either.

> > Why pull them out? Leave them in the file and use them there. No need to
> > replicate the stuff to somewhere else (and then try to deal with the
> > resulting synchronization issues).
> 
> Because we always pull them out to a distinguishable file name, there
> are no more synchronization issues than with Python .pyc files.

When I say "why pull them out?" ... "Because we always pull them out" is
not a helpful answer :-)

*why* ?

In any case, I've covered this ground above. I maintain that we leave them
as inline data. If the implementation wants to do something under the
covers, then fine... it has no affect whatsoever on the user.

>...
> > > The runtime should not allow an assignment or function call to violate
> > > the declarations in the PyDL file. In an "optimized speed mode" those
> > > checks would be disabled. In non-optimized mode, these assignments
> > > would generate an IncompatibleAssignmentError.
> > 
> > This is a difficult requirement for the runtime. I would suggest moving
> > this to a V2 requirement.
> 
> I don't see that our version numbers and Python's version numbers have
> to coincide. If it takes two years for Python to live up to all of the
> rules of our spec, so be it. 

But we also don't want to bite off more than we can chew. If we put in
something about "the runtime disallows assignments of the wrong type" and
create an *expectation* that this will be followed, then we will quickly
find ourselves in a trap.

This won't be implemented any time soon, so why discuss it or even pretend
that we'll be getting around to it in the near future? Punt it to V2. Set
people's expectations properly.

> > > The runtime should not allow a read from an unassigned attribute. It
> > > should raise NotAssignedError if it detects this at runtime instead of
> > > at compile time.
> > 
> > Huh? We already have a definition for this. It raises NameError or
> > AttributeError. Please don't redefine this behavior.
> 
> It is a different error. NameError's and AttributeError's should be
> eliminated through static type checking (where it is used religiously).
> NotAssignedError is not always statically detectable. 

It sounds like you're changing the error for change's sake. Why not leave
it as NameError or AttributeError? Why introduce another name for an
existing semantic?

  try:
    x = foo.bar
  except AttributeError:
    ...

Are you saying that I have to go and change all these to use
NotAssignedError? Why? If it is a direct replacement, then there is no new
semantic and therefore no need to change the name.

> Implementing this is also easy since objects have access to their
> interface objects.

I don't understand this. Implementation for "not assigned" is already
done. It exists today in Python without the need for interface objects.

Hrm. All right... let's clarify your language some:

  "If a name or attribute is not defined, then the checker may flag read
   accesses to these non-existent entities. If the name/attribute *is*
   defined, but a runtime access finds that it has not yet been assigned,
   then NotAssignedError is raised instead."

Now your section makes sense.

But I still don't buy it. Distinguishing between Name/AttributeError and
NotAssignedError is probably of little value (IMO).

At least: please mark the thing as an open issue.

> > >...
> > >     Idea: The Undefined Object:
> > >     ===========================
> > 
> > You haven't addressed any of my concerns with this object. Even though
> > you've listed it under the "future" section, I think you're still going to
> > have some serious [implementation] problems with this concept.
> 
> It is about as difficult to handle as pervasive assignment checks and
> both will probably be part of a written-from-scratch Python.

It's worse than the assignments, I think. But hey: as long as that thing
stays down in the "futures" section, then I'm not worried. And when the
time comes, I won't be the volunteer to code it up :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/