[Types-sig] RFC Comments

skaller skaller@maxtal.com.au
Wed, 29 Dec 1999 04:55:36 +1100


Paul Prescod wrote:
 
> >         Yeah, but you would do well to get out of the habit
> > of saying 'can' and 'may'. Use the word 'shall'. Meaning,
> > that the damn thing is REQUIRED to do something :-)
> > Dont give permission. Specify requirements.
> 
> I expect to rewrite the specification from scratch (with grammar) before
> I am done. Consider this version a prototype. Once we have the design
> down I will generate the normative spec.

	I do consider your RFC's prototypes -- but they're already
quite good specifications, so it is already time to try to tighten them
up.
IMHO. Doing this will also help uncover ambiguities and problems,
that 'loose' wording will cover up.
 
> >         Point 0: Paul, list the predefined names like Integer,
> >         or whatever. Say if they are keywords or plain identifiers.
> 
> I've been putting this off because there are some tricky issues around
> file objects.

	The leave them out. Temporarily. If your proposal
is coherent and well principled, but doesn't quite cover all the
territory, it should be possible to extend it. If you try to
make it cover too much, it may be harder to get something
concrete enough to extend.
 
> > > 5. Declare un-modifiability:
> > >
> > > const [const Array( Integer )]
> > >
> > > (the semantics of un-modifiability need to be worked out)
> >
> >         Again, forget it, for the moment.
> 
> Isn't that what I did? :)

	No, you mentioned it in point 5. :-)
 
> I don't understand your model of namespaces and inclusions. I don't
> understand mine either so don't feel bad.

	I agree. I'll try again; perhaps an example:

	# file m.py
	import n
	include p

Here, in the module m, we import n. This has to actually
import the module n at run time. In pass 1, we read the
interface file n.pyi. In pass 2, we generate code to
actually load module n. Agree?

But, for p, we ONLY read in the interface file p.pyi.
We do not generate code to import p.

Why would we do this? The answer is, we may gain access
to classes and functions of the module p, even though
we have not imported it. For example, consider a function

	def f(x):
		import p # local import
		return p.someclass()

	def f(x): # get at module p from module n
		return n.p.someclass()

We cannot state the interface of f, in particular
the return type, without the name of the interface
of the class 'someclass' which is defined in 
the interface p. But p isn't imported into module m.

So: we have to be able to load an interface, without
that necessarily implying the module be imported.

On the other hand, in an _interface_ file, we cannot
import anything: importing implies run time code
generation, to bind a name to a module object.

So the correct way to load an interface, but not
import anything, requires a separate keyword like 'include'.
The semantics are distinct:

	import implies include
	the converse is not the case

> We can have an API like:
> 
> load_interface("foo")

	Yes, that would be possible but ugly. :-)
 
> I don't think that the needs of a very specific tool like a static type
> checker should drive syntax to that extent. The other 99% of code will
> never do an "include" and the keyword will be wasted.

	but you cannot write that in an implementation
file because it would be interpreted as a function call
to be done at run time, whereas loading the interface
must be done at compile time.
 
> >         I am already using ! not : here, following Greg Stein.
> 
> I'm going to presume that that isn't a backwards-compatibility argument.
> :)

	Sure it is. It is only a minor one though.
The reason I chose "!" for argument declarations was that it
was already being used in similar way for the _expression_:

	x ! t

as in:

	y = x ! t

and in this context, ":" cannot be used.
 
> > There are enough ":"'s in python already :-)
> 
> Debatable. I would also be amenable to "as", "is" or "isa". "!" means
> not to me.

	OK. You should proceed with _some_ fixed syntax. Perhaps
it makes sense to seek feedback from users on c.l.p?
I'll implement whatever you decide [provided it fits with the grammar
of course :-]
 
> > > What we are really defining is the constructor. The signature of the
> > > created object can be described in an interface declaration.
> >
> >         Not good enough. The semantics of class instance
> > attributes would be 'when you assign to this attribute,
> > it had better have this type'. This doesn't mean that
> > you can be sure an access gives that type,
> > the attribute might not exist. This defeats optimisation.
> 
> The attribute will either have the type or something like "undefined".
> Since undefined is not a "useful" value, you can optimize away.

	I understand that this is your intent, but I am questioning it.
My argument is something like this: a requirement that an attribute
have type X IF it exists, is weaker than one that doesn't require
anything at all, since the typing requirement is contingent
on the existence requirement. What I mean is that, the purpose
of the typing requirement can be stated as 'you can be sure
when you access this name that the object it is bound to has
the specified type', but that purpose is not met, if the name
isn't bound to an object. you cannot safely optimise an
access, because you don't know if the name is bound.

	Uggg. I'm not explaining this very well.
What I'm saying is that type safe access isn't type safe
at all unless the access is also safe, irrespective
of whether it is typesafe: it has to be safe, before
being typesafe is any use.
 
> > Your spec would break this code. You can argue that your
> > spec is a better spec -- but it isn't Python compatible.
> 
> Agreed. I will clarify that the behavior of "dropped off" functions is
> just a suggestion of how Python 2 might be improved using the features
> of the new object.

	The new Undefined object is an implementation detail in this respect:
It is not required, at all, to specify that Python
functions be required to explictly return a value, and may
not drop off the end, or, weaker, that IF a function
drops off the end, the return value may not be used.

	[Yes, I know you added some extra semantics allowing the
dropped of the end returns to be tested -- more debatable, I think]

> >         FYI: In Viper, uninitialised, statically
> > declared variables are initialised with the special object PyInitial.

> It sounds like None re-invented. 

	It is, except that clients may refer to None explicitly,
but NOT to PyInitial:

	x = None # valid Python
	x = PyInitial # NameError, no such thing

> My only reason for wanting a new object
> (not None) is because None is way too flexible. You could pass a None
> through ten thousand lines of code accidently. So I wouldn't want
> Undefined to be useful to "max" or anything else other than "is", "str"
> and "repr".

	Perhaps you misunderstood: PyInitial is used in the IMPLEMENTATION
of 'max', which is written in ocaml. It is not available to the client
python programmer. 

> I will consider this. An alternate technique is to list allowed recovery
> strategies:
> 
> "It is an error if this leaves more than one match. An XSLT processor
> may signal the error; if it does not signal the error, it must recover
> by choosing, from amongst the matches that are left, the one that occurs
> last in the stylesheet."

	Style sheets have different requirements: there is some kind
of need for robustness: compilers should be fragile.
[If it is at all possible to break the users program, do it!]
 
	It is, of course, possible to specify _anything_.
But it is not a good idea, IMHO. For example, Greg Stein might
argue that two options be allowed: a compile time diagnostic
OR a run time diagnostic. This is dangerous: it limits the
kind of processors to what Greg thinks is important today.

	The general rule of standards bodies is that
if there is no consensus, leave it out -- don't define
anything. This gives implementors maximum freedom,
and restricts the programmer most. It also gives
the standardisers the option of adding more constraints
on implementors _later_: it is much harder to undo a rule,
than to add a new one.

	Note that NO ONE likes 'undefined behaviour'.
On the other hand, most of us prefer 'deterministic behaviour',
that is, exactly one option is given the implementor,
and the programmer can rely on it. But the next best thing
is 'don't do it -- it is not defined'. Two or more choices
is a very weak compromise (usually), because the programmer
cannot rely on a particular behaviour, and will usually
have to avoid it for this reason: meaning the implementor
is constrained needlessly, providing a feature the programmer
cannot use.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850