[Types-sig] RFC 0.1

Paul Prescod paul@prescod.net
Mon, 13 Dec 1999 10:49:15 -0500


I don't think we would get anywhere if I just opened up the floor and
had everyone yell their opinions about type safety. Here is a very rough
starting point. Let's talk freely about it for a few days and then I'll
try to direct the conversation based upon addressing the feedback.

Version 0.1 Draft of a Pythonic Type Checking System
====================================================

    Guiding Principles in the System's Development 
    ----------------------------------------------

#1. The system exists to serve the dual goals of catching errors
earlier in the development process and improving the performance of
Python compilers and the Python runtime. Neither goal should be
pursued exclusively.

#2. The system must allow authors to make assertions about the sorts 
of values that may be bound to names. These are called binding
assertions. They should behave as if an assertion statement was
inserted after every assignment to that name program-wide.

Note: this does in fact put more power in the hands of module
developers. For the first time we will be able to say that
sys.exit may not be overridden in user code and that sys.maxint cannot
be changed to contain a string.

Note: the term "sorts of values" is meant to be ambiguous: the
definition of "type" in Python may undergo change in the future.

#3. Binding assertions must always be optional.

#4. There must be declarations that instruct static type checking
software to verify that a function cannot violate binding assertions.
These are called safety declarations.

#5. The introduction of binding assertions to a module should not
change the perceived interface of functions and classes in the module.
In other words, code that uses functions and classes from the module
should not need to know whether it uses binding assertions or old
fashioned assert statements. 

#6. In the absence of local safety declarations, a static type checker
should not by default report errors in otherwise legal Python code. In
other words, a coder must ask (through function or module level
declarations, command line switches or environment variables) for his
or her code to be checked. In particular, a module cannot force client
modules to be statically type checked (see #5, above).

#7. The attachment of safety declarations to a function should not
change the perceived interface of the function. In other words, code
that uses the functions should not need to know that the function
happens to be statically checkable.

#8. It is not a goal that a statically checkable function should only
be able to call other statically checkable functions. Those other
functions should be presumed to return a "PyObject" object.

#9. There should be a mechanism to assert that an object has a
particular type for purposes of informing the static and dynamic type
checkers about something that the programmer knows about the flow of
the program.

#10. In general, the mechanism should try to be "pythonic" which
includes but is not limited to:

    * maximize simplicity
    * maximize power
    * minimize syntax
    * be explicit
    * be readable
    * interoperate nicely with other features

    Temporary Goals and Non-Goals:
    ------------------------------

#1. The first version of the system will be as neutral as possible on
the issue of what defines a "type". Fulton's capability-based
interfaces should be legal as types but so should type objects and
classes.

Note: a purely interface based system cannot be feasible for testing
until interfaces are embedded deeply into the existing Python library.
It might be more philisophically pure to test for an abstract
CharacterString interface but if the Python expression "abc" does not
return an object that conforms to the interface then there is not much
we can do. Some future version of the system may be restricted to only
allow declared interfaces as types. Or it may be expanded to allow
parameterized types.

#2. The first version of the system will not allow the use of types
that cannot be referred to as simple Python objects. In particular it
will not allow users to refer to things like "List of Integers" and
"Functions taking Integers as arguments and returning strings."

#3. The first version of the system will not define the operation of a
type inferencing system. For now, all type declarations would need to
be explicit.

#4. The first version of the system will be syntactically compatible
with Python 1.5.x in order to allow experimentation in the lead-up to
an integrated system in Python 2.

    Definitions:
    ------------
Namespace creating suite:
    The suite contained directly within a module, class or function
definition.  

Statically available namespace creating suite:
    The namespace creating suite defined by a module or class
definition.  We do not consider the suite contained with a function as
Statically available because the namespace only becomes available when
the function is executed, not when it is declared.

Name binding statement, target:
    An assignment statement (target), "def" statement ("funcname"),
"class" ("classname") statement or "import" statement (module).  ***
more thought about "from" version ***

Name declaration:
    A name bound at the most out-dented context of a statically
available namespace creating suite.

Classification:
    Due to a shortage of synonyms for "type" that do not already have a
meaning, we use the word "classification." 

    Given a value v and a value t, v conforms to classification t if 
        t is returned by type( v )
        t is returned by v.__class__
        t is in v.__implements__ (the fulton convention)
        t is the "object" classification
        v is the value "None"

Classification Declaration:
    A statement that precedes a name binding statement and declares
the classifications that the name must conform to. The type
declaration must textually precede any use of the name.

Classification Constraints:
    A pair of statements declaring the classifications that values
bound to a name must support. There are a few syntactic variations:

    1. A name binding statement preceded by a statement referencing a
classification. 

<example>
types.StringType
a

class foo:
    types.IntType
    j=5
</example>

This assertion is maintained by a combination of the static and
dynamic type checkers. In order for the dynamic checker to work, we
will need to modify the module_setattr and class_setattr functions for
Python 1.6.

    2. A simple expression containing only a tuple where all but the
last item reference a classification. The last item should be a
locally declared name. The statement must occur in the most out-dented
context of a namespace creating statement suite:

def foo(bar, baz):
    types.IntType, bar
    interfaces.NumericType, interfaces.SignedType, baz

    3. The classification of a function is always "function" but its
return classification can be specified with a declaration:

<example>
types.StringType
def foo(): return "abc"
</example>

This can be checked through the introduction of "virtual" assertion
statements into byte-code:

<example>
types.StringType
def foo(): 
    __tmp = "abc"
    assert has_type( __tmp, types.StringType )
    return "abc"
</example>

    4. The classification of class instance variables comes from the
classification of the corresponding class variable. 

<example>
class foo:
    types.IntType
    a=5

    types.ListType
    b=None
</example>

Classification-testing expression:

The function has_type takes a value and a reference to a
classification or list of classifications. The return type of the
function is the union of the classifications.

Classification-safe Function:

    a function that can be checked at compile time not to violate any
classification constraints by assigning invalid values to any
constrained names:

Every reference to a name in a module or class (not instance!) must be
to a declared (but perhaps not classification constrained) name.

<note>
Remember that variables without classification constraints can be
presumed to conform to the "Object" type.
</note>

Every expression must be type-checked based on the operators,
constants and global and local name references.

Attribute assignments and references are checked based upon the
asserted classifications of the owning object.

The classification of every assignment must be checked based on the
types of constants, variables and function return types in the
right-hand side.

The classification of every function parameter must be checked based
on the classifications of the argument expression.

All return statements must be checked based on the classifications of
the expressions.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
"A writer is also a citizen, a political animal, whether he likes it or 
not. But I do not accept that a writer has a greater obligation 
to society than a musician or a mason or a teacher. Everyone has
a citizen's commitment."  - Wole Soyinka, Africa's first Nobel Laureate