From guido@CNRI.Reston.VA.US  Thu Dec  2 21:17:19 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 02 Dec 1999 16:17:19 -0500
Subject: [Types-sig] The Types-SIG is comatose.  Let's retire it.
Message-ID: <199912022117.QAA15195@eric.cnri.reston.va.us>

It's time for the twice yearly ritual of looking for comatose SIGs.
From the archives, it looks like the types-sig is the only dud amongst
the crowd: all other SIGs are doing well (some are doing *extremely*
well, like the doc-sig and the matrix-sig).

The types-sig hasn't had traffic since August (4 messages) and in all
of 1999 it has only has 12 messages.

Type-sig, what do you have to say for yourself?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mengx@nielsenmedia.com  Thu Dec  2 21:52:33 1999
From: mengx@nielsenmedia.com (mengx@nielsenmedia.com)
Date: Thu, 2 Dec 1999 16:52:33 -0500 (EST)
Subject: [Types-sig] The Types-SIG is comatose.  Let's retire it.
Message-ID: <199912022152.QAA29677@p5mts.nielsenmedia.com>

Perhaps this proved trying to (optionally) adding TYPEs to python language 
itself to  be unpopular. At the start of this list, I suggested to embed 
type hints inside doc string, or some other non-breaking methods which 
only requires python engine implementation changes instead of adding new
keywords or symbols to the code. Or instead of diving into uncertain
langauge research, accept and enhance CXX to ease the extension writing,
which may solve many issues related to the need of TYPED python

Thanks

-Ted Meng
 
> From POP3 Thu Dec  2 16:18:02 1999
> Delivered-To: types-sig@dinsdale.python.org
> To: types-sig@python.org
> Cc: meta-sig@python.org
> Date: Thu, 02 Dec 1999 16:17:19 -0500
> From: Guido van Rossum <guido@CNRI.Reston.VA.US>
> Subject: [Types-sig] The Types-SIG is comatose.  Let's retire it.
> X-BeenThere: types-sig@python.org
> X-Mailman-Version: 1.2 (experimental)
> List-Id: Special Interest Group on the Python type system <types-sig.python.org>
> 
> It's time for the twice yearly ritual of looking for comatose SIGs.
> 


From fdrake@acm.org  Thu Dec  2 22:18:07 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 2 Dec 1999 17:18:07 -0500 (EST)
Subject: [meta-sig] Re: [Types-sig] The Types-SIG is comatose.  Let's retire it.
In-Reply-To: <199912022152.QAA29677@p5mts.nielsenmedia.com>
References: <199912022152.QAA29677@p5mts.nielsenmedia.com>
Message-ID: <14406.61471.268274.986137@weyr.cnri.reston.va.us>

mengx@nielsenmedia.com writes:
 > Perhaps this proved trying to (optionally) adding TYPEs to python language 
 > itself to  be unpopular. At the start of this list, I suggested to embed 
 > type hints inside doc string, or some other non-breaking methods which 
 > only requires python engine implementation changes instead of adding new
 > keywords or symbols to the code. Or instead of diving into uncertain
 > langauge research, accept and enhance CXX to ease the extension writing,
 > which may solve many issues related to the need of TYPED python

  Actually, someone suggested encoding type information in docstrings
just recently in the Doc-SIG.  See:

http://dinsdale.python.org/pipermail/doc-sig/1999-December/001607.html
http://dinsdale.python.org/pipermail/doc-sig/1999-December/001610.html
http://dinsdale.python.org/pipermail/doc-sig/1999-December/001623.html
http://dinsdale.python.org/pipermail/doc-sig/1999-December/001627.html

  Since the decision was to table it for now, I don't think it
warrents keeping alive a dead SIG.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From paul@prescod.net  Thu Dec  2 22:47:25 1999
From: paul@prescod.net (Paul Prescod)
Date: Thu, 02 Dec 1999 16:47:25 -0600
Subject: [Types-sig] The Types-SIG is comatose.  Let's retire it.
References: <199912022152.QAA29677@p5mts.nielsenmedia.com>
Message-ID: <3846F6FD.4FCCCD1@prescod.net>

I'm not speaking on behalf of or in favor of the types-sig.

mengx@nielsenmedia.com wrote:
> 
> Perhaps this proved trying to (optionally) adding TYPEs to python language
> itself to  be unpopular. 

I don't think so. I think that there were just too many ideas of how it
should work. I think that's why revoluationary programming language
features cannot be designed by committee.

> Or instead of diving into uncertain
> langauge research, accept and enhance CXX to ease the extension writing,
> which may solve many issues related to the need of TYPED python

I don't see how CXX can help. Python programmers choose not to program
C++ for a reason.

Here's an approach that we didn't try because it is likely to be wildly
unpopular:

There exists a popular programming language that uses optional type
checking and is nearly as dynamic as Python: Visual Basic. The overall
type system is weak, (e.g. no concept of common interface) but the
optional type checking part seems to work pretty well. We wouldn't have
to do "uncertain language research" to rip its behavior (and even some
of its syntax) off. It strikes me as a pretty common sense approach.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
"I always wanted to be somebody, but I should have been more
specific." --Lily Tomlin


From guido@CNRI.Reston.VA.US  Thu Dec  2 22:51:03 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 02 Dec 1999 17:51:03 -0500
Subject: [Types-sig] The Types-SIG is comatose. Let's retire it.
In-Reply-To: Your message of "Thu, 02 Dec 1999 16:47:25 CST."
 <3846F6FD.4FCCCD1@prescod.net>
References: <199912022152.QAA29677@p5mts.nielsenmedia.com>
 <3846F6FD.4FCCCD1@prescod.net>
Message-ID: <199912022251.RAA15968@eric.cnri.reston.va.us>

> Here's an approach that we didn't try because it is likely to be wildly
> unpopular:

Why would it be unpopular?

> There exists a popular programming language that uses optional type
> checking and is nearly as dynamic as Python: Visual Basic. The overall
> type system is weak, (e.g. no concept of common interface) but the
> optional type checking part seems to work pretty well. We wouldn't have
> to do "uncertain language research" to rip its behavior (and even some
> of its syntax) off. It strikes me as a pretty common sense approach.

I don't know the details, never having studied VB manuals, although I
once saw the source of a file that described the linkage to a C
module (pretty ugly but effective and no need for wrappers).

Do you have the time to describe this in somewhat more detail for us
lucky folks who haven't had the pleasure to learn VB?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gstein@lyra.org  Fri Dec  3 00:05:14 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 2 Dec 1999 16:05:14 -0800 (PST)
Subject: [Types-sig] Re: The Types-SIG is comatose. Let's retire it.
In-Reply-To: <199912022251.RAA15968@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912021602300.18529-100000@nebula.lyra.org>

Meta issue:

I'm not sure that I agree the types-sig should stay alive simply because
some traffic is inserted when a threat-of-execution has arisen. The
impetus for the SIG is (IMO) obviously gone, despite some people's
unstated desires to see work done along this path.

I'd recommend closing the SIG and letting this discussion move elsewhere.

Cheers,
-g

On Thu, 2 Dec 1999, Guido van Rossum wrote:
> > Here's an approach that we didn't try because it is likely to be wildly
> > unpopular:
> 
> Why would it be unpopular?
> 
> > There exists a popular programming language that uses optional type
> > checking and is nearly as dynamic as Python: Visual Basic. The overall
> > type system is weak, (e.g. no concept of common interface) but the
> > optional type checking part seems to work pretty well. We wouldn't have
> > to do "uncertain language research" to rip its behavior (and even some
> > of its syntax) off. It strikes me as a pretty common sense approach.
> 
> I don't know the details, never having studied VB manuals, although I
> once saw the source of a file that described the linkage to a C
> module (pretty ugly but effective and no need for wrappers).
> 
> Do you have the time to describe this in somewhat more detail for us
> lucky folks who haven't had the pleasure to learn VB?
> 
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> 
> _______________________________________________
> Types-SIG mailing list
> Types-SIG@python.org
> http://www.python.org/mailman/listinfo/types-sig
> 

-- 
Greg Stein, http://www.lyra.org/


From jeremy@cnri.reston.va.us  Fri Dec  3 00:08:54 1999
From: jeremy@cnri.reston.va.us (Jeremy Hylton)
Date: Thu, 2 Dec 1999 19:08:54 -0500 (EST)
Subject: [Types-sig] Re: [meta-sig] Re: The Types-SIG is comatose. Let's retire it.
In-Reply-To: <Pine.LNX.4.10.9912021602300.18529-100000@nebula.lyra.org>
References: <199912022251.RAA15968@eric.cnri.reston.va.us>
 <Pine.LNX.4.10.9912021602300.18529-100000@nebula.lyra.org>
Message-ID: <14407.2582.897430.477756@goon.cnri.reston.va.us>

>>>>> "GS" == Greg Stein <gstein@lyra.org> writes:

  GS> Meta issue: I'm not sure that I agree the types-sig should stay
  GS> alive simply because some traffic is inserted when a
  GS> threat-of-execution has arisen. The impetus for the SIG is (IMO)
  GS> obviously gone, despite some people's unstated desires to see
  GS> work done along this path.

I don't remember anymore what the impetus was.  The problem I see is
that a lot of work is going to be required to make much progress on
extending the type system.  In the absence of anyone willing and able
to do the work (whatever it is), there's not much point to a SIG.

  GS> I'd recommend closing the SIG and letting this discussion move
  GS> elsewhere.

Yes.

Jeremy


From paul@prescod.net  Fri Dec  3 02:53:24 1999
From: paul@prescod.net (Paul Prescod)
Date: Thu, 02 Dec 1999 20:53:24 -0600
Subject: [Types-sig] VB Types
References: <199912022152.QAA29677@p5mts.nielsenmedia.com>
 <3846F6FD.4FCCCD1@prescod.net> <199912022251.RAA15968@eric.cnri.reston.va.us>
Message-ID: <384730A4.7BFC4933@prescod.net>

Guido van Rossum wrote:
> 
> > Here's an approach that we didn't try because it is likely to be wildly
> > unpopular:
> 
> Why would it be unpopular?

Stealing ideas from Visual Basic? Shudder!

> Do you have the time to describe this in somewhat more detail for us
> lucky folks who haven't had the pleasure to learn VB?

The declarations are totally optional. If you don't declare something
then it is a "Variant" which is a grab-bag like void * or PyObject. So
the semantics of an untyped program are similar to Pythons:

Private Function Foo()
    b = "foo"
    MsgBox (b)
    b = 5
    MsgBox (b)
End Function
    
b is a Variant. So is the return value of the function. I could change
that:

Public Function Foo() As Slide
        Set Foo = ActivePresentation.Slides(0)
End Function

Ignore the word "set". It's a hack and I think that even in VB there
isn't a good reason it is necessary. 

Their word for "declare" is "dim"

Dim i as Integer

As soon as you Dim something, the IDE tries to help you with its method
signatures. That's useful enough to encourage type declarations for
things of known type...which in turn can help you catch type errors more
quickly. I've prototyped some COM apps in VB and then port to Python
because the method signatures stuff is important when the COM object
isn't well documented (usually!).

For some reason, it isn't compile time type safe. This would cause a
runtime error:

    Dim b As Integer
    MsgBox (b)
    b = "foo"
    b = 5

I don't think I've ever got a compile time type error message. Perhaps
they don't want to give a false sense of "type safety" because it is
still very possible to make type errors (because of the variants).

Still, in a Python implementation I would expect IDEs to have a "check
all types" menu item and the Python interpreter would also need a check
all types command line option. The default value of integers is "0". 

Parameters can be typed or implicitly variant:

Public Function Foo(a As Integer, b, c as String) As Collection
    Set Foo = New Collection
End Function

Classes are types so you can create new types easily. There is no
concept of predefined interfaces (other than interfaces in a typelib)
but that could be added easily. There is no union type: you would have
to use variants.

As you point out, these same definitions can be used to interface to
statically typed languages without good introspection and libraries but
that also depends on built-in language features.

I can't think of anything else that is relevant.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
"I always wanted to be somebody, but I should have been more
specific." --Lily Tomlin


From da@ski.org  Fri Dec  3 04:53:00 1999
From: da@ski.org (David Ascher)
Date: Thu, 2 Dec 1999 20:53:00 -0800 (Pacific Standard Time)
Subject: [Types-sig] VB Types
In-Reply-To: <384730A4.7BFC4933@prescod.net>
Message-ID: <Pine.WNT.4.05.9912022050140.201-100000@david.ski.org>

On Thu, 2 Dec 1999, Paul Prescod wrote:

> Guido van Rossum wrote:
> > 
> > > Here's an approach that we didn't try because it is likely to be wildly
> > > unpopular:
> > 
> > Why would it be unpopular?
> 
> Stealing ideas from Visual Basic? Shudder!

Nobody's got to know outside of these four walls. =)

> The declarations are totally optional. If you don't declare something
> then it is a "Variant" which is a grab-bag like void * or PyObject. So
> the semantics of an untyped program are similar to Pythons: [...]

Sounds quite a bit like JimH's proposal @ IPC7 (or was it 6?).  He got
booed, IIRC, but that was just an emotional reaction, methinks. =)

How does VB handle specifying types which are not one of the atomic types?
(e.g. list of (tuple or dictionaries) of length 5 or fewer?)

--david


From tim_one@email.msn.com  Fri Dec  3 05:58:48 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 3 Dec 1999 00:58:48 -0500
Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose.  Let's retire it.
In-Reply-To: <199912022117.QAA15195@eric.cnri.reston.va.us>
Message-ID: <000801bf3d53$77a44f20$3a2d153f@tim>

[Guido]
> It's time for the twice yearly ritual of looking for comatose SIGs.
> ...
> The types-sig hasn't had traffic since August (4 messages) and in all
> of 1999 it has only has 12 messages.
>
> Type-sig, what do you have to say for yourself?

The Types-SIG was very active at its inception; indeed, I still have 142 old
Types-SIG msgs in my inbox I haven't yet read!

Note that the traffic dropped to essentially nothing several weeks after you
(Guido) posted your own last msg to it.  I don't think that's coincidence.
You were an active initial participant, and when you dropped out most of us
likely figured you had some other ideas in mind for Python2 and there was
little point to proceeding without you.

So, like everything else that goes wrong in the Python world, it was
entirely Gordon McMillan's fault <wink>.

I'd kill the SIG due to lack of activity.  I'm sure interest in the topics
remains high among many, though ("Types SIG"-related debates have continued
non-stop on c.l.py).

taking-no-more-from-this-than-that-a-successful-sig-needs-a-
    focused-charter-ly y'rs  - tim


From paul@prescod.net  Fri Dec  3 12:52:22 1999
From: paul@prescod.net (Paul Prescod)
Date: Fri, 03 Dec 1999 06:52:22 -0600
Subject: [Types-sig] VB Types
References: <Pine.WNT.4.05.9912022050140.201-100000@david.ski.org>
Message-ID: <3847BD06.C2FA6743@prescod.net>

David Ascher wrote:
> 
> How does VB handle specifying types which are not one of the atomic types?
> (e.g. list of (tuple or dictionaries) of length 5 or fewer?)

Good question. It seems to handle fixed (and variable?) length arrays.

Dim Washington(1 To 100) As StateData

It also has a concept of a "struct" which they call a "user defined
type".

Type StateData
    CityCode (1 To 100) As Integer    ' Declare a static array.
    County As String
End Type

Of course for Python, we would use square brackets for array bounds and
classes for structs.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
"I always wanted to be somebody, but I should have been more
specific." --Lily Tomlin


From jim@digicool.com  Fri Dec  3 14:27:38 1999
From: jim@digicool.com (Jim Fulton)
Date: Fri, 03 Dec 1999 09:27:38 -0500
Subject: [Types-sig] The Types-SIG is comatose.  Let's retire it.
References: <199912022117.QAA15195@eric.cnri.reston.va.us>
Message-ID: <3847D35A.59EB5770@digicool.com>

Guido van Rossum wrote:
> 
> It's time for the twice yearly ritual of looking for comatose SIGs.
> From the archives, it looks like the types-sig is the only dud amongst
> the crowd: all other SIGs are doing well (some are doing *extremely*
> well, like the doc-sig and the matrix-sig).
> 
> The types-sig hasn't had traffic since August (4 messages) and in all
> of 1999 it has only has 12 messages.
> 
> Type-sig, what do you have to say for yourself?

As others have pointed out, there is clear evidence that the 
SIG is inactive and should be deactivated.

I was a bit frustrated that the SIG tried to address three topics that
I consider independent:

  - Interfaces

  - Classes vs types

  - Static typing

This hurt the focus of the sig and emotion from some topics
tended to bleed over to others.  For example, I think the
interfaces work was hurt by association with the typing
work.

I'll find some time over the next few days to try to 
sumarize and report on work in the sig on the first two
topics. Perhaps someone else will do the same for static
typing.

Even if the SIG goes away, I think some report on the SIGs
activity should be made at IPC8 (assuming there is a SIG
status discussion).

Jim

--
Jim Fulton           mailto:jim@digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From paul@prescod.net  Fri Dec  3 14:32:13 1999
From: paul@prescod.net (Paul Prescod)
Date: Fri, 03 Dec 1999 08:32:13 -0600
Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose.  Let's retire
 it.
References: <000801bf3d53$77a44f20$3a2d153f@tim>
Message-ID: <3847D46D.17C79972@prescod.net>

David Ascher wrote:
> Sounds quite a bit like JimH's proposal @ IPC7 (or was it 6?).  He 
> got booed, IIRC, but that was just an emotional reaction, methinks. =)

<scowl> There is no non-trivial Python extension that will not get
booed.

I like the Visual Basic approach because it is simple, seems intuitive
to me, does not depend on any new ideas at all and thus does not require
a lot of debate. To me, Python's brilliance is in eschewing innovation.
People come to it and say: "this is the language I have been looking
for". Other than whitespace there is no "weird stuff." It just takes the
best ideas from every other language and simplifies the hell out of
them. 

While I'm ranting, the other problem new people have is the whole
reference/copy issue. Is there any language that has more understandable
(perhaps more explicit) semantics for that stuff that we could steal for
Py2?

P.S. I brainwashed another one today. Literal quote: "This is the
language I've been looking for."
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
"I always wanted to be somebody, but I should have been more
specific." --Lily Tomlin


From paul@prescod.net  Fri Dec  3 14:32:32 1999
From: paul@prescod.net (Paul Prescod)
Date: Fri, 03 Dec 1999 08:32:32 -0600
Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose.  Let's retire
 it.
References: <000801bf3d53$77a44f20$3a2d153f@tim>
Message-ID: <3847D480.12A31666@prescod.net>

> taking-no-more-from-this-than-that-a-successful-sig-needs-a-
>     focused-charter-ly y'rs  - tim

I propose that the types sig be re-commissioned with a much tighter
commission. Let's focus on ONE of the three problems listed in our old
charter:

http://www.python.org/sigs/types-sig/

And let's start with a clear direction from the Powers that Be. 

I propose:

 * the goal is a optional static type system for version 2. 
 * presume that the type/class dichotomy has been removed in V2
 * backwards compatibility with current code is relatively important
 * compatibility with the Python 1.x interpreter is NOT important
 * interfaces are not an issue
 * parameterized (template) types are not available
 * names are type checked, not expressions
 * got now, only named types (types and classes) can be declared, not
lists and tuples of types

(many of these restrictions are easy to work around in Python: for
instance making a list of string subclass of userlist)

Start from these (very similar!) proposals:

http://www.python.org/~rmasse/papers/python-types/
The current Visual Basic type system
Something somewhere from JimH
The type declaration part of strongtalk
The first half of this:
http://www.foretec.com/python/workshops/1998-11/greg-type-ideas.html

We should appoint an "editor" as they do in standards bodies. If there
are issues that just cannot be worked out by consensus, Guido rules.
Ideally, it should work much like the docstring discussion going on in
the doc-sig.

If we had a particularly ambitious editor (unlikely) then we could have
an RFC by the Python conference.

Later, we could do the same thing for the class/type dichotomy.
...then interfaces
...then parameterized types.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
"I always wanted to be somebody, but I should have been more
specific." --Lily Tomlin


From guido@CNRI.Reston.VA.US  Fri Dec  3 14:47:07 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 03 Dec 1999 09:47:07 -0500
Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose. Let's retire it.
In-Reply-To: Your message of "Fri, 03 Dec 1999 08:32:32 CST."
 <3847D480.12A31666@prescod.net>
References: <000801bf3d53$77a44f20$3a2d153f@tim>
 <3847D480.12A31666@prescod.net>
Message-ID: <199912031447.JAA16565@eric.cnri.reston.va.us>

Paul, do you want to be the head honcho for the reborn types SIG?  You
seem to have the right ideas, and you're the only one so far who has
spoken up to keep it alive.  I doubt that anyone else will volunteer,
so if you don't, we will retire the SIG.  I'll give you till June 2000
(the same expiration date as for other SIGs) to show that there's life
in the subject.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From bwarsaw@cnri.reston.va.us (Barry A. Warsaw)  Fri Dec  3 15:29:47 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Fri, 3 Dec 1999 10:29:47 -0500 (EST)
Subject: [meta-sig] Re: [Types-sig] The Types-SIG is comatose.  Let's retire it.
References: <199912022117.QAA15195@eric.cnri.reston.va.us>
 <3847D35A.59EB5770@digicool.com>
Message-ID: <14407.57835.318063.51763@anthem.cnri.reston.va.us>

>>>>> "JF" == Jim Fulton <jim@digicool.com> writes:

    JF> Even if the SIG goes away, I think some report on the SIGs
    JF> activity should be made at IPC8 (assuming there is a SIG
    JF> status discussion).

Let's see how many other topics get championed.  If there's time, I
say this is a great idea.

-Barry


From m.faassen@vet.uu.nl  Fri Dec  3 17:08:38 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Fri, 03 Dec 1999 18:08:38 +0100
Subject: [Types-sig] The Types-SIG is comatose.  Let's retire it.
References: <199912022117.QAA15195@eric.cnri.reston.va.us>
Message-ID: <3847F916.35743ADB@vet.uu.nl>

Guido van Rossum wrote:
> 
> It's time for the twice yearly ritual of looking for comatose SIGs.
> >From the archives, it looks like the types-sig is the only dud amongst
> the crowd: all other SIGs are doing well (some are doing *extremely*
> well, like the doc-sig and the matrix-sig).
> 
> The types-sig hasn't had traffic since August (4 messages) and in all
> of 1999 it has only has 12 messages.
> 
> Type-sig, what do you have to say for yourself?

Oddly enough, there has been quite some discussion on types on
comp.lang.python since then. John Skaller's viper discussions and the
discussions on Ruby are an example. I agree with others that the problem
of the types-SIG is a lack of focus of discussion (too many different
topics all having to do somewhat with types), and nobody doing the brunt
of the work. John Skaller does appear to be doing lots of work on types
in Python, but he seems to prefer working alone with his source. It's
not as if there's no interest for type issues in the Python community;
far from that. It just seems that there's nobody who has enough
time/knowledge to work on them.

Having studied the Zope sources I'm becoming painfully aware for the
need of something like interfaces. Zope's source would really be far
more understandable if it were rewritten with interfaces, I think. I
understand Jim Fulton's motivation concerning interfaces far better
since my foray into those sources.

I'm still interested in static types as well, mostly in the interests of
compilation. It's ridiculous to split a SIG that doesn't talk, of
course, but perhaps better would be to have a 'compiler-SIG' and an
'interfaces-SIG'. I'd expect the interface-SIG to come with results far
more quickly than the compiler-SIG.

Regards,

Martijn


From m.faassen@vet.uu.nl  Fri Dec  3 17:15:26 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Fri, 03 Dec 1999 18:15:26 +0100
Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose. Let's retire
 it.
References: <000801bf3d53$77a44f20$3a2d153f@tim>
 <3847D480.12A31666@prescod.net> <199912031447.JAA16565@eric.cnri.reston.va.us>
Message-ID: <3847FAAE.B8D74FDD@vet.uu.nl>

Guido van Rossum wrote:
> 
> Paul, do you want to be the head honcho for the reborn types SIG?  You
> seem to have the right ideas, and you're the only one so far who has
> spoken up to keep it alive.  I doubt that anyone else will volunteer,
> so if you don't, we will retire the SIG.  I'll give you till June 2000
> (the same expiration date as for other SIGs) to show that there's life
> in the subject.

Okay, I'd like to keep the place alive as well. I'll endeavor contribute
by replying to Paul's messages, or anybody else who posts here. 

(I was actually quite excited to suddenly discover my types-SIG mailbox
had lots of new messages in it :).

Perhaps I'll get more time to actually work on these issues next year.
Then my only thing lacking is actual knowledge and experience, but I can
work on that. :)

Regards,

Martijn


From jeremy@cnri.reston.va.us  Fri Dec  3 17:33:55 1999
From: jeremy@cnri.reston.va.us (Jeremy Hylton)
Date: Fri, 3 Dec 1999 12:33:55 -0500 (EST)
Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose.  Let's retire
 it.
In-Reply-To: <3847D46D.17C79972@prescod.net>
References: <000801bf3d53$77a44f20$3a2d153f@tim>
 <3847D46D.17C79972@prescod.net>
Message-ID: <14407.65283.532788.640647@goon.cnri.reston.va.us>

>>>>> "PP" == Paul Prescod <paul@prescod.net> writes:

  PP> David Ascher wrote:
  >> Sounds quite a bit like JimH's proposal @ IPC7 (or was it 6?).
  >> He got booed, IIRC, but that was just an emotional reaction,
  >> methinks. =)

Jim's proposal was to extend Python with Java-style syntax and
semantics.  The Modula-3 fans cried foul.

  PP> <scowl> There is no non-trivial Python extension that will not
  PP> get booed.

:-)

  PP> While I'm ranting, the other problem new people have is the
  PP> whole reference/copy issue. Is there any language that has more
  PP> understandable (perhaps more explicit) semantics for that stuff
  PP> that we could steal for Py2?

I think Python's rules are pretty simple already!  I think newbies get
confused by the general design issue, rather than Python's semantics.

I read The Practice of Programming a few months ago and much
appreciated the discussion of resource (e.g. memory) management.  The
authors said: "One of the most difficult problems in designing the
interface for a library (or a class or a package) is to manage
resources that are owned by the library and shared by the library and
those who call it." (p. 103) Memory management issues, in particular,
don't simply disappear in garbage-collected languages.  The designer
still has to determine when to use copies and when to use shared
objects.  I don't think the language can do a lot more to help with
this issue except have clear semantics.  

Jeremy


From m.faassen@vet.uu.nl  Fri Dec  3 17:40:16 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Fri, 03 Dec 1999 18:40:16 +0100
Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose.  Let's retire
 it.
References: <000801bf3d53$77a44f20$3a2d153f@tim> <3847D480.12A31666@prescod.net>
Message-ID: <38480080.3403BDDF@vet.uu.nl>

Paul Prescod wrote:
> 
> > taking-no-more-from-this-than-that-a-successful-sig-needs-a-
> >     focused-charter-ly y'rs  - tim
> 
> I propose that the types sig be re-commissioned with a much tighter
> commission. Let's focus on ONE of the three problems listed in our old
> charter:
> 
> http://www.python.org/sigs/types-sig/
> 
> And let's start with a clear direction from the Powers that Be.
> 
> I propose:
> 
>  * the goal is a optional static type system for version 2.

Okay, I'll assume this goal for now. I'd like to see something happen
with interfaces too, but I'll just assume/hope that an interface
proposal will arise 'naturally' from any static type system we come up
with.

>  * presume that the type/class dichotomy has been removed in V2

Gladly. So, what does this mean in practice? A particular class is
another
type? I don't want to accidentally start the discussion on the dichotomy
itself
here, I just want to know what Python 2 is like in practice. For now
I'll
assume that if I declare a class in Python 2, that class becomes a type.

>  * backwards compatibility with current code is relatively important

All right, though you'll run into trouble with any current code that
messes
too much with types, so we can just forget about that trouble, as it'd
be caused by the solving of the class/type dichotomy in any case.

>  * compatibility with the Python 1.x interpreter is NOT important

So we don't care if we can add static typing to the Python 1.x
interpreter line?

>  * interfaces are not an issue

Presumably they'll arise naturally, as I said before. :)

>  * parameterized (template) types are not available

Darn! I like these, if I understand what you mean. Don't we need things
like 'a list of integers'? Or 'a list of objects that have class-type
Foo' (objects of class Bar may be of class-type Foo too if Foo derives
from Bar).

If you want to actually use static types for compilation 'Swallow style'
(only compile those functions/classes that are *fully* static type
described) you'd need something like parameterized types..

Also, if you don't have parameterized types, you'll effectively lose
track (statically) of the type of any object once you put it in a list.

>  * names are type checked, not expressions

What does this mean? 

a = 4 @ IntegerType
b = a @ IntegerType # checked if a is indeed integertype
b = a + a @ IntegerType # not checked, as a + a is an expression

class Foo:
   def doFoo():
       print "Foo!"

a = Foo() @ FooType
a.doFoo() # does this do a typecheck for a? 

>  * got now, only named types (types and classes) can be declared, not
> lists and tuples of types

That fits in with the no template types idea, right?
 
> (many of these restrictions are easy to work around in Python: for
> instance making a list of string subclass of userlist)

Hm. But tuples, lists and dictionaries are very basic in Python. If the
types system does not support them that would seem to be a bit
incongruous (and inconvenient).
 
> Start from these (very similar!) proposals:
> 
> http://www.python.org/~rmasse/papers/python-types/

This does talk about interfaces (protocols) though. What part of this
proposal do you mean?

> The current Visual Basic type system

I'll reread your posts on that.

> Something somewhere from JimH
> The type declaration part of strongtalk

Any references?

> The first half of this:
> http://www.foretec.com/python/workshops/1998-11/greg-type-ideas.html
> 
> We should appoint an "editor" as they do in standards bodies. If there
> are issues that just cannot be worked out by consensus, Guido rules.

Guido would rule in any case if Guido disagrees with consensus, right?
:)

Regards,

Martijn


From jeremy@cnri.reston.va.us  Fri Dec  3 17:52:42 1999
From: jeremy@cnri.reston.va.us (Jeremy Hylton)
Date: Fri, 3 Dec 1999 12:52:42 -0500 (EST)
Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose.  Let's retire
 it.
In-Reply-To: <3847D480.12A31666@prescod.net>
References: <000801bf3d53$77a44f20$3a2d153f@tim>
 <3847D480.12A31666@prescod.net>
Message-ID: <14408.874.505464.996655@goon.cnri.reston.va.us>

Paul Prescod proposes a new charter for the types-sig:
> * the goal is a optional static type system for version 2. 
> * presume that the type/class dichotomy has been removed in V2
> * backwards compatibility with current code is relatively important
> * compatibility with the Python 1.x interpreter is NOT important
> * interfaces are not an issue
> * parameterized (template) types are not available
> * names are type checked, not expressions
> * got now, only named types (types and classes) can be declared, not
>lists and tuples of types

If you're going to develop a static type system to describe Python
programs (optional or otherwise), then I think you can't punt on all
the things you want to punt on.

> * interfaces are not an issue
Yes, they are :-).

> * parameterized (template) types are not available
They need to be.

> * names are type checked, not expressions
Expressions need type checking, too!  I'm thinking of the "the"
special form in Common Lisp.  (I don't have much experience with CL,
so I'd appreciate input from someone who is.)

Regardless of these minor quibbles, my largest complaint is:
> * the goal is a optional static type system for version 2. 

What exactly is the deliverable.  Saying an "optional static type
system" is a bit vague.  What is it specifically?  A formal
specification of the type system?  A stand-alone utility that reports
type errors?  A new compiler?

If this is a type system for Python 2, it seems that the best a SIG
can hope for right now is a specification of the type system.  Since
Py2 design hasn't even started.

Jeremy


From jim@digicool.com  Fri Dec  3 17:55:12 1999
From: jim@digicool.com (Jim Fulton)
Date: Fri, 03 Dec 1999 12:55:12 -0500
Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose.  Let's retire
 it.
References: <000801bf3d53$77a44f20$3a2d153f@tim> <3847D480.12A31666@prescod.net> <38480080.3403BDDF@vet.uu.nl>
Message-ID: <38480400.D3EE8A6@digicool.com>

Martijn Faassen wrote:
> 
> Paul Prescod wrote:
> >
> > > taking-no-more-from-this-than-that-a-successful-sig-needs-a-
> > >     focused-charter-ly y'rs  - tim
> >
> > I propose that the types sig be re-commissioned with a much tighter
> > commission. Let's focus on ONE of the three problems listed in our old
> > charter:
> >
> > http://www.python.org/sigs/types-sig/

I really agree with this.

> > And let's start with a clear direction from the Powers that Be.
> >
> > I propose:
> >
> >  * the goal is a optional static type system for version 2.
> 
> Okay, I'll assume this goal for now. I'd like to see something happen
> with interfaces too, but I'll just assume/hope that an interface
> proposal will arise 'naturally' from any static type system we come up
> with.

I intend to summarize the interfaces discussion and report back.
I also intend to go ahead and release the interface implementation
based on requirements that we agreed to at Spam7 and mostly agreed to
in the SIG.  We'll also start folding it into Zope.  Based on actual 
experience using it, we'll have a basis for future discussions. I 
desperately hope these future discussions happen somewhere other than
the reinvented types sig.

> >  * presume that the type/class dichotomy has been removed in V2
> 
> Gladly. So, what does this mean in practice? A particular class is
> another
> type? I don't want to accidentally start the discussion on the dichotomy
> itself
> here, I just want to know what Python 2 is like in practice. For now
> I'll
> assume that if I declare a class in Python 2, that class becomes a type.

I vaguely remember agreement on a number of issues. As I said in a previous
post, I'll try to summarise the progress made and report back. We can decide
what to do based on that. (Alternatively, if someone else wants to summarize
that's OK with me.)
 
(snip, I don't really care that much about static typing, except that
 I'm generally wary of it. ;)

Jim

--
Jim Fulton           mailto:jim@digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From GoldenH@littoncorp.com  Fri Dec  3 18:03:04 1999
From: GoldenH@littoncorp.com (Golden, Howard)
Date: Fri, 3 Dec 1999 10:03:04 -0800
Subject: [Types-SIG] Python vs. Smalltalk/Strongtalk, etc. Was: The Types-
 SIG is comatose.
Message-ID: <ADCB388D8C6BD211A4CE0000F63D90112215A0@mail.littoncorp.com>

Paul Prescod wrote:

> And let's start with a clear direction from the Powers that Be. 
> 
> I propose:
> 
>  * the goal is a optional static type system for version 2. 
>  * presume that the type/class dichotomy has been removed in V2
>  * backwards compatibility with current code is relatively important
>  * compatibility with the Python 1.x interpreter is NOT important
>  * interfaces are not an issue
>  * parameterized (template) types are not available
>  * names are type checked, not expressions
>  * got now, only named types (types and classes) can be declared, not
> lists and tuples of types

There are a lot of different proposals.  Do we all agree on all of these
points? (Unlikely!)

> Start from these (very similar!) proposals:
> 
> http://www.python.org/~rmasse/papers/python-types/
> The current Visual Basic type system
> Something somewhere from JimH
> The type declaration part of strongtalk
> The first half of this:
> http://www.foretec.com/python/workshops/1998-11/greg-type-ideas.html

It must be serendipity, but I was just thinking about this subject
yesterday, and I went so far as to look up Strongtalk and download Squeak.

Syntax differences aside, what I think we would benefit from is a comparison
of the capabilities of Python1.x and proposed Python2 to
Smalltalk/Strongtalk/Squeak, Visual Basic, etc.  For me, I am looking at
Python as a general purpose language, rather than a scripting language, so
programming-in-the-large features are important.

-- Specific questions: --

What if the C definition of functions and methods were extended by adding a
signature object?  (If so, how can signatures be specified?)  Could the
signatures then be used to generate more efficient code?  Should there be
function/method choice by signature?

Maybe I'm trying to make Python into something it wasn't intended to be, but
I have this wish that I wouldn't have to use different languages for
different tasks.

Howard B. Golden
Software developer
Litton Industries, Inc.
Woodland Hills, California


From m.faassen@vet.uu.nl  Fri Dec  3 18:08:29 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Fri, 03 Dec 1999 19:08:29 +0100
Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose.  Let's retire
 it.
References: <000801bf3d53$77a44f20$3a2d153f@tim> <3847D480.12A31666@prescod.net> <38480080.3403BDDF@vet.uu.nl> <38480400.D3EE8A6@digicool.com>
Message-ID: <3848071D.5A994688@vet.uu.nl>

Jim Fulton wrote:
> 
> Martijn Faassen wrote:
> >
> > Paul Prescod wrote:
> > >
> > > > taking-no-more-from-this-than-that-a-successful-sig-needs-a-
> > > >     focused-charter-ly y'rs  - tim
> > >
> > > I propose that the types sig be re-commissioned with a much tighter
> > > commission. Let's focus on ONE of the three problems listed in our old
> > > charter:
> > >
> > > http://www.python.org/sigs/types-sig/
> 
> I really agree with this.

But I suppose you disagree with Paul on what this focus problem should
be? You'd prefer interfaces, right? Or seeing what you said later on in
your post, perhaps an interface-SIG? I expect I'd contribute to any
discussion of interfaces *or* static types. I'd probably be able to
contribute more of practical value to any interface development right
now. I don't have so much to contribute about the class/type dichotomy.

> > > And let's start with a clear direction from the Powers that Be.
> > >
> > > I propose:
> > >
> > >  * the goal is a optional static type system for version 2.
> >
> > Okay, I'll assume this goal for now. I'd like to see something happen
> > with interfaces too, but I'll just assume/hope that an interface
> > proposal will arise 'naturally' from any static type system we come up
> > with.
> 
> I intend to summarize the interfaces discussion and report back.

That'd be really helpful.

> I also intend to go ahead and release the interface implementation
> based on requirements that we agreed to at Spam7 and mostly agreed to
> in the SIG.

That'd be even more helpful.

> We'll also start folding it into Zope.

And that'd be wonderful! I am starting to feel that need after getting
lost in the Zope sources too often. I'd like to contribute; perhaps by
documenting something for starters. Any ideas?

> Based on actual
> experience using it, we'll have a basis for future discussions.

So practical.! :) I'd like to get in on this early on. I assume I'll
catch your announcement on the release of the interface implementation,
but I'd also be very interested to follow the process of rolling it into
Zope from the start. Not that I'm likely to be able to contribute much
at the start, but it just sounds really interesting to me. Any idea on
how this could be accomplished?

> I desperately hope these future discussions happen somewhere other than
> the reinvented types sig.

The interfaces-SIG? :)

[snip class/type discussion]
 
[lots on static typing]
> (snip, I don't really care that much about static typing, except that
>  I'm generally wary of it. ;)

*grin* Okay, I suggest another design goal for the revived types-SIG:
'Pass the Fulton Test'. We must strive for a static type system so
wonderful that even Jim Fulton will like it. :)
 
Regards,

Martijn


From m.faassen@vet.uu.nl  Fri Dec  3 18:17:26 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Fri, 03 Dec 1999 19:17:26 +0100
Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose.  Let's retire
 it.
References: <000801bf3d53$77a44f20$3a2d153f@tim>
 <3847D480.12A31666@prescod.net> <14408.874.505464.996655@goon.cnri.reston.va.us>
Message-ID: <38480936.270617B9@vet.uu.nl>

Jeremy Hylton wrote:
> 
> Paul Prescod proposes a new charter for the types-sig:
> > * the goal is a optional static type system for version 2.
> > * presume that the type/class dichotomy has been removed in V2
> > * backwards compatibility with current code is relatively important
> > * compatibility with the Python 1.x interpreter is NOT important
> > * interfaces are not an issue
> > * parameterized (template) types are not available
> > * names are type checked, not expressions
> > * got now, only named types (types and classes) can be declared, not
> >lists and tuples of types
> 
> If you're going to develop a static type system to describe Python
> programs (optional or otherwise), then I think you can't punt on all
> the things you want to punt on.

I probably agree with you (at least partially). See my previous post.

> > * interfaces are not an issue
> Yes, they are :-).

Why, exactly?
 
> > * parameterized (template) types are not available
> They need to be.

Why, exactly? :)
 
> > * names are type checked, not expressions
> Expressions need type checking, too!  I'm thinking of the "the"
> special form in Common Lisp.  (I don't have much experience with CL,
> so I'd appreciate input from someone who is.)

I'm even less familiar with CL than you are, so I don't know...

> Regardless of these minor quibbles, my largest complaint is:
> > * the goal is a optional static type system for version 2.
> 
> What exactly is the deliverable.  Saying an "optional static type
> system" is a bit vague.  What is it specifically?  A formal
> specification of the type system?  A stand-alone utility that reports
> type errors?  A new compiler?

Very good question. We need to agree on a deliverable.

> If this is a type system for Python 2, it seems that the best a SIG
> can hope for right now is a specification of the type system

Unfortunately this kind of goal may be too vague to actually involve
people. Not being able to try things out in some kind of implementation
may disconnect the discussion from reality.

> Since
> Py2 design hasn't even started.

When will this start, by the way? Anybody know or is this still pure
speculation? The conference? I started wondering when I saw this in the
'A Date with Tim Peters...' post by Guido on comp.lang.python:

- a developers' day where the feature set of Python 2.0 is worked out. 

Regards,

Martijn


From Paul@digicool.com  Fri Dec  3 18:33:35 1999
From: Paul@digicool.com (Paul Everitt)
Date: Fri, 3 Dec 1999 13:33:35 -0500
Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose.  Let's
 retire  it.
Message-ID: <613145F79272D211914B0020AFF64019262F5D@gandalf.digicool.com>

Hey folks, isn't this technical discussion better handled on the
types-sig? :^)

--Paul

> -----Original Message-----
> From: Martijn Faassen [mailto:m.faassen@vet.uu.nl]
> Sent: Friday, December 03, 1999 1:17 PM
> Cc: types-sig@python.org; meta-sig@python.org
> Subject: Re: [Types-sig] RE: [meta-sig] The Types-SIG is 
> comatose. Let's
> retire it.
> 
> 
> Jeremy Hylton wrote:
> > 
> > Paul Prescod proposes a new charter for the types-sig:
> > > * the goal is a optional static type system for version 2.
> > > * presume that the type/class dichotomy has been removed in V2
> > > * backwards compatibility with current code is relatively 
> important
> > > * compatibility with the Python 1.x interpreter is NOT important
> > > * interfaces are not an issue
> > > * parameterized (template) types are not available
> > > * names are type checked, not expressions
> > > * got now, only named types (types and classes) can be 
> declared, not
> > >lists and tuples of types
> > 
> > If you're going to develop a static type system to describe Python
> > programs (optional or otherwise), then I think you can't punt on all
> > the things you want to punt on.
> 
> I probably agree with you (at least partially). See my previous post.
> 
> > > * interfaces are not an issue
> > Yes, they are :-).
> 
> Why, exactly?
>  
> > > * parameterized (template) types are not available
> > They need to be.
> 
> Why, exactly? :)
>  
> > > * names are type checked, not expressions
> > Expressions need type checking, too!  I'm thinking of the "the"
> > special form in Common Lisp.  (I don't have much experience with CL,
> > so I'd appreciate input from someone who is.)
> 
> I'm even less familiar with CL than you are, so I don't know...
> 
> > Regardless of these minor quibbles, my largest complaint is:
> > > * the goal is a optional static type system for version 2.
> > 
> > What exactly is the deliverable.  Saying an "optional static type
> > system" is a bit vague.  What is it specifically?  A formal
> > specification of the type system?  A stand-alone utility 
> that reports
> > type errors?  A new compiler?
> 
> Very good question. We need to agree on a deliverable.
> 
> > If this is a type system for Python 2, it seems that the best a SIG
> > can hope for right now is a specification of the type system
> 
> Unfortunately this kind of goal may be too vague to actually involve
> people. Not being able to try things out in some kind of 
> implementation
> may disconnect the discussion from reality.
> 
> > Since
> > Py2 design hasn't even started.
> 
> When will this start, by the way? Anybody know or is this still pure
> speculation? The conference? I started wondering when I saw 
> this in the
> 'A Date with Tim Peters...' post by Guido on comp.lang.python:
> 
> - a developers' day where the feature set of Python 2.0 is 
> worked out. 
> 
> Regards,
> 
> Martijn
> 
> _______________________________________________
> Meta-sig maillist  -  Meta-sig@python.org
> http://www.python.org/mailman/listinfo/meta-sig
> 


From janssen@parc.xerox.com  Fri Dec  3 19:21:53 1999
From: janssen@parc.xerox.com (Bill Janssen)
Date: Fri, 3 Dec 1999 11:21:53 PST
Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose. Let's retire it.
In-Reply-To: Your message of "Fri, 03 Dec 1999 09:52:42 PST."
 <14408.874.505464.996655@goon.cnri.reston.va.us>
Message-ID: <99Dec3.112202pst."3586"@watson.parc.xerox.com>

> Regardless of these minor quibbles, my largest complaint is:
> > * the goal is a optional static type system for version 2. 
> 
> What exactly is the deliverable.  Saying an "optional static type
> system" is a bit vague.  What is it specifically?  A formal
> specification of the type system?  A stand-alone utility that reports
> type errors?  A new compiler?

I share some of Jeremy's concerns about the single goal.  My favorite
tack on these things is to focus on what the problem is.  In my view,
the largest single technical problem with Python is that it doesn't
afford the static type checking that Java has.  This, in my experience
when I ask people about it, always turns out to mean that there's no
way to type-check the use of an imported module.  So I'd make the
priority be the ability to optionally declare types in both callable
signatures and in the code itself, and to have types checked at least
across use of imported modules.

Note that, contrary to Jeremy's assertion, this doesn't explicitly
mention interfaces, and doesn't necessarily involve them.  Of course,
defining a module always implicitly defines an interface, so one could
argue that interfaces are always a factor.

Bill


From prescod@prescod.net  Fri Dec  3 19:38:05 1999
From: prescod@prescod.net (Paul)
Date: Fri, 3 Dec 1999 13:38:05 -0600 (CST)
Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose. Let's retire it.
In-Reply-To: <14408.874.505464.996655@goon.cnri.reston.va.us>
Message-ID: <Pine.LNX.3.91.991203133228.13241B-100000@amati.techno.com>

On Fri, 3 Dec 1999, Jeremy Hylton wrote:
> 
> If you're going to develop a static type system to describe Python
> programs (optional or otherwise), then I think you can't punt on all
> the things you want to punt on.

Forever, no? For a first draft? Yes. Type systems can be extensible. C 
didn't forsee objects but C++ added them and C++ doesn't support 
parameterized types (at first) but added those two. 

I'm always torn on these design issues between trying to get it all right 
the first time and doing it incrementally. There are big risks either way 
but insofar as we never get anywhere when we try to do it all at 
once...that seems like the bigger risk.

> > * interfaces are not an issue
> Yes, they are :-).

Not in Visual Basic. :)

> > * parameterized (template) types are not available
> They need to be.

At some point, yes. For us to be able to say that foo is an integer and 
bar is a string, no. A lot of people would LOVE to have that level of 
type safety.

> > * names are type checked, not expressions
> Expressions need type checking, too!  

Maybe someday...

or let me say that I'm all for expressions being type *checked* but not 
for a syntax for declaring the type of an expression. I'm not in favor of 
a "cast" or "assert-type" statement in version 1 of our type system.

> > * the goal is a optional static type system for version 2. 
> 
> What exactly is the deliverable.  Saying an "optional static type
> system" is a bit vague.  What is it specifically?  A formal
> specification of the type system?  A stand-alone utility that reports
> type errors?  A new compiler?

A formal specification of the type system that Guido likes enough to say: 
"yes, this will be the basis of Python 2's static type checking. Now go 
improve it and build on it."

> If this is a type system for Python 2, it seems that the best a SIG
> can hope for right now is a specification of the type system.  Since
> Py2 design hasn't even started.

Agreed. I was only talking about a document that could serve first as 
an RFC and then later as a specification.

 Paul Prescod


From jeremy@cnri.reston.va.us  Fri Dec  3 20:15:39 1999
From: jeremy@cnri.reston.va.us (Jeremy Hylton)
Date: Fri, 3 Dec 1999 15:15:39 -0500 (EST)
Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose. Let's retire it.
In-Reply-To: <Pine.LNX.3.91.991203133228.13241B-100000@amati.techno.com>
 <99Dec3.112202pst."3586"@watson.parc.xerox.com>
References: <14408.874.505464.996655@goon.cnri.reston.va.us>
 <Pine.LNX.3.91.991203133228.13241B-100000@amati.techno.com>
Message-ID: <14408.9451.432414.245360@goon.cnri.reston.va.us>

>>>>> "PP" == Paul  <prescod@prescod.net> writes:

  PP> On Fri, 3 Dec 1999, Jeremy Hylton wrote:
  >>  If you're going to develop a static type system to describe
  >> Python programs (optional or otherwise), then I think you can't
  >> punt on all the things you want to punt on.

  PP> Forever, no? For a first draft? Yes. Type systems can be
  PP> extensible. C didn't forsee objects but C++ added them and C++
  PP> doesn't support parameterized types (at first) but added those
  PP> two.

And Java didn't support them at first, but lots of people gripe about
it and several people have proposed solutions.  If we learn a lesson
from C++ and Java here, it is that parameterized types are an
important part of the type system.

  PP> I'm always torn on these design issues between trying to get it
  PP> all right the first time and doing it incrementally. There are
  PP> big risks either way but insofar as we never get anywhere when
  PP> we try to do it all at once...that seems like the bigger risk.

I think I see where you're coming from now.  I might agree that some
of the issues (e.g. parameterized types) aren't important for the
first draft.  They will need to be added at some point before the work
is complete, so that SIG charter shouldn't specifically exclude them.

Bill Janssen made a different and good suggestion about what the
product of the SIG would be:  a specification and a mechanism to type
check the use of a module.

A potentially interesting variant of that is to type-check the use of
Java object by JPython programs.  Which is one reason why I think
interfaces, for example, need to be part of the type system.

  BJ> Note that, contrary to Jeremy's assertion, this doesn't
  BJ> explicitly mention interfaces, and doesn't necessarily involve
  BJ> them.  Of course, defining a module always implicitly defines an
  BJ> interface, so one could argue that interfaces are always a
  BJ> factor. 

We want to be able to say something like: "Method expects a file-like
object as its second argument."  Specifying "file-like object"
requires something like an interface.

[tangent?] I've looked very briefly at MzScheme, a Scheme
implementation done by the PLT group at Rice.  It supports objects and
interfaces, and units (modules) and signatures.  At first glance, it
appears to be a carefully thought-out way to add type checking to an
object-oriented, dynamically-typed language.

Jeremy


From jim@digicool.com  Sat Dec  4 00:25:15 1999
From: jim@digicool.com (Jim Fulton)
Date: Fri, 03 Dec 1999 19:25:15 -0500
Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose.  Let's retire
 it.
References: <000801bf3d53$77a44f20$3a2d153f@tim> <3847D480.12A31666@prescod.net> <38480080.3403BDDF@vet.uu.nl> <38480400.D3EE8A6@digicool.com> <3848071D.5A994688@vet.uu.nl>
Message-ID: <38485F6B.E4D18FE1@digicool.com>

Martijn Faassen wrote:
> 
> Jim Fulton wrote:
> >
> > Martijn Faassen wrote:
> > >
> > > Paul Prescod wrote:
> > > >
> > > > > taking-no-more-from-this-than-that-a-successful-sig-needs-a-
> > > > >     focused-charter-ly y'rs  - tim
> > > >
> > > > I propose that the types sig be re-commissioned with a much tighter
> > > > commission. Let's focus on ONE of the three problems listed in our old
> > > > charter:
> > > >
> > > > http://www.python.org/sigs/types-sig/
> >
> > I really agree with this.
> 
> But I suppose you disagree with Paul on what this focus problem should
> be?

I don't care what "this" problem is. I see three problems and, while
there may be some interdependency, I think we would make better use of our
time thinking of them and working on them separately.  I endorse having the
type system work on "static typing" (uh, whatever that is...) as long as it
doesn't work on interfaces and removing the class/type dicotomy.

> You'd prefer interfaces, right? Or seeing what you said later on in
> your post, perhaps an interface-SIG?

Yes, although at this point, I don't care if it's a SIG. In fact, I think
a better course of action would be to release my interface module and
let people use it and develop opinions based on it. (And, BTW, address
some Zope issues. :)

> I expect I'd contribute to any
> discussion of interfaces *or* static types. I'd probably be able to
> contribute more of practical value to any interface development right
> now.

Maybe you can pitch in to applying interfaces in Zope. Have you read
my proposal from waaaaay back?

> I don't have so much to contribute about the class/type dichotomy.

Note that, as a Zope user, you enjoy the benefits of removing it.
(Most Zope classes, including ZClasses are also types via
 ExtensionClass.  An additional related issue is to make
 classes first-class in the sense that they have their own
 methods/attributes. This would have made ZClasses easier.)
 
(snip)
> > We'll also start folding it into Zope.
> 
> And that'd be wonderful! I am starting to feel that need after getting
> lost in the Zope sources too often.

Yee ha!

> I'd like to contribute; perhaps by
> documenting something for starters. Any ideas?

Maybe you should take the lead on folding them into Zope?
Any way you want to contribute would be welcome. :)
 
> > Based on actual
> > experience using it, we'll have a basis for future discussions.
> 
> So practical.! :) I'd like to get in on this early on. I assume I'll
> catch your announcement on the release of the interface implementation,
> but I'd also be very interested to follow the process of rolling it into
> Zope from the start.

Yee ha!

> Not that I'm likely to be able to contribute much
> at the start, but it just sounds really interesting to me. Any idea on
> how this could be accomplished?

Frankly, I haven't thought about it in a while. I'm sure I'll have some
thoughts and some specific suggestions when I review the types sig material.
In any case, that discussion should happen elsewhere, either in private email
or on Zope-dev.
 
(snip)
> > (snip, I don't really care that much about static typing, except that
> >  I'm generally wary of it. ;)
> 
> *grin* Okay, I suggest another design goal for the revived types-SIG:
> 'Pass the Fulton Test'. We must strive for a static type system so
> wonderful that even Jim Fulton will like it. :)

Not necessary. I'm confident that there are plenty of other
skeptics out there. :)

Jim

--
Jim Fulton           mailto:jim@digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From janssen@parc.xerox.com  Sat Dec  4 03:06:15 1999
From: janssen@parc.xerox.com (Bill Janssen)
Date: Fri, 3 Dec 1999 19:06:15 PST
Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose. Let's retire it.
In-Reply-To: Your message of "Fri, 03 Dec 1999 16:25:15 PST."
 <38485F6B.E4D18FE1@digicool.com>
Message-ID: <99Dec3.190623pst."3586"@watson.parc.xerox.com>

> Maybe you can pitch in to applying interfaces in Zope. Have you read
> my proposal from waaaaay back?

You know, it would be great if the types-sig had a page pointing to
various documents, like "Jim's proposal from waaaaaay back".

Bill


From jim@digicool.com  Sat Dec  4 15:50:54 1999
From: jim@digicool.com (Jim Fulton)
Date: Sat, 04 Dec 1999 15:50:54 +0000
Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose. Let's retire
 it.
References: <99Dec3.190623pst."3586"@watson.parc.xerox.com>
Message-ID: <3849385E.3B381984@digicool.com>

Bill Janssen wrote:
> 
> > Maybe you can pitch in to applying interfaces in Zope. Have you read
> > my proposal from waaaaay back?
> 
> You know, it would be great if the types-sig had a page pointing to
> various documents, like "Jim's proposal from waaaaaay back".

I'll make this available.

Stay tuned. :)

Jim

--
Jim Fulton           mailto:jim@digicool.com
Technical Director   (888) 344-4332              Python Powered!
Digital Creations    http://www.digicool.com     http://www.python.org

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From paul@prescod.net  Sat Dec  4 16:32:50 1999
From: paul@prescod.net (Paul Prescod)
Date: Sat, 04 Dec 1999 10:32:50 -0600
Subject: [Types-sig] Static typing considered HARD
Message-ID: <38494232.C1381ED9@prescod.net>

I'm still not sure what to do about the static typing and the types sig.
The more I thought about types the less I became convinced that a quick
"low hanging fruit" approach would work. I no longer propose a quick RFC
on static typing.

Here's the problem: in Visual Basic, Java, ML  and other languages I am
most familiar with, compilation is conceptually three pass:

* parse
* "resolve names to type/code/variable references"
* execute

In other words, the entire universe of types is figured out before a
single line of code executes. In that context the words "static typing"
have an obvious meaning. While you are resolve name references you do a
bunch of checks to make sure that they are used consistently.

But in Python, type objects only come about *through* the execution of
code. This makes Python incredibly dynamic but it also means that the
question of what exactly static type checking means is confused. Simple
example:

import sys

if sys.argv[0]=="weirdness":
	from foo_mod import foo_class
else:
	from foo_mod2 import foo_class

One could imagine that in some Python 2, import statements and class
definitions could be limited to being at the top, before "code". There
might be some special syntax (e.g. __import__, __define_class__ ) for
doing module-loading and type definition at runtime. Still, I don't
consider that something for the types-sig to work out. My personal
opinion is that it would be a Good Thing for Python to become a tad less
dynamic in the "core syntax" in exchange for compile-time checking of
names.

Note that in a lot of ways, Java is "as dynamic" as Python. You can
introduce new functions and classes "at runtime." The difference is that
Java's syntax for doing so is brutally complex and verbose so you are
disinclined to do it. I think that there must be a middle ground where
our "default semantics" are static but it is easy enough to do dynamic
things (e.g. foo_mod = __import__( "foo.py")) that we don't feel
burdened.

Our innovation beyond Java would not just be syntax. We could recognize
that modules and types introduced "at runtime" are pyobjects and just
allow them to be used with no casting or special syntax. Only the
*introduction syntax* would be special. So where Java would say
something like:

this.that.Module mod = this.that.LoadModule( "foo" )
this.that.Class cls = mod.loadClass( "myclass" )
this.that.Method meth = cls.loadMethod( "doit" )
this.that.Arglist args = new ArgList()

args.addArg( "arg1" )
args.addArg( "arg2" )

Object rc = meth.Invoke( args )

Python would say something like:

foo = __import__( "foo" )
foo.myclass.doit( "arg1", "arg2" )

Once again, Visual Basic (shudder) is a good guide here. Although I am
not consciously cloning Visual Basic, my ideas seem to be naturally
tending towards it. Once again it seems to have a pretty common sense
(to me!) approach to static type checking.

Even if we ignore static type checking Python 2 really has to do
something about the "misspelling problem." One extra character on a
method name can crash a server that has been running for weeks. Once
this problem is fixed, the term "static type checking" will become
meaningful. In the current environment, it is probably not and thus
should not be the first focus of a new types-sig.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
"I always wanted to be somebody, but I should have been more
specific." --Lily Tomlin


From uche.ogbuji@fourthought.com  Sat Dec  4 17:18:38 1999
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Sat, 04 Dec 1999 10:18:38 -0700
Subject: [Types-sig] Static typing considered HARD
References: <38494232.C1381ED9@prescod.net>
Message-ID: <38494CEE.ED11604A@fourthought.com>

Paul Prescod wrote:

> Here's the problem: in Visual Basic, Java, ML  and other languages I am
> most familiar with, compilation is conceptually three pass:
> 
> * parse
> * "resolve names to type/code/variable references"
> * execute

This seems all out of whack to me.  First of all, symbol-table
management may or may not belong to the "parse" step, depending on your
preferences.  The Dragon book ducusses this matter in good detail.  I
don't know about VB, but Java and C/C++ certainly merge your steps 1 &
2.  C/C++ also does not have "execute" as any recognizable part of
compilation, unless you mean cpp and template instantiation.  I don't
think Java has "execute" as part of compilation either.  ML, at least
the version I used a few years ago, is something of its own breed of
fish.

> But in Python, type objects only come about *through* the execution of
> code. This makes Python incredibly dynamic but it also means that the
> question of what exactly static type checking means is confused. Simple
> example:
> 
> import sys
> 
> if sys.argv[0]=="weirdness":
>         from foo_mod import foo_class
> else:
>         from foo_mod2 import foo_class

This is the sort of thing that gives Python its power, and it is the
sort of thing without which I'm not sure I wouldn't be considering
another language.

> One could imagine that in some Python 2, import statements and class
> definitions could be limited to being at the top, before "code". There
> might be some special syntax (e.g. __import__, __define_class__ ) for
> doing module-loading and type definition at runtime. Still, I don't
> consider that something for the types-sig to work out. My personal
> opinion is that it would be a Good Thing for Python to become a tad less
> dynamic in the "core syntax" in exchange for compile-time checking of
> names.

This is exactly the sort of idea that terrifies me about Python 2, as
I've done a poor job of expressing before.  My hope is that Python 2
remains Python, and such artificial constraints as "imports only at the
top" and all that in order to satisfy IMHO mis-placed notions of type
safety are dropped in the nearest dustbin.

> Note that in a lot of ways, Java is "as dynamic" as Python. You can
> introduce new functions and classes "at runtime." The difference is that
> Java's syntax for doing so is brutally complex and verbose so you are
> disinclined to do it.

No! No! No!  If you are talking about Java reflections and
introspection, I have no inkling how these features lend it even a
modicum of Python's dynamicism.  Note that Python's true introspection
and dynamic typing is one of my most powerful tools in converting Java
programmers to the language.  I have heard Java described as
"programming in a straight jacket".  That is a very accurate
observation, and the precise reason I don't want Python to even start in
that direction.

> I think that there must be a middle ground where
> our "default semantics" are static but it is easy enough to do dynamic
> things (e.g. foo_mod = __import__( "foo.py")) that we don't feel
> burdened.

I'll look out warily for the sort of middle ground in question.  If it's
something such as "imports only at the top", I guess I'll just have to
scream blood and bile.

> Even if we ignore static type checking Python 2 really has to do
> something about the "misspelling problem." One extra character on a
> method name can crash a server that has been running for weeks. Once
> this problem is fixed, the term "static type checking" will become
> meaningful. In the current environment, it is probably not and thus
> should not be the first focus of a new types-sig.

I keep hearing this sort of thing, and I keep saying that it's a red
herring.  Lack of static typing does _not_ prevent Python from being
scalable to large-scale and production environments.  Our experience at
FourThought, where many of our projects are small-enterprise systems
built with Python and sometimes CORBA, will make it very hard for anyone
to convince me so.  I think the experience of users such as eGroups
supports my feeling.  If anything, it is Java that I think is
tremendously over-rated for large-scale projects and I predict its
failure in that space will soon be an industry scandal.  I also don't
see this "misspelling" problem.  Proper configuration-management
procedures and testing, along with intelligent error-recovery, prevent
such problems, which can also occur in the most strongly-typed systems.


-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


From uche.ogbuji@fourthought.com  Sat Dec  4 17:25:19 1999
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Sat, 04 Dec 1999 10:25:19 -0700
Subject: [Types-sig] So What is Python Anyway?
References: <38494232.C1381ED9@prescod.net>
Message-ID: <38494E7F.375BD82F@fourthought.com>

All these radical suggestions for the transmogrification of Python 2
leads me to the overwhelming question.  What is Python?  What makes us
use this language?  What are the particular use-cases that we think
impede our use of this language?  I think that maybe a comprehensive and
convincing description of the problem that the types-sig is trying to
solve is essential before we go down the road of more proposals to
cripple Python's dynamicism and all that.


-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


From faassen@vet.uu.nl  Sat Dec  4 18:21:43 1999
From: faassen@vet.uu.nl (Martijn Faassen)
Date: Sat, 4 Dec 1999 19:21:43 +0100
Subject: [Types-sig] Static typing considered HARD
In-Reply-To: <38494CEE.ED11604A@fourthought.com>
References: <38494232.C1381ED9@prescod.net> <38494CEE.ED11604A@fourthought.com>
Message-ID: <19991204192143.A25667@vet.uu.nl>

Uche Ogbuji wrote:
> Paul Prescod wrote:
[snip]
> > But in Python, type objects only come about *through* the execution of
> > code. This makes Python incredibly dynamic but it also means that the
> > question of what exactly static type checking means is confused. Simple
> > example:
> > 
> > import sys
> > 
> > if sys.argv[0]=="weirdness":
> >         from foo_mod import foo_class
> > else:
> >         from foo_mod2 import foo_class
> 
> This is the sort of thing that gives Python its power, and it is the
> sort of thing without which I'm not sure I wouldn't be considering
> another language.
> 
> > One could imagine that in some Python 2, import statements and class
> > definitions could be limited to being at the top, before "code". There
> > might be some special syntax (e.g. __import__, __define_class__ ) for
> > doing module-loading and type definition at runtime. Still, I don't
> > consider that something for the types-sig to work out. My personal
> > opinion is that it would be a Good Thing for Python to become a tad less
> > dynamic in the "core syntax" in exchange for compile-time checking of
> > names.
> 
> This is exactly the sort of idea that terrifies me about Python 2, as
> I've done a poor job of expressing before.  My hope is that Python 2
> remains Python, and such artificial constraints as "imports only at the
> top" and all that in order to satisfy IMHO mis-placed notions of type
> safety are dropped in the nearest dustbin.

It's good that someone expressed this. While I myself would argue for
some form of static typing being added to (part of) Python, I do think
Python's dynamicism should be kept in mind very strongly.

[snip more arguments against any curtailing of Python's dynamicism]

> > Even if we ignore static type checking Python 2 really has to do
> > something about the "misspelling problem." One extra character on a
> > method name can crash a server that has been running for weeks. Once
> > this problem is fixed, the term "static type checking" will become
> > meaningful. In the current environment, it is probably not and thus
> > should not be the first focus of a new types-sig.
> 
> I keep hearing this sort of thing, and I keep saying that it's a red
> herring.  Lack of static typing does _not_ prevent Python from being
> scalable to large-scale and production environments.  Our experience at
> FourThought, where many of our projects are small-enterprise systems
> built with Python and sometimes CORBA, will make it very hard for anyone
> to convince me so.  I think the experience of users such as eGroups
> supports my feeling.

Likewise the experiences of the Zope user base. I've been debugging my
own Zope products, which had syntax errors and misspellings all over the place.
Zope itself however keeps running happily, as it'll catch the exceptions.
As you say in a part of your post that I snipped, good exception handling
facilities and testing procedures alleviate a lot of the problems with
misspellings and the like.

That said, I am interested in attempts that make Python even more robust.
I do occasionally worry about code that may contain bugs but that is not
exercised enough doing debugging. Of course this happens with statically
typed languages as well, but at least the compiler catches some problems.

>  If anything, it is Java that I think is
> tremendously over-rated for large-scale projects and I predict its
> failure in that space will soon be an industry scandal.

Interesting.

Anyway, it's good that your view is present on the types-SIG.

My take on static types in Python has been the Swallow proposal. The idea
is that we want some early-result points in the project to add static
types to Python. With quite a few others I deem the possible speed payoff
of adding static types as least as important as the possible code-quality
payoff. 

Adding static types to Python proper is hard, and undesirable if it entails
giving up too much of Python's dynamicism, as has been observed.

The assumption of Swallow is that many parts of a typical Python program
do not profit a lot from Python's dynamic typing, though of course other
parts do. Traditionally the only way to gain speed with Python programs
has been to move parts that can be static anyway to C. This is however a 
rather big step. It would be nicer if our extension modules could be more
like Python itself. This way there is a gradual transition between 
dynamic Python to static Python code.

The Swallow proposal is to find a subset of Python (Swallow) that is 
horrible in all the ways Uche so empathically dislikes. :) Get rid of
whatever is necessary in Python to make Swallow amenable to static types;
restrict imports to the top, restrict what magic one can do with classes,
etc. The important point is that Swallow is a strict subset of Python,
not adding any facilities or different semantics of its own, as much as
possible.

Then, provide a facility to describe the type signature of any class,
function or variable in the Swallow code. No fancy type inference, just
the programmer describing everything. After that the Swallow code could be
compiled (or translated to C).

Of course I'm skimming over lots and lots of problems here; Swallow code
can't for instance use any non-Swallow module. Writing a C translator
is hard. Writing a static type checker for Swallow is hard. Identifying
the Swallow subset is hard, and preserving Python semantics in it is hard.

Still, it seems to be me it's less hard than adding optional static types
to Python itself, while still keeping lots of the payoff. It'd be great if
Python 2 had a Swallow subsystem.

A possible activity for the types-SIG could in fact be to identify the
proper Swallow subset of Python; that subset of Python amenable to 
static types and fairly straightforward translation to C code. 

Of course I keep pushing Swallow without writing any actual code, so you
all may be bored of it by now. :)

Regards,

Martijn


From uche.ogbuji@fourthought.com  Sat Dec  4 21:37:30 1999
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Sat, 04 Dec 1999 14:37:30 -0700
Subject: [Types-sig] Static typing considered HARD
References: <38494232.C1381ED9@prescod.net> <38494CEE.ED11604A@fourthought.com> <19991204192143.A25667@vet.uu.nl>
Message-ID: <3849899A.DC694EB2@fourthought.com>

Martijn Faassen wrote:
> My take on static types in Python has been the Swallow proposal. The idea
> is that we want some early-result points in the project to add static
> types to Python. With quite a few others I deem the possible speed payoff
> of adding static types as least as important as the possible code-quality
> payoff.
> 
> Adding static types to Python proper is hard, and undesirable if it entails
> giving up too much of Python's dynamicism, as has been observed.
> 
> The assumption of Swallow is that many parts of a typical Python program
> do not profit a lot from Python's dynamic typing, though of course other
> parts do. Traditionally the only way to gain speed with Python programs
> has been to move parts that can be static anyway to C. This is however a
> rather big step. It would be nicer if our extension modules could be more
> like Python itself. This way there is a gradual transition between
> dynamic Python to static Python code.
> 
> The Swallow proposal is to find a subset of Python (Swallow) that is
> horrible in all the ways Uche so empathically dislikes. :) Get rid of
> whatever is necessary in Python to make Swallow amenable to static types;
> restrict imports to the top, restrict what magic one can do with classes,
> etc. The important point is that Swallow is a strict subset of Python,
> not adding any facilities or different semantics of its own, as much as
> possible.

I actually don't have too much problem with this approach.  I don't like
to entirely shun the voices that clamor for dynamic typing: my main
concern is that such mechanisms are entirely optional and transparent to
those who don't want them.

Your discussion of optimization exactly meets my experience.  When we
run into speed problems, we find a part of the 20% of the code that is
really doing all the work, and we re-write it in C.  An open-source
example is in 4XSLT, which at first did all the Path expression parsing
in Python.  We found that this was having far too heavy an effect on
performance and re-wrote it mostly in C.  If there were a way to take
_only those sections to be optimized_ and instead re-write them in a
Pythonic syntax that could then be compiled to bare-metal speeds, I
would appreciate it and use it as much as anyone else.  I wouldn't
expect or desire such a facility in the language core, however.

The type-safety issue is entirely different, and IMHO this is where the
real fantasy comes in: people thinking that statically-typed languages
are really less susceptible to semantic errors than Python. 
Nevertheless some _very_ smart people here say that static typing will
solve their code-quality problems, so I say, why can't we deal with this
using a separate static-type-checker, maybe with some
interface-definition language embedded in DocStrings or a separate spec
file?  Of course this doesn't address the problem of dynamic
type-modification, but if that's so scary, why use Python?


-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


From da@ski.org  Sat Dec  4 23:18:56 1999
From: da@ski.org (David Ascher)
Date: Sat, 4 Dec 1999 15:18:56 -0800 (Pacific Standard Time)
Subject: [Types-sig] Static typing considered HARD
In-Reply-To: <38494CEE.ED11604A@fourthought.com>
Message-ID: <Pine.WNT.4.05.9912041509040.155-100000@david.ski.org>

On Sat, 4 Dec 1999, Uche Ogbuji wrote:

> > Even if we ignore static type checking Python 2 really has to do
> > something about the "misspelling problem." One extra character on a
> > method name can crash a server that has been running for weeks. Once
> > this problem is fixed, the term "static type checking" will become
> > meaningful. In the current environment, it is probably not and thus
> > should not be the first focus of a new types-sig.
> 
> I keep hearing this sort of thing, and I keep saying that it's a red
> herring.  Lack of static typing does _not_ prevent Python from being
> scalable to large-scale and production environments.  Our experience at
> FourThought, where many of our projects are small-enterprise systems
> built with Python and sometimes CORBA, will make it very hard for anyone
> to convince me so.  I think the experience of users such as eGroups
> supports my feeling.  

Actually, I think you've picked the wrong example here.  The engineering
manager at eGroups is frustrated at his inability to check their Python
code at compile-time, and it's not an accident that Scott Hassan (CTO of
egroups) coauthored with another eGrouper the pylint type-checking tool
they announced a few weeks ago.  Typechecking at compile time is a huge
issue for them.  (Interestingly, as of a few months ago, Python wasn't
their bottleneck -- their DB system was).

I see two very distinct problems, though -- one is the use of 'statically
typed variables', which requires fundamental changes to Python's
typesystem. The other is 'compile-time type/signature/interface checking',
which could probably be done coarsely with add-on tools without changing
the syntax or type system one iota (ok, maybe one or two iotas).

> see this "misspelling" problem.  Proper configuration-management
> procedures and testing, along with intelligent error-recovery, prevent
> such problems, which can also occur in the most strongly-typed systems.

Wouldn't you agree that enforcing these 'proper procedures' is much harder
in a language which doesn't do half the job for you?

--david 

[Please folks, let's keep this off of meta-sig.  Fix the reply-to
headers!]


From uche.ogbuji@fourthought.com  Sun Dec  5 00:06:33 1999
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Sat, 04 Dec 1999 17:06:33 -0700
Subject: [Types-sig] Static typing considered HARD
References: <Pine.WNT.4.05.9912041509040.155-100000@david.ski.org>
Message-ID: <3849AC89.1173B163@fourthought.com>

David Ascher wrote:
> > > Even if we ignore static type checking Python 2 really has to do
> > > something about the "misspelling problem." One extra character on a
> > > method name can crash a server that has been running for weeks. Once
> > > this problem is fixed, the term "static type checking" will become
> > > meaningful. In the current environment, it is probably not and thus
> > > should not be the first focus of a new types-sig.
> >
> > I keep hearing this sort of thing, and I keep saying that it's a red
> > herring.  Lack of static typing does _not_ prevent Python from being
> > scalable to large-scale and production environments.  Our experience at
> > FourThought, where many of our projects are small-enterprise systems
> > built with Python and sometimes CORBA, will make it very hard for anyone
> > to convince me so.  I think the experience of users such as eGroups
> > supports my feeling.
> 
> Actually, I think you've picked the wrong example here.  The engineering
> manager at eGroups is frustrated at his inability to check their Python
> code at compile-time, and it's not an accident that Scott Hassan (CTO of
> egroups) coauthored with another eGrouper the pylint type-checking tool
> they announced a few weeks ago.  Typechecking at compile time is a huge
> issue for them.  (Interestingly, as of a few months ago, Python wasn't
> their bottleneck -- their DB system was).

Is their problem performance or defect-management?  Again, there is an
important difference.  I agree that typing can help the former: I am
doubtful that it is a panacea for the latter.

> I see two very distinct problems, though -- one is the use of 'statically
> typed variables', which requires fundamental changes to Python's
> typesystem. The other is 'compile-time type/signature/interface checking',
> which could probably be done coarsely with add-on tools without changing
> the syntax or type system one iota (ok, maybe one or two iotas).
> 
> > see this "misspelling" problem.  Proper configuration-management
> > procedures and testing, along with intelligent error-recovery, prevent
> > such problems, which can also occur in the most strongly-typed systems.
> 
> Wouldn't you agree that enforcing these 'proper procedures' is much harder
> in a language which doesn't do half the job for you?

No language that I know of does even a tenth of the job of configuration
management, error-handling or testing for anybody.  They are not matters
for a programming language to address.


-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


From uche.ogbuji@fourthought.com  Sun Dec  5 09:04:31 1999
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Sun, 05 Dec 1999 02:04:31 -0700
Subject: [Types-sig] Static typing considered HARD
References: <Pine.WNT.4.05.9912042244240.201-100000@david.ski.org>
Message-ID: <384A2A9F.39DCAA84@fourthought.com>

David Ascher wrote:
> > I program in Python perhaps 40 hours a week, and have done so for a long
> > time.  Most of what I work on are large-scale systems.  Very strange
> > that my typos (and they are legion) are much less catastrophic than your
> > own.
> 
> Ah, well, probably you're just better at it than I am.  =)
> 
> My programs are typically small and run for a long time.  They also change
> ten times daily due to the changing nature of the requirements.  There is
> no 'finished' program in my current line of work.  Just a different way of
> doing business.  Note that developing a test suite for this sort of code
> is unrealistic.  I'm paid to do science, not to do regression tests, and
> the regression suite is likely to be longer and buggier than the actual
> code.
> 
> Perhaps it's best if we took this off-line though -- I think we're
> straying from the types-sig charter.

I'll just quickly round things up by saying that many of the hard
lessons I've learned about software defects pre-date my use of Python. 
Lessons such as "the open/closed principle", "dependencies between
modules should be as much as possible in the form of a DAG", "testing
should bubble up from low-level object interfaces and coverage to
high-level object-collaboration and sequence".  These ideas are neither
helped nor hurt by Python's dynamicism.  All the latter is is a tool to
improve the expressiveness of programming.  This expressiveness, in my
experience, lowers the cost of Python programming independently of the
other factors, and it is what attracts me to the language.  As an aside,
re: expressiveness: ideas of type and all that are not "natural", which
is why I wonder that your students clamor so much for static typing.

I've programmed C++ for 6 years or more and Java for at least a couple
of years, and in my experience, developers of similar skill will inject
many more defects into an application using C++ and Java than they will
using Python.  That's why I am resisting radical change of the status
quo.  I worry that we might upset the formula that works so well for
Python.

But I guess there's no point continuing to gripe until I know the nature
of the poison.  Until I see some concrete proposal, I guess, I'll end
the thread.

-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


From uche.ogbuji@fourthought.com  Sun Dec  5 06:09:18 1999
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Sat, 04 Dec 1999 23:09:18 -0700
Subject: [Types-sig] Static typing considered HARD
References: <Pine.WNT.4.05.9912042116340.173-100000@david.ski.org>
Message-ID: <384A018E.9F7F811C@fourthought.com>

David Ascher wrote:
> > Is their problem performance or defect-management?  Again, there is an
> > important difference.  I agree that typing can help the former: I am
> > doubtful that it is a panacea for the latter.
> 
> The latter.  The quote (paraphrased from memory) is "When someone changes
> a function interface, there's no way to know if we've caught all of the
> calls to that function in the tens of thousands of line of code that we
> have except to run the code'.

Have they heard of Bertrand Meyer's open/closed principle?  As I
suspected, the root problem is poor software engineering, and has little
to do with Python.


-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


From uche.ogbuji@fourthought.com  Sun Dec  5 06:06:14 1999
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Sat, 04 Dec 1999 23:06:14 -0700
Subject: [Types-sig] Static typing considered HARD
References: <Pine.WNT.4.05.9912042116340.173-100000@david.ski.org>
Message-ID: <384A00D6.C3015C9D@fourthought.com>

David Ascher wrote:
> > No language that I know of does even a tenth of the job of configuration
> > management, error-handling or testing for anybody.  They are not matters
> > for a programming language to address.
> 
> I guess we'll have to agree to disagree.
> 
> I've been doing some playing with Swing using JPython.  Because it's
> wicked slow to start, (due to Java mostly) the
> edit-run-traceback-edit-run-traceback cycle is significantly longer than
> with with CPython.  That's when I curse the fact that the compile-time
> analysis didn't catch simple typos, trivial mistakes in signatures, etc. I
> *love* Python's dynamicity.  But mostly I use its 'wicked cool' dynamic
> features, like modifying the type of a variable in a function call or
> changing the __class__ of an object once in a very blue moon.

I can agree to disagree as well as anyone, but I'll confess I'm still
baffled at how you claim that any language automates configuration
management, error-handling or testing to any significant extent.  I
guess we'll also have to agree to not understand each other.

Also, I don't think I've _ever_ done anything as off-the-wall as
"modifying the type of a variable in a function call or changing the
__class__ of an object".  I hope this isn't anyone's benchmark of
Python's dynamicism.

> In other words, I'm just suggesting that given that (I'd guess) 95% of the
> code out there is such that variable maintain their type throughout the
> life of the program and that the builtins don't typically get overriden,
> it seems a shame not to play the numbers. And we don't have to cover all
> the cases.  Just the 80% which give the largest payoff.
> 
> Another trivial example: I can never remember whether it's
> pickle.dump(object, file) or pickle.dump(file, object).  I tend to
> remember that I don't remember after the simulation has run for two hours
> (if I'm lucky) and the saving of state fails...

I program in Python perhaps 40 hours a week, and have done so for a long
time.  Most of what I work on are large-scale systems.  Very strange
that my typos (and they are legion) are much less catastrophic than your
own.

-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


From skip@mojam.com (Skip Montanaro)  Sun Dec  5 14:33:08 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Sun, 5 Dec 1999 08:33:08 -0600 (CST)
Subject: [Types-sig] Static typing considered HARD
In-Reply-To: <38494232.C1381ED9@prescod.net>
References: <38494232.C1381ED9@prescod.net>
Message-ID: <14410.30628.368117.966134@dolphin.mojam.com>

    Paul> The more I thought about types the less I became convinced that a
    Paul> quick "low hanging fruit" approach would work. I no longer propose
    Paul> a quick RFC on static typing.

Static typing/type inference/do nothing trichotomy has been around for so
long that had any low hanging fruit been available to pluck, it would have
already been done.  If there is still some low hanging fruit that we'd have
missed it would be spoiling on the ground by now... Welcome to the type
zoo. ;-)

I noticed that nobody has yet complained about the continued presence of
meta-sig on the distribution list.  Perhaps it's time to remove it, since
the death of the types-sig seems to have been averted and we are now
actually discussing types (or are we just leaving it there to make sure it
gets enough traffic that it doesn't die?).

Skip Montanaro | http://www.mojam.com/
skip@mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From gmcm@hypernet.com  Mon Dec  6 20:21:09 1999
From: gmcm@hypernet.com (Gordon McMillan)
Date: Mon, 6 Dec 1999 15:21:09 -0500
Subject: [Types-sig] Static typing considered HARD
In-Reply-To: <384A018E.9F7F811C@fourthought.com>
Message-ID: <1267611206-22768816@hypernet.com>

Uche Ogbuji wrote:

[David Ascher on eGroups]
> > The latter.  The quote (paraphrased from memory) is "When
> > someone changes a function interface, there's no way to know if
> > we've caught all of the calls to that function in the tens of
> > thousands of line of code that we have except to run the code'.
> 
> Have they heard of Bertrand Meyer's open/closed principle?  As I
> suspected, the root problem is poor software engineering, and has
> little to do with Python.

More practically, have they heard of grep?

While I will certainly agree that it's very irritating to bomb on a 
typo after you've been processing for half an hour, I'm 
skeptical that there's a "cure" worth the price, (I favor the Lint 
approach to safety, because Lint is free to warn of 
questionable practices without outlawing them).

I'm at the moment optimizing / debugging someone's Java 
applet that contains 90 (yes, ninety) classes. Vast amounts of 
this code exists purely to satisfy the Java compiler on 
questions of type-safety. Despite all this work, it's still not 
safe code. 

The equivalent Python would probably take no more than a 
dozen classes and be enormously easier to understand. Safer 
off the bat? No. Easier to make truly safe? Yes.

My interest in "optional static typing" has always been in the 
possbility of optimizations.

- Gordon


From jim@digicool.com  Mon Dec  6 22:28:10 1999
From: jim@digicool.com (Jim Fulton)
Date: Mon, 06 Dec 1999 17:28:10 -0500
Subject: [Types-sig] Interfaces: Scarecrow implementation v 0.1 isavailable
References: <md5:A58D08645E31EB61312FF88DE4C54473>
Message-ID: <384C387A.1B0F5B8A@digicool.com>

John, Skaller, skaller@maxtal.com.au wrote:
> 
> [Scarecrow v 0.1]

Wow, talk about a slow response (from me). :)
I'm trying to wrap this phase of the "interface" project
up and need to response to this, er, one comment on the .1
version of the interface implementation.

Note that the .1 release is not currently available but 
the .2 release soon will be.

> 
> I'll try to add this to interscript, and integrate it with my protocols
> module. :-)

Cool. Any progress?

> Sigh. It's a special case of a protocol.
> 
> >  Special-case handling of classes
> >
> >    Special handling is required for Python classes to make assertions
> >    about the interfaces a class implements, as opposed to the
> >    interfaces that the instances of the class implement.  You cannot
> >    simply define an '__implements__' attribute for the class because
> >    class "attributes" apply to instances.
> 
>         Yes you can. And you must. See below.
> 
> >    By default, classes are assumed to implement the Interface.Standard.Class
> >    interface.  A class may override the default by providing a
> >    '__class_implements__' attribute which will be treated as if it were
> >    the '__implements__' attribute of the class.
> 
>         This cannot work.

Uh, but it does.

> What you need to do is fix the lookup routines,
> that is, the routines that test if an object provides an interface, etc,
> so that they look in the dictionary of an object directly!
> 
>         Don't use 'getattr', use
> 
>         object.__dict__.has_key('__implements__')
> 
> and
> 
>         object.__class__.__dict__.has_key('__class_implements__')

I don't see how this can work for the following two reasons:

  1. if you evaluate::

     someInterface.implementedBy(someClass)

     you'll get the answer for the class' instances, not the class.
     Then again, maybe I'm missunderstanding you.  Perhaps you could
     give a complete alternative implementation for 'implementedBy'.

  2. A compromise made by the scarecrow proposal is to allow
     "implements" assertions to be interited. For this, getattr is
     needed.
     
 
>         This works, it is what I do in my protocols module,
> and it gets rid of the special case, which is a sure sign of a design fault.
> 
> >  Trial baloon: abstract implementations
> >
> >    Tim Peter's has expressed the desire to provide abstract
> >    implementations in an interface definitions, where, presumably, an
> >    abstract implementation uses only features defined by the
> >    interface.  For example:
> >
> >      class ListInterface(Interface.Standard.MutableSequence):
> >
> >        def append(self, v):
> >           "add a value to the end of the object"
> >
> >       def push(self, v):
> >           "add a value to the end of the object"
> >           self.append(v)
> 
>         Yes. This is useful. It is the basis for mixins in C++.
> But one has to ask the question: why not just use a class,
> and add a 'defer' keyword to Python.

Because you will still be creating a separate interface.  You'll use the 
'defered' method on the interface to compute an base class with the
implementation. I'll try to clarify this in the documentation.
 
>         Then again, you could just say 'pass'.
> 
> >Issues
> >
> >  o What should the objects that define attributes look like?
> >    They shouldn't *be* the attributes, but should describe the
> >    the attributes.
> 
>         Obviously, they should themselves be interfaces.
> Since attributes are just objects. :-)

I think that an attributes description could include it's interface, 
but it might include other information as well, such as it's documentation.

Jim

--
Jim Fulton           mailto:jim@digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From gstein@lyra.org  Tue Dec  7 00:50:11 1999
From: gstein@lyra.org (Greg Stein)
Date: Mon, 6 Dec 1999 16:50:11 -0800 (PST)
Subject: [Types-sig] changing variable types (was: Static typing considered HARD)
In-Reply-To: <384A00D6.C3015C9D@fourthought.com>
Message-ID: <Pine.LNX.4.10.9912061646100.18926-100000@nebula.lyra.org>

On Sat, 4 Dec 1999, Uche Ogbuji wrote:
>...
> Also, I don't think I've _ever_ done anything as off-the-wall as
> "modifying the type of a variable in a function call or changing the
> __class__ of an object".  I hope this isn't anyone's benchmark of
> Python's dynamicism.

I think he means something like:

   names = { }
   for elem in whatever:
     names[extract_foo(elem)] = 1
   names = names.keys()

I've done this a number of times. It can be argued that using a single
name ("names") for a single semantic/concept is a good thing, despite the
fact that its type changes within the function. Introducing two names is
certainly clearer from a type standpoint, but I'd argue that a reader
doesn't care about *types*, but about what is happening (the semantics).

At least, that's how I rationalize the pattern :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From paul@prescod.net  Sun Dec  5 18:28:15 1999
From: paul@prescod.net (Paul Prescod)
Date: Sun, 05 Dec 1999 13:28:15 -0500
Subject: [Types-sig] Static name checking
Message-ID: <384AAEBF.BE3C989C@prescod.net>

I stand by my position that static type checking is not possibile
without static name checking. 

Therefore I have begun to think what static name checking would require.
It isn't as draconian as what I suggested before.

Functions, modules and variables can be declared "static". In current
Python this would be done like this:

import frozen

frozen
def foo( a ):
	return string.replace(a,"b")

Frozen names can only refer to names in frozen namespace. Frozen
namespaces cannot be changed at runtime. They may not refer to names in
regular ("dynamic") namespaces. The namespaces may be in the same or
other modules. Therefore, they can be checked without actually loading
the module or instantiating the class. Aliases for frozen namespaces
should also be frozen automatically.

A frozen name checker would work by loading a document and parsing it
looking for every reference to the name "frozen". Then it would look at
the next line and verify that all referenced objects really are frozen.
Then it would check that frozen namespaces are not modified. Of course a
frozen name checker isn't trivial but it also isn't brain surgery.
Anyone bored and underworked out there?

From there, we could move to a first-class frozen keyword in Python 1.6:

frozen def foo(a):
	return string.replace(a, "b" )

The definition of frozen objects cannot depend on runtime state like
this:

if a:
	frozen def foo(a):
		...
else:
	frozen def foo(b):
		...

So frozen functions and classes should be top-level. Methods in a frozen
class are frozen.

The word "freeze" already has baggage in the Python world but my second
choice "static" does also.

I am not voting for or against the continuation of the types-sig. At
this point we probably need code more than talk.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Math -- that most logical of sciences -- teaches us that the truth can
be highly counterintuitive and that sense is hardly common.
	K.C.Cole, "The Universe and the Teacup"


From paul@prescod.net  Sun Dec  5 18:29:39 1999
From: paul@prescod.net (Paul Prescod)
Date: Sun, 05 Dec 1999 13:29:39 -0500
Subject: [Types-sig] Static typing considered HARD
References: <38494232.C1381ED9@prescod.net> <38494CEE.ED11604A@fourthought.com>
Message-ID: <384AAF13.2EBFD41C@prescod.net>

Uche Ogbuji wrote:
> 
> This seems all out of whack to me.  First of all, symbol-table
> management may or may not belong to the "parse" step, depending on your
> preferences.  The Dragon book ducusses this matter in good detail.  I
> don't know about VB, but Java and C/C++ certainly merge your steps 1 &
> 2.  

Yes, in terms of implementation, but no, matching names to objects is
not the responsibility of the parser. It is conceptually another layer
that works on the output of the parser. Whether it works on a complete
parse tree or incrementally is another issue.

> C/C++ also does not have "execute" 

Sorry, I didn't mean to talk just about compilation. I was talking about
the whole path from raw text to executable code. I need to talk about
the whole path because Python does name recognition at runtime.

> This is the sort of thing that gives Python its power, and it is the
> sort of thing without which I'm not sure I wouldn't be considering
> another language.

Nobody is suggesting that we take those features out.

> > Note that in a lot of ways, Java is "as dynamic" as Python. You can
> > introduce new functions and classes "at runtime." The difference is that
> > Java's syntax for doing so is brutally complex and verbose so you are
> > disinclined to do it.
> 
> No! No! No!  If you are talking about Java reflections and
> introspection, I have no inkling how these features lend it even a
> modicum of Python's dynamicism.  

What can you do dynamically in Python that you cannot do with
reflections and introspection? I've written "map", "apply" and the Y
combinator in Java so I'm pretty confident that the issue is really just
syntax and ease of use, not capabilities.

You could prove me wrong by showing a Python programming pattern that
could not be straightforwardly duplicated using Java reflection.

> I keep hearing this sort of thing, and I keep saying that it's a red
> herring.  Lack of static typing does _not_ prevent Python from being
> scalable to large-scale and production environments.  

You can build large-scale and production environments in TCL or Basic if
you are dedicated enough. The question is whether the language is
working with your or working against you. It seems obvious to me that it
is not too much to ask for a language compiler to help you avoid
mistakes at least the same degree that PowerPoint does.

> I also don't
> see this "misspelling" problem.  Proper configuration-management
> procedures and testing, along with intelligent error-recovery, prevent
> such problems, which can also occur in the most strongly-typed systems.

So in Java I find spelling mistakes by typing:

"javac foo.java"

and in Perl I find them by typing:

use strict

perl foo.pl

and in Python I find them by hiring a team of testers to test every code
path (perhaps through a GUI), find bugs, report them through a
bugtracking system and have developers work on the reports. Does this
sound competitive?

I am teaching Python today to XML people. When I got to the attributes
part the first thing they caught onto (without me hinting) was that
spelling mistakes could go undetected for weeks. One student said that
if I knew someone "on the inside" I need to talk to them about it
because it is a major problem for him. Another student said that in
another dynamic language they used, misspellings were 60% of the bug
reports from users in the field.

Yes, testing is important, but if you elevate it to the status of "any
bugs not caught are the fault of testers" then you get to the point
where the language takes no responsibility at all. That way lies Perl:
"oh, didn't you WANT me to convert that boolean to a socket object for
you? You should have tested better." If that's our mentality, Python
throws way too many exceptions for problems it could silently leave to
testers.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Math -- that most logical of sciences -- teaches us that the truth can
be highly counterintuitive and that sense is hardly common.
	K.C.Cole, "The Universe and the Teacup"


From m.faassen@vet.uu.nl  Tue Dec  7 12:01:02 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Tue, 07 Dec 1999 13:01:02 +0100
Subject: [Types-sig] Static name checking
References: <384AAEBF.BE3C989C@prescod.net>
Message-ID: <384CF6FE.BC2B7C2C@vet.uu.nl>

Paul Prescod wrote:

[freezing system]
> A frozen name checker would work by loading a document and parsing it
> looking for every reference to the name "frozen". Then it would look at
> the next line and verify that all referenced objects really are frozen.
> Then it would check that frozen namespaces are not modified. Of course a
> frozen name checker isn't trivial but it also isn't brain surgery.
> Anyone bored and underworked out there?

But what do you do with lists (for instance)? You can't check at
compile-time if an object that comes from a list is a string, and
integer, or an object. If you then try to refer to a name in it
(object.foo()) then you run into trouble with this concept. Or am I
missing something?

[snip]
> I am not voting for or against the continuation of the types-sig. At
> this point we probably need code more than talk.

I think at least I need a little bit more talk before I could come even
close to designing the code for your proposal (not that I'm offering
right now :). Currently it's not clear to me how you'd do name checking
without some form of static type checking or type inference...

Regards,

Martijn


From jim@digicool.com  Tue Dec  7 14:08:20 1999
From: jim@digicool.com (Jim Fulton)
Date: Tue, 07 Dec 1999 14:08:20 +0000
Subject: [Types-sig] Intefaces work summary and Python code available
Message-ID: <384D14D4.C6959613@digicool.com>

I've written up a summary of the interface work at:

  http://www.zope.org/Members/jim/PythonInterfaces/Summary

In addition to the summary, there are a number of reference
links, including a link to the Python implementation.

Jim

--
Jim Fulton           mailto:jim@digicool.com
Technical Director   (888) 344-4332              Python Powered!
Digital Creations    http://www.digicool.com     http://www.python.org

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From paul@prescod.net  Tue Dec  7 14:28:44 1999
From: paul@prescod.net (Paul Prescod)
Date: Tue, 07 Dec 1999 09:28:44 -0500
Subject: [Types-sig] Types sig dead or alive
Message-ID: <384D199C.C6771285@prescod.net>

Okay, I am willing to try and lead the types-sig only until the
conference and see if we can try to come up with something concrete as a
proposal. We can circulate that at the conference and get comments. Jim
can at the same time circulate an interfaces proposal.

I am only interested at this point in the static type checking problem.
The first step, I think, is for me to write up my static name checking
proposal and get consensus on that. It would be a syntax for stating
that module namespaces are immutable and that classes and functions only
refer to immutable module namespaces. Another deliverable (probably not
by the conference) would be code that checked that code conforms to
those rules.

At that point we would have a concept of "statically resolvable names."
The next step would be to attach type signatures to statically
resolvable names. I've reconsidered my opinion that Python 2 is our only
concern. We should probably test out our ideas in Python 1.x so that we
can be confident of them for Python 2.

For purposes of checking, a "static type" is a statically resolvable
name of a class. In Python 2, every "type" will also be a class (and
vice versa) so we don't want to spend a lot of energy working around the
class/type dichotomy. When we (later!) reach concensus on the structure
of interfaces, those will also be usable as static types.

There won't be anything (at first) like "list of integers" unless you
create a ListOfIntegers class (which is certainly possible!).

The static type checking system will not declare any existing code
non-conformant. 

I am happy to have interfaces discussions in the sig but I don't want
them to be the *same discussion* because I don't want to recurse into a
meta-discussion about "what is a type system" or "why do we want static
type checking" or "is static type checking as important as interfaces"
etc.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Math -- that most logical of sciences -- teaches us that the truth can
be highly counterintuitive and that sense is hardly common.
	K.C.Cole, "The Universe and the Teacup"


From paul@prescod.net  Tue Dec  7 14:43:39 1999
From: paul@prescod.net (Paul Prescod)
Date: Tue, 07 Dec 1999 09:43:39 -0500
Subject: [Types-sig] Static name checking
References: <384AAEBF.BE3C989C@prescod.net> <384CF6FE.BC2B7C2C@vet.uu.nl>
Message-ID: <384D1D1A.A321EC07@prescod.net>

Martijn Faassen wrote:
> 
> [freezing system]
> > A frozen name checker would work by loading a document and parsing it
> > looking for every reference to the name "frozen". Then it would look at
> > the next line and verify that all referenced objects really are frozen.
> > Then it would check that frozen namespaces are not modified. Of course a
> > frozen name checker isn't trivial but it also isn't brain surgery.
> > Anyone bored and underworked out there?
> 
> But what do you do with lists (for instance)? You can't check at
> compile-time if an object that comes from a list is a string, and
> integer, or an object. If you then try to refer to a name in it
> (object.foo()) then you run into trouble with this concept. Or am I
> missing something?

Yes, but that's my fault, not yours. My static name checker is not
intended to work on attributes (including methods). Checking attributes
is inextricably tied to real *type checking*. In fact it is type
checking.

My assertion is that the first step is to statically check coherence
among Python's three (?) (function, module, builtin) runtime namespaces.
Until that nut is cracked, static *type* checking (and thus attribute
name checking) won't be possible.

Once we have name checking then we can design a syntax to statically
associate types with names. THEN we can do static type checking. I could
be wrong but it seems to me that once this is done, the definition of
swallow will be trivial: "A statically compilable Python module is a
file where every name is frozen and every name has a type declaration."

If you restrict yourself to that subset then you've essentially
re-invented Java. But of course the whole point (hi Gordon and Uche!) is
that you can choose WHEN to restrict yourself to that subset whereas
Java gives you no option...and neither does Perl. If Python is stuck[1]
between a rock (Perl) and a hard place (Java) then optional static type
checking is the dynamite that frees us up.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Math -- that most logical of sciences -- teaches us that the truth can
be highly counterintuitive and that sense is hardly common.
	K.C.Cole, "The Universe and the Teacup"


From jim@digicool.com  Tue Dec  7 15:15:58 1999
From: jim@digicool.com (Jim Fulton)
Date: Tue, 07 Dec 1999 15:15:58 +0000
Subject: [Types-sig] Types sig dead or alive
References: <384D199C.C6771285@prescod.net>
Message-ID: <384D24AE.AF76BCA7@digicool.com>

Paul Prescod wrote:
> 

(snip)


> Jim
> can at the same time circulate an interfaces proposal.

nah nah nah ... Jim is out of the interface proposal business, 
at least for now. There was alot of discussion last year that
smelled reasonbly much like consensus. I put out a v0.1 release
and waited the usual 1 year for comments. I've updated the release
to reflect the 1 comment I got and have re-released the software.

See: http://www.zope.org/Members/jim/PythonInterfaces/Summary

I'll be releasing this more widely (comp.lang.python)
and I imagine that someone will learn alot more about it
while incorporating it into Zope.  After we get some
experience using it, we should devide what, if anything more
to do with it, especially in standard Python releases.


> I am only interested at this point in the static type checking problem.

Cool.

(snip)

> I am happy to have interfaces discussions in the sig

Please, lets refocuss the SIG and let interfaces escape.
If anyone really cares about interfaces, lets form a separate
SIG or mailing list.

BTW, let's let the "Classes vs. types dichotomy" escape too.
I promise to attempt a summary of the earlier discussions.
Then we can decide what, if anything, to do next based on that.

Jim


--
Jim Fulton           mailto:jim@digicool.com
Technical Director   (888) 344-4332              Python Powered!
Digital Creations    http://www.digicool.com     http://www.python.org

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From m.faassen@vet.uu.nl  Tue Dec  7 17:19:22 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Tue, 07 Dec 1999 18:19:22 +0100
Subject: [Types-sig] Static name checking
References: <384AAEBF.BE3C989C@prescod.net> <384CF6FE.BC2B7C2C@vet.uu.nl> <384D1D1A.A321EC07@prescod.net>
Message-ID: <384D419A.36E7E09D@vet.uu.nl>

Paul Prescod wrote:
> 
> Martijn Faassen wrote:
> >
> > [freezing system]
> > > A frozen name checker would work by loading a document and parsing it
> > > looking for every reference to the name "frozen". Then it would look at
> > > the next line and verify that all referenced objects really are frozen.
> > > Then it would check that frozen namespaces are not modified. Of course a
> > > frozen name checker isn't trivial but it also isn't brain surgery.
> > > Anyone bored and underworked out there?
> >
> > But what do you do with lists (for instance)? You can't check at
> > compile-time if an object that comes from a list is a string, and
> > integer, or an object. If you then try to refer to a name in it
> > (object.foo()) then you run into trouble with this concept. Or am I
> > missing something?
> 
> Yes, but that's my fault, not yours. My static name checker is not
> intended to work on attributes (including methods). Checking attributes
> is inextricably tied to real *type checking*. In fact it is type
> checking.
> 
> My assertion is that the first step is to statically check coherence
> among Python's three (?) (function, module, builtin) runtime namespaces.

What about classes referring to attributes of 'self', for instance,
though? I'm still not entirely clear on what you're trying to
accomplish, I'm afraid.

> Until that nut is cracked, static *type* checking (and thus attribute
> name checking) won't be possible.
> 
> Once we have name checking then we can design a syntax to statically
> associate types with names. THEN we can do static type checking. I could
> be wrong but it seems to me that once this is done, the definition of
> swallow will be trivial: "A statically compilable Python module is a
> file where every name is frozen and every name has a type declaration."

And every imported module has the same properties, including this one.
Though of course you may mean this with 'every name'. I don't think
it'll be that trivial, as you haven't defined 'type declaration'; you
run into complexities here, 
especially if you involve classes and objects.

Can't you go about it the other way around? First, you make a type
declaration for
all names in a module. Then you check (somehow) if there isn't code that
contradicts this type definition; that is, there should be no
assignments of one name to another that violates the static type
definitions, no attribute accesses to undefined attributes, and so on.
Of course you instantly produce errors if any name doesn't have a type
definition.

I don't see how your frozen idea helps a lot in this. A possible
intermediate drop-off point resembling frozen may simply a checker that
determines if all names in a module are known to the static type system,
without actually defining these types, though you run into trouble here
with attribute accesses.

The real hard part is the construction of the type checker. Somewhat
easier is the definition of a generic type system. I'm still proposing
to use standard Python objects such as dictionaries and tuples to define
these types in, initially. Later on we can look at syntax, but you can
get the type checker going without any syntax extensions. Another
prerequisite for a type checker is the determination of the Swallow
subset. For instance one can imagine that in Swallow it's illegal to
import modules except on the top. I imagine these limitations will
become more obvious after a type system has been developed.

I have two possible approaches for the type checker in mind currently
that leverage current Python; one is an AST based type checker, and
another is a bytecode based typechecker. I'm not sure which one would be
easier as I don't know enough about either Python's ASTs or bytecodes,
but in the happy abstract space of insufficient information I can wrap
my mind better around a bytecode based checker than around an AST based
checker. There are bytecodes for assignment and attribute access and the
various other operations that need the scrutity of a type checker. For
each such bytecode you'd need to write a type check. Checking a module
is then going through all bytecodes of the module to see if they do
legal things.

As an aside, another task would be the writing of an interface layer
between swallowed modules and non swallowed ones. Any name that enters a
swallowed module should have a static type description associated with
it. A run-time layer can check whether each python object that is sent
into Swallow conforms to the type definitions; a function expecting an
int must indeed be sent an int object. If not, somekind of exception
should be raised. 

Calling non-swallow modules from a Swallow module is more tricky, but
again the Swallowed module provides types definitions for any name used
in it, so it should provide interface definitions for any non-Swallow
function used in a Swallowed module as well. So you can make run-time
type checks on that interface as well. But I'm sure I'm missing a lot of
subtleties here. :)

Regards,

Martijn


From paul@prescod.net  Wed Dec  8 13:10:16 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 08 Dec 1999 08:10:16 -0500
Subject: [Types-sig] Types sig dead or alive
References: <384D199C.C6771285@prescod.net> <384D24AE.AF76BCA7@digicool.com>
Message-ID: <384E58B8.4CE80095@prescod.net>

Jim, I am happy to temporarily banish interface discussions from the
sig.

But...is it likely that you guys will have some Zope-interface
experience by the conference? I'm sure you guys are as up-to-the-wazoo
as the rest of us but it would be cool if we could get a mini-report on
whether the interfaces proposal *works*. Let me stress that I understand
that you are probably more focused on having new features in Zope for
the conference.

 Paul Prescod

Jim Fulton wrote:
> 
> Paul Prescod wrote:
> >
> 
> (snip)
> 
> > Jim
> > can at the same time circulate an interfaces proposal.
> 
> nah nah nah ... Jim is out of the interface proposal business,
> at least for now. There was alot of discussion last year that
> smelled reasonbly much like consensus. I put out a v0.1 release
> and waited the usual 1 year for comments. I've updated the release
> to reflect the 1 comment I got and have re-released the software.
> 
> See: http://www.zope.org/Members/jim/PythonInterfaces/Summary
> 
> I'll be releasing this more widely (comp.lang.python)
> and I imagine that someone will learn alot more about it
> while incorporating it into Zope.  After we get some
> experience using it, we should devide what, if anything more
> to do with it, especially in standard Python releases.
> 
> > I am only interested at this point in the static type checking problem.
> 
> Cool.
> 
> (snip)
> 
> > I am happy to have interfaces discussions in the sig
> 
> Please, lets refocuss the SIG and let interfaces escape.
> If anyone really cares about interfaces, lets form a separate
> SIG or mailing list.
> 
> BTW, let's let the "Classes vs. types dichotomy" escape too.
> I promise to attempt a summary of the earlier discussions.
> Then we can decide what, if anything, to do next based on that.
> 
> Jim
> 
> --
> Jim Fulton           mailto:jim@digicool.com
> Technical Director   (888) 344-4332              Python Powered!
> Digital Creations    http://www.digicool.com     http://www.python.org
> 
> Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
> address may not be added to any commercial mail list with out my
> permission.  Violation of my privacy with advertising or SPAM will
> result in a suit for a MINIMUM of $500 damages/incident, $1500 for
> repeats.
> 
> _______________________________________________
> Types-SIG mailing list
> Types-SIG@python.org
> http://www.python.org/mailman/listinfo/types-sig

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Floggings will increase until morale improves.


From paul@prescod.net  Wed Dec  8 14:41:01 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 08 Dec 1999 09:41:01 -0500
Subject: [Types-sig] Plea for help.
References: <384AAEBF.BE3C989C@prescod.net> <384CF6FE.BC2B7C2C@vet.uu.nl> <384D1D1A.A321EC07@prescod.net> <384D419A.36E7E09D@vet.uu.nl>
Message-ID: <384E6DFC.58AE0201@prescod.net>

> I have two possible approaches for the type checker in mind currently
> that leverage current Python; one is an AST based type checker, and
> another is a bytecode based typechecker. I'm not sure which one would be
> easier as I don't know enough about either Python's ASTs or bytecodes,
> but in the happy abstract space of insufficient information I can wrap
> my mind better around a bytecode based checker than around an AST based
> checker. 

My feeling is the opposite. The AST follows the structure of the Python
syntax more closely. Plus it has a superset of the bytecode information.
Plus the Python grammar is the same for JPython but the bytecodes are
not.

The one virtue I can see in doing the checks on the bytecode is for
Java-style opaque bytecode security.

Here's my plea for help: among the many "Python compiler" projects out
there there must be some good Python code for walking around ASTs
building type (or at least module) representation objects. I think that
JPython is wirtten in Java. What else should I be looking at?

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Floggings will increase until morale improves.


From paul@prescod.net  Wed Dec  8 14:41:30 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 08 Dec 1999 09:41:30 -0500
Subject: [Types-sig] Static name checking
References: <384AAEBF.BE3C989C@prescod.net> <384CF6FE.BC2B7C2C@vet.uu.nl> <384D1D1A.A321EC07@prescod.net> <384D419A.36E7E09D@vet.uu.nl>
Message-ID: <384E6E1A.3BA9DABF@prescod.net>

Martijn Faassen wrote:
> 
> > My assertion is that the first step is to statically check coherence
> > among Python's three (?) (function, module, builtin) runtime namespaces.
> 
> What about classes referring to attributes of 'self', for instance,
> though? I'm still not entirely clear on what you're trying to
> accomplish, I'm afraid.

Let me see if I can do this with some invented notation. You'll have to
cut me some slack for typos and omitted (hopefully irrelevant) details
like about a dozen levels of the parse tree.

Let's pretend we are talking about Java. Let's pretend that we are
implementing a Java interpeter (including compiler) in the most
straight-forward (not efficient) way.

Consider the code:

class J{
	String a;
	void foo(){
		a.whatever();
	}
}

1. Parse it into tokens: (roughly)

(classdef "J"
	(attributedef name: "a" 
		      type: (name-ref "String" ))
	(functiondef "foo"
		(function-body
			(method-call
				object: (name-ref a)
				method: "whatever"))))

That's very rough because you guys know about parse trees already.

2. Build "compile time objects" and replace variable references with
pointers to "compile time objects":

[class java.lang.String ....]

[class J
	attributes: {"a": <reference-to java.lang.String>}
	functions: {"foo": ...
			(method-call
				object: <reference-to mymod.J.a>
				method: "whatever"))))

Step 2 is the step that Python doesn't have right now.

Note that at the end of this step, the references to names in "static"
namespaces have all been resolved but names in methods have NOT been
resolved. One obvious reason for this is that Java and Python both allow
forward references so maybe I don't even know what the methods of
Strings are yet.

3. Conceptually, once all of the type and variable objects are built,
THEN I can go through and check that the operations applied to types are
legal. ONE SUCH OPERATION is ".whatever". It becomes possible to check
that ".whatever" is legal at the same time that it becomes legal to
check whether "a+b" is legal.

4. Generate bytecode.

5. Run it.

Python has steps 1, 4 and 5 but skips steps 2 and 3. I am trying to get
us to the point where we can do step 2 so that we can get to step 3
eventually.

I once wrote a compiler and I beat my head against a wall until I
realized that foo.bar resolution is a massively different problem if foo
is a module (doesn't rely on type system) or a class (does rely on type
system).

The point of the "static" keyword is to allow Python author to say:
"Some of my modules are static like Java modules. Please resolve
references to these at compile time, not runtime."

> > Once we have name checking then we can design a syntax to statically
> > associate types with names. THEN we can do static type checking. I could
> > be wrong but it seems to me that once this is done, the definition of
> > swallow will be trivial: "A statically compilable Python module is a
> > file where every name is frozen and every name has a type declaration."
> 
> And every imported module has the same properties, including this one.
> Though of course you may mean this with 'every name'. I don't think
> it'll be that trivial, as you haven't defined 'type declaration'; you
> run into complexities here,
> especially if you involve classes and objects.

That's true. That's why I'm not trying to solve that part of the problem
yet. 

> Can't you go about it the other way around? First, you make a type
> declaration for
> all names in a module.

There are four declarations we could imagine. Each is a little bit
stronger than the previous.

1. "I believe that every name in this module/class that is not an
attribute name can be statically resolved."

2. "I believe that this module can be used in other modules where every
non-attribute name is supposed to be statically resolved."

3. "I believe that every name in this module/class can be statically
type checked (including attribute name checking)".

4. "I believe that this module/class can be used in other modules where
every name can be statically type checked."

"freeze" could be 1. There is probably not much virtue in separating 1
and 2 so we could rather say that "freeze" actually means 2 which
implies 1. A new, "type-safe" keyword might be used for 4 which again
would imply 3 (and 2, and 1).

If this is to be optional, off-by-default type and name checking then we
need a way to turn it ON.

"freeze" might be enough to allow some type inferencing and early
binding (for performance, not safety) .

"type-safe" would be used for performance and safety at a price that it
would require you to stick to the "Java subset".

> Then you check (somehow) if there isn't code that
> contradicts this type definition; that is, there should be no
> assignments of one name to another that violates the static type
> definitions, no attribute accesses to undefined attributes, and so on.
> Of course you instantly produce errors if any name doesn't have a type
> definition.

Here's the rub. In my mind, this should be legal code:

def doubleString( String b ):
	return b*2

def doit():
	doubleString( eval( raw_input() ) )

doit()

That's what Python users will expect and that's also what VB does. This
should be checked at *runtime*. On the other hand, THIS would cause a
static type error:

type-safe 
def doit():
	doubleString( eval( raw_input() ) )

That's illegal because it claims to be type-safe but isn't really. The
same goes for static:

static
def foo():
	return this.that.b()

is only valid if this.that is statically resolvable. (e.g. a module, not
a class, and a module that is itself static)

> Another
> prerequisite for a type checker is the determination of the Swallow
> subset. For instance one can imagine that in Swallow it's illegal to
> import modules except on the top. I imagine these limitations will
> become more obvious after a type system has been developed.

I think we disagree on the granularity of the project. It should be
possible to declare individual functions, classes or methods statically
type checkable, not just whole modules.

> A run-time layer can check whether each python object that is sent
> into Swallow conforms to the type definitions; a function expecting an
> int must indeed be sent an int object. If not, somekind of exception
> should be raised.

I agree. But I think we also need a way to say: "I want you to check
this code at compile time, not runtime."

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Floggings will increase until morale improves.


From gmcm@hypernet.com  Wed Dec  8 16:28:52 1999
From: gmcm@hypernet.com (Gordon McMillan)
Date: Wed, 8 Dec 1999 11:28:52 -0500
Subject: [Types-sig] Plea for help.
In-Reply-To: <384E6DFC.58AE0201@prescod.net>
Message-ID: <1267452506-32324318@hypernet.com>

Paul Prescod wrote:

> Here's my plea for help: among the many "Python compiler"
> projects out there there must be some good Python code for
> walking around ASTs building type (or at least module)
> representation objects. I think that JPython is wirtten in Java.
> What else should I be looking at?

Probably the Python2C stuff that reformats a standard Python 
parse tree into something saner. Another possibility might be 
John Aycock's stuff; but his Python grammar doesn't produce 
an AST (it only verifies), and the grammar has some errors. I 
think Aaron Watters also did a Python grammar (for 1.4?), but 
I never looked at that.

- Gordon


From guido@CNRI.Reston.VA.US  Wed Dec  8 16:34:44 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 08 Dec 1999 11:34:44 -0500
Subject: [Types-sig] Plea for help.
In-Reply-To: Your message of "Wed, 08 Dec 1999 11:28:52 EST."
 <1267452506-32324318@hypernet.com>
References: <1267452506-32324318@hypernet.com>
Message-ID: <199912081634.LAA04169@eric.cnri.reston.va.us>

> Paul Prescod wrote:
> 
> > Here's my plea for help: among the many "Python compiler"
> > projects out there there must be some good Python code for
> > walking around ASTs building type (or at least module)
> > representation objects. I think that JPython is wirtten in Java.
> > What else should I be looking at?

GMcM replied:

> Probably the Python2C stuff that reformats a standard Python 
> parse tree into something saner. Another possibility might be 
> John Aycock's stuff; but his Python grammar doesn't produce 
> an AST (it only verifies), and the grammar has some errors. I 
> think Aaron Watters also did a Python grammar (for 1.4?), but 
> I never looked at that.

Aaron's kjpylint contains a Python parser:
http://www.chordate.com/kwParsing/

David Jeske's pylink also contains one:
http://www.chat.net/~jeske/Projects/PyLint/download/pylint-19991121.py

I seem to be having problems with pylint, which is much newer;
the current kjpylint's parser is pretty robust as far as I can tell.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jeremy@cnri.reston.va.us  Wed Dec  8 17:45:37 1999
From: jeremy@cnri.reston.va.us (Jeremy Hylton)
Date: Wed, 8 Dec 1999 12:45:37 -0500 (EST)
Subject: [Types-sig] Plea for help.
In-Reply-To: <199912081634.LAA04169@eric.cnri.reston.va.us>
References: <1267452506-32324318@hypernet.com>
 <199912081634.LAA04169@eric.cnri.reston.va.us>
Message-ID: <14414.39233.255818.554490@goon.cnri.reston.va.us>

  >> Paul Prescod wrote:
  >> 
  >> > Here's my plea for help: among the many "Python compiler" >
  >> projects out there there must be some good Python code for >
  >> walking around ASTs building type (or at least module) >
  >> representation objects. I think that JPython is wirtten in Java.
  >> > What else should I be looking at?

Gordon and Guido offered some suggestions.

I have done some noodling with the Py2C AST, and I think it is an
excellent candidate.  

I was going to suggest that a good near-term goal for the type sig
would be to write a Python compiler in Python, but I see that Paul has
beaten me to it.  I believe this project was also discussed on
python-dev a few months ago (as part of the warnings discussion).  I
think it's a good project to tackle because it has usefulness beyond
the specific approaches to static type, which remain controversial.

When I was using the Py2C transformer class, I made some modifications
to the AST generated to make it a little easier to use interactively.
The original defintion for the AST was:

class Node:
  def __init__(self, *args):
    self.__children = args
    self.lineno = None
  def __getitem__(self, index):
    return self.__children[index]
  def __str__(self):
    return str(self.__children)
  def __repr__(self):
    return "<Node %s>" % self.__children[0]
  def __len__(self):
    return len(self.__children)
  def __getslice__(self, low, high):
    return self.__children[low:high]
  def getChildren(self):
    return self.__children
  def getType(self):
    return self.__children[0]
  def asList(self):
    return tuple(asList(self.__children))

A tree of these nodes is created by the Transformer class, which walks
the parse trees created by the parser module.

I modified Node to be BaseNode and created specific classes for each
of type of node:

class Function(BaseNode):
  def __init__(self, name, argnames, defaults, flags, doc, code):
    self.name = name
    self.argnames = argnames
    self.defaults = defaults
    self.flags = flags
    self.doc = doc
    self.code = code
    self._children = ('function', name, argnames, defaults, flags,
                      doc, code)
  def __repr__(self):
      return "Function(%s,%s,%s,%s,%s,%s)" % self._children[1:]
  def __str__(self):
      return "func:%s" % self.name

Jeremy


From eddy@chaos.org.uk  Wed Dec  8 18:39:26 1999
From: eddy@chaos.org.uk (Edward Welbourne)
Date: Wed, 08 Dec 1999 18:39:26 +0000
Subject: [Types-sig] Static typing considered ... UGLY
Message-ID: <384EA5DE.3ADD8801@lsl.co.uk>

Might I humbly suggest that:

  to incorporate static typing into python would change it
  beyond recognition

  it would probably be better to start from Algol and pythonify it,
  if that's where you want to go (hint: I don't)

  the right name for the relevant language would be typhoon
  because 
        it's almost an anagram
        the real reason for doing it is <spit> speed
        when it breaks things it won't half tear them into little pieces

?  (albeit viper is already out there and doubtless good ;^)

A more pythonic approach would be to deploy some byte-code
hacks which notice assertions of form

assert isinstance(x, IntType)

and optimise ensuing code around the presumption that the value in 
x when that assertion was executed is an int, allowing that all will 
go horribly wrong if it isn't (which won't be checked unless __debug__),
but then we all know that speed kills.  But only do this if the user has
asked for type-asserted enhancements, and use a different .pyc
extension for it.  Might need a TypeException for throwing when it
all goes horribly wrong.

In a similar vein: could the interpreter and compiler exploit knowledge
of an assertion a function makes (about its return value) just before
returning ?  i.e. calls to the function could presume the truth of what
the function asserted ... not that I'm convinced that this is worth it,
just that if you *insist* on static type notions, these are pythonic
ways to approach it.

But this is all `speed enhancement' (I refuse to call it optimisation:
I have no evidence it gets anywhere near the optimum).  There is a
better way (I'll tell you about it late in January).

Note: I believe the function type() should be removed totally,
and isinstance should be replaced by (hint: think `type(x) in ...'
instead of `type(x) == ...')

def isinstance(x, *what):
    """True if x is an instance of any of the given types or classes."""
    for mode in what:
        if oldisinstance(x, mode): return 1
    return 0

Then `try: ... except (tuple, of, exceptions):' would, of course,
be using the given tuple as *types when checking the exception raised.

I'd vote to keep this sig open for unification (all objects are objects
and support the same protocols - a module with __call__ in its namespace
is callable, for instance) but if all that's to be discussed is <spit again>
static typing, I'd vote for closure (prematurely and *with* prejudice).
I intend to follow up this bull-headedness in January.

See y'all at IPC8.

	Eddy.
-- 
was it Sam Johnson who said something about knowledge of impending death
concentrating the mind ?  Hence The Grim Guido re-woke the types-sig.


From jim@digicool.com  Wed Dec  8 18:48:44 1999
From: jim@digicool.com (Jim Fulton)
Date: Wed, 08 Dec 1999 13:48:44 -0500
Subject: [Types-sig] Types sig dead or alive
References: <384D199C.C6771285@prescod.net> <384D24AE.AF76BCA7@digicool.com> <384E58B8.4CE80095@prescod.net>
Message-ID: <384EA80C.42B533F6@digicool.com>

Paul Prescod wrote:
> 
> Jim, I am happy to temporarily banish interface discussions from the
> sig.

I'd prefer that it be permanent. I'll also reiterate that
I'd like the Class-Type unification to be taken out too.
 
> But...is it likely that you guys will have some Zope-interface
> experience by the conference?

Don't know. It probably depends on Martijn Faassen, who sort 
of volunteered to do Zope integration. :)

> I'm sure you guys are as up-to-the-wazoo
> as the rest of us but it would be cool if we could get a mini-report on
> whether the interfaces proposal *works*.

Yes it would, however I can't make any promises to do this myself 
(or commit DC), but I *am* willing to work with Martijn or anyone 
else who wants to take the lead for now.

Note that, even if the implementation isn't exercised in the next
month, there would be progress since Spam7 to report on.

Jim

--
Jim Fulton           mailto:jim@digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From bwarsaw@cnri.reston.va.us (Barry A. Warsaw)  Wed Dec  8 18:51:44 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Wed, 8 Dec 1999 13:51:44 -0500 (EST)
Subject: [Types-sig] Types sig dead or alive
References: <384D199C.C6771285@prescod.net>
 <384D24AE.AF76BCA7@digicool.com>
 <384E58B8.4CE80095@prescod.net>
 <384EA80C.42B533F6@digicool.com>
Message-ID: <14414.43200.610908.557195@anthem.cnri.reston.va.us>

>>>>> "JF" == Jim Fulton <jim@digicool.com> writes:

    JF> Note that, even if the implementation isn't exercised in the
    JF> next month, there would be progress since Spam7 to report on.

Do you want another devday session, Jim?

-Barry


From jim@digicool.com  Wed Dec  8 19:05:32 1999
From: jim@digicool.com (Jim Fulton)
Date: Wed, 08 Dec 1999 14:05:32 -0500
Subject: [Types-sig] Types sig dead or alive
References: <384D199C.C6771285@prescod.net>
 <384D24AE.AF76BCA7@digicool.com>
 <384E58B8.4CE80095@prescod.net>
 <384EA80C.42B533F6@digicool.com> <14414.43200.610908.557195@anthem.cnri.reston.va.us>
Message-ID: <384EABFC.EF38BA24@digicool.com>

"Barry A. Warsaw" wrote:
> 
> >>>>> "JF" == Jim Fulton <jim@digicool.com> writes:
> 
>     JF> Note that, even if the implementation isn't exercised in the
>     JF> next month, there would be progress since Spam7 to report on.
> 
> Do you want another devday session, Jim?

No, but if recommendations are made on a devday, there
should be some time spent on the next devday to report 
on progress. So, I think there should be some time
(session, whatever) spent giving progress on projects
launched by devday'98.

Jim


--
Jim Fulton           mailto:jim@digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From jim@digicool.com  Wed Dec  8 19:15:56 1999
From: jim@digicool.com (Jim Fulton)
Date: Wed, 08 Dec 1999 14:15:56 -0500
Subject: [Types-sig] I like the new look of the types-sig page!
Message-ID: <384EAE6C.1CF299A6@digicool.com>

I like the new look at: http://www.python.org/sigs/types-sig/.

Thanks!

Jim

--
Jim Fulton           mailto:jim@digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From m.faassen@vet.uu.nl  Wed Dec  8 20:11:50 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Wed, 08 Dec 1999 21:11:50 +0100
Subject: [Types-sig] Types sig dead or alive
References: <384D199C.C6771285@prescod.net> <384D24AE.AF76BCA7@digicool.com> <384E58B8.4CE80095@prescod.net> <384EA80C.42B533F6@digicool.com>
Message-ID: <384EBB85.E402AD2D@vet.uu.nl>

Jim Fulton wrote:
> 
> Paul Prescod wrote:
> >
> > Jim, I am happy to temporarily banish interface discussions from the
> > sig.
> 
> I'd prefer that it be permanent. I'll also reiterate that
> I'd like the Class-Type unification to be taken out too.
> 
> > But...is it likely that you guys will have some Zope-interface
> > experience by the conference?
> 
> Don't know. It probably depends on Martijn Faassen, who sort
> of volunteered to do Zope integration. :)

Currently I'm way too busy, and I volunteered to be involved in it, not
to do it all myself. :) That said, I *hope* I'll get more time in
january and actually explore the issue better then. Don't know how far
I'll get as there's more I need to do.

> > I'm sure you guys are as up-to-the-wazoo
> > as the rest of us but it would be cool if we could get a mini-report on
> > whether the interfaces proposal *works*.
> 
> Yes it would, however I can't make any promises to do this myself
> (or commit DC), but I *am* willing to work with Martijn or anyone
> else who wants to take the lead for now.

I hope to get some time by the end of this month, but I can't say how
much right now. ZFormulator, XMLWidgets, and a whole lot of other stuff
still needs to be worked on.

> Note that, even if the implementation isn't exercised in the next
> month, there would be progress since Spam7 to report on.

That's true. I did read through your documentation on it yesterday. It
looks good. I recall having read it before way back when, too, but I
think I understand it better now.

Regards,

Martijn


From guido@CNRI.Reston.VA.US  Wed Dec  8 22:01:31 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 08 Dec 1999 17:01:31 -0500
Subject: [Types-sig] I like the new look of the types-sig page!
In-Reply-To: Your message of "Wed, 08 Dec 1999 14:15:56 EST."
 <384EAE6C.1CF299A6@digicool.com>
References: <384EAE6C.1CF299A6@digicool.com>
Message-ID: <199912082201.RAA04898@eric.cnri.reston.va.us>

> I like the new look at: http://www.python.org/sigs/types-sig/.
> 
> Thanks!
> 
> Jim

You're welcome.  I basically ripped out three pages saying "coming
soon" that were last edited a year ago, plus everything that referred
to them.  There wasn't much left after that. ;-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From paul@prescod.net  Wed Dec  8 22:47:29 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 08 Dec 1999 17:47:29 -0500
Subject: [Types-sig] Re: Static typing considered ... UGLY
References: <384EA5DE.3ADD8801@lsl.co.uk>
Message-ID: <384EE001.7D040EFE@prescod.net>

Edward Welbourne wrote:
> 
> Might I humbly suggest that:
> 
>   to incorporate static typing into python would change it
>   beyond recognition

Our intention is that all existing Python code would continue to be
valid modulo the possible introduction of a couple of keywords. If that
still doesn't sound like it meets your needs then I'll just have to
apologize in advance.

A few other points:

 * if Python is never allowed to make major changes it will die
prematurely. The other languages (except Scheme and other arguably dead
languages) grow and evolve.

 * I would be more interested in your technical concerns. "It's ugly" is
too subjective....especially when nobody considers it "ugly" in every
other programming language that has type declarations. Rather, I think
that type declarations improve code readability.

 * the assertion syntax strikes me as doubly ugly. Imagine a function
that takes 10 arguments with 10 of those assertions.

 * for me, the goal is not performance. Performance considerations are
secondary.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Floggings will increase until morale improves.


From gstein@lyra.org  Thu Dec  9 23:04:17 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 9 Dec 1999 15:04:17 -0800 (PST)
Subject: [Types-sig] Plea for help.
In-Reply-To: <14414.39233.255818.554490@goon.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912091500170.10472-100000@nebula.lyra.org>

On Wed, 8 Dec 1999, Jeremy Hylton wrote:
>...
> Gordon and Guido offered some suggestions.
> 
> I have done some noodling with the Py2C AST, and I think it is an
> excellent candidate.  

http://www.mudlib.org/~rassilon/p2c/

Specifically, the file transformer.py in that distribution. I've
threatened before to break it out and make it available on my Python
page... ought to do that sometime.

>...
> When I was using the Py2C transformer class, I made some modifications
> to the AST generated to make it a little easier to use interactively.
> The original defintion for the AST was:
>...
> A tree of these nodes is created by the Transformer class, which walks
> the parse trees created by the parser module.
> 
> I modified Node to be BaseNode and created specific classes for each
> of type of node:

The Node subclasses were on Bill's to-do list. That's cool that you've
already done it!

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From GoldenH@littoncorp.com  Fri Dec 10 01:48:45 1999
From: GoldenH@littoncorp.com (Golden, Howard)
Date: Thu, 9 Dec 1999 17:48:45 -0800
Subject: [Types-sig] "Open-World" design using generic Java: Lesson for Py
 thon?
Message-ID: <ADCB388D8C6BD211A4CE0000F63D90112215B2@mail.littoncorp.com>

I recommend you look at the paper "Safe 'Open-World' Designs in Java and
GJ," by Marco Nissen and Karsten Weihe,
ftp://ftp.fmi.uni-konstanz.de/pub/preprints/1998/preprint-066-02.ps.Z .

In the paper (Section 4), they distinguish two use scenarios for static type
safety.  In one use, static safety is of no benefit.  However, in the other
case, it will lead to a more reliable system.  I hope those of you who
question the value of typing will read the paper and consider their
argument.

(Note: In the paper the authors conclude that Java is deficient in meeting
the second use case.  They find that GJ, the generic version of Java
developed by Bracha, et al., has the necessary feature of parametric
polymorphism to be used this way.)

Howard B. Golden
Software developer
Litton Industries, Inc.
Woodland Hills, California


From paul@prescod.net  Fri Dec 10 15:05:05 1999
From: paul@prescod.net (Paul Prescod)
Date: Fri, 10 Dec 1999 10:05:05 -0500
Subject: [Types-sig] Plea for help.
References: <1267452506-32324318@hypernet.com>
 <199912081634.LAA04169@eric.cnri.reston.va.us> <14414.39233.255818.554490@goon.cnri.reston.va.us>
Message-ID: <385116A1.69E1BD05@prescod.net>

Jeremy Hylton wrote:
> 
> I was going to suggest that a good near-term goal for the type sig
> would be to write a Python compiler in Python, but I see that Paul has
> beaten me to it.  

Not quite. I'm not going to do anything about generating bytecodes. But
it seems to me like that would be another cool project. Someone should
do py2pyc and add it to the py2c distribution. But I'm not going to...

Yes, we will eventually want such a beast in order to allow for some
runtime checks (since changing and distributing a Python-coded compiler
is probably easier than changing the C-coded interpreter).

Here's what I *would* like to do. I would like to subclass your node 
objects and build "statically resolved" subtypes. This will be a natural
base class for a new version of py2c (which as far as I know does no
static resolution) and for an optimizing bytecode compiler.

Hopefully a big chunk of the Python 2 compiler can be written in Python
2.
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
"A writer is also a citizen, a political animal, whether he likes it or 
not. But I do not accept that a writer has a greater obligation 
to society than a musician or a mason or a teacher. Everyone has
a citizen's commitment."  - Wole Soyinka, Africa's first Nobel Laureate


From paul@prescod.net  Fri Dec 10 15:04:39 1999
From: paul@prescod.net (Paul Prescod)
Date: Fri, 10 Dec 1999 10:04:39 -0500
Subject: [Types-sig] Plea for help.
References: <Pine.LNX.4.10.9912091500170.10472-100000@nebula.lyra.org>
Message-ID: <38511687.73E793C6@prescod.net>

Greg Stein wrote:
> 
> The Node subclasses were on Bill's to-do list. That's cool that you've
> already done it!

Would it be possible to update the standard py2c distribution with these
changes so that we don't have a code fork?

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
"A writer is also a citizen, a political animal, whether he likes it or 
not. But I do not accept that a writer has a greater obligation 
to society than a musician or a mason or a teacher. Everyone has
a citizen's commitment."  - Wole Soyinka, Africa's first Nobel Laureate


From jeremy@cnri.reston.va.us  Fri Dec 10 16:44:44 1999
From: jeremy@cnri.reston.va.us (Jeremy Hylton)
Date: Fri, 10 Dec 1999 11:44:44 -0500 (EST)
Subject: [Types-sig] Plea for help.
In-Reply-To: <38511687.73E793C6@prescod.net>
References: <Pine.LNX.4.10.9912091500170.10472-100000@nebula.lyra.org>
 <38511687.73E793C6@prescod.net>
Message-ID: <14417.11772.747280.226120@goon.cnri.reston.va.us>

>>>>> "PP" == Paul Prescod <paul@prescod.net> writes:

  PP> Greg Stein wrote:
  >>  The Node subclasses were on Bill's to-do list. That's cool that
  >> you've already done it!

  PP> Would it be possible to update the standard py2c distribution
  PP> with these changes so that we don't have a code fork?

I'm going to send patches to Greg.  I'm swamped today, but will get to
it Monday at the latest.

Jeremy


From gstein@lyra.org  Sat Dec 11 01:00:41 1999
From: gstein@lyra.org (Greg Stein)
Date: Fri, 10 Dec 1999 17:00:41 -0800 (PST)
Subject: [Types-sig] transformer.py (was: Plea for help.)
In-Reply-To: <38511687.73E793C6@prescod.net>
Message-ID: <Pine.LNX.4.10.9912101658020.16305-100000@nebula.lyra.org>

On Fri, 10 Dec 1999, Paul Prescod wrote:
> Greg Stein wrote:
> > 
> > The Node subclasses were on Bill's to-do list. That's cool that you've
> > already done it!
> 
> Would it be possible to update the standard py2c distribution with these
> changes so that we don't have a code fork?

Oh. Sure. I'll get right on it.


Bill and I have already exchanged mail with Jeremy. The stuff will get
folded in at some point. When? Dunno. When we have free time. Bill and I
don't spend much time with that code -- it comes in bursts. Code fork? I
don't see that occurring at all; it isn't like Jeremy is purposefully
going to start producing new releases of transformer.py. If he sends one
out, it would simply be to expedite matters.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Sat Dec 11 01:13:59 1999
From: gstein@lyra.org (Greg Stein)
Date: Fri, 10 Dec 1999 17:13:59 -0800 (PST)
Subject: [Types-sig] Plea for help.
In-Reply-To: <385116A1.69E1BD05@prescod.net>
Message-ID: <Pine.LNX.4.10.9912101706330.16305-100000@nebula.lyra.org>

On Fri, 10 Dec 1999, Paul Prescod wrote:
> Jeremy Hylton wrote:
> > I was going to suggest that a good near-term goal for the type sig
> > would be to write a Python compiler in Python, but I see that Paul has
> > beaten me to it.  
> 
> Not quite. I'm not going to do anything about generating bytecodes. But
> it seems to me like that would be another cool project. Someone should
> do py2pyc and add it to the py2c distribution. But I'm not going to...

P2C is in CVS (see http://www.pythonpros.com/cvs.html). If people really
want to get some work done, then we can arrange for access.

The P2C framework has been used for a couple output targets, so
generating a pyc is definitely workable.

> Yes, we will eventually want such a beast in order to allow for some
> runtime checks (since changing and distributing a Python-coded compiler
> is probably easier than changing the C-coded interpreter).

yup.

> Here's what I *would* like to do. I would like to subclass your node 
> objects and build "statically resolved" subtypes. This will be a natural
> base class for a new version of py2c (which as far as I know does no
> static resolution) and for an optimizing bytecode compiler.

We have very limited type handling (and certainly no inference).

> Hopefully a big chunk of the Python 2 compiler can be written in Python
> 2.

I'm hoping to see a replaceable compiler in 1.6. Shouldn't be hard to move
the compilation step behind some hooks. Should be able to hook-ify the
parser, too.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From steve@websentric.com  Mon Dec 13 10:00:53 1999
From: steve@websentric.com (Stephen Purcell)
Date: Mon, 13 Dec 1999 11:00:53 +0100
Subject: [Types-sig] Re: Static typing considered HARD
Message-ID: <3854C3D5.D7F765A@websentric.com>

Paul Prescod wrote:
>
> What can you do dynamically in Python that you cannot do with
> reflections and introspection? I've written "map", "apply" and the Y
> combinator in Java so I'm pretty confident that the issue is really just
> syntax and ease of use, not capabilities.
> 
> You could prove me wrong by showing a Python programming pattern that
> could not be straightforwardly duplicated using Java reflection.

I have one: dynamic configuration of exception catching --

class Test:
   FAILURE_ERROR_TYPE = AssertionError

   def run(self, result):
       try:
          self.runTest()
       except self.FAILURE_ERROR_TYPE:
          result.failed(self)
       except:
          result.error(self)
       else:
          result.success(self)

This is a cut-down version of a real and justifiable example. Try doing
that in Java with reflection and without resorting to 'instanceof' or
'Class.isInstance()'.

-Steve
_______________________________________________
Steve Purcell                   Squadron Leader
WebSentric AG,       http://www.websentric.com/
____"Would you like to look at my Python?"_____


From paul@prescod.net  Mon Dec 13 15:49:15 1999
From: paul@prescod.net (Paul Prescod)
Date: Mon, 13 Dec 1999 10:49:15 -0500
Subject: [Types-sig] RFC 0.1
Message-ID: <3854EB4B.37EA2888@prescod.net>

I don't think we would get anywhere if I just opened up the floor and
had everyone yell their opinions about type safety. Here is a very rough
starting point. Let's talk freely about it for a few days and then I'll
try to direct the conversation based upon addressing the feedback.

Version 0.1 Draft of a Pythonic Type Checking System
====================================================

    Guiding Principles in the System's Development 
    ----------------------------------------------

#1. The system exists to serve the dual goals of catching errors
earlier in the development process and improving the performance of
Python compilers and the Python runtime. Neither goal should be
pursued exclusively.

#2. The system must allow authors to make assertions about the sorts 
of values that may be bound to names. These are called binding
assertions. They should behave as if an assertion statement was
inserted after every assignment to that name program-wide.

Note: this does in fact put more power in the hands of module
developers. For the first time we will be able to say that
sys.exit may not be overridden in user code and that sys.maxint cannot
be changed to contain a string.

Note: the term "sorts of values" is meant to be ambiguous: the
definition of "type" in Python may undergo change in the future.

#3. Binding assertions must always be optional.

#4. There must be declarations that instruct static type checking
software to verify that a function cannot violate binding assertions.
These are called safety declarations.

#5. The introduction of binding assertions to a module should not
change the perceived interface of functions and classes in the module.
In other words, code that uses functions and classes from the module
should not need to know whether it uses binding assertions or old
fashioned assert statements. 

#6. In the absence of local safety declarations, a static type checker
should not by default report errors in otherwise legal Python code. In
other words, a coder must ask (through function or module level
declarations, command line switches or environment variables) for his
or her code to be checked. In particular, a module cannot force client
modules to be statically type checked (see #5, above).

#7. The attachment of safety declarations to a function should not
change the perceived interface of the function. In other words, code
that uses the functions should not need to know that the function
happens to be statically checkable.

#8. It is not a goal that a statically checkable function should only
be able to call other statically checkable functions. Those other
functions should be presumed to return a "PyObject" object.

#9. There should be a mechanism to assert that an object has a
particular type for purposes of informing the static and dynamic type
checkers about something that the programmer knows about the flow of
the program.

#10. In general, the mechanism should try to be "pythonic" which
includes but is not limited to:

    * maximize simplicity
    * maximize power
    * minimize syntax
    * be explicit
    * be readable
    * interoperate nicely with other features

    Temporary Goals and Non-Goals:
    ------------------------------

#1. The first version of the system will be as neutral as possible on
the issue of what defines a "type". Fulton's capability-based
interfaces should be legal as types but so should type objects and
classes.

Note: a purely interface based system cannot be feasible for testing
until interfaces are embedded deeply into the existing Python library.
It might be more philisophically pure to test for an abstract
CharacterString interface but if the Python expression "abc" does not
return an object that conforms to the interface then there is not much
we can do. Some future version of the system may be restricted to only
allow declared interfaces as types. Or it may be expanded to allow
parameterized types.

#2. The first version of the system will not allow the use of types
that cannot be referred to as simple Python objects. In particular it
will not allow users to refer to things like "List of Integers" and
"Functions taking Integers as arguments and returning strings."

#3. The first version of the system will not define the operation of a
type inferencing system. For now, all type declarations would need to
be explicit.

#4. The first version of the system will be syntactically compatible
with Python 1.5.x in order to allow experimentation in the lead-up to
an integrated system in Python 2.

    Definitions:
    ------------
Namespace creating suite:
    The suite contained directly within a module, class or function
definition.  

Statically available namespace creating suite:
    The namespace creating suite defined by a module or class
definition.  We do not consider the suite contained with a function as
Statically available because the namespace only becomes available when
the function is executed, not when it is declared.

Name binding statement, target:
    An assignment statement (target), "def" statement ("funcname"),
"class" ("classname") statement or "import" statement (module).  ***
more thought about "from" version ***

Name declaration:
    A name bound at the most out-dented context of a statically
available namespace creating suite.

Classification:
    Due to a shortage of synonyms for "type" that do not already have a
meaning, we use the word "classification." 

    Given a value v and a value t, v conforms to classification t if 
        t is returned by type( v )
        t is returned by v.__class__
        t is in v.__implements__ (the fulton convention)
        t is the "object" classification
        v is the value "None"

Classification Declaration:
    A statement that precedes a name binding statement and declares
the classifications that the name must conform to. The type
declaration must textually precede any use of the name.

Classification Constraints:
    A pair of statements declaring the classifications that values
bound to a name must support. There are a few syntactic variations:

    1. A name binding statement preceded by a statement referencing a
classification. 

<example>
types.StringType
a

class foo:
    types.IntType
    j=5
</example>

This assertion is maintained by a combination of the static and
dynamic type checkers. In order for the dynamic checker to work, we
will need to modify the module_setattr and class_setattr functions for
Python 1.6.

    2. A simple expression containing only a tuple where all but the
last item reference a classification. The last item should be a
locally declared name. The statement must occur in the most out-dented
context of a namespace creating statement suite:

def foo(bar, baz):
    types.IntType, bar
    interfaces.NumericType, interfaces.SignedType, baz

    3. The classification of a function is always "function" but its
return classification can be specified with a declaration:

<example>
types.StringType
def foo(): return "abc"
</example>

This can be checked through the introduction of "virtual" assertion
statements into byte-code:

<example>
types.StringType
def foo(): 
    __tmp = "abc"
    assert has_type( __tmp, types.StringType )
    return "abc"
</example>

    4. The classification of class instance variables comes from the
classification of the corresponding class variable. 

<example>
class foo:
    types.IntType
    a=5

    types.ListType
    b=None
</example>

Classification-testing expression:

The function has_type takes a value and a reference to a
classification or list of classifications. The return type of the
function is the union of the classifications.

Classification-safe Function:

    a function that can be checked at compile time not to violate any
classification constraints by assigning invalid values to any
constrained names:

Every reference to a name in a module or class (not instance!) must be
to a declared (but perhaps not classification constrained) name.

<note>
Remember that variables without classification constraints can be
presumed to conform to the "Object" type.
</note>

Every expression must be type-checked based on the operators,
constants and global and local name references.

Attribute assignments and references are checked based upon the
asserted classifications of the owning object.

The classification of every assignment must be checked based on the
types of constants, variables and function return types in the
right-hand side.

The classification of every function parameter must be checked based
on the classifications of the argument expression.

All return statements must be checked based on the classifications of
the expressions.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
"A writer is also a citizen, a political animal, whether he likes it or 
not. But I do not accept that a writer has a greater obligation 
to society than a musician or a mason or a teacher. Everyone has
a citizen's commitment."  - Wole Soyinka, Africa's first Nobel Laureate


From paul@prescod.net  Mon Dec 13 15:56:52 1999
From: paul@prescod.net (Paul Prescod)
Date: Mon, 13 Dec 1999 10:56:52 -0500
Subject: [Types-sig] Re: Static typing considered HARD
References: <3854C3D5.D7F765A@websentric.com>
Message-ID: <3854ED13.8093FCE7@prescod.net>

Stephen Purcell wrote:
> 
> This is a cut-down version of a real and justifiable example. Try doing
> that in Java with reflection and without resorting to 'instanceof' or
> 'Class.isInstance()'.

What you're saying is that I can't emulate Python's dynamic features
without using Java's dynamic features. I would agree with that assertion
-- but I'm not convinced it is relevant. instanceof is part of the
language core and isInstance is a reflective feature.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
There are only two countries in the world that have not ratified the
United 
Nations convention on the rights of children: 
Somalia and the United States of America.  See: http://www.boes.org/


From steve@websentric.com  Mon Dec 13 17:46:39 1999
From: steve@websentric.com (Stephen Purcell)
Date: Mon, 13 Dec 1999 18:46:39 +0100
Subject: [Types-sig] Re: Static typing considered HARD
References: <3854C3D5.D7F765A@websentric.com> <3854ED13.8093FCE7@prescod.net>
Message-ID: <385530FF.70AABC9E@websentric.com>

Paul Prescod wrote:
> 
> Stephen Purcell wrote:
> >
> > This is a cut-down version of a real and justifiable example. Try doing
> > that in Java with reflection and without resorting to 'instanceof' or
> > 'Class.isInstance()'.
> 
> What you're saying is that I can't emulate Python's dynamic features
> without using Java's dynamic features. I would agree with that assertion
> -- but I'm not convinced it is relevant. instanceof is part of the
> language core and isInstance is a reflective feature.
> 

Thanks, Paul, for noting the lack of clarity in my comment, which I
shall endeavour to remedy:

The dynamic nature of Python's exception handling is an intrinsic
language property that cannot be exactly mirrored in Java's exception
handling, using reflection or otherwise. No 'catch' clause in Java will
ever work the same way as Python's 'except', and by "resorting to
instanceof or isInstance" I meant a subversion of the 'catch' clause's
fundamental semantics:

abstract class Test {
   private Class FAILURE_ERROR_TYPE = AssertionException.class;
   void run(Result r) {
     try {
        runTest();
        r.success(this);
     catch ( Exception e ) {
         if ( e.getClass() == FAILURE_ERROR_TYPE ) {
            r.fail(this);
         }
         else {
            r.error(this);
         }
     }
   }
}

This is not the same language construct as the Python version.

Your argument is that any functionality implemented in a
dynamically-typed language can be mirrored in a statically-typed
language. Of course that is true, given enough code. It does *not* imply
that the features of the statically typed language are compatible with
those of the dynamically typed language, nor that their introduction is
desirable and technically possible.

It seems to me that the whole static-blah thing clashes with fundamental
choices that Guido made when designing Python, and those choices are
presumably a large part of Python's appeal and success. I would never
presume to second-guess the needs of Python's users.

Static typing works very well in Java and suchlike, but those are
different languages, and the people who cannot live without static
typing use them instead of Python (and Smalltalk).

The rest of us, who do not expect perfection to consist of the union of
all possibilities, use Python when appropriate and keep in mind its
characteristics.

There's something special about Python's elegance, and losing that
elegance by a Perl-5-like process of cluttering would be enough for me
to abandon the language, and move on to the next 'clean' thing. I care
enough to have posted this opinion, but not enough to try to influence
Python's development.

The booing to which David Ascher alluded in his posting may indeed have
been 'just' an emotional reaction to the proposal, but I challenge any
avid Python user to fully describe his or her enthusiasm for the
language in purely technical terms. I use the language because it
somehow makes me feel good. When it no longer gives me that feeling,
I'll stop using it. Static typing would have that effect. I suspect that
other avid users such as Uche would also stop.

Rational? Not entirely. I don't expect anyone to care what I think or if
I abandon Python in the future, and I certainly don't imagine that any
rational argument I might provide for my opinion would change anybody
else's mind. I'll avoid the fray, and vote with my feet when the time
comes.

-Steve
_______________________________________________
Steve Purcell                   Squadron Leader
WebSentric AG,       http://www.websentric.com/
____"Would you like to look at my Python?"_____


From guido@CNRI.Reston.VA.US  Mon Dec 13 18:09:15 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Mon, 13 Dec 1999 13:09:15 -0500
Subject: [Types-sig] RFC 0.1
In-Reply-To: Your message of "Mon, 13 Dec 1999 10:49:15 EST."
 <3854EB4B.37EA2888@prescod.net>
References: <3854EB4B.37EA2888@prescod.net>
Message-ID: <199912131809.NAA19402@eric.cnri.reston.va.us>

> I don't think we would get anywhere if I just opened up the floor and
> had everyone yell their opinions about type safety. Here is a very rough
> starting point. Let's talk freely about it for a few days and then I'll
> try to direct the conversation based upon addressing the feedback.

Thanks for starting this, Paul!

> Version 0.1 Draft of a Pythonic Type Checking System
> ====================================================
> 
>     Guiding Principles in the System's Development 
>     ----------------------------------------------
> 
> #1. The system exists to serve the dual goals of catching errors
> earlier in the development process and improving the performance of
> Python compilers and the Python runtime. Neither goal should be
> pursued exclusively.

Hm, these may at times be very different goals.  I had a recent
private discussion about types where the two goals were referred to as
(OPT), for optimization, and (ERR), for error-detection.  One
observation is that while for (OPT) you may be able to get away with
aggressive whole-program type inferencing only, but for (ERR) you're
likely to *want* to declare types in certain cases; e.g. to prepare
for possible evolution of a module you may want to fix its API to a
subset of what is actually implemented.

> #2. The system must allow authors to make assertions about the sorts 
> of values that may be bound to names. These are called binding
> assertions. They should behave as if an assertion statement was
> inserted after every assignment to that name program-wide.

Technically, Python assert statements are only executed in
non-optimizing mode -- "assert 0" has no effect when you happen to use
"python -O" to execute your program.  But I presume that here you mean
assertions in the abstract conceptual sense.

> Note: this does in fact put more power in the hands of module
> developers. For the first time we will be able to say that
> sys.exit may not be overridden in user code and that sys.maxint cannot
> be changed to contain a string.

I think JPython secretly already imposes some of these restrictions
(in particular for the sys module!).

> Note: the term "sorts of values" is meant to be ambiguous: the
> definition of "type" in Python may undergo change in the future.
> 
> #3. Binding assertions must always be optional.
> 
> #4. There must be declarations that instruct static type checking
> software to verify that a function cannot violate binding assertions.
> These are called safety declarations.

I'm not sure what you mean here and how such declarations differ from
type assertions.  And I'm worried about the "must" part.  Please explain
better?

> #5. The introduction of binding assertions to a module should not
> change the perceived interface of functions and classes in the module.
> In other words, code that uses functions and classes from the module
> should not need to know whether it uses binding assertions or old
> fashioned assert statements.

Except that some unintended uses may become illegal while before you
might just have gotten away with them.

> #6. In the absence of local safety declarations, a static type checker
> should not by default report errors in otherwise legal Python code. In
> other words, a coder must ask (through function or module level
> declarations, command line switches or environment variables) for his
> or her code to be checked. In particular, a module cannot force client
> modules to be statically type checked (see #5, above).

However, there are some examples of dynamic code usage that are
fishy.  Examples include adding or changing globals in other modules
(except for the rare global that is intended to be a settable option),
or messing with the __builtin__ module.

> #7. The attachment of safety declarations to a function should not
> change the perceived interface of the function. In other words, code
> that uses the functions should not need to know that the function
> happens to be statically checkable.

But I'd still like to be able to be diagnosed at compile time instead
of at runtime when my code makes a statically illegal call to a
function with a safety declaration.

> #8. It is not a goal that a statically checkable function should only
> be able to call other statically checkable functions. Those other
> functions should be presumed to return a "PyObject" object.
> 
> #9. There should be a mechanism to assert that an object has a
> particular type for purposes of informing the static and dynamic type
> checkers about something that the programmer knows about the flow of
> the program.

Beyond "assert isinstance(object, type_or_class)" ?

> #10. In general, the mechanism should try to be "pythonic" which
> includes but is not limited to:
> 
>     * maximize simplicity
>     * maximize power
>     * minimize syntax
>     * be explicit
>     * be readable
>     * interoperate nicely with other features
> 
>     Temporary Goals and Non-Goals:
>     ------------------------------
> 
> #1. The first version of the system will be as neutral as possible on
> the issue of what defines a "type". Fulton's capability-based
> interfaces should be legal as types but so should type objects and
> classes.
> 
> Note: a purely interface based system cannot be feasible for testing
> until interfaces are embedded deeply into the existing Python library.
> It might be more philisophically pure to test for an abstract
> CharacterString interface but if the Python expression "abc" does not
> return an object that conforms to the interface then there is not much
> we can do. Some future version of the system may be restricted to only
> allow declared interfaces as types. Or it may be expanded to allow
> parameterized types.
> 
> #2. The first version of the system will not allow the use of types
> that cannot be referred to as simple Python objects. In particular it
> will not allow users to refer to things like "List of Integers" and
> "Functions taking Integers as arguments and returning strings."

It's been said before: that's a shame.  Type inference is seriously
hindered if it doesn't have such information.  (Consider a loop over
sys.argv; I want the checker to be able to assume that the items are
strings.)

> #3. The first version of the system will not define the operation of a
> type inferencing system. For now, all type declarations would need to
> be explicit.

I expect that this will make the system relatively heavy-weight and
hence unpythonic.  You'd be sprinkling way more type decls over your
source code than would be necessary with a somewhat more sophisticated
type checker.

> #4. The first version of the system will be syntactically compatible
> with Python 1.5.x in order to allow experimentation in the lead-up to
> an integrated system in Python 2.

I think that this is too much of a constraint, and may be informing
your preliminary design too much.  As long as an easy mechanical
transformation to valid Python 1.5.x is available, I'd be happy.

>     Definitions:
>     ------------
> Namespace creating suite:
>     The suite contained directly within a module, class or function
> definition.  
> 
> Statically available namespace creating suite:
>     The namespace creating suite defined by a module or class
> definition.  We do not consider the suite contained with a function as
> Statically available because the namespace only becomes available when
> the function is executed, not when it is declared.
> 
> Name binding statement, target:
>     An assignment statement (target), "def" statement ("funcname"),
> "class" ("classname") statement or "import" statement (module).  ***
> more thought about "from" version ***
> 
> Name declaration:
>     A name bound at the most out-dented context of a statically
> available namespace creating suite.

The indentation don't enter into it.  Consider

    if win32:
       def func(): ... # win32 specific version
    else:
       def func(): ... # generic version

> Classification:
>     Due to a shortage of synonyms for "type" that do not already have a
> meaning, we use the word "classification." 

Oh, dear.  Keep looking for a better synonym!

>     Given a value v and a value t, v conforms to classification t if 
>         t is returned by type( v )
>         t is returned by v.__class__
>         t is in v.__implements__ (the fulton convention)
>         t is the "object" classification
>         v is the value "None"
> 
> Classification Declaration:
>     A statement that precedes a name binding statement and declares
> the classifications that the name must conform to. The type
> declaration must textually precede any use of the name.
> 
> Classification Constraints:
>     A pair of statements declaring the classifications that values
> bound to a name must support. There are a few syntactic variations:
> 
>     1. A name binding statement preceded by a statement referencing a
> classification. 
> 
> <example>
> types.StringType
> a
> 
> class foo:
>     types.IntType
>     j=5
> </example>
> 
> This assertion is maintained by a combination of the static and
> dynamic type checkers. In order for the dynamic checker to work, we
> will need to modify the module_setattr and class_setattr functions for
> Python 1.6.
> 
>     2. A simple expression containing only a tuple where all but the
> last item reference a classification. The last item should be a
> locally declared name. The statement must occur in the most out-dented
> context of a namespace creating statement suite:
> 
> def foo(bar, baz):
>     types.IntType, bar
>     interfaces.NumericType, interfaces.SignedType, baz
> 
>     3. The classification of a function is always "function" but its
> return classification can be specified with a declaration:
> 
> <example>
> types.StringType
> def foo(): return "abc"
> </example>
> 
> This can be checked through the introduction of "virtual" assertion
> statements into byte-code:
> 
> <example>
> types.StringType
> def foo(): 
>     __tmp = "abc"
>     assert has_type( __tmp, types.StringType )
>     return "abc"
> </example>

Of course, in certain cases (as in this example) the type checker may
be able to prove that the assertion can never fail, and omit it.

>     4. The classification of class instance variables comes from the
> classification of the corresponding class variable. 
> 
> <example>
> class foo:
>     types.IntType
>     a=5
> 
>     types.ListType
>     b=None
> </example>

The initialization for b denies its type declaration.  Do you really
want to do this?  This doesn't look like it should be part of the
final (Python 2.0) version -- it's just too ugly.  How am I going to
explain this to a newbie with no programming *nor* Python experience?

> Classification-testing expression:
> 
> The function has_type takes a value and a reference to a
> classification or list of classifications. The return type of the
> function is the union of the classifications.

Perhaps this could be an extension of isinstance()?  (That already
takes both class and type objects.)

> Classification-safe Function:
> 
>     a function that can be checked at compile time not to violate any
> classification constraints by assigning invalid values to any
> constrained names:
> 
> Every reference to a name in a module or class (not instance!) must be
> to a declared (but perhaps not classification constrained) name.

Explain the reason for excluding instances?  Maybe I'm not very clear
on what you're proposing here.

> <note>
> Remember that variables without classification constraints can be
> presumed to conform to the "Object" type.
> </note>
> 
> Every expression must be type-checked based on the operators,
> constants and global and local name references.

Ah, good.  This implies the "no messing with builtins or other
modules' globals" rule that I'm proposing.

> Attribute assignments and references are checked based upon the
> asserted classifications of the owning object.
> 
> The classification of every assignment must be checked based on the
> types of constants, variables and function return types in the
> right-hand side.
> 
> The classification of every function parameter must be checked based
> on the classifications of the argument expression.
> 
> All return statements must be checked based on the classifications of
> the expressions.

OK.  I'm not sure everywhere whether you want compile-time or run-time
checking.  Perhaps you can clarify this?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From GoldenH@littoncorp.com  Mon Dec 13 18:11:15 1999
From: GoldenH@littoncorp.com (Golden, Howard)
Date: Mon, 13 Dec 1999 10:11:15 -0800
Subject: [Types-sig] Re: RFC 0.1
Message-ID: <ADCB388D8C6BD211A4CE0000F63D90112215B5@mail.littoncorp.com>

Paul Prescod wrote:

> #2. The system must allow authors to make assertions about the sorts 
> of values that may be bound to names. These are called binding
> assertions. They should behave as if an assertion statement was
> inserted after every assignment to that name program-wide.

I think the system should also allow the author to require declarations of
all variables (e.g., via a command-line switch or pragma).

> #3. Binding assertions must always be optional.

Unless the author requires them using the above mechanism.

> #10. In general, the mechanism should try to be "pythonic" which
> includes but is not limited to:

>   * maximize simplicity
>   * maximize power
>   * minimize syntax
>   * be explicit
>   * be readable
>   * interoperate nicely with other features

This is vague.  I'm not sure what it means.

> 1. The first version of the system will be as neutral as possible on
> the issue of what defines a "type". Fulton's capability-based
> interfaces should be legal as types but so should type objects and
> classes.

I don't understand the ramifications of this.  Might it not gut the RFC?

> Note: a purely interface based system cannot be feasible for testing
> until interfaces are embedded deeply into the existing Python library.
> It might be more philisophically pure to test for an abstract
> CharacterString interface but if the Python expression "abc" does not
> return an object that conforms to the interface then there is not much
> we can do. Some future version of the system may be restricted to only
> allow declared interfaces as types. Or it may be expanded to allow
> parameterized types.

Shouldn't it be straightforward to add declarations to the existing library?

> #2. The first version of the system will not allow the use of types
> that cannot be referred to as simple Python objects. In particular it
> will not allow users to refer to things like "List of Integers" and
> "Functions taking Integers as arguments and returning strings."

Why?  I don't think this should be prohibited, only not guaranteed.

> #4. The first version of the system will be syntactically compatible
> with Python 1.5.x in order to allow experimentation in the lead-up to
> an integrated system in Python 2.

Does this mean no new syntax?  (That's what it appears from your examples.)

How about a declaration syntax, e.g.,

    var x : type1, y : type2

Is this prohibited by the RFC?

>    Definitions:
>    ------------

I'm confused about this section.  Are these requirements or merely
terminology?

In general, I don't understand the definitions.  It would help me if there
were some additional explanation of how the defined terms fit together and
what benefits are being obtained by making these distinctions.

---

Howard B. Golden
Software developer
Litton Industries, Inc.
Woodland Hills, California


From jpe@arachne.org  Mon Dec 13 18:54:30 1999
From: jpe@arachne.org (John Ehresman)
Date: Mon, 13 Dec 1999 13:54:30 -0500 (EST)
Subject: [Types-sig] RFC 0.1
In-Reply-To: <199912131809.NAA19402@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912131316140.1149-100000@localhost.localdomain>

On Mon, 13 Dec 1999, Guido van Rossum wrote:
> ...
> OK.  I'm not sure everywhere whether you want compile-time or run-time
> checking.

I think it might be possible to do both run-time and compile-time
checking by defining the system in terms of what happens at run time, but
allowing compile time optimizations to be made.  For example, we might
say the declaration (using C-like syntax) "def IntType atoi(StringType
s):" to mean that if a value is passed to atoi that is not a string, a
TypeError exception is raised.  This declaration might be enough for a
lint like program to analyze code before it is run and to flag cases where
TypeError would be thrown.  I think there's value in having run-time
checking to support delayed checking in some cases -- it would allow
strongly typed functions to be bound to symbols without any typing info.
Otherwise, it unclear how to handle the following:
  def IntType atoi(StringType s):
    ...

  if something:
    conv = atoi
  else:
    conv = function_of_unknown_type
  print conv('1')

Then, if a compiler was able to determine that the value bound to
StringType never changed and that a value was a valid StringType, it could
optimize away the code to check the type of that value.  This could be
implemented for functions by separating the code to check argument types
from the function body and setting up the calling conventions so that the
type checking code was only executed when needed.

I don't see anything here that prevents type inferencing to work in either
a lint like program or a compiler.  For example, from the code:
  def IntType atoi(StringType s):
    ...

  def wrapper(s):
    return atoi(s)

it's relatively easy to determine that wrapper must take a string and
returns an integer (assuming the binding for atoi is constant).

John


From gstein@lyra.org  Mon Dec 13 20:15:13 1999
From: gstein@lyra.org (Greg Stein)
Date: Mon, 13 Dec 1999 12:15:13 -0800 (PST)
Subject: [Types-sig] RFC 0.1
In-Reply-To: <199912131809.NAA19402@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912131200000.16305-100000@nebula.lyra.org>

My comments below come from a writeup that is posted at:
  http://www.foretec.com/python/workshops/1998-11/greg-type-ideas.html

The writeup is from a discussion last year, between Fred, Sjoerd, and
myself. I'm not going to replicate the details of that writeup here, but
will simply highlight some points. Hit the link to see the background.


On Mon, 13 Dec 1999, Guido van Rossum wrote:
> Paul Prescode wrote:
>...
> > #2. The system must allow authors to make assertions about the sorts 
> > of values that may be bound to names. These are called binding
> > assertions. They should behave as if an assertion statement was
> > inserted after every assignment to that name program-wide.

In our writeup, we posit that it is better (and more Pythonic) to bind the
assertions to expressions, rather than names. This came about when we
looked at how to supply assertions for things like:

  x.y = value
  x[i] = value
  x[i:j] = value

Certainly, function objects would have type information associated with
them, but I believe that is different than associating a type with the
function's name.

> Technically, Python assert statements are only executed in
> non-optimizing mode -- "assert 0" has no effect when you happen to use
> "python -O" to execute your program.  But I presume that here you mean
> assertions in the abstract conceptual sense.

We proposed a new type-assertion operator. Whether it did anything or not
(based on the -O switch) is a different discussion :-)

>...
> > #9. There should be a mechanism to assert that an object has a
> > particular type for purposes of informing the static and dynamic type
> > checkers about something that the programmer knows about the flow of
> > the program.
> 
> Beyond "assert isinstance(object, type_or_class)" ?

We also proposed extending isinstance() to allow a callable for the third
argument. This allows for arbitrarily complex type checking (e.g. the
"list of integers" problem).

>...
> > #2. The first version of the system will not allow the use of types
> > that cannot be referred to as simple Python objects. In particular it
> > will not allow users to refer to things like "List of Integers" and
> > "Functions taking Integers as arguments and returning strings."
> 
> It's been said before: that's a shame.  Type inference is seriously
> hindered if it doesn't have such information.  (Consider a loop over
> sys.argv; I want the checker to be able to assume that the items are
> strings.)

The mechanism we outlined would allow any dotted-name for specifying a
type, and the "isinstance(ob, callable)" mechanism would allow for complex
type checking.

>...
> > #4. The first version of the system will be syntactically compatible
> > with Python 1.5.x in order to allow experimentation in the lead-up to
> > an integrated system in Python 2.
> 
> I think that this is too much of a constraint, and may be informing
> your preliminary design too much.  As long as an easy mechanical
> transformation to valid Python 1.5.x is available, I'd be happy.

I believe we came up with an unambiguous grammer which should easily allow
for mechanical translation.

[ side note: if we get replaceable parser/compiler functionality in 1.6,
  then we can start to test these alternative grammers and can compile
  assertions and things based on them! ]

>...
> >     4. The classification of class instance variables comes from the
> > classification of the corresponding class variable. 
> > 
> > <example>
> > class foo:
> >     types.IntType
> >     a=5
> > 
> >     types.ListType
> >     b=None
> > </example>
> 
> The initialization for b denies its type declaration.  Do you really
> want to do this?  This doesn't look like it should be part of the
> final (Python 2.0) version -- it's just too ugly.  How am I going to
> explain this to a newbie with no programming *nor* Python experience?

If type assertions are bound to expressions, rather than names, a data
flow analysis will show the types at any point. This could (theoretically)
avoid many "declarations".

> > Classification-testing expression:
> > 
> > The function has_type takes a value and a reference to a
> > classification or list of classifications. The return type of the
> > function is the union of the classifications.
> 
> Perhaps this could be an extension of isinstance()?  (That already
> takes both class and type objects.)

See my proposed extension to isinstance(). I believe it is a very clear
extension and offers all the functionality you may need.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From tim.hochberg@ieee.org  Mon Dec 13 21:17:00 1999
From: tim.hochberg@ieee.org (Tim Hochberg)
Date: Mon, 13 Dec 1999 14:17:00 -0700
Subject: [Types-sig] Greg Stein's writeup (was RFC 0.1)
References: <Pine.LNX.4.10.9912131200000.16305-100000@nebula.lyra.org>
Message-ID: <00a301bf45af$67323440$87740918@phnx3.az.home.com>

> My comments below come from a writeup that is posted at:
>   http://www.foretec.com/python/workshops/1998-11/greg-type-ideas.html
>
> The writeup is from a discussion last year, between Fred, Sjoerd, and
> myself. I'm not going to replicate the details of that writeup here, but
> will simply highlight some points. Hit the link to see the background.

I just read Greg's writeup and I like it quite a bit. With the exception of
those nasty !s, it seems very Pythonic. My question is: is there a reason
that a digraph couldn't be used instead of the !. In particular "foo->Int"
can be read as "foo evaluates_to Int" which seems to have all of the correct
associations. Or does this result in ambiguous syntax? All of Greg's
examples seem to be OK:

x = value->Int
(x,y) = value->Coord
(x,y) = value->(Int, String)
while foo()->Int:
   ...
def foo(x->String)->Int:
    ...

Of course there is the problem that -> has a very different meaning in
C/C++, but then so does !.

Just my two cents,

-tim


From paul@prescod.net  Tue Dec 14 04:39:36 1999
From: paul@prescod.net (Paul Prescod)
Date: Mon, 13 Dec 1999 20:39:36 -0800
Subject: [Types-sig] Plea for help.
References: <Pine.LNX.4.10.9912101706330.16305-100000@nebula.lyra.org>
Message-ID: <3855CA08.4EA1BF11@prescod.net>

Greg Stein wrote:
> 
> I'm hoping to see a replaceable compiler in 1.6. Shouldn't be hard to move
> the compilation step behind some hooks. Should be able to hook-ify the
> parser, too.

Is there currently any path from high level parse trees to bytecodes?
E.g. is there a way to get sane parse trees to "render" themselves as,
er, insane parse trees? I don't think so but I'm just checking to avoid
extra work.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
"Unwisely, Santa offered a teddy bear to James, unaware that he had 
been mauled by a grizzly earlier that year." - Timothy Burton, "James"


From paul@prescod.net  Tue Dec 14 04:39:43 1999
From: paul@prescod.net (Paul Prescod)
Date: Mon, 13 Dec 1999 20:39:43 -0800
Subject: [Types-sig] Re: transformer.py (was: Plea for help.)
References: <Pine.LNX.4.10.9912101658020.16305-100000@nebula.lyra.org>
Message-ID: <3855CA0F.911CB41A@prescod.net>

Greg Stein wrote:
> 
> Bill and I have already exchanged mail with Jeremy. The stuff will get
> folded in at some point. When? Dunno. 

Is it a case of "folding in" or of merging a file? I thought it was the
latter because I thought that Jeremy's changes were backwards compatible
with py2c.

> Code fork? I
> don't see that occurring at all; it isn't like Jeremy is purposefully
> going to start producing new releases of transformer.py. If he sends one
> out, it would simply be to expedite matters.

My point is, if I build an interesting application on top of Jeremy's
version and you continue to build on the older version, we will have a
defacto code fork because one of us will have to update our code in
order to re-sync. If Jeremy's code is just a drop-in replacement then
that won't be a problem and I'll just use it until you get around to
"dropping it in" to py2c.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
"Unwisely, Santa offered a teddy bear to James, unaware that he had 
been mauled by a grizzly earlier that year." - Timothy Burton, "James"


From paul@prescod.net  Tue Dec 14 07:34:16 1999
From: paul@prescod.net (Paul Prescod)
Date: Mon, 13 Dec 1999 23:34:16 -0800
Subject: [Types-sig] RFC 0.1
References: <Pine.LNX.4.10.9912131200000.16305-100000@nebula.lyra.org>
Message-ID: <3855F2F8.DE1FED31@prescod.net>

I did evaluate your proposal but it seemed to me that it was solving a
slightly different problem. I think as we compare them we'll find that
your ideas were more oriented toward runtime safety checking.

Greg Stein wrote:
> 
> > > #2. The system must allow authors to make assertions about the sorts
> > > of values that may be bound to names. These are called binding
> > > assertions. They should behave as if an assertion statement was
> > > inserted after every assignment to that name program-wide.
> 
> In our writeup, we posit that it is better (and more Pythonic) to bind the
> assertions to expressions, rather than names. This came about when we
> looked at how to supply assertions for things like:
> 
>   x.y = value
>   x[i] = value
>   x[i:j] = value

I wouldn't supply assertions for assignments at all. You supply
assertions for the names x, y, i, and j.

> Certainly, function objects would have type information associated with
> them, but I believe that is different than associating a type with the
> function's name.

But if a function takes as its first argument an int, in what sense is
that type associated with an "expression"? It is associated with a name,
whatever the name of the first argument is. Plus consider this:

type-safe
String
def foo():
	return abc()

How can I, at compile time, statically know the type of the value
currently contained in the name abc if I don't restrict it in advance
like this:

String
def abc(): return "abc"

Rebinding is fine, as long as it doesn't invalidate the type
declaration:

abc = lambda: "def"

> We also proposed extending isinstance() to allow a callable for the third
> argument. This allows for arbitrarily complex type checking (e.g. the
> "list of integers" problem).

I liked that idea but really didn't see how to port it to a compile time
static type checker. I'm going out of my way to avoid running arbitrary
Python code. Static type checking shouldn't be a security hazard.

> [ side note: if we get replaceable parser/compiler functionality in 1.6,
>   then we can start to test these alternative grammers and can compile
>   assertions and things based on them! ]

That would be way cool!

> If type assertions are bound to expressions, rather than names, a data
> flow analysis will show the types at any point. This could (theoretically)
> avoid many "declarations".

Names get their values from expressions so the data flow analysis is the
same.

If you have to type-check the statement "return a" then you need to be
able to know the type of both the variable and the expression.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Tue Dec 14 04:56:32 1999
From: paul@prescod.net (Paul Prescod)
Date: Mon, 13 Dec 1999 20:56:32 -0800
Subject: [Types-sig] Re: Static typing considered HARD
References: <3854C3D5.D7F765A@websentric.com> <3854ED13.8093FCE7@prescod.net> <385530FF.70AABC9E@websentric.com>
Message-ID: <3855CE00.805CF934@prescod.net>

Stephen Purcell wrote:
> 
>...
> 

> Static typing works very well in Java and suchlike, but those are
> different languages, and the people who cannot live without static
> typing use them instead of Python (and Smalltalk).

If Python had not had object orientation 8 years ago, we would now be
arguing against the introduction of the "class" operator as being
"un-pythonic." Anything elegant, clean and in line with the rest of the
language is, in my mind, Pythonic.

Since Guido encouraged us to go down this path, at least as a mind
experiment, I personally will not be dissuaded based on arguments that
we are going against his original intentions. As afraid as you are that
we will kill Python by changing it, I fear that we will kill it by
stultifying it. We are, after all, in the software industry.

> I use the language because it
> somehow makes me feel good. When it no longer gives me that feeling,
> I'll stop using it. Static typing would have that effect. 

Had you read the static type checking proposal when you wrote that? Have
you used languages with optional static typing? I want everybody's
opinions, emotional or otherwise, but I want people's informed opinions.
My proposal is basically about giving people a special,
computer-recognizable syntax for assertions. Are you against assertions?
Does changing the syntax of assertions bother you? Would it bother you
to find that your Python compiler might someday have a declaration that
would allow some assertions to be checked at compile time assertions?
Why wouldn't you just choose not to use that declaration?

-- 
 Paul Prescod  - ISOGEN Consulting Engineer
"Unwisely, Santa offered a teddy bear to James, unaware that he had 
been mauled by a grizzly earlier that year." - Timothy Burton, "James"


From paul@prescod.net  Tue Dec 14 05:54:41 1999
From: paul@prescod.net (Paul Prescod)
Date: Mon, 13 Dec 1999 21:54:41 -0800
Subject: [Types-sig] Type inferencing
Message-ID: <3855DBA1.9384B6AE@prescod.net>

> > #3. The first version of the system will not define the operation of a
> > type inferencing system. For now, all type declarations would need to
> > be explicit.
>
> I expect that this will make the system relatively heavy-weight and
> hence unpythonic.  You'd be sprinkling way more type decls over your
> source code than would be necessary with a somewhat more sophisticated
> type checker.

Point taken. I am only willing to do type inferencing up to a function
level. After my "ML Experience" I am not willing to do it globally. A
method with no type declaration should be presumed to return Object,
even if it is like this:

def foo():
	return "abc"

Otherwise you get the problem where changing a line of code in the
middle of a function somewhere breaks code somewhere far away:

type-check
StringType
def a():
	return b()

def b():
	return c()

def c():
	if something():
		return "abc"
	else
		return 1

Under my plan, the very first function would never have been statically
checkable. So the code far away couldn't have broken it. But I am
willing to take type inference this far:

type-check
StringType
def a():
	a="abc"
	return a

This seems no harder than the type inferencing you need to do to check
the type of expressions. Probably the only reason that Java and C don't
do this is because they want to know what size of space to allocate on
the stack. Of course we won't worry about that.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
"Unwisely, Santa offered a teddy bear to James, unaware that he had 
been mauled by a grizzly earlier that year." - Timothy Burton, "James"


From paul@prescod.net  Tue Dec 14 05:55:07 1999
From: paul@prescod.net (Paul Prescod)
Date: Mon, 13 Dec 1999 21:55:07 -0800
Subject: [Types-sig] List of FOO
Message-ID: <3855DBBB.6D1B462A@prescod.net>

> > #2. The first version of the system will not allow the use of types
> > that cannot be referred to as simple Python objects. In particular it
> > will not allow users to refer to things like "List of Integers" and
> > "Functions taking Integers as arguments and returning strings."
> 
> It's been said before: that's a shame.  Type inference is seriously
> hindered if it doesn't have such information.  (Consider a loop over
> sys.argv; I want the checker to be able to assume that the items are
> strings.)

It took two years to get the parameterized version of the Java type
system up and running. Let me ask this your opinion on this question
(seriously, not sarcastically), should we include a spelling for "list
of string" and not "callable taking list of callables taking strings
returning integers returning string" and what about "callable taking
list of callables taking <T> and R returning list of callables taking
<R> and returning <T>." You see my problem? I could special case "list
of" as Java and C did if we agreed to take our chances that my syntax
would be extensible. We could even steal that weird "[]" thing that C
and Java do:

	StringType [] foo

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
"Unwisely, Santa offered a teddy bear to James, unaware that he had 
been mauled by a grizzly earlier that year." - Timothy Burton, "James"


From paul@prescod.net  Tue Dec 14 06:22:31 1999
From: paul@prescod.net (Paul Prescod)
Date: Mon, 13 Dec 1999 22:22:31 -0800
Subject: [Types-sig] IsInstance
References: <3854EB4B.37EA2888@prescod.net> <199912131809.NAA19402@eric.cnri.reston.va.us>
Message-ID: <3855E227.AE33907@prescod.net>

Guido van Rossum wrote:
> Perhaps this could be an extension of isinstance()?  (That already
> takes both class and type objects.)

I wanted the function to return an object:

myList=isinstance( foo, types.ListType )
if not myList:
	myDict=isinstance( foo, types.DictionaryType )

Then we can do the inferencing by looking at a single statement. Compare
it to this:

if isinstance( foo, types.ListType ):
	myList=foo
elif isinstance( foo, types.DictionaryType ):
	myDict=foo

That inferencing is just too hard. It isn't a proper cast operator
anymore. If you are willing to change isinstance to return the object if
it matches then I would like to use it.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Tue Dec 14 06:41:25 1999
From: paul@prescod.net (Paul Prescod)
Date: Mon, 13 Dec 1999 22:41:25 -0800
Subject: [Types-sig] Module protection
Message-ID: <3855E695.B1180A86@prescod.net>

> However, there are some examples of dynamic code usage that are
> fishy.  Examples include adding or changing globals in other modules
> (except for the rare global that is intended to be a settable option),
> or messing with the __builtin__ module.

I am glad you agree. Actually I took out a feature that allowed you to
say that a module namespace (or particular name) was constant. I'll
leave that to you since it is not directly required for static typing.
All I require for static typing is that you don't replace sys.exit with
a function that returns a string or replace sys.version with a file
object.
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Tue Dec 14 06:44:38 1999
From: paul@prescod.net (Paul Prescod)
Date: Mon, 13 Dec 1999 22:44:38 -0800
Subject: [Types-sig] type-safe declaration
Message-ID: <3855E756.174CE40D@prescod.net>

> > #4. There must be declarations that instruct static type checking
> > software to verify that a function cannot violate binding assertions.
> > These are called safety declarations.
> 
> I'm not sure what you mean here and how such declarations differ from
> type assertions.  And I'm worried about the "must" part.  Please explain
> better?

"must" is an instruction to the specification writers (us) not to Python
programmers. It means that we must provide a mechanism that would allow
a programmer to say that a function is type-safe:

type-safe
StringType
def double(a):
	StringType, a;

	return a*a

Unlike Java, if you don't ask for a function to be statically type
checked then it just isn't. Newbies can work without type checking until
they feel it would be useful for them.

My feeling that declaring a return type is just declaring a return type.
It doesn't mean that you are willing to PROVE (statically) that the
return type declaration will be accurate.

> But I'd still like to be able to be diagnosed at compile time instead
> of at runtime when my code makes a statically illegal call to a
> function with a safety declaration.

Under my plan, you would need a static declaration on YOUR code. I mean
if your code can NEVER be right (e.g. range( "abc" ) ) then maybe a
smart checker could report that. Java actually requires this of
implementors. But if your code COULD be right (which is much more often
the case in Python) then it should wait until runtime to check:

a=callSomeUnTypedFunction()
range( a )

> OK.  I'm not sure everywhere whether you want compile-time or run-time
> checking.  Perhaps you can clarify this?

Static type checking if you ask for it (with a type-check declaration)
or dynamic type checking otherwise (unless you turn it off with an
optimization option).
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Tue Dec 14 06:53:04 1999
From: paul@prescod.net (Paul Prescod)
Date: Mon, 13 Dec 1999 22:53:04 -0800
Subject: [Types-sig] RFC 0.1
References: <3854EB4B.37EA2888@prescod.net> <199912131809.NAA19402@eric.cnri.reston.va.us>
Message-ID: <3855E950.AE0E3E19@prescod.net>

Thanks for all of your feedback! It's good stuff.

Guido van Rossum wrote:
> 
> > #1. The system exists to serve the dual goals of catching errors
> > earlier in the development process and improving the performance of
> > Python compilers and the Python runtime. Neither goal should be
> > pursued exclusively.
> 
> Hm, these may at times be very different goals.  I had a recent
> private discussion about types where the two goals were referred to as
> (OPT), for optimization, and (ERR), for error-detection.  One
> observation is that while for (OPT) you may be able to get away with
> aggressive whole-program type inferencing only, 

In theory, but in practice "whole-program X" seems to never get
implemented (in Python or elsewhere!), as in "whole program type checks"
and "whole program optimization" and "whole program flow analysis."
"Whole program analysis" tends to be an excuse to put off work (roughly
like "type inference").

> Technically, Python assert statements are only executed in
> non-optimizing mode -- "assert 0" has no effect when you happen to use
> "python -O" to execute your program.  But I presume that here you mean
> assertions in the abstract conceptual sense.

No, I was thinking of actually compiling to the same byte-codes. It
isn't really "safe" to turn off type-checks at runtime but it also isn't
safe to turn off assertions. They are both there to guarantee program
correctness at the price of performance. But maybe we would make a
different command line option to control type checking.

> I think JPython secretly already imposes some of these restrictions
> (in particular for the sys module!).

Good, then programmers are warmed up. :)

> > In other words, code that uses functions and classes from the module
> > should not need to know whether it uses binding assertions or old
> > fashioned assert statements.
> 
> Except that some unintended uses may become illegal while before you
> might just have gotten away with them.

Yes and no. In the past, we didn't do many type checks because many of
us were philosophically against "type" and "class" checks. We wanted
capability checks. Jim Fulton (et. al.) is working on that with
interfaces. So with or without static type checking we should start
seeing interface assertions. We're just giving them a nicer syntax
(which may, admittedly, lead to more of them). Still, I want to put the
blame squarely in Jim's corner (even if I was also in that corner).

> > #9. There should be a mechanism to assert that an object has a
> > particular type for purposes of informing the static and dynamic type
> > checkers about something that the programmer knows about the flow of
> > the program.
>
> Beyond "assert isinstance(object, type_or_class)" ?

There are two issues here.

First, I avoided using existing Python "spellings" for things that are
going to take on magical meanings because people will expect other
logical variations to work:

typeobj = callSomeRandomFunction()
assert isinstance(object, typeobj)

If we invent new, syntactically distinct spellings then we can
syntactically recognize them and complain if they aren't spelled
"exactly right" (i.e. in a statically analyzable way).

> I think that this is too much of a constraint, and may be informing
> your preliminary design too much.  As long as an easy mechanical
> transformation to valid Python 1.5.x is available, I'd be happy.

Okay. I'll keep this in mind.

> > Name declaration:
> >     A name bound at the most out-dented context of a statically
> > available namespace creating suite.
> 
> The indentation don't enter into it.  Consider
> 
>     if win32:
>        def func(): ... # win32 specific version
>     else:
>        def func(): ... # generic version

That's precisely what I'm trying to disallow. I don't know the value of
win32 until runtime! The pyc could be moved from Unix to win32. And more
to the point, the value win32 might be computed based on arbitrarily
complex code. So that's why I said out-dented. An out-dented name
binding statement cannot depend (much) on a computed value. Computed
base classes are going to have to be explicitly disallowed for
statically checkable classes:

class foo( dosomething() ):
	...

> > Classification:
> >     Due to a shortage of synonyms for "type" that do not already have a
> > meaning, we use the word "classification."
> 
> Oh, dear.  Keep looking for a better synonym!

You just had to put "type" and "class" in the same language! I could
redefine the term type in this context and refer to the old concept of
type as I did below:

> >     Given a value v and a value t, v conforms to classification t if
> >         t is returned by type( v )
> >     4. The classification of class instance variables comes from the
> > classification of the corresponding class variable.
> >
> > <example>
> > class foo:
> >     types.IntType
> >     a=5
> >
> >     types.ListType
> >     b=None
> > </example>
> 
> The initialization for b denies its type declaration.  Do you really
> want to do this?  

None is a valid value for any type as with NULL in C or SQL.

> This doesn't look like it should be part of the
> final (Python 2.0) version -- it's just too ugly.  How am I going to
> explain this to a newbie with no programming *nor* Python experience?

With all due respect my problem is that you took the obvious (or at
least traditional) instance variable declaration syntax and used it as a
class variable declaring syntax. Okay, let's try this:

 class foo:
     types.IntType, a=5

     def __init__( self ):
         types.ListType, self.b

That looks equally ugly to me. Got any other ideas?

On a separate track: I don't think that the whole static type system is
for newbies, just as all of Python is not for newbies (think
__getattr__). You shouldn't even start thinking about static typing
until you are trying to "tighten up" your code for performance or
safety. I don't want to use that as an excuse to make things difficult
but if we are ever going to get to full polymorphic parametric static
type checking we will have to acknowledge that the type system will have
hard parts just as the language has hard parts.

> > Classification-safe Function:
> >
> >     a function that can be checked at compile time not to violate any
> > classification constraints by assigning invalid values to any
> > constrained names:
> >
> > Every reference to a name in a module or class (not instance!) must be
> > to a declared (but perhaps not classification constrained) name.
> 
> Explain the reason for excluding instances?  Maybe I'm not very clear
> on what you're proposing here.

I think that that was from an earlier draft. Obviously we can't check
instance variables in the same way that you check class and module
namespaces but we do want to check them. The thought gives me a
headache. It's my fourth year compiler class all over again. Make it
stop!

Maybe if I just specify it, some fourth year student will implement it
as a project.
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
"Unwisely, Santa offered a teddy bear to James, unaware that he had 
been mauled by a grizzly earlier that year." - Timothy Burton, "James"


From paul@prescod.net  Tue Dec 14 07:05:24 1999
From: paul@prescod.net (Paul Prescod)
Date: Mon, 13 Dec 1999 23:05:24 -0800
Subject: [Types-sig] Re: RFC 0.1
References: <ADCB388D8C6BD211A4CE0000F63D90112215B5@mail.littoncorp.com>
Message-ID: <3855EC34.F78D9A99@prescod.net>

"Golden, Howard" wrote:
> 
> I think the system should also allow the author to require declarations of
> all variables (e.g., via a command-line switch or pragma).

I think that's a good idea for a particular implementation but I'm not
going to put it in the type system specification. If I were Guido I
would be unwilling to instruct every standard library package maintainer
to supply all type declarations in order to please the minority who want
to use Python in a manner that is as restrictive as Java.

> > 1. The first version of the system will be as neutral as possible on
> > the issue of what defines a "type". Fulton's capability-based
> > interfaces should be legal as types but so should type objects and
> > classes.
> 
> I don't understand the ramifications of this.  Might it not gut the RFC?

I don't think so (yet). The main point is that we need to support
"types", "classes" and the new "interfaces"

> Shouldn't it be straightforward to add declarations to the existing library?

Not just declarations: someone needs to actually define the set of
"standard interfaces." There are probably a few weeks worth of work
there and even a few weeks of work are hard to find since we all have
other jobs.

> > #2. The first version of the system will not allow the use of types
> > that cannot be referred to as simple Python objects. In particular it
> > will not allow users to refer to things like "List of Integers" and
> > "Functions taking Integers as arguments and returning strings."
> 
> Why?  I don't think this should be prohibited, only not guaranteed.

How can we allow it without defining the syntax? 

> > #4. The first version of the system will be syntactically compatible
> > with Python 1.5.x in order to allow experimentation in the lead-up to
> > an integrated system in Python 2.
> 
> Does this mean no new syntax?  (That's what it appears from your examples.)
> 
> How about a declaration syntax, e.g.,
> 
>     var x : type1, y : type2
> 
> Is this prohibited by the RFC?

Yes, but I may change my mind on this issue based on Guido's feedback.

> I'm confused about this section.  Are these requirements or merely
> terminology?

The definitions turned into the spec. The long and short of it is that
you can declare the types of variables:

StringType
a = "abc"

and functions:

StringType
def a(): return "abc"

and you can state that you want a function to be statically type
checked:

StringType
def a(): return "abc"

The spec is complex because I have to restrict the set of circumstances
where this "works" to things that can be detected statically. I
explicitly do not support stuff like this:

import somefunction
import a
import b

if somefunction.doit():
	mod=a
else:
	mod=b

a.SomeType
foo1 = None

b.SomeType
foo2 = None

mod
func( arg ):
	return a #valid or not??

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Tue Dec 14 07:13:35 1999
From: paul@prescod.net (Paul Prescod)
Date: Mon, 13 Dec 1999 23:13:35 -0800
Subject: [Types-sig] RFC 0.1
References: <Pine.LNX.4.10.9912131316140.1149-100000@localhost.localdomain>
Message-ID: <3855EE1F.9E8B4C1B@prescod.net>

John Ehresman wrote:
> 
> I think it might be possible to do both run-time and compile-time
> checking by defining the system in terms of what happens at run time, but
> allowing compile time optimizations to be made.  

We are almost on the same track, but are not completely in sync. Static
type checking isn't just an optimization. It's also a way of making more
robust code. We use the "type-safe" declaration to say that the
function/class/module should never throw a TypeError (thought it might
propagate one from un-typesafe code). Note that even in C++ and Java it
is possible for type-safe code to be required to propagate those
language's equivalent of a type error.

I'm not happy with a "lint-like-tool". I want static type checking to be
formally defined in the language definition as it is in other languages.
If you want it, you should be able to get it, reliably and at compile
time.
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From gstein@lyra.org  Tue Dec 14 11:08:59 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 14 Dec 1999 03:08:59 -0800 (PST)
Subject: [Types-sig] RFC 0.1
In-Reply-To: <3855F2F8.DE1FED31@prescod.net>
Message-ID: <Pine.LNX.4.10.9912140200100.16305-100000@nebula.lyra.org>

On Mon, 13 Dec 1999, Paul Prescod wrote:
> I did evaluate your proposal but it seemed to me that it was solving a
> slightly different problem. I think as we compare them we'll find that
> your ideas were more oriented toward runtime safety checking.

True, but I might posit that (due to Python's dynamic nature) you really
aren't going to come up with a good compile-time system. That leaves
runtime.

> Greg Stein wrote:
> > 
> > > > #2. The system must allow authors to make assertions about the sorts
> > > > of values that may be bound to names. These are called binding
> > > > assertions. They should behave as if an assertion statement was
> > > > inserted after every assignment to that name program-wide.
> > 
> > In our writeup, we posit that it is better (and more Pythonic) to bind the
> > assertions to expressions, rather than names. This came about when we
> > looked at how to supply assertions for things like:
> > 
> >   x.y = value
> >   x[i] = value
> >   x[i:j] = value
> 
> I wouldn't supply assertions for assignments at all. You supply
> assertions for the names x, y, i, and j.

I know you wouldn't. I was offering a different tack (and one that seemed
to work better).

In Python, names have no semantics other than an identifier, a scope, and
that they are a reference. We thought it would be nice to retain the
notion that names are just names -- it is the objects and what you're
doing with them that is important.

> > Certainly, function objects would have type information associated with
> > them, but I believe that is different than associating a type with the
> > function's name.
> 
> But if a function takes as its first argument an int, in what sense is
> that type associated with an "expression"? It is associated with a name,
> whatever the name of the first argument is. Plus consider this:

I think I wasn't clear enough here. In the following statement:

  a = b

We suggested that type checking is defined and applied to the value (b),
rather than associating a type with "a" and performing an assertion at
assignment time. The concept of "this variable name can only contain
values of <this> type" is a standard, classical approach. We didn't think
it applied as well to Python (for a number of reasons). If you're doing
type inferencing, then you are actually tracking values -- the types
associated with a name are very artificial/unnecessary during type
inferencing. For example:

  a = [1, 2]
  foo(a)
  a = {1: 2}
  bar(a)
  a = 1.2
  baz(a)

The above code is quite legal in Python (and no, I don't want to hear
arguments that it shouldn't be :-). With a type system that is associated
with expressions/values rather than names, then we can do proper type
inferencing, checking, etc on the above code.

The only thing in our outline that has associated type information is a
function object (note: not a function name). Reflection on the function
can get the information for you (obviously, only useful for runtime
tools; compilers would be using syntactic markers only).

> type-safe
> String
> def foo():
> 	return abc()
> 
> How can I, at compile time, statically know the type of the value
> currently contained in the name abc if I don't restrict it in advance
> like this:
> 
> String
> def abc(): return "abc"

You would. I never said otherwise :-)

But I see it as data (type) flow: "abc happens to refer to an object,
which is typed as a function returning a string", rather than saying "abc
is a function returning a string."

Just as objects have types ("is-a"), I think a function object should
expand a bit and record the types of its params and return value(s).

> Rebinding is fine, as long as it doesn't invalidate the type
> declaration:
> 
> abc = lambda: "def"

So you say :-)  I say rebind all you want. Base your assertions and
type-checks on what it has at whatever lexical point in your program.

For the case of:

  if condition:
      x = 1
  else:
      x = "abc"

I would say that the type of "x" is a set, rather than a particular type.
If you're going to do type-checking/assertions, then any uses of "x"
better be able to accept all types in the set.

> > We also proposed extending isinstance() to allow a callable for the third
> > argument. This allows for arbitrarily complex type checking (e.g. the
> > "list of integers" problem).
> 
> I liked that idea but really didn't see how to port it to a compile time
> static type checker. I'm going out of my way to avoid running arbitrary
> Python code. Static type checking shouldn't be a security hazard.

I believe that Python is too rich in data types and composition of types
to be able to add *syntax* for all type declarations. I think you better
stop and realize that before you get in too deep :-)

In your RFC 0.1, you punted on the complex/composited data types issue too
keep the solution tractable. I posit that you will *never* solve the
problem of coming up with sufficient syntactical expression; therefore,
you will always have to resort to a procedural component in your type
system *if* you want full coverage.

>...
> > If type assertions are bound to expressions, rather than names, a data
> > flow analysis will show the types at any point. This could (theoretically)
> > avoid many "declarations".
> 
> Names get their values from expressions so the data flow analysis is the
> same.

Partially true, but as I mentioned above: names are just points in your
data flow. They are a side-effect. A name "recieves" a type from the data
-- it does not "drive" the data flow. I think it is clearer to just avoid
the attaching of a type to a name and to just look at the data.

Note that one benefit of associating types with names, is that you can
shortcut the data flow analysis (so the analysis is not necessarily the
same). But: you cannot have a name refer to different types of objects
(which I don't like; it reduces some of Python's polymorphic and dynamic
behavior (interfaces solve the polymorphism stuff in a typed world)).

> If you have to type-check the statement "return a" then you need to be
> able to know the type of both the variable and the expression.

[ by expression, I presume you mean the object referenced by "a". if you
  mean the expression "a", well yah... but that's a degenerate case which
  doesn't server as a good example of what you're trying to say. ]

In the above case, you need to know the type of the variable *OR* the
expression (the referenced object). If you have the type of the variable,
then you simply assert that type rather than using data flow to know what 
is referenced.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From paul@prescod.net  Tue Dec 14 14:13:09 1999
From: paul@prescod.net (Paul Prescod)
Date: Tue, 14 Dec 1999 06:13:09 -0800
Subject: [Types-sig] RFC 0.1
References: <Pine.LNX.4.10.9912140200100.16305-100000@nebula.lyra.org>
Message-ID: <38565074.9E51515F@prescod.net>

Greg Stein wrote:
> 
> True, but I might posit that (due to Python's dynamic nature) you really
> aren't going to come up with a good compile-time system. That leaves
> runtime.

Okay, but my mandate is to come up with a static (compile-time) system.
I already have a variety of runtime tools for doing type checking. 

http://www.python.org/sigs/types-sig/

> In Python, names have no semantics other than an identifier, a scope, and
> that they are a reference. We thought it would be nice to retain the
> notion that names are just names -- it is the objects and what you're
> doing with them that is important.

But we WANT some names to have semantics. sys.version should be an
integer. sys.path should be a list of strings. __builtins__.dir should
be object->list and so forth.

> The above code is quite legal in Python (and no, I don't want to hear
> arguments that it shouldn't be :-). 

I am perfectly happy to have it be legal Python code. I just don't
intend for it to be *statically type checkable* Python code. No, you
cannot use all of the flexibility of Python and expect to get all of the
static type checking of Java. For each function you choose one or the
other.

> With a type system that is associated
> with expressions/values rather than names, then we can do proper type
> inferencing, checking, etc on the above code.

That code could not be legally inferenced in any inferencing system I am
familiar with. (ML and inferenced Ada) If you write a formal
specification for "data flow analysis" that can be implemented by two
independent compilers based on the spec then I will take this approach
seriously. But my impression from my time in the scheme world is that
"data flow analysis" is an unconscious code-word for "let's put this
problem off and hope that someone else figures out some magic that I
haven't figured out yet."

If a static type checking system is hard for Python, a static type
inferencing one is going to be doubly hard!

> > Rebinding is fine, as long as it doesn't invalidate the type
> > declaration:
> >
> > abc = lambda: "def"
> 
> So you say :-)  I say rebind all you want. Base your assertions and
> type-checks on what it has at whatever lexical point in your program.

What if the rebinding happened in some other function, class or module?

> I believe that Python is too rich in data types and composition of types
> to be able to add *syntax* for all type declarations. I think you better
> stop and realize that before you get in too deep :-)

I have a few different answers here:

 1. I don't have to be able to describe every possible type. If you
can't statically check that "foo is a callable from T,T to callable from
T" tough bloody luck, at least for the time being. Java can't do that.
Neither could mid-90's C++. And forget about it for ANSI C. 

Python is not the world's most OO programming language. It is just a
good one. It may not have the world's most static type checker. It will
just have a good one. No type system makes type errors impossible so
that is not my goal. My goal is that if a module uses type checks as
religiously as  Java module would, that module would be roughly as
type-safe. 

 2. Python is no richer in types than any other language with an
extensible type system. This includes ML, Haskell and even Java. There
is no language today without a list type or mapping type. Yes, some
Python complexity comes from the fact that there are dozens of
non-reflective types "built-in" but we can and should fix that.

 3. Compositions of types are complex, but not infinitely complex. We
have about two decades in parameterized type research to rely on. Within
a year and a half, two of the world's most popular languages (C++ and
Java) will have parameterized types.

> In your RFC 0.1, you punted on the complex/composited data types issue too
> keep the solution tractable. I posit that you will *never* solve the
> problem of coming up with sufficient syntactical expression; therefore,
> you will always have to resort to a procedural component in your type
> system *if* you want full coverage.

I am happy to have a runtime component. I just don't see that we need
any new syntax for this runtime component. And I don't think that we
should give up on a formally defined static system.
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From guido@CNRI.Reston.VA.US  Tue Dec 14 15:19:52 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 14 Dec 1999 10:19:52 -0500
Subject: [Types-sig] A case study
Message-ID: <199912141519.KAA23476@eric.cnri.reston.va.us>

Here's a long and rambling example of what I think a type inferencer
could do -- without type declarations of any sort.  I wrote this down
while thinking about the type checker that I would like to see in
IDLE.

--Guido van Rossum (home page: http://www.python.org/~guido/)

"""Let's analyze a simple program.  Here's an example script -- let's
call it pyfind.py -- that prints the names of all Python files in a
given directory tree."""

#----------------------------------------------------------------------
import sys, find

def main():
    dir = "."
    if sys.argv[1:]:
        dir = sys.argv[1]
    list = find.find("*.py", dir)
    list.sort()
    for name in list:
        print name

if __name__ == "__main__":
    main()
#----------------------------------------------------------------------

"""Our task is to check whether this is a correct program.  I won't
define correctness rigidly here, but it has something to do with under
what circumstances the program will execute to completion.

At the top level, we see an import statement, a function definition,
and an if statement.

Analyzing the import statement, we notice that sys is a well-known
standard module.  The find module is in the standard library.  (In
Python 1.6 it will be obsolete, but it's a convenient example.)

Let's have a look at find.py just to see if there's any weirdness
there:"""

#----------------------------------------------------------------------
import fnmatch
import os

_debug = 0

_prune = ['(*)']

def find(pattern, dir = os.curdir):
    list = []
    names = os.listdir(dir)
    names.sort()
    for name in names:
        if name in (os.curdir, os.pardir):
            continue
        fullname = os.path.join(dir, name)
        if fnmatch.fnmatch(name, pattern):
            list.append(fullname)
        if os.path.isdir(fullname) and not os.path.islink(fullname):
            for p in _prune:
                if fnmatch.fnmatch(name, p):
                    if _debug: print "skip", `fullname`
                    break
            else:
                if _debug: print "descend into", `fullname`
                list = list + find(pattern, fullname)
    return list
#----------------------------------------------------------------------

"""This imports two more modules, and then defines two variables and a
function.  Module os is a standard library module with special status.
It's written in Python, but its source code is actually pretty hairy
and dynamic; we can assume that its effective behavior can be
hardcoded in the analyzer somehow, or perhaps we show the analyzer an
idealized version of its source code.  (This is an example of one
trick our analyzer can use to make its life easier.  It's equivalent
to the concept of a "lint library" for the Unix/C lint tool.)

Let's look at the fnmatch source code (I've left out some doc
strings):"""

#----------------------------------------------------------------------
import re

_cache = {}

def fnmatch(name, pat):
	import os
	name = os.path.normcase(name)
	pat = os.path.normcase(pat)
	return fnmatchcase(name, pat)

def fnmatchcase(name, pat):
	if not _cache.has_key(pat):
		res = translate(pat)
		_cache[pat] = re.compile(res)
	return _cache[pat].match(name) is not None

def translate(pat):
	i, n = 0, len(pat)
	res = ''
	while i < n:
		c = pat[i]
		i = i+1
		if c == '*':
			res = res + '.*'
		elif c == '?':
			res = res + '.'
		elif c == '[':
			j = i
			if j < n and pat[j] == '!':
				j = j+1
			if j < n and pat[j] == ']':
				j = j+1
			while j < n and pat[j] != ']':
				j = j+1
			if j >= n:
				res = res + '\\['
			else:
				stuff = pat[i:j]
				i = j+1
				if stuff[0] == '!':
					stuff = '[^' + stuff[1:] + ']'
				elif stuff == '^'*len(stuff):
					stuff = '\\^'
				else:
					while stuff[0] == '^':
						stuff = stuff[1:] + stuff[0]
					stuff = '[' + stuff + ']'
				res = res + stuff
		else:
			res = res + re.escape(c)
	return res + "$"
#----------------------------------------------------------------------

"""This in turn imports the re module.  This one is a bit too long to
include here; let's assume that, like the os module, it's known as a
special case to the analyzer.  Just a variable initialization and
three function definitions here, no other executable code.

Let's go back to the top-level script.  I think the analyzer can
easily recognize the idiom ``if __name__ == "__main__": ...'': it can
know that since this is the root of the program, __name__ is indeed
equal to "__main__", so it knows that main() gets called.

Now we need to analyze main() further.  Here it is again, with line
numbers:"""

#----------------------------------------------------------------------
def main():                             # 1
    dir = "."                           # 2
    if sys.argv[1:]:                    # 3
        dir = sys.argv[1]               # 4
    list = find.find("*.py", dir)       # 5
    list.sort()                         # 6
    for name in list:                   # 7
        print name                      # 8
#----------------------------------------------------------------------

"""In line 1, we see that there are no arguments.  Line 2 initializes
the variable dir with the constant value ".", so we know its type is a
string at this point.

There's one other assignment to dir, on line 4.  How do we know that
this is also assigning a string to dir?  My reasoning as the human
reader of the program is that sys.argv is initially a list of strings,
so sys.argv[x] for any x either raises an exception or yields a string
value.  The initial type of sys.argv can be known to the analyzer.

How does the analyzer know that no other code has assigned anything
different to sys.argv?  Exhaustive analysis can probably show that
there are no assignments to sys.argv or its items (or slices) anywhere
in all the modules used by the program, nor are there calls to any of
the list-modifying methods.  We may be able to restrict ourselves to
the code that may already have run by the time we reach this statement
-- but then we have to prove that there are no other calls to main().

Maybe the analyzer can be primed with special knowledge about
sys.argv, e.g. that its type cannot change.  Then statements that
cannot be proved to keep its type the same can be flagged as errors.
Of course this gets muddy in the light of aliasing -- we'd need to
keep around the information that some variable might point to
sys.argv.  Fortunately that kind of information seems useful in
general.

OK, so we know that dir is a string.  Let's go back to line 3: the
checker notes that sys is an imported module (the sys module) which
has indeed an argv argument that is sliceable.  It will also note that
the expression ``1'' has the type integer which is a valid slice
index.  (There are no out-of-bounds exceptions for slice indices.)

In line 4, we need to have another look at the expression
``sys.argv[1]''.  (Note: I'm not saying that the analyzer jumps around
haphazardly like this, I'm just making a case for what kinds of
processes typically go on in the analyzer.  In reality it probably
goes at it in a much more orderly fashion.)  Again, sys.argv is
recognized as a list, so it can be indexed, and the expression ``1''
has the correct type.  Now, indexing may cause an IndexError
exception.  Can the checker prove that we won't (ever!) get an
IndexError at this particular line because of the test in the if
statement on the previous line?  I think that may be asking a bit
much.  But it knows that if the index is valid, the result will be a
string (see above).

Next, line 5.  Here the analyzer knows that find is a module we
imported, and that find.find is a function defined in that module.  We
call it with two arguments.  The first is a string literal; the second
is our local variable dir, which is also a string.  Let's have a look
at the function definition again:"""

#----------------------------------------------------------------------
def find(pattern, dir = os.curdir):                                    #  1
    list = []                                                          #  2
    names = os.listdir(dir)                                            #  3
    names.sort()                                                       #  4
    for name in names:                                                 #  5
        if name in (os.curdir, os.pardir):                             #  6
            continue                                                   #  7
        fullname = os.path.join(dir, name)                             #  8
        if fnmatch.fnmatch(name, pattern):                             #  9
            list.append(fullname)                                      # 10
        if os.path.isdir(fullname) and not os.path.islink(fullname):   # 11
            for p in _prune:                                           # 12
                if fnmatch.fnmatch(name, p):                           # 13
                    if _debug: print "skip", `fullname`                # 14
                    break                                              # 15
            else:                                                      # 16
                if _debug: print "descend into", `fullname`            # 17
                list = list + find(pattern, fullname)                  # 18
    return list                                                        # 19
#----------------------------------------------------------------------

"""Indeed the function takes two arguments.  It's also reassuring that
the second argument has a default argument of type string (os.curdir
is easily recognized as a string, using similar reasoning as for
sys.argv).

Now let's analyze it further.

Line 2 defines a local variable list initialized to an empty list.
Will it always have the type List, throughout this function?  There's
an assignment further down to this variable from the expression ``list
+ find(...)''.  How can we prove that the type of that variable is
List?  I can show it in any of two ways, neither is very satisfactory:

    1. List objects support a + operator only with a List right
    operand, and the result is a List.  The problem with this is that
    the right operand might be a class instance that defines __radd__
    and returns something else, so it's not a valid proof.

    2. Using induction: if recursion level N returns a List, the
    assignment ``list = list + find(...)'' shows that recursion level
    N+1 has type List; recursion level 0 (no recursive calls) has
    return type List; so all recursion levels have type List.  This is
    a valid proof (though I should write it down more carefully) but
    I'm not sure if I can assume that the analyzer is smart enough to
    deduce it!

I'm not sure how to rescue myself out of this conundrum; perhaps
there's value in John Aycock's assumption that variables typically
don't change their type unless shown otherwise; then we could assume
list was a List throughout.  Still thin ice, but this is a common
pattern.

The rest is a bit simpler.  Line 3 calls a known system function
taking a string and returning a list of strings, or raising os.error.
We remember that the dir argument is a string so this is valid.  We
also note that this might raise os.error.  As indeed it will when we
pass it a non-directory as a command line argument.  The return type
shows us that the local variable names is set to a list of strings.
Line 3 sorts that list.  The analyzer should know that this
calls the built-in function cmp() pairwise on items of the list;
comparing strings is fine so there's no chance of an error here.

In line 5 we iterate over names, which is a list of strings, so we
know name is a string.

Plodding along: ``name in (os.curdir, os.pardir)'' is a valid test;
the rhs of the in operator is a tuple of strings and we know that the
in operator calls cmp(name, x) for each x in the tuple; again, this is
fine.

Line 8, ``fullname = os.path.join(dir, name)'': we can know that
os.path.join is a function of 1 or more string arguments returning a
string; the arguments are both strings and we now know that the local
variable fullname is a string, too.

Line 9 calls fnmatch.fnmatch().  I postulate that it's obvious that
this takes two arguments and returns a Boolean:"""

#----------------------------------------------------------------------
def fnmatch(name, pat):
	import os
	name = os.path.normcase(name)
	pat = os.path.normcase(pat)
	return fnmatchcase(name, pat)

def fnmatchcase(name, pat):
	if not _cache.has_key(pat):
		res = translate(pat)
		_cache[pat] = re.compile(res)
	return _cache[pat].match(name) is not None
#----------------------------------------------------------------------

"""I leave it as an exercise that the argument types (strings again)
are correct and that no other errors can occur.

Line 10 calls list.append(fullname).  We know that list is a List
object and that fullname is a string.  We should also know the effect
of a list's append method; the call is correct (it takes one argument
of arbitrary type).

What do we now know about the type of the list variable?  It was
initialized to an empty List.  It's still a List, and we know that at
least some of its items are lists.  Are all its items lists?  This
gets us into similar issues as the recursive call to find() before,
and just as there, I'm not sure that we really do, so maybe we need to
continue the single type hypothesis.  (One way out would be to assume
the single type hypothesis until we see positive proof to the
contrary, and if so, redo the analysis with a less restricted type.)

I'll leave the rest of this function as an exercise; no new principles
are employed.  Note that _prune is a global variable initialized to a
list of strings, and, with John, we'll assume that that is its final
type; this makes everything work.

This function is recursive.  Could we prove that the recursion will
terminate?  Probably not; it would require knowing filesystem
properties.  Note that if it weren't for the os.path.islink() test, it
would be possible to create a structure in the filesystem that would
cause infinite recursion here!  So our analyzer might flag this
function as questionable recursive.

We can now finish the analysis of our original main function:"""

#----------------------------------------------------------------------
def main():                             # 1
    dir = "."                           # 2
    if sys.argv[1:]:                    # 3
        dir = sys.argv[1]               # 4
    list = find.find("*.py", dir)       # 5
    list.sort()                         # 6
    for name in list:                   # 7
        print name                      # 8
#----------------------------------------------------------------------

"""We know that find.find() returns a list of strings, so this is the
type of the list variable.  We already talked about sorting a list of
strings.  I can probably prove that sorting here is redundant, given
the way find() sorts its list of names, but that will be hard for the
analyzer, so I doubt that it will find this subtle optimization tip.

The final for loop and print statement have no further problems; we
know that list is a List, which is a sequence, so a for loop can
iterate over it; the print statement calls str() on each of its
arguments and this function is always safe on strings (as it does for
most types, except instances or extension types that raise exception
in their __str__ implementation)."""


From paul@prescod.net  Tue Dec 14 15:20:54 1999
From: paul@prescod.net (Paul Prescod)
Date: Tue, 14 Dec 1999 07:20:54 -0800
Subject: [Types-sig] Avoiding innovation
Message-ID: <38566056.70679872@prescod.net>

In response to Greg's message I want to add a design goal:

#11. Wherever possible the system should try to build upon existing
implemented type systems and research rather than being designed from
scratch for Python. It will build much more closely on dynamic
language type annotation systems such as those in Smalltalk, Common
Lisp, Dylan and Visual Basic. Java and C++ are of secondary interest
as models.

---

Python is just another syntax and virtual machine for the lambda
calculus. It obeys the same mathematical laws as other programming
languages. I think it would be a mistake to throw out everything that we
know about type systems and implement something idiosyncratic.

Python IS a classical dynamic object/procedural programming language. It
is not a research language and I dislike attempts to put in untested new
ideas, especially in the area of type checks.
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From guido@CNRI.Reston.VA.US  Tue Dec 14 15:37:11 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 14 Dec 1999 10:37:11 -0500
Subject: [Types-sig] RFC 0.1
In-Reply-To: Your message of "Tue, 14 Dec 1999 03:08:59 PST."
 <Pine.LNX.4.10.9912140200100.16305-100000@nebula.lyra.org>
References: <Pine.LNX.4.10.9912140200100.16305-100000@nebula.lyra.org>
Message-ID: <199912141537.KAA23487@eric.cnri.reston.va.us>

[Greg Stein]

> Note that one benefit of associating types with names, is that you can
> shortcut the data flow analysis (so the analysis is not necessarily the
> same). But: you cannot have a name refer to different types of objects
> (which I don't like; it reduces some of Python's polymorphic and dynamic
> behavior (interfaces solve the polymorphism stuff in a typed world)).

This is a bogus argument.  From the point of view of human
readability, I find this:

   s = "the quick brown fox"
   s = string.split(s)
   del s[1]			# the fox is getting old
   s = string.join(s)

less readable and more confusing than this:

   s = "the quick brown fox"
   w = string.split(s)
   del w[1]			# the fox is getting old
   s = string.join(w)

The first version gives polymorphism a bad name; it's like a sloppy
physicist using the same symbol for velocity and accelleration.

The polymorphism that is worth having deals with function arguments
and containers and the like.  For example:

def sum(l, zero):
    s = zero
    for x in l: s = s + x
    return s

Here the type of l is sequence of <something> and the type of zero is
<something>; the only implied requirement for <something> is that
<something> + <something> returns another <something>.

The fact that this works just as well for lists of ints, floats,
strings, or even matrixes, given the appropriate zero, is valuable
polymorphism.  Other languages can only do this using parametrized
types; they get more type checking, but at a terrible cost.

Note that a type inferencer may not be able to deduce the rules I
stated above, since you could construct an example where there is no
single type <something> and yet the whole thing works.  E.g. I could
create a list [1, 2, 3, joint, "a", "b", "c"] where joint is an
instance of a class that when added to an int returns a string.
However if we had a typesystem and notation that couldn't express this
easily but that could express the stricter rules, I bet that no-one
would mind adding the stricter type declarations to the code, since
those rules most likely express the programmer's intent better.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From paul@prescod.net  Tue Dec 14 15:44:30 1999
From: paul@prescod.net (Paul Prescod)
Date: Tue, 14 Dec 1999 07:44:30 -0800
Subject: [Types-sig] Sorry!
Message-ID: <385665DE.9963174B@prescod.net>

...for spamming you guys all night. I want to make sure that everybody's
concerns get addressed. I'll slow down tonight.

One issue that Greg raised was the difficulty of checking builtin types
(in addition to the hairy parameterized types stuff). I've been thinking
about this and I think that the doc-sig and the types-sig have the same
problem. How do we sniff out the parameters and docstrings for methods
without running dangerous binary code.

I think that Java (and many other languages) has the right plan with
"shadow libraries." The CORBA guys already use IDL as a static library
syntax. I think that we should support both IDL and a strongly-typed
Pythonic syntax. It might work something like this:

def Int foo(a, b): 
	"Foo, defined in module"
	pass

def String foo(c, d, *args ): pass	
	"Foo, defined in module"
	pass

import _foo
locals().update( _foo.__dict__ )

Maybe we would have some kind of a keyword instead of the locals() hack.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From guido@CNRI.Reston.VA.US  Tue Dec 14 15:58:44 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 14 Dec 1999 10:58:44 -0500
Subject: [Types-sig] Re: RFC 0.1
In-Reply-To: Your message of "Mon, 13 Dec 1999 23:05:24 PST."
 <3855EC34.F78D9A99@prescod.net>
References: <ADCB388D8C6BD211A4CE0000F63D90112215B5@mail.littoncorp.com>
 <3855EC34.F78D9A99@prescod.net>
Message-ID: <199912141558.KAA23531@eric.cnri.reston.va.us>

[Paul Prescod]
> If I were Guido I would be unwilling to instruct every standard
> library package maintainer to supply all type declarations in order
> to please the minority who want to use Python in a manner that is as
> restrictive as Java.

Don't assume that!  I think that for standard library modules (either
in Python or in C), having the types can be a great boon -- it acts as
documentation, guidelines for future API evolution, etc.  Well worth
having.  Probably will catch some bugs in contributed code too! :-)

> Not just declarations: someone needs to actually define the set of
> "standard interfaces." There are probably a few weeks worth of work
> there and even a few weeks of work are hard to find since we all have
> other jobs.

This can be done incrementally, like the documentation got done.

> > How about a declaration syntax, e.g.,
> > 
> >     var x : type1, y : type2
> > 
> > Is this prohibited by the RFC?
> 
> Yes, but I may change my mind on this issue based on Guido's feedback.

Feedback: I think adding type declarations is too important to be
crippled by a "no new keywords, no new syntax" rule.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tismer@appliedbiometrics.com  Tue Dec 14 15:57:40 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Tue, 14 Dec 1999 16:57:40 +0100
Subject: [Types-sig] RFC 0.1
References: <Pine.LNX.4.10.9912140200100.16305-100000@nebula.lyra.org>
Message-ID: <385668F4.2340C4B2@appliedbiometrics.com>


Greg Stein wrote:
> 
> On Mon, 13 Dec 1999, Paul Prescod wrote:
> > I did evaluate your proposal but it seemed to me that it was solving a
> > slightly different problem. I think as we compare them we'll find that
> > your ideas were more oriented toward runtime safety checking.
> 
> True, but I might posit that (due to Python's dynamic nature) you really
> aren't going to come up with a good compile-time system. That leaves
> runtime.

This is very true. Are you completely moving away
from type declaration, or do you still propose an
  expr ! typeid
notation?

...
>   a = b
> 
> We suggested that type checking is defined and applied to the value (b),
> rather than associating a type with "a" and performing an assertion at
> assignment time. The concept of "this variable name can only contain
> values of <this> type" is a standard, classical approach. We didn't think
> it applied as well to Python (for a number of reasons). If you're doing
> type inferencing, then you are actually tracking values -- the types
> associated with a name are very artificial/unnecessary during type
> inferencing. For example:
> 
>   a = [1, 2]
>   foo(a)
>   a = {1: 2}
>   bar(a)
>   a = 1.2
>   baz(a)

One could have both behaviors at the same time, I think.
Type restriction would be a property of the involved namespace.
The namespace responsible for the assignment could be an 
extended dictionary object with the desired rules defined.

...
> For the case of:
> 
>   if condition:
>       x = 1
>   else:
>       x = "abc"
> 
> I would say that the type of "x" is a set, rather than a particular type.
> If you're going to do type-checking/assertions, then any uses of "x"
> better be able to accept all types in the set.

Allow me a question about types:

Where are the limits between types, values, and properties of values?

Assume a function which returns either
[1, 2, 3] or the number 42.

We now know that we either get a list or an integer.
But in this case, we also know that we get a list of three
integer elements which are known constants, or we get
the integer 42 which is even, for instance.

So what is 'type', how abstract or concrete should it be,
where is the cut?

> I believe that Python is too rich in data types and composition of types
> to be able to add *syntax* for all type declarations.

At the same time, Python is so rich from self-inspection that
writing a dynamic type inference machine seems practicable,
so how about not declaring types, but asking your code about its
type?

I could imagine two concepts working together:

Having optional interfaces, which is a different issue
and looks fine (Jim's 0.1.1 implementation).

Having dynamic type inference, which is implemented by cached
type info at runtime.

(I hope this idea isn't too simple minded)
Assume for instance the string module, implemented in Python.
It would have an interface which defines what goes in and
out of its functions.

At "compile" time of string.py, type inference can partially
take place already when the code objects are created. The interface
part creates restrictions on argument values, which can be used
for further inference. It can also be deduced whether the return
values already obey the interface or if deduction for imported
functions is necessary.
This info is saved in some cache with the compilation.
Changes to the module object simply break the cache.
When I dynamically redefine a piece of the module where it
depends of (say I assign something new to "_lower"), then
the analysis must be carried out again, recursively invalidating
other cached info as necessary.

Well, this is an example where I think the restriction to
type checking of expressions still applies, but something more is
needed to trigger this check early.
The involved namespace object is the string module's __dict__,
which should know that it is referenced by this expression:

def lower(s):
	res = ''
	for c in s:
		res = res + _lower[ord(c)]
	return res

And by assignment to the name "_lower" in this case, it could
invalidate code object lower's type cache. lower can no more
assure that it will return a string result and will trigger
its interface object to re-check consistency. The latter
will raise an interface_error if the rule doesn't match.

It remains an open question for me how deeply possible
values should be checkable, i.e. "this arg has to be a list
which is not empty". Minor point, maybe.

Did I make some sense, or am I off the track? - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From paul@prescod.net  Tue Dec 14 16:07:12 1999
From: paul@prescod.net (Paul Prescod)
Date: Tue, 14 Dec 1999 08:07:12 -0800
Subject: [Types-sig] Re: Inferencing: A case study
References: <199912141519.KAA23476@eric.cnri.reston.va.us>
Message-ID: <38566B30.36608D4E@prescod.net>

Guido van Rossum wrote:
> 
> Here's a long and rambling example of what I think a type inferencer
> could do -- without type declarations of any sort.  I wrote this down
> while thinking about the type checker that I would like to see in
> IDLE.

Okay, but let me ask this: if TOTAL Java-level type safety ONLY required
type declarations for all "non-local" variables (including functions and
instance variables) would that be acceptable to you?

Your inferencer heuristics are fine for an interactive GUI environment
where failure is merely an inconvenience but if we are going to have a
formally checkable notion of "this is statically type-safe" and "this is
not" then I worry about the "non-local breakage" problem. Oops, did
changing that variable to an "int" break your module way over there?

I spoke to the Journal of Functional Programmers at a conference
recently. I asked him about why ML's type inferencer made the language
so hard to use. He said: "oh, you should always put the type
declarations in. The type inferencer is mostly just an educational
tool." Of course that's not what the type inferencer was SUPPOSED to be,
but I think that that's what it has become. "Global" type inferencing
scares me and I think that it has the unintended consequence of making
the static type checker (and thus the language) harder to understand.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From Edward Welbourne <eddy@chaos.org.uk>  Tue Dec 14 16:18:19 1999
From: Edward Welbourne <eddy@chaos.org.uk> (Edward Welbourne)
Date: Tue, 14 Dec 1999 16:18:19 +0000
Subject: [Types-sig] Re: Static typing considered ... UGLY
Message-ID: <E11xueZ-0002mV-00@lsls4p>

> too subjective
OK, fair enough: why *do* I find it ugly ?

Substantially, prejudice and paranoia.  However ...

If I implement a datatype (probably as a class) whose objects
(instances) behave *just the same as* integers in all pythonic respects,
I demand to be able to use it everywhere that I am allowed to use an
integer.  If static typing breaks that, it's right out.

If the way you're doing static typing is based on `what interfaces does
this object support' questions instead of `type' (and I realise your
deliberate vaguery on what you mean by `kind' of value may allow this),
then I'm much less concerned, though I have deep reservations about
changing the syntax of python in order to provide syntactic sugar for
stuff that can, at present, be done using assertions.  Furthermore,
boring though it may be to begin a function with as many assertions as
arguments, the assert mechanism leaves ample scope for the programmer to
identify just exactly what it is the programmer wanted to say as the
constraint on the integer (not only is it an integer, and non-negative,
but *it's even*, say): and this without having to invent new and
fascinating syntactic forms to express it.

It's all very well to say that `existing python code will be unaffected'
but if existing python programmers come across

> import frozen
>
> frozen
> def foo( a ):
>     return string.replace(a,"b")

we're not going to be happy with being expected to understand that foo
is now a name we can't modify.  The existing semantics of evaluating an
expression (which is how I'm reading `frozen' the second time it
appears) are that the expression is evaluated and thrown away and doing
so hasn't changed the semantics of how the interpreter modifies
namespaces thereafter.  The fact that the last-executed expression
yielded (and discarded) a type object should *not* have any impact on
the meaning of the code following.  And existing python programmers
might sensibly write something like:

try:
    types.MagicMethodType       # Check we're using python 2.0
    version = '2'
except AttributeError:          # Cope if we're not
    version = '1'

and be unhappy about the typerror because '2' isn't a magic method.
Indeed, if any of the 1.x chain have added values to types, the above
code may appear awful close to verbatim in reality ...

On the other hand, if you want an object whose attributes are of
pre-decided kinds, or a namespace in which certain names are reserved
for certain values, use a setattr hack (or, if you're feeling very
brave, some variant on the wrapper defined by
URL: http://www.chaos.org.uk/~eddy/dev/toy/class.py).

Likewise, if you want a namespace (the module in which your code above
appeared) which can be initialised `in the usual way' but which (except
with severe hassle which should alert folk to the folly of doing so)
can't be modified after initialisation, use an initspace ...
see .../~eddy/dev/toy/object.html and, in the same directory, python.py

> if Python is never allowed to make major changes ...
I'm not suggesting `no change' - only `not in that direction'.
And even type-checking can get past my prejudices if it's approached
gently ...

 ... I've now read the Greg/Fred/Sjoerd attack and I like that: let !
be a new binary operator with grammar

   anyvalue ! typechecker

the value of the expression being that of the given value, but
evaluating it'll raise an exception if the typechecker didn't like the
value.  Now that's a much nicer way to go.  Of course, this effectively
just amounts to implementing ! as an in-expression assert mechanism ...
and I'm not entirely sure how it helps the compiler-writer - is that why
you insist on the typechecker being a dotted name, not an arbitrary
expression ?

Type-checking applies to values ;^)

Of course, obstreperous as I am, I immediately want to meddle with the
scheme: specifically, though the *default* behaviour might be (in
effect)

    if not isinstance(value, typechecker): raise TypeError
    else: yield value

I'd argue for the semantics to say: evaluate the expressions `value' and
`typechecker', look for a __check__ method on the latter: if present
invoke it on the value, else use isinstance as above; on false return
(no problem) the !-expression yields the given `value', otherwise
TypeError with parameter the true value returned.  Then we can implement
weird and devious __check__ methods for fiddly type-checks (instead of
needing to change isinstance in the way proposed - which would conflict
with my pet tweak to that, which allows isinstance(value, thistype,
thattype, othertype) for when I've got several types I'll accept).

Please Greg/Fred/Sjoerd, can you write a proposal which starts where
http://www.foretec.com/python/workshops/1998-11/greg-type-ideas.html
ends (reading the first 9/10 of that was ... illuminating in hindsight,
but off-putting on the way there).  It looks pretty promising ...

	Eddy.


From guido@CNRI.Reston.VA.US  Tue Dec 14 16:33:17 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 14 Dec 1999 11:33:17 -0500
Subject: [Types-sig] RFC 0.1
In-Reply-To: Your message of "Mon, 13 Dec 1999 22:53:04 PST."
 <3855E950.AE0E3E19@prescod.net>
References: <3854EB4B.37EA2888@prescod.net> <199912131809.NAA19402@eric.cnri.reston.va.us>
 <3855E950.AE0E3E19@prescod.net>
Message-ID: <199912141633.LAA23558@eric.cnri.reston.va.us>

[Paul Prescod again]
> In theory, but in practice "whole-program X" seems to never get
> implemented (in Python or elsewhere!), as in "whole program type checks"
> and "whole program optimization" and "whole program flow analysis."
> "Whole program analysis" tends to be an excuse to put off work (roughly
> like "type inference").

I actually hope that I can use some of the small change I got from
DARPA for CP4E to do this rather than putting it off, but I hear your
warning -- and I agree it's a major project.

> No, I was thinking of actually compiling to the same byte-codes. It
> isn't really "safe" to turn off type-checks at runtime but it also isn't
> safe to turn off assertions. They are both there to guarantee program
> correctness at the price of performance. But maybe we would make a
> different command line option to control type checking.

Hm, this is strange -- most of the time you seem to be firmly in the
compile-time-checks camp, but here you seem to want run-time checks.
I say we already have run-time checks, they just come a little later.
(If we didn't have runtime checks, an expression like 1+"" would dump
core rather than raising a TypeError exception.)

If it's (OPT) we're after, adding run-time checks can never obtain
your goal.  If it's (ERR) we're after, well, *maybe* adding some
run-time checks can produce clearer error messages than some of the
existing ones, but this doesn't really do anything for my confidence
that my program is correct -- if there's a type error in my except
clause, what good does it do me to get a type-check error at run time?

> > The indentation don't enter into it.  Consider
> > 
> >     if win32:
> >        def func(): ... # win32 specific version
> >     else:
> >        def func(): ... # generic version
> 
> That's precisely what I'm trying to disallow. I don't know the value of
> win32 until runtime! The pyc could be moved from Unix to win32.

Most people interested in (OPT) would gladly trade in platform
independence for speed.

> And more
> to the point, the value win32 might be computed based on arbitrarily
> complex code.

But typically, it isn't.

> So that's why I said out-dented. An out-dented name
> binding statement cannot depend (much) on a computed value. Computed
> base classes are going to have to be explicitly disallowed for
> statically checkable classes:
> 
> class foo( dosomething() ):
> 	...

There's an alternative.  You could do some analysis on both variants
of func() and derive a union for its interface (arguments & return
type).  If that union is really weird, a static checker might even
warn the user that the two versions of func() don't behave the same
way!  (E.g. if on win32, func() takes more or different arguments or
returns a different type, it's hard to write the code that *uses*
func() portably, so something is probably wrong in the design.)

> > > Classification:
> > >     Due to a shortage of synonyms for "type" that do not already have a
> > > meaning, we use the word "classification."
> > 
> > Oh, dear.  Keep looking for a better synonym!
> 
> You just had to put "type" and "class" in the same language!

Blame C++ or Java, both of which have separate concepts of type and
class.  I'll admit that the type() function is pretty bogus -- perhaps
it should be matched to isinstance(), which takes either a type object
or a class as its second argument.  Perhaps it's not too late to use
the word type for the concept you need?  (We can distinguish by using
"type object" to refer to the old concept where we need it.)

> I could
> redefine the term type in this context and refer to the old concept of
> type as I did below:

Aha.  Proof that I didn't read ahead when I wrote that previous
paragraph. :-)

> > The initialization for b denies its type declaration.  Do you really
> > want to do this?  
> 
> None is a valid value for any type as with NULL in C or SQL.

No.  In C, NULL is not a valid integer (at least not conceptually --
it's a pointer).  I hate the fact that in Java, NULL is always a valid
string, because strings happen to be objects, and so I always run into
run-time errors dereferencing NULL.  I'd like to be able to declare
the possibility that a particular value is None separate from its type
-- this feels much more natural and powerful to me.

> > This doesn't look like it should be part of the
> > final (Python 2.0) version -- it's just too ugly.  How am I going to
> > explain this to a newbie with no programming *nor* Python experience?
> 
> With all due respect my problem is that you took the obvious (or at
> least traditional) instance variable declaration syntax and used it as a
> class variable declaring syntax. Okay, let's try this:
> 
>  class foo:
>      types.IntType, a=5
> 
>      def __init__( self ):
>          types.ListType, self.b
> 
> That looks equally ugly to me. Got any other ideas?

There have been plenty of suggestions, from int a=5 via a:int = 5 to
a!int = 5 and even a = 5!int...

> On a separate track: I don't think that the whole static type system is
> for newbies, just as all of Python is not for newbies (think
> __getattr__). You shouldn't even start thinking about static typing
> until you are trying to "tighten up" your code for performance or
> safety. I don't want to use that as an excuse to make things difficult
> but if we are ever going to get to full polymorphic parametric static
> type checking we will have to acknowledge that the type system will have
> hard parts just as the language has hard parts.

Yes, fair enough.

> > Explain the reason for excluding instances?  Maybe I'm not very clear
> > on what you're proposing here.
> 
> I think that that was from an earlier draft. Obviously we can't check
> instance variables in the same way that you check class and module
> namespaces but we do want to check them. The thought gives me a
> headache. It's my fourth year compiler class all over again. Make it
> stop!

The hard part is keeping which variables (and arguments, etc.) can
contain instances of a given class; if we have that we can track
instance variable assignments.

A simple rule (which I may just implement in a "stricter Python" for
use in early CP4E classes, just like the TeachScheme project starts
teaching with a Scheme variant that has only 6 constructs) would be
that class instance variables can only be assigned to via self.  We
can then statically analyze the methods comprising the class body,
ignoring all dynamicism allowed in more advanced versions of the
language, and deduce a set of instance variable names.  The
implementation can then be told about this set and disallow setting
others (except by derived classes, which are dealt with separately).

I'm hoping that this idea can somehow be extended to full Python --
maybe I'm naive?

> Maybe if I just specify it, some fourth year student will implement it
> as a project.

Is John Aycock listening? :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@CNRI.Reston.VA.US  Tue Dec 14 16:42:08 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 14 Dec 1999 11:42:08 -0500
Subject: [Types-sig] type-safe declaration
In-Reply-To: Your message of "Mon, 13 Dec 1999 22:44:38 PST."
 <3855E756.174CE40D@prescod.net>
References: <3855E756.174CE40D@prescod.net>
Message-ID: <199912141642.LAA23570@eric.cnri.reston.va.us>

[still Paul Prescod]
> Under my plan, you would need a static declaration on YOUR code. I mean
> if your code can NEVER be right (e.g. range( "abc" ) ) then maybe a
> smart checker could report that. Java actually requires this of
> implementors. But if your code COULD be right (which is much more often
> the case in Python) then it should wait until runtime to check:
> 
> a=callSomeUnTypedFunction()
> range( a )

If the type checker can prove that callSomeUnTypedFunction() can
return non-integer types as well as integers I think I'd be happy to
get a warning here (as long as we're in lint mode).  It's much more
likely that the programmer didn't realize this possibility, than that
she somehow had tweaked the environment or the arguments so that
callSomeUnTypedFunction() would never return a non-int at this
particular call site, or that she would be catching the resulting
TypeError later.

Aside: I also believe that a static typechecker can easily know 99% of
all try-except statements that are currently on the call stack.
Try-except statements with a variable (that isn't a simple alias) in
the exception name slot are extremely rare, in my experience.  Of
course a lint-style checker should also warn about (1) all unqualified
except clauses, and (2) "wide" try clauses -- that is, try clauses
around lots of code that could raise the exception that is being
caught.  Bot of these are caused by sloppy coding much more frequently
than they are a necessity in the program.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@CNRI.Reston.VA.US  Tue Dec 14 16:45:38 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 14 Dec 1999 11:45:38 -0500
Subject: [Types-sig] IsInstance
In-Reply-To: Your message of "Mon, 13 Dec 1999 22:22:31 PST."
 <3855E227.AE33907@prescod.net>
References: <3854EB4B.37EA2888@prescod.net> <199912131809.NAA19402@eric.cnri.reston.va.us>
 <3855E227.AE33907@prescod.net>
Message-ID: <199912141645.LAA23582@eric.cnri.reston.va.us>

> From: Paul Prescod <paul@prescod.net>

> I wanted the function to return an object:
> 
> myList=isinstance( foo, types.ListType )
> if not myList:
> 	myDict=isinstance( foo, types.DictionaryType )

Good feature idea, but abusing isinstance() is a bad name.  In C++ I
believe this is called a dynamic cast.  Long ago I learned to define
virtual functions that would return either an X, if the object was an
X, or a null pointer.

Besides, the "if not myList" test could fail if foo happened to be an
empty list.

> Then we can do the inferencing by looking at a single statement. Compare
> it to this:
> 
> if isinstance( foo, types.ListType ):
> 	myList=foo
> elif isinstance( foo, types.DictionaryType ):
> 	myDict=foo
> 
> That inferencing is just too hard.

Are you sure?

> It isn't a proper cast operator
> anymore. If you are willing to change isinstance to return the object if
> it matches then I would like to use it.

No, call it something else.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@CNRI.Reston.VA.US  Tue Dec 14 16:49:50 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 14 Dec 1999 11:49:50 -0500
Subject: [Types-sig] List of FOO
In-Reply-To: Your message of "Mon, 13 Dec 1999 21:55:07 PST."
 <3855DBBB.6D1B462A@prescod.net>
References: <3855DBBB.6D1B462A@prescod.net>
Message-ID: <199912141649.LAA23593@eric.cnri.reston.va.us>

> From: Paul Prescod <paul@prescod.net>

> > > #2. The first version of the system will not allow the use of types
> > > that cannot be referred to as simple Python objects. In particular it
> > > will not allow users to refer to things like "List of Integers" and
> > > "Functions taking Integers as arguments and returning strings."
> > 
> > It's been said before: that's a shame.  Type inference is seriously
> > hindered if it doesn't have such information.  (Consider a loop over
> > sys.argv; I want the checker to be able to assume that the items are
> > strings.)
> 
> It took two years to get the parameterized version of the Java type
> system up and running.

Probably because Java was initially conceived as a language with a
"classic" type system (like C or Pascal).  Python on the other hand
already has all this.

> Let me ask this your opinion on this question
> (seriously, not sarcastically), should we include a spelling for "list
> of string" and not "callable taking list of callables taking strings
> returning integers returning string" and what about "callable taking
> list of callables taking <T> and R returning list of callables taking
> <R> and returning <T>." You see my problem? I could special case "list
> of" as Java and C did if we agreed to take our chances that my syntax
> would be extensible. We could even steal that weird "[]" thing that C
> and Java do:
> 
> 	StringType [] foo

If we could express all those the type checker could do a much better
job.  If we could at least do the ones without the <T> notation, we'd
still be doing a good job.  Stopping at "list" is useless.

(I'm guessing your use of "R" instead of "<R>" once is a typo and not
something deep I've missed?)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@CNRI.Reston.VA.US  Tue Dec 14 16:54:16 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 14 Dec 1999 11:54:16 -0500
Subject: [Types-sig] Type inferencing
In-Reply-To: Your message of "Mon, 13 Dec 1999 21:54:41 PST."
 <3855DBA1.9384B6AE@prescod.net>
References: <3855DBA1.9384B6AE@prescod.net>
Message-ID: <199912141654.LAA23612@eric.cnri.reston.va.us>

> From: Paul Prescod <paul@prescod.net>
> 
> Point taken. I am only willing to do type inferencing up to a function
> level. After my "ML Experience" I am not willing to do it globally.
[example snipped]

I'm disappointed.  Jim Hugunin did global analysis on the pystone.py
module -- 250 lines containing 14 functions and one class with two
methods.  (He may actually have left out the class, but I'm pretty
sure he did everything else.)  He got a 1000x speedup, which I think
should be a pretty good motivator for those interested in (OPT).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@CNRI.Reston.VA.US  Tue Dec 14 16:59:27 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 14 Dec 1999 11:59:27 -0500
Subject: [Types-sig] Plea for help.
In-Reply-To: Your message of "Mon, 13 Dec 1999 20:39:36 PST."
 <3855CA08.4EA1BF11@prescod.net>
References: <Pine.LNX.4.10.9912101706330.16305-100000@nebula.lyra.org>
 <3855CA08.4EA1BF11@prescod.net>
Message-ID: <199912141659.LAA23638@eric.cnri.reston.va.us>

> Is there currently any path from high level parse trees to bytecodes?
> E.g. is there a way to get sane parse trees to "render" themselves as,
> er, insane parse trees? I don't think so but I'm just checking to avoid
> extra work.

The parser module lets you construct a parse tree and then compile
it.  The parse tree must be correct before this is allowed.  Check out
the compileast() function on
http://www.python.org/doc/current/lib/Converting_ASTs.html

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@CNRI.Reston.VA.US  Tue Dec 14 17:36:44 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 14 Dec 1999 12:36:44 -0500
Subject: [Types-sig] Avoiding innovation
In-Reply-To: Your message of "Tue, 14 Dec 1999 07:20:54 PST."
 <38566056.70679872@prescod.net>
References: <38566056.70679872@prescod.net>
Message-ID: <199912141736.MAA23833@eric.cnri.reston.va.us>

> In response to Greg's message I want to add a design goal:
> 
> #11. Wherever possible the system should try to build upon existing
> implemented type systems and research rather than being designed from
> scratch for Python. It will build much more closely on dynamic
> language type annotation systems such as those in Smalltalk, Common
> Lisp, Dylan and Visual Basic. Java and C++ are of secondary interest
> as models.
> 
> ---
> 
> Python is just another syntax and virtual machine for the lambda
> calculus. It obeys the same mathematical laws as other programming
> languages. I think it would be a mistake to throw out everything that we
> know about type systems and implement something idiosyncratic.
> 
> Python IS a classical dynamic object/procedural programming language. It
> is not a research language and I dislike attempts to put in untested new
> ideas, especially in the area of type checks.

I like this.

I have almost always tried to avoid invention for the rest of Python,
and some of the few bits of invention are some of my least favorite
Python features.

I also often think of Python as a particularly dynamic
*implementation* of a fairly conventional type system.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From m.faassen@vet.uu.nl  Tue Dec 14 18:07:47 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Tue, 14 Dec 1999 19:07:47 +0100
Subject: [Types-sig] RFC 0.1
References: <3854EB4B.37EA2888@prescod.net> <199912131809.NAA19402@eric.cnri.reston.va.us> <3855E950.AE0E3E19@prescod.net>
Message-ID: <38568773.218B3176@vet.uu.nl>

Paul Prescod wrote:
[vast snip]
> With all due respect my problem is that you took the obvious (or at
> least traditional) instance variable declaration syntax and used it as a
> class variable declaring syntax. Okay, let's try this:
> 
>  class foo:
>      types.IntType, a=5
> 
>      def __init__( self ):
>          types.ListType, self.b
> 
> That looks equally ugly to me. Got any other ideas?

Let's ignore the syntax issue for now, please? Let's just put the type
info in Python lists/dictionaries/etc. Those may look horribly ugly, but
they're *there* for use, you can do fancy generic type construction in
them if you want to, you can easily whip up a structure for that, and
Python can already use them right away!

Later on once we've got the horribly ugly system going we can think
about syntax. Syntax will be clearer once we've got the semantics going,
anyway.

Regards,

Martijn


From GoldenH@littoncorp.com  Tue Dec 14 18:23:13 1999
From: GoldenH@littoncorp.com (Golden, Howard)
Date: Tue, 14 Dec 1999 10:23:13 -0800
Subject: [Types-sig] Pascal style declarations
Message-ID: <ADCB388D8C6BD211A4CE0000F63D90112215B9@mail.littoncorp.com>

Since Guido hasn't had a coronary in response to my earlier suggestion, I
will be more specific:

1.  I propose _optional_ typing, using the Pascal syntax (since this seems
to me to be the most "Pythonic" (Isn't that like giving a snake an enema?
Sorry.).  Actually, I don't care about the specific syntax, just as long as
there is one.

2.  Specifically, you can declare a variable using the syntax:

    var x : int, y : string, ...

3.  In functions and methods, you can _optionally_ specify the argument
type:

    def funx(x : int, y : string): ...

4.  If you use these, then you are making binding assertions about the types
of the names, and these assertions can be checked at compile or run time.

5.  The parser could be made to strip out these declarations, and ignore
them, in which case they would have no effect.

6.  The parser should be modified so you can tell it (using a compile-time
switch or pragma) to require declarations.

7.  It appears to me that this would not change existing code, except if it
uses the name "var".

8.  I think there should be a parameterized type mechanism.  I don't much
like the angle bracket notation of C++, but I guess it's well established,
so it'll do.

In my opinion, this doesn't "muck up" the language (since you don't have to
use it).

---

Howard B. Golden
Software developer
Litton Industries, Inc.
Woodland Hills, California


From m.faassen@vet.uu.nl  Tue Dec 14 18:27:38 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Tue, 14 Dec 1999 19:27:38 +0100
Subject: [Types-sig] RFC 0.1
References: <Pine.LNX.4.10.9912140200100.16305-100000@nebula.lyra.org> <38565074.9E51515F@prescod.net>
Message-ID: <38568C1A.D1BCF530@vet.uu.nl>

Paul Prescod wrote:
> Greg Stein wrote:
[assigns objects of various types to the same name and wants this to
remain legal Python code]
> I am perfectly happy to have it be legal Python code. I just don't
> intend for it to be *statically type checkable* Python code. No, you
> cannot use all of the flexibility of Python and expect to get all of the
> static type checking of Java. For each function you choose one or the
> other.

I agree with this, which is I am advocating a strong split (for
simplicity) of fully-statically checked code and normal python code.
Later on you can work on blurring the interface between the two. First
*fully* type annotated functions (classes, modules, what you want),
which can only refer to other things that are fully annotated. By 'fully
annotated' I mean all names have a type. I keep disagreeing with Paul's
simplification of initially throwing out constructed types such as list
of integer, as that would break my own approach at simplicity. :)

[snip]
> > I believe that Python is too rich in data types and composition of types
> > to be able to add *syntax* for all type declarations. I think you better
> > stop and realize that before you get in too deep :-)
> 
> I have a few different answers here:
> 
>  1. I don't have to be able to describe every possible type. If you
> can't statically check that "foo is a callable from T,T to callable from
> T" tough bloody luck, at least for the time being. Java can't do that.
> Neither could mid-90's C++. And forget about it for ANSI C.
> 
> Python is not the world's most OO programming language. It is just a
> good one. It may not have the world's most static type checker. It will
> just have a good one. No type system makes type errors impossible so
> that is not my goal. My goal is that if a module uses type checks as
> religiously as  Java module would, that module would be roughly as
> type-safe.

If we throw out the syntax issue and use Python constructs for types
until we know more, we'll all be happier, right? :) The syntax will be
clear when the semantics is. Guido is good at syntax, let him figure out
a good syntax for it, let's just focus on the semantics.

Our static type checker/compiler can use the Python type constructions
directly. We can put limitations on them to forbid any type
constructions that the compiler cannot fully evaluate before the
compilation of the actual code, of course, just like we can put
limitations on statically typed functions (they shouldn't be able to
call any non-static functions in the first iteration of our design, I'm
still maintaining)

[snip]
>  3. Compositions of types are complex, but not infinitely complex. We
> have about two decades in parameterized type research to rely on. Within
> a year and a half, two of the world's most popular languages (C++ and
> Java) will have parameterized types.

Doesn't C++ already have parameterized types? (template classes and
such?).

> > In your RFC 0.1, you punted on the complex/composited data types issue too
> > keep the solution tractable. I posit that you will *never* solve the
> > problem of coming up with sufficient syntactical expression; therefore,
> > you will always have to resort to a procedural component in your type
> > system *if* you want full coverage.
> 
> I am happy to have a runtime component. I just don't see that we need
> any new syntax for this runtime component. And I don't think that we
> should give up on a formally defined static system.

I agree we should focus on a static system.

Regards,

Martijn


From guido@CNRI.Reston.VA.US  Tue Dec 14 18:51:01 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 14 Dec 1999 13:51:01 -0500
Subject: [Types-sig] Re: Inferencing: A case study
In-Reply-To: Your message of "Tue, 14 Dec 1999 08:07:12 PST."
 <38566B30.36608D4E@prescod.net>
References: <199912141519.KAA23476@eric.cnri.reston.va.us>
 <38566B30.36608D4E@prescod.net>
Message-ID: <199912141851.NAA24093@eric.cnri.reston.va.us>

> Guido van Rossum wrote:
> > 
> > Here's a long and rambling example of what I think a type inferencer
> > could do -- without type declarations of any sort.  I wrote this down
> > while thinking about the type checker that I would like to see in
> > IDLE.
> 
> Okay, but let me ask this: if TOTAL Java-level type safety ONLY required
> type declarations for all "non-local" variables (including functions and
> instance variables) would that be acceptable to you?
> 
> Your inferencer heuristics are fine for an interactive GUI environment
> where failure is merely an inconvenience but if we are going to have a
> formally checkable notion of "this is statically type-safe" and "this is
> not" then I worry about the "non-local breakage" problem. Oops, did
> changing that variable to an "int" break your module way over there?
> 
> I spoke to the Journal of Functional Programmers at a conference
> recently. 

Is Journal some kind of military term, maybe between General and
Sergeant? :-)

> I asked him about why ML's type inferencer made the language
> so hard to use. He said: "oh, you should always put the type
> declarations in. The type inferencer is mostly just an educational
> tool." Of course that's not what the type inferencer was SUPPOSED to be,
> but I think that that's what it has become. "Global" type inferencing
> scares me and I think that it has the unintended consequence of making
> the static type checker (and thus the language) harder to understand.

I agree.  Typically, especially for libraries, there should be type
decls at the module boundaries to avoid endless "exercises for the
reader" as in my case study.  (Note that the case study actually
stipulates that the re module has a module declaration, and explains
why.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From m.faassen@vet.uu.nl  Tue Dec 14 18:52:03 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Tue, 14 Dec 1999 19:52:03 +0100
Subject: [Types-sig] Re: RFC 0.1
References: <ADCB388D8C6BD211A4CE0000F63D90112215B5@mail.littoncorp.com>
Message-ID: <385691D3.6DC4A36E@vet.uu.nl>

"Golden, Howard" wrote:
[snip snip]
> > #4. The first version of the system will be syntactically compatible
> > with Python 1.5.x in order to allow experimentation in the lead-up to
> > an integrated system in Python 2.
> 
> Does this mean no new syntax?  (That's what it appears from your examples.)
> 
> How about a declaration syntax, e.g.,
> 
>     var x : type1, y : type2
> 
> Is this prohibited by the RFC?

While my agenda is to kill the syntax discussions for the moment, I'd
propose a seperate declaration syntax before all others, because this is
the most syntactically compatible with Python. And easier on the
programmer.

Imagine you have a module. Now you want to make it fully statically
typed. With most syntax proposals I've seen you'd have to go through the
code and add type declarations here and there, mix it with the current
code.

With either a Python based system as I'm proposing (ugly but powerful
and fairly simple), or a seperate type declaration system, you have your
type declarations separated from the code itself. This means you easily
add and remove type information and switch between a statically typed
module and a dynamically typed module easily.

On a slightly seperate issue, I propose a classification of modules
according to type annotation (or functions or classes, whatever level
you prefer thinking about):

fully unannotated module:

Names have no type annotations. 

Full type dynamicism. Only run-time type checks by hand are possible.
Can use any other kind of module. I.e. this is the good old Python
module as we know it now.

fully annotated module:

All names (local and global, function definitions, classes, class
members, class data, etc) in the module have a type annotation.

Restricts lots. Can only use other fully annotated modules. object
attributes are fixed at compile-time according to type annotations. code
that tries to add a new member to an object at run-time will give a
run-time error. 'a = "foo"; a = 1' will give a compile time error. I.e.
this is like a static language and this can be compiled to fast native
code.

partially annotated module:

Some names, but not all names, have type information. Possibly all names
do in fact, but imported modules aren't fully annotated which also
breaks things.

Restricts some. Will raise a run-time error if it is detected that type
annotations are violate, automatically. *may* do limited compile-time
checking. *may* try to do type inference to turn this module into a
fully annotated one. *may* even do fancy analysis and come up with one
or more fully annotated modules (which can be compiled for speed
reasons), but keeps a dynamic module around in case the fully annotated
modules cannot be used.

Regards,

Martijn


From m.faassen@vet.uu.nl  Tue Dec 14 19:03:20 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Tue, 14 Dec 1999 20:03:20 +0100
Subject: [Types-sig] List of FOO
References: <3855DBBB.6D1B462A@prescod.net> <199912141649.LAA23593@eric.cnri.reston.va.us>
Message-ID: <38569478.40E29421@vet.uu.nl>

Guido van Rossum wrote:
> 
> > From: Paul Prescod <paul@prescod.net>
> 
> > > > #2. The first version of the system will not allow the use of types
> > > > that cannot be referred to as simple Python objects. In particular it
> > > > will not allow users to refer to things like "List of Integers" and
> > > > "Functions taking Integers as arguments and returning strings."
> > >
> > > It's been said before: that's a shame.  Type inference is seriously
> > > hindered if it doesn't have such information.  (Consider a loop over
> > > sys.argv; I want the checker to be able to assume that the items are
> > > strings.)
> >
> > It took two years to get the parameterized version of the Java type
> > system up and running.
> 
> Probably because Java was initially conceived as a language with a
> "classic" type system (like C or Pascal).  Python on the other hand
> already has all this.
> 
> > Let me ask this your opinion on this question
> > (seriously, not sarcastically), should we include a spelling for "list
> > of string" and not "callable taking list of callables taking strings
> > returning integers returning string" and what about "callable taking
> > list of callables taking <T> and R returning list of callables taking
> > <R> and returning <T>." You see my problem? I could special case "list
> > of" as Java and C did if we agreed to take our chances that my syntax
> > would be extensible. We could even steal that weird "[]" thing that C
> > and Java do:
> >
> >       StringType [] foo
> 
> If we could express all those the type checker could do a much better
> job.  If we could at least do the ones without the <T> notation, we'd
> still be doing a good job.  Stopping at "list" is useless.
[snip]

I agree completely, and one *can* express most of this pretty easily in
current Python, i.e.:

types = {
    "bar": IntType,
    "baz": ListType(IntType),
    "hey": IntType,
    "foo3": FunctionType(args=(IntType,), result=IntType),

    "crazy" : ListType(FunctionType(args=(ListType(IntType),
StringType),                                 result=DictType(StringType, 
                                      
FunctionType(args=None,                                                           
result=StringType)))
}

It looks very very ugly, but that's beside the point. It's usable for
type reasoning from within Python, directly (I actually have a buggy
module which this a little, and features typedefs to boot). One can come
up with a more Pythonic syntax (indentation, anyone?) later, once one
has the semantics working.

Regards,

Martijn


From guido@CNRI.Reston.VA.US  Tue Dec 14 19:09:59 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 14 Dec 1999 14:09:59 -0500
Subject: [Types-sig] RFC 0.1
In-Reply-To: Your message of "Tue, 14 Dec 1999 19:27:38 +0100."
 <38568C1A.D1BCF530@vet.uu.nl>
References: <Pine.LNX.4.10.9912140200100.16305-100000@nebula.lyra.org> <38565074.9E51515F@prescod.net>
 <38568C1A.D1BCF530@vet.uu.nl>
Message-ID: <199912141909.OAA24221@eric.cnri.reston.va.us>

[Martijn Faassen]
> I agree with this, which is I am advocating a strong split (for
> simplicity) of fully-statically checked code and normal python code.

You can already do this -- write in Java or C.

> Later on you can work on blurring the interface between the two. First
> *fully* type annotated functions (classes, modules, what you want),
> which can only refer to other things that are fully annotated. By 'fully
> annotated' I mean all names have a type. I keep disagreeing with Paul's
> simplification of initially throwing out constructed types such as list
> of integer, as that would break my own approach at simplicity. :)

Agreed.  List of integer and its friends are important.  Also
correspondences (see my example of a sum() function taking a list of
<something> and an additional single <something>.

> If we throw out the syntax issue and use Python constructs for types
> until we know more, we'll all be happier, right? :) The syntax will be
> clear when the semantics is. Guido is good at syntax, let him figure out
> a good syntax for it, let's just focus on the semantics.

Thank you.  This of course leaves Paul with the question of how to
prototype all this -- he'll have to make *something* up. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@CNRI.Reston.VA.US  Tue Dec 14 19:17:23 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 14 Dec 1999 14:17:23 -0500
Subject: [Types-sig] RFC 0.1
In-Reply-To: Your message of "Tue, 14 Dec 1999 16:57:40 +0100."
 <385668F4.2340C4B2@appliedbiometrics.com>
References: <Pine.LNX.4.10.9912140200100.16305-100000@nebula.lyra.org>
 <385668F4.2340C4B2@appliedbiometrics.com>
Message-ID: <199912141917.OAA24231@eric.cnri.reston.va.us>

[Christian Tismer]
> Allow me a question about types:
> 
> Where are the limits between types, values, and properties of values?
> 
> Assume a function which returns either
> [1, 2, 3] or the number 42.
> 
> We now know that we either get a list or an integer.
> But in this case, we also know that we get a list of three
> integer elements which are known constants, or we get
> the integer 42 which is even, for instance.
> 
> So what is 'type', how abstract or concrete should it be,
> where is the cut?

Good questions.  I'd like to remember all of this information.  It can
help with optimization (through constant folding).  It can help detect
unreachable code (e.g. your example function always returns a true
value).  Etc., etc.

Note that this can all be folded into a sufficiently rich type system;
a type is nothing more than a (possibly infinite) set of values.

> At the same time, Python is so rich from self-inspection that
> writing a dynamic type inference machine seems practicable,
> so how about not declaring types, but asking your code about its
> type?

I suppose you could do symbolic execution on the bytecode, but I don't
think this is a very fruitful path.  (Of course if anyone can prove
I'm wrong, it's you. :-)

> I could imagine two concepts working together:
> 
> Having optional interfaces, which is a different issue
> and looks fine (Jim's 0.1.1 implementation).
> 
> Having dynamic type inference, which is implemented by cached
> type info at runtime.

Eh?  Type inference is supposed to be a compile-time thing.  You
present your whole Python program to the typechecker and ask it "where
could this crash if I sent it on rocket to Mars?"

> (I hope this idea isn't too simple minded)
> Assume for instance the string module, implemented in Python.
> It would have an interface which defines what goes in and
> out of its functions.
> 
> At "compile" time of string.py, type inference can partially
> take place already when the code objects are created. The interface
> part creates restrictions on argument values, which can be used
> for further inference. It can also be deduced whether the return
> values already obey the interface or if deduction for imported
> functions is necessary.
> This info is saved in some cache with the compilation.
> Changes to the module object simply break the cache.

And that's exactly the problem.  I want to be able to be told whether
the cache might be broken *before* I launch my rocket to Mars.

> When I dynamically redefine a piece of the module where it
> depends of (say I assign something new to "_lower"), then
> the analysis must be carried out again, recursively invalidating
> other cached info as necessary.

In my scenario, the assignment to _lower is either detected and taken
into account by the type checker, or forbidden.  But this decision is
taken at compile time and if forbidden, it is flagged as a compile
time error.  If you exec code that could make this assignment that
would be a run-time error (it's also forbidden at run-time) but
typically, the Mars lander isn't going to accept input for exec from
the Martians -- we could probably flag all uses of exec (and eval()
and a few others) as errors unless there's a try/except around them.

> Well, this is an example where I think the restriction to
> type checking of expressions still applies, but something more is
> needed to trigger this check early.
> The involved namespace object is the string module's __dict__,
> which should know that it is referenced by this expression:
> 
> def lower(s):
> 	res = ''
> 	for c in s:
> 		res = res + _lower[ord(c)]
> 	return res
> 
> And by assignment to the name "_lower" in this case, it could
> invalidate code object lower's type cache. lower can no more
> assure that it will return a string result and will trigger
> its interface object to re-check consistency. The latter
> will raise an interface_error if the rule doesn't match.
> 
> It remains an open question for me how deeply possible
> values should be checkable, i.e. "this arg has to be a list
> which is not empty". Minor point, maybe.
> 
> Did I make some sense, or am I off the track? - chris

Read my case study.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From m.faassen@vet.uu.nl  Tue Dec 14 19:19:34 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Tue, 14 Dec 1999 20:19:34 +0100
Subject: [Types-sig] RFC 0.1
References: <Pine.LNX.4.10.9912140200100.16305-100000@nebula.lyra.org> <38565074.9E51515F@prescod.net>
 <38568C1A.D1BCF530@vet.uu.nl> <199912141909.OAA24221@eric.cnri.reston.va.us>
Message-ID: <38569846.853294E7@vet.uu.nl>

Guido van Rossum wrote:
> 
> [Martijn Faassen]
> > I agree with this, which is I am advocating a strong split (for
> > simplicity) of fully-statically checked code and normal python code.
> 
> You can already do this -- write in Java or C.

Good answer, but I'd prefer to write more Pythonic code. If I want to
translate my Python module to C, I have to work hard. If I want to
translate my Python module to a static Python module, I 'just' need to
add type annotations and change some parts that are 'too dynamic'. Most
Python code is fairly static.

And I didn't intend to *stop* at this, I just think it's valuable
'early' payoff.

> > Later on you can work on blurring the interface between the two. First
> > *fully* type annotated functions (classes, modules, what you want),
> > which can only refer to other things that are fully annotated. By 'fully
> > annotated' I mean all names have a type. I keep disagreeing with Paul's
> > simplification of initially throwing out constructed types such as list
> > of integer, as that would break my own approach at simplicity. :)
> 
> Agreed.  List of integer and its friends are important.  Also
> correspondences (see my example of a sum() function taking a list of
> <something> and an additional single <something>.
> 
> > If we throw out the syntax issue and use Python constructs for types
> > until we know more, we'll all be happier, right? :) The syntax will be
> > clear when the semantics is. Guido is good at syntax, let him figure out
> > a good syntax for it, let's just focus on the semantics.
> 
> Thank you.  This of course leaves Paul with the question of how to
> prototype all this -- he'll have to make *something* up. :-)

You're welcome. As to the prototype, you can easily make up something in
Python. I have posted an example of this to the list in another post.

Regards,

Martijn


From gstein@lyra.org  Tue Dec 14 19:56:50 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 14 Dec 1999 11:56:50 -0800 (PST)
Subject: [Types-sig] Plea for help.
In-Reply-To: <199912141659.LAA23638@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912141152050.16305-100000@nebula.lyra.org>

On Tue, 14 Dec 1999, Guido van Rossum wrote:
> > Is there currently any path from high level parse trees to bytecodes?
> > E.g. is there a way to get sane parse trees to "render" themselves as,
> > er, insane parse trees? I don't think so but I'm just checking to avoid
> > extra work.
> 
> The parser module lets you construct a parse tree and then compile
> it.  The parse tree must be correct before this is allowed.  Check out
> the compileast() function on
> http://www.python.org/doc/current/lib/Converting_ASTs.html

While it is certainly possible to go from a transformer-tree back to an
ast-tree and then to compile -- if that's what you want, then why use the
transformer at all?

:-)

As Bill said: you can definitely generate a pyc from a transformer tree. I
believe it is bit easier than doing it from AST, too. But it isn't a
cake-walk... there are a lot of constructs in there.

Hrm. Well... the Python bytecodes certainly map better. It was difficult
for us to go to C, but maybe generating bytecodes won't be too hard.

If anybody is thinking about doing this, then please talk with Bill and I
first. genc.py is not the best model. In a proprietary compiler (e.g. I
can't release it yet), we built a *much* better model. There are some
things that are similar, but others that really need to change.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From paul@prescod.net  Tue Dec 14 20:24:40 1999
From: paul@prescod.net (Paul Prescod)
Date: Tue, 14 Dec 1999 12:24:40 -0800
Subject: [Types-sig] Re: Static typing considered ... UGLY
References: <E11xueZ-0002mV-00@lsls4p>
Message-ID: <3856A788.A7435117@prescod.net>

Edward Welbourne wrote:
> 
> If I implement a datatype (probably as a class) whose objects
> (instances) behave *just the same as* integers in all pythonic respects,
> I demand to be able to use it everywhere that I am allowed to use an
> integer.  If static typing breaks that, it's right out.

It won't break it. Number will be an interface with operations like
"add", "radd", "sub", "mult" and so forth. If you check against the
interface instead of against the type, things just work.

Anyhow, the decision of whether to do this in an interface-y way or a
hard-coded type way is ALREADY up to the author of a module. There are
many places in the standard library where module owners check the types
of objects and return TypeError if they don't get the data they expect. 
It is even more common in the built-in modules. 

How would changing the syntax from 

	def prepend(self, cmd, kind):

		if type(cmd) <> type(''):

			raise TypeError, \

			      'Template.prepend: cmd must be a string'


To:

	def prepend( self, cmd: String, kind ):
		...

make anything worse? And is the latter really "uglier" than the former?
Or do you propose to outlaw the former? Does the mere fact that the
verbose version is essentially useless to the compiler make it more
virtuous?

> the value of the expression being that of the given value, but
> evaluating it'll raise an exception if the typechecker didn't like the
> value.  Now that's a much nicer way to go.  Of course, this effectively
> just amounts to implementing ! as an in-expression assert mechanism ...
> and I'm not entirely sure how it helps the compiler-writer - is that why
> you insist on the typechecker being a dotted name, not an arbitrary
> expression ?

Exactly. Dotted names help the compiler writer and the compiler writer
helps the programmer by finding mistakes and optimizing code. You
scratch my back and I'll scratch yours.

> Please Greg/Fred/Sjoerd, can you write a proposal which starts where
> http://www.foretec.com/python/workshops/1998-11/greg-type-ideas.html
> ends (reading the first 9/10 of that was ... illuminating in hindsight,
> but off-putting on the way there).  It looks pretty promising ...

Let me point out again that while that approach is interesting, it
doesn't solve the problem I was recruited to solve: a *static*
*compile-time* checker.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Tue Dec 14 20:00:12 1999
From: paul@prescod.net (Paul Prescod)
Date: Tue, 14 Dec 1999 12:00:12 -0800
Subject: [Types-sig] RFC 0.1
References: <Pine.LNX.4.10.9912140200100.16305-100000@nebula.lyra.org> <38565074.9E51515F@prescod.net> <38568C1A.D1BCF530@vet.uu.nl>
Message-ID: <3856A1CB.B5470782@prescod.net>

Martijn Faassen wrote:
> 
> I agree with this, which is I am advocating a strong split (for
> simplicity) of fully-statically checked code and normal python code.

I don't see this as buying much simplicity. And I do see it as requiring
more work later. I also see it as scaring the bejeesus out of many
static type system fence sitters. Can you demonstrate that it makes our
life easier to figure out integration issues later?

> Later on you can work on blurring the interface between the two. First
> *fully* type annotated functions (classes, modules, what you want),
> which can only refer to other things that are fully annotated. By 'fully
> annotated' I mean all names have a type. 

I think that's a non-starter because it will take forever to become
useful because the standard library is not type-safe. Anyhow I fell like
I've *already solved* the problem of integration so why would I undo
that?

> I keep disagreeing with Paul's
> simplification of initially throwing out constructed types such as list
> of integer, as that would break my own approach at simplicity. :)

If I'm making this problem harder than it needs to be then I'm happy to
accept your simple solution for parameterized types as soon as I
understand it.

> If we throw out the syntax issue and use Python constructs for types
> until we know more, we'll all be happier, right? :) The syntax will be
> clear when the semantics is. Guido is good at syntax, let him figure out
> a good syntax for it, let's just focus on the semantics.

Well, we need SOME syntax in order to communicate. Anyhow...

> Our static type checker/compiler can use the Python type constructions
> directly. We can put limitations on them to forbid any type
> constructions that the compiler cannot fully evaluate before the
> compilation of the actual code, of course, just like we can put
> limitations on statically typed functions (they shouldn't be able to
> call any non-static functions in the first iteration of our design, I'm
> still maintaining)

I see no reason for that limitation. The result of a call to a
non-static function is a Pyobject. You cast it in your client code to
get type safety. Just like the shift from K&R C to ANSI C. Functions
always (okay, often) returned "ints" but you could cast them to foo *'s.

> Doesn't C++ already have parameterized types? (template classes and
> such?).

Yes. I was just pointing out that in a year and a half Java will have
them too which will put a lot of pressure on us.
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Tue Dec 14 20:24:32 1999
From: paul@prescod.net (Paul Prescod)
Date: Tue, 14 Dec 1999 12:24:32 -0800
Subject: [Types-sig] Type inferencing
References: <3855DBA1.9384B6AE@prescod.net> <199912141654.LAA23612@eric.cnri.reston.va.us>
Message-ID: <3856A780.36B6788D@prescod.net>

Guido van Rossum wrote:
> 
> > From: Paul Prescod <paul@prescod.net>
> >
> > Point taken. I am only willing to do type inferencing up to a function
> > level. After my "ML Experience" I am not willing to do it globally.
> [example snipped]
> 
> I'm disappointed.  Jim Hugunin did global analysis on the pystone.py
> module -- 250 lines containing 14 functions and one class with two
> methods.  (He may actually have left out the class, but I'm pretty
> sure he did everything else.)  He got a 1000x speedup, which I think
> should be a pretty good motivator for those interested in (OPT).

I think that we may be talking at cross purposes. I am trying to define
a formal, independently implementable specification for a type system
that Python users will understand and like. Some languages use global
type inferencing as a formally specified part of the type checker but my
impression is that users do not like the resulting languages.

Jim created an implementation of an excellent, intelligent optimizing
compiler. His work is as, or more, interesting than mine, but it is a
different problem he is trying to solve. (OPT) comes into the picture
because my work makes his much, much easier and more effective in many
cases. I am totally in favor of particular global type inferencing
implementations, but am not in favor of requiring global type inference
of every static type checker implementation nor of requiring
safety-conscious Python users to think in terms of global type
inferencing.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Tue Dec 14 20:12:55 1999
From: paul@prescod.net (Paul Prescod)
Date: Tue, 14 Dec 1999 12:12:55 -0800
Subject: [Types-sig] Re: Inferencing: A case study
References: <199912141519.KAA23476@eric.cnri.reston.va.us>
 <38566B30.36608D4E@prescod.net> <199912141851.NAA24093@eric.cnri.reston.va.us>
Message-ID: <3856A4C7.1B6132C8@prescod.net>

Guido van Rossum wrote:
> 
> > I spoke to the Journal of Functional Programmers at a conference
> > recently.
> 
> Is Journal some kind of military term, maybe between General and
> Sergeant? :-)

I spoke with an editor of said Journal. :)

> I agree.  Typically, especially for libraries, there should be type
> decls at the module boundaries to avoid endless "exercises for the
> reader" as in my case study.  (Note that the case study actually
> stipulates that the re module has a module declaration, and explains
> why.)

Good, we are in agreement. 

I've been thinking: I can allow statically checked references to
type-inferenced module variables if we make the module namespace
write-only outside of the module. The "trick" is that I need to put a
boundary around where I expect writes to take place so I can check that
I can figure the complete list of possible values the variable can take.
If writes can come from outer space then I need to check every write at
runtime.

So, I can do static type checking on a module variable if:

 * it is declared only "privately writeable"
 * or the whole module namespace is "privately writeable"
 * or it has a type declaration.

We can provide access to any combination of these options that we
decide. 

Privately writable is more pythonic than "const" which was my first
reaction. Of course the vast, vast majority of module variables are
privately writable. And one could argue that ALL of them should be.
Module namespace writability is a security nightmare and it is SO easy
to move writeable variables to an object:

sys.path => sys.runtime.path
sys.version => sys.impl.version
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From tismer@appliedbiometrics.com  Tue Dec 14 20:23:03 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Tue, 14 Dec 1999 21:23:03 +0100
Subject: [Types-sig] RFC 0.1
References: <Pine.LNX.4.10.9912140200100.16305-100000@nebula.lyra.org>
 <385668F4.2340C4B2@appliedbiometrics.com> <199912141917.OAA24231@eric.cnri.reston.va.us>
Message-ID: <3856A727.490600C4@appliedbiometrics.com>


Guido van Rossum wrote:
> 
> [Christian Tismer]

[about
  is [42]'s type "list", 
  "list with one element",
  "list with one even int"
]

> Good questions.  I'd like to remember all of this information.  It can
> help with optimization (through constant folding).  It can help detect
> unreachable code (e.g. your example function always returns a true
> value).  Etc., etc.

Fine, in general.

> Note that this can all be folded into a sufficiently rich type system;
> a type is nothing more than a (possibly infinite) set of values.

Yup, it's just open where to cut. I'd like to do compile time
checking, but to refine this at any time during the program
execution (sometimes maybe), and this needs some abstraction
to keep data limited.

> > At the same time, Python is so rich from self-inspection that
> > writing a dynamic type inference machine seems practicable,
> > so how about not declaring types, but asking your code about its
> > type?

Wrong wording of mine. I don't want to analyse bytecode, but
perhaps use AST info at some time. The initial compile time
AST is general but currently doesn't try deduction. It could
do so. But it could build derived AST's at runtime which know
much more. Well I'm still after the JIT idea, so just drop it,
I think this thread is for static types, which are a good thing!

> I suppose you could do symbolic execution on the bytecode, but I don't
> think this is a very fruitful path.  (Of course if anyone can prove
> I'm wrong, it's you. :-)

Will not try again soon, I'm tired. Proving you slightly not right
(wrong is too much) costs me half a year of work, finally a little
adjustment to truth helped. Changing truth is the easier way :-)

[interfaces and "dynamic" type inference]
> 

> Eh?  Type inference is supposed to be a compile-time thing.  You
> present your whole Python program to the typechecker and ask it "where
> could this crash if I sent it on rocket to Mars?"

I understand. I always think of importing which is already
execution of something, and then I miss the need to do it before.
Hmm, isn't it AST inspection, and after code is run, you get
a new AST instance which is richer?

...
> And that's exactly the problem.  I want to be able to be told whether
> the cache might be broken *before* I launch my rocket to Mars.

I see. The Houston traceback. You need to close the cache, and also
foresee that some module might want to break it and report a syntax
error *before*, which sounds hard.
A frozen module is a module which has proven its interface
and is protected against changes of necessary conditions.
That's indeed more than mine.

[more numb stuff of mine]

> Read my case study.

Did that. Great. I think I would use Greg's modified AST and do
the analysis there. An interpreter which runs these is also
not that hard and keeps more info than bytecodes. AFAIK this
is Skaller's approach for Viper.

You often said that you want to analyse code form the source,
instead of importing/executing stuff and use inspection. I
understand that. But analysis needs some simulation as well,
to see the effects of running some code. This simulation needs
an environment which can track effects of assignments, imports
and so on. Instead of re-inventing much stuff, why not use
Python inside of a restricted environment? A "virtual Python",
run in a real one, could execute steps, undo them, use other
control flow paths, record types(==sets of possible values),
all as long as there are no permanent side effects to the outside.
But the latter are to be avoided in either case, whatever
which approach you use.

Finally I think this is just another view of the same thing.

cheers - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From gstein@lyra.org  Tue Dec 14 20:28:34 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 14 Dec 1999 12:28:34 -0800 (PST)
Subject: [Types-sig] Re: Static typing considered ... UGLY
In-Reply-To: <E11xueZ-0002mV-00@lsls4p>
Message-ID: <Pine.LNX.4.10.9912141204390.16305-100000@nebula.lyra.org>

On Tue, 14 Dec 1999, Edward Welbourne wrote:
>...
> On the other hand, if you want an object whose attributes are of
> pre-decided kinds, or a namespace in which certain names are reserved
> for certain values, use a setattr hack (or, if you're feeling very
> brave, some variant on the wrapper defined by
> URL: http://www.chaos.org.uk/~eddy/dev/toy/class.py).

setattr hacks work great for class instances. Try to apply them to module
namespaces...

:-)

This is a pretty old request to Guido: provide a setattr hook for modules.

>...
>  ... I've now read the Greg/Fred/Sjoerd attack and I like that: let !
> be a new binary operator with grammar
> 
>    anyvalue ! typechecker
> 
> the value of the expression being that of the given value, but

Ah. Right. I didn't make that explicit, but yes.

> evaluating it'll raise an exception if the typechecker didn't like the
> value.  Now that's a much nicer way to go.  Of course, this effectively

Yes, and I think so, too :-)

> just amounts to implementing ! as an in-expression assert mechanism ...

Yup. The proposal doesn't even introduce new bytecodes... it could use the
same pattern as the assert statement, allowing the Python VM to optimize
it during a -O invocation.

> and I'm not entirely sure how it helps the compiler-writer - is that why
> you insist on the typechecker being a dotted name, not an arbitrary
> expression ?

Correct. The compiler is going to have a hard enough time with dotted
names, let alone arbitrary expressions.

The type assertions help the compiler because the compiler can then make
assumptions on how to *use* that value. For example:

  if x!Int:
    ...

If you compile this, then you know that you can do a simple integer test,
rather than check for an instance and possibly calling __nonzero__. While
no biggy for compiling to the Python VM, this is a *huge* win if you're
compiling to something like C or the JVM.

In the statement:

  a = 5!String

The compiler now knows that <a> will contain a string and can optimize the
uses of <a> as appropriate.

> Type-checking applies to values ;^)

:-)

I believe one of the differences is how a person views "type-safety". I
don't regard "<a> must only contain integers" as an interesting
requirement. "the second param of foo(a,b) must be an integer" is
interesting, and asserting specific return types is interesting.

Problems with types almost *always* occur at boundaries (function
arguments and return values). Type problems just don't occur within a
single function (Guido's CP4E system might disagree, tho :-). As a result,
I think restricting (variable) names is not nearly as interesting as
asserting that your func args/returns are "correct."

> Of course, obstreperous as I am, I immediately want to meddle with the
> scheme: specifically, though the *default* behaviour might be (in
> effect)
> 
>     if not isinstance(value, typechecker): raise TypeError
>     else: yield value
> 
> I'd argue for the semantics to say: evaluate the expressions `value' and
> `typechecker', look for a __check__ method on the latter: if present
> invoke it on the value, else use isinstance as above; on false return
> (no problem) the !-expression yields the given `value', otherwise
> TypeError with parameter the true value returned.  Then we can implement
> weird and devious __check__ methods for fiddly type-checks (instead of
> needing to change isinstance in the way proposed - which would conflict
> with my pet tweak to that, which allows isinstance(value, thistype,
> thattype, othertype) for when I've got several types I'll accept).

I think altering isinstance() to accept a callable is preferable to
introducing a __check__ method. A callable implies that you can use
builtin types to implement type-checkers. You could still use the
__check__ concept with builtins, but you would need to add new slots to
the type structures (which is, IMO, to be avoided).

There is no problem with saying that isinstance() can take more than two
parameters, where 2..n can be a type, a class, or a callable.

[ and note: it should be apparent that you check for a class before a
  callable :-) ]

> Please Greg/Fred/Sjoerd, can you write a proposal which starts where
> http://www.foretec.com/python/workshops/1998-11/greg-type-ideas.html
> ends (reading the first 9/10 of that was ... illuminating in hindsight,
> but off-putting on the way there).  It looks pretty promising ...

I'm peripherally interested here. Not enough to go writing :-). I've got
about three other projects on my "over the next couple months" plate. An
email here or there... sure, I'll do. An "emphatic discussion"... sure.

That page was definitely just a series of notes. I think somebody could
easily distill a one-page proposal from it. Please feel free!

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From guido@CNRI.Reston.VA.US  Tue Dec 14 20:35:50 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 14 Dec 1999 15:35:50 -0500
Subject: [Types-sig] Type inferencing
In-Reply-To: Your message of "Tue, 14 Dec 1999 12:24:32 PST."
 <3856A780.36B6788D@prescod.net>
References: <3855DBA1.9384B6AE@prescod.net> <199912141654.LAA23612@eric.cnri.reston.va.us>
 <3856A780.36B6788D@prescod.net>
Message-ID: <199912142035.PAA24440@eric.cnri.reston.va.us>

> I think that we may be talking at cross purposes. I am trying to define
> a formal, independently implementable specification for a type system
> that Python users will understand and like. Some languages use global
> type inferencing as a formally specified part of the type checker but my
> impression is that users do not like the resulting languages.

OK, you may be right.  Although I think that with Python as a starting
point we'd end up with something sufficiently different from ML that
the jury is still out on whether users will like it or not.

> Jim created an implementation of an excellent, intelligent optimizing
> compiler. His work is as, or more, interesting than mine, but it is a
> different problem he is trying to solve. (OPT) comes into the picture
> because my work makes his much, much easier and more effective in many
> cases. I am totally in favor of particular global type inferencing
> implementations, but am not in favor of requiring global type inference
> of every static type checker implementation nor of requiring
> safety-conscious Python users to think in terms of global type
> inferencing.

OK, I see and agree.

I think that I would like to make *some* form of type inference (maybe
only within the function body) part of the formal specs.  Note that in
a limited way, inference is already part of Python (and sometimes
deplored -- because the diagnostics stink): if you write "a = 1"
anywhere in a function body, then a is a local variable everywhere in
that function (unless there's a "global a" as well).

Now, please make some progress with a design...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gstein@lyra.org  Tue Dec 14 20:42:25 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 14 Dec 1999 12:42:25 -0800 (PST)
Subject: [Types-sig] expression-based type assertions (was: Static typing considered
 ...UGLY)
In-Reply-To: <3856A788.A7435117@prescod.net>
Message-ID: <Pine.LNX.4.10.9912141229250.16305-100000@nebula.lyra.org>

On Tue, 14 Dec 1999, Paul Prescod wrote:
>...
> > Please Greg/Fred/Sjoerd, can you write a proposal which starts where
> > http://www.foretec.com/python/workshops/1998-11/greg-type-ideas.html
> > ends (reading the first 9/10 of that was ... illuminating in hindsight,
> > but off-putting on the way there).  It looks pretty promising ...
> 
> Let me point out again that while that approach is interesting, it
> doesn't solve the problem I was recruited to solve: a *static*
> *compile-time* checker.

Yes, it does :-)

As I mentioned in my note to Eddy just now, the compiler can use the
assertions to determine an expression's type (assuming it isn't already
available through inference). The type can then be used in the checks.

Specifically, the "GFS proposal" would lead to the following types of
compile-time checks:

* is the type correct for each parameter of a function call?
* is the type correct for the function return value(s)?
* will a type assertion (the '!' operator) possibly fail?

And to reiterate a point from my last note: I believe checks associated
with shoving a value into a name are not as interesting, as 99% of the
errors will occur at code boundaries (function calls), which are handled
by the above mechanism.

In fact, I would even say that the only type declarations used would be
associated with function params and returns (and not variable). If you are
implementing a function and want to ensure that a result has a proper
type, then the '!' operator can be used (shoving it into a typed variable
isn't going to help you!).

In both cases, expression- and name-based type assertions, I think you
require type inferencing. So I don't think the problem is simplified by
virtue of using name-based assertions. All you really get is an
compile-time assertion at assignment time, which is also provided by an
expression-based typing.

In other words:

  Int a
  a = foo()

vs.

  a = foo() ! Int

In both cases, the compiler will throw a fit if it knows foo() always
returns a String.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From guido@CNRI.Reston.VA.US  Tue Dec 14 20:42:37 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 14 Dec 1999 15:42:37 -0500
Subject: [Types-sig] Re: Inferencing: A case study
In-Reply-To: Your message of "Tue, 14 Dec 1999 12:12:55 PST."
 <3856A4C7.1B6132C8@prescod.net>
References: <199912141519.KAA23476@eric.cnri.reston.va.us> <38566B30.36608D4E@prescod.net> <199912141851.NAA24093@eric.cnri.reston.va.us>
 <3856A4C7.1B6132C8@prescod.net>
Message-ID: <199912142042.PAA24452@eric.cnri.reston.va.us>

[Paul]
> I've been thinking: I can allow statically checked references to
> type-inferenced module variables if we make the module namespace
> write-only outside of the module. The "trick" is that I need to put a
> boundary around where I expect writes to take place so I can check that
> I can figure the complete list of possible values the variable can take.
> If writes can come from outer space then I need to check every write at
> runtime.

Good -- I've been thinking the same thing.  Here's what I think would
be needed:

1. <module>.<attribute> = <expression> is simply forbidden (this is
setattr for module objects)

2. Somehow we restrict use of <module>.__dict__, globals(), locals(),
and vars().

3. Somehow we restrict exec, eval(), and execfile() when these can
touch a module's globals.

So the only way a module-level variable can be set will be through
assignments in its body (this includes classes and functions contained
in its body); such assignments are easily traceable for the
typechecker.

> So, I can do static type checking on a module variable if:
> 
>  * it is declared only "privately writeable"
>  * or the whole module namespace is "privately writeable"
>  * or it has a type declaration.
> 
> We can provide access to any combination of these options that we
> decide. 
> 
> Privately writable is more pythonic than "const" which was my first
> reaction. Of course the vast, vast majority of module variables are
> privately writable. And one could argue that ALL of them should be.
> Module namespace writability is a security nightmare and it is SO easy
> to move writeable variables to an object:
> 
> sys.path => sys.runtime.path
> sys.version => sys.impl.version

Actually, there's never a need to assign to sys.version, and as for
sys.path, maybe you can't assign a different object to it, you can
still change its value because a list is mutable.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gstein@lyra.org  Tue Dec 14 20:55:31 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 14 Dec 1999 12:55:31 -0800 (PST)
Subject: [Types-sig] Type inferencing
In-Reply-To: <199912142035.PAA24440@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912141248360.16305-100000@nebula.lyra.org>

On Tue, 14 Dec 1999, Guido van Rossum wrote:
> Paul Prescod wrote:
>...
> > cases. I am totally in favor of particular global type inferencing
> > implementations, but am not in favor of requiring global type inference
> > of every static type checker implementation nor of requiring
> > safety-conscious Python users to think in terms of global type
> > inferencing.
> 
> OK, I see and agree.
> 
> I think that I would like to make *some* form of type inference (maybe
> only within the function body) part of the formal specs.  Note that in
> a limited way, inference is already part of Python (and sometimes

I believe that you will always have type inferencing occurring. Maybe I'm
just referring to a degenerate case, but you do need inferencing just to
deal with:

  Int a
  a = foo() + bar()

i.e. inference says "foo-result-type + bar-result-type => Int", so the
     assignment is safe.

> Now, please make some progress with a design...

I've got a partial one for you :-)

* add declarations to "def" statements
* add a type-assertion operator (for discussion, this has been '!')
* use type inference to check func args and returns, and to (pre)check
  type-assertion operators

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Tue Dec 14 20:59:39 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 14 Dec 1999 12:59:39 -0800 (PST)
Subject: [Types-sig] Pascal style declarations
In-Reply-To: <ADCB388D8C6BD211A4CE0000F63D90112215B9@mail.littoncorp.com>
Message-ID: <Pine.LNX.4.10.9912141258240.16305-100000@nebula.lyra.org>

You don't provide a way to declare function return value(s) types. When
you do, then I think you're going to run into a problem using the ':'
syntactical marker...

This was one reason that Fred/Sjoerd/myself moved away from ':'-based
declarations, and eventually fell into expression-based type checking.

Cheers,
-g


On Tue, 14 Dec 1999, Golden, Howard wrote:

> Since Guido hasn't had a coronary in response to my earlier suggestion, I
> will be more specific:
> 
> 1.  I propose _optional_ typing, using the Pascal syntax (since this seems
> to me to be the most "Pythonic" (Isn't that like giving a snake an enema?
> Sorry.).  Actually, I don't care about the specific syntax, just as long as
> there is one.
> 
> 2.  Specifically, you can declare a variable using the syntax:
> 
>     var x : int, y : string, ...
> 
> 3.  In functions and methods, you can _optionally_ specify the argument
> type:
> 
>     def funx(x : int, y : string): ...
> 
> 4.  If you use these, then you are making binding assertions about the types
> of the names, and these assertions can be checked at compile or run time.
> 
> 5.  The parser could be made to strip out these declarations, and ignore
> them, in which case they would have no effect.
> 
> 6.  The parser should be modified so you can tell it (using a compile-time
> switch or pragma) to require declarations.
> 
> 7.  It appears to me that this would not change existing code, except if it
> uses the name "var".
> 
> 8.  I think there should be a parameterized type mechanism.  I don't much
> like the angle bracket notation of C++, but I guess it's well established,
> so it'll do.
> 
> In my opinion, this doesn't "muck up" the language (since you don't have to
> use it).
> 
> ---
> 
> Howard B. Golden
> Software developer
> Litton Industries, Inc.
> Woodland Hills, California
> 
> 
> _______________________________________________
> Types-SIG mailing list
> Types-SIG@python.org
> http://www.python.org/mailman/listinfo/types-sig
> 

-- 
Greg Stein, http://www.lyra.org/


From jeremy@cnri.reston.va.us  Tue Dec 14 21:02:38 1999
From: jeremy@cnri.reston.va.us (Jeremy Hylton)
Date: Tue, 14 Dec 1999 16:02:38 -0500 (EST)
Subject: [Types-sig] expression-based type assertions (was: Static typing considered
 ...UGLY)
In-Reply-To: <Pine.LNX.4.10.9912141229250.16305-100000@nebula.lyra.org>
References: <3856A788.A7435117@prescod.net>
 <Pine.LNX.4.10.9912141229250.16305-100000@nebula.lyra.org>
Message-ID: <14422.45166.820239.289239@goon.cnri.reston.va.us>

>>>>> "GS" == Greg Stein <gstein@lyra.org> writes:

  GS> On Tue, 14 Dec 1999, Paul Prescod wrote:
  >> ...  > Please Greg/Fred/Sjoerd, can you write a proposal which
  >> starts where >
  >> http://www.foretec.com/python/workshops/1998-11/greg-type-ideas.html
  >> > ends (reading the first 9/10 of that was ... illuminating in
  >> hindsight, > but off-putting on the way there).  It looks pretty
  >> promising ...
  >> 
  >> Let me point out again that while that approach is interesting,
  >> it doesn't solve the problem I was recruited to solve: a *static*
  >> *compile-time* checker.

[I was in starting a response to Paul when Greg's mail arrived, so I
merged the responses.  Hoping to maximize confusion.]

Perhaps we need a charter revision.  We need to formally define a type
system for Python.  It may or may not be statically checkable --
that's just the way type systems work, e.g. Java does array bounds
checks at runtime because it can't at compile time.  The fact that
array bounds are checked at runtime doesn't mean that Java's type
system forbids referencing past the end of an array; it just can be
statically checked (or at least no one has figured out a practical way
to check it).  The point of this digression is to argue that saying
you only do "compile-time" checks is a bit of a cop out.

  GS> Yes, it does :-)

I agree.

  GS> As I mentioned in my note to Eddy just now, the compiler can use
  GS> the assertions to determine an expression's type (assuming it
  GS> isn't already available through inference). The type can then be
  GS> used in the checks.

  GS> Specifically, the "GFS proposal" would lead to the following
  GS> types of compile-time checks:

  GS> * is the type correct for each parameter of a function call?  
  GS> * is the type correct for the function return value(s)?  
  GS> * will a type assertion (the '!' operator) possibly fail?

These sounds like exactly the right place to start!

  GS> And to reiterate a point from my last note: I believe checks
  GS> associated with shoving a value into a name are not as
  GS> interesting, as 99% of the errors will occur at code boundaries
  GS> (function calls), which are handled by the above mechanism.

  GS> In fact, I would even say that the only type declarations used
  GS> would be associated with function params and returns (and not
  GS> variable). If you are implementing a function and want to ensure
  GS> that a result has a proper type, then the '!' operator can be
  GS> used (shoving it into a typed variable isn't going to help
  GS> you!).

I think I agree with you as far as local variables.  It becomes quite
interesting when you're talking about attributes of objects, e.g. what
is the type of the closed attribute of a builtin file object.  (For
that matter, what is the type of the builtin open function and how
does it differ from a function that returns a StringIO object?)

Jeremy


From gstein@lyra.org  Tue Dec 14 21:33:44 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 14 Dec 1999 13:33:44 -0800 (PST)
Subject: [Types-sig] expression-based type assertions
In-Reply-To: <14422.45166.820239.289239@goon.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912141323280.16305-100000@nebula.lyra.org>

On Tue, 14 Dec 1999, Jeremy Hylton wrote:
>...
> Perhaps we need a charter revision.  We need to formally define a type
> system for Python.  It may or may not be statically checkable --
> that's just the way type systems work, e.g. Java does array bounds

In deference to Paul, I think it must be statically checkable. However,
your point about "completely checkable" is quite valid! We can have "as
much as possible" but not necessarily "completely."

Due to complex type issues, it may not ever be possible to be complete.
The question really becomes "how close?"

(note that expression-based type assertions allow a person to make
assertions on sub-components of a complex/composite type while it is being
used; this gives expr-based the capability to fill in where name-based
falls down because of a lack of syntactic expressability)

>...
> to check it).  The point of this digression is to argue that saying
> you only do "compile-time" checks is a bit of a cop out.

Damn... :-)

Personally, I in the (OPT) camp, rather than (ERR) camp, so I don't care
about static checks. But: the GFS proposal still supports it.

>...
>   GS> And to reiterate a point from my last note: I believe checks
>   GS> associated with shoving a value into a name are not as
>   GS> interesting, as 99% of the errors will occur at code boundaries
>   GS> (function calls), which are handled by the above mechanism.
> 
>   GS> In fact, I would even say that the only type declarations used
>   GS> would be associated with function params and returns (and not
>   GS> variable). If you are implementing a function and want to ensure
>   GS> that a result has a proper type, then the '!' operator can be
>   GS> used (shoving it into a typed variable isn't going to help
>   GS> you!).
> 
> I think I agree with you as far as local variables.  It becomes quite
> interesting when you're talking about attributes of objects, e.g. what
> is the type of the closed attribute of a builtin file object.  (For
> that matter, what is the type of the builtin open function and how
> does it differ from a function that returns a StringIO object?)

Ah! Good point. I think this is where interfaces come in. Otherwise, it
becomes very difficult to syntactically specify the types of attributes.
Note that many of the problems with type decls for builtin types would
probably be solved with interfaces, too. Until interfaces arrive,
Martijn's proposal of using structures to specify an interface is probably
best. A class or type can have an associated structure to specify
attribute type information (functions still use syntactical declarators).

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From GoldenH@littoncorp.com  Tue Dec 14 21:45:53 1999
From: GoldenH@littoncorp.com (Golden, Howard)
Date: Tue, 14 Dec 1999 13:45:53 -0800
Subject: [Types-sig] Re: Pascal style declarations
Message-ID: <ADCB388D8C6BD211A4CE0000F63D90112215BC@mail.littoncorp.com>

Greg Stein [mailto:gstein@lyra.org] wrote:

> You don't provide a way to declare function return value(s) 
> types. When
> you do, then I think you're going to run into a problem using the ':'
> syntactical marker...

[refers to:]

> > 3.  In functions and methods, you can _optionally_ specify 
> the argument
> > type:
> > 
> >     def funx(x : int, y : string): ...
> > 

I'll admit that Python already uses the ":" character where Pascal does, but
so what?  You can still specify the return type in other ways.  The most
obvious (to me) is to use the ":" character twice, e.g.,

    def funx(x : int, y : string): int : ...

While I'm not a parsing expert, I believe this would still be parsable.  Of
course, any other available character could be used instead of the ":", if
this would be preferable.  (Again, I'm not trying to dictate the final
syntax, just suggest a starting point.)

> This was one reason that Fred/Sjoerd/myself moved away from ':'-based
> declarations, and eventually fell into expression-based type checking.

I am suggesting using declarations, rather than expression-based type
checking, since that is familiar in other languages.  As a declaration, it
is clear that I am talking about an invariant assertion, not a dynamic one.

Expression-based type checking should also be available, since it is needed
when static checking is impossible.  I don't think it has to be either/or.


From Tony Lownds <tony@metanet.com>  Tue Dec 14 21:46:03 1999
From: Tony Lownds <tony@metanet.com> (Tony Lownds)
Date: Tue, 14 Dec 1999 13:46:03 -0800 (PST)
Subject: [Types-sig] Pascal style declarations
In-Reply-To: <Pine.LNX.4.10.9912141258240.16305-100000@nebula.lyra.org>
Message-ID: <Pine.GSO.3.93.991214131841.5952D-100000@adam12>

Hi,

Visual Basic uses "as" to declare types of parameters, and
Object Pascal uses "as" as a dynamic cast operator, so consider "as"
instead of !

I'll use that below just to try it on for size. My main point is, I think
there should be a seperate operator for declaring return types. If I read
your proposal right, then

def logfn(s as String, *args) as String:
 ...

declares that log is a reference to a function taking a sting and a bunch
of unspecified types, returning a string. How would you check that an
object is a function with the same signature? 

# programmer would have to think associativity here
log = logfn as (String, *Object) as String

That syntax doesnt seem to be easily grokkable. Now if you had another 
operator that declared return values, say ->, then the statement above is
clearer and you could also make a typedef for a function and apply it in
the def statement.

def logfn(s as String, *args) -> String:
  ...

log = logfn as (String, *Object) -> String

-or-

log_function = (String, *Object) -> String

def logfn(s, *args) as log_function:
 ...

log = logfn as log_function


Tim H. also mentioned using -> but he suggested replacing ! with ->, I am
suggesting that we'd want a seperate operator for declaring return
types.
 
-Tony Lownds

On Tue, 14 Dec 1999, Greg Stein wrote:

> You don't provide a way to declare function return value(s) types. When
> you do, then I think you're going to run into a problem using the ':'
> syntactical marker...
> 
> This was one reason that Fred/Sjoerd/myself moved away from ':'-based
> declarations, and eventually fell into expression-based type checking.
> 
> Cheers,
> -g
> 
> 
> On Tue, 14 Dec 1999, Golden, Howard wrote:
> 
> > Since Guido hasn't had a coronary in response to my earlier suggestion, I
> > will be more specific:
> > 
> > 1.  I propose _optional_ typing, using the Pascal syntax (since this seems
> > to me to be the most "Pythonic" (Isn't that like giving a snake an enema?
> > Sorry.).  Actually, I don't care about the specific syntax, just as long as
> > there is one.
> > 
> > 2.  Specifically, you can declare a variable using the syntax:
> > 
> >     var x : int, y : string, ...
> > 
> > 3.  In functions and methods, you can _optionally_ specify the argument
> > type:
> > 
> >     def funx(x : int, y : string): ...
> > 
> > 4.  If you use these, then you are making binding assertions about the types
> > of the names, and these assertions can be checked at compile or run time.
> > 
> > 5.  The parser could be made to strip out these declarations, and ignore
> > them, in which case they would have no effect.
> > 
> > 6.  The parser should be modified so you can tell it (using a compile-time
> > switch or pragma) to require declarations.
> > 
> > 7.  It appears to me that this would not change existing code, except if it
> > uses the name "var".
> > 
> > 8.  I think there should be a parameterized type mechanism.  I don't much
> > like the angle bracket notation of C++, but I guess it's well established,
> > so it'll do.
> > 
> > In my opinion, this doesn't "muck up" the language (since you don't have to
> > use it).
> > 
> > ---
> > 
> > Howard B. Golden
> > Software developer
> > Litton Industries, Inc.
> > Woodland Hills, California
> > 
> > 
> > _______________________________________________
> > Types-SIG mailing list
> > Types-SIG@python.org
> > http://www.python.org/mailman/listinfo/types-sig
> > 
> 
> -- 
> Greg Stein, http://www.lyra.org/
> 
> 
> _______________________________________________
> Types-SIG mailing list
> Types-SIG@python.org
> http://www.python.org/mailman/listinfo/types-sig
> 


From gstein@lyra.org  Tue Dec 14 22:17:29 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 14 Dec 1999 14:17:29 -0800 (PST)
Subject: [Types-sig] Re: Pascal style declarations
In-Reply-To: <ADCB388D8C6BD211A4CE0000F63D90112215BC@mail.littoncorp.com>
Message-ID: <Pine.LNX.4.10.9912141407400.16305-100000@nebula.lyra.org>

On Tue, 14 Dec 1999, Golden, Howard wrote:
>...
> I'll admit that Python already uses the ":" character where Pascal does, but
> so what?  You can still specify the return type in other ways.  The most
> obvious (to me) is to use the ":" character twice, e.g.,
> 
>     def funx(x : int, y : string): int : ...
> 
> While I'm not a parsing expert, I believe this would still be parsable.  Of

It isn't "easily" parsable :-)

"int" is a valid expression, which is valid on the same line after a
function definition. For example:

  def funx(x, y): foo() ; return 5

The parser wouldn't know whether the expression is part of the function
body or a return type declaration until hitting the ':'. That would
require an arbitrary look-ahead or some funkiness in the grammar.

> course, any other available character could be used instead of the ":", if
> this would be preferable.  (Again, I'm not trying to dictate the final
> syntax, just suggest a starting point.)

Yes, another character would be used. But which? What construct looks
Pythonic? I don't disagree with the basic notion here... just that it is
tough to retain Python's clean feel.

While we didn't necessarily like the '!' choice for the operator, we felt
that the basic concept imposed very little change on Python's clean feel.

> > This was one reason that Fred/Sjoerd/myself moved away from ':'-based
> > declarations, and eventually fell into expression-based type checking.
> 
> I am suggesting using declarations, rather than expression-based type
> checking, since that is familiar in other languages.  As a declaration, it
> is clear that I am talking about an invariant assertion, not a dynamic one.

As I've mentioned in my other email, expression-based checkin also defines
an invariant. The compiler can make assumptions based on type declarators
in a function or when it sees a type-assert operator.

> Expression-based type checking should also be available, since it is needed
> when static checking is impossible.  I don't think it has to be either/or.

We already have expr-based (the "assert" statement) -- we can assert types
on expressions anywhere. It is just a little less convenient since we must
place the expression value into a temporary variable, assert the type of
that, then continue with the expression. The "type-assert operator"
simplifies this process dramatically.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From tismer@appliedbiometrics.com  Tue Dec 14 22:53:33 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Tue, 14 Dec 1999 23:53:33 +0100
Subject: [Types-sig] Re: Pascal style declarations
References: <Pine.LNX.4.10.9912141407400.16305-100000@nebula.lyra.org>
Message-ID: <3856CA6D.C84785ED@appliedbiometrics.com>


Greg Stein wrote:
> 
> On Tue, 14 Dec 1999, Golden, Howard wrote:

[snap]

> We already have expr-based (the "assert" statement) -- we can assert types
> on expressions anywhere. It is just a little less convenient since we must
> place the expression value into a temporary variable, assert the type of
> that, then continue with the expression. The "type-assert operator"
> simplifies this process dramatically.

Why not use "assert" instead of "as" as an operator?

def f(x):
    a = x assert int
    #... stuff
    return str(x) + g(x) assert string # assert binding low precedence

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From gstein@lyra.org  Tue Dec 14 22:59:49 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 14 Dec 1999 14:59:49 -0800 (PST)
Subject: [Types-sig] Re: Pascal style declarations
In-Reply-To: <3856CA6D.C84785ED@appliedbiometrics.com>
Message-ID: <Pine.LNX.4.10.9912141459120.16305-100000@nebula.lyra.org>

A bit wordy, but that might work!


On Tue, 14 Dec 1999, Christian Tismer wrote:
> Greg Stein wrote:
>...
> > We already have expr-based (the "assert" statement) -- we can assert types
> > on expressions anywhere. It is just a little less convenient since we must
> > place the expression value into a temporary variable, assert the type of
> > that, then continue with the expression. The "type-assert operator"
> > simplifies this process dramatically.
> 
> Why not use "assert" instead of "as" as an operator?
> 
> def f(x):
>     a = x assert int
>     #... stuff
>     return str(x) + g(x) assert string # assert binding low precedence

-- 
Greg Stein, http://www.lyra.org/


From GoldenH@littoncorp.com  Tue Dec 14 23:05:06 1999
From: GoldenH@littoncorp.com (Golden, Howard)
Date: Tue, 14 Dec 1999 15:05:06 -0800
Subject: [Types-sig] Re: Pascal style declarations
Message-ID: <ADCB388D8C6BD211A4CE0000F63D90112215BD@mail.littoncorp.com>

Christian Tismer [mailto:tismer@appliedbiometrics.com] wrote:

> def f(x):
>     a = x assert int
>     #... stuff
>     return str(x) + g(x) assert string # assert binding low precedence

I'm still trying to get a _declaration_ into the signature, e.g., using your
assert:

def f(x assert int) assert string :
    a = x
    #... stuff
    return str(x) + g(x)

In other words, "assert" is a synonym for Pascal's ":"!  :-)


From tismer@appliedbiometrics.com  Tue Dec 14 23:30:27 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Wed, 15 Dec 1999 00:30:27 +0100
Subject: [Types-sig] Re: Pascal style declarations
References: <ADCB388D8C6BD211A4CE0000F63D90112215BD@mail.littoncorp.com>
Message-ID: <3856D313.CFE21380@appliedbiometrics.com>

> I'm still trying to get a _declaration_ into the signature, e.g., using your
> assert:
> 
> def f(x assert int) assert string :
>     a = x
>     #... stuff
>     return str(x) + g(x)
> 
> In other words, "assert" is a synonym for Pascal's ":"!  :-)

Sure, while not mentioning, it was obvious to do this
since I proposed a textual replacement for "as" :-)

assert is also a synonym for VB's "as" but don't tell 'em :-=

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From paul@prescod.net  Tue Dec 14 20:33:56 1999
From: paul@prescod.net (Paul Prescod)
Date: Tue, 14 Dec 1999 12:33:56 -0800
Subject: [Types-sig] List of FOO
References: <3855DBBB.6D1B462A@prescod.net> <199912141649.LAA23593@eric.cnri.reston.va.us> <38569478.40E29421@vet.uu.nl>
Message-ID: <3856A9B4.7636C0C9@prescod.net>

Martijn Faassen wrote:
> 
> I agree completely, and one *can* express most of this pretty easily in
> current Python, i.e.:
> 
> types = {
>     "bar": IntType,
>     "baz": ListType(IntType),
>     "hey": IntType,
>     "foo3": FunctionType(args=(IntType,), result=IntType),
> 
>     "crazy" : ListType(FunctionType(args=(ListType(IntType),
> StringType),                                 result=DictType(StringType,
> 
> FunctionType(args=None,
> result=StringType)))
> }

Questions:

1. This system is supposed to be extensible, right? So I could, for
instance, define a binary tree module and have "binary trees of ints"
and "binary trees of strings." How do I define the binary tree class and
state that it is parameterizable?

2. How does this work with interfaces? "ListType" is cheating. We need
SequenceType because that's not implementation specific. And
SequenceType needs to be defined by an interface, not a class.

3. What does "tuple of int, string" look like? And should we have list
length parameters?

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Tue Dec 14 20:45:28 1999
From: paul@prescod.net (Paul Prescod)
Date: Tue, 14 Dec 1999 12:45:28 -0800
Subject: [Types-sig] Shadow File Opinions?
References: <ADCB388D8C6BD211A4CE0000F63D90112215B5@mail.littoncorp.com> <385691D3.6DC4A36E@vet.uu.nl>
Message-ID: <3856AC68.2D5FCF37@prescod.net>

Martijn Faassen wrote:
> 
> ...
> While my agenda is to kill the syntax discussions for the moment, I'd
> propose a seperate declaration syntax before all others, because this is
> the most syntactically compatible with Python. And easier on the
> programmer.

I'm considering your argument carefully. If we make separate interface
files then we get Python 1.5 (hell, Python 1.0) compatibility "for free"
and we can experiment with different syntaxes without breaking Python
code. Plus we could use IDL and type libraries for type analysis
*already*.

I think the final product must allow inline declarations but I am
starting to think that in the short term, "interface definition" files
are the way to go not just for builtin modules but for all modules.

Do others agree?

> Imagine you have a module. Now you want to make it fully statically
> typed. With most syntax proposals I've seen you'd have to go through the
> code and add type declarations here and there, mix it with the current
> code.

I think that any proposal that requires you to keep two separate files
"in sync" is bound to fail in the long term. I left that crap behind in
C++. But in the short term...okay.

> With either a Python based system as I'm proposing (ugly but powerful
> and fairly simple), or a seperate type declaration system, you have your
> type declarations separated from the code itself. This means you easily
> add and remove type information and switch between a statically typed
> module and a dynamically typed module easily.

But there is not going to be alot of "switching". You add declarations
and you leave them there. You update them when they get out of sync with
the code. Why would you want to take a nice, safe, optimized module that
you have gone to the effort of type annotating and hide the annotations?
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Tue Dec 14 20:47:48 1999
From: paul@prescod.net (Paul Prescod)
Date: Tue, 14 Dec 1999 12:47:48 -0800
Subject: [Types-sig] Compile-time or runtime checks?
References: <ADCB388D8C6BD211A4CE0000F63D90112215B5@mail.littoncorp.com> <385691D3.6DC4A36E@vet.uu.nl>
Message-ID: <3856ACF4.E19023D8@prescod.net>

Martijn Faassen wrote:
> 
> On a slightly seperate issue, I propose a classification of modules
> according to type annotation (or functions or classes, whatever level
> you prefer thinking about):

I'm trying hard to separate the axes of: "I have some type declarations"
and "I want a static type checker to gurantee that this code is totally
type safe." This should be legal:

StringType
def foo():
	a=eval( sys.argv[1] )
	return a

That means I want a runtime check. This should be illegal:

type-safe
StringType
def foo():
	a=eval( sys.argv[1] )
	return a

Here I've specifically asked for a compile time check and my code is not
up to snuff.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Wed Dec 15 01:16:19 1999
From: paul@prescod.net (Paul Prescod)
Date: Tue, 14 Dec 1999 17:16:19 -0800
Subject: [Types-sig] Type inferencing
References: <Pine.LNX.4.10.9912141248360.16305-100000@nebula.lyra.org>
Message-ID: <3856EBE3.74D8FD2F@prescod.net>

Greg Stein wrote:
> 
> I believe that you will always have type inferencing occurring. Maybe I'm
> just referring to a degenerate case, but you do need inferencing just to
> deal with:
> 
>   Int a
>   a = foo() + bar()

That's absolutely true. I agree with everyone else that argument values
must be type checked and that most of the rest can be inferred.

> * add declarations to "def" statements

Agreed.

> * add a type-assertion operator (for discussion, this has been '!')

Prefer function call syntax. Or maybe Java/C++ (cast) syntax.

> * use type inference to check func args and returns, and to (pre)check
>   type-assertion operators

Agreed.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From guido@CNRI.Reston.VA.US  Wed Dec 15 03:03:17 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 14 Dec 1999 22:03:17 -0500
Subject: [Types-sig] Compile-time or runtime checks?
In-Reply-To: Your message of "Tue, 14 Dec 1999 12:47:48 PST."
 <3856ACF4.E19023D8@prescod.net>
References: <ADCB388D8C6BD211A4CE0000F63D90112215B5@mail.littoncorp.com> <385691D3.6DC4A36E@vet.uu.nl>
 <3856ACF4.E19023D8@prescod.net>
Message-ID: <199912150303.WAA00737@eric.cnri.reston.va.us>

> I'm trying hard to separate the axes of: "I have some type declarations"
> and "I want a static type checker to gurantee that this code is totally
> type safe." This should be legal:
> 
> StringType
> def foo():
> 	a=eval( sys.argv[1] )
> 	return a
> 
> That means I want a runtime check. This should be illegal:
> 
> type-safe
> StringType
> def foo():
> 	a=eval( sys.argv[1] )
> 	return a
> 
> Here I've specifically asked for a compile time check and my code is not
> up to snuff.

I would strongly advise to focus on the type-safe axis.  Run-time
checks can already be implemented using various assert statements.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@CNRI.Reston.VA.US  Wed Dec 15 03:05:47 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 14 Dec 1999 22:05:47 -0500
Subject: [Types-sig] Shadow File Opinions?
In-Reply-To: Your message of "Tue, 14 Dec 1999 12:45:28 PST."
 <3856AC68.2D5FCF37@prescod.net>
References: <ADCB388D8C6BD211A4CE0000F63D90112215B5@mail.littoncorp.com> <385691D3.6DC4A36E@vet.uu.nl>
 <3856AC68.2D5FCF37@prescod.net>
Message-ID: <199912150305.WAA00748@eric.cnri.reston.va.us>

> I think the final product must allow inline declarations but I am
> starting to think that in the short term, "interface definition" files
> are the way to go not just for builtin modules but for all modules.
> 
> Do others agree?

Yes on both counts.  I think this has been suggested long ago (maybe
by Jack Jansen?).  It never went anywhere, probably because the whole
idea never went anywhere.

Note that there's one case where separate interface files may be the
end solution: when the source itself is in another language.  This has
been discussed already.  Note that the doc-sig is also considering
that for documenting C extensions.  And of course Java does this for
native methods (both for docs and for typedecls!).


--Guido van Rossum (home page: http://www.python.org/~guido/)


From janssen@parc.xerox.com  Wed Dec 15 03:27:14 1999
From: janssen@parc.xerox.com (Bill Janssen)
Date: Tue, 14 Dec 1999 19:27:14 PST
Subject: [Types-sig] Shadow File Opinions?
In-Reply-To: Your message of "Tue, 14 Dec 1999 12:45:28 PST."
 <3856AC68.2D5FCF37@prescod.net>
Message-ID: <99Dec14.192724pst."3587"@watson.parc.xerox.com>

> I'm considering your argument carefully. If we make separate interface
> files then we get Python 1.5 (hell, Python 1.0) compatibility "for free"
> and we can experiment with different syntaxes without breaking Python
> code. Plus we could use IDL and type libraries for type analysis
> *already*.
> 
> I think the final product must allow inline declarations but I am
> starting to think that in the short term, "interface definition" files
> are the way to go not just for builtin modules but for all modules.
> 
> Do others agree?

Hey, I agreed with this five years ago!  The tricky part is
type-checking your use of that module without type declarations in the
usage-side code.

But yes, the standard process is:

1)  Add separate interface files, containing declarations of the
interface exported from a module file.  This is documentation even if
used for no other purpose.

2)  Add a type inferencer that checks code using a module against the
interface for that module.  Provided you don't kill yourself writing
the type inferencer (which almost happened here attempting the type
inferencing system for SchemeXerox :-), you can now make some limited
type checking available.

3) Move the type declaration syntax you developed for step 1 into the
language proper.  The parser is initially rigged to ignore it (and
maybe it always will).

4)  Now the type inferencer/checker is re-written to take advantage of
the real type annotations in the usage-side code.

5)  (Optional) Do away with the separate interfaces developed in step
1 and move the type declarations into the implementation of the
module.

Bill


From paul@prescod.net  Wed Dec 15 02:03:24 1999
From: paul@prescod.net (Paul Prescod)
Date: Tue, 14 Dec 1999 18:03:24 -0800
Subject: [Types-sig] Re: Pascal style declarations
References: <Pine.LNX.4.10.9912141407400.16305-100000@nebula.lyra.org>
Message-ID: <3856F6EC.4FD5860E@prescod.net>

Greg Stein wrote:
> 
> "int" is a valid expression, which is valid on the same line after a
> function definition. For example:
> 
>   def funx(x, y): foo() ; return 5

Well, first, I don't think that we are going to allow functions as
return type specifications. Use assert for runtime assertions.

Second, Python needs to use look-ahead to tell the difference between
parentheses used for parsing a tuple and used for bracketing, doesn't
it?

> We already have expr-based (the "assert" statement) -- we can assert types
> on expressions anywhere. It is just a little less convenient since we must
> place the expression value into a temporary variable, assert the type of
> that, then continue with the expression. The "type-assert operator"
> simplifies this process dramatically.

Sure, but why not just use function call syntax? Or maybe Java/C++
(cast) syntax?

> Jeremy wrote:
> > I think I agree with you as far as local variables.  It becomes quite
> > interesting when you're talking about attributes of objects, e.g. what
> > is the type of the closed attribute of a builtin file object.  (For
> > that matter, what is the type of the builtin open function and how
> > does it differ from a function that returns a StringIO object?)
> 
Greg Stein wrote:
> Ah! Good point. I think this is where interfaces come in. Otherwise, it
> becomes very difficult to syntactically specify the types of attributes.

When we specify the types of attributes, we will be talking about those
attributes by name, not by expression or value. So we need a syntax for
specifying types of names *and* expressions. If we use function syntax
for expressions casts then we can reduced the syntactic overload.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Wed Dec 15 02:39:55 1999
From: paul@prescod.net (Paul Prescod)
Date: Tue, 14 Dec 1999 18:39:55 -0800
Subject: [Types-sig] Re: expression-based type assertions (was: Static typing
 considered...UGLY)
References: <Pine.LNX.4.10.9912141229250.16305-100000@nebula.lyra.org>
Message-ID: <3856FF7B.3AA32F18@prescod.net>

Greg Stein wrote:
> 
> ...
> In fact, I would even say that the only type declarations used would be
> associated with function params and returns (and not variable). 

How do we handle attribute values? We can't just say "interfaces" unless
we agree that interfaces allow type declarations to be associated with
instance variables. And if we start associating type declarations with
attribute names as we do parameter names, why wouldn't we also allow
that for local and global variables?

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Wed Dec 15 02:41:56 1999
From: paul@prescod.net (Paul Prescod)
Date: Tue, 14 Dec 1999 18:41:56 -0800
Subject: [Types-sig] Shadow File Opinions?
References: <99Dec14.192724pst."3587"@watson.parc.xerox.com>
Message-ID: <3856FFF4.1C1A4AD@prescod.net>

Okay, shadow files seem to be a hit.

Bill, while you're here, could you help me out with the CORBA IDL POV on
generic types? Does IDL support parameterization?

> 2)  Add a type inferencer that checks code using a module against the
> interface for that module.  Provided you don't kill yourself writing
> the type inferencer (which almost happened here attempting the type
> inferencing system for SchemeXerox :-), you can now make some limited
> type checking available.

This is the part that scares the hell out of me!

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From peter.sommerfeld@gmx.de  Wed Dec 15 06:36:10 1999
From: peter.sommerfeld@gmx.de (Peter Sommerfeld)
Date: Wed, 15 Dec 1999 06:36:10 +0000
Subject: [Types-sig] Re: Pascal style declarations
Message-ID: <199912150536.AAA25023@python.org>

Paul Prescod wrote:

> Well, first, I don't think that we are going to allow functions as
> return type specifications. Use assert for runtime assertions.

I don't see a reason for this limitation. It would seriously
restrict future introduction of closures into python.

def format(string s) -> def(string);

-- Peter


From tim_one@email.msn.com  Wed Dec 15 09:08:44 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Wed, 15 Dec 1999 04:08:44 -0500
Subject: [Types-sig] List of FOO
In-Reply-To: <3855DBBB.6D1B462A@prescod.net>
Message-ID: <000d01bf46db$fdeccd00$05a0143f@tim>

[Paul Prescod]
> It took two years to get the parameterized version of the Java type
> system up and running.

Ya, but they took this stuff seriously <wink>.

> Let me ask this your opinion on this question (seriously, not
> sarcastically), should we include a spelling for "list of
> string"

[""]

> and not "callable taking list of callables taking strings returning
> integers returning string"

["" -> 0] -> ""

> and what about "callable taking list of callables taking <T>
> and R returning list of callables taking <R> and returning <T>."

The last "returning <T>" is ambiguous.  You may mean:

    [(T, R) -> [R -> None]] -> T

or

    [(T, R) -> [R -> T]] -> None

> You see my problem?

I don't.  The convolution comes not from the concepts but from the attempt
to express them in English.  If the formalism introduced above is too
concise, there are a gazillion other ways to spell it; e.g.,

List of String
Func(List of Func(String)->Int)->String
Func(List of Func(T, R)->List of Func(R))->T
Func(List of Func(T, R)->List of Func(R)->T)

> I could special case "list of" as Java and C did if we agreed to
> take our chances that my syntax would be extensible.

Ack, no -- start with a general scheme, so special cases aren't necessary.
Although it's *pleasant* if Python's builtin types get especially nice
syntax.  BTW, the concise form above is much like what Haskell uses.

panic-is-always-premature-ly y'rs  - tim


From tim_one@email.msn.com  Wed Dec 15 09:08:50 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Wed, 15 Dec 1999 04:08:50 -0500
Subject: [Types-sig] RFC 0.1
In-Reply-To: <3855E950.AE0E3E19@prescod.net>
Message-ID: <000e01bf46dc$0050ada0$05a0143f@tim>

[Paul Prescod]
> ...
> In theory, but in practice "whole-program X" seems to never get
> implemented (in Python or elsewhere!), as in "whole program type checks"
> and "whole program optimization" and "whole program flow analysis."
> "Whole program analysis" tends to be an excuse to put off work (roughly
> like "type inference").

Whole-program type inference is the *norm* in the functional language
world -- although they design the languages to make this provably possible
in all cases(possible != easy -- it's not).  Experienced f.p. programmers
nevertheless explicitly name all their types and explictly declare all their
vrbls of non-trivial types; else the unification algorithms that deduce
most-general types yield incomprehensible error msgs; e.g., if you have a
function that you *think* of as taking a list of ints, you don't know what
the compiler is talking about if you forget to declare it as such and the
type inferencer bitches about being unable to unify two type expressions
that take five lines each to spell <0.5 wink>.

The more general the language, the more benefit there is for *people* to be
able to declare types, in their role as code readers.  So in addition to
Guido's OPT and ERR, add COM -- for "make this mess COMprehensible" <wink>.

> ...
> If we invent new, syntactically distinct spellings then we can
> syntactically recognize them and complain if they aren't spelled
> "exactly right" (i.e. in a statically analyzable way).

[Guido]
>> As long as an easy mechanical transformation to valid Python
>> 1.5.x is available, I'd be happy.

[PP]
> ...
> With all due respect my problem is that you took the obvious (or at
> least traditional) instance variable declaration syntax and used it
> as a class variable declaring syntax. Okay, let's try this:
>
>  class foo:
>      types.IntType, a=5
>
>      def __init__( self ):
>          types.ListType, self.b
>
> That looks equally ugly to me. Got any other ideas?

Don't try to overload existing syntax, either asserts or (as above) tuple
syntax.  That confuses both the overloader and the overloadee.  Guido just
*begged* <wink> us to suck up a new keyword!  For lack of a better word, say
type declarations are in "decl" stmts.  It doesn't matter to me, but what
does matter is that once you get your own statement, you can also define the
syntax of that statement; e.g.,

    class foo:
        decl a: int  # slop in a const too, if you like
        a = 5

        def __init__(self):
            decl member b: List of Any
            # or put that at class level -- where it belongs <wink>

Resist the dubious temptation to conflate declaration with initialization,
and "an easy mechanical transformation to valid Python 1.5.x" consists of
commenting out the decl stmts!  Heck, call the keyword "#\s+decl\s+" and
it's a nop.

> ...
> if we are ever going to get to full polymorphic parametric static
> type checking we will have to acknowledge that the type system will
> have hard parts just as the language has hard parts.

Indeed it will.

but-in-a-pythonically-soft-way-ly y'rs  - tim


From m.faassen@vet.uu.nl  Wed Dec 15 09:32:07 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Wed, 15 Dec 1999 10:32:07 +0100
Subject: [Types-sig] RFC 0.1
References: <Pine.LNX.4.10.9912140200100.16305-100000@nebula.lyra.org> <38565074.9E51515F@prescod.net> <38568C1A.D1BCF530@vet.uu.nl> <3856A1CB.B5470782@prescod.net>
Message-ID: <38576017.75252FF7@vet.uu.nl>

Paul Prescod wrote:
> 
> Martijn Faassen wrote:
> >
> > I agree with this, which is I am advocating a strong split (for
> > simplicity) of fully-statically checked code and normal python code.
> 
> I don't see this as buying much simplicity. And I do see it as requiring
> more work later. I also see it as scaring the bejeesus out of many
> static type system fence sitters. Can you demonstrate that it makes our
> life easier to figure out integration issues later?

Sure, but we're bound to scare the bejeesus out of everyone anyway;
we're proposing a major change to Python.

The 'simplicity' part comes in because you don't need *any* type
inferencing. Conceptually it's quite simple; all names need a type.

> > Later on you can work on blurring the interface between the two. First
> > *fully* type annotated functions (classes, modules, what you want),
> > which can only refer to other things that are fully annotated. By 'fully
> > annotated' I mean all names have a type.
> 
> I think that's a non-starter because it will take forever to become
> useful because the standard library is not type-safe. Anyhow I fell like
> I've *already solved* the problem of integration so why would I undo
> that?

Okay, I will need to figure out your solution then. :)

> > I keep disagreeing with Paul's
> > simplification of initially throwing out constructed types such as list
> > of integer, as that would break my own approach at simplicity. :)
> 
> If I'm making this problem harder than it needs to be then I'm happy to
> accept your simple solution for parameterized types as soon as I
> understand it.

I'll try to clean up my swallow.py demo module. It doesn't demonstrate
much, just a way a type system could work using Python dicts and such to
construct complicated types.
 
> > If we throw out the syntax issue and use Python constructs for types
> > until we know more, we'll all be happier, right? :) The syntax will be
> > clear when the semantics is. Guido is good at syntax, let him figure out
> > a good syntax for it, let's just focus on the semantics.
> 
> Well, we need SOME syntax in order to communicate. Anyhow...

Right, but just Python code will do for communication. It's clear as we
all understand it already. It looks horrible, but we can work on that
later.
 
> > Our static type checker/compiler can use the Python type constructions
> > directly. We can put limitations on them to forbid any type
> > constructions that the compiler cannot fully evaluate before the
> > compilation of the actual code, of course, just like we can put
> > limitations on statically typed functions (they shouldn't be able to
> > call any non-static functions in the first iteration of our design, I'm
> > still maintaining)
> 
> I see no reason for that limitation. The result of a call to a
> non-static function is a Pyobject. You cast it in your client code to
> get type safety. Just like the shift from K&R C to ANSI C. Functions
> always (okay, often) returned "ints" but you could cast them to foo *'s.

Sure, that's why I say it's easy to start blurring things later. This
would require runtime manipulation of bytecodes or something to insert a
type
cast or assertion, while a fully annotated module can be fully checked 
statically and thus this type of runtime manipulation can be delayed
until later.

> > Doesn't C++ already have parameterized types? (template classes and
> > such?).
> 
> Yes. I was just pointing out that in a year and a half Java will have
> them too which will put a lot of pressure on us.

We already have parameterized types that are fully dynamic in Python
now, don't we, really? :)

Regards,

Martijn


From m.faassen@vet.uu.nl  Wed Dec 15 09:39:34 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Wed, 15 Dec 1999 10:39:34 +0100
Subject: [Types-sig] Shadow File Opinions?
References: <ADCB388D8C6BD211A4CE0000F63D90112215B5@mail.littoncorp.com> <385691D3.6DC4A36E@vet.uu.nl> <3856AC68.2D5FCF37@prescod.net>
Message-ID: <385761D6.BD8B8E5F@vet.uu.nl>

Paul Prescod wrote:
> 
> Martijn Faassen wrote:
> >
> > ...
> > While my agenda is to kill the syntax discussions for the moment, I'd
> > propose a seperate declaration syntax before all others, because this is
> > the most syntactically compatible with Python. And easier on the
> > programmer.
> 
> I'm considering your argument carefully. If we make separate interface
> files then we get Python 1.5 (hell, Python 1.0) compatibility "for free"
> and we can experiment with different syntaxes without breaking Python
> code. Plus we could use IDL and type libraries for type analysis
> *already*.
> 
> I think the final product must allow inline declarations but I am
> starting to think that in the short term, "interface definition" files
> are the way to go not just for builtin modules but for all modules.
>
> Do others agree?

I agree. I'm not sure I'm others, though. :)
 
> > Imagine you have a module. Now you want to make it fully statically
> > typed. With most syntax proposals I've seen you'd have to go through the
> > code and add type declarations here and there, mix it with the current
> > code.
> 
> I think that any proposal that requires you to keep two separate files
> "in sync" is bound to fail in the long term. I left that crap behind in
> C++. But in the short term...okay.

Right - in the longer term we'll have a nice syntax, but it's too soon
for syntax right now.

> > With either a Python based system as I'm proposing (ugly but powerful
> > and fairly simple), or a seperate type declaration system, you have your
> > type declarations separated from the code itself. This means you easily
> > add and remove type information and switch between a statically typed
> > module and a dynamically typed module easily.
> 
> But there is not going to be alot of "switching". You add declarations
> and you leave them there. You update them when they get out of sync with
> the code. Why would you want to take a nice, safe, optimized module that
> you have gone to the effort of type annotating and hide the annotations?

Hiding the annotations may be useful (on the short term, at least). You
can use existing the Python interpreter to test your module even if you
have added type annotations. That's nice for development/debugging,
including the development and debugging of the type annotation system.
You can say 'hey, Python does this to my code when I only pass strings
in, but our static type checker/compiler/asserter barfs at it'. If you
have Python code already sprinkled with annotations you need two source
files for the same module, one with annotations, one without. You can
automate this but it's not as nice.

Regards,

Martijn


From m.faassen@vet.uu.nl  Wed Dec 15 09:53:42 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Wed, 15 Dec 1999 10:53:42 +0100
Subject: [Types-sig] List of FOO
References: <3855DBBB.6D1B462A@prescod.net> <199912141649.LAA23593@eric.cnri.reston.va.us> <38569478.40E29421@vet.uu.nl> <3856A9B4.7636C0C9@prescod.net>
Message-ID: <38576526.78AD224C@vet.uu.nl>

Paul Prescod wrote:
> 
> Martijn Faassen wrote:
> >
> > I agree completely, and one *can* express most of this pretty easily in
> > current Python, i.e.:
> >
> > types = {
> >     "bar": IntType,
> >     "baz": ListType(IntType),
> >     "hey": IntType,
> >     "foo3": FunctionType(args=(IntType,), result=IntType),
> >
> >     "crazy" : ListType(FunctionType(args=(ListType(IntType),
> > StringType),                                 result=DictType(StringType,
> >
> > FunctionType(args=None,
> > result=StringType)))
> > }
> 
> Questions:
> 
> 1. This system is supposed to be extensible, right? So I could, for
> instance, define a binary tree module and have "binary trees of ints"
> and "binary trees of strings." How do I define the binary tree class and
> state that it is parameterizable?

Good question; so far I only thought about making built in types (such
as list) parameterizable. One could however do something similar with
classes, though:

__typedefs__ = {
"parameterized_class" : ParameterizedClassTypeDef(parameters=('foo',),
                          members = {
                            "alpha" : 'foo',
                            "beta" : IntType
                          }
                        )

}

__types__ = {
  "integer_class" : ParameterizedClassType('parameterized_class', 
                      parameters = { "foo" : IntegerType })
}
 
Something like that, at least. I know it looks absolutely horrible, but
it's workable. :)
                                             
> 2. How does this work with interfaces? "ListType" is cheating. We need
> SequenceType because that's not implementation specific. And
> SequenceType needs to be defined by an interface, not a class.

I just basically took the standard module types and replaced them with
parameterizable classes, but you could come up with SequenceType if you
like. I'm often in quite an OPT frame of mind. But even outside that,
ListType does say something about the interface. A TupleType parameter
cannot be changed inside the function, but a ListType parameter can.
That's a huge difference for the interface.
 
> 3. What does "tuple of int, string" look like? And should we have list
> length parameters?

I haven't fully worked this out yet, but you can fill in details
yourself. :)

HeterogenousTupleType(elementtypes = (IntType, StringType))

I don't know if we should have list length parameters.

Regards,

Martijn


From tim_one@email.msn.com  Wed Dec 15 10:08:54 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Wed, 15 Dec 1999 05:08:54 -0500
Subject: [Types-sig] A case study
In-Reply-To: <199912141519.KAA23476@eric.cnri.reston.va.us>
Message-ID: <000f01bf46e4$650557c0$05a0143f@tim>

[Guido]
> ...
> What do we now know about the type of the list variable?  It was
> initialized to an empty List.  It's still a List, and we know
> that at least some of its items are lists.  Are all its items
> lists?  This gets us into similar issues as the recursive call
> to find() before, and just as there, I'm not sure that we really
> do, so maybe we need to continue the single type hypothesis.  (One
> way out would be to assume the single type hypothesis until we
> see positive proof to the contrary, and if so, redo the analysis
> with a less restricted type.)

These are all std problems in dataflow analysis.  Conceptually:  you have a
program graph (rooted and directed), where nodes are basic blocks (single
entry, single exit), and arcs represent control flow.  Associate (still
conceptually) with every node a table mapping every name to the set of types
it may have upon entry to the block, and another table doing likewise for
block exit.  Initialize all these to empty sets (that is, replace your
"single type" hypothesis with the "no type" hypothesis!).

Traverse the graph.  Each block has certain effects on its exit type
mappings.  These need to propogate to the block's successors.  At each block
entry, the set of types a name maps to is just the union of the set of types
the name maps to at the exits of all predecessor blocks.  The root of a
function's graph is a slightly special case, in that the arglist acts like a
predecessor block for this purpose.

You continue propagating changes until you reach a steady state.  Meaning
that, for each node, the entry map equals the union of the predecessors'
exit maps, and the exit map is consistent with the entry map as modified by
the bindings in the block.

The hard parts are changing this from intuitive conception to efficient
implementation (global dataflow analysis can consume enormous amounts of
memory -- all blocks * all names * all functions * all modules == a whole
lot), and in crafting the type system so that you know a priori that you
*must* reach a steady state (e.g., it's probably not a good idea to say that
lists whose length is a prime number constitute "a type" <0.31 wink>).

Freebie:  if, at the end of this, there exists a block and a local name such
that the first occurrence of the name within the block is a reference, and
the name is still associated with the empty set in the block's entry map,
you've got an UnboundLocalError waiting to happen (provided the block is
reachable).

Semi-freebie:  If the first occurrence of a local name within a block is a
reference, and at least one of the block's predecessors associates this name
with the empty set in its exit map, you've got something very close to a
violation of Java's "definite assignment" rules.  That is, there is *a* path
in which this name may not be bound before reference; although you cannot,
in general, prove that it's *possible* for that path to occur at runtime.
Java gives a fatal error anyway, and after the first hour I came to like
that.

So your intuition is on the right track here.  What I can add as a former
Professional Compiler Writer is my Professional Assurance that making this
all run efficiently (in either time or space) is a Professional Pain in the
Professional Ass.  Because of this, global analysis never works out in
practice unless you invent an efficient database format to cache the results
of analysis, keeping that in synch with the source base under mutation.
It's all too easy to come up with a toy system that absolutely will not
scale to real life!  Python has an advantage, though, in that most people
write very small functions and methods most of the time.  If you can, in
addition, avoiding needing to deduce the types of most globals, it could
actually fly before we're all dead <wink>.

but-civilization-ends-in-a-few-weeks-anyway-ly y'rs  - tim


From tim_one@email.msn.com  Wed Dec 15 10:32:20 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Wed, 15 Dec 1999 05:32:20 -0500
Subject: [Types-sig] RFC 0.1
In-Reply-To: <199912141537.KAA23487@eric.cnri.reston.va.us>
Message-ID: <001001bf46e7$aafc14a0$05a0143f@tim>

[Greg Stein]
> Note that one benefit of associating types with names, is that
> you can shortcut the data flow analysis (so the analysis is not
> necessarily the same). But: you cannot have a name refer to
> different types of objects (which I don't like; it reduces some
> of Python's polymorphic and dynamic behavior (interfaces solve
> the polymorphism stuff in a typed world)).

[Guido]
> This is a bogus argument.  From the point of view of human
> readability, I find this:
>
>    s = "the quick brown fox"
>    s = string.split(s)
>    del s[1]			# the fox is getting old
>    s = string.join(s)
>
> less readable and more confusing than this:
>
>    s = "the quick brown fox"
>    w = string.split(s)
>    del w[1]			# the fox is getting old
>    s = string.join(w)
>
> The first version gives polymorphism a bad name; it's like a sloppy
> physicist using the same symbol for velocity and accelleration.

It's an excellent example, but to me the *first* is easier to follow!  In
the 2nd I'm left wondering what further use will be made of w, so have to
try to keep w *and* s alive in my short-term memory.  In the 1st, I can
scrub my brain cleaner harder oftener.

Heck, I wrote this just last week -- and deliberately:

        result = {}
        for i in xrange(k):
            # The expected # of times thru the next loop is n/(n-i).
            # Since i < k <= n/2, n-i > n/2, so n/(n-i) < 2 and is
            # usually closer to 1:  on average, this succeeds very
            # quickly!
            while 1:
                candidate = int(random() * n)
                if not result.has_key(candidate):
                    result[candidate] = 1
                    break
        result = result.keys()
        result.sort()
        return result

At the start of its life, the result is a (conceptual) set, and at the end
it's a list with the same stuff.  That's not confusing -- it's helpful!  It
wouldn't confuse a decent type-inference engine, either ("result" is a dict
until the block starting with the .keys() call, and is a list thereafter;
it's not even a "union type" -- at any given point, it's always one or the
other).

> ...
> Note that a type inferencer may not be able to deduce the rules I
> stated above, since you could construct an example where there is no
> single type <something> and yet the whole thing works.  E.g. I could
> create a list [1, 2, 3, joint, "a", "b", "c"] where joint is an
> instance of a class that when added to an int returns a string.

Now *that's* what gives polymorphism a bad name <0.9 wink>.

> However if we had a typesystem and notation that couldn't express
> this easily but that could express the stricter rules, I bet that
> no-one would mind adding the stricter type declarations to the
> code, since those rules most likely express the programmer's intent
> better.

I agree there's little payback in making a type system that can represent
everything possible, simply because 99% of the benefit is in capturing
vanilla types (which certainly includes lists of X and dicts mapping X to Y
and functions taking lists of X returning dicts mapping Y to lists of Z
...).

shocked-at-what-some-people-find-unreadable-ly y'rs  - tim


From gstein@lyra.org  Wed Dec 15 10:42:25 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 15 Dec 1999 02:42:25 -0800 (PST)
Subject: [Types-sig] Re: Pascal style declarations
In-Reply-To: <3856F6EC.4FD5860E@prescod.net>
Message-ID: <Pine.LNX.4.10.9912150219400.16305-100000@nebula.lyra.org>

On Tue, 14 Dec 1999, Paul Prescod wrote:
> Greg Stein wrote:
> > 
> > "int" is a valid expression, which is valid on the same line after a
> > function definition. For example:
> > 
> >   def funx(x, y): foo() ; return 5
> 
> Well, first, I don't think that we are going to allow functions as
> return type specifications. Use assert for runtime assertions.

I was pointing out that the above code is currently-valid, "real" code. It
is a counter-example to the notion that you can use the "def foo(): type:"
syntax.

I've already stated in the past that type declarators should only be
dotted names.

> Second, Python needs to use look-ahead to tell the difference between
> parentheses used for parsing a tuple and used for bracketing, doesn't
> it?

A very different problem. In both cases, you can have an expression
following that open parentheses. When you see a comma or a right-parent,
then you know what to do with the whole thing.

In the function definition, you don't know whether the thing following the
colon is a type-declarator, or an expression (where an expression is a
valid type of statement, which is part of a suite). In other words, when
the parser starts consuming stuff, it doesn't know whether it is consuming
a "suite" or a "typedecl". Therefore, you must create a pseudo grammar
element which means "it is one of these two, but I don't know YET." At
some point you figure it out and transition to the right part of the
grammar. This is what I meant when I said it would be possible, but not
pretty. I might even venture to say that Guido just plain wouldn't allow
this kind of thing in the Python grammar! :-)

> > We already have expr-based (the "assert" statement) -- we can assert types
> > on expressions anywhere. It is just a little less convenient since we must
> > place the expression value into a temporary variable, assert the type of
> > that, then continue with the expression. The "type-assert operator"
> > simplifies this process dramatically.
> 
> Sure, but why not just use function call syntax? Or maybe Java/C++
> (cast) syntax?

Grammar construction issues. The cast would be difficult -- again the
issue of determining "(" typedecl ")" vs. "(" expr ")" (presuming that a
typedecl cannot be an arbitrary expression. Until you know what the
parse element is (typedecl vs expr), you cannot apply the appropriate
restrictions (e.g. typedecl is only a dotted name or some other new
typedecl syntax that gets invented). C/C++/Java can tell because a name
has an associated name-type (e.g. typedef, variable); once the first
symbol inside that "(" is seen, it can figure out which parsing form is
occurring.

A function call syntax could be possible, but again: is the function part
(before the open paren) an expression or a typedecl? If it looks just like
a function call, then how do you know it is a type assertion? For example:

class Foo:
  ...

x = Foo(y)

Is that an assertion that y is of type Foo, or is it a constructor?

> > Jeremy wrote:
> > > I think I agree with you as far as local variables.  It becomes quite
> > > interesting when you're talking about attributes of objects, e.g. what
> > > is the type of the closed attribute of a builtin file object.  (For
> > > that matter, what is the type of the builtin open function and how
> > > does it differ from a function that returns a StringIO object?)
> > 
> Greg Stein wrote:
> > Ah! Good point. I think this is where interfaces come in. Otherwise, it
> > becomes very difficult to syntactically specify the types of attributes.
> 
> When we specify the types of attributes, we will be talking about those
> attributes by name, not by expression or value. So we need a syntax for
> specifying types of names *and* expressions. If we use function syntax
> for expressions casts then we can reduced the syntactic overload.

As mentioned above: you cannot function syntax (somebody educate me if you
believe otherwise).

All right. I'll modify my statement:

* new syntax to specify param and return types
* new syntax to specify attribute types [as part of an interface defn?]
* type assert operator

Note the specific lack of syntax for specifying *variable* types. We
aren't typing names, just interfaces (and yes, they happen to have names,
but it is truly a different concept right there).

In other words, I don't agree with your statement about typing names and
expressions. I say we provide:

* types for function params, return values
* types for attributes [via interfaces rather than syntax?]
* a type assertion operator
* compile-time (and runtime) checks for the above usages

[ and in a case where all your (called) functions have type decls (e.g.
  os.listdir()), then your code probably doesn't need any assertions since
  inferencing is enough; Guido's case study shows that you can infer
  *everything*, but it would be a lot easier once you have inference 
  boundaries established at the function/method calls. ]

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Wed Dec 15 10:47:24 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 15 Dec 1999 02:47:24 -0800 (PST)
Subject: [Types-sig] Re: expression-based type assertions
In-Reply-To: <3856FF7B.3AA32F18@prescod.net>
Message-ID: <Pine.LNX.4.10.9912150243070.16305-100000@nebula.lyra.org>

On Tue, 14 Dec 1999, Paul Prescod wrote:
> Greg Stein wrote:
> > 
> > ...
> > In fact, I would even say that the only type declarations used would be
> > associated with function params and returns (and not variable). 
> 
> How do we handle attribute values? We can't just say "interfaces" unless
> we agree that interfaces allow type declarations to be associated with
> instance variables. And if we start associating type declarations with
> attribute names as we do parameter names, why wouldn't we also allow
> that for local and global variables?

This was covered elsewhere, but for completeness...

We handle attribute values thru interfaces, which associate typedecls with
attributes. (and yes, an instance variable is an attribute)

I do not see a logical extension of that framework that states you should
also provide typedecls for variables (local/global). Specifying the type
of an attribute is a very different matter from specifying the type of a
global. As I've stated: I think specifying the type of a local/global is
needless syntactic sugar, which Python (thankfully) has a minimum of.

Note that modules and classes each have an interface (to establish type
info).

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Wed Dec 15 10:50:44 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 15 Dec 1999 02:50:44 -0800 (PST)
Subject: [Types-sig] Re: Pascal style declarations
In-Reply-To: <199912150536.AAA25023@python.org>
Message-ID: <Pine.LNX.4.10.9912150248470.16305-100000@nebula.lyra.org>

On Wed, 15 Dec 1999, Peter Sommerfeld wrote:
> Paul Prescod wrote:
> > Well, first, I don't think that we are going to allow functions as
> > return type specifications. Use assert for runtime assertions.
> 
> I don't see a reason for this limitation. It would seriously
> restrict future introduction of closures into python.
> 
> def format(string s) -> def(string);

I think he was saying that you can't use a runtime-computed type
declaration. That is different than saying you can't define functional
types. In other words: nobody is suggesting that you cannot declare a
function type as a return value.

Regardless, this thread is bogus. Nobody even said that runtime-computed
types should be allowed. Paul mistook my counter-example as a typedecl.
See my response to his email.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Wed Dec 15 11:09:46 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 15 Dec 1999 03:09:46 -0800 (PST)
Subject: [Types-sig] RFC 0.1
In-Reply-To: <001001bf46e7$aafc14a0$05a0143f@tim>
Message-ID: <Pine.LNX.4.10.9912150301070.16305-100000@nebula.lyra.org>

Woo hoo! Tim to the rescue! :-)

On Wed, 15 Dec 1999, Tim Peters wrote:
> [Greg Stein]
... me saying that it is nice for names to have different types...
> 
> [Guido]
... Guido saying that "feature" is less readable ...
> 
> It's an excellent example, but to me the *first* is easier to follow!  In
> the 2nd I'm left wondering what further use will be made of w, so have to
> try to keep w *and* s alive in my short-term memory.  In the 1st, I can
> scrub my brain cleaner harder oftener.

Yup. I might use two variables myself in that example, but using a single
name can definitely be easier in some cases...

> Heck, I wrote this just last week -- and deliberately:
... Tim's example code ...
> At the start of its life, the result is a (conceptual) set, and at the end
> it's a list with the same stuff.  That's not confusing -- it's helpful!  It
> wouldn't confuse a decent type-inference engine, either ("result" is a dict
> until the block starting with the .keys() call, and is a list thereafter;
> it's not even a "union type" -- at any given point, it's always one or the
> other).

Ha! I posted something just like this just the other day:
  http://www.python.org/pipermail/types-sig/1999-December/000518.html

Basically: I *totally* agree, and this is primarily the time when I use a
single variable name for two different types. This is also a reason why
I'd like to avoid the notion of associating a type with a [variable] name.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Wed Dec 15 11:33:46 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 15 Dec 1999 03:33:46 -0800 (PST)
Subject: [Types-sig] List of FOO
In-Reply-To: <3856A9B4.7636C0C9@prescod.net>
Message-ID: <Pine.LNX.4.10.9912150326590.16305-100000@nebula.lyra.org>

On Tue, 14 Dec 1999, Paul Prescod wrote:
> Martijn Faassen wrote:
... example ...
> 1. This system is supposed to be extensible, right? So I could, for
> instance, define a binary tree module and have "binary trees of ints"
> and "binary trees of strings." How do I define the binary tree class and
> state that it is parameterizable?

Dunno. I'll leave that for some other brainiac. :-)

As Tim pointed out: you'll get 99% of your benefit from handling a
half-dozen builtin types and their composites.

Lessee... int, long, float, complex, list, dict, tuple, func, class

2nd order: numeric, sequence, mapping

I think a big question is whether you provide syntax, like what Tim just
posted recently (e.g. ["" -> 0] -> None), and/or whether you use/allow
names (which refer to Types) (e.g. [StringType -> IntType] -> None). If
you allow names, rather than pure syntax, then the compiler will need to
infer what type the name refers to.

Note that the presence of classes means that names are probably required
in some way.

> 2. How does this work with interfaces? "ListType" is cheating. We need
> SequenceType because that's not implementation specific. And
> SequenceType needs to be defined by an interface, not a class.

We need both. It is perfectly acceptable to state that a List is required.

> 3. What does "tuple of int, string" look like? And should we have list
> length parameters?

I think Martijn answered this one.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Wed Dec 15 11:43:53 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 15 Dec 1999 03:43:53 -0800 (PST)
Subject: [Types-sig] Shadow File Opinions?
In-Reply-To: <3856AC68.2D5FCF37@prescod.net>
Message-ID: <Pine.LNX.4.10.9912150334100.16305-100000@nebula.lyra.org>

On Tue, 14 Dec 1999, Paul Prescod wrote:
> Martijn Faassen wrote:
> > 
> > ...
> > While my agenda is to kill the syntax discussions for the moment, I'd
> > propose a seperate declaration syntax before all others, because this is
> > the most syntactically compatible with Python. And easier on the
> > programmer.
> 
> I'm considering your argument carefully. If we make separate interface
> files then we get Python 1.5 (hell, Python 1.0) compatibility "for free"
> and we can experiment with different syntaxes without breaking Python
> code. Plus we could use IDL and type libraries for type analysis
> *already*.
> 
> I think the final product must allow inline declarations but I am
> starting to think that in the short term, "interface definition" files
> are the way to go not just for builtin modules but for all modules.
> 
> Do others agree?

Interface files and/or Martijn's approach. Personally, I like Martijn's a
bit better because you don't have to juggle two files.

But yes: it solves a short-term problem of "what is the syntax for
defining a module/class interface (its func and attr signatures)".

Although I think func signatures are an easy syntactic extension which
several people have provided samples for, so the interface can use that.
The attributes of a module/class are the hard part. And no... I haven't
read JimF's proposal yet to see his suggestion for how this might be 
done... it does apply to this problem. And here we tried to separate
interfaces from the discussion :-)

Suggestion: defer consideration of interfaces (whether via Martijn's
approach or a separate file) for V2 of the type system design. For V1,
let's concentrate on applying type signatures to functions (and variables
if people insist :-), and any type inferencing that may be needed.

I believe there are a lot of associated problems to handle before needing
to throw the interface problem into the mix. Seriously, I only see
interfaces as providing a way to define type info for attributes (within
the context of this discussion; they have other uses). We have issues
dealing with the existing modules, backwards/forwards compatibility, what
constitutes type-safety, what checks are available, what runtime switches
are used, etc.
[ many of these types of details goes into the RFC Paul is putting
  together ]

>...
> > With either a Python based system as I'm proposing (ugly but powerful
> > and fairly simple), or a seperate type declaration system, you have your
> > type declarations separated from the code itself. This means you easily
> > add and remove type information and switch between a statically typed
> > module and a dynamically typed module easily.
> 
> But there is not going to be alot of "switching". You add declarations
> and you leave them there. You update them when they get out of sync with
> the code. Why would you want to take a nice, safe, optimized module that
> you have gone to the effort of type annotating and hide the annotations?

Agreed. We don't need to support switching.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Wed Dec 15 11:51:11 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 15 Dec 1999 03:51:11 -0800 (PST)
Subject: [Types-sig] Compile-time or runtime checks?
In-Reply-To: <3856ACF4.E19023D8@prescod.net>
Message-ID: <Pine.LNX.4.10.9912150344270.16305-100000@nebula.lyra.org>

On Tue, 14 Dec 1999, Paul Prescod wrote:
>...
> I'm trying hard to separate the axes of: "I have some type declarations"
> and "I want a static type checker to gurantee that this code is totally
> type safe." This should be legal:

Agreed. Good separation.

> StringType
> def foo():
> 	a=eval( sys.argv[1] )
> 	return a
> 
> That means I want a runtime check. This should be illegal:
> 
> type-safe
> StringType
> def foo():
> 	a=eval( sys.argv[1] )
> 	return a
> 
> Here I've specifically asked for a compile time check and my code is not
> up to snuff.

Add the following in:

type-safe
StringType
def foo():
  a = eval(sys.argv[1]) ! StringType
  return a

Now you have a type-safe function. :-)  Of course, it might raise an
exception, but your types are clean. (heck, the eval could raise an
exception... type safety does not imply "no exceptions")

(yah yah.. I recognize the same could be done with an "assert" statement,
 but I think the inferencer would not be as pleased trying to deal with
 that, as with the type-assert operator)


Oh. That just made me think of something. "Exceptions which might be
raised" is technically part of a signature. I say punt that to V2 :-)
[ we could make some accomodation in the FuncObject to record a tuple of
  possible exceptions, and it would always contain (Exception,) in it for
  now... ]

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Wed Dec 15 12:12:37 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 15 Dec 1999 04:12:37 -0800 (PST)
Subject: [Types-sig] RFC 0.1
In-Reply-To: <38576017.75252FF7@vet.uu.nl>
Message-ID: <Pine.LNX.4.10.9912150400580.16305-100000@nebula.lyra.org>

On Wed, 15 Dec 1999, Martijn Faassen wrote:
> Paul Prescod wrote:
> > Martijn Faassen wrote:
> > > I agree with this, which is I am advocating a strong split (for
> > > simplicity) of fully-statically checked code and normal python code.
> > 
> > I don't see this as buying much simplicity. And I do see it as requiring
> > more work later. I also see it as scaring the bejeesus out of many
> > static type system fence sitters. Can you demonstrate that it makes our
> > life easier to figure out integration issues later?
> 
> Sure, but we're bound to scare the bejeesus out of everyone anyway;
> we're proposing a major change to Python.

"We" ?

I'm advocating a minimal change. Add a bit of grammar to function
definitions. Add a new type-assert operator. Add Tim's "decl" statement
for interfaces (caveat/todo: rationalize against JimF's proposal). Leave
out the complexity of variable declarations.

Note that I'd be okay with punting the "decl" / interfaces for now. That
leaves a bit of "def" grammar changing and a new operator.

To the Python programmer: *very* little change.

> The 'simplicity' part comes in because you don't need *any* type
> inferencing. Conceptually it's quite simple; all names need a type.

1) There is *no* way that I'm going to give every name a type. I may as
   well switch to Java, C, or C++ (per Guido's advice in another email :-)

2) You *still* need inferencing. "a = foo() + bar()" implies that some
   inferencing occurs.
   (for a compile-time check; the compiler can insert a runtime check to
    assert the type being assigned to "a" (but you know my opinion
    there...))

>...
> > > Later on you can work on blurring the interface between the two. First
> > > *fully* type annotated functions (classes, modules, what you want),
> > > which can only refer to other things that are fully annotated. By 'fully
> > > annotated' I mean all names have a type.
> > 
> > I think that's a non-starter because it will take forever to become
> > useful because the standard library is not type-safe. Anyhow I fell like
> > I've *already solved* the problem of integration so why would I undo
> > that?

Agreed. Also, if I grab some module Foo from Joe, and he didn't add
typedecls, then why shouldn't I be able to use it?
(and I'd just add some type-asserts if that even mattered to me)

>...
> > > Our static type checker/compiler can use the Python type constructions
> > > directly. We can put limitations on them to forbid any type
> > > constructions that the compiler cannot fully evaluate before the
> > > compilation of the actual code, of course, just like we can put
> > > limitations on statically typed functions (they shouldn't be able to
> > > call any non-static functions in the first iteration of our design, I'm
> > > still maintaining)

The compiler can issue a warning and insert a type assertion for a runtime
check. IMO, it should not forbid you from doing anything simply because it
can't figure out some type. Python syntax's "type agnosticism" is one of
its major strengths.

> > I see no reason for that limitation. The result of a call to a
> > non-static function is a Pyobject. You cast it in your client code to
> > get type safety. Just like the shift from K&R C to ANSI C. Functions

Bunk! It is *not* a cast. You cannot cast in Python. It is a type
assertion. An object is an object -- you cannot cast it to something else.
Forget function call syntax and casting syntax -- they don't work
grammatically, and that is the wrong semantic (if you're using that format
to create some semantic equivalent to a cast).

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From m.faassen@vet.uu.nl  Wed Dec 15 12:54:25 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Wed, 15 Dec 1999 13:54:25 +0100
Subject: [Types-sig] RFC 0.1
References: <Pine.LNX.4.10.9912150400580.16305-100000@nebula.lyra.org>
Message-ID: <38578F81.3F262F89@vet.uu.nl>

Greg Stein wrote:
> 
> On Wed, 15 Dec 1999, Martijn Faassen wrote:
> > Paul Prescod wrote:
> > > Martijn Faassen wrote:
> > > > I agree with this, which is I am advocating a strong split (for
> > > > simplicity) of fully-statically checked code and normal python code.
> > >
> > > I don't see this as buying much simplicity. And I do see it as requiring
> > > more work later. I also see it as scaring the bejeesus out of many
> > > static type system fence sitters. Can you demonstrate that it makes our
> > > life easier to figure out integration issues later?
> >
> > Sure, but we're bound to scare the bejeesus out of everyone anyway;
> > we're proposing a major change to Python.
> 
> "We" ?
> 
> I'm advocating a minimal change.

Ahum:

> Add a bit of grammar to function
> definitions. Add a new type-assert operator. Add Tim's "decl" statement
> for interfaces (caveat/todo: rationalize against JimF's proposal). Leave
> out the complexity of variable declarations.
> 
> Note that I'd be okay with punting the "decl" / interfaces for now. That
> leaves a bit of "def" grammar changing and a new operator.
> 
> To the Python programmer: *very* little change.

The programmer needs to deal with the following new things and their
consequences:

* New grammar with function definitions.

* A whole new operator (which you can't overload..or can you?), which
does something quite unusual (most programmers associate types with
names, not with expressions). The operation also doesn't actually return
much that's useful to the program, so the semantics are weird too.

* Interfaces with a new 'decl' statement. [If you punt on this you'll
have to the innocent Python programmer he can't use the static type
system with instances? or will we this be inferenced?]

* Unspecified syntax to actually *specify* types, I mean, a ! operator
with
something syntactically wholly new behind it may not be that simple for
the Python programmer either. It's not that hard with IntType and so on,
but it gets complex if you have function types, class types, etc.

* And then there's the type inferencer which will interact with the
Python programmer's code as well, right? And the interpreter will spew
out errors if compile time checks fail on types?

And you call this: '*very* little change' ? I'll call adding a list with
names of static type associations to the module an 'an even *smaller*
change' then, as you don't need any new operator or statement, at least
to start with. :)

Adding anything like static type checking to Python entails fairly major
changes to the language, I'd think. Not that we shouldn't aim at keeping
those transparant and mostly compatible with Python as it is now, but
what we'll add will still be major.

> > The 'simplicity' part comes in because you don't need *any* type
> > inferencing. Conceptually it's quite simple; all names need a type.
> 
> 1) There is *no* way that I'm going to give every name a type. I may as
>    well switch to Java, C, or C++ (per Guido's advice in another email :-)

Sure, but we're looking at *starting* the process. Perhaps we can do
away with specifying the type of each local variable very quickly by
using type inferencing, but at least we'll have a working
implementation!
 
> 2) You *still* need inferencing. "a = foo() + bar()" implies that some
>    inferencing occurs.
>    (for a compile-time check; the compiler can insert a runtime check to
>     assert the type being assigned to "a" (but you know my opinion
>     there...))

Sure, that's true.

[me]
> > > > Later on you can work on blurring the interface between the two. First
> > > > *fully* type annotated functions (classes, modules, what you want),
> > > > which can only refer to other things that are fully annotated. By 'fully
> > > > annotated' I mean all names have a type.

[Paul]
> > > I think that's a non-starter because it will take forever to become
> > > useful because the standard library is not type-safe. Anyhow I fell like
> > > I've *already solved* the problem of integration so why would I undo
> > > that?
> 
> Agreed. Also, if I grab some module Foo from Joe, and he didn't add
> typedecls, then why shouldn't I be able to use it?
> (and I'd just add some type-asserts if that even mattered to me)

I'm not saying this is a good situation, it's just a way to get off the
ground without having to deal with quite a few complexities such as
inferencing (outside expressions), interaction with modules that don't
have type annotations, and so on. I'm *not* advocating this as the end
point, but I am advocating this as an intermediate point where it's
actually functional.

[me]
> > > > Our static type checker/compiler can use the Python type constructions
> > > > directly. We can put limitations on them to forbid any type
> > > > constructions that the compiler cannot fully evaluate before the
> > > > compilation of the actual code, of course, just like we can put
> > > > limitations on statically typed functions (they shouldn't be able to
> > > > call any non-static functions in the first iteration of our design, I'm
> > > > still maintaining)
> 
> The compiler can issue a warning and insert a type assertion for a runtime
> check. IMO, it should not forbid you from doing anything simply because it
> can't figure out some type. Python syntax's "type agnosticism" is one of
> its major strengths.

Yes, but now you're building a static type checker *and* a Python
compiler inserting run time checks into bytecodes. This is two things.
This is more work, and more interacting systems, before you get *any*
payoff. My sequence would be:

* build system that can do compile-time checking of fully annotated code

   * now you can work on interfacing this with non-fully annotated code.
You
     can also looking at including run-time assertions.

   * in parallel, now you can work on type inferencing the local
variable               annotations out of function type signatures,
interface declarations, 
     and so on.

If you don't separate out your development path like this you end up
having to do it all at once, which is harder and less easy to test.

[Paul]
> > > I see no reason for that limitation. The result of a call to a
> > > non-static function is a Pyobject. You cast it in your client code to
> > > get type safety. Just like the shift from K&R C to ANSI C. Functions
> 
> Bunk! It is *not* a cast. You cannot cast in Python. It is a type
> assertion. An object is an object -- you cannot cast it to something else.
> Forget function call syntax and casting syntax -- they don't work
> grammatically, and that is the wrong semantic (if you're using that format
> to create some semantic equivalent to a cast).

This'd be only implementable with run-time assertions, I think, unless
you do inferencing and know what the type the object is after all. So
that's why I put the limitation there. Don't allow unknown objects
entering a statically typed function before you have the basic static
type system going. After that you can work on type inference or cleaner
interfaces with regular Python.

But perhaps I'm mistaken and local variables don't need type
descriptions, as it's easy to do type inferencing from the types of the
function arguments and what the function returns, as well as the types
of any instance attributes involved. I'd like to see some actual
examples of how this'd work first, though. For instance:

def brilliant() ! IntType:
    a = []
    a.append(1)
    a.append("foo")
    return a[0]

What's the inferred type of 'a' now? A list with heterogenous contents,
that's about all you can say, and how hard is it for a type inferencer
to deduce even that? But for optimization purposes, at least, but it
could also help with error checking, if 'a' was a list of IntType, or
StringType, or something like that? It seems tough for the type
inferencer to be able to figure out that this is so, but perhaps I'm
overestimating the difficulty.

Regards,

Martijn


From tismer@appliedbiometrics.com  Wed Dec 15 14:06:09 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Wed, 15 Dec 1999 15:06:09 +0100
Subject: [Types-sig] Shadow File Opinions?
References: <Pine.LNX.4.10.9912150334100.16305-100000@nebula.lyra.org>
Message-ID: <3857A051.406A2FC@appliedbiometrics.com>


Greg Stein wrote:
> 
> On Tue, 14 Dec 1999, Paul Prescod wrote:
> > Martijn Faassen wrote:
[- seperate IF files, forget about syntax -]

[Paul]
[- compatibility for free, plus IDL option -]
[- IF files for all modules possible -]
> > Do others agree?

[Greg]
> Interface files and/or Martijn's approach. Personally, I like Martijn's a
> bit better because you don't have to juggle two files.

It doesn't matter if there is an extra file, or you insert a
function call into your module, like

system.interface("""triple quoted string defining interface""")

without changes to the language but experimental syntaxes for
these IF files/strings.

> But yes: it solves a short-term problem of "what is the syntax for
> defining a module/class interface (its func and attr signatures)".

I think JimF has the best answer yet. Just look into his code.

> Although I think func signatures are an easy syntactic extension which
> several people have provided samples for, so the interface can use that.
> The attributes of a module/class are the hard part. And no... I haven't
> read JimF's proposal yet to see his suggestion for how this might be
> done... it does apply to this problem. And here we tried to separate
> interfaces from the discussion :-)
> 
> Suggestion: defer consideration of interfaces (whether via Martijn's
> approach or a separate file) for V2 of the type system design. For V1,
> let's concentrate on applying type signatures to functions (and variables
> if people insist :-), and any type inferencing that may be needed.

Hmm, I hink the opposite is the way to go. Forget about type signatures
for functions et al at all, just use interface info, and prove the
interface by type inference.

The interface is correct if and only if it can be proven.

Given that, I see no reason to spoil Python with extra type
annotation syntax. It's the other way round:
If there is a correct interface, then type inference can be run
in parallel as you are typing, like code colorizing, and python
can tell you the set of types which any expression might have.

I'm telling types by using stuff with known type. That is either
literals, or functions which come from other modules which
already have an interface. At any time, my IDE can tell me what
type the object at the cursor has, nad worst case this is
just PyObject.

The empty interface which just says "every visible is exported"
and "everything is a PyObject" is always fulfilled.

An interface which specifies restrictions on input values
(as parameters to functions and arguments of setattr calls of
objects) provides the information which is used to calculate
types in your code.

An interface which restricts output values
(function return values and results of getattr calls of
objects) provides the constraints which have to be proven.

What am I missing when I say:
We need interfaces only and an inference machine to prove it,
and that's all! Forget about extra info in the Python code.
What would it help? I believe this is the whole story, and
building upon JimF's startup, we would just need to write
the inferencer now.

not-trying-to-say-this-were-easy - ly chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From guido@CNRI.Reston.VA.US  Wed Dec 15 14:21:59 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 15 Dec 1999 09:21:59 -0500
Subject: [Types-sig] A challenge
Message-ID: <199912151421.JAA01106@eric.cnri.reston.va.us>

There seem to be several proposals for type declaration syntaxes out
there, with (mostly implied) suggestions on how to spell various types
etc.

I personally am losing track of all the various proposals.

I would encourage the proponents of each approach to sit down with
some sample code and mark it up using your proposed syntax.  Or write
the corresponding interface file, if that's your fancy.

I recommend using the sample code that I posted as a case study,
including some of the imported modules -- this should be a
reasonable but not extreme challenge.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gstein@lyra.org  Wed Dec 15 14:40:03 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 15 Dec 1999 06:40:03 -0800 (PST)
Subject: [Types-sig] minimal or major change? (was: RFC 0.1)
In-Reply-To: <38578F81.3F262F89@vet.uu.nl>
Message-ID: <Pine.LNX.4.10.9912150602060.16305-100000@nebula.lyra.org>

On Wed, 15 Dec 1999, Martijn Faassen wrote:

... me: stating the "GFS proposal" isn't that major of a change ...

> The programmer needs to deal with the following new things and their
> consequences:
> 
> * New grammar with function definitions.

Right. And this is optional. I don't see this extension of the grammar or
semantic as difficult to deal with.

> * A whole new operator (which you can't overload..or can you?), which
> does something quite unusual (most programmers associate types with
> names, not with expressions). The operation also doesn't actually return
> much that's useful to the program, so the semantics are weird too.

No, you cannot overload the operator. That would be a Bad Thing, I think.
That would throw the whole type system into the garbage :-).

The operator is not unusual: it is an inline type assertion. It is not a
"new-fangled way to declare the type of something." It is simply a new
operation. The compiler happens to be able to create associations from it,
but that does *not* alter the basic semantic of the operation.

Given:

   x = y or z

In the above statement, it returns "y" if it is "true". In the statement:

   x = y ! z

It returns "y" if it has "z" type; otherwise, throws an exception. The
semantics aren't all the difficult or unusual.

Programmers are confronted with "new stuff" all the time. How about:

   values = cgi.parse()

Just because the above happens to be a method invocation rather than a
syntactical construction does not reduce the amount of new semantics that
a programmer must learn.

In summary: a new operator isn't that much of a burden.

> * Interfaces with a new 'decl' statement. [If you punt on this you'll
> have to the innocent Python programmer he can't use the static type
> system with instances? or will we this be inferenced?]

Yes, I'd prefer to punt this for a while, as it is a much larger can of
worms. It is another huge discussion piece. In the current discussion, I
believe that we can factor out the interface issue quite easily -- we
can do a lot of work now, and when interfaces arrive, they will slide
right in without interfering with the V1 work. In other words, I believe
there is very little coupling between the proposal as I've outline, and
the next set of type system extensions (via interfaces).

Without interfaces (or the "decl" statement, or whatever), I *do* posit
that the type system will not be applicable to attributes. And no: we
cannot infer their type -- that would require global type inferencing.
Thankfully, I believe the inferencing required by the "GFS proposal" is
local to a single function at a time.

> * Unspecified syntax to actually *specify* types, I mean, a ! operator
> with
> something syntactically wholly new behind it may not be that simple for
> the Python programmer either. It's not that hard with IntType and so on,
> but it gets complex if you have function types, class types, etc.

True. I've been suggesting the use of dotted names, but also allowing for
the fact that new syntax can be designed to generate typedecl objects.

Specifying a typedecl is necessary to introduce any typing. That is a hit
that we take no matter what. I don't see it as a "major" change, though,
since we can keep the syntax simple and limit where/how they are used.

> * And then there's the type inferencer which will interact with the
> Python programmer's code as well, right? And the interpreter will spew
> out errors if compile time checks fail on types?

This is behind the scenes. The Python programmer is usually not impacted,
so yes... again a minimal impact.

IMO, the compile-time checks are not enabled by default. If you want them,
then you can deal with the errors and warnings.

> And you call this: '*very* little change' ?

Yes. From the standpoint of the Python programmer, there is not much more
to learn or to deal with. [unless we introduce interfaces, IMO]

> I'll call adding a list with
> names of static type associations to the module an 'an even *smaller*
> change' then, as you don't need any new operator or statement, at least
> to start with. :)

I never said yours was more complex :-). I just said that we aren't
necessarily creating a "major change". I'd like to see variable decls
punted and interfaces deferred. Add a new semantic (typedecls), a new
operator, and an extension to the "def" statement. Done.

(hehe... if only the code backing that were so easy...)

> Adding anything like static type checking to Python entails fairly major
> changes to the language, I'd think. Not that we shouldn't aim at keeping
> those transparant and mostly compatible with Python as it is now, but
> what we'll add will still be major.

Sure. I think we're just viewing it a bit differently. To me, something
like the metaclass stuff was a big change: it is capable of altering the
very semantics of class construction. Adding package support was the same
-- Python moved from a flat import space to an entirely new semantic for
importing and application packaging.

> > > The 'simplicity' part comes in because you don't need *any* type
> > > inferencing. Conceptually it's quite simple; all names need a type.
> > 
> > 1) There is *no* way that I'm going to give every name a type. I may as
> >    well switch to Java, C, or C++ (per Guido's advice in another email :-)
> 
> Sure, but we're looking at *starting* the process. Perhaps we can do
> away with specifying the type of each local variable very quickly by
> using type inferencing, but at least we'll have a working
> implementation!

I don't want to start there. I don't believe we need to start there. And
my point (2) below blows away your premise of simplicity. Since you still
need inferencing, the requirement to declare every name is not going to
help, so you may as well relax that requirement.

> > 2) You *still* need inferencing. "a = foo() + bar()" implies that some
> >    inferencing occurs.
> >    (for a compile-time check; the compiler can insert a runtime check to
> >     assert the type being assigned to "a" (but you know my opinion
> >     there...))
> 
> Sure, that's true.
> 
> [me]
> > > > > Later on you can work on blurring the interface between the two. First
> > > > > *fully* type annotated functions (classes, modules, what you want),
> > > > > which can only refer to other things that are fully annotated. By 'fully
> > > > > annotated' I mean all names have a type.
> 
> [Paul]
> > > > I think that's a non-starter because it will take forever to become
> > > > useful because the standard library is not type-safe. Anyhow I fell like
> > > > I've *already solved* the problem of integration so why would I undo
> > > > that?
> > 
> > Agreed. Also, if I grab some module Foo from Joe, and he didn't add
> > typedecls, then why shouldn't I be able to use it?
> > (and I'd just add some type-asserts if that even mattered to me)
> 
> I'm not saying this is a good situation, it's just a way to get off the
> ground without having to deal with quite a few complexities such as
> inferencing (outside expressions), interaction with modules that don't
> have type annotations, and so on. I'm *not* advocating this as the end
> point, but I am advocating this as an intermediate point where it's
> actually functional.

IMO, it is better to assume "PyObject" when you don't have type
information, rather than throw an error. Detecting the lack of type info
is the same in both cases, and the resolution of the lack is easy in both
mehtods: throw an error, or substitute "PyObject". I prefer the latter so
that I don't have to update every module I even get close to.

> [me]
> > > > > Our static type checker/compiler can use the Python type constructions
> > > > > directly. We can put limitations on them to forbid any type
> > > > > constructions that the compiler cannot fully evaluate before the
> > > > > compilation of the actual code, of course, just like we can put
> > > > > limitations on statically typed functions (they shouldn't be able to
> > > > > call any non-static functions in the first iteration of our design, I'm
> > > > > still maintaining)
> > 
> > The compiler can issue a warning and insert a type assertion for a runtime
> > check. IMO, it should not forbid you from doing anything simply because it
> > can't figure out some type. Python syntax's "type agnosticism" is one of
> > its major strengths.
> 
> Yes, but now you're building a static type checker *and* a Python
> compiler inserting run time checks into bytecodes. This is two things.
> This is more work, and more interacting systems, before you get *any*
> payoff. My sequence would be:

Who says *both* must be implemented in V0.1? If the compiler can't figure
it out, then it just issues a warning and continues. Some intrepid
programmer comes along and tweaks the AST to insert a runtime check. Done.
The project is easily phased to give you a working system very quickly.

Heck, it may even be easier for the compiler to insert runtime checks in
V0.1. Static checking might come later. Or maybe an external tool does the
checking at first; later to be built into the compiler.

... proposed implementation order ...
> If you don't separate out your development path like this you end up
> having to do it all at once, which is harder and less easy to test.

Of course. Nobody is suggesting a "do it all at once" course of
implementation.

> [Paul]
> > > > I see no reason for that limitation. The result of a call to a
> > > > non-static function is a Pyobject. You cast it in your client code to
> > > > get type safety. Just like the shift from K&R C to ANSI C. Functions
> > 
> > Bunk! It is *not* a cast. You cannot cast in Python. It is a type
> > assertion. An object is an object -- you cannot cast it to something else.
> > Forget function call syntax and casting syntax -- they don't work
> > grammatically, and that is the wrong semantic (if you're using that format
> > to create some semantic equivalent to a cast).
> 
> This'd be only implementable with run-time assertions, I think, unless
> you do inferencing and know what the type the object is after all. So
> that's why I put the limitation there. Don't allow unknown objects
> entering a statically typed function before you have the basic static
> type system going. After that you can work on type inference or cleaner
> interfaces with regular Python.

Why not allow unknown objects? Just call it a PyObject and be done with
it.

Note that the type-assert operator has several purposes:

* a run-time assertion (and possibly: unless -O is used)
* signal to the compiler that the expression value will have that type
  (because otherwise, an exception would hav been raised)
* provides a mechanism to type-check: if the compiler discovers (thru
  inferencing) that the value has a different type than the right-hand
  side, then it can flag an error.

The limitation you propose would actually slow things down. People would
not be able to use the type system until a lot of modules were
type-annotated.

> But perhaps I'm mistaken and local variables don't need type
> descriptions, as it's easy to do type inferencing from the types of the
> function arguments and what the function returns,

That is my (alas: unproven) belief.

> as well as the types
> of any instance attributes involved.

These would always be "PyObject" (or "Any" if you prefer) until we
introduce some kind of "decl" or interface mechanism. Needless to say, I
do agree that this would be very difficult.

> I'd like to see some actual
> examples of how this'd work first, though. For instance:
> 
> def brilliant() ! IntType:
>     a = []
>     a.append(1)
>     a.append("foo")
>     return a[0]
> 
> What's the inferred type of 'a' now? A list with heterogenous contents,
> that's about all you can say, and how hard is it for a type inferencer
> to deduce even that?

It would be very difficult for an inferencer. It would have to understand
the semantics of ListType.append(). Specifically, that the type of the
argument is added to the set of possible types for the List elements.

Certainly: a good inferencer would understand all the builtin types and
their methods' semantics.

> But for optimization purposes, at least, but it
> could also help with error checking, if 'a' was a list of IntType, or
> StringType, or something like that?

It would still need to understand the semantics to do this kind of
checking. In my no-variable-declaration world, the type error would be
raised at the return statement. a[0] would have the type set: (IntType,
StringType). The compiler would flag an error stating "return value may be
a StringType or an IntType, but it must only be an IntType".

> It seems tough for the type
> inferencer to be able to figure out that this is so, but perhaps I'm
> overestimating the difficulty.

Yes it would be tough -- you aren't overestimating :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Wed Dec 15 14:47:29 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 15 Dec 1999 06:47:29 -0800 (PST)
Subject: [Types-sig] hehe... sorry, too...
Message-ID: <Pine.LNX.4.10.9912150645470.16305-100000@nebula.lyra.org>

Well, I count nine messages tonite, plus the others earlier today. Sorry
about that...

Maybe Paul can pull all the threads together and toss out a new RFC with
references (not necessarily details) to the different options/threads. We
can start again from that point!

And Guido's challenge wouldn't be a bad idea... :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Wed Dec 15 15:37:52 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 15 Dec 1999 07:37:52 -0800 (PST)
Subject: [Types-sig] challenge response (was: A challenge)
In-Reply-To: <199912151421.JAA01106@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912150649380.16305-100000@nebula.lyra.org>

On Wed, 15 Dec 1999, Guido van Rossum wrote:
> There seem to be several proposals for type declaration syntaxes out
> there, with (mostly implied) suggestions on how to spell various types
> etc.
> 
> I personally am losing track of all the various proposals.
> 
> I would encourage the proponents of each approach to sit down with
> some sample code and mark it up using your proposed syntax.  Or write
> the corresponding interface file, if that's your fancy.
> 
> I recommend using the sample code that I posted as a case study,
> including some of the imported modules -- this should be a
> reasonable but not extreme challenge.

#----------------------------------------------------------------------
import sys, find			# 1
					# 2
def main():				# 3
    dir = "."				# 4
    if sys.argv[1:]:			# 5
        dir = sys.argv[1]		# 6
    list = find.find("*.py", dir)	# 7
    list.sort()				# 8
    for name in list:			# 9
        print name			# 10
					# 11
if __name__ == "__main__":		# 12
    main()				# 13
#----------------------------------------------------------------------

Presume that find.find is declared as:
  def find(pattern!StringType, dir = os.curdir!StringType)!ListType:

At the moment, I'm going to use '!' for the type declarators and the
type-assert operator. A placeholder. Also, the ListType in the above
declaration could be updated at some point, if we presume that more
complex type declarator syntax is designed (allowing List<String>).

In my view, the type system does not prove correctness, but only type
safety. In that sense, we don't need to worry about things like
termination or whether main() actually gets called, or whatever.
[ Guido's case study was concerned with algorithmic correctness ]

At the beginning of the compilation/inference process, the compiler knows
that __name__ is a string (technically: it knows the type for each name in
the module's namespace (the others are '__builtins__' and '__doc__')). The
import states that "sys" and "find" are ModuleType (which means sys.argv
and find.find will be okay from an operational point; it would still need
to check if they exist).

Line 3: the compiler defines main() to take no arguments and to have a
PyObject return value.

Line 4: "dir" now has a String value.

Line 5: The compiler knows sys is a module so the attribute access is
fine. It now must verify that argv exists.
  Problem #1: I'm not sure how it would do this without loading the
              module.
  Caveat #2: I do not have a proposal for stating sys.argv's type. This
             would be part of the interface stuff (which I would defer).
  Caveat #3: alternative to #2: we could hardcode knowledge of "sys"

For this line, the compiler cannot ensure that the [1:] would not raise an
error. This can be solved by introducing type info (interfaces), or we can
alter the line to:

    if (sys.argv!ListType)[1:]:			# 5

At this point, the compiler will assume it is a List for the rest of the
function.
  Caveat #4: we would need to determine how strict we want to be about the
             possibility of external changes to objects. Another thread
             could change the type before the next usage (or the
             find.find() could, but we don't use sys.argv after that)

If the compiler knows it is a list, then it also knows the [1:] would
succeed.

Line 6: in the absence of List<String> type information, the "dir" would
now become a (PyObject, String) which is simplified to (PyObject,).
  Caveat #5: adding List<String> concepts would keep "dir" as a String, as
             the inferencer would understand the indexing operation.

Line 7: per caveat #1, assume the compiler can access the find.find()
function. From that, it knows the signature. The first parameter has a
matching type, but the second (PyObject) does not match the required type
(String), so an error is raised. If caveat #5 is resolved, then the second
parameter matches. It is also possible to avoid the error by rewriting:

    list = find.find("*.py", dir!StringType)	# 7

"list" is now a ListType, based on the find.find() return value. (see
caveat #5 -- it could be possible to refine this knowledge).

Line 8: this is fine -- the inferencer knows List has a sort() method and
what the sort method's signature is.

Line 9: again, this is okay, based on the the inferencer's knowledge of
"for" statements and Lists. "name" is assigned a PyObject type (unless we
resolve caveat #5).

Line 10: the print succeeds, as any object can be printed.

Line 12: any comparison is valid, so this is fine. The compiler does
happen to know that __name__ is a string, though.

Line 13: the invocation matches the definition. No problem.


-----------------------------

IMO, the best thing to improve the system here is to introduce
parameterized lists (and dicts, tuples, etc). In fact, this would be
necessary to avoid the error at line 7 (without rewriting).

The following problem needs to be resolved:

1) how to fetch type information from modules without necessarily loading
   them

[ if this is unsolved then all attribute accesses become PyObject values.
  The type system would be pretty useless since you wouldn't even get
  function information. ]

The caveats listed are desired to be resolved:

2) how to specify types for module/class attributes
3) hardcode type information in the absence of a solution for #2
4) what sort of notions of "const" do we provide -- can the types of
   things change? (this may be moot with an interface present)
5) provide a syntax for composite/complex types

Summary of changes to the case study code:

1) find.find() definition altered.

[ it is certainly possible that fnmatch.* could be altered, but that is
  not necessary from the standpoint of the example code. The inferencer
  goes no further than the find.find() definition. ]

Optional changes, if the caveats are not resolved:

1) add !ListType to line 5
2) add !StringType to line 7

Miscellaneous notes:

* note the absence of declarations for "dir", "list", and "name".
* only one change was made in the "find" library to support type safety
  for the example code. The example code itself had no alterations
  (subject to the noted caveats)

Underlying proposal:

* add type declarator syntax
* add declarators to function args and return value (example provided for
  discussion purposes; I do believe a ':' is not possible and that a "name
  syntactic-marker typedecl" form is the proper form)
* add type-assert operator ('!' for discussion purposes)
* add type inferencing for associating types with names in the global and
  local scope. all other type information is imported rather than globally
  computed. inferencing does not occur over multiple, local scopes (in
  other words, we can process one function at a time, independent of the
  other functions)

TODO:

* I just realized the presence of the "global" statement throws off a lot
  of stuff. Type inferencing and/or checking will be harder and/or require
  a second pass if a global is used and new type is added to the possible
  set of types for the global.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From skip@mojam.com (Skip Montanaro)  Wed Dec 15 15:45:08 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 15 Dec 1999 09:45:08 -0600
Subject: [Types-sig] Global analysis - anything available?
Message-ID: <199912151545.JAA07256@dolphin.mojam.com>

Guido wrote:

> Jim Hugunin did global analysis on the pystone.py module -- 250 lines
> containing 14 functions and one class with two methods.  (He may actually
> have left out the class, but I'm pretty sure he did everything else.)  He
> got a 1000x speedup, which I think should be a pretty good motivator for
> those interested in (OPT).

Did anything concrete fall out of this exercise?  Did Jim write code or do
it manually?

Skip Montanaro | http://www.mojam.com/
skip@mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From guido@CNRI.Reston.VA.US  Wed Dec 15 15:48:42 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 15 Dec 1999 10:48:42 -0500
Subject: [Types-sig] Global analysis - anything available?
In-Reply-To: Your message of "Wed, 15 Dec 1999 09:45:08 CST."
 <199912151545.JAA07256@dolphin.mojam.com>
References: <199912151545.JAA07256@dolphin.mojam.com>
Message-ID: <199912151548.KAA01393@eric.cnri.reston.va.us>

> Guido wrote:
> 
> > Jim Hugunin did global analysis on the pystone.py module -- 250 lines
> > containing 14 functions and one class with two methods.  (He may actually
> > have left out the class, but I'm pretty sure he did everything else.)  He
> > got a 1000x speedup, which I think should be a pretty good motivator for
> > those interested in (OPT).
> 
> Did anything concrete fall out of this exercise?  Did Jim write code or do
> it manually?

He write Python code that would do this in general, with a limited
subset of Python as input.  Since Jim left the project it has not been
taken to the next step, but I'm sure Barry has Jim's code somewhere.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From paul@prescod.net  Wed Dec 15 13:53:52 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 15 Dec 1999 05:53:52 -0800
Subject: [Types-sig] Progress
Message-ID: <38579D70.75E477BA@prescod.net>

We are actually making progress among all of the sound and fury here.
You guys have a lot of good ideas and I think that we are converging
more than it seems.

1. Most people seem to agree with the idea that shadow files allow us a
nice way to separate type assertions out so that their syntax can vary.
I think Greg disagreed but perhaps not violently enough to argue about
it. Interface files are in. Inline syntax is temporarily out. Syntactic
"details" to be worked out.

2. Everybody but me is comfortable with defining
genericity/templating/parameterization only for built-in types for now.
But now that we are separating interfaces from implementations I am
thinking that I may be able to think more clearly about
parameterizability. It may be possible to define parameterizable
interfaces by IPC8. Parameterization is in. Syntactic "details" to be
worked out.

3. We agree that we need a syntax for asserting the types of expressions
at runtime. Greg proposes ! but says he is flexible on the issue. The
original RFC spelled this as:  has_type( foo, types.StringType ) which
returns (in this case) a string or NULL. This strikes me as more
flexible than ! because you can use it in an assertion but you don't
have to. You can also use it like this:

j=has_type( foo, types.StringType ) or has_type( foo, types.ListType ):

4. The Python misfeature that modules are externally writable by default
is gone. Only Guido has expressed an opinion on whether they should be
writeable at all. His opinion is no. Unless I hear otherwise, externally
writable modules are gone. (I have this vague feeling that maybe we
should think of modules as classes with methods and properties that
happen to be a subtype of a new base class "module", in that case the
rules for modules and classes should be identical)

5. It isn't clear WHAT we can specify in "PyDL" interface files. Clearly
we can define function, class/interface and method interfaces.

	a. do we allow declarations for the type of non-method instance
variables?

	b. do we check assignments to class and module attributes from other
modules at runtime? We need to expect that some cross-module assignments
will come from modules that are not statically type checked.

	c. should we perhaps just disallow writing to "declared" attributes
from other modules?

	d. is it possible to write to UN-declared attributes from other
modules? And what are the type safety implications of doing so?

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From rmasse@cnri.reston.va.us  Wed Dec 15 16:36:34 1999
From: rmasse@cnri.reston.va.us (Roger Masse)
Date: Wed, 15 Dec 1999 11:36:34 -0500 (EST)
Subject: [Types-sig] Re: Pascal style declarations
In-Reply-To: <ADCB388D8C6BD211A4CE0000F63D90112215BC@mail.littoncorp.com>
References: <ADCB388D8C6BD211A4CE0000F63D90112215BC@mail.littoncorp.com>
Message-ID: <14423.50066.233550.951115@nobot.cnri.reston.va.us>

Golden, Howard writes:
 > Greg Stein [mailto:gstein@lyra.org] wrote:
 > 
 > > You don't provide a way to declare function return value(s) 
 > > types. When
 > > you do, then I think you're going to run into a problem using the ':'
 > > syntactical marker...
 > 
 > [refers to:]
 > 
 > > > 3.  In functions and methods, you can _optionally_ specify 
 > > the argument
 > > > type:
 > > > 
 > > >     def funx(x : int, y : string): ...
 > > > 
 > 
 > I'll admit that Python already uses the ":" character where Pascal does, but
 > so what?  You can still specify the return type in other ways.  The most
 > obvious (to me) is to use the ":" character twice, e.g.,
 > 
 >     def funx(x : int, y : string): int : ...
 > 
 > While I'm not a parsing expert, I believe this would still be parsable.  Of
 > course, any other available character could be used instead of the ":", if
 > this would be preferable.  (Again, I'm not trying to dictate the final
 > syntax, just suggest a starting point.)
 >
I don't want to further muddy the waters because I think Paul has some
really good ideas that he should start to run with... (I hope he has
the time and resources to write some code) The optional
static typing syntax has been talked about for quite some time...

IMHO specifying the return type as outlined in my static types paper
from last year's conference is more readable (i.e. return type of the
function *before* the parameter list). See dev-day review from last year
http://www.foretec.com/python/workshops/1998-11/dd-rmasse-sum.html
For example:

def myCallable : Int( i : Int, f : Float, m : myType):

...And parsable.  (Jon Reihl developed the grammar for the proposal)

Syntax is easy (contentious but easy), semantics that are "Pythonic"
and don't get in the way and *actually* improve safety is much harder.

    -Roger


From gstein@lyra.org  Wed Dec 15 16:46:17 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 15 Dec 1999 08:46:17 -0800 (PST)
Subject: [Types-sig] Progress
In-Reply-To: <38579D70.75E477BA@prescod.net>
Message-ID: <Pine.LNX.4.10.9912150833460.16305-100000@nebula.lyra.org>

On Wed, 15 Dec 1999, Paul Prescod wrote:
>...
> 1. Most people seem to agree with the idea that shadow files allow us a
> nice way to separate type assertions out so that their syntax can vary.
> I think Greg disagreed but perhaps not violently enough to argue about
> it. Interface files are in. Inline syntax is temporarily out. Syntactic
> "details" to be worked out.

I stated a preference for allowing this information to reside in the same
file as the implementation. i.e. I don't want to maintain two files.

I'll go further and state that we should not use a new language for this.
It should just be Python. (and this is where Martijn's __types__ thing
comes in, although I'm not advocating that format)

This should be equivalent to JimF's document (with extensions: I read it
and he does not define typedecl mechanims). Where we disagree, change, or
reinvent, then we provide feedback. Where we extend, we fold that back in.

>...
> 3. We agree that we need a syntax for asserting the types of expressions
> at runtime. Greg proposes ! but says he is flexible on the issue. The

Flexible on the character(s) used for the operator. That's a bit different
than flexibility on the issue :-)

> original RFC spelled this as:  has_type( foo, types.StringType ) which
> returns (in this case) a string or NULL. This strikes me as more
> flexible than ! because you can use it in an assertion but you don't
> have to. You can also use it like this:
> 
> j=has_type( foo, types.StringType ) or has_type( foo, types.ListType ):

You'll have issues with empty strings and empty lists, as Guido pointed
out.

has_type() does not create a *definitive* type assertion. The compiler
cannot extract any information from the presence of has_type(). Using an
operator which raises an exception allows the compiler to make the
assertion (and thereby assist with type inferencing and type checking).

>...
> writable modules are gone. (I have this vague feeling that maybe we
> should think of modules as classes with methods and properties that
> happen to be a subtype of a new base class "module", in that case the
> rules for modules and classes should be identical)

This is an interesting way to view the application of an interface to
either a module or a class. i.e. restate it as "apply interfaces to
classes only; modules become classes so they can have interfaces applied."

Note that this will also solve the setattr "problem" with modules.

> 5. It isn't clear WHAT we can specify in "PyDL" interface files. Clearly
> we can define function, class/interface and method interfaces.
> 
> 	a. do we allow declarations for the type of non-method instance
> variables?

Yes. My reluctance to specify types for instance variables is caused by
problems with designing a nice, inline syntax for it. If you're not
worrying about an inline syntax, then you can definitely add typedecls for
instance and class attributes.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Wed Dec 15 17:01:56 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 15 Dec 1999 09:01:56 -0800 (PST)
Subject: [Types-sig] Shadow File Opinions?
In-Reply-To: <3857A051.406A2FC@appliedbiometrics.com>
Message-ID: <Pine.LNX.4.10.9912150850430.16305-100000@nebula.lyra.org>

On Wed, 15 Dec 1999, Christian Tismer wrote:
>...
> It doesn't matter if there is an extra file, or you insert a
> function call into your module, like
> 
> system.interface("""triple quoted string defining interface""")
> 
> without changes to the language but experimental syntaxes for
> these IF files/strings.

The compiler needs the information. This implies that you can't add the
information procedurally. The mechanism must be "transparent" to the
compiler.

> > But yes: it solves a short-term problem of "what is the syntax for
> > defining a module/class interface (its func and attr signatures)".
> 
> I think JimF has the best answer yet. Just look into his code.

I don't like the implementation at all (too many modules, too many "from
foo import ...), but the ideas seem to be sound. I believe there should be
a proposal for syntactical representations. The compiler can't pull its
information from JimF's current interface mechanism.

>...
> > Although I think func signatures are an easy syntactic extension which
> > several people have provided samples for, so the interface can use that.
> > The attributes of a module/class are the hard part. And no... I haven't
> > read JimF's proposal yet to see his suggestion for how this might be
> > done... it does apply to this problem. And here we tried to separate
> > interfaces from the discussion :-)
> > 
> > Suggestion: defer consideration of interfaces (whether via Martijn's
> > approach or a separate file) for V2 of the type system design. For V1,
> > let's concentrate on applying type signatures to functions (and variables
> > if people insist :-), and any type inferencing that may be needed.
> 
> Hmm, I hink the opposite is the way to go. Forget about type signatures
> for functions et al at all, just use interface info, and prove the
> interface by type inference.

The interface must be defined syntactically (or at least very
transparently to the compiler). Using a procedural mechanism only helps
with runtime issues. Given the presumption of syntactical interface
definitions, this leads to type signatures for functions.

> The interface is correct if and only if it can be proven.

This is a different problem, IMO. I would like to see interfaces used to
tell callers about the type information. I don't care whether the
interface is truly representative of the code or not.

> Given that, I see no reason to spoil Python with extra type
> annotation syntax. It's the other way round:
> If there is a correct interface, then type inference can be run
> in parallel as you are typing, like code colorizing, and python
> can tell you the set of types which any expression might have.

For runtime applications: yes. For compile-time static checks, you most
likely require new syntax.

>...
> What am I missing when I say:
> We need interfaces only and an inference machine to prove it,
> and that's all! Forget about extra info in the Python code.
> What would it help? I believe this is the whole story, and
> building upon JimF's startup, we would just need to write
> the inferencer now.

You're missing the requirement that a compiler must be able to extract
useful information. I'm not sure that a compiler cannot do this with the
current JimF proposal:
1) there is no mechanism for signatures (or a defined way for the compiler
   to extract/parse them)
2) procedural definition is allowed (e.g. instantiating Method()), which
   prevents the compiler from extracting the info.

New syntax, or well-known mechanism such as __types__ is needed. I believe
JimF's proposal could be used (pull the info from __implements__ and the
definition of the interface), but some of its dynamicism must be torched.
(and maybe where it is required, we only allow it in one or two specific
cases (internal to the Interfaces implementation) and then hard-code those
details into the compiler/inferencer).

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From guido@CNRI.Reston.VA.US  Wed Dec 15 17:07:30 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 15 Dec 1999 12:07:30 -0500
Subject: [Types-sig] Low-hanging fruit: recognizing builtins
Message-ID: <199912151707.MAA02639@eric.cnri.reston.va.us>

It's always bothered me from a performance point of view that using a
built-in costs at least two dict lookups, one failing (in the modules'
globals), one succeeding (in the builtins).  This is done so that you
can define globals or locals that override the occasional builtin;
which is good since new Python versions can define new builtins, and
if you weren't allowed to override builtins this would break old code.

Here's a way that per-module analysis plus a conservative assumption
plus an addition to the PVM (Python Virtual Machine) bytecode can
remove *both* dict lookups for most uses of builtins.

Per-module analysis can easily detect that there are no global
variables named "len", say.  In this case, any expression calling
len() on some object can be transformed into a new bytecode that calls
PyObjectt_Length() on the object at the top of the stack.  Thus, a
sequence like

          LOAD_GLOBAL         0 (len)
          LOAD_FAST           0 (a)
          CALL_FUNCTION       1

can be replaced by

          LOAD_FAST           0 (a)
          UNARY_LEN

which saves one PVM roundtrip and two dictionary lookups, plus the
argument checking code inside the len() function.

There are plenty of bytecodes available.

In addition, we can now optimize common idioms involving builtins, the
most important one probably

     for i in range(n): ...

We lose a tiny bit of dynamic semantics: if some bozo replaces
__builtin__.len with something that always returns 0, this won't
affect our module.  Do we care?  I don't.  We don't have to do this
for *every* builtin; for example __import__() has as explicit
semantics that you can replace it with something else; for open() I
would guess that there must be plenty of programs that play tricks
with it.  But range()?  Or len()?  Or type()?  I don't think anybody
would care if these were less dynamic.  Note that you can still
override these easily as globals, it just has to be visible to the
global analyzer.

The per-module analysis required is major compared to what's currently
happening in compile.c (which only checks one function at a time
looking for assignments to locals) but minor compared to any serious
type inferencing.  Clearly this does nothing for (ERR), but I bet it
could speed up the typical Python program with a substantial factor...

Any takers?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From GoldenH@littoncorp.com  Wed Dec 15 17:52:54 1999
From: GoldenH@littoncorp.com (Golden, Howard)
Date: Wed, 15 Dec 1999 09:52:54 -0800
Subject: [Types-sig] What is the Essence of Python? (Was: Low-hanging frui
 t: recognizing builtins)
Message-ID: <ADCB388D8C6BD211A4CE0000F63D90112215C0@mail.littoncorp.com>

Guido van Rossum wrote:

> We lose a tiny bit of dynamic semantics: if some bozo replaces
> __builtin__.len with something that always returns 0, this won't
> affect our module.  Do we care?  I don't.  We don't have to do this
> for *every* builtin; for example __import__() has as explicit
> semantics that you can replace it with something else; for open() I
> would guess that there must be plenty of programs that play tricks
> with it.  But range()?  Or len()?  Or type()?  I don't think anybody
> would care if these were less dynamic.

I reiterate that we should define what is the essence of Python, so we know
what sort of dynamicism and flexibility we are trying to preserve, and what
is superfluous.  Until we do this, we are dealing with a squishy set of
requirements.

In the various comments in the last few days, I have the sense that many of
you are using Python in very innovative ways, far beyond my pedestrian
style.  Therefore, I find some of the arguments very esoteric.

Even if no one is willing to attempt a definition, I would certainly benefit
if someone would point me in the direction of examples of dynamic Python
that you want to preserve.

- Howard


From paul@prescod.net  Wed Dec 15 18:36:44 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 15 Dec 1999 10:36:44 -0800
Subject: [Types-sig] Handling attributes
Message-ID: <3857DFBC.A2B7ADAC@prescod.net>

> Yes. My reluctance to specify types for instance variables is caused by
> problems with designing a nice, inline syntax for it. If you're not
> worrying about an inline syntax, then you can definitely add typedecls for
> instance and class attributes.

Okay, but what about all of the other questions (updated slightly):

        b. do we check assignments to class and module attributes from
other
modules at runtime? We need to expect that some cross-module assignments
will come from modules that are not statically type checked.

        c. should we perhaps just disallow writing to "declared"
attributes
from other classes and modules?

        d. is it possible to write to UN-declared attributes from other
people's classes and modules? And what are the type safety implications 
of doing so?

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Wed Dec 15 18:37:40 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 15 Dec 1999 10:37:40 -0800
Subject: [Types-sig] Implementability
References: <000f01bf46e4$650557c0$05a0143f@tim>
Message-ID: <3857DFF4.F02097CB@prescod.net>

I was wondering when our Professional Compiler Writer and resident
skeptic would jump in and tell us what we were doing wrong! Thanks.

Tim Peters wrote:
> 
> ...
>
> So your intuition is on the right track here.  What I can add as a former
> Professional Compiler Writer is my Professional Assurance that making this
> all run efficiently (in either time or space) is a Professional Pain in the
> Professional Ass.  

According to the principle of "from each according to their talents" you
should be writing this optimizing, static type checker.

> Because of this, global analysis never works out in
> practice unless you invent an efficient database format to cache the results
> of analysis, keeping that in synch with the source base under mutation.

Bah. The scope of compilation is the module. The scope of inference is a
namespace defining suite (e.g. a module, class body or method, but not
an "if" or "try").

> It's all too easy to come up with a toy system that absolutely will not
> scale to real life!  Python has an advantage, though, in that most people
> write very small functions and methods most of the time.  If you can, in
> addition, avoiding needing to deduce the types of most globals, it could
> actually fly before we're all dead <wink>.

The types of globals from other modules should be explicitly declared.
If they aren't, they are presumed to have type PyObject or to return
PyObject. Or they just aren't available if you are in strict static type
check mode.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Wed Dec 15 18:38:21 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 15 Dec 1999 10:38:21 -0800
Subject: [Types-sig] check_type()
Message-ID: <3857E01D.6C699075@prescod.net>

> > original RFC spelled this as:  has_type( foo, types.StringType ) which
> > returns (in this case) a string or NULL. This strikes me as more
> > flexible than ! because you can use it in an assertion but you don't
> > have to. You can also use it like this:
> >
> > j=has_type( foo, types.StringType ) or has_type( foo, types.ListType ):
> 
> You'll have issues with empty strings and empty lists, as Guido pointed
> out.

Yes, you have to use it in ways that follow Python's boolean rules. A
better name would be check_type.

j=check_type( foo, types.StringType ) 

> has_type() does not create a *definitive* type assertion. The compiler
> cannot extract any information from the presence of has_type(). Using an
> operator which raises an exception allows the compiler to make the
> assertion (and thereby assist with type inferencing and type checking).

j=check_type( foo, types.StringType)

j is *guaranteed* to be either a string or None.

Note that check_type is actually an operator in that it cannot be
overwritten or shadowed. It just happens to be an operator that looks
like a function and that returns a useful value instead of immediately
causing an exception. It also happens to be compatible with the current
Python grammar.

I have big aesthetic problems with adding a special character to a
language that uses the word "or" to mean, well "or" and "not" to mean
"not". I might be able to live with 

"k = eval('1') as int"

if it isn't too horribly ambiguous. 

Sorry, I've seen so many posts back and forth that I can't remember what
the consensus on this was. That's my fault as moderator. 

We'll have to start focusing on individual issues soon because the
fireman's hose approach is reaching its limits (but it was great at
first!)

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Wed Dec 15 18:38:35 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 15 Dec 1999 10:38:35 -0800
Subject: [Types-sig] Interface files
References: <Pine.LNX.4.10.9912150833460.16305-100000@nebula.lyra.org>
Message-ID: <3857E02B.53CF27AC@prescod.net>

Greg Stein wrote:
> 
> I stated a preference for allowing this information to reside in the same
> file as the implementation. i.e. I don't want to maintain two files.

The nice thing about having separate files is that it becomes instantly
clear what is "interesting" to the compiler. We have no backwards
compatibility constraints. We have no questions about what variable are
"in scope" and "available". It's just plain simpler.

There is also something deeply elegant and useful about a separation of
interface from implementation. 

Sure, you don't always want to be REQUIRED to separate them. I
acknowledge that we will one day have to support inline declarations but
I'm going to put it off unless I hear some screaming.

> I'll go further and state that we should not use a new language for this.
> It should just be Python. (and this is where Martijn's __types__ thing
> comes in, although I'm not advocating that format)

I think that that's an unreasonable (and unreadable) constraint. The
language should probably be pythonic, but not necessarily Python. Python
doesn't have a type declaration syntax and none of Python's existing
syntax was meant to be used AS a type declaration syntax. It just gets
too unreadable for quasi-complicated declarations. We need to support
polymorphic and parameteric higher order functions!

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Wed Dec 15 18:56:17 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 15 Dec 1999 10:56:17 -0800
Subject: [Types-sig] Low-hanging fruit: recognizing builtins
References: <199912151707.MAA02639@eric.cnri.reston.va.us>
Message-ID: <3857E451.F7E4B6EC@prescod.net>

What is the proposed framework for these sorts of experiments? Perhaps
the first project should be an interpreter that can be extended with new
bytecodes perhaps through a registration mechanism...and a hook to call
Python code between parsing and generating bytecodes. You have
specifically commissioned this experiment so it has a good chance of
being "rolled in" but in the more general case...
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From guido@CNRI.Reston.VA.US  Wed Dec 15 19:21:40 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 15 Dec 1999 14:21:40 -0500
Subject: [Types-sig] Low-hanging fruit: recognizing builtins
In-Reply-To: Your message of "Wed, 15 Dec 1999 10:56:17 PST."
 <3857E451.F7E4B6EC@prescod.net>
References: <199912151707.MAA02639@eric.cnri.reston.va.us>
 <3857E451.F7E4B6EC@prescod.net>
Message-ID: <199912151921.OAA07698@eric.cnri.reston.va.us>

> What is the proposed framework for these sorts of experiments? Perhaps
> the first project should be an interpreter that can be extended with new
> bytecodes perhaps through a registration mechanism...and a hook to call
> Python code between parsing and generating bytecodes. You have
> specifically commissioned this experiment so it has a good chance of
> being "rolled in" but in the more general case...

Dynamic bytecode registration would slow the PVM too much.  Just hack
a few special cases into ceval.c and then go hack on the bytecode.
Note that the bytecode hacking could conceivably be done entirely in
Python.  I think Skip Montanaro may have tools to do this already.

A first approximation would be to go hunt through all existing code
objects in a module for LOAD_GLOBAL and STORE_GLOBAL opcodes with
built-in names; for all such built-in names that have no STORE_GLOBAL
anywhere, it's "safe enough" to use the special opcode.  Then of
course you will have to hunt through the bytecode for sequences of
LOAD_GLOBAL(<name>), followed by arbitrary code to load an object,
followed by CALL_FUNCTION(1).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From Edward Welbourne <eddyw@lsl.co.uk>  Wed Dec 15 19:59:10 1999
From: Edward Welbourne <eddyw@lsl.co.uk> (Edward Welbourne)
Date: Wed, 15 Dec 1999 19:59:10 +0000
Subject: [Types-sig] expression-based type assertions
Message-ID: <E11yKZq-0006Rd-00@lsls4p>

[Due to confusion on my part, this begins by echoing other stuff]
[First: something I sent to Greg while forgetting it wasn't to all]

A reason why I want not-just-dotted-names as the type checking object
... a generator for type-checking tuples (say), which takes some
parameters (checkers for the individual members of a tuple) and returns
a checker for that flavour of tuple.

Of course, a compiler can only *exploit* this if it `knows' what the
relevant type checker `really will be'.  However, I see no reason
against *allowing* the checker construct to be applied using checkers
that no compiler could hope to recognise - these will buy you 
robustness but no speed - while the spec for checking can say that
compilers are at liberty to exploit any checkers they *do* recognise.

Even dotted names, if the compiler doesn't `know' the relevant
checker, won't prevent this: and parameterised checkers, like the
tuple example above, can be known (by some compilers) despite 
involving more than just dotted names.

Note: one common type for functions is the whatever-or-None used
for default arguments, notably in the case where the default for 
the argument is an empty dict (or list), but the standard gotcha
about using {} or [] as default obliges one to use None and begin
the function with 

    if arg is None: arg = { } # or [ ] as may be

This is going to provide irritation for the syntax of checking
unless something sensible is dreamed up (or it's not done in the
argument list).  A possibility would be `if a default value is
given, this value is also tolerated for the argument, even if it
fails the type check' subject to a presumption that the function
must begin with something which copes with the default, replacing
it with something that matches the type spec ...

[To which Greg replied (I've shrunk my excerpts here):]

EW> A reason why I want ... returns a checker for that flavour of tuple.

Sure. Syntactical type declarators are fine. But arbitrary expressions
would prevent the compiler from understanding what was going on.

In fact, I proposed this exact kind of thin in the "GFS proposal". Did you
read that?  For example:

x = foo() ! Int
x, y = foo() ! (Int, String)

In the first case, you have a dotted name. In the second, the parser and
compiler understand that the parens mean tuple-of-these-types.

EW> Of course, a compiler ... exploit any checkers they *do* recognise.

A dotted name allows the following construct:

MyChecker = SomeCheckerGenerator(...)
x = foo() ! MyChecker

Again, this was in the GFS proposal. Since you can always do an assignment
such as above, I felt it was quite fine to say "only dotted names."

EW> Even dotted names, ... involving more than just dotted names.

Well... it gets pretty tough for the compiler, the further you move from
simple dotted names. Worst case, the compiler can issue a warning saying
it doesn't understand how to do the compile-time check, and then insert a
runtime check.

EW> Note: ... whatever-or-None ... something that matches the type spec ...

True. Syntactic markers can created and used to state "<type> or None".

[OK, sorry about that, back to the present]

Two issues are developing in the list: one is name-checking, the other
is value-checking.  The two are mostly seperable - however, the
parameters of a function provide a clash: value-checking says that the
parameter is tagged with the type of value the *caller* may supply,
name-checking tags it with a constraint that stops the function
subsequently using that name to hold any other type (while incidentally
doing the value-checker's check when the function is invoked).  This is
an area of difficulties.

I wish to state explicitly that:

  A quite natural consolidation of the existing python object model will
  leave us in a position where attribute modification is *always* done
  by thing.__setattr__(key, val) or thing.__delattr__(key) for any thing
  you care to mention (albeit with some subtleties).

  The resulting framework *will* make it easy to:

    * set up a name-space such that the suite executed to initialise it
      performs sensible attribute modification (without any fancy syntax
      of fuss within the suite), yet: once it is initialised it has no
      attribute modification methods - if you really want, access to its
      __dict__ can be unavailable.

    * police any restrictions you can specify on what values may be
      stored against which names in a given namespace: you do this with
      a __setattr__ hack.

    * have your locals and globals just be namespaces, so there's
      nothing special about them - i.e. you can do the above with them.

It is unnecessary to add syntax to the language for the purpose of
specifying what you can arrange to have __setattr__ do.  In particular,
use of setattr hacks is the right way to implement any fascist policy a
name-space wants to use in controlling modifications to itself - not new
grammar.

However, that only affects the robustness side of matters - it doesn't
provide for the compiler to know it can presume the relevant hacks are
in place: but value-checking can be used here too, with care.  (Have a
checker which corresponds to the assertion: this object has a namespace
containing the name foo, with value matching bah-spec; optionally
specifying that this can't be changed - which the checker can check by
trying to, or by looking for the attribute modification methods.)  So
I'm falling into the purely value-checked camp.


Type-specifications on values (of which function argument/return
type-specs are examples) are valuable: my one reservation about them is
that they are syntactic sugar for assertions - but I accept they are a
good scheme which will genuinely increase the extent to which folk will
make assertions: which improves robustness and maintainability.

I will suppose that ! is the operator to be used for this (since : is so
widely used already that another use would be irksome), but I dont'
really care what the symbol is (though: the type assertion makes sense
as an exclamation, e.g. `7 + (x ! IntType)' reads quite neatly as
   7 plus x, which *is* an int !
as it were).  I have some sympathy with someone's suggestion that the
return-type of a function gets a different spelling for !, e.g. ->
I *really* like TimP's Haskellish spelling of type-specifiers.


The function of inline assertions is then to ensure that the compiler
can perform (rudimentary) type inference based on that which is
asserted.  There are two ways a compiler can respond to !-assertions
(indeed, it has the same choice with assert):

  * exploit - generate faster code, presuming the asserted truth
  * check - verify the assertion

The latter is obvious - perform the test, raise the exception on
failure.  One of the easiest ways to exploit an assertion is to infer
that some subsequent assertion is guaranteed true and elide its check.

The inferencer has other sources of data than the !-assertions - for
instance, immediately after the expression `x+7' has been evaluated, we
know that x's value supports addition, at least with integers, at least
this way round - either that, or we've raised an exception and aren't
executing the code which followed.

I believe it is philosophically valid for the inferencer to presume the
truth of anything that is asserted (by existing assert statements) even
if __debug__ is false.  How much umbrage does that raise ?

The inferencer takes all these sources of information and infers what it
will, the compiler exploits the inferred truths in hunting for efficient
ways to implement the given code.  Any part of the code that the
inferencer doesn't know how to exploit, it simply (silently) doesn't
exploit.  Whether it bothers to check will depend on optimise/debug
flags its been given and how anal the compiler-writer is.


On the dotted-names issue: so long as it's valid to say

> MyChecker = SomeCheckerGenerator(...)
> x = foo() ! MyChecker

there is no mileage in forbidding

x = foo() ! SomeCheckerGenerator(...)

it just obliges me to polute my namespace (unless, of course, I intend
to re-use the checker).  The objection that the compiler has a hard time
coping with this is without substance:

  * the difficulties involved in recognising SomeCheckerGenerator(...)
    are present in both of the above and in no way reduced by storing
    the result in a variable in the mean time: the compiler's knowledge
    of what `! MyChecker' buys it is entirely dependent on making sense
    of the code it's just seen which gives MyChecker a value.

  * each compiler is only going to recognise a sub-set of the
    type-specifiers deployed, even with the dotted-name constraint.

  * when the compiler recognises a checker, it can generate code which
    exploits the truth asserted.

  * the right thing for a compiler to do about unrecognised checks is to
    not exploit them.  After all, it has nothing to exploit.

and, in any case, dotted names (may) involve function calls:
__getattr__.

> I think altering isinstance() to accept a callable is preferable to
> introducing a __check__ method.

Ah, have I mis-understood you ... I thought you said that isinstance
would take a third argument which is callable ... were you saying that
it accepts a third kind of thing as its second argument ?  In which case
I see where you're going and that sounds great.

def isinstance(thing, *modes):
    for mode in modes:
        if type(mode) is TypeType:
            if type(thing) is mode: return 1
        elif type(mode) is ClassType:
            try:
                if issubclass(mode.__class__, mode): return 1
            except AttributeError: pass
        else:
            # mode is presumed to be a callable
            if mode(thing): return 1
    return 0


Associativity:

7 ! IntType ! (Prime ! TypeChecker)

7 is an integer, indeed it's a prime (oh, and Prime is a type-checker).
So ! groups to the left (I think that's what + et al. do).

Paul said:
> There is also something deeply elegant and useful about a separation
> of interface from implementation.

Conceptual separation - yes.
Physical separation - quite the opposite.
When programs and documentation live apart, they drift apart.

The only way the interface spec can live apart from the implementation
is when the implementation can be checked for compatibility with the
interface (so that the implementation change which changes the interface
gets recognised as such the first time the changed code is used).  If we
can check the implementation for compatibility with its interface spec,
we're already specifying the interface in the implementation, so why
bother having a second copy in a separate file ?


This list is too busy.
Time I went home, before I make it any worse.

	Eddy.
--
Yes, I did read the GFS proposal.


From Edward Welbourne <eddyw@lsl.co.uk>  Wed Dec 15 20:26:25 1999
From: Edward Welbourne <eddyw@lsl.co.uk> (Edward Welbourne)
Date: Wed, 15 Dec 1999 20:26:25 +0000
Subject: [Types-sig] Re: [Doc-SIG] Sorry!
In-Reply-To: <3857E6F9.52FC29F2@prescod.net>
References: <385665DE.9963174B@prescod.net> <E11xvPE-00034T-00@lsls4p>
 <38569FFF.78EC8DA@prescod.net> <E11yCSI-0004pP-00@lsls4p>
 <3857CDD1.D77331AD@prescod.net> <E11yIzw-0006CY-00@lsls4p>
 <3857E6F9.52FC29F2@prescod.net>
Message-ID: <E11yL0D-0006Vr-00@lsls4p>

Damn.  Another message arrived before I could escape ;^/
Paul said (we'd gone off line due to another of my confusions):

> Greg admits that his proposal does nothing about attributes.
Thus far I've only seen him saying he doesn't consider them worth attention.

> a whole interface definition mechanism. Which brings us back to the
> idea of interfaces separate from implementations, which brings us back
> to shadow files, even for Python code.
whoa.  I don't follow the inferences here.

The only interface definition mechanism I can see needed is the one that
lets us specify the analogue of C structures and function types - that
is, the equivalent of a typedef.  One interface thus defined can be
deployed for several objects that support it - this does not mean that
we have to have a separate *file* in which to say it, let alone a
separate file in which to re-iterate the specification of the interface
for each of the files which defines an export which matches that
interface.

The !-assertion mechanism will, indeed, depend (when taken to its
logical extreme) on some way of saying `an object which has attribute
foo, which is an integer and which *you* cannot modify though the object
might', mutatis mutandis.  For that we'll need an IDL, in some guise,
which can produce an object which encodes that spec.  Such an object can
be used in many places.  While that object gets constructed someplace
else than the objects it gets used to describe, it needn't be in a
separate file; nor need this place know anything about the objects that
will be described by the interface description object, least of all
their names.

As long as we can construct objects which encode interface descriptions,
we can

 * use !-assertions on values in-line, where those values are computed,

 * use those interface objects when filtering using a setattr hack

 * provide some mechanism for an object to `publish' the fact that it
   supports some given interface (as I understand it, someone's done this).

and if we can't define such objects then there is no amount of fun and
games with shadow files can possibly help us.

(If I keep this up much longer, I shalln't have time to write about my
object-unification schemes in time for IPC8 ... which I care about much
more than type-checking, and which may simplify all this anyway.)

Dinner time,

	Eddy.


From janssen@parc.xerox.com  Wed Dec 15 20:26:14 1999
From: janssen@parc.xerox.com (Bill Janssen)
Date: Wed, 15 Dec 1999 12:26:14 PST
Subject: [Types-sig] Shadow File Opinions?
In-Reply-To: Your message of "Tue, 14 Dec 1999 18:41:56 PST."
 <3856FFF4.1C1A4AD@prescod.net>
Message-ID: <99Dec15.122615pst."3601"@watson.parc.xerox.com>

> Bill, while you're here, could you help me out with the CORBA IDL POV on
> generic types? Does IDL support parameterization?

Nope.  It believes in inheritance and mixins, which give you a
different set of capabilities.  Actually, with valuetypes, that's
become a much more reasonable position.

Bill


From mal@lemburg.com  Wed Dec 15 18:14:20 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 15 Dec 1999 19:14:20 +0100
Subject: [Types-sig] Low-hanging fruit: recognizing builtins
References: <199912151707.MAA02639@eric.cnri.reston.va.us>
Message-ID: <3857DA7C.75433A13@lemburg.com>

Guido van Rossum wrote:
> 
> It's always bothered me from a performance point of view that using a
> built-in costs at least two dict lookups, one failing (in the modules'
> globals), one succeeding (in the builtins).  This is done so that you
> can define globals or locals that override the occasional builtin;
> which is good since new Python versions can define new builtins, and
> if you weren't allowed to override builtins this would break old code.
> 
> Here's a way that per-module analysis plus a conservative assumption
> plus an addition to the PVM (Python Virtual Machine) bytecode can
> remove *both* dict lookups for most uses of builtins.
> 
> Per-module analysis can easily detect that there are no global
> variables named "len", say.  In this case, any expression calling
> len() on some object can be transformed into a new bytecode that calls
> PyObjectt_Length() on the object at the top of the stack.  Thus, a
> sequence like
> 
>           LOAD_GLOBAL         0 (len)
>           LOAD_FAST           0 (a)
>           CALL_FUNCTION       1
> 
> can be replaced by
> 
>           LOAD_FAST           0 (a)
>           UNARY_LEN
> 
> which saves one PVM roundtrip and two dictionary lookups, plus the
> argument checking code inside the len() function.
> 
> There are plenty of bytecodes available.
> 
> In addition, we can now optimize common idioms involving builtins, the
> most important one probably
> 
>      for i in range(n): ...
> 
> We lose a tiny bit of dynamic semantics: if some bozo replaces
> __builtin__.len with something that always returns 0, this won't
> affect our module.  Do we care?  I don't.  We don't have to do this
> for *every* builtin; for example __import__() has as explicit
> semantics that you can replace it with something else; for open() I
> would guess that there must be plenty of programs that play tricks
> with it.  But range()?  Or len()?  Or type()?  I don't think anybody
> would care if these were less dynamic.  Note that you can still
> override these easily as globals, it just has to be visible to the
> global analyzer.
> 
> The per-module analysis required is major compared to what's currently
> happening in compile.c (which only checks one function at a time
> looking for assignments to locals) but minor compared to any serious
> type inferencing.  Clearly this does nothing for (ERR), but I bet it
> could speed up the typical Python program with a substantial factor...

I like this :-)

How about also adding caching of globals
which are not modified within the module in locals ? This
would save another cylce or two. The caching would have
to take place during function creation time. I'm currently doing
this by hand which results in ugly code... :-( but faster execution
:-)

Note that interning the builtins as byte codes could be
a security risk when executing in a restricted environment,
though. Some builtin operations might not be allowed and but would
still be available via bytecode.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    16 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido@CNRI.Reston.VA.US  Wed Dec 15 22:20:03 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 15 Dec 1999 17:20:03 -0500
Subject: [Types-sig] Low-hanging fruit: recognizing builtins
In-Reply-To: Your message of "Wed, 15 Dec 1999 19:14:20 +0100."
 <3857DA7C.75433A13@lemburg.com>
References: <199912151707.MAA02639@eric.cnri.reston.va.us>
 <3857DA7C.75433A13@lemburg.com>
Message-ID: <199912152220.RAA07942@eric.cnri.reston.va.us>

> How about also adding caching of globals
> which are not modified within the module in locals ? This
> would save another cylce or two. The caching would have
> to take place during function creation time. I'm currently doing
> this by hand which results in ugly code... :-( but faster execution
> :-)

Indeed -- the same analysis I was proposing would also support this.
However there's a common pattern that can be a problem here (and isn't
a problem for the built-in functions analysis): modules often have a
few global variables that are initialized only once in the module, but
are clearly (e.g. through comments) intended to be modified by using
modules.  Examples: default files, debug levels, and the like.  I'm
not sure how to detect this pattern reliably, unless you decide to
cache only functions, classes, and imported modules.

> Note that interning the builtins as byte codes could be
> a security risk when executing in a restricted environment,
> though. Some builtin operations might not be allowed and but would
> still be available via bytecode.

Of course a restricted environment should not accept arbitrary
bytecode!  Also you could simply not define bytecodes for
security-sensitive built-ins; the only ones I cna think of right now
are __import__() and open(), which I already mentioned as exceptions.

Note that a bunch of built-in constants can also be optimized using
this same mechanism: None and perhaps exception names.  I'm not sure
that exception names are worth it though; they don't tend to be
touched in inner loops where performance gains are made.  But None is
definitely worth its own 1-byte opcode.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From paul@prescod.net  Wed Dec 15 23:55:05 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 15 Dec 1999 15:55:05 -0800
Subject: [Types-sig] Low-hanging fruit: recognizing builtins
References: <199912151707.MAA02639@eric.cnri.reston.va.us>
 <3857E451.F7E4B6EC@prescod.net> <199912151921.OAA07698@eric.cnri.reston.va.us>
Message-ID: <38582A59.74BEA4CC@prescod.net>

Guido van Rossum wrote:
> 
> 
> Dynamic bytecode registration would slow the PVM too much.  

I was thinking of just changing this:

default:
	handler = handlers[opcode];
	if( handler ){
		handler( f );
	}else{

		fprintf(stderr,

			"XXX lineno: %d, opcode: %d\n",

			f->f_lineno, opcode);

		PyErr_SetString(PyExc_SystemError, "unknown opcode");

		why = WHY_EXCEPTION;

		break;
	}


-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Thu Dec 16 02:51:54 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 15 Dec 1999 18:51:54 -0800
Subject: [Types-sig] What is the Essence of Python? (Was: Low-hanging fruit:
 recognizing builtins)
References: <ADCB388D8C6BD211A4CE0000F63D90112215C0@mail.littoncorp.com>
Message-ID: <385853CA.62850B22@prescod.net>

"Golden, Howard" wrote:
> 
> I reiterate that we should define what is the essence of Python, so we know
> what sort of dynamicism and flexibility we are trying to preserve, and what
> is superfluous.  Until we do this, we are dealing with a squishy set of
> requirements.

I think that that is always the case in language design. What one person
hates is what another loves: even in Python! I don't know how to answer
your question. I think that we can only argue about particular features
"when we get to them."

Most people probably do not use dynamicity to the same extent as the
real power users but on the other hand they are the ones who are most
fanatical about the language.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From gstein@lyra.org  Thu Dec 16 03:05:01 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 15 Dec 1999 19:05:01 -0800 (PST)
Subject: [Types-sig] What is the Essence of Python?
In-Reply-To: <385853CA.62850B22@prescod.net>
Message-ID: <Pine.LNX.4.10.9912151903320.16305-100000@nebula.lyra.org>

On Wed, 15 Dec 1999, Paul Prescod wrote:
> "Golden, Howard" wrote:
> > 
> > I reiterate that we should define what is the essence of Python, so we know
> > what sort of dynamicism and flexibility we are trying to preserve, and what
> > is superfluous.  Until we do this, we are dealing with a squishy set of
> > requirements.
> 
> I think that that is always the case in language design. What one person
> hates is what another loves: even in Python! I don't know how to answer
> your question. I think that we can only argue about particular features
> "when we get to them."

I agree. It is like asking somebody to describe the color "blue" :-)

I think there is a yardstick in there somewhere, that you can hold up to a
feature or design and say "that's Pythonic" or "that's not". But it is
very subjective and incapable of being described...

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Thu Dec 16 03:49:18 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 15 Dec 1999 19:49:18 -0800 (PST)
Subject: [Types-sig] Handling attributes
In-Reply-To: <3857DFBC.A2B7ADAC@prescod.net>
Message-ID: <Pine.LNX.4.10.9912151947430.16305-100000@nebula.lyra.org>

On Wed, 15 Dec 1999, Paul Prescod wrote:
> > Yes. My reluctance to specify types for instance variables is caused by
> > problems with designing a nice, inline syntax for it. If you're not
> > worrying about an inline syntax, then you can definitely add typedecls for
> > instance and class attributes.
> 
> Okay, but what about all of the other questions (updated slightly):

I didn't reply to them because I didn't really have much of an opinion :-)

In general, I might say: punt. Don't worry about that stuff right now.
Worry about phase 1. Refining assignment behavior can come later, as that
"should" be independent of what occurs in the first phase.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From tim_one@email.msn.com  Thu Dec 16 03:55:18 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Wed, 15 Dec 1999 22:55:18 -0500
Subject: [Types-sig] A challenge
In-Reply-To: <199912151421.JAA01106@eric.cnri.reston.va.us>
Message-ID: <000501bf4779$5e566b40$58a2143f@tim>

[Guido]
> There seem to be several proposals for type declaration syntaxes
> out there, with (mostly implied) suggestions on how to spell
> various types etc.
>
> I personally am losing track of all the various proposals.

You're not alone <wink>.

> I would encourage the proponents of each approach to sit down with
> some sample code and mark it up using your proposed syntax.  Or write
> the corresponding interface file, if that's your fancy.

I like interface files fine, but will stick to inline "decl"s below.
Apparently unlike anyone else here, I think explicit declarations can make
code easier for *human readers* to understand -- so I'm not interested in
hiding them from view.

> I recommend using the sample code that I posted as a case study,
> including some of the imported modules -- this should be a
> reasonable but not extreme challenge.

Sorry, but if we avoid excessive novelty, it's just a finger exercise <0.5
wink>.

Note that you convert this back to Python 1.5.x code simply by commenting
out the decl stmts.

if-it-looks-a-lot-like-every-other-reasonable-declaration-
    syntax-you've-ever-seen-it-met-its-goal-ly y'rs  - tim

import sys, find

decl main: def() -> None

def main():
    decl dir: String, list: [String], name: String
    dir = "."
    if sys.argv[1:]:
        dir = sys.argv[1]
    list = find.find("*.py", dir)
    list.sort()
    for name in list:
        print name

if __name__ == "__main__":
    main()

----------------------------------------------------
import fnmatch
import os

decl _debug: Int  # but Boolean makes more sense; see below
_debug = 0

decl _prune: [String]
_prune = ['(*)']

decl find: def(String, optional dir: String) -> [String]

def find(pattern, dir = os.curdir):
    decl list, names: [String], name: String
    list = []
    names = os.listdir(dir)
    names.sort()
    for name in names:
        decl name, fullname: String
        if name in (os.curdir, os.pardir):
            continue
        fullname = os.path.join(dir, name)
        if fnmatch.fnmatch(name, pattern):
            list.append(fullname)
        if os.path.isdir(fullname) and not os.path.islink(fullname):
            decl p: String
            for p in _prune:
                if fnmatch.fnmatch(name, p):
                    if _debug: print "skip", `fullname`
                    break
            else:
                if _debug: print "descend into", `fullname`
                list = list + find(pattern, fullname)
    return list
#----------------------------------------------------------------------

import re

# Declaring the type of _cache is irritating, because so far
# as current Python is concerned a compiled regexp is of
# type Instance, and that's too inclusive to be interesting.
# I'm giving its class name instead.

decl _cache: {String: RegexObject}
_cache = {}

# Assuming a Boolean "type" exists, if for no other reason
# than to support meaningful (to humans!) type declarations.

# Declaring all the function signatures in a block here, for
# the heck of it.  BTW, this is an example of how decls can
# aid human comprehension -- e.g., I had to reverse-engineer
# the code to figure out whether the "pat" arguments were
# supposed to be strings or compiled regexps.  They don't
# both work, and the name "pat" doesn't answer it.

decl fnmatch: def(String, String) -> Boolean, \
     fnmatchcase: def(String, String) -> Boolean, \
     translate: def(String) -> String

def fnmatch(name, pat):
    import os
    name = os.path.normcase(name)
    pat = os.path.normcase(pat)
    return fnmatchcase(name, pat)

def fnmatchcase(name, pat):
    if not _cache.has_key(pat):
        decl res: String
        res = translate(pat)
        _cache[pat] = re.compile(res)
    return _cache[pat].match(name) is not None

def translate(pat):
    decl i, n: Int, res: String
    i, n = 0, len(pat)
    res = ''
    while i < n:
        decl c: String
        c = pat[i]
        i = i+1
        if c == '*':
            res = res + '.*'
        elif c == '?':
            res = res + '.'
        elif c == '[':
            decl j: Int
            j = i
            if j < n and pat[j] == '!':
                j = j+1
            if j < n and pat[j] == ']':
                j = j+1
            while j < n and pat[j] != ']':
                j = j+1
            if j >= n:
                res = res + '\\['
            else:
                decl stuff: String
                stuff = pat[i:j]
                i = j+1
                if stuff[0] == '!':
                    stuff = '[^' + stuff[1:] + ']'
                elif stuff == '^'*len(stuff):
                    stuff = '\\^'
                else:
                    while stuff[0] == '^':
                        stuff = stuff[1:] + stuff[0]
                    stuff = '[' + stuff + ']'
                res = res + stuff
        else:
            res = res + re.escape(c)
    return res + "$"


From tim_one@email.msn.com  Thu Dec 16 03:55:23 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Wed, 15 Dec 1999 22:55:23 -0500
Subject: [Types-sig] Implementability
In-Reply-To: <3857DFF4.F02097CB@prescod.net>
Message-ID: <000601bf4779$614092e0$58a2143f@tim>

[Tim]
> making this [global type inference] all run efficiently (in either
> time or space) is a Professional Pain in the Professional Ass.

[Paul Prescod]
> According to the principle of "from each according to their
> talents"

I'm afraid you've mistaken Benevolent Dictatorship for some variant of
Communism <wink>.

> you should be writing this optimizing, static type checker.

Guido is the one interested in magical type inference; I'm not.  I'm happy
to explicitly declare the snot out of everything when I want something from
static typing.  Merely checking that my declared types match my usage is
much easier (doesn't require any flow analysis).  The good news is that I
couldn't make time to write an inferencer even if I wanted to; if I had
time, I'd be much more likely to write something that *used* explicit
declarations to generate faster code.

>> Because of this, global analysis never works out in practice
>> unless you invent an efficient database format to cache the
>> results of analysis, keeping that in synch with the source
>> base under mutation.

> Bah. The scope of compilation is the module. The scope of inference is
> a namespace defining suite (e.g. a module, class body or method, but
> not an "if" or "try").

"Bah"?  I didn't say anything about the granularity of the cached analysis.
For general Python use, module level sounds good to me too.  But note that
in the msg I was responding to, Guido was blue-skying a type checker for
IDLE:  for interactive use, he'll probably want quicker feedback than that
(if a newbie breaks the type correctness of an "if" test with an edit, they
should probably be told about it as soon as they move the cursor off the
line!).

>> ...
>> If you can, in addition, avoid needing to deduce the types of
>> most globals, it could actually fly before we're all dead <wink>.

> The types of globals from other modules should be explicitly declared.

A global type inferencer can usually figure that out on its own.  There's
more than one issue being discussed here, alas -- blame Guido <0.9 wink>.

> If they aren't, they are presumed to have type PyObject or to return
> PyObject. Or they just aren't available if you are in strict static
> type check mode.

In the language of the msg to which you're replying, they're associated with
the universal set (the set of all types) -- same thing.  Then e.g.

    declared_int = unknown

is an error, but

    unknown1 = unknown2

is not.  Whether

    unknown = declared_int

should be an error is a policy issue.  Many will claim it should be an
error, but the correct answer <wink> is that it should not.  Types form a
lattice, in which "unknown" is the top element, and the basic rule of type
checking is that the binding

    lhs = rhs

is OK iff

    type(lhs) >= type(rhs)

where ">=" is wrt the partial ordering defined by the type lattice (or, in
English <wink>, only "widening" bindings are OK; like assigning an int to a
real, or a subclass to a base class etc, but not their converses).

phrase-it-that-way-or-not-you-end-up-with-the-same-rules-ly y'rs  - tim


From paul@prescod.net  Thu Dec 16 03:02:00 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 15 Dec 1999 19:02:00 -0800
Subject: [Types-sig] Re: [Doc-SIG] Sorry!
References: <385665DE.9963174B@prescod.net> <E11xvPE-00034T-00@lsls4p>
 <38569FFF.78EC8DA@prescod.net> <E11yCSI-0004pP-00@lsls4p>
 <3857CDD1.D77331AD@prescod.net> <E11yIzw-0006CY-00@lsls4p>
 <3857E6F9.52FC29F2@prescod.net> <E11yL0D-0006Vr-00@lsls4p>
Message-ID: <38585628.D4B4934F@prescod.net>

Edward Welbourne wrote:
> 
> The only interface definition mechanism I can see needed is the one that
> lets us specify the analogue of C structures and function types - that
> is, the equivalent of a typedef.  One interface thus defined can be
> deployed for several objects that support it - this does not mean that
> we have to have a separate *file* in which to say it, let alone a
> separate file in which to re-iterate the specification of the interface
> for each of the files which defines an export which matches that
> interface.

Who said anything about a separate file for every interface? The
benefits of the shadow files have been documented in other messages
including those in the thread "Shadow File Opinions" and "Progress" and
"Interface Files". You said that static type checking was ugly to start
with so I would have thought that you would prefer a proposal that
separated the type declarations from your code. This is one of the
reasons I like this strategy: to comfort those that didn't want static
types in Python code.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things to be wary of: A new kid in his prime
A man who knows the answers, and code that runs first time
http://www.geezjan.org/humor/computers/threes.html


From skip@mojam.com (Skip Montanaro)  Thu Dec 16 05:38:18 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 15 Dec 1999 23:38:18 -0600 (CST)
Subject: [Types-sig] challenge response (was: A challenge)
In-Reply-To: <Pine.LNX.4.10.9912150649380.16305-100000@nebula.lyra.org>
References: <199912151421.JAA01106@eric.cnri.reston.va.us>
 <Pine.LNX.4.10.9912150649380.16305-100000@nebula.lyra.org>
Message-ID: <14424.31434.689571.714592@dolphin.mojam.com>

    Greg> Line 7: per caveat #1, assume the compiler can access the
    Greg> find.find() function. From that, it knows the signature. The first
    Greg> parameter has a matching type, but the second (PyObject) does not
    Greg> match the required type (String), so an error is raised. If caveat
    Greg> #5 is resolved, then the second parameter matches. It is also
    Greg> possible to avoid the error by rewriting:

    Greg>     list = find.find("*.py", dir!StringType)	# 7

    Greg> "list" is now a ListType, based on the find.find() return
    Greg> value. (see caveat #5 -- it could be possible to refine this
    Greg> knowledge).

I humbly assert this train of thought rates a *bzzzt*.  I thought one core
requirement was that all type declaration stuff be optional.  The worst that
the type checker/inferencer should do in the face of incomplete type info is
display a warning.  I don't think you can flag an error unless the
programmer sets some sort of PY_ANAL_TYPE_CHECKING_AND_I_REALLY_MEAN_IT
environment variable.

Skip Montanaro | http://www.mojam.com/
skip@mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From skip@mojam.com (Skip Montanaro)  Thu Dec 16 05:54:20 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 15 Dec 1999 23:54:20 -0600 (CST)
Subject: [Types-sig] Interface files
In-Reply-To: <3857E02B.53CF27AC@prescod.net>
References: <Pine.LNX.4.10.9912150833460.16305-100000@nebula.lyra.org>
 <3857E02B.53CF27AC@prescod.net>
Message-ID: <14424.32396.684602.505977@dolphin.mojam.com>

    Greg> I stated a preference for allowing this information to reside in
    Greg> the same file as the implementation. i.e. I don't want to maintain
    Greg> two files.

    Paul> The nice thing about having separate files is that it becomes
    Paul> instantly clear what is "interesting" to the compiler. We have no
    Paul> backwards compatibility constraints. We have no questions about
    Paul> what variable are "in scope" and "available". It's just plain
    Paul> simpler.

If you're determined to have some sort of syntax to support declarations,
why not separate files for 1.6 and modified syntax for 2.0?

Skip


From skip@mojam.com (Skip Montanaro)  Thu Dec 16 05:57:36 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 15 Dec 1999 23:57:36 -0600 (CST)
Subject: [Types-sig] Low-hanging fruit: recognizing builtins
In-Reply-To: <199912151921.OAA07698@eric.cnri.reston.va.us>
References: <199912151707.MAA02639@eric.cnri.reston.va.us>
 <3857E451.F7E4B6EC@prescod.net>
 <199912151921.OAA07698@eric.cnri.reston.va.us>
Message-ID: <14424.32592.50142.921358@dolphin.mojam.com>

    Guido> A first approximation would be to go hunt through all existing
    Guido> code objects in a module for LOAD_GLOBAL and STORE_GLOBAL opcodes
    Guido> with built-in names; for all such built-in names that have no
    Guido> STORE_GLOBAL anywhere, it's "safe enough" to use the special
    Guido> opcode.  Then of course you will have to hunt through the
    Guido> bytecode for sequences of LOAD_GLOBAL(<name>), followed by
    Guido> arbitrary code to load an object, followed by CALL_FUNCTION(1).

Don't you also have to watch out for the dreaded

    from my_rewritten_builtins import *

?

Skip


From mal@lemburg.com  Thu Dec 16 10:02:49 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 16 Dec 1999 11:02:49 +0100
Subject: [Types-sig] Low-hanging fruit: recognizing builtins
References: <199912151707.MAA02639@eric.cnri.reston.va.us>
 <3857DA7C.75433A13@lemburg.com> <199912152220.RAA07942@eric.cnri.reston.va.us>
Message-ID: <3858B8C9.24962AAD@lemburg.com>

Guido van Rossum wrote:
> 
> > How about also adding caching of globals
> > which are not modified within the module in locals ? This
> > would save another cylce or two. The caching would have
> > to take place during function creation time. I'm currently doing
> > this by hand which results in ugly code... :-( but faster execution
> > :-)
> 
> Indeed -- the same analysis I was proposing would also support this.
> However there's a common pattern that can be a problem here (and isn't
> a problem for the built-in functions analysis): modules often have a
> few global variables that are initialized only once in the module, but
> are clearly (e.g. through comments) intended to be modified by using
> modules.  Examples: default files, debug levels, and the like.  I'm
> not sure how to detect this pattern reliably, unless you decide to
> cache only functions, classes, and imported modules.

In the long run it would be better to wrap those module
globals with write access functions (the write action would
then be recognized by the optimizer).

I haven't followed the thread too closely, but isn't there
some way to tell the optimizer which modules to treat at
what optimization level ? Old modules should only use the
"safe" caching strategy then while modules compiled with
full optimization would be caching all read-only globals.

BTW, instead of adding oodles of new byte code, how about
grouping them... e.g. instead of UNARY_LEN, BUILD_RANGE, etc.
why not have a CALL_BUILTIN which takes an index into
a predefined set of builtin functions.

The same could be done with some often used constants
such as None, '', 1, 0: LOAD_SYSTEM_CONST with an index
into a constants array.

The advantage is that you can easily extend both sets
of prefetched constants while not adding too many
new new byte codes to the inner loop.

Note that the loop as it is built now is already too large
for common Intel+compatible based CPUs. Adding even more byte
codes to the huge single loop would probably result in a
decrease of CPU cache hits. (I split the Great Switch
in two switch statements and got some good results out of
this: the first switch handles often used byte codes while the
second takes care of the more exotic ones.)

A note on range() and for: the common usage of

for i in range(const):
   ...

could be compiled into a completely different set of opcodes
not creating any list or tuple at all. Since the FOR_LOOP
opcode generates loop integers on each iteration the creation
of a range tuple or list is not needed. The loop opcode would only
have to check for the upper bound "const". I've added a new
counter type (basically a mutable integer type that allows
for fast increment and decrement) to simplify this even more.
For the curious, it's in the old patch:

    http://starship.skyport.net/~lemburg/mxPython-1.5.patch.gz

> > Note that interning the builtins as byte codes could be
> > a security risk when executing in a restricted environment,
> > though. Some builtin operations might not be allowed and but would
> > still be available via bytecode.
> 
> Of course a restricted environment should not accept arbitrary
> bytecode!  Also you could simply not define bytecodes for
> security-sensitive built-ins; the only ones I cna think of right now
> are __import__() and open(), which I already mentioned as exceptions.
> 
> Note that a bunch of built-in constants can also be optimized using
> this same mechanism: None and perhaps exception names.  I'm not sure
> that exception names are worth it though; they don't tend to be
> touched in inner loops where performance gains are made.  But None is
> definitely worth its own 1-byte opcode.

See above: I'd rather like see the addition of more generic
opcodes than many different new ones for each common
constant.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    15 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From m.faassen@vet.uu.nl  Thu Dec 16 13:31:46 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Thu, 16 Dec 1999 14:31:46 +0100
Subject: [Types-sig] minimal or major change? (was: RFC 0.1)
References: <Pine.LNX.4.10.9912150602060.16305-100000@nebula.lyra.org>
Message-ID: <3858E9C2.E2722B88@vet.uu.nl>

Greg Stein wrote:
> 
> On Wed, 15 Dec 1999, Martijn Faassen wrote:
> 
> ... me: stating the "GFS proposal" isn't that major of a change ...

[I'm disagreeing with the 'isn't that big of a change' thesis, Greg
defends fairly
well that it is, but I still disagree with him. I don't think our
disagreeing will matter much in the future, though, so let's forget
about it..
I'll answer some points he raised in the following, but not to defend
my point of view :)]

[snip]
> > * A whole new operator (which you can't overload..or can you?), which
> > does something quite unusual (most programmers associate types with
> > names, not with expressions). The operation also doesn't actually return
> > much that's useful to the program, so the semantics are weird too.
> 
> No, you cannot overload the operator. That would be a Bad Thing, I think.
> That would throw the whole type system into the garbage :-).

Okay, in that sense the operator would be special, as generally
operators
in Python can be overloaded (directly or indirectly). I'd agree you
shouldn't be able to overload this one, though.

> The operator is not unusual: it is an inline type assertion. It is not a
> "new-fangled way to declare the type of something."

But it's quite unusual to the programmer coming from most other
languages, still. That doesn't mean it's bad, but Python isn't an
experimental language, so this could be an objection to the operator
approach.

> It is simply a new
> operation. The compiler happens to be able to create associations from it,
> but that does *not* alter the basic semantic of the operation.
> 
> Given:
> 
>    x = y or z
> 
> In the above statement, it returns "y" if it is "true". In the statement:
> 
>    x = y ! z
> 
> It returns "y" if it has "z" type; otherwise, throws an exception. The
> semantics aren't all the difficult or unusual.

Okay, that isn't that unusual as other operator operations can throw
exceptions under some circumstances as well. Well defended. :)

[snip]
> > * Interfaces with a new 'decl' statement. [If you punt on this you'll
> > have to the innocent Python programmer he can't use the static type
> > system with instances? or will we this be inferenced?]
> 
> Yes, I'd prefer to punt this for a while, as it is a much larger can of
> worms. It is another huge discussion piece. In the current discussion, I
> believe that we can factor out the interface issue quite easily -- we
> can do a lot of work now, and when interfaces arrive, they will slide
> right in without interfering with the V1 work. In other words, I believe
> there is very little coupling between the proposal as I've outline, and
> the next set of type system extensions (via interfaces).

Hm, I'm still having some difficulty with this; as I understand it your
proposal would initially only work with functions (not methods) which
only use built-in types (not class instances). Am I right, or perhaps
I'm missing something..


[snip]

> > Adding anything like static type checking to Python entails fairly major
> > changes to the language, I'd think. Not that we shouldn't aim at keeping
> > those transparant and mostly compatible with Python as it is now, but
> > what we'll add will still be major.
> 
> Sure.

You say 'sure' to me saying it'll still be major? :) Oh, wait, I wasn't
arguing about that anymore!

> I think we're just viewing it a bit differently. To me, something
> like the metaclass stuff was a big change: it is capable of altering the
> very semantics of class construction. Adding package support was the same
> -- Python moved from a flat import space to an entirely new semantic for
> importing and application packaging.

Both happened before I was involved with Python, and I still don't know
much about metaclasses, so I can't comment on this one.

> > > > The 'simplicity' part comes in because you don't need *any* type
> > > > inferencing. Conceptually it's quite simple; all names need a type.
> > >
> > > 1) There is *no* way that I'm going to give every name a type. I may as
> > >    well switch to Java, C, or C++ (per Guido's advice in another email :-)
> >
> > Sure, but we're looking at *starting* the process. Perhaps we can do
> > away with specifying the type of each local variable very quickly by
> > using type inferencing, but at least we'll have a working
> > implementation!
> 
> I don't want to start there. I don't believe we need to start there. And
> my point (2) below blows away your premise of simplicity. Since you still
> need inferencing, the requirement to declare every name is not going to
> help, so you may as well relax that requirement.

But you'd only need expression inferencing, which I was ('intuitively'
:) assuming is easier than the larger scale thing.

[snip]
> > I'm not saying this is a good situation, it's just a way to get off the
> > ground without having to deal with quite a few complexities such as
> > inferencing (outside expressions), interaction with modules that don't
> > have type annotations, and so on. I'm *not* advocating this as the end
> > point, but I am advocating this as an intermediate point where it's
> > actually functional.
> 
> IMO, it is better to assume "PyObject" when you don't have type
> information, rather than throw an error. Detecting the lack of type info
> is the same in both cases, and the resolution of the lack is easy in both
> mehtods: throw an error, or substitute "PyObject". I prefer the latter so
> that I don't have to update every module I even get close to.

I still don't understand how making it a PyObject will help here. Would
this mean a run-time check would need to be inserted whenever PyObject
occurs in a function with type annotations? In my approach this would be
part of the Python/Static Python interface work. How does it fit in for
you?

[snip]
> > Yes, but now you're building a static type checker *and* a Python
> > compiler inserting run time checks into bytecodes. This is two things.
> > This is more work, and more interacting systems, before you get *any*
> > payoff. My sequence would be:
> 
> Who says *both* must be implemented in V0.1? If the compiler can't figure
> it out, then it just issues a warning and continues. Some intrepid
> programmer comes along and tweaks the AST to insert a runtime check. Done.
> The project is easily phased to give you a working system very quickly.
> 
> Heck, it may even be easier for the compiler to insert runtime checks in
> V0.1. Static checking might come later. Or maybe an external tool does the
> checking at first; later to be built into the compiler.

That's true; the other approach would start with adding run-time checks
and proceed to a static checker later.

> ... proposed implementation order ...
> > If you don't separate out your development path like this you end up
> > having to do it all at once, which is harder and less easy to test.
> 
> Of course. Nobody is suggesting a "do it all at once" course of
> implementation.

So that's where I'm coming from. It's important for our proposal to
actually come up with a workable development plan, because adding type
checking to Python is rather involved. So I've been pushing one course
of implementation towards a testable/hackable system that seems to give
us the minimal amount of development complexities. I haven't seen clear
development paths from others yet; most proposals seem to involve both
run-time and compile-time developments at the same time.

So I'm interested to see other development proposals; possibly there's a
simpler approach or equally complex approach with more payoff, that I'm
missing.


> > [Paul]
> > > > > I see no reason for that limitation. The result of a call to a
> > > > > non-static function is a Pyobject. You cast it in your client code to
> > > > > get type safety. Just like the shift from K&R C to ANSI C. Functions
> > >
> > > Bunk! It is *not* a cast. You cannot cast in Python. It is a type
> > > assertion. An object is an object -- you cannot cast it to something else.
> > > Forget function call syntax and casting syntax -- they don't work
> > > grammatically, and that is the wrong semantic (if you're using that format
> > > to create some semantic equivalent to a cast).
> >
> > This'd be only implementable with run-time assertions, I think, unless
> > you do inferencing and know what the type the object is after all. So
> > that's why I put the limitation there. Don't allow unknown objects
> > entering a statically typed function before you have the basic static
> > type system going. After that you can work on type inference or cleaner
> > interfaces with regular Python.
> 
> Why not allow unknown objects? Just call it a PyObject and be done with
> it.

Hm, I suppose I'm looking at it from the OPT point of view; I'd like to
see a compiler that exploits the type information. If you have PyObjects
this seems to get more difficult; could be solved if you had an
interpreter waiting in the sidelines that would handle stuff like this
that can't be compiled.

> Note that the type-assert operator has several purposes:
> 
> * a run-time assertion (and possibly: unless -O is used)
> * signal to the compiler that the expression value will have that type
>   (because otherwise, an exception would hav been raised)
> * provides a mechanism to type-check: if the compiler discovers (thru
>   inferencing) that the value has a different type than the right-hand
>   side, then it can flag an error.
> 
> The limitation you propose would actually slow things down. People would
> not be able to use the type system until a lot of modules were
> type-annotated.

I think I'm starting to see where you're coming from now, with the !
operator. It allows you to say 'from this point on, this value is an
int, otherwise the operator would've raised an exception'. The
inferencer and checker can exploit this. The point where I am coming
from is however that you lose compile-time checkability as soon as you
use any function that inserts PyObjects into the mix. I'm afraid that
even with the operator, you wouldn't be able to check most of the code,
if PyObjects are freely allowed. Perhaps I'm wrong, but I'd like to see
some more debate about this.

> > But perhaps I'm mistaken and local variables don't need type
> > descriptions, as it's easy to do type inferencing from the types of the
> > function arguments and what the function returns,
> 
> That is my (alas: unproven) belief.

How do we set about to prove it? Here I'll come with my approach again;
if you have a type checker that can handle a fully annotated function
(all names used in the function have type annotations), then you have a
platform you can build on to develop a type checker. Then you can figure
out what does need type annotations and what doesn't. You simply try to
build code that adds type annotations itself, based on inferences. You
can spew out warnings: "full type inferencing not possible, cannot
figure out type of 'foo'". The programmer can then go add type info for
'foo'. If all types are known one way (specified) or the other
(inferred), a compiler can start to do heavy duty optimization on that
code.
 
[snip]
> > I'd like to see some actual
> > examples of how this'd work first, though. For instance:
> >
> > def brilliant() ! IntType:
> >     a = []
> >     a.append(1)
> >     a.append("foo")
> >     return a[0]
> >
> > What's the inferred type of 'a' now? A list with heterogenous contents,
> > that's about all you can say, and how hard is it for a type inferencer
> > to deduce even that?
> 
> It would be very difficult for an inferencer. It would have to understand
> the semantics of ListType.append(). Specifically, that the type of the
> argument is added to the set of possible types for the List elements.
> 
> Certainly: a good inferencer would understand all the builtin types and
> their methods' semantics.
> 
> > But for optimization purposes, at least, but it
> > could also help with error checking, if 'a' was a list of IntType, or
> > StringType, or something like that?
> 
> It would still need to understand the semantics to do this kind of
> checking. In my no-variable-declaration world, the type error would be
> raised at the return statement. a[0] would have the type set: (IntType,
> StringType). The compiler would flag an error stating "return value may be
> a StringType or an IntType, but it must only be an IntType".

Right, I think this would be the right behavior. But it becomes a lot
easier to get a working implementation if you get to specify the type of
'a'. If you say a is a list of StringType, it's then relatively easy for
a compile time checker to notice that you can't add an integer to it.

And possibly it also becomes clearer for the programmer; I had to think
to figure out why your compiler would complain about a[0]. I had to play
type inferencer myself. I don't have to think as much if I get to
specify what list 'a' may contain; obviously if something else it put
into it, there should be an error.
 
> > It seems tough for the type
> > inferencer to be able to figure out that this is so, but perhaps I'm
> > overestimating the difficulty.
> 
> Yes it would be tough -- you aren't overestimating :-)

What would your path towards successful implementation be, then?

Regards,

Martijn


From Edward Welbourne <eddyw@lsl.co.uk>  Thu Dec 16 14:09:02 1999
From: Edward Welbourne <eddyw@lsl.co.uk> (Edward Welbourne)
Date: Thu, 16 Dec 1999 14:09:02 +0000
Subject: [Types-sig] Re: [Doc-SIG] Sorry!
In-Reply-To: <38585628.D4B4934F@prescod.net>
References: <385665DE.9963174B@prescod.net> <E11xvPE-00034T-00@lsls4p>
 <38569FFF.78EC8DA@prescod.net> <E11yCSI-0004pP-00@lsls4p>
 <3857CDD1.D77331AD@prescod.net> <E11yIzw-0006CY-00@lsls4p>
 <3857E6F9.52FC29F2@prescod.net> <E11yL0D-0006Vr-00@lsls4p>
 <38585628.D4B4934F@prescod.net>
Message-ID: <E11ybaY-0001IK-00@lsls4p>

[Ooops - sent it to Paul but not the group, again]

> Who said anything about a separate file for every interface?
Dunno - I wasn't supposing anyone had.

You are asking for a separate file for each module, for the purpose of
saying which interfaces the things in the module support.  My
(admittedly poorly expressed) point was that any mechanism for doing
this depends on ways of saying what an interface is; which could as
readilly be stated in the source file as in a separate one.  If it's so
big and unweildy that it needs to go in a separate source file, then:

  * anything else I write that has the same interface has to include a
    copy of this big and unweildy thing, instead of just referencing the
    same interface-description object

  * it's big and unweildy, so its right out.

> The benefits of the shadow files have been documented ...
Hmm, I'm not sure I read Guido's
> I think that any proposal that requires you to keep two separate files
> "in sync" is bound to fail in the long term. I left that crap behind
> in C++. But in the short term...okay.
as anything but a `yes, we could use this as a temporary measure to let
us experiment with static typing within python 1'.  And it appears to be
typical of the comments to date.

I guess this means I should ask:
  do you consider the changes you're proposing to be
   * temporary measures or 
   * how python 2 will do this
  ?

If the former, then we have nothing to argue about - my sole concern is
how python 2 can do this (and, for context, I'm not particularly keen on
bothering - but if it's going to be done, I want to be *very* sure it
isn't going to foul some of the truly lovely things python 2 could be ...)

> You said that static type checking was ugly to start with so I would
> have thought that you would prefer a proposal that separated the type
> declarations from your code.

I also consider many kinds of large industrial plant to be ugly: and
whether I can see them or not doesn't enter into it (except that the
ones no-one can see can get away with uglier visual appearance than the
others): and the `ugliness' I'm sensing isn't a surface-thing.  What my
prejudices are pinging off is (something I can't properly express, or I
wouldn't call it a prejudice, but it's) about the fact that we're
*saying things about* what interfaces an object supports, rather than
just leaving exceptions to get raised when those interfaces of it get
exercised.  As with large industrial plant, I can see how it may serve a
useful purpose ... but I'd far sooner see that purpose served some other
way, or find some way of dodging the need to serve that purpose.

Besides, various of the proposals I've seen since being so horribly
judgemental have swayed me towards a more ... restrained ... view of
type-checking.  I'm not keen on it, but maybe it's not as bad as my
first impressions.  However, trying to hide it so that I'll forget it's
there is *much* less welcome than confronting me with something I find
ugly.

> This is one of the reasons I like this strategy: to comfort those that
> didn't want static types in Python code.
Don't bother.  I don't want comforted.  (Although, I confess, I
sometimes wish I had my teddy-bear, but don't tell the shrinks ...)

If there's going to be a mechanism for saying, in the source, which
variables have (and/or which expressions hold) which kinds of value:
    let it be
        * straightforward
        * general (or, in the first instance, straightforwardly
          extensible to full generality)
        * part of the source code

I'm *much* happier with Tim Peters' scheme, using typedecls, than with
any scheme involving my source living in a separate place from something
that will make a difference to how it gets compiled.

(Tim: the type Boolean is a (useful) synonym for PyObject.  It probably
includes some added semantics about how you should be trying to use it.)

(I'd even be happy with the typedecl incorporating the docstring, which
is part of the interface spec after all: and would make the run-time
thing actually called be lighter-weight in some probably-irrelevant
sense.)  Actually, there are two doc-strings in two places: one is the
doc-string of (say) the function object - it says what the function does
- the other is where some object carrying that function documents the
role of the attribute as which it stores that object.  This is directly
analogous to the two forms of type declaration: one describes an object,
regardless of any names we may be calling it, the other describes what
some namespace holds under some name.

OK, now for a *technical* reason in favour of same-sourcefile typedecls:
specifically, for the typedecl of a name to appear in the namespace to
which it is local:

  A typedecl's execution (c.f.: del) can lead to the namespace setting
  up, within its infrastructure (stuff like __dict__), the magic setattr
  hookery that can implement (possibly under the bonnet)
   * such enforcing of type-checking (on the names stipulated) as the
     namespace is prepared to bother with (and the typedecl requests).
   * whatever machinery distant code is meant to use to ask about the
     types of the attributes of the objects whose namespace this is.

and, incidentally, this `each namespace is responsible for managing
itself' mentality says that one can stipulate stuff *within* a function
which isn't ever going to be visible to the outside world (e.g. the
shadow file) - for instance, if a function contains code which defines
and returns a class which defines some method under a name controlled by
the arguments recieved by the function, how on earth is the shadow file
going to say anything at all about the return type of the function that
doesn't simply ignore the argument-dependent stuff ?  (OK, I know that
was hard to read, so a wilfully perverse example follows.)

def crazyIknow(methname):

    class lumpy (previously, defined, bases): pass

    def doit(self, kinky, key):
        return some(expression, involving, self, kinky, key, andmaybe, methname)

    setattr(lumpy, methname, doit)

    return lumpy

A declspec (or, indeed, type-assertion-for-values) scheme can say
something useful about all this, specifically what the type of doit is.
How would a shadow file cope ?

Work calls,

	Eddy.


From m.faassen@vet.uu.nl  Thu Dec 16 14:52:12 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Thu, 16 Dec 1999 15:52:12 +0100
Subject: [Types-sig] A challenge
References: <199912151421.JAA01106@eric.cnri.reston.va.us>
Message-ID: <3858FC9C.8472CB62@vet.uu.nl>

Hi there,

Here's my approach towards type annotations. Note that this syntax is
not
be very readable, but it is very powerful as it's Python; readable
syntax
can be developed later. I was lucky because no classes are defined in
any of the modules in Guido's example, this means it's fairly readable.
:)

Type annotations for function locals generally follow under the function
definition, type annotations for the entire module follow at the bottom
of
the module. This order is taken mostly because __types__ uses the other
type 
declarations in itself; the type checker can simply look at __types__
and
find all information in there; the __types_foo__ notation is just for
convenience. All the type annotations could of course also reside in
external interface files. I'm also hinting at a typedef system for
complicated composite types.

Regards,

Martijn

#----------------------------------------------------------------------
import sys, find

# assume static type classes and such are builtin, for this example

def main():
    dir = "."
    if sys.argv[1:]:
        dir = sys.argv[1]
    list = find.find("*.py", dir)
    list.sort()
    for name in list:
        print name

__types_main__ = {
    'dir' : StringType,
    'list' : ListType(StringType),
    'sys.argv' : ListType(StringType) # supply extra types by hand
    # the type checker should look the 'find' module for more type
information
    # automatically
}

if __name__ == "__main__":
    main()

__types__ = {
    'main' : FunctionType(args=None, result=None,
			  local=__types_main__)
    'name' : StringType,
    '__name__' : StringType # might already be defined somewhere else
}

#----------------------------------------------------------------------


#----------------------------------------------------------------------
import fnmatch
import os

_debug = 0

_prune = ['(*)']

def find(pattern, dir = os.curdir):
    list = []
    names = os.listdir(dir)
    names.sort()
    for name in names:
        if name in (os.curdir, os.pardir):
            continue
        fullname = os.path.join(dir, name)
        if fnmatch.fnmatch(name, pattern):
            list.append(fullname)
        if os.path.isdir(fullname) and not os.path.islink(fullname):
            for p in _prune:
                if fnmatch.fnmatch(name, p):
                    if _debug: print "skip", `fullname`
                    break
            else:
                if _debug: print "descend into", `fullname`
                list = list + find(pattern, fullname)
    return list

__types_find__ = {
    'list' : ListType(StringType),
    'names' : ListType(StringType),
    'name' : StringType,
    'os.curdir' : StringType,
    'os.pardir' : StringType,
    'fullname' : StringType,
    'os.path.isdir' : ImpFunctionType(args=(StringType,),
result=IntegerType),
    'os.path.islink' : ImpFunctionType(args=(StringType,),
result=IntegerType),
    'p' : StringType,
}

__types__ = {
    '_debug' : IntegerType,
    '_prune' : ListType(StringType),
    'find' : FunctionType(args=(StringType, StringType),
			  result=ListType(StringType),
			  local=__types_find__)
}


#----------------------------------------------------------------------

#----------------------------------------------------------------------
import re

_cache = {}

def fnmatch(name, pat):
        import os
        name = os.path.normcase(name)
        pat = os.path.normcase(pat)
        return fnmatchcase(name, pat)

__types_fnmatch__ = {
    'os.path.normcase' : ImpFunctionType(args=(StringType,), 
					 result=StringType),
    }

def fnmatchcase(name, pat):
        if not _cache.has_key(pat):
                res = translate(pat)
                _cache[pat] = re.compile(res)
        return _cache[pat].match(name) is not None


__types_fnmatchcase__ = {
    'res' : StringType,
    }

def translate(pat):
        i, n = 0, len(pat)
        res = ''
        while i < n:
                c = pat[i]
                i = i+1
                if c == '*':
                        res = res + '.*'
                elif c == '?':
                        res = res + '.'
                elif c == '[':
                        j = i
                        if j < n and pat[j] == '!':
                                j = j+1
                        if j < n and pat[j] == ']':
                                j = j+1
                        while j < n and pat[j] != ']':
                                j = j+1
                        if j >= n:
                                res = res + '\\['
                        else:
                                stuff = pat[i:j]
                                i = j+1
                                if stuff[0] == '!':
                                        stuff = '[^' + stuff[1:] + ']'
                                elif stuff == '^'*len(stuff):
                                        stuff = '\\^'
                                else:
                                        while stuff[0] == '^':
                                                stuff = stuff[1:] +
stuff[0]
                                        stuff = '[' + stuff + ']'
                                res = res + stuff
                else:
                        res = res + re.escape(c)
        return res + "$"


__types_translate__ = {
    'i' : IntegerType,
    'n' : IntegerType,
    'res' : StringType,
    'c' : StringType, # or CharType ?
    'j' : IntegerType,
    'stuff' : StringType,
    }

__types__ = {
    # cheating; I'm assuming re has a ReObjectType defined somewhere
    # this is probably a very complicated construction
    # we're also assuming re functions are defined in re
    '_cache' : DictType(key=StringType,
value=re.__typedefs__['ReObjectType']),
    'fnmatch' : FunctionType(args=(StringType, StringType),
result=IntegerType,
			     local=__types_fnmatch__),
    'fnmatchcase' : FunctionType(args=(StringType, StringType), 
				 result=IntegerType,
				 local=__types_fnmatchcase__),
    'translate' : FunctionType(args=(StringType,), result=StringType,
			       local=__types_translate__),
    }

#----------------------------------------------------------------------


From m.faassen@vet.uu.nl  Thu Dec 16 15:03:29 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Thu, 16 Dec 1999 16:03:29 +0100
Subject: [Types-sig] Progress
References: <38579D70.75E477BA@prescod.net>
Message-ID: <3858FF41.501B12F1@vet.uu.nl>

Paul Prescod wrote:
> 
> We are actually making progress among all of the sound and fury here.
> You guys have a lot of good ideas and I think that we are converging
> more than it seems.
> 
> 1. Most people seem to agree with the idea that shadow files allow us a
> nice way to separate type assertions out so that their syntax can vary.
> I think Greg disagreed but perhaps not violently enough to argue about
> it. Interface files are in. Inline syntax is temporarily out. Syntactic
> "details" to be worked out.

Actually I can see the arguments for including type annotations in the
module files themselves as there's something to say for keeping it
together. As long as the syntax isn't inline in our first design I'm
fine. See the syntax example I just posted to the list.

> 2. Everybody but me is comfortable with defining
> genericity/templating/parameterization only for built-in types for now.

What do you mean by 'built-in types'? Does this include classes?

> But now that we are separating interfaces from implementations I am
> thinking that I may be able to think more clearly about
> parameterizability. It may be possible to define parameterizable
> interfaces by IPC8. Parameterization is in. Syntactic "details" to be
> worked out.

See my syntax response to Guido's challenge for my take on things.

> 3. We agree that we need a syntax for asserting the types of expressions
> at runtime. 

I'm not sure I do agree with this. It's an intruiging concept but I'm
still
not convinced we shouldn't go with annotating names instead. This may be
easier to think about for the programmer, see an earlier response of
mine to the list for an example.

[snip]
 
> 5. It isn't clear WHAT we can specify in "PyDL" interface files. Clearly
> we can define function, class/interface and method interfaces.
> 
>         a. do we allow declarations for the type of non-method instance
> variables?

Yes, eventually at least. We could focus on functions first, but I think
supporting classes will become necessary very quickly.

>         b. do we check assignments to class and module attributes from other
> modules at runtime? We need to expect that some cross-module assignments
> will come from modules that are not statically type checked.

You can manually add extra annotations for the names you use from other
modules  that those other modules don't annotate; see my syntax
proposal.
 
>         c. should we perhaps just disallow writing to "declared" attributes
> from other modules?

Hm, yes, this could become complicated, even for run-time checks. We
should come up with somekind of rule. run-time checks can help, but we
need to figure out when they're necessary, and when they aren't; i.e if
you write to a declared attribute from a module with something that
doesn't have a compile-time type associated, a run time check should
occur. But otherwise, it shouldn't.

>         d. is it possible to write to UN-declared attributes from other
> modules? And what are the type safety implications of doing so?

This would generally be fine; undeclared attributes can contain objects
of any type, right? What will be tricky (and which is why I'm clamoring
for full type annotations and other strictness, at least initially) is
*reading* from these..

Regards,

Martijn


From m.faassen@vet.uu.nl  Thu Dec 16 15:08:42 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Thu, 16 Dec 1999 16:08:42 +0100
Subject: [Types-sig] Interface files
References: <Pine.LNX.4.10.9912150833460.16305-100000@nebula.lyra.org> <3857E02B.53CF27AC@prescod.net>
Message-ID: <3859007A.FD369FB2@vet.uu.nl>

Paul Prescod wrote:
> 
> Greg Stein wrote:
> >
> > I stated a preference for allowing this information to reside in the same
> > file as the implementation. i.e. I don't want to maintain two files.
> 
> The nice thing about having separate files is that it becomes instantly
> clear what is "interesting" to the compiler. We have no backwards
> compatibility constraints. We have no questions about what variable are
> "in scope" and "available". It's just plain simpler.

Look at my proposal (response to Guido's challenge). It's in the same
file, and backwardly compatible, and it's instantly clear what the
compiler looks at.

> There is also something deeply elegant and useful about a separation of
> interface from implementation.

It can be helpful, but that doesn't mean it needs to be in a separate
file. :)

> Sure, you don't always want to be REQUIRED to separate them. I
> acknowledge that we will one day have to support inline declarations but
> I'm going to put it off unless I hear some screaming.

Right, but Greg can't put it off, as he is advocating his operators,
which have to be inline.

> > I'll go further and state that we should not use a new language for this.
> > It should just be Python. (and this is where Martijn's __types__ thing
> > comes in, although I'm not advocating that format)
> 
> I think that that's an unreasonable (and unreadable) constraint. The
> language should probably be pythonic, but not necessarily Python. Python
> doesn't have a type declaration syntax and none of Python's existing
> syntax was meant to be used AS a type declaration syntax. It just gets
> too unreadable for quasi-complicated declarations. We need to support
> polymorphic and parameteric higher order functions!

It may become fairly readable if you support typedefs (which can be used
in type anontations). But I agree that this isn't the final solution;
the final solution should probably be some nice Pythonic syntax. But for
now:

* it's quickly implementable

* it's instantly usable by tools written in Python

* it's understandable by anyone who can read Python

* it's backwards compatible

* we don't have to debate about syntax anymore and can actually think
about
  semantics without syntax confusion.

These are major advantages during the development.

Regards,

Martijn


From m.faassen@vet.uu.nl  Thu Dec 16 15:14:21 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Thu, 16 Dec 1999 16:14:21 +0100
Subject: [Types-sig] Handling attributes
References: <Pine.LNX.4.10.9912151947430.16305-100000@nebula.lyra.org>
Message-ID: <385901CD.258DD7DB@vet.uu.nl>

Greg Stein wrote:
> 
> On Wed, 15 Dec 1999, Paul Prescod wrote:
> > > Yes. My reluctance to specify types for instance variables is caused by
> > > problems with designing a nice, inline syntax for it. If you're not
> > > worrying about an inline syntax, then you can definitely add typedecls for
> > > instance and class attributes.
> >
> > Okay, but what about all of the other questions (updated slightly):
> 
> I didn't reply to them because I didn't really have much of an opinion :-)
> 
> In general, I might say: punt. Don't worry about that stuff right now.
> Worry about phase 1. Refining assignment behavior can come later, as that
> "should" be independent of what occurs in the first phase.

I'll note that this fits in with my agenda; I was thinking about
worrying about a single module for now, that has full static type
annotations. You can then ignore a lot of the problems you get when
interfacing with non-annotated modules. If you want to use things
(functions, classes) from other modules, you can put in temporary
annotations for them, but not in those modules; you put them in the
module that's using them (see again my response to Guido's challenge for
examples). Basically you define not only the module's interface seen
from the outside, but also how the module interfaces with the other
modules on the inside. You define *all* interfacing in any direction
that the module is involved in.

Regards,

Martijn


From m.faassen@vet.uu.nl  Thu Dec 16 15:22:49 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Thu, 16 Dec 1999 16:22:49 +0100
Subject: [Types-sig] Implementability
References: <000601bf4779$614092e0$58a2143f@tim>
Message-ID: <385903C9.98B69A30@vet.uu.nl>

Tim Peters wrote:
> 
> [Tim]
> > making this [global type inference] all run efficiently (in either
> > time or space) is a Professional Pain in the Professional Ass.
> 
> [Paul Prescod]
> > According to the principle of "from each according to their
> > talents"
> 
> I'm afraid you've mistaken Benevolent Dictatorship for some variant of
> Communism <wink>.
> 
> > you should be writing this optimizing, static type checker.
> 
> Guido is the one interested in magical type inference; I'm not.  I'm happy
> to explicitly declare the snot out of everything when I want something from
> static typing.

Yay, someone on my side, and it's Tim too! (now I get to watch Tim drag
himself quickly out of this and into a position completely incompatible
with mine :)

>  Merely checking that my declared types match my usage is
> much easier (doesn't require any flow analysis).  The good news is that I
> couldn't make time to write an inferencer even if I wanted to; if I had
> time, I'd be much more likely to write something that *used* explicit
> declarations to generate faster code.

Right -- this would be the first step towards a magic inferencer anyway;
you simply let it come up with 'explicit' declarations by itself, which
you then fit into your optimizer.

[snip]
> > The types of globals from other modules should be explicitly declared.
> 
> A global type inferencer can usually figure that out on its own.  There's
> more than one issue being discussed here, alas -- blame Guido <0.9 wink>.
> 
> > If they aren't, they are presumed to have type PyObject or to return
> > PyObject. Or they just aren't available if you are in strict static
> > type check mode.
> 
> In the language of the msg to which you're replying, they're associated with
> the universal set (the set of all types) -- same thing.  Then e.g.
> 
>     declared_int = unknown
> 
> is an error, but

Or, if you're interfacing with untyped python, this could raise a
run-time exception if unknown doesn't turn out to be an integer. Or do
you disagree with this?

>     unknown1 = unknown2
> 
> is not.  Whether
> 
>     unknown = declared_int
> 
> should be an error is a policy issue.  Many will claim it should be an
> error, but the correct answer <wink> is that it should not.

This would seem to be the natural way to do it; I'm not sure why many
would claim it should be an error. Could you explain?

> Types form a
> lattice, in which "unknown" is the top element, and the basic rule of type
> checking is that the binding
> 
>     lhs = rhs
> 
> is OK iff
> 
>     type(lhs) >= type(rhs)
> 
> where ">=" is wrt the partial ordering defined by the type lattice (or, in
> English <wink>, only "widening" bindings are OK; like assigning an int to a
> real, or a subclass to a base class etc, but not their converses).

I agree.

Regards,

Martijn


From guido@CNRI.Reston.VA.US  Thu Dec 16 15:38:24 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 16 Dec 1999 10:38:24 -0500
Subject: [Types-sig] Low-hanging fruit: recognizing builtins
In-Reply-To: Your message of "Wed, 15 Dec 1999 23:57:36 CST."
 <14424.32592.50142.921358@dolphin.mojam.com>
References: <199912151707.MAA02639@eric.cnri.reston.va.us> <3857E451.F7E4B6EC@prescod.net> <199912151921.OAA07698@eric.cnri.reston.va.us>
 <14424.32592.50142.921358@dolphin.mojam.com>
Message-ID: <199912161538.KAA08333@eric.cnri.reston.va.us>

>     Guido> A first approximation would be to go hunt through all existing
>     Guido> code objects in a module for LOAD_GLOBAL and STORE_GLOBAL opcodes
>     Guido> with built-in names; for all such built-in names that have no
>     Guido> STORE_GLOBAL anywhere, it's "safe enough" to use the special
>     Guido> opcode.  Then of course you will have to hunt through the
>     Guido> bytecode for sequences of LOAD_GLOBAL(<name>), followed by
>     Guido> arbitrary code to load an object, followed by CALL_FUNCTION(1).
> 
> Don't you also have to watch out for the dreaded
> 
>     from my_rewritten_builtins import *

A module using "from whatever import *" loses the benefits of this
optimization.  Serves them right.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@CNRI.Reston.VA.US  Thu Dec 16 15:44:04 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 16 Dec 1999 10:44:04 -0500
Subject: [Types-sig] Low-hanging fruit: recognizing builtins
In-Reply-To: Your message of "Thu, 16 Dec 1999 11:02:49 +0100."
 <3858B8C9.24962AAD@lemburg.com>
References: <199912151707.MAA02639@eric.cnri.reston.va.us> <3857DA7C.75433A13@lemburg.com> <199912152220.RAA07942@eric.cnri.reston.va.us>
 <3858B8C9.24962AAD@lemburg.com>
Message-ID: <199912161544.KAA08348@eric.cnri.reston.va.us>

[MAL]
> In the long run it would be better to wrap those module
> globals with write access functions (the write action would
> then be recognized by the optimizer).

Yes, but we need to deal with the current idiom or we'd break too much
code.  (When I have to break some valid code, I'd rather do it in an
explicit way, e.g. by adding a keyword, rather than silently changing
working code into non-working code for an obscure reason.)

> I haven't followed the thread too closely, but isn't there
> some way to tell the optimizer which modules to treat at
> what optimization level ? Old modules should only use the
> "safe" caching strategy then while modules compiled with
> full optimization would be caching all read-only globals.

That hasn't been discussed this time around.  I think you have
proposed more optimization control in the past; that's still a good
idea.

> BTW, instead of adding oodles of new byte code, how about
> grouping them... e.g. instead of UNARY_LEN, BUILD_RANGE, etc.
> why not have a CALL_BUILTIN which takes an index into
> a predefined set of builtin functions.
> 
> The same could be done with some often used constants
> such as None, '', 1, 0: LOAD_SYSTEM_CONST with an index
> into a constants array.
> 
> The advantage is that you can easily extend both sets
> of prefetched constants while not adding too many
> new new byte codes to the inner loop.

Good ideas.

> Note that the loop as it is built now is already too large
> for common Intel+compatible based CPUs. Adding even more byte
> codes to the huge single loop would probably result in a
> decrease of CPU cache hits. (I split the Great Switch
> in two switch statements and got some good results out of
> this: the first switch handles often used byte codes while the
> second takes care of the more exotic ones.)

Sigh -- I wish C compilers took care of this.  I like a single switch
because it's so simple.

> A note on range() and for: the common usage of
> 
> for i in range(const):
>    ...
> 
> could be compiled into a completely different set of opcodes
> not creating any list or tuple at all. Since the FOR_LOOP
> opcode generates loop integers on each iteration the creation
> of a range tuple or list is not needed. The loop opcode would only
> have to check for the upper bound "const".

Yes, this is what I had in mind.

> I've added a new
> counter type (basically a mutable integer type that allows
> for fast increment and decrement) to simplify this even more.
> For the curious, it's in the old patch:
> 
>     http://starship.skyport.net/~lemburg/mxPython-1.5.patch.gz

Or there could be something even more ad-hoc (and faster).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@CNRI.Reston.VA.US  Thu Dec 16 15:47:55 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 16 Dec 1999 10:47:55 -0500
Subject: [Types-sig] Re: [Doc-SIG] Sorry!
In-Reply-To: Your message of "Thu, 16 Dec 1999 14:09:02 GMT."
 <E11ybaY-0001IK-00@lsls4p>
References: <385665DE.9963174B@prescod.net> <E11xvPE-00034T-00@lsls4p> <38569FFF.78EC8DA@prescod.net> <E11yCSI-0004pP-00@lsls4p> <3857CDD1.D77331AD@prescod.net> <E11yIzw-0006CY-00@lsls4p> <3857E6F9.52FC29F2@prescod.net> <E11yL0D-0006Vr-00@lsls4p> <38585628.D4B4934F@prescod.net>
 <E11ybaY-0001IK-00@lsls4p>
Message-ID: <199912161547.KAA08382@eric.cnri.reston.va.us>

> Hmm, I'm not sure I read Guido's
> > I think that any proposal that requires you to keep two separate files
> > "in sync" is bound to fail in the long term. I left that crap behind
> > in C++. But in the short term...okay.
> as anything but a `yes, we could use this as a temporary measure to let
> us experiment with static typing within python 1'.  And it appears to be
> typical of the comments to date.

I didn't write the inner quote.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From GoldenH@littoncorp.com  Thu Dec 16 17:15:37 1999
From: GoldenH@littoncorp.com (Golden, Howard)
Date: Thu, 16 Dec 1999 09:15:37 -0800
Subject: [Types-sig] A challenge
Message-ID: <ADCB388D8C6BD211A4CE0000F63D90112215C6@mail.littoncorp.com>

Tim Peters wrote:

> I like interface files fine, but will stick to inline "decl"s below.
> Apparently unlike anyone else here, I think explicit 
> declarations can make
> code easier for *human readers* to understand -- so I'm not 
> interested in
> hiding them from view.

I _don't_ like interface files, precisely for this reason!  Making the code
easier to understand is my highest goal.

> if-it-looks-a-lot-like-every-other-reasonable-declaration-
>     syntax-you've-ever-seen-it-met-its-goal-ly y'rs  - tim

I completely support this style!  I won't quibble about 'decl' vs. 'var',
though I suggest the latter, all else being equal, since it has a proud
heritage.

- Howard


From GoldenH@littoncorp.com  Thu Dec 16 17:21:49 1999
From: GoldenH@littoncorp.com (Golden, Howard)
Date: Thu, 16 Dec 1999 09:21:49 -0800
Subject: [Types-sig] What is the Essence of Python? (Was: Low-hanging
 fruit:  recognizing builtins)
Message-ID: <ADCB388D8C6BD211A4CE0000F63D90112215C7@mail.littoncorp.com>

Paul Prescod wrote:

> "Golden, Howard" wrote:

> > I reiterate that we should define what is the essence of Python, so we
know
> > what sort of dynamicism and flexibility we are trying to preserve, and
what
> > is superfluous.  Until we do this, we are dealing with a squishy set of
> > requirements.
 
> I think that that is always the case in language design. What one person
> hates is what another loves: even in Python! I don't know how to answer
> your question. I think that we can only argue about particular features
> "when we get to them."

Then I am suggesting an "Annotated Python Reference Manual" to act as a
taxonomy of the features.  This could become the basis for our arguments!
 
> Most people probably do not use dynamicity to the same extent as the
> real power users but on the other hand they are the ones who are most
> fanatical about the language.

Again, I hope someone will suggest some good examples for me to study.

- Howard


From GoldenH@littoncorp.com  Thu Dec 16 17:31:21 1999
From: GoldenH@littoncorp.com (Golden, Howard)
Date: Thu, 16 Dec 1999 09:31:21 -0800
Subject: [Types-sig] What is the Essence of Python?
Message-ID: <ADCB388D8C6BD211A4CE0000F63D90112215C8@mail.littoncorp.com>

Greg Stein wrote:

> On Wed, 15 Dec 1999, Paul Prescod wrote:

> > "Golden, Howard" wrote:

> > > I reiterate that we should define what is the essence of Python, so we
know
> > > what sort of dynamicism and flexibility we are trying to preserve, and
what
> > > is superfluous.  Until we do this, we are dealing with a squishy set
of
> > > requirements.

> > I think that that is always the case in language design. What one person
> > hates is what another loves: even in Python! I don't know how to answer
> > your question. I think that we can only argue about particular features
> > "when we get to them."
 
> I agree. It is like asking somebody to describe the color "blue" :-)
 
> I think there is a yardstick in there somewhere, that you can hold up to a
> feature or design and say "that's Pythonic" or "that's not". But it is
> very subjective and incapable of being described...

This reminds me of the judge's comment that he couldn't define obscenity,
but he knew it when he saw it.  Unfortunately, it's also not very useful in
communicating between people.  I hope some of you will make an extra effort
to help newbies like me!

- Howard


From Edward Welbourne <eddyw@lsl.co.uk>  Thu Dec 16 17:50:43 1999
From: Edward Welbourne <eddyw@lsl.co.uk> (Edward Welbourne)
Date: Thu, 16 Dec 1999 17:50:43 +0000
Subject: [Types-sig] Re: [Doc-SIG] Sorry!
In-Reply-To: <199912161547.KAA08382@eric.cnri.reston.va.us>
References: <385665DE.9963174B@prescod.net> <E11xvPE-00034T-00@lsls4p>
 <38569FFF.78EC8DA@prescod.net> <E11yCSI-0004pP-00@lsls4p>
 <3857CDD1.D77331AD@prescod.net> <E11yIzw-0006CY-00@lsls4p>
 <3857E6F9.52FC29F2@prescod.net> <E11yL0D-0006Vr-00@lsls4p>
 <38585628.D4B4934F@prescod.net> <E11ybaY-0001IK-00@lsls4p>
 <199912161547.KAA08382@eric.cnri.reston.va.us>
Message-ID: <E11yf35-0001SO-00@lsls4p>

>> Hmm, I'm not sure I read Guido's
>>> I think that any proposal that requires you to keep two separate files
>>> "in sync" is bound to fail in the long term. I left that crap behind
>>> in C++. But in the short term...okay.
> I didn't write the inner quote.

Oops - sorry: in fact, that was Paul (drawback of snatching a look at
the threaded list during compiles and such) on
http://www.python.org/pipermail/types-sig/1999-December/000617.html

From that being Paul, I guess I should infer that the answer to the
question I posed later would be that the two-file scheme is a `for the
present' idea, which greatly reduces my twitchiness about it.

	Eddy.
--
I have almost enough time either to read all the relevant information or
to respond to it.


From paul@prescod.net  Thu Dec 16 13:29:50 1999
From: paul@prescod.net (Paul Prescod)
Date: Thu, 16 Dec 1999 05:29:50 -0800
Subject: [Types-sig] Type annotations
Message-ID: <3858E94E.B7D86846@prescod.net>

Okay, I see four different approaches to the syntax short-term type
annotation. Personally, I do not think that it is too early to talk
about syntax because we need to communicate these ideas. Here are my
metrics:

Python 1.5 compatibility: Sill the Python 1.5 compiler accept it? Will
it have reasonable semantics?
Logical separation: Will users be able to understand that runtime
objects are not available?
Convenience: How easy is it to edit?
Syntactic Cleanliness: How "obvious" is it what the declaration means?
 
1. separate file:
  Python 1.5 compatibility: high
  Logical separation: high
  Convenience: low
  Syntactic Cleanliness: high

2. labelled string expressions: (like 3, but in strings)
  Python 1.5 compatibility: high
  Logical separation: high
  Convenience: medium
  Syntactic Cleanliness: medium

3. in separate decl statements: (Incompatible with Python 1.5, but
easily converted)

  Python 1.5 compatibility: low
  Logical separation: high
  Convenience: medium
  Syntactic Cleanliness: high

4. in-line in "other" declarations

  Python 1.5 compatibility: low
  Logical separation: low
  Convenience: high
  Syntactic Cleanliness: high

5. in dictionaries, lists, and other basic Python objects, "overloaded"
with special meaning

  Python 1.5 compatibility: high
  Logical separation: medium
  Convenience: high
  Syntactic Cleanliness: low

Of course if we use a backwards-incompatible expression syntax then
backwards-incompatibility is not an issue. That is one reason I propose
check_type( expr, type ) which can be interpreted in old Python as a
function call.

My preference is to allow three different syntaxes according to a
schedule:

	January: separate files (we need this anyhow to define types for the
builtin modules)
	February: in-module string declarations and build separate files
	Python 2.0: either 3 or 4

I admit that I don't find the compatibility benefits of 5 to be worth
the obfuscation. Parsing is not THAT hard.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things never trust in: That's the vendor's final bill
The promises your boss makes, and the customer's good will 
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Thu Dec 16 13:29:59 1999
From: paul@prescod.net (Paul Prescod)
Date: Thu, 16 Dec 1999 05:29:59 -0800
Subject: [Types-sig] Type annotations
Message-ID: <3858E957.CBF8F4DF@prescod.net>

Okay, I see four different approaches to the syntax short-term type
annotation. Personally, I do not think that it is too early to talk
about syntax because we need to communicate these ideas. Here are my
metrics:

Python 1.5 compatibility: Sill the Python 1.5 compiler accept it? Will
it have reasonable semantics?
Logical separation: Will users be able to understand that runtime
objects are not available?
Convenience: How easy is it to edit?
Syntactic Cleanliness: How "obvious" is it what the declaration means?
 
1. separate file:
  Python 1.5 compatibility: high
  Logical separation: high
  Convenience: low
  Syntactic Cleanliness: high

2. labelled string expressions: (like 3, but in strings)
  Python 1.5 compatibility: high
  Logical separation: high
  Convenience: medium
  Syntactic Cleanliness: medium

3. in separate decl statements: (Incompatible with Python 1.5, but
easily converted)

  Python 1.5 compatibility: low
  Logical separation: high
  Convenience: medium
  Syntactic Cleanliness: high

4. in-line in "other" declarations

  Python 1.5 compatibility: low
  Logical separation: low
  Convenience: high
  Syntactic Cleanliness: high

5. in dictionaries, lists, and other basic Python objects, "overloaded"
with special meaning

  Python 1.5 compatibility: high
  Logical separation: medium
  Convenience: high
  Syntactic Cleanliness: low

Of course if we use a backwards-incompatible expression syntax then
backwards-incompatibility is not an issue. That is one reason I propose
check_type( expr, type ) which can be interpreted in old Python as a
function call.

My preference is to allow three different syntaxes according to a
schedule:

	January: separate files (we need this anyhow to define types for the
builtin modules)
	February: in-module string declarations and build separate files
	Python 2.0: either 3 or 4

I no longer find the compatibility benefits of 5 to be worth the
obfuscation if we are really going to move into supporting parametric
polymorphism and the rest. It just gets too hairy.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things never trust in: That's the vendor's final bill
The promises your boss makes, and the customer's good will 
http://www.geezjan.org/humor/computers/threes.html


From Edward Welbourne <eddyw@lsl.co.uk>  Thu Dec 16 18:14:17 1999
From: Edward Welbourne <eddyw@lsl.co.uk> (Edward Welbourne)
Date: Thu, 16 Dec 1999 18:14:17 +0000
Subject: [Types-sig] What is the Essence of Python?
Message-ID: <E11yfPt-0001TS-00@lsls4p>

>>>> I reiterate that we should define what is the essence of Python,
>> ... like asking somebody to describe the color "blue" :-)
> I hope some of you will make an extra effort to help newbies like me!

The nearest you could hope to get to `what is the essence of Python'
would be if each of the folk in the present discussion ignored one
another's answers (and opinions) and told you our own individual answers,
but you mustn't go expecting our answers to agree ...

So ... what is the essence of Python ?  Eddy's answer:

A bunch of protocols for manipulating namespaces and functions.

An object is a namespace if getattr knows how to ask it for attributes.
Anything you want to do with a namespace, you do by:
  * finding the protocol that describes what you wanted to do
  * looking up the attributes the protocol specifies
  * calling the function (it usually is a function) you just got back,
    with the arguments the protocol specifies, and
  * trusting that this has either:
     - achieved the effect you had in mind, or
     - raised an exception (probably stipulated by the protocol)

There are a few handy built-in types and functions which suffice to
boot-strap the protocols python defines, and to let you do `most' of the
things you will ever want to do.  These suffice for implementation of
everything else you might want to do.  The base protocols are specified
in terms of various names, typically beginning and ending `__'.


Now, with any luck, other answers will be so different you'll doubt we
were talking about the same language as one another ... then you'll
begin to understand why, though your question is sensible, we can't give
you a sensible answer ;^>

	Eddy.


From m.faassen@vet.uu.nl  Thu Dec 16 18:20:15 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Thu, 16 Dec 1999 19:20:15 +0100
Subject: [Types-sig] Type annotations
References: <3858E94E.B7D86846@prescod.net>
Message-ID: <38592D5F.FBF12AF0@vet.uu.nl>

Paul Prescod wrote:
[snip snip]
> My preference is to allow three different syntaxes according to a
> schedule:
> 
>         January: separate files (we need this anyhow to define types for the
> builtin modules)
>         February: in-module string declarations and build separate files
>         Python 2.0: either 3 or 4
> 
> I admit that I don't find the compatibility benefits of 5 to be worth
> the obfuscation. Parsing is not THAT hard.

But it doesn't completely obfuscate; it's _in Python_. Python
programmers already grok Python syntax. It looks fairly horrible, but
it's also readable by Python programmers, and so it's easier to
communicate.

And you don't have to deal with parsing only; that isn't the main
problem. The main thing is that we need a way to express complex,
composite types. Python is very expressive. You can make a Pythonic
language to express types in later, but we can't yet as we don't fully
know yet what we want to express.

Yet another advantange of going the 'in Python' route is that you
already have the backend for your parser. And if you have an
implementation (that we'll undoubtedly will change several times,
another advantage of using Python), you can actually start thinking
about a good syntax with *knowledge*. You know what kind of data
structures are actually involved, you have working experience. Something
we don't really have right now.

Regards,

Martijn


From m.faassen@vet.uu.nl  Thu Dec 16 18:23:30 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Thu, 16 Dec 1999 19:23:30 +0100
Subject: [Types-sig] Type annotations
References: <3858E957.CBF8F4DF@prescod.net>
Message-ID: <38592E22.7EC44C45@vet.uu.nl>

Paul Prescod wrote:
> 
> I no longer find the compatibility benefits of 5 to be worth the
> obfuscation if we are really going to move into supporting parametric
> polymorphism and the rest. It just gets too hairy.

Oh, cool, yet another, slightly different objection. :)

I disagree that it gets too hairy. I'm advocating using Python
*precisely* because of the complex types. Python expressions can deal
with that kind of complexity right now. What's all this obsession with
syntax early on about anyway? It only distracts us from the real topic,
in my opinion..

Regards,

Martijn


From paul@prescod.net  Thu Dec 16 18:17:56 1999
From: paul@prescod.net (Paul Prescod)
Date: Thu, 16 Dec 1999 10:17:56 -0800
Subject: [Types-sig] Type annotations
References: <3858E94E.B7D86846@prescod.net> <38592D5F.FBF12AF0@vet.uu.nl>
Message-ID: <38592CD4.963694B0@prescod.net>

Martijn Faassen wrote:
> 
> And you don't have to deal with parsing only; that isn't the main
> problem. The main thing is that we need a way to express complex,
> composite types. Python is very expressive. You can make a Pythonic
> language to express types in later, but we can't yet as we don't fully
> know yet what we want to express.

Actually, we do. The Python type system is well understood. We just
don't have a way of talking about it statically. I'll attach some of my
half-formed ideas and then shut up for a little while.

> Yet another advantange of going the 'in Python' route is that you
> already have the backend for your parser. And if you have an
> implementation (that we'll undoubtedly will change several times,
> another advantage of using Python), you can actually start thinking
> about a good syntax with *knowledge*. You know what kind of data
> structures are actually involved, you have working experience. Something
> we don't really have right now.

I don't see how dictionaries are a decent back-end. The real back-end
will be type objects with direct references to other type objects.

----

Here are some Haskell-ish syntax ideas for type declarations:

First we need to be able to talk about types. We need a "type
expression" which evalutates to a type.

Rough Grammar: 

Type : Type ['|' Type] # allow unions
Unit : dotted_name | Parameterized | Function | Tuple | List | Dict
Parameterized : dotted_name '(' Basic (',' Basic)* ')'
Basic : dotted_name | PythonLiteral | "*" # * means anything.
PythonLiteral : atom
Function : Type '->' Type
Tuple : "(" Type ("," Type )* ) 
List: "[" Type "]"
Dict: "{" Type ":" Type "}"

Examples:

String
[(Int, Int)]
{(String,Int), String}
BTree( String )
BTree( somepackage.somemod.someclass )

There is another syntax for declaring instance interface types, it
follows Python's class declaration syntax. More on that later.

Now we probably want to be able to invent names for types. This is
like C's typedef. We'll use simple Python assignment syntax.

Typedef = NAME["(" args ")"]  '=' Type


Examples :
StringOrList = String | List( String )
ElementNode = XMLNode( "Element" )
MyTuple = ( Integer, String, List( String ) )
Str50 = BoundedString( 0, 50 )
PositiveInteger = BoundedInteger( 0, sys.maxint )
PositiveInteger = BoundedInteger( -sys.maxint-1, 0 )
len = sequence(*) -> int
maptype(intype, outtype) = 
    (( intype -> outtype ), List( intype )) -> List( outtype )
intmap = maptype( int, int )
lenmap = maptype( sequence(*), int )

Interfaces look like Python classes but they use an "interface"
keyword.

interfacedef: 'interface' NAME ['(' testlist ')'] asdecl ':'
interfacebody
interfacebody: funcdef | classdef | instancevar | interfacevar
asdecl: "as" type 

funcdef: 'def' NAME parameters ':' docstring?
parameters: '(' [varargslist] ')'
varargslist: (like Pythons but with added "as" operator.

"Interface" and instance variables may also be declared. 

interface (a,b) foo_interface(base_interface):
    static: 
        k as String

    instance:
        j as Integer

    def bar( self, arg1 as a ) as b:
        "This takes an argument of one paramterized type and rteturns
            the other."

    def baz( self, arg1 as a ) as b:
        "This takes an argument of one paramterized type and rteturns
            the other."

"Empty" class definitions are also possible in interface files:

class (a, b) foo_class (base_class) implements foo_interface:
    def newfunc( self, arg1 as a ) as b:
        "This takes an argument of one paramterized type and rteturns
            the other."

We can also export individual instances and other objects:

a as String
b as foo_interface
c as foo_class
const path as ["String"]
const version as Integer

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things never trust in: That's the vendor's final bill
The promises your boss makes, and the customer's good will 
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Thu Dec 16 18:18:36 1999
From: paul@prescod.net (Paul Prescod)
Date: Thu, 16 Dec 1999 10:18:36 -0800
Subject: [Types-sig] Interface files
References: <Pine.LNX.4.10.9912150833460.16305-100000@nebula.lyra.org> <3857E02B.53CF27AC@prescod.net> <3859007A.FD369FB2@vet.uu.nl>
Message-ID: <38592CFC.DD175CAB@prescod.net>

Martijn Faassen wrote:
> 
> * we don't have to debate about syntax anymore and can actually think
> about
>   semantics without syntax confusion.

Clean syntax helps comprehension. 

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things never trust in: That's the vendor's final bill
The promises your boss makes, and the customer's good will 
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Thu Dec 16 18:18:38 1999
From: paul@prescod.net (Paul Prescod)
Date: Thu, 16 Dec 1999 10:18:38 -0800
Subject: [Types-sig] Re: The role of PyObjects
References: <Pine.LNX.4.10.9912150602060.16305-100000@nebula.lyra.org> <3858E9C2.E2722B88@vet.uu.nl>
Message-ID: <38592CFE.695545D9@prescod.net>

Martijn Faassen wrote:
> 
> I'm afraid that
> even with the operator, you wouldn't be able to check most of the code,
> if PyObjects are freely allowed. Perhaps I'm wrong, but I'd like to see
> some more debate about this.

PyObjects are just another type. In Python or any OO language it is
ABSOLUTELY impossible to know the type of every object at compile time
because of polymophism:

a = CGIHTTPServer()
b = BaseHTTPServer()

startServer( a )
startServer( b )

startServer does not know the exact types at compile time. The basic
nature of the problem does not change if we have a function that expects
just a "PyObject" (the base class of all base classes). Of course if the
function is to be statically type checked then you cannot use operations
on the object other than those allowed by PyObjects, but the basic
principle is the same.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things never trust in: That's the vendor's final bill
The promises your boss makes, and the customer's good will 
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Thu Dec 16 18:19:36 1999
From: paul@prescod.net (Paul Prescod)
Date: Thu, 16 Dec 1999 10:19:36 -0800
Subject: [Types-sig] Attributes proposal
Message-ID: <38592D38.63057A1A@prescod.net>

My proposal for handling attributes is this:

An attribute's type can be declared. Writes to the attribute from the
same module can be statically type checked (if requested). Writes to the
attribute from other modules are checked at runtime. That way we can
always know the type of the attribute value and can therefore make
reasonable use of the attribute in statically type checked functions.

Opinions?

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things never trust in: That's the vendor's final bill
The promises your boss makes, and the customer's good will 
http://www.geezjan.org/humor/computers/threes.html


From GoldenH@littoncorp.com  Thu Dec 16 18:54:17 1999
From: GoldenH@littoncorp.com (Golden, Howard)
Date: Thu, 16 Dec 1999 10:54:17 -0800
Subject: [Types-sig] What is the Essence of Python?
Message-ID: <ADCB388D8C6BD211A4CE0000F63D90112215CA@mail.littoncorp.com>

Edward Welbourne wrote:

> So ... what is the essence of Python ?  Eddy's answer:
> 
> A bunch of protocols for manipulating namespaces and functions.
> 
> An object is a namespace if getattr knows how to ask it for 
> attributes.
> Anything you want to do with a namespace, you do by:
>   * finding the protocol that describes what you wanted to do
>   * looking up the attributes the protocol specifies
>   * calling the function (it usually is a function) you just got back,
>     with the arguments the protocol specifies, and
>   * trusting that this has either:
>      - achieved the effect you had in mind, or
>      - raised an exception (probably stipulated by the protocol)
> 
> There are a few handy built-in types and functions which suffice to
> boot-strap the protocols python defines, and to let you do 
> `most' of the
> things you will ever want to do.  These suffice for implementation of
> everything else you might want to do.  The base protocols are 
> specified
> in terms of various names, typically beginning and ending `__'.

This is the _mechanism_ of Python, but is it the _essence_?  It's just like
my talk yesterday with my 10 year-old son about the essence of the movie
"Field of Dreams."  He said it was about a man building a baseball field in
an Iowa cornfield.  I said it was about a man coming to terms with the
conflict with his dead father.  My son is very literal in his thinking, so I
understand his analysis.  I was trying to encourage him to think below the
surface.

Your answer about Python, and its appeal to you reminds me of how I felt
about Forth, when I first learned it around 1980.  Again, you have a very
simple mechanism which is easily extensible to do whatever you want.  It is
interactive, too.  I suspect that many people are still using Forth, but you
seldom hear about it any more.  Probably many of those using Forth have
added all sorts of object-oriented, generic programming, parametric
polymorphism extensions.  My question is:  Is that still Forth?  I think you
could argue either side, but the important point is that it wouldn't _look_
like 1980's Forth.

"What is Python?" is really Guido's decision. (If I agree, I'll use it, and
if not, I'll vote with my feet.)  But I am arguing that it is more than just
a clear syntax wrapped around a flexible namespace.

- Howard


From tismer@appliedbiometrics.com  Thu Dec 16 18:53:48 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Thu, 16 Dec 1999 19:53:48 +0100
Subject: [Types-sig] A challenge
References: <000501bf4779$5e566b40$58a2143f@tim>
Message-ID: <3859353C.4106B165@appliedbiometrics.com>


Tim Peters wrote:
> 
> [Guido]

> > I personally am losing track of all the various proposals.
> 
> You're not alone <wink>.

Trackless Python. I'm loosing track every day now, when there are
between 2 to six new posts each of about 7 people, and over
70% of cited text. Hard to follow since I'm a learner still.

<grateful cut>

> if-it-looks-a-lot-like-every-other-reasonable-declaration-
>     syntax-you've-ever-seen-it-met-its-goal-ly y'rs  - tim

...

This is what I can read. What a delight :-)
Just a question, please:

> import fnmatch
> import os
> 
> decl _debug: Int  # but Boolean makes more sense; see below

Is this meant to be lexically true in the globals scope from
here on?

> _debug = 0
> 
> decl _prune: [String]
> _prune = ['(*)']
> 
> decl find: def(String, optional dir: String) -> [String]
> 
> def find(pattern, dir = os.curdir):
>     decl list, names: [String], name: String
>     list = []
>     names = os.listdir(dir)
>     names.sort()
>     for name in names:
>         decl name, fullname: String

Same question: "name" is redefined from here on?
Would this behave (or be as behaviorless) like
the "global" declaration, or lexical, or do
you open a new type scope with "for"? (New
"variable, with C's {} in mind).
The latter cannot be since "for" declared it already.

<snip again, was'n ambigous to me>

make code, not words :)- ly 'y'rs - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From lannert@uni-duesseldorf.de  Thu Dec 16 19:05:24 1999
From: lannert@uni-duesseldorf.de (lannert@uni-duesseldorf.de)
Date: Thu, 16 Dec 1999 20:05:24 +0100 (MET)
Subject: [Types-sig] A lurker's comment
In-Reply-To: <19991216170010.65B751CEEF@dinsdale.python.org> from "types-sig-admin@python.org" at "Dec 16, 99 12:00:10 pm"
Message-ID: <19991216190524.16032.qmail@lannert.rz.uni-duesseldorf.de>

"types-sig-admin@python.org" wrote:
  [Apologies first. Although being subscribed to the digest only, I hardly
   manage to follow the current volume of this list. Is there a life beyond
   work, Types-SIG and a minimum of sleep?? The discussion may well be
   past the points I'm addressing at the time of this writing ...]

> Paul Prescod wrote:
> > 
> > Greg Stein wrote:
> > >
> > > I stated a preference for allowing this information to reside in the same
> > > file as the implementation. i.e. I don't want to maintain two files.
> > 
> > The nice thing about having separate files is that it becomes instantly
> > clear what is "interesting" to the compiler. We have no backwards
> > compatibility constraints. We have no questions about what variable are
> > "in scope" and "available". It's just plain simpler.

Please, don't introduce separate spec files. It's OK for a quick hack while
doing a proof of concept, but not for actual use. A C[+-]* program that
consists of .c, .h, .cpp and some other files usually resides in a directory
of its own, but when it's compiled, it usually collapses into just one file
that you can freely move around. I'm already not too happy with a Python
program that needs a few special-purpose modules to accompany it wherever
it goes.

> > There is also something deeply elegant and useful about a separation of
> > interface from implementation.
> 
> It can be helpful, but that doesn't mean it needs to be in a separate
> file. :)

Seconded!

Wouldn't it be a Pythonic solution to regard a restricted namespace as a
"restricted dictionary" which can (a) refuse to accept new items once
it is declared closed (or frozen or fixated), and (b) refuse to accept
values for certain keys unless these values are compatible with a (list
of) type/class/interface spec(s)? (I guess Chris T. had something similar
in mind; hadn't you?)

  d = RestrictedDict()
  d.declare_type("i", IntType)
  d.declare_type("j", (IntType, NoneType))
  d["i"] = 5
  d["j"] = None
  d["i"] = None # raises TypeError
  d.fixate()
  d["spam"] = "foo" # raises KeyError

Modules, classes, and instances can offer this sort of __dict__, providing
type and name safety; for a function's locals() it has to be simulated.

If there is an unambiguous syntax for these restrictions, a compiler can
use them for (OPT):

  def count: IntType  # == __dict__.declare_type("count", IntType)
  def finally         # == __dict__.fixate()

(Or whatever syntax there will be.) Of course the variable declarations
should be performed at definition/compile time; as for "global"s,
the variable must not be used before the declaration.

Anyway, I'd like to have something as open to inspection and manipulation
as Python's __dict__s etc. to achieve type and name safety. (Even a
d.unfixate() would be nice for testing a program with the interpreter.)
And if I knew how to do the declarations right, I'd help the compiler to
implement my count=count+1 as a simple machine code integer increment.

[Tim Peters, iirc:]
> Types form a
> lattice, in which "unknown" is the top element, and the basic rule of type
> checking is that the binding
> 
>     lhs = rhs
> 
> is OK iff
> 
>     type(lhs) >= type(rhs)
> 
> where ">=" is wrt the partial ordering defined by the type lattice (or, in
> English <wink>, only "widening" bindings are OK; like assigning an int to a
> real, or a subclass to a base class etc, but not their converses).

This would also be valid for alternative types:
(NoneType, IntType) >= (IntType,).

An assignment with lhs: (IntType,), rhs: (NoneType, IntType) should not be
rejected by the interpreter if rhs happens to be an Int, but by the
compiler.

Finally, while I'm just bothering you anyway, an irrelevant opinion on the
difficulty of declaring lists of tuples of (int, string, someclass, ...):
Wouldn't a simplistic approach, which leaves the ultimate responsibility
to the user, suffice for the time being? We can't _prove_ the correctness
of a program (yet; did I miss something?), but we can help a human to
avoid the most frequent errors.

  class MyListType(ListType): pass # suppose we have types as classes ...
  def ml: MyListType
  def al: ListType
  
  ml = MyListType([1, 2, "many"])  # OK
  al = ml                          # OK
  ml = al                          # rejected by the compiler
  ml = MyListType(al)              # OK (it's up to myself to do it right,
                                   #     I proved my awareness)

  Detlef


From Edward Welbourne <eddyw@lsl.co.uk>  Thu Dec 16 19:18:41 1999
From: Edward Welbourne <eddyw@lsl.co.uk> (Edward Welbourne)
Date: Thu, 16 Dec 1999 19:18:41 +0000
Subject: [Types-sig] What is the Essence of Python?
In-Reply-To: <ADCB388D8C6BD211A4CE0000F63D90112215CA@mail.littoncorp.com>
References: <ADCB388D8C6BD211A4CE0000F63D90112215CA@mail.littoncorp.com>
Message-ID: <E11ygQD-0001W8-00@lsls4p>

> This is the _mechanism_ of Python, but is it the _essence_?
well, it's *part of* the mechanism ... the part that grabs me.

The mechanism/essence distinction is one it's a lot easier to make in
the case of a story ... the closer one gets to the concrete world, the
harder it gets to make ... what is the essence of stone ? ... be
prepared to cope with the essence and mechanism of a real thing
overlapping rather more severely than arises for stories - especially
stories written by someone explicitly trying to put across a message.
In such cases, the essence may well *be* part of the mechanism.
And folk can't be relied on to agree about which part.

> ... those using Forth have added all sorts of ... extensions.
> My question is: Is that still Forth?

or, to put your original question (what is the essence) another way:
what is it about python that you can't change because if you did it
wouldn't be python any more ?

To me, the answer to that is

>> A bunch of protocols for manipulating namespaces and functions.

(albeit words like `sufficient', `good' and `well' need added in several
places there).

There are some important `pythonic theses' I saw (by Tim Peters, I
think) but I've lost the bookmark ... ask Tim Peters, they were good.
They might come closer to satisfying your criteria of essentiality.

	Eddy.


From gstein@lyra.org  Thu Dec 16 19:24:55 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 11:24:55 -0800 (PST)
Subject: [Types-sig] Interface files
In-Reply-To: <3857E02B.53CF27AC@prescod.net>
Message-ID: <Pine.LNX.4.10.9912161120060.16305-100000@nebula.lyra.org>

On Wed, 15 Dec 1999, Paul Prescod wrote:
> Greg Stein wrote:
> > 
> > I stated a preference for allowing this information to reside in the same
> > file as the implementation. i.e. I don't want to maintain two files.
> 
> The nice thing about having separate files is that it becomes instantly
> clear what is "interesting" to the compiler. We have no backwards
> compatibility constraints. We have no questions about what variable are
> "in scope" and "available". It's just plain simpler.
> 
> There is also something deeply elegant and useful about a separation of
> interface from implementation. 

In your opinion, sure.

I just got done telling you my opinion :-). And that is that separate
files are Not Nice. Elegant? Bah. It's extra files to deal with and
coordinate.

> Sure, you don't always want to be REQUIRED to separate them. I
> acknowledge that we will one day have to support inline declarations but
> I'm going to put it off unless I hear some screaming.

*SCREAM*

How's that?

> > I'll go further and state that we should not use a new language for this.
> > It should just be Python. (and this is where Martijn's __types__ thing
> > comes in, although I'm not advocating that format)
> 
> I think that that's an unreasonable (and unreadable) constraint. The
> language should probably be pythonic, but not necessarily Python. Python
> doesn't have a type declaration syntax and none of Python's existing
> syntax was meant to be used AS a type declaration syntax. It just gets
> too unreadable for quasi-complicated declarations. We need to support
> polymorphic and parameteric higher order functions!

Why in the heck should I have to go and code up a separate file? In a
separate language? That is nonsense. Really. And no, I'd rather not be
diplomatic here. Saying that we are going to use Yet Another Goddamned
Language is the wrong move.

I'm going to stop now. I could go on, but it probably would not be
productive.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From tismer@appliedbiometrics.com  Thu Dec 16 19:25:29 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Thu, 16 Dec 1999 20:25:29 +0100
Subject: [Types-sig] A lurker's comment
References: <19991216190524.16032.qmail@lannert.rz.uni-duesseldorf.de>
Message-ID: <38593CA9.37E36AB3@appliedbiometrics.com>


lannert@lannert.rz.uni-duesseldorf.de wrote:
...
> > It can be helpful, but that doesn't mean it needs to be in a separate
> > file. :)
> 
> Seconded!
> 
> Wouldn't it be a Pythonic solution to regard a restricted namespace as a
> "restricted dictionary" which can (a) refuse to accept new items once
> it is declared closed (or frozen or fixated), and (b) refuse to accept
> values for certain keys unless these values are compatible with a (list
> of) type/class/interface spec(s)? (I guess Chris T. had something similar
> in mind; hadn't you?)

Yes of course. When I can get an effect by adding some sugar to
semantics, and I can avoid any syntactic changes, then I try
since I hate syntax.

>   d = RestrictedDict()

...and so on, easy to implement between supper and X chapters...

What I was missing was the fact that you cannot get out of this is
a static check that your ship will make it to the mars before
you travel. This example from Guido really struck me.
Still I'm not convinced that compile time and run time
are different things, since Python itself is at the moment
the best counterexample.
There must be a third concept between runtime checks and
compiletime syntactic distortion which we are misssing.
Python's simplicity together with cleverness is one of its
most attractive things for me. While I shouted "yeah" when
it came to the type discussion, I quickly recognized that
I don't want it to happen. Something inside me cries veto,
wrong track. But I can't publish this without providing
a better one.

<forgive my concise snip again>

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From paul@prescod.net  Thu Dec 16 18:28:13 1999
From: paul@prescod.net (Paul Prescod)
Date: Thu, 16 Dec 1999 10:28:13 -0800
Subject: [Types-sig] Module Attribute visibility
References: <000e01bf46dc$0050ada0$05a0143f@tim>
Message-ID: <38592F3D.10A7AFA4@prescod.net>

Tim Peters wrote:
> 
> Resist the dubious temptation to conflate declaration with initialization,
> and "an easy mechanical transformation to valid Python 1.5.x" consists of
> commenting out the decl stmts!  Heck, call the keyword "#\s+decl\s+" and
> it's a nop.

Okay, but doesn't Python already conflate declaration with
initialization? When I refer to mymod.foo I am referring to an object
that was assigned, somewhere to the name foo in the module mymod.

Are we going to say that statically type checked code can only refer to
declared (not merely assigned) variables in other modules? Would it be
safe to say that undeclared variables are simply not available for type
checking?

Would you suggest that this is even the case for functions? I.e. 

def foo( str ): return str*2

is invisible to the type checker until we add:

decl foo: str -> str

Or would foo have an implicit declaration:

decl foo: PyObject -> PyObject

And if that foo has an implicit declaration, shouldn't this foo also:

foo = lambda x: x*2

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things never trust in: That's the vendor's final bill
The promises your boss makes, and the customer's good will 
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Thu Dec 16 18:28:19 1999
From: paul@prescod.net (Paul Prescod)
Date: Thu, 16 Dec 1999 10:28:19 -0800
Subject: [Types-sig] Type annotations
References: <3858E957.CBF8F4DF@prescod.net> <38592E22.7EC44C45@vet.uu.nl>
Message-ID: <38592F43.11042753@prescod.net>

Martijn Faassen wrote:
> 
> I disagree that it gets too hairy. I'm advocating using Python
> *precisely* because of the complex types. Python expressions can deal
> with that kind of complexity right now. What's all this obsession with
> syntax early on about anyway? It only distracts us from the real topic,
> in my opinion..

I see the situation this way:

Python has a type system. It works. (though there are some subtle
improvements coming) Our job is to define a syntax and semantics for
type assertions and also 
 a) the operation of a software processor called a "type checker"
 b) changes to the runtime behavior of the PVM to support the accuracy
of a)

I don't see us as being at that "early on" of a stage. In my head, at
least, the pieces are coming together nicely. At this point, I seem to
be in agreement on most non-syntactic issues with Tim, Greg and Guido so
I think that we are converging. I admit that I have not yet integrated
all of the ideas of you, Edward and a few other people. I don't have
time to read everyone else's work carefully and nobody has time to read
everyone else's either!

Maybe this email will help with that. It outlines what I see as the
consensus so that we can debate these things one last time and put them
behind us.

I don't have time to write up all of the semantics of the system yet but
the major parts are:

 * local variables types are usually inferred
 * module variables and instance variables may have type declarations
 * non-local writes are checked at runtime (by default)
 * for optimization, the checks may be stripped based on type inferenced
information
 * function return types are NEVER inferred
 * ...they must be declared or assumed to be PyObject
 * "types" can be Python primitive types, or declared classes or
interfaces
 * built-in types are declared through "shadow files"
 * but a function return statement could be verified based on
inferencing to conform to its declaration
 * expression assertions support within-function assertions
 * function parameters can have declarations
 * function calls and assignments are checked at runtime if they cannot
be verified at compile time
 * but you can ask for an explicit verification at compile time
 * which enables faster code to be generated
 * ...and verifies your understanding of what you are doing

 * types can be parameterized
 * which means that the compile/runtime checks need to be more
sophisticated

 * we do not yet handle the "exception interface" of a function
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things never trust in: That's the vendor's final bill
The promises your boss makes, and the customer's good will 
http://www.geezjan.org/humor/computers/threes.html


From Edward Welbourne <eddyw@lsl.co.uk>  Thu Dec 16 19:35:55 1999
From: Edward Welbourne <eddyw@lsl.co.uk> (Edward Welbourne)
Date: Thu, 16 Dec 1999 19:35:55 +0000
Subject: [Types-sig] Attributes proposal
Message-ID: <E11yggt-0001Wq-00@lsls4p>

> My proposal for handling attributes is this:

> An attribute's type can be declared. Writes to the attribute from the
> same module can be statically type checked (if requested). Writes to
> the attribute from other modules are checked at runtime. That way we
> can always know the type of the attribute value and can therefore make
> reasonable use of the attribute in statically type checked functions.

> Opinions?

Sounds like a good way to cut that pie.
At least for modules, also for classes within a module (at outer scope).
For a class defined in the body of a function, ... hmm ... the right
scope in which to static-check is the function, anything else in the
module is outside (i.e. runtime-check).

For (attributes of) an instance of a class, we seem to have a messier
situation (its class may have lots of bases in lots of files ...
so which module is playing as host ?).
Did you intend this to apply to instances ?
If so how ?

Or did you intend to apply this only to attributes of modules ?

Sometimes a package might want to modify its submodules and have such
activity included in the static checking ... but punting on that sounds
reasonable at this stage.

	Eddy.


From paul@prescod.net  Thu Dec 16 19:45:55 1999
From: paul@prescod.net (Paul Prescod)
Date: Thu, 16 Dec 1999 11:45:55 -0800
Subject: [Types-sig] New syntax?
References: <Pine.LNX.4.10.9912161120060.16305-100000@nebula.lyra.org>
Message-ID: <38594173.8762D205@prescod.net>

There are two separate issues here. Separate files and separate
syntaxes. And there are two different time periods here: today and in
Python 2.

Separate files are a necessity to handle C-coded types. Ergo anything on
top of that is more work and given that we are still talking about
something useful in a month (though that is looking less and less
likely) I am not inclined to take on the extra work of new operators and
an inline syntax.

As far as separate syntaxes go, we are designing a new syntax
regardless. There is no way to define the type of "map" in Python today.
The question is whether the new syntax is built by overloading the
meaning of Python basic types or whether it is just new and different. I
mean we could outlaw new syntaxes in Python:

from re import *

compile( union( repeat( character_class( ["abc"] ), 
	optional( negate( character_class ( ["def"]) ) )

That makes no sense to me.

If you or someone proposes a completely Pythonic syntax that can handle
type unions, parameterized types, lists and tuples gracefully then we
can compare some declaration examples to a designed-from-scratch syntax
and let Guido decide.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things never trust in: That's the vendor's final bill
The promises your boss makes, and the customer's good will 
http://www.geezjan.org/humor/computers/threes.html


From m.faassen@vet.uu.nl  Thu Dec 16 19:54:23 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Thu, 16 Dec 1999 20:54:23 +0100
Subject: [Types-sig] Interface files
References: <Pine.LNX.4.10.9912150833460.16305-100000@nebula.lyra.org> <3857E02B.53CF27AC@prescod.net> <3859007A.FD369FB2@vet.uu.nl> <38592CFC.DD175CAB@prescod.net>
Message-ID: <3859436F.90846D66@vet.uu.nl>

Paul Prescod wrote:
> 
> Martijn Faassen wrote:
> >
> > * we don't have to debate about syntax anymore and can actually think
> > about
> >   semantics without syntax confusion.
> 
> Clean syntax helps comprehension.

5 syntaxes with uncertain semantics destroy comprehension. We won't know
semantics until implementation. It's tough to design a nice syntax
before you have tested your semantics. Then again, perhaps one of the
syntaxes will blow me away and I'll relent. :)

Regards,

Martijn


From gstein@lyra.org  Thu Dec 16 20:28:13 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 12:28:13 -0800 (PST)
Subject: [Types-sig] separate files (was: Sorry!)
In-Reply-To: <E11yf35-0001SO-00@lsls4p>
Message-ID: <Pine.LNX.4.10.9912161226040.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, Edward Welbourne wrote:
> >> Hmm, I'm not sure I read Guido's
> >>> I think that any proposal that requires you to keep two separate files
> >>> "in sync" is bound to fail in the long term. I left that crap behind
> >>> in C++. But in the short term...okay.
> > I didn't write the inner quote.
> 
> Oops - sorry: in fact, that was Paul (drawback of snatching a look at
> the threaded list during compiles and such) on
> http://www.python.org/pipermail/types-sig/1999-December/000617.html
> 
> >From that being Paul, I guess I should infer that the answer to the
> question I posed later would be that the two-file scheme is a `for the
> present' idea, which greatly reduces my twitchiness about it.

But I don't think anybody should be planning on 2.0 to resolve things.
That is at least two or three years away, if I'm not mistaken.

I think we need an inline syntax in the 1.x series. I like Tim's approach
so far; it seems like it should work although I might suggest some tweaks.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From scott@chronis.pobox.com  Thu Dec 16 20:27:41 1999
From: scott@chronis.pobox.com (scott)
Date: Thu, 16 Dec 1999 15:27:41 -0500
Subject: [Types-sig] separate files (was: Sorry!)
In-Reply-To: <Pine.LNX.4.10.9912161226040.16305-100000@nebula.lyra.org>
References: <E11yf35-0001SO-00@lsls4p> <Pine.LNX.4.10.9912161226040.16305-100000@nebula.lyra.org>
Message-ID: <19991216152741.A6338@chronis.pobox.com>

On Thu, Dec 16, 1999 at 12:28:13PM -0800, Greg Stein wrote:
[...]
> But I don't think anybody should be planning on 2.0 to resolve things.
> That is at least two or three years away, if I'm not mistaken.
> 
> I think we need an inline syntax in the 1.x series. I like Tim's approach
> so far; it seems like it should work although I might suggest some tweaks.

For what it's worth, I'd like to second this.

scott


From m.faassen@vet.uu.nl  Thu Dec 16 20:26:57 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Thu, 16 Dec 1999 21:26:57 +0100
Subject: [Types-sig] New syntax?
References: <Pine.LNX.4.10.9912161120060.16305-100000@nebula.lyra.org> <38594173.8762D205@prescod.net>
Message-ID: <38594B11.766AE8A3@vet.uu.nl>

Paul Prescod wrote:
> 
> There are two separate issues here. Separate files and separate
> syntaxes. And there are two different time periods here: today and in
> Python 2.
> 
> Separate files are a necessity to handle C-coded types.

Um? Why?

> Ergo anything on
> top of that is more work and given that we are still talking about
> something useful in a month (though that is looking less and less
> likely) I am not inclined to take on the extra work of new operators and
> an inline syntax.

What about putting this extra information inside the module file itself?
You need a separate file because you want to come up with your own
syntax, but even then you can do:

__types__ = """

def foo(int, int):
    hey: int
    hoi: [int]
    result: string

bar: string

grok: [string]

class Mine:
    hm : float

    def __init__(self, int, string):
        self.yahoo: [int]
        self.dict: {string : int}
        temp: int
    
    def getYahoo(self):
        result: [int]

def more(Mine, Mine):
    temp: int
    result: Mine

class Parametric:
    firstparam: param
    secondparam: param

    def __init__(self):
        self.a: firstparam
        self.b: secondparam
    
    def hullo(self):
        result: firstparam

whoops: Parametric(string, float)

def optional(int, *(int, string, int)):
    pass

def anotheroptional(int, *[Mine]):
    pass

union: int or string

"""

which incidentally would be a neat Pythonic syntax. :)

Regards,

Martijn


From guido@CNRI.Reston.VA.US  Thu Dec 16 20:35:01 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 16 Dec 1999 15:35:01 -0500
Subject: [Types-sig] Interface files
In-Reply-To: Your message of "Thu, 16 Dec 1999 11:24:55 PST."
 <Pine.LNX.4.10.9912161120060.16305-100000@nebula.lyra.org>
References: <Pine.LNX.4.10.9912161120060.16305-100000@nebula.lyra.org>
Message-ID: <199912162035.PAA11433@eric.cnri.reston.va.us>

> Why in the heck should I have to go and code up a separate file? In a
> separate language? That is nonsense. Really. And no, I'd rather not be
> diplomatic here. Saying that we are going to use Yet Another Goddamned
> Language is the wrong move.

I'm not taking sides here, but I want to note that none of the takers
on my latest challenge have shown separate interface files.  All the
ones I've seen used inline syntax.  So perhaps it's not even necessary
to get all bent out of shape over this one.  Or perhaps one of the
proponents could post an example instead of responding directly to
Greg's screaming.  (Too much sugar again, Greg? :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From m.faassen@vet.uu.nl  Thu Dec 16 20:36:31 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Thu, 16 Dec 1999 21:36:31 +0100
Subject: [Types-sig] Interface files
References: <Pine.LNX.4.10.9912161120060.16305-100000@nebula.lyra.org> <199912162035.PAA11433@eric.cnri.reston.va.us>
Message-ID: <38594D4F.E81989C4@vet.uu.nl>

Guido van Rossum wrote:
> 
> > Why in the heck should I have to go and code up a separate file? In a
> > separate language? That is nonsense. Really. And no, I'd rather not be
> > diplomatic here. Saying that we are going to use Yet Another Goddamned
> > Language is the wrong move.
> 
> I'm not taking sides here, but I want to note that none of the takers
> on my latest challenge have shown separate interface files.  All the
> ones I've seen used inline syntax.  So perhaps it's not even necessary
> to get all bent out of shape over this one.  Or perhaps one of the
> proponents could post an example instead of responding directly to
> Greg's screaming.  (Too much sugar again, Greg? :-)

I'm not a real proponent of interface files (I used an inline syntax
before) but I just posted an example of a non inline syntax in this very
thread. I'm hereby maliciously choosing both sides at the same time. :)

Basically I came up with this non inline syntax to prove my point that
even
that can be turned into an inline one easily, but it may have merits of
its own.

Regards,

Martijn


From gstein@lyra.org  Thu Dec 16 20:57:54 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 12:57:54 -0800 (PST)
Subject: [Types-sig] Module Attribute visibility
In-Reply-To: <38592F3D.10A7AFA4@prescod.net>
Message-ID: <Pine.LNX.4.10.9912161256270.16305-100000@nebula.lyra.org>

IMO, let's solve static type checking. Leave visibility and modification
rules to another phase. They are orthogonal problems, and we would do well
to reduce our problem set (and the amount of discussion thereby
engendered (my 25 cent word for the day :-)).

Really: please, can we table discussions on visibility and modification?

Cheers,
-g

On Thu, 16 Dec 1999, Paul Prescod wrote:
> Tim Peters wrote:
> > 
> > Resist the dubious temptation to conflate declaration with initialization,
> > and "an easy mechanical transformation to valid Python 1.5.x" consists of
> > commenting out the decl stmts!  Heck, call the keyword "#\s+decl\s+" and
> > it's a nop.
> 
> Okay, but doesn't Python already conflate declaration with
> initialization? When I refer to mymod.foo I am referring to an object
> that was assigned, somewhere to the name foo in the module mymod.
> 
> Are we going to say that statically type checked code can only refer to
> declared (not merely assigned) variables in other modules? Would it be
> safe to say that undeclared variables are simply not available for type
> checking?
> 
> Would you suggest that this is even the case for functions? I.e. 
> 
> def foo( str ): return str*2
> 
> is invisible to the type checker until we add:
> 
> decl foo: str -> str
> 
> Or would foo have an implicit declaration:
> 
> decl foo: PyObject -> PyObject
> 
> And if that foo has an implicit declaration, shouldn't this foo also:
> 
> foo = lambda x: x*2
> 
> 

-- 
Greg Stein, http://www.lyra.org/


From tismer@appliedbiometrics.com  Thu Dec 16 20:55:10 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Thu, 16 Dec 1999 21:55:10 +0100
Subject: [Types-sig] New syntax?
References: <Pine.LNX.4.10.9912161120060.16305-100000@nebula.lyra.org> <38594173.8762D205@prescod.net> <38594B11.766AE8A3@vet.uu.nl>
Message-ID: <385951AE.3D3142FD@appliedbiometrics.com>


Martijn Faassen wrote:
> 
> Paul Prescod wrote:
...
> What about putting this extra information inside the module file itself?
> You need a separate file because you want to come up with your own
> syntax, but even then you can do:
> 
> __types__ = """
> 
> def foo(int, int):
>     hey: int
>     hoi: [int]
>     result: string
... and so on ...

Yes I like that, but tried this earlier in some thread, see here:

[Greg, vaporizing runtime-looking type checks:-]
'''
On Wed, 15 Dec 1999, Christian Tismer wrote:
>...
> It doesn't matter if there is an extra file, or you insert a
> function call into your module, like
> 
> system.interface("""triple quoted string defining interface""")
> 
> without changes to the language but experimental syntaxes for
> these IF files/strings.

The compiler needs the information. This implies that you can't add the
information procedurally. The mechanism must be "transparent" to the
compiler.
'''

So where is the difference. The compiler would have to treat
__types__ as a special keyword and not an assignment target.
In my example, it would need to know what 
"system.interface" kind of animal is. I used this explicitly
to make my point clear, but this doesn't seem to help.

I think people want new syntax since this assures a new meaning
to some characters. I don't share this. If a new construct
just happens to fit into the existing language, why must we
forcibly invent a new escape? Yes this is all about escapes.
We escape into syntax, or escape into different files.
But nobody cares about None, which yet *can* be overwritten,
and which has a special role although not being a special
object. Also nobody cares that we use namespaces to "escape"
semantics. Those __init__ constructs are escaping animals
which are still in the language but have different meaning.
Nobody would refuse to parse a class definition and see if it
has an __init__, but for types we need a real new language?

We escape to justify a wrong idea.


> which incidentally would be a neat Pythonic syntax. :)

Very nice, IMO.

The string looks like a module in the module. But it needn't
be a string. I'm not against a new concept if it fits other
ideas. Opening a new context with new rules, why not?
We have classes, functions etc, which all impose different
semantics with nearly the same language. Now if we define
an interface object which has the exceptional rule that
it can *not* be generated dynamically by some tricks,
but can only be written statically down (which is nonsense
since I can write source code by program), then there would be
just one keyword necessary to tell the type checker that there
is something immutable in this module.
This interface object can btw. of course contain code which is
executed at compile time and be part of the type checking system.
Well, I see an idea coming...

sory, talking at length, I should go to sleep now - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From gstein@lyra.org  Thu Dec 16 21:16:20 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 13:16:20 -0800 (PST)
Subject: [Types-sig] consensus(?) summary (was: Type annotations)
In-Reply-To: <38592F43.11042753@prescod.net>
Message-ID: <Pine.LNX.4.10.9912161300170.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, Paul Prescod wrote:
>...
> I don't have time to write up all of the semantics of the system yet but
> the major parts are:
> 
>  * local variables types are usually inferred

Woo! :-)

>  * module variables and instance variables may have type declarations

Yes. I believe this is because these variables fall under the same rubric
of "interface declarations." One interface for the module, another for a
class.

>  * non-local writes are checked at runtime (by default)

Hrm. Is there an easy rule to determine this? I might suggest deferring
this unless/until we have a clear set of rules. Shades of C++'s "friend"
modifier are forming in my head when we talk about this...

>  * for optimization, the checks may be stripped based on type inferenced
> information

Which checks? I think runtime checks are *ignored* if you run with -O.
Python doesn't (yet) have different forms of compilation (or did I miss
something?). Certainly, in 1.6 we can have different compilations by
virtue of substituting a new compiler, but I think it would be nice to
retain a single form of compilation.

In reference to type-inferred information: I don't think runtime checks
would ever be added if the type has been inferred.

Issue: what are the rules for inserting runtime checks? When are they
added and when are they not?

   Strawman:
      1) they are added for function arguments which have type declarators
         (i.e. added as a function prologue).
      2) they are added when the type-assert operator is used.

>  * function return types are NEVER inferred
>  * ...they must be declared or assumed to be PyObject
>  * "types" can be Python primitive types, or declared classes or
> interfaces

Agreed.

>  * built-in types are declared through "shadow files"

This is somewhat problematic. How do we map from a builtin type to this
shadow file? Do they reside in a well-known location?

Second issue: keeping them in sync, version mismatches, distribution and
install problems, etc.

My recommendation would be to enable a mechanism by which modules can
internally declare their interface. I recognize this is complex and would
therefore defer any discussion regarding interfaces for builtin types.

Note: and when I say "builtin type", I'm referring to things like "socket"
rather than the "core types" such as List or Dict.

>  * but a function return statement could be verified based on
> inferencing to conform to its declaration

Yes. This would be a compile-time static check.

>  * expression assertions support within-function assertions
>  * function parameters can have declarations

Agreed.

>  * function calls and assignments are checked at runtime if they cannot
> be verified at compile time

Function calls: yes.

I'm not sure we would ever check assignments. See my response above,
regarding knowing when the proper time is. Instead, I think that an
interface is a statement (to users) about the types, but we don't
necessarily have to enforce it. Hrm. This kind of falls under the concept
of "verifying an implementation conforms to an interface." I would prefer
to avoid that.

>  * but you can ask for an explicit verification at compile time
>  * which enables faster code to be generated
>  * ...and verifies your understanding of what you are doing

I'm not clear on your points here.

>  * types can be parameterized
>  * which means that the compile/runtime checks need to be more
> sophisticated

Yes, although I might modify it somewhat and say "only core types can be
parameterized."

>  * we do not yet handle the "exception interface" of a function

Thank you :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Thu Dec 16 21:28:28 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 13:28:28 -0800 (PST)
Subject: [Types-sig] New syntax?
In-Reply-To: <38594B11.766AE8A3@vet.uu.nl>
Message-ID: <Pine.LNX.4.10.9912161326050.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, Martijn Faassen wrote:
> Paul Prescod wrote:
> > There are two separate issues here. Separate files and separate
> > syntaxes. And there are two different time periods here: today and in
> > Python 2.
> > 
> > Separate files are a necessity to handle C-coded types.
> 
> Um? Why?
> 
> > Ergo anything on
> > top of that is more work and given that we are still talking about
> > something useful in a month (though that is looking less and less
> > likely) I am not inclined to take on the extra work of new operators and
> > an inline syntax.
> 
> What about putting this extra information inside the module file itself?
> You need a separate file because you want to come up with your own
> syntax, but even then you can do:
> 
> __types__ = """
...
> """
> 
> which incidentally would be a neat Pythonic syntax. :)

Really. We don't want a separate syntax.

Think about the parsing. Who is going to parse it? Are you suggesting that
we have the Python parser doing some code parsing, then we invoke another
parser to parse interface information, then we pass those blobs off to the
compiler (and type inferencer/checker/optimizer/etc) ?

No way. Use one parser for code and interface information.

Inline vs. external is a different question (and I vote for former). But
different syntaxes is a big problem that is easily avoided.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Thu Dec 16 21:29:37 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 13:29:37 -0800 (PST)
Subject: [Types-sig] New syntax?
In-Reply-To: <38594173.8762D205@prescod.net>
Message-ID: <Pine.LNX.4.10.9912161328520.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, Paul Prescod wrote:
>...
> If you or someone proposes a completely Pythonic syntax that can handle
> type unions, parameterized types, lists and tuples gracefully then we
> can compare some declaration examples to a designed-from-scratch syntax
> and let Guido decide.

Tim has come up with a good first pass. If we can formalize that (and I'd
like to tweak it), then we have the basis we need to move forward.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Thu Dec 16 21:36:55 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 13:36:55 -0800 (PST)
Subject: [Types-sig] challenge response (was: A challenge)
In-Reply-To: <14424.31434.689571.714592@dolphin.mojam.com>
Message-ID: <Pine.LNX.4.10.9912161332580.16305-100000@nebula.lyra.org>

On Wed, 15 Dec 1999, Skip Montanaro wrote:
>     Greg> Line 7: per caveat #1, assume the compiler can access the
>     Greg> find.find() function. From that, it knows the signature. The first
>     Greg> parameter has a matching type, but the second (PyObject) does not
>     Greg> match the required type (String), so an error is raised. If caveat
>     Greg> #5 is resolved, then the second parameter matches. It is also
>     Greg> possible to avoid the error by rewriting:
> 
>     Greg>     list = find.find("*.py", dir!StringType)	# 7
> 
>     Greg> "list" is now a ListType, based on the find.find() return
>     Greg> value. (see caveat #5 -- it could be possible to refine this
>     Greg> knowledge).
> 
> I humbly assert this train of thought rates a *bzzzt*.  I thought one core
> requirement was that all type declaration stuff be optional.  The worst that
> the type checker/inferencer should do in the face of incomplete type info is
> display a warning.  I don't think you can flag an error unless the
> programmer sets some sort of PY_ANAL_TYPE_CHECKING_AND_I_REALLY_MEAN_IT
> environment variable.

My entire post was pre-conditioned on the assumption that type-checking
has been enabled.

IMO, type checking is NOT enabled by default. I believe it will impose a
noticable performance penalty and I'm not willing to pay that in the
general case. Periodically, I'll turn it on and run it over my code (and
in that sense, type-checking as a lint-like tool is probably okay with
me; I'm more interested in typing for its (OPT) features).

As a side note: pulling in my Strawman from another thread, re: when to
insert runtime checks -- the determination is entirely based on syntax,
rather than a type analysis (or failure thereof). In other words, even if
we disable compile-time checking, we still end up with the same output
(which includes runtime checks where applicable).

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From skip@mojam.com (Skip Montanaro)  Thu Dec 16 21:39:45 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Thu, 16 Dec 1999 15:39:45 -0600 (CST)
Subject: [Types-sig] New syntax?
In-Reply-To: <38594B11.766AE8A3@vet.uu.nl>
References: <Pine.LNX.4.10.9912161120060.16305-100000@nebula.lyra.org>
 <38594173.8762D205@prescod.net>
 <38594B11.766AE8A3@vet.uu.nl>
Message-ID: <14425.23585.74942.844600@dolphin.mojam.com>

    Paul> Separate files are a necessity to handle C-coded types.

    Martijn> Um? Why?

My guess is so that declaration and definition are separated.  If
"definition" roughly means "import", you'd like to get at an object's
interface without actually importing (or perhaps even parsing it?) the
module it is defined in and risking the side effects of import.

Skip Montanaro | http://www.mojam.com/
skip@mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From skip@mojam.com (Skip Montanaro)  Thu Dec 16 21:43:17 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Thu, 16 Dec 1999 15:43:17 -0600 (CST)
Subject: [Types-sig] consensus(?) summary (was: Type annotations)
In-Reply-To: <Pine.LNX.4.10.9912161300170.16305-100000@nebula.lyra.org>
References: <38592F43.11042753@prescod.net>
 <Pine.LNX.4.10.9912161300170.16305-100000@nebula.lyra.org>
Message-ID: <14425.23797.624232.17777@dolphin.mojam.com>

    >> * non-local writes are checked at runtime (by default)

    Greg> Hrm. Is there an easy rule to determine this? 

In particular, is the following a non-local write?

    import sys
    p = sys.path
    p.append("/usr/local/lib/other")

Skip Montanaro | http://www.mojam.com/
skip@mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From m.faassen@vet.uu.nl  Thu Dec 16 21:47:43 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Thu, 16 Dec 1999 22:47:43 +0100
Subject: [Types-sig] New syntax?
References: <Pine.LNX.4.10.9912161326050.16305-100000@nebula.lyra.org>
Message-ID: <38595DFF.B0B1751D@vet.uu.nl>

Greg Stein wrote:

[snip Pythonic syntax stuff I perversely proposed]
> Really. We don't want a separate syntax.
> 
> Think about the parsing. Who is going to parse it? Are you suggesting that
> we have the Python parser doing some code parsing, then we invoke another
> parser to parse interface information, then we pass those blobs off to the
> compiler (and type inferencer/checker/optimizer/etc) ?

That's not really what I'm proposing; I was proposing using Python at
least for the first shot at things. But, this does appear to be what
Paul's proposing. Paul doesn't consider writing a new parser a problem,
I do think it'll hold us back when we could better be discussing
semantics. But since Paul thinks syntax is important I'm obliging with
something that seems Pythonic. Because I'm Dutch I get bonus points
anyway. ;)

> No way. Use one parser for code and interface information.

All right:

foo = 1:

def bar(i, j):
   return i + j

vardef bar(int, int):
   return int

class Foo:
   alpha = 1
   
   def __init__(self, beta):
       self.beta = beta

   def getbeta(self):
       return self.beta

varclass Foo:
   alpha: int

   def __init__(self, int):
       self.beta: int

   def getbeta(self):
       result: int

:) # not part of syntax

> Inline vs. external is a different question (and I vote for former). But
> different syntaxes is a big problem that is easily avoided.

So what are you suggesting if you would be voting for external, then? A
Python based system such as the one I proposed earlier? Or is this why
you're voting for internal?

Regards,

Martijn


From gstein@lyra.org  Thu Dec 16 21:55:54 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 13:55:54 -0800 (PST)
Subject: [Types-sig] doc strings (was: Sorry!)
In-Reply-To: <E11ybaY-0001IK-00@lsls4p>
Message-ID: <Pine.LNX.4.10.9912161337520.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, Edward Welbourne wrote:
>...
> (I'd even be happy with the typedecl incorporating the docstring, which
> is part of the interface spec after all: and would make the run-time
> thing actually called be lighter-weight in some probably-irrelevant
> sense.)  Actually, there are two doc-strings in two places: one is the
> doc-string of (say) the function object - it says what the function does
> - the other is where some object carrying that function documents the
> role of the attribute as which it stores that object.  This is directly

I agree with the doc string thing. Specifically, imagine something like
this:

----------
# module foo

decl some_global: String
"some_global is used for ..."

decl some_func: def(Int) -> None
"this function does ..."
----------

Note that JimF's interface proposal allows attaching doc strings to the
elements of an interface. I believe the "decl" statement and associated
doc strings would be the (syntactical) subtitution for his runtime
solution.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Thu Dec 16 22:13:11 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 14:13:11 -0800 (PST)
Subject: [Types-sig] New syntax?
In-Reply-To: <38595DFF.B0B1751D@vet.uu.nl>
Message-ID: <Pine.LNX.4.10.9912161358320.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, Martijn Faassen wrote:

>... example syntax for a Python syntax to declare interfaces ...

Ah. Good. That is better than your/Paul's previous suggestions.

> > Inline vs. external is a different question (and I vote for former). But
> > different syntaxes is a big problem that is easily avoided.
> 
> So what are you suggesting if you would be voting for external, then? A
> Python based system such as the one I proposed earlier? Or is this why
> you're voting for internal?

By "for former", I meant that I want an internal syntax. Something like
Tim's suggestion. It keeps the declaration closest to the implementation,
which (IMO) is best. It is kind of like comments and code: they can easily
drift apart, especially if the two are distant from each other.

In your example above, I think it would be a bit painful to flip back and
forth between the "class" and "varclass" every time you wanted to add a
method.

In fact, I don't even like Tim's notion of declaring a function since a
"def" is more than adequate for doing that. I would like to see something
like:

#---------------------------------------------------------
class Foo:
  decl class a: Int "The <a> class variable is for ..."
  a = 1

  decl b: String
  "Member variable. Alternative location for a doc string"

  def bar(x: Int, y: String) -> List:
    "Doc string goes here"
    ...
    return some_list
#---------------------------------------------------------

Note that an interface definition would look exactly the same, except that
"interface" would be used instead of "class", variable assignments are not
allowed, and functions cannot have a body (only a doc string).

Note the use of "decl class ..." to define class variables, while "decl
..." is for member variables. I'm not sure if we should instead use Tim's
suggestion of "decl member ...", though. Given the position of the
declaration, I think "decl member" might actually be better because it
makes it clear that <b> is a *member* variable, despite being in a
location that is normally used for class variables. An alternative would
be a different "decl" keyword just for members.
[ I *really* don't like member declarations in the __init__() method as
  some people have shown. Those could be confused with declarations of
  local vars, which I hope we aren't going to have. ]

Consider the above example, my latest proposal for syntax changes in
support of declarations. Obviously, a bit more detail is needed for things
like parameterized types, but I think the above is representative of where
I'd like to see things go.

And I won't suggest anything for an external syntax, since I don't support
that :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Thu Dec 16 22:16:18 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 14:16:18 -0800 (PST)
Subject: [Types-sig] New syntax?
In-Reply-To: <14425.23585.74942.844600@dolphin.mojam.com>
Message-ID: <Pine.LNX.4.10.9912161414050.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, Skip Montanaro wrote:
>     Paul> Separate files are a necessity to handle C-coded types.
> 
>     Martijn> Um? Why?
> 
> My guess is so that declaration and definition are separated.  If
> "definition" roughly means "import", you'd like to get at an object's
> interface without actually importing (or perhaps even parsing it?) the
> module it is defined in and risking the side effects of import.

I don't think so... Paul was referring to C-coded extension types.
Therefore, Python syntax (or any other syntax) is not available. The
interface would need to be programmatically defined, or it would occur in
a separate file.

As I mentioned in another thread, I think we should defer this problem.
There are too many issues, none of which move us (or hinder us) from
getting to a first-version type system. I think we can revisit this and
add some new concepts, code, whatever, to solve the problem (without
tearing down or interfering with what we did in Rev 1).

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From m.faassen@vet.uu.nl  Thu Dec 16 22:19:50 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Thu, 16 Dec 1999 23:19:50 +0100
Subject: [Types-sig] New syntax?
References: <Pine.LNX.4.10.9912161358320.16305-100000@nebula.lyra.org>
Message-ID: <38596586.63D9E29F@vet.uu.nl>

Greg Stein wrote:
> 
> On Thu, 16 Dec 1999, Martijn Faassen wrote:
> 
> >... example syntax for a Python syntax to declare interfaces ...
> 
> Ah. Good. That is better than your/Paul's previous suggestions.

It was almost exactly the same as before, though.

> > > Inline vs. external is a different question (and I vote for former). But
> > > different syntaxes is a big problem that is easily avoided.
> >
> > So what are you suggesting if you would be voting for external, then? A
> > Python based system such as the one I proposed earlier? Or is this why
> > you're voting for internal?
> 
> By "for former", I meant that I want an internal syntax.

Yes, I understood that. I just was curious what you meant when you
stated you'd be something parsed with Python anyway, even if it were
external. It's hard to come up with external type annotation syntax that
doesn't include a new language.

> Something like
> Tim's suggestion. It keeps the declaration closest to the implementation,
> which (IMO) is best. It is kind of like comments and code: they can easily
> drift apart, especially if the two are distant from each other.

That's true. It is a disadvantage.

> In your example above, I think it would be a bit painful to flip back and
> forth between the "class" and "varclass" every time you wanted to add a
> method.

Yes, but I think my proposal is rather easy to understand for Python
programmers, as it looks almost exactly like Python in structure. The
flipping back and forth is a bit painful, though, I agree. The advantage
of separation though is that it can actually be made to look exactly
like Python structures, which is rather neat.
 
[snip]

> [ I *really* don't like member declarations in the __init__() method as
>   some people have shown. Those could be confused with declarations of
>   local vars, which I hope we aren't going to have. ]

Well, my syntax proposal avoids this confusion by following Python's
lead:

varclass Foo:
   alpha: int

   def __init__(self):
       self.member: int
       local: int
    
> Consider the above example, my latest proposal for syntax changes in
> support of declarations. Obviously, a bit more detail is needed for things
> like parameterized types, but I think the above is representative of where
> I'd like to see things go.

Didn't you think parameterized types looked fairly straightforward in my
syntax proposal?
 
> And I won't suggest anything for an external syntax, since I don't support
> that :-)

Right, that was what I was curious about. :)

Regards,

Martijn


From evan@4-am.com  Thu Dec 16 22:25:26 1999
From: evan@4-am.com (Evan Simpson)
Date: Thu, 16 Dec 1999 16:25:26 -0600
Subject: [Types-sig] Return of the Docstring: The Typening
Message-ID: <385966D5.BAF592C4@4-am.com>

I'm sure this was bandied around in the (distant) past, but since
backward-compatible inline syntaxes are being proposed, I thought I'd
resurrect it:
Put the type constraints (of whatever syntax) in docstrings (and ignored
strings).  Shadow files for extensions can simply be .pyt's with dummy
objects definitions, or could be more compact.  Example:

def foo(x1, x2):
    '''A foo function
    ::(int, int) -> int'''
    return x1+x2

'bar:: string'; bar = "I'm a string!"

class Mine:
    '''My class! Mine!
    hm, uh:: float'''
    hm = 1.0
    uh = 3.14

This could also serve as a shadow file, or perhaps a more compact
notation:

def foo::(int, int) ->int

bar:: string

class Mine:
    hm, uh:: float

That is, if we're going to have name-type declarations at all.  I'm
rather partial to expression-type constraints with 'as' instead of '!'.
Perform single-module analysis at compile time (if requested) to produce
a type-inference graph such as Tim(?) described.  Save the graph in a
*.pyt file, then have a tool which uses them to do full-program
type-checking, and possibly rewrite the *.pyc if optimization is
possible and requested.

I still like the Sparrow/SPython concept, too <wink>.

Cheers,

Evan @ 4-am


From gstein@lyra.org  Thu Dec 16 23:05:58 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 15:05:58 -0800 (PST)
Subject: [Types-sig] bunch o' stuff (was: minimal or major change?)
In-Reply-To: <3858E9C2.E2722B88@vet.uu.nl>
Message-ID: <Pine.LNX.4.10.9912161417250.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, Martijn Faassen wrote:
>...
> [I'm disagreeing with the 'isn't that big of a change' thesis, Greg
> defends fairly
> well that it is, but I still disagree with him. I don't think our
> disagreeing will matter much in the future, though, so let's forget
> about it..

Not a problem. "Agree to disagree" is quite civilized and proper :-)

>...
> > > * A whole new operator (which you can't overload..or can you?), which
> > > does something quite unusual (most programmers associate types with
> > > names, not with expressions). The operation also doesn't actually return
> > > much that's useful to the program, so the semantics are weird too.
> > 
> > No, you cannot overload the operator. That would be a Bad Thing, I think.
> > That would throw the whole type system into the garbage :-).
> 
> Okay, in that sense the operator would be special, as generally
> operators
> in Python can be overloaded (directly or indirectly). I'd agree you
> shouldn't be able to overload this one, though.

Well, I hope to not consider it "special". In my mind, it is just another
operator. It has some semantics the compiler can take advantage of, sure,
but it isn't like a pragma or some other meta-level thing.

However, its semantics don't lend well to overloading. *shrug* Assuming we
end up proposing this operator to Guido for inclusion as part of the new
type system, then he can certainly make a call on whether it should be
possible to overload it.

> > The operator is not unusual: it is an inline type assertion. It is not a
> > "new-fangled way to declare the type of something."
> 
> But it's quite unusual to the programmer coming from most other
> languages, still. That doesn't mean it's bad, but Python isn't an
> experimental language, so this could be an objection to the operator
> approach.

Perl has the ~= operator, which has unusual semantics for a programmer 
coming from Python. Python's slice operator is not available to a C or C++
programmer, but people don't complain about that.

Point is: each language has its own set of operators to solve problems
within that language's domain. I see this operator as a pretty neat and
clean way to resolve Python's (current) lack of type declarations.

And I disagree with the notion "Python isn't an experimental language." It
is one of the few to natively support complex types, sophisticated
slicing, builtin dictionary types, and keyword arguments.

>...
> > > * Interfaces with a new 'decl' statement. [If you punt on this you'll
> > > have to the innocent Python programmer he can't use the static type
> > > system with instances? or will we this be inferenced?]
> > 
> > Yes, I'd prefer to punt this for a while, as it is a much larger can of
> > worms. It is another huge discussion piece. In the current discussion, I
> > believe that we can factor out the interface issue quite easily -- we
> > can do a lot of work now, and when interfaces arrive, they will slide
> > right in without interfering with the V1 work. In other words, I believe
> > there is very little coupling between the proposal as I've outline, and
> > the next set of type system extensions (via interfaces).
> 
> Hm, I'm still having some difficulty with this; as I understand it your
> proposal would initially only work with functions (not methods) which
> only use built-in types (not class instances). Am I right, or perhaps
> I'm missing something..

Methods are actually function objects. When I've referred to functions,
I'm talking about functions and methods. In other words:

class Foo:
  def bar(x: String, y:String) -> String:
    pass

In the above code, the bar() method has a type signature, which can be
type-checked.

Since writing the quoted text, I've read the interface proposal and
thought more on the "decl" statement. I am now in favor of including
"decl" in V1, thus providing types for all portions of an interface
(attributes and method).

>...
> > > Adding anything like static type checking to Python entails fairly major
> > > changes to the language, I'd think. Not that we shouldn't aim at keeping
> > > those transparant and mostly compatible with Python as it is now, but
> > > what we'll add will still be major.
> > 
> > Sure.
> 
> You say 'sure' to me saying it'll still be major? :) Oh, wait, I wasn't
> arguing about that anymore!

I'm not sure what I was referring to. Sorry about that. I think I meant,
"yes, we should aim at keeping things transparent and compatible." At
least, that's what I mean now when I re-read and re-comment on your text
:-)

>...
> > > > > The 'simplicity' part comes in because you don't need *any* type
> > > > > inferencing. Conceptually it's quite simple; all names need a type.
> > > >
> > > > 1) There is *no* way that I'm going to give every name a type. I may as
> > > >    well switch to Java, C, or C++ (per Guido's advice in another email :-)
> > >
> > > Sure, but we're looking at *starting* the process. Perhaps we can do
> > > away with specifying the type of each local variable very quickly by
> > > using type inferencing, but at least we'll have a working
> > > implementation!
> > 
> > I don't want to start there. I don't believe we need to start there. And
> > my point (2) below blows away your premise of simplicity. Since you still
> > need inferencing, the requirement to declare every name is not going to
> > help, so you may as well relax that requirement.
> 
> But you'd only need expression inferencing, which I was ('intuitively'
> :) assuming is easier than the larger scale thing.

Yes, expression-level inferencing is easier, as you don't have to worry
about code like this:

a = 1
while 1:
  func_which_takes_int(a)
  a = "foo"

The above code should raise a type-check error. Tim referred to the above
problem when he talked about "reaching a stable state," although it
probably wasn't obvious to most readers :-)

If names have types, then the a="foo" line would raise an error. In a
purely inferenced world, the inferencer (eventually) figures out that <a>
can have one of two types at the time of the function call. It then raises
an error saying "func_which_takes_int expect an Int, but a may be a
String."

>...
> > > I'm not saying this is a good situation, it's just a way to get off the
> > > ground without having to deal with quite a few complexities such as
> > > inferencing (outside expressions), interaction with modules that don't
> > > have type annotations, and so on. I'm *not* advocating this as the end
> > > point, but I am advocating this as an intermediate point where it's
> > > actually functional.
> > 
> > IMO, it is better to assume "PyObject" when you don't have type
> > information, rather than throw an error. Detecting the lack of type info
> > is the same in both cases, and the resolution of the lack is easy in both
> > mehtods: throw an error, or substitute "PyObject". I prefer the latter so
> > that I don't have to update every module I even get close to.
> 
> I still don't understand how making it a PyObject will help here. Would
> this mean a run-time check would need to be inserted whenever PyObject
> occurs in a function with type annotations? In my approach this would be
> part of the Python/Static Python interface work. How does it fit in for
> you?

The PyObject approach means that you don't throw an error. There are no
runtime checks or compile time checks. They are simply unavailable since
you have no type information.

Using PyObject will help because it means you aren't raising errors simply
because some module has not added type declarations. Instead, the compiler
just uses the "unknown" (PyObject) type and keeps going. Of course, that
may cause type errors later, but that is resolvable with the type-assert
operator (which inserts a run-time check, and tells the compiler what type
you're expecting it to be).

>...
> > > Yes, but now you're building a static type checker *and* a Python
> > > compiler inserting run time checks into bytecodes. This is two things.
> > > This is more work, and more interacting systems, before you get *any*
> > > payoff. My sequence would be:
> > 
> > Who says *both* must be implemented in V0.1? If the compiler can't figure
> > it out, then it just issues a warning and continues. Some intrepid
> > programmer comes along and tweaks the AST to insert a runtime check. Done.
> > The project is easily phased to give you a working system very quickly.
> > 
> > Heck, it may even be easier for the compiler to insert runtime checks in
> > V0.1. Static checking might come later. Or maybe an external tool does the
> > checking at first; later to be built into the compiler.
> 
> That's true; the other approach would start with adding run-time checks
> and proceed to a static checker later.

Yes, that's what I said :-)

First, we add the new typedecl syntax. Then, if the compiler inserts
runtime checks for function arguments and as a result of the type-assert
operator, then we have a good first pass. Next comes an external tool to
consume type information and perform type inferencing and checking.
Finally, we decide on integrating the external tool into the compiler
proper.

>...
> So that's where I'm coming from. It's important for our proposal to
> actually come up with a workable development plan, because adding type
> checking to Python is rather involved. So I've been pushing one course
> of implementation towards a testable/hackable system that seems to give
> us the minimal amount of development complexities. I haven't seen clear
> development paths from others yet; most proposals seem to involve both
> run-time and compile-time developments at the same time.

I haven't seen any, let alone clear, discussions from others about
development paths :-)

But I don't think anybody is going to advocate a system that will take a
while to bring up, so I think we're all in agreement here.

>...
> > > This'd be only implementable with run-time assertions, I think, unless
> > > you do inferencing and know what the type the object is after all. So
> > > that's why I put the limitation there. Don't allow unknown objects
> > > entering a statically typed function before you have the basic static
> > > type system going. After that you can work on type inference or cleaner
> > > interfaces with regular Python.
> > 
> > Why not allow unknown objects? Just call it a PyObject and be done with
> > it.
> 
> Hm, I suppose I'm looking at it from the OPT point of view; I'd like to
> see a compiler that exploits the type information. If you have PyObjects
> this seems to get more difficult; could be solved if you had an
> interpreter waiting in the sidelines that would handle stuff like this
> that can't be compiled.

The compiler can exploit type information, sure. But we're talking about
the case where type information is not available. Rather than just
failing, the compiler just doesn't optimize. Using the type-assert
operator, you can get the compiler cranking up again (of course, you could
also go and add type annotations to the code being called).

> > Note that the type-assert operator has several purposes:
> > 
> > * a run-time assertion (and possibly: unless -O is used)
> > * signal to the compiler that the expression value will have that type
> >   (because otherwise, an exception would hav been raised)
> > * provides a mechanism to type-check: if the compiler discovers (thru
> >   inferencing) that the value has a different type than the right-hand
> >   side, then it can flag an error.
> > 
> > The limitation you propose would actually slow things down. People would
> > not be able to use the type system until a lot of modules were
> > type-annotated.
> 
> I think I'm starting to see where you're coming from now, with the !
> operator. It allows you to say 'from this point on, this value is an
> int, otherwise the operator would've raised an exception'. The
> inferencer and checker can exploit this.

Exactly! The compiler can also use it to perform various optimizations,
since it now knows the (guaranteed) type.

> The point where I am coming
> from is however that you lose compile-time checkability as soon as you
> use any function that inserts PyObjects into the mix. I'm afraid that
> even with the operator, you wouldn't be able to check most of the code,
> if PyObjects are freely allowed. Perhaps I'm wrong, but I'd like to see
> some more debate about this.

Yes, you lose it, but that doesn't mean you throw the baby out with the
bath water. The compiler just degrades gracefully in the presence of a
PyObject. With the type-assert operator, you effectively convert that
PyObject into a known type which the compiler can then use in later checks
and optimizations.

> > > But perhaps I'm mistaken and local variables don't need type
> > > descriptions, as it's easy to do type inferencing from the types of the
> > > function arguments and what the function returns,
> > 
> > That is my (alas: unproven) belief.
> 
> How do we set about to prove it?

I don't need a proof :-). I think we *can* use inferencing to avoid decls
for local variables. In fact, I am positive of it, and instead would like
to hear a counter-proof.

> Here I'll come with my approach again;
> if you have a type checker that can handle a fully annotated function
> (all names used in the function have type annotations), then you have a
> platform you can build on to develop a type checker. Then you can figure
> out what does need type annotations and what doesn't. You simply try to
> build code that adds type annotations itself, based on inferences. You
> can spew out warnings: "full type inferencing not possible, cannot
> figure out type of 'foo'". The programmer can then go add type info for
> 'foo'. If all types are known one way (specified) or the other
> (inferred), a compiler can start to do heavy duty optimization on that
> code.

I do not believe that developing a type checker for fully-annotated
functions is going to help in any way towards building an inferencer. In
other words, we just build the inferencer.

However, I do see that a compiler that knows all types is a good first
step. Using those types, it can do various things (e.g. type checks on
func args, various optimizations). Where it gets the type information is
the point of discussion here :-)

I'd rather just start with inferencing rather than modifying the syntax to
support typing of locals, only to pull that syntax change out later.

Note: your proposal of __types__ would be useful during development of the
compiler (presuming that occurs before the inferencer is available).
__types__ requires no syntax changes, so it can give the compiler the info
right away. Later, we just stop looking for __types__ and use the
inferencer.

>...
> [snip]
> > > I'd like to see some actual
> > > examples of how this'd work first, though. For instance:
> > >
> > > def brilliant() ! IntType:
> > >     a = []
> > >     a.append(1)
> > >     a.append("foo")
> > >     return a[0]
> > >
> > > What's the inferred type of 'a' now? A list with heterogenous contents,
> > > that's about all you can say, and how hard is it for a type inferencer
> > > to deduce even that?
> > 
> > It would be very difficult for an inferencer. It would have to understand
> > the semantics of ListType.append(). Specifically, that the type of the
> > argument is added to the set of possible types for the List elements.
> > 
> > Certainly: a good inferencer would understand all the builtin types and
> > their methods' semantics.
> > 
> > > But for optimization purposes, at least, but it
> > > could also help with error checking, if 'a' was a list of IntType, or
> > > StringType, or something like that?
> > 
> > It would still need to understand the semantics to do this kind of
> > checking. In my no-variable-declaration world, the type error would be
> > raised at the return statement. a[0] would have the type set: (IntType,
> > StringType). The compiler would flag an error stating "return value may be
> > a StringType or an IntType, but it must only be an IntType".
> 
> Right, I think this would be the right behavior. But it becomes a lot
> easier to get a working implementation if you get to specify the type of
> 'a'. If you say a is a list of StringType, it's then relatively easy for
> a compile time checker to notice that you can't add an integer to it.

Well, kind of. The checker would sitll have to understand that a.append()
is going to insert that value into the list, so that appending an Int
would generate a type conflict.

Re: working implementation faster: this presumes that the compiler will
use the type declarations before the inferencer is available.

> And possibly it also becomes clearer for the programmer; I had to think
> to figure out why your compiler would complain about a[0]. I had to play
> type inferencer myself. I don't have to think as much if I get to
> specify what list 'a' may contain; obviously if something else it put
> into it, there should be an error.

The programmer never has to think about type inferencing. That only exists
to create type-check warnings/errors. The programmer believes that he has
a list of integers and codes that way. The inferencer then comes along and
tells him that he goofed up.

Declaring <a> up front simply moves the error from the return statement to
the point where the wrong type was inserted into <a>. It is arguable that
this is preferable.

> > > It seems tough for the type
> > > inferencer to be able to figure out that this is so, but perhaps I'm
> > > overestimating the difficulty.
> > 
> > Yes it would be tough -- you aren't overestimating :-)
> 
> What would your path towards successful implementation be, then?

* add the syntax changes (decl, def changes, and !)
* change the compiler to use the new syntax to insert runtime checks
* develop external tool to do type checking
* possibly integrate the tool into the compiler

Note that the external tool will start with rudimentary type inference and
analysis. It will then grow in complexity as more capability is added. For
example, initially, it might only know "a" + 1 is a type error. Later, it
would be able to do some simple inference based on data flow. Later still,
it would recognize problems like the "while" example I listed above.

Also note that I'm not sure we ever put type-checking into the core
interpreter. If it isn't going to alter the compilation output, then why
put it in? In other words: somebody ought to come up with a list of things
they expect the compiler to alter in the *bytecodes* based on the type
information (Python doesn't really have type-specific bytecodes (yet)).

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Thu Dec 16 23:11:27 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 15:11:27 -0800 (PST)
Subject: [Types-sig] New syntax?
In-Reply-To: <38596586.63D9E29F@vet.uu.nl>
Message-ID: <Pine.LNX.4.10.9912161508050.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, Martijn Faassen wrote:
>...
> > [ I *really* don't like member declarations in the __init__() method as
> >   some people have shown. Those could be confused with declarations of
> >   local vars, which I hope we aren't going to have. ]
> 
> Well, my syntax proposal avoids this confusion by following Python's
> lead:
> 
> varclass Foo:
>    alpha: int
> 
>    def __init__(self):
>        self.member: int
>        local: int

Quite true. This is much clearer. But I still want "decl" rather than
"varclass" :-)

> > Consider the above example, my latest proposal for syntax changes in
> > support of declarations. Obviously, a bit more detail is needed for things
> > like parameterized types, but I think the above is representative of where
> > I'd like to see things go.
> 
> Didn't you think parameterized types looked fairly straightforward in my
> syntax proposal?

Yes. I would expect something like that for a new typedecl syntax.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Thu Dec 16 23:20:21 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 15:20:21 -0800 (PST)
Subject: [Types-sig] Attributes proposal
In-Reply-To: <38592D38.63057A1A@prescod.net>
Message-ID: <Pine.LNX.4.10.9912161519180.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, Paul Prescod wrote:
> My proposal for handling attributes is this:
> 
> An attribute's type can be declared. Writes to the attribute from the
> same module can be statically type checked (if requested). Writes to the
> attribute from other modules are checked at runtime. That way we can
> always know the type of the attribute value and can therefore make
> reasonable use of the attribute in statically type checked functions.
> 
> Opinions?

Punt issues of writeability to a later revision. Concentrate on type
checking instead. Assume that an attribute's declared type is correct.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Thu Dec 16 23:29:21 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 16 Dec 1999 15:29:21 -0800 (PST)
Subject: [Types-sig] check_type()
In-Reply-To: <3857E01D.6C699075@prescod.net>
Message-ID: <Pine.LNX.4.10.9912161521440.16305-100000@nebula.lyra.org>

On Wed, 15 Dec 1999, Paul Prescod wrote:
> Greg Stein wrote:
...
> > > j=has_type( foo, types.StringType ) or has_type( foo, types.ListType ):
> > 
> > You'll have issues with empty strings and empty lists, as Guido pointed
> > out.
> 
> Yes, you have to use it in ways that follow Python's boolean rules. A
> better name would be check_type.
> 
> j=check_type( foo, types.StringType ) 
> 
> > has_type() does not create a *definitive* type assertion. The compiler
> > cannot extract any information from the presence of has_type(). Using an
> > operator which raises an exception allows the compiler to make the
> > assertion (and thereby assist with type inferencing and type checking).
> 
> j=check_type( foo, types.StringType)
> 
> j is *guaranteed* to be either a string or None.

But that is a problem right there: you've introduced the possibility that
<j> might be None. While the compiler / type-checker can certainly do
something useful with that concept, this does not provide a way to
guarantee a *single* type.

j = check_type(foo, String)
func_taking_string(j)

The above will fail because the compiler will flag the possibility of <j>
being None.

j = foo ! String
func_taking_string(j)

Now that works :-)

> Note that check_type is actually an operator in that it cannot be
> overwritten or shadowed. It just happens to be an operator that looks
> like a function and that returns a useful value instead of immediately
> causing an exception. It also happens to be compatible with the current
> Python grammar.

Icky. Either the compiler now has to understand that NAME(...) could
possibly require special processing, or the parser recognizes it and
constructs a new AST node. The former is badness, and the latter says
you're changing the parser, so why not use a "real" operator?

People might also be tempted to do:

ct = check_type

ct(x, String)
ct(y, Int)
ct(z, List)

Of course, this will fail because check_type isn't really a valid name.

> I have big aesthetic problems with adding a special character to a
> language that uses the word "or" to mean, well "or" and "not" to mean
> "not". I might be able to live with 
> 
> "k = eval('1') as int"
> 
> if it isn't too horribly ambiguous. 

Using "as" instead of "!" would be non-ambiguous. However, the word "as"
seems to imply "use it as an int" rather than an assertion that it *is* an
integer.

Of course, we can't use the word "is" because it can already be used
inside an expression. "isa" might even be more appropriate, and is
available for usage.

For now, I'll keeping using '!', but I'm on record as being open to
alternate representations for the operator.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From tim_one@email.msn.com  Fri Dec 17 01:17:10 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 16 Dec 1999 20:17:10 -0500
Subject: [Types-sig] Re: RFC 0.1
In-Reply-To: <385691D3.6DC4A36E@vet.uu.nl>
Message-ID: <000801bf482c$71670100$63a2143f@tim>

[Martijn Faassen]
> While my agenda is to kill the syntax discussions for the moment,
> ...

Martijn, in that case you should stop feeding the syntax meta-discussion and
just view all the other notations as virtual spellings for masses of obscure
nested dicts <wink>.

notation-is-an-aid-to-comprehension-ly y'rs  - tim


From tim_one@email.msn.com  Fri Dec 17 01:17:05 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 16 Dec 1999 20:17:05 -0500
Subject: [Types-sig] RFC 0.1
In-Reply-To: <199912141633.LAA23558@eric.cnri.reston.va.us>
Message-ID: <000601bf482c$6e075b40$63a2143f@tim>

[Guido]
> ...
> If it's (OPT) we're after, adding run-time checks can never obtain
> your goal.

That's unclear:  declared types don't have to be known correct to be useful!
They at least tell you what the user *expects* to be true.  Code can then be
generated *assuming* everything the user said is true, with a block of code
preceding to *verify* it's true.  Optimizing compilers do this routinely
under the covers, where the (misnamed in this case) "verification" code
simply branches to a slower all-purpose translation of the code if the
assumptions turn out to be false at runtime.

Trivial example:

    for i in range(n):
        x[i] = i

In the presence of

    decl x: [Int]

the generated (pseduo)code

    if type(x) is not ListType:
        raise TypeError("lying bastard!")
    else:
        setter = ListType.__setitem__
        for i in range(n):
            setter(x, i, i)

is a good bet and already saves n lookups of the proper __setitem__ method.
It's a comparatively small step from there for a compiler to say "ah, but I
know all about Lists!  I can generate list "setter" code inline".  And again
from there to "ah, now that all the code is exposed, I know the net effect
on each i's refcount, so can skip useless inc+dec pairs".

I'm not saying you need <wink> to do this; just saying that all information
*can* be valuable to a gung-ho optimizer -- even wrong information!
Optimization is a probability game, and while certainty is helpful it isn't
essential.

> ... if there's a type error in my except clause, what good does it
> do me to get a type-check error at run time?

Frankly, I think the "safety" arguments are the weakest -- if someone has
untested code paths in their program, they should *assume* all such paths
are broken!  What good does it do you to have a statically type-correct
except clause if it raises an OverflowError at runtime <0.5 wink>?
(Speaking of which, I routinely see error paths in C++ apps blow up with
memory errors due to null pointers.)

>>> The initialization for b denies its type declaration.  Do you really
>>> want to do this?

>> None is a valid value for any type as with NULL in C or SQL.

> No.  In C, NULL is not a valid integer (at least not conceptually --
> it's a pointer).  I hate the fact that in Java, NULL is always a valid
> string, because strings happen to be objects, and so I always run into
> run-time errors dereferencing NULL.  I'd like to be able to declare
> the possibility that a particular value is None separate from its type
> -- this feels much more natural and powerful to me.

Paul later semi-suggested borrowing Haskell's notation for union types.
This looks good to me (despite that "my" syntax looks concrete, it's
abstract <wink>):

    decl i: Int           # "i = None" not allowed
    decl j: Int | None    # "j = None" is OK

As you said earlier, "a type" is a set of values.  So if None is a legit
value for a name, then None is in the set of values that name can take on,
so None is certainly a part of its type.  We don't need tricks or
compromises here:  we can say what's intended directly.

> The hard part is keeping which variables (and arguments, etc.) can
> contain instances of a given class; if we have that we can track
> instance variable assignments.

I don't see the problems here, at least not for explicit declaration
schemes.  The inference schemes are harder -- because they're, well,
*harder* <wink>.

loathe-to-see-hard-problems-prevent-solving-easy-ones-ly y'rs  - tim


From tim_one@email.msn.com  Fri Dec 17 01:17:12 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 16 Dec 1999 20:17:12 -0500
Subject: [Types-sig] RFC 0.1
In-Reply-To: <199912141909.OAA24221@eric.cnri.reston.va.us>
Message-ID: <000901bf482c$72738f00$63a2143f@tim>

[Guido]
> Agreed.  List of integer and its friends are important.  Also
> correspondences (see my example of a sum() function taking a
> list of <something> and an additional single <something>.

Assuming an object of type C is declared

    decl x: C

and an object of type "list of C" is declared

    decl y: [C]

then for a function taking a list of some type and a scalar of that type,
returning a binary tree of objects of that type <wink>, I'd suggest:

    decl sum: def([_T], _T) -> BinaryTree(_T)

I'm just warping Haskell's system to Python conventions.  As I've noted
before, Haskell is the most Pythonic of all the languages that are entirely
unlike Python <0.9 wink>.

Correspondences require a formal type *variable*.  C++ templates use an ugly
angle-bracket notation to surround the formal type variables.  Haskell uses
identifiers that begin with a lowercase letter, conventionally a one-letter
name from the end of the alphabet.  I suggest a leading underscore in
Python, to suggest that there's something special about the name, and to
suggest that it's "local" to the type expression in which it appears.

it's-easy-if-you-don't-think<wink>-ly y'rs  - tim


From paul@prescod.net  Fri Dec 17 01:28:38 1999
From: paul@prescod.net (Paul Prescod)
Date: Thu, 16 Dec 1999 17:28:38 -0800
Subject: [Types-sig] Module Attribute visibility
References: <Pine.LNX.4.10.9912161256270.16305-100000@nebula.lyra.org>
Message-ID: <385991C6.5B2917B4@prescod.net>

Greg Stein wrote:
> 
> IMO, let's solve static type checking. Leave visibility and modification
> rules to another phase. They are orthogonal problems, and we would do well
> to reduce our problem set (and the amount of discussion thereby
> engendered (my 25 cent word for the day :-)).

They are not orthogonal at all. I can't statically check a file that
uses sys.version unless I know that sys.version has not been overwritten
with a string. We can't allow the runtime system to violate the
expectations of the static type engine. We also don't want every user of
sys.version to need to assert its type.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things never trust in: That's the vendor's final bill
The promises your boss makes, and the customer's good will 
http://www.geezjan.org/humor/computers/threes.html


From tim_one@email.msn.com  Fri Dec 17 02:04:05 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 16 Dec 1999 21:04:05 -0500
Subject: [Types-sig] RFC 0.1
In-Reply-To: <Pine.LNX.4.10.9912150301070.16305-100000@nebula.lyra.org>
Message-ID: <000c01bf4832$ff321460$63a2143f@tim>

[Greg Stein]
> Woo hoo! Tim to the rescue! :-)

Na -- I just like to stir up trouble <wink>.

[Tim code that uses one name for both a dict and that dict's
 list of keys]
> Ha! I posted something just like this just the other day:
>   http://www.python.org/pipermail/types-sig/1999-December/000518.html

Yes, I noticed that at the time but forgot -- it's such a common idiom in my
code that it didn't "stick".

> Basically: I *totally* agree, and this is primarily the time when
> I use a single variable name for two different types. This is also
> a reason why I'd like to avoid the notion of associating a type
> with a [variable] name.

Associating types with names is thoroughly conventional, and thoroughly
appropriate if a given name is in fact intended to have a fixed type -- and
I expect that's most names (e.g., likely every name in __builtin__ and
sys!).

If I have a class with a dozen methods and they all treat e.g. self.x as a
list of floats, I certainly don't want to have to decorate every reference
to and binding of self.x[i] to say that over and over again.  I'd rather use
a distinct name in the few places I "cheat" now.  Heck, given a suitable set
of predefined interfaces, I could declare my dict/list name as being of type
(or implementing the interface) Subscriptable <wink>.  Or of the universal
type (Paul's PyObject).  Although the less specific I am, the less help I
can expect to get from typing -- that's my tradeoff to make.

Given that you *have* to associate types at least with formal argument
names, "avoiding the notion of associating a type with *a* name" is a lost
cause.  A further distinction between "[variable] name"s and "all names"
isn't compelling -- although I hope Guido doesn't listen to me and presses
on with his type inference schemes anyway, because given the types of
globals and arguments, the types of almost all local variables are indeed
easy to infer <wink>.

doesn't-mean-i-want-to-prevent-people-from-declaring-'em-
    though-they're-*used*-to-it-from-other-languages-and-
    there's-no-good-reason-to-outlaw-it-ly y'rs  - tim


From tim_one@email.msn.com  Fri Dec 17 02:04:07 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 16 Dec 1999 21:04:07 -0500
Subject: [Types-sig] List of FOO
In-Reply-To: <Pine.LNX.4.10.9912150326590.16305-100000@nebula.lyra.org>
Message-ID: <000d01bf4833$0070fd00$63a2143f@tim>

> So I could, for instance, define a binary tree module and
> have "binary trees of ints" and "binary trees of strings."
> How do I define the binary tree class and state that it
> is parameterizable?

Via:

    decl type BinaryTree(_T)

    class BinaryTree: # exactly as today

exploiting type variables (see earlier msg), and that I named "decl" "decl"
instead of "var" precisely because "var"iable declarations aren't the only
kinds of "decl"arations that will be needed before the blood stops flowing.

decl-will-encompass-a-sublanguage-bigger-than-python<wink>-ly y'rs  - tim


From tim_one@email.msn.com  Fri Dec 17 02:19:21 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 16 Dec 1999 21:19:21 -0500
Subject: [Types-sig] RFC 0.1
In-Reply-To: <Pine.LNX.4.10.9912150400580.16305-100000@nebula.lyra.org>
Message-ID: <000e01bf4835$20e04b20$63a2143f@tim>

[GregS]
> ...
> 2) You *still* need inferencing. "a = foo() + bar()" implies that some
>    inferencing occurs.

The type of the RHS expression is the union of the types returned by
foo.__add__ and bar.__radd__.  I wouldn't call that inferencing, any more
than I'd say it required inferencing to determine the return type of

    math.sin(3.0)

Now you *can* call that inferencing, but doing so wouldn't be helpful
<wink>.

> ...
> The compiler can issue a warning and insert a type assertion for
> a runtime check. IMO, it should not forbid you from doing anything
> simply because it can't figure out some type. Python syntax's "type
> agnosticism" is one of its major strengths.

It sure is!  OTOH, many people write code that doesn't exploit that, and
would rather not see runtime surprises when it's *possible* to catch them at
compile-time.  And some of those would rather not be allowed to write any
code that *could* yield a runtime surprise.

Different strokes, and I say don't worry about it!  What's important is that
the type system be defined sufficiently well that the compiler can either
*know* that it knows the type of a given expression at compile-time, or know
that it *doesn't* know it.  What it does in the latter case can be
determined by a compile option.

There's really no other realistic choice, since Python's dynamicism is too
useful to allow defining a type system that's guaranteed always resolvable
at compile time.  Some people are going to want to die when it's not, others
are going to want to press on.

and-both-positions-are-ridiculous<wink>-ly y'rs  - tim


From paul@prescod.net  Fri Dec 17 03:25:40 1999
From: paul@prescod.net (Paul Prescod)
Date: Thu, 16 Dec 1999 19:25:40 -0800
Subject: [Types-sig] consensus(?) summary (was: Type annotations)
References: <Pine.LNX.4.10.9912161300170.16305-100000@nebula.lyra.org>
Message-ID: <3859AD34.17FABEC5@prescod.net>

Greg Stein wrote:
> 
> >  * non-local writes are checked at runtime (by default)
> 
> Hrm. Is there an easy rule to determine this? 

Yes: if the code's module object is not the module/class whose namespace
we are writing to, the write fails. This is a fast pointer comparison in
CPython.

> I might suggest deferring
> this unless/until we have a clear set of rules. Shades of C++'s "friend"
> modifier are forming in my head when we talk about this...

Simple languages should have simple protection rules.

> >  * for optimization, the checks may be stripped based on type inferenced
> > information
> 
> Which checks? I think runtime checks are *ignored* if you run with -O.
> ...
> In reference to type-inferred information: I don't think runtime checks
> would ever be added if the type has been inferred.

That's what I meant.

> >  * built-in types are declared through "shadow files"
> 
> This is somewhat problematic. How do we map from a builtin type to this
> shadow file? Do they reside in a well-known location?

They reside on the PYTHONPATH.

> Second issue: keeping them in sync, version mismatches, distribution and
> install problems, etc.

Keeping them in sync: coder's responsibility.
Version mismatch: coder's responsibility.
Distribution and install programs: I'll toss this to distutils.

> >  * types can be parameterized
> >  * which means that the compile/runtime checks need to be more
> > sophisticated
> 
> Yes, although I might modify it somewhat and say "only core types can be
> parameterized."

I don't any longer see a reason to restrict it that way. Parameterizing
types is actually not so bad. I don't know how C++'s got so complex as
to be a full programming language 

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things never trust in: That's the vendor's final bill
The promises your boss makes, and the customer's good will 
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Fri Dec 17 03:25:47 1999
From: paul@prescod.net (Paul Prescod)
Date: Thu, 16 Dec 1999 19:25:47 -0800
Subject: [Types-sig] New syntax?
References: <Pine.LNX.4.10.9912161358320.16305-100000@nebula.lyra.org>
Message-ID: <3859AD3B.E841643A@prescod.net>

Greg Stein wrote:
> 
> On Thu, 16 Dec 1999, Martijn Faassen wrote:
> 
> >... example syntax for a Python syntax to declare interfaces ...
> 
> Ah. Good. That is better than your/Paul's previous suggestions.

and

> Tim has come up with a good first pass. If we can formalize that (and I'd
> like to tweak it), then we have the basis we need to move forward.

There are two problems. One is defining interfaces. The other is
referring to compound types. My syntax for the former was almost the
same as Martijn's (and thus non-Python). My syntax for the other was
directly based on Tim's. Neither of my syntaxes are "Python" but neither
are Tim and Martijn's. 

Perhaps you can quote the syntax that you object to and propose an
alternative.
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things never trust in: That's the vendor's final bill
The promises your boss makes, and the customer's good will 
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Fri Dec 17 03:25:51 1999
From: paul@prescod.net (Paul Prescod)
Date: Thu, 16 Dec 1999 19:25:51 -0800
Subject: [Types-sig] New syntax?
References: <Pine.LNX.4.10.9912161358320.16305-100000@nebula.lyra.org> <38596586.63D9E29F@vet.uu.nl>
Message-ID: <3859AD3F.CFB0ED0A@prescod.net>

Martijn Faassen wrote:
> 
> ...
> Didn't you think parameterized types looked fairly straightforward in my
> syntax proposal?

I must have missed something. Could you show me how to do Btree of X and
then make concrete types Btree of Int and Btree of Functions From String
to Int?

 Paul Prescod

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things never trust in: That's the vendor's final bill
The promises your boss makes, and the customer's good will 
http://www.geezjan.org/humor/computers/threes.html


From tim_one@email.msn.com  Fri Dec 17 04:05:49 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 16 Dec 1999 23:05:49 -0500
Subject: [Types-sig] challenge response (was: A challenge)
In-Reply-To: <Pine.LNX.4.10.9912161332580.16305-100000@nebula.lyra.org>
Message-ID: <001101bf4844$0085adc0$63a2143f@tim>

[GregS]
> ...
> IMO, type checking is NOT enabled by default.

IMO2.

oo?-ly y'rs  - tim


From tim_one@email.msn.com  Fri Dec 17 04:05:51 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 16 Dec 1999 23:05:51 -0500
Subject: [Types-sig] New syntax?
In-Reply-To: <Pine.LNX.4.10.9912161358320.16305-100000@nebula.lyra.org>
Message-ID: <001201bf4844$01f56a60$63a2143f@tim>

[GregS]
> ...
> In fact, I don't even like Tim's notion of declaring a function since a
> "def" is more than adequate for doing that.

I thought it would be easier to get one new stmt than to modify existing
stmts, and *much* easier to write a dirt-simple tool to strip them out again
(vis a vis Guido's requirement).

In real life I would certainly prefer annotating "def" stmts directly.  I
think a declaration statement needs the *ability* to specify full function
signatures, though; e.g.,

decl handlerMap: {String: def(Int, Int)->Int}

handlerMap = {"+": lambda x, y: x+y,
              "*": lambda x, y: x*y,
              ...
             }

In either case, I'm not sure what to do about varargs (the "*rest" form of
argument).

> ...
> Note the use of "decl class ..." to define class variables, while
> "decl ..." is for member variables. I'm not sure if we should
> instead use Tim's suggestion of "decl member ...", though.

I am:  I didn't think about this at all.  Member vrbls are far more common
than class vrbls, so practicality beats purity <wink>.

> Given the position of the declaration, I think "decl member" might
> actually be better because it makes it clear that <b> is a *member*
> variable, despite being in a location that is normally used for class
> variables.

That's the purity argument <2.0 wink>.

> An alternative would be a different "decl" keyword just
> for members.

And that's the bozo argument <3.0 wink>.  "decl" doesn't mean "here's a
variable", it means "here's a declaration of 'something'"; e.g., on some
days I would have killed to be able to say:

    decl builtin int, ord  # stop looking these up in the inner loop!

In fact, I think of "decl" as a devious way of writing "pragma" -- and all
that that implies.

although-less-than-that-demands-ly y'rs  - tim


From tim_one@email.msn.com  Fri Dec 17 04:05:44 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 16 Dec 1999 23:05:44 -0500
Subject: [Types-sig] Interface files
In-Reply-To: <199912162035.PAA11433@eric.cnri.reston.va.us>
Message-ID: <001001bf4843$fe435ee0$63a2143f@tim>

[Guido]
> I'm not taking sides here, but I want to note that none of the takers
> on my latest challenge have shown separate interface files.  All the
> ones I've seen used inline syntax.

No, you're confusing my concrete syntax with my abstract syntax <wink>:  in
my submission, you move all the module-level "decl" statements that aren't
declaring module-private names into a separate file.  Bingo:  interface
file.  I just happened to sprinkle them around inline to aid readability
<wink>.

heck-you-could-even-write-a-perl-script-to-do-it-ly y'rs  - tim


From tim_one@email.msn.com  Fri Dec 17 07:25:55 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 17 Dec 1999 02:25:55 -0500
Subject: [Types-sig] Handling attributes
In-Reply-To: <3857DFBC.A2B7ADAC@prescod.net>
Message-ID: <001401bf485f$f47d2b40$63a2143f@tim>

[PP]
> ...
>         b. do we check assignments to class and module
> attributes from other modules at runtime?

As an eventual end user, if I declare the type of a name (*any* name), and
I've enabled type checking, I expect that there is no possibility of that
name getting bound to an object not of the declared type.  I expect to get
an error at compile-time if that's feasible, but I understand it may not be.
In the latter case I expect a runtime error pointing at the offending
binding.  I also accept that the program may run slower because of this!

> ...
>         c. should we perhaps just disallow writing to "declared"
> attributes from other classes and modules?

OK by me at the start.  It's one way to satisfy my "no possibility", about
which I'm serious because users will be serious.  Unfortunately, I'm also a
typical user in demanding the impossible <wink> -- that is, "no" is a very
strong word, covering things like "disguised" rebindings via direct __dict__
access too.  So as a reasonable user, I settle for "no possibility, with
this peculiar but precise meaning of 'no': ...".

>         d. is it possible to write to UN-declared attributes from
> other people's classes and modules? And what are the type safety
> implications of doing so?

Sure and none, for some peculiar but precise meanings of "sure" and "none"
<wink>.  For example, it may or may not totally screw up the conclusions
reached by a type inferencer -- we'd have to see the type inferencer first
to know for sure.  Or if Guido's "optimize builtin" (yay!) idea is
implemented, my doing

    yourmodule.len = lambda any: 42

will likely have no visible effect (for some peculiar but precise ...).

but-i'd-call-that-a-feature!-ly y'rs  - tim


From tim_one@email.msn.com  Fri Dec 17 07:25:50 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 17 Dec 1999 02:25:50 -0500
Subject: [Types-sig] Low-hanging fruit: recognizing builtins
In-Reply-To: <199912151707.MAA02639@eric.cnri.reston.va.us>
Message-ID: <001301bf485f$f2c12360$63a2143f@tim>

[Guido]
> ...
>           LOAD_GLOBAL         0 (len)
>           LOAD_FAST           0 (a)
>           CALL_FUNCTION       1

vs

> can be replaced by
>
>           LOAD_FAST           0 (a)
>           UNARY_LEN
>
> which saves one PVM roundtrip and two dictionary lookups, plus the
> argument checking code inside the len() function.

To get the latter, I believe we'd need to write an additional len()
function.  That is, the current checking len has to stick around to deal
with stuff like

    apply(f, args)

when f happens to be bound to __builtin__.len.

> There are plenty of bytecodes available.

Likely many more than there are compilers that can tolerate another case in
eval_code2's switch <wink>.

> ...
> The per-module analysis required is major compared to what's
> currently happening in compile.c (which only checks one function
> at a time looking for assignments to locals) but minor compared
> to any serious type inferencing.

Note too that it's a length-changing transformation, which is also brand
new.  That is, the only optimizations in compile.c now replace a bytecode
with another of the same size.  So (at least) jump offsets and the
line-number table would need to be recomputed too.

Not hard, there's simply no machinery there to build on now.

OTOH, compile.c needn't be involved; the analysis & transformations *could*
be done via a bytecode-fiddling Python program.

i-knew-michael-hudson-would-be-useful-for-*something*<wink>-ly y'rs  - tim


From tim_one@email.msn.com  Fri Dec 17 07:32:59 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 17 Dec 1999 02:32:59 -0500
Subject: [Types-sig] expression-based type assertions
In-Reply-To: <E11yKZq-0006Rd-00@lsls4p>
Message-ID: <001b01bf4860$f19690a0$63a2143f@tim>

[Edward Welbourne]
> ... [and 14Kb later] ...
> This list is too busy.

Actually, I heard it was being killed for lack of activity <wink>.

being-overwhelmed-is-a-way-of-life-ly y'rs  - tim


From tim_one@email.msn.com  Fri Dec 17 08:08:22 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 17 Dec 1999 03:08:22 -0500
Subject: [Types-sig] Low-hanging fruit: recognizing builtins
In-Reply-To: <3858B8C9.24962AAD@lemburg.com>
Message-ID: <001d01bf4865$e2da0920$63a2143f@tim>

[M.-A. Lemburg]
> ...
> I haven't followed the thread too closely, but isn't there
> some way to tell the optimizer which modules to treat at
> what optimization level ?

No.  I'm trying to introduce a "decl" stmt, though, that can in principle
express any thought capable of human expression <wink>.

> BTW, instead of adding oodles of new byte code, how about
> grouping them... e.g. instead of UNARY_LEN, BUILD_RANGE, etc.
> why not have a CALL_BUILTIN which takes an index into
> a predefined set of builtin functions.

It's another tradeoff.  UNARY_LEN is simple enough that the code for
builtin_len could be put in the case stmt inline, but skipping the argument
check.  Read it out of a table instead, and you're back to Yet Another
Function call, or an embedded switch stmt in CALL_BUILTIN's implementation.

> ...
> Note that the loop as it is built now is already too large
> for common Intel+compatible based CPUs.

I assume this is Flowery Language for your particularly lame AMD K6 <wink --
ah, the satsifaction of being a Pure Wintel Guy!>.

> Adding even more byte codes to the huge single loop would
> probably result in a decrease of CPU cache hits. (I split the
> Great Switch in two switch statements and got some good results
> out of this: the first switch handles often used byte codes
> while the second takes care of the more exotic ones.)

Good strategy!

silly-cpus-ly y'rs  - tim


From gstein@lyra.org  Fri Dec 17 08:37:26 1999
From: gstein@lyra.org (Greg Stein)
Date: Fri, 17 Dec 1999 00:37:26 -0800 (PST)
Subject: [Types-sig] Module Attribute visibility
In-Reply-To: <385991C6.5B2917B4@prescod.net>
Message-ID: <Pine.LNX.4.10.9912170031510.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, Paul Prescod wrote:
> Greg Stein wrote:
> > IMO, let's solve static type checking. Leave visibility and modification
> > rules to another phase. They are orthogonal problems, and we would do well
> > to reduce our problem set (and the amount of discussion thereby
> > engendered (my 25 cent word for the day :-)).
> 
> They are not orthogonal at all. I can't statically check a file that
> uses sys.version unless I know that sys.version has not been overwritten
> with a string. We can't allow the runtime system to violate the
> expectations of the static type engine. We also don't want every user of
> sys.version to need to assert its type.

You certainly can statically check a file. Assume that sys.version is a
string and remains a string. Done.

Why can't the runtime system violate the expectations? Seriously: I doubt
you can prevent it. Python is simply too dynamic. I'd be surprised if you
could completely stop me from changing sys.version if I want really trying
to do so.

This falls back to what Tim was stating: you can output code that assumes
a particular type and runs better, but falls back to a slower version if
the type is wrong (or maybe raises an error). I certainly would hope that
the compiler/PVM will not bomb because somebody managed to change the type
of something where the type check system didn't think it would be changed.

I think the type check system will work to signal errors at compile time,
but I don't think it needs to go very far past that (e.g. hard-line
restrictions on modification). This is basically a corollary of "we're all
adults here." i.e. don't be a child and put a list into sys.version :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From tim_one@email.msn.com  Fri Dec 17 08:59:33 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 17 Dec 1999 03:59:33 -0500
Subject: [Types-sig] minimal or major change? (was: RFC 0.1)
In-Reply-To: <3858E9C2.E2722B88@vet.uu.nl>
Message-ID: <001f01bf486d$09feec80$63a2143f@tim>

[Martijn Faassen, reasonably demands ...]
> So that's where I'm coming from. It's important for our proposal
> to actually come up with a workable development plan, because
> adding type checking to Python is rather involved. So I've been
> pushing one course of implementation towards a testable/hackable
> system that seems to give us the minimal amount of development
> complexities. I haven't seen clear development paths from others
> yet; most proposals seem to involve both run-time and compile-
> time developments at the same time.
>
> So I'm interested to see other development proposals; possibly
> there's a simpler approach or equally complex approach with more
> payoff, that I'm missing.

I haven't given a lick of thought to development, beyond sketching "the
usual" approach to type inference for Guido, and having a hard-won intuition
about what is and isn't reasonably parseable.  This SIG has been "alive
again" for on the order of just one week:  design precedes implementation,
and I won't bemoan the lack of implementation details even if they're
delayed for *another* whole week <wink>.

At that point, it's fine by me if the first cut is *spelled* using plain
dicts and docstrings etc to ease development.  But before that point, we
don't even know what we want it to *do*.

"we"-being-the-consensus-"we"-ly y'rs  - tim


From tim_one@email.msn.com  Fri Dec 17 08:59:36 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 17 Dec 1999 03:59:36 -0500
Subject: [Types-sig] Re: [Doc-SIG] Sorry!
In-Reply-To: <E11ybaY-0001IK-00@lsls4p>
Message-ID: <002001bf486d$0b457640$63a2143f@tim>

[Edward Welbourne]
> ...
> (Tim: the type Boolean is a (useful) synonym for PyObject.

I agree.  I'm trying to avoid the mess C and then C++ got into by refusing
to define a bool type for so many years (and so 10,000 development groups
typedef'ed it in 20,000 different ways, many pairwise incompatable -- the
runtime doesn't have much use for a distinct bool type, but it says
something vital in signatures for *people*).

> It probably includes some added semantics about how you should be
> trying to use it.)

Python's runtime rules have no restrictions on what a true/false object may
be, or how one may be manipulated, and I don't want to impose any that Guido
didn't see fit to impose from the start (I think we should be trying to
provide notation for Python's actual types, not invent brand new types -- so
"synonym" is what I want!).  I just want that specific type name *there* so
people don't run off defining their own in mutually incompatible ways.

someday-i-*might*-want-to-run-somebody-else's-code<wink>-ly y'rs  - tim


From mal@lemburg.com  Fri Dec 17 09:07:50 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 17 Dec 1999 10:07:50 +0100
Subject: [Types-sig] Low-hanging fruit: recognizing builtins
References: <001d01bf4865$e2da0920$63a2143f@tim>
Message-ID: <3859FD66.E47352E@lemburg.com>

Tim Peters wrote:
> 
> [M.-A. Lemburg]
> > BTW, instead of adding oodles of new byte code, how about
> > grouping them... e.g. instead of UNARY_LEN, BUILD_RANGE, etc.
> > why not have a CALL_BUILTIN which takes an index into
> > a predefined set of builtin functions.
> 
> It's another tradeoff.  UNARY_LEN is simple enough that the code for
> builtin_len could be put in the case stmt inline, but skipping the argument
> check.  Read it out of a table instead, and you're back to Yet Another
> Function call, or an embedded switch stmt in CALL_BUILTIN's implementation.

Sure its a tradeoff, but it has the advantage of allowing to
extend it later without adding too much cruft to the inner loop.
What it basically does is avoid the global *and* local lookups
by replacing them with a C array index lookup.

Of course, for very common things such as len and range some
other strategy might be worth persuing.
 
> > ...
> > Note that the loop as it is built now is already too large
> > for common Intel+compatible based CPUs.
> 
> I assume this is Flowery Language for your particularly lame AMD K6 <wink --
> ah, the satsifaction of being a Pure Wintel Guy!>.

The performance improvement mentioned below is not really
noticable on machines with different architectures, e.g.
Sun SPARC. That's where I drew my conclusion from. But then,
I tested a few years ago, so perhaps the new Pentiums and
Athlon don't gripe about the size of the inner loop anymore.

BTW, just to make buying one of those new microwave
ovens more attractive: what is the pystone rating for
the new Athlon and Pentium III chips ?

> > Adding even more byte codes to the huge single loop would
> > probably result in a decrease of CPU cache hits. (I split the
> > Great Switch in two switch statements and got some good results
> > out of this: the first switch handles often used byte codes
> > while the second takes care of the more exotic ones.)
> 
> Good strategy!

Thanks :-)

We are getting a little off-topic here, I'm afraid, but it
was fun looking at old optimization strategies again...
I haven't touched that code in years.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    14 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From gstein@lyra.org  Fri Dec 17 09:16:09 1999
From: gstein@lyra.org (Greg Stein)
Date: Fri, 17 Dec 1999 01:16:09 -0800 (PST)
Subject: [Types-sig] New syntax?
In-Reply-To: <001201bf4844$01f56a60$63a2143f@tim>
Message-ID: <Pine.LNX.4.10.9912170109080.16305-100000@nebula.lyra.org>

On Thu, 16 Dec 1999, Tim Peters wrote:
> [GregS]
> > ...
> > In fact, I don't even like Tim's notion of declaring a function since a
> > "def" is more than adequate for doing that.
> 
> I thought it would be easier to get one new stmt than to modify existing
> stmts, and *much* easier to write a dirt-simple tool to strip them out again
> (vis a vis Guido's requirement).
> 
> In real life I would certainly prefer annotating "def" stmts directly.  I
> think a declaration statement needs the *ability* to specify full function
> signatures, though; e.g.,
> 
> decl handlerMap: {String: def(Int, Int)->Int}

Ah. Right. Good point. I guess that does mean that something like:

decl a: def(Int)->None

would be possible. e.g. <a> is a member holding a ref to a function
object. Of course, the type of <a> in this case is no different than:

def a(Int x)->None:

It is just that one declares a member and the other declares a method.
There is a subtle difference there :-)

In fact, these two are probably equivalent:

decl class a: def(Int)->None
def a(Int x)->None:

> handlerMap = {"+": lambda x, y: x+y,
>               "*": lambda x, y: x*y,
>               ...
>              }
> 
> In either case, I'm not sure what to do about varargs (the "*rest" form of
> argument).

What's wrong with:

decl a: def(Int, *)->Int
decl b: def(Int, **)->Int
decl c: def(Int, *, **)->Int

I don't see any ambiguity in the grammar there, unless you use "*" to mean
unknown (as Paul once mentioned). I think the unknown type should be
"Any" (or "any"), since it really means "take any type of value."

> > ...
> > Note the use of "decl class ..." to define class variables, while
> > "decl ..." is for member variables. I'm not sure if we should
> > instead use Tim's suggestion of "decl member ...", though.
> 
> I am:  I didn't think about this at all.  Member vrbls are far more common
> than class vrbls, so practicality beats purity <wink>.

That was my first thought. Then I started thinking too hard about the
problem... :-)

I'm not sure whether to go for practical or pure.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From tim_one@email.msn.com  Fri Dec 17 09:28:12 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 17 Dec 1999 04:28:12 -0500
Subject: [Types-sig] Implementability
In-Reply-To: <385903C9.98B69A30@vet.uu.nl>
Message-ID: <002101bf4871$09fad100$63a2143f@tim>

[Tim]
>     declared_int = unknown
>
> is an error, but

[Martijn Faassen]
> Or, if you're interfacing with untyped python, this could raise a
> run-time exception if unknown doesn't turn out to be an integer. Or do
> you disagree with this?

Yes and no <wink>.  I think "type-checked Python" needs at least three
different compile modes, because nobody is claiming we *can* check the
entire language in a bulletproof way -- and when we can't, different people
will want different behaviors at different times for legitimate reasons.

1. Anything that can't be proven safe at compile-time is a compile-time
   error (that's where I disagree with you above).

2. Anything that can't be proven safe at compile-time is checked
   at runtime (that's where I agree with you above).

3. Anything that can't be proven safe at compile-time emits a
   compile-time warning, and there's no guarantees one way or the
   other about what happens at runtime (where I don't even agree
   with myself).

If someone doesn't want any of those behaviors, fine, don't enable
type-checking.


>>     unknown1 = unknown2
>>
>> is not.  Whether
>>
>>     unknown = declared_int
>>
>> should be an error is a policy issue.  Many will claim it should
>> be an error, but the correct answer <wink> is that it should not.

> This would seem to be the natural way to do it; I'm not sure why many
> would claim it should be an error. Could you explain?

This is what I call a "self-denying prophecy".  That is, by implicitly
ridiculing the position before anyone took it, I graciously saved everyone
from suggesting it <wink>.

> I agree.

I'm going to save that sentence and paste it into other replies as needed.

thanks!-ly y'rs  - tim


From tim_one@email.msn.com  Fri Dec 17 09:28:15 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 17 Dec 1999 04:28:15 -0500
Subject: [Types-sig] A challenge
In-Reply-To: <ADCB388D8C6BD211A4CE0000F63D90112215C6@mail.littoncorp.com>
Message-ID: <002201bf4871$0baf37c0$63a2143f@tim>

[Golden, Howard]
> I completely support this style!  I won't quibble about 'decl' vs.
> 'var', though I suggest the latter, all else being equal, since
> it has a proud heritage.

Despite that I think of "my syntax" as an abtract one, I've tried to give
one that *could* serve as a concrete syntax.  Toward that end, I
deliberately avoided "var" because we're going to need more kinds of
declarations in the future, and Guido's willingness to add a keyword is a
miracle I don't want to risk abusing <wink>.  So it was "decl" for
"declaration -- of anything whatsoever".

For example, just about everywhere I wrote

    decl x ...

so far I'd be just as happy with

    decl var x ...

Other things that are going to need declaration include type synonyms and
parameterized type declarations; e.g.,

    decl type BinaryTree(_T)   # a parameterized type
    decl typedef IntTree = BinaryTree(Int)  # a type synonym

Stuff like Java's final methods will eventually attract rabid enthusiasts
<wink> too, and decl can do *anything*.

the-miracle-that-is-delayed-definition-ly y'rs  - tim


From tim_one@email.msn.com  Fri Dec 17 10:05:06 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 17 Dec 1999 05:05:06 -0500
Subject: [Types-sig] Type annotations
In-Reply-To: <3858E94E.B7D86846@prescod.net>
Message-ID: <002301bf4876$318e1880$63a2143f@tim>

[Paul]
> 3. in separate decl statements: (Incompatible with Python 1.5, but
> easily converted)
> 
>   Python 1.5 compatibility: low

Noting that

    decl x: Int

could just as well be spelled

    # decl x: Int

That would knock it down on the "syntactic cleanliness" scale, though!


From tim_one@email.msn.com  Fri Dec 17 10:05:07 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 17 Dec 1999 05:05:07 -0500
Subject: [Types-sig] Type annotations
In-Reply-To: <38592CD4.963694B0@prescod.net>
Message-ID: <002401bf4876$327173a0$63a2143f@tim>

I'm burned out on SIG msgs for tonight, so one quickie:

[Paul]
> ...
> Here are some Haskell-ish syntax ideas for type declarations:
>
> First we need to be able to talk about types. We need a "type
> expression" which evalutates to a type.
>
> Rough Grammar:
>
> Type : Type ['|' Type] # allow unions
> Unit : dotted_name | Parameterized | Function | Tuple | List | Dict
> Parameterized : dotted_name '(' Basic (',' Basic)* ')'
> Basic : dotted_name | PythonLiteral | "*" # * means anything.
> PythonLiteral : atom
> Function : Type '->' Type
> Tuple : "(" Type ("," Type )* )
> List: "[" Type "]"
> Dict: "{" Type ":" Type "}"

The Function defn above is appropriate for Haskell because all functions
there are curried (exactly one argument).  The LHS should be different in
Python, because we have multiple-argument functions, and some arglist
gimmicks Haskell doesn't have.  In my examples I've been using

Function : 'def' '(' arglist ')' '->' Type

I think that's what's required.  With the "def" it's obvious.  Without the
"def" it's too easy to mistake the parenthesized arglist for a tuple of some
sort (yes, the '->' later disambiguates it, but unbounded lookahead isn't
machine- or human-friendly).  An explict def also allows the variant

Function : 'def' '(' arglist ')'

for functions that don't return results.

The "*" and "**" elements of arglists need also to be addressed.

BTW, there appear to be two holes in Tuple:  the empty tuple, and a tuple
with unknown length.  Shivery as it is, I expect we have to follow Python in
this regard, and say

   (T)

is a tuple-of-T of arbitrary length, while

   (T,)

is a tuple containing one T.

> ...
> maptype(intype, outtype) =
>     (( intype -> outtype ), List( intype )) -> List( outtype )

Don't the parens around (intype->outtype) say that it's a tuple containing a
function?

BTW, Python's actual "map" function is quite a puzzle to describe!  I can't
do it:

    map: def(def(_A)->_B, Seq(_A) -> [_B] |
         def(def(_A, _B)->_C, Seq(_A), Seq(_B)) -> [_C] | ...

is just the start of an unbounded sequence of legit map signatures.  Haskell
avoids the difficulty here thanks to currying.

> ...
> Interfaces look like Python classes but they use an "interface"
> keyword.

Weren't we leaving interfaces to JimF <wink>?

too-late-now-ly y'rs  - tim


From tim_one@email.msn.com  Fri Dec 17 10:23:56 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 17 Dec 1999 05:23:56 -0500
Subject: [Types-sig] A challenge
In-Reply-To: <3859353C.4106B165@appliedbiometrics.com>
Message-ID: <002501bf4878$d2faf240$63a2143f@tim>

[Christian Tismer]
> ...
> Just a question, please:
>
>> import fnmatch
>> import os
>>
>> decl _debug: Int  # but Boolean makes more sense; see below
>
> Is this meant to be lexically true in the globals scope from
> here on?

I'm not sure I grasp the question.  The implicit model is the "global" stmt:
a type declaration applies to the entire compilation unit in which it
appears (module, class or def), and referencing a declared name before the
declaration should be verboten.  "global" currently allows redundant
declarations, and you get a Fabulous Prize for later noticing the instance
of redundant type declarations that I snuck into one of the examples <wink>.
Conflicting declarations should be a compile-time error, and of the two
traditional approaches to that I vote for name equivalence (as opposed to
structural (content) equivalence).  That is:

class C1:
    decl member real, imag: Float
    real = imag = 0

and

class C2:
    decl member real, imag: Float
    real = imag = 0

are incompatible classes, despite that their guts are identical.

>> _debug = 0
>>
>> decl _prune: [String]
>> _prune = ['(*)']
>>
>> decl find: def(String, optional dir: String) -> [String]
>>
>> def find(pattern, dir = os.curdir):
>>     decl list, names: [String], name: String   # LINE1
>>     list = []
>>     names = os.listdir(dir)
>>     names.sort()
>>     for name in names:
>>         decl name, fullname: String            # LINE2

> Same question: "name" is redefined from here on?

My intent was to illustrate redundant declaration.  I'm definitely not
trying to invent new scoping rules!  The semantics would be exactly the same
if "name" were absent from either LINE1 or LINE2; it would be an error if
e.g. "decl name: Int" appeared in the "for" loop instead; and e.g. it would
be illegal to reference fullname before LINE2.

> Would this behave (or be as behaviorless) like
> the "global" declaration, or lexical, or do
> you open a new type scope with "for"? (New
> "variable, with C's {} in mind).
> The latter cannot be since "for" declared it already.

That's right.  No new scopes are implied here; just saying something about
the names in the current scopes.

how-do-we-declare-the-type-of-a-continuation<wink>?-ly y'rs  - tim


From tim_one@email.msn.com  Fri Dec 17 11:00:06 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 17 Dec 1999 06:00:06 -0500
Subject: [Types-sig] Module Attribute visibility
In-Reply-To: <38592F3D.10A7AFA4@prescod.net>
Message-ID: <002601bf487d$e0b16fe0$63a2143f@tim>

[Paul]
> Okay, but doesn't Python already conflate declaration with
> initialization?

I don't think so.  Its only declaration today is "global"!

> When I refer to mymod.foo I am referring to an object that was
> assigned, somewhere to the name foo in the module mymod.

I don't follow this, unless you think of referring to mymod.foo as "a
declaration" -- I don't.  It's just a reference.

> Are we going to say that statically type checked code can only
> refer to declared (not merely assigned) variables in other
> modules?

I wouldn't say that, although I bet some people will want that as an option.

> Would it be safe to say that undeclared variables are simply not
> available for type checking?

Don't know what "not available" means.  Yell at me if it means more than
what you're talking about below.

> Would you suggest that this is even the case for functions? I.e.
>
> def foo( str ): return str*2
>
> is invisible to the type checker

Not invisible, but that its argument and its return type are (in the absence
of inference) both associated with the universal set (the set of all types),
with rules as sketched earlier (widening bindings only; "int = universal"
bad (& whether that's a compile-time error, or *potential* run-time error,
or compile-time warning, is an option); "universal = int" good).

> until we add:
>
> decl foo: str -> str
>
> Or would foo have an implicit declaration:
>
> decl foo: PyObject -> PyObject

Yes, your PyObject appears to be a spelling of what I called the "universal
set" above.  Although, since this *is* Python, you could probably drop the
"Py" prefix without risk of confusion <wink>.

> And if that foo has an implicit declaration, shouldn't this foo also:
>
> foo = lambda x: x*2

I can't imagine any reason why it shouldn't!  If we're disagreeing here, I
don't see how -- unless it's that you believe I mean something by
"declaration" and/or "initialization" that I don't.

All I was getting at is that

    decl x: Int = 5

(combining declaration with initialization) is more dubious than

    decl x: Int
    x = 5

(leaving declaration (1st line) separate from initialization (2nd line)) in
a language where some people clearly don't want to look at type declarations
*at all*.  Keeping the binding out of the declaration makes it trivial to
"comment it out", set an editor mode to suppress displaying "decl" lines,
copy decl lines into interface files, and so on.

different-purpose-different-statement-ly y'rs  - tim


From guido@CNRI.Reston.VA.US  Fri Dec 17 14:32:25 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 17 Dec 1999 09:32:25 -0500
Subject: [Types-sig] Module Attribute visibility
In-Reply-To: Your message of "Fri, 17 Dec 1999 00:37:26 PST."
 <Pine.LNX.4.10.9912170031510.16305-100000@nebula.lyra.org>
References: <Pine.LNX.4.10.9912170031510.16305-100000@nebula.lyra.org>
Message-ID: <199912171432.JAA12387@eric.cnri.reston.va.us>

> Why can't the runtime system violate the expectations? Seriously: I doubt
> you can prevent it. Python is simply too dynamic. I'd be surprised if you
> could completely stop me from changing sys.version if I want really trying
> to do so.

Nonsense.  You are confusing one particular implementation with what's
possible.  In JPython, things like this *are* being enforced in an
absolute way.  If that's what we want, we can do it.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@CNRI.Reston.VA.US  Fri Dec 17 14:42:35 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 17 Dec 1999 09:42:35 -0500
Subject: [Types-sig] Keyword arg declarations
In-Reply-To: Your message of "Fri, 17 Dec 1999 01:16:09 PST."
 <Pine.LNX.4.10.9912170109080.16305-100000@nebula.lyra.org>
References: <Pine.LNX.4.10.9912170109080.16305-100000@nebula.lyra.org>
Message-ID: <199912171442.JAA12404@eric.cnri.reston.va.us>

I just realized that Tim's decl syntax that's currently being bandied
around doesn't declare the names of arguments.  That's fine for a
language like C, but in Python, any argument with a name (*args
excluded) can be used as a keyword argument.

I think it will be useful for the decl syntax to allow leaving out or
supplying argument names -- that tells whether keyword arguments are
allowed for this particular function.  And that is part of a
function's signature.

(Note that not all builtins support keyword arguments; in fact most
don't.)

(Un)related: I think it makes sense to be able to restrict the types
of *varargs arguments.  E.g. eons ago (last week in the types-sig)
someone proposed an extension to isinstance() allowing one to write
isinstance(x, type1, type2, type3, ...).  Clearly the varargs are all
type objects here.

Not so sure about **kwargs, but these should probably be treated the
same way.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gstein@lyra.org  Fri Dec 17 14:54:32 1999
From: gstein@lyra.org (Greg Stein)
Date: Fri, 17 Dec 1999 06:54:32 -0800 (PST)
Subject: [Types-sig] Keyword arg declarations
In-Reply-To: <199912171442.JAA12404@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912170651540.16305-100000@nebula.lyra.org>

On Fri, 17 Dec 1999, Guido van Rossum wrote:
> I just realized that Tim's decl syntax that's currently being bandied
> around doesn't declare the names of arguments.  That's fine for a
> language like C, but in Python, any argument with a name (*args
> excluded) can be used as a keyword argument.
> 
> I think it will be useful for the decl syntax to allow leaving out or
> supplying argument names -- that tells whether keyword arguments are
> allowed for this particular function.  And that is part of a
> function's signature.

Shouldn't be hard to add these names. IMO, the syntax for functions in a
typedecl should look just like the "def" syntax (which should be updated
to allow typedecls).

>...
> (Un)related: I think it makes sense to be able to restrict the types
> of *varargs arguments.  E.g. eons ago (last week in the types-sig)
> someone proposed an extension to isinstance() allowing one to write
> isinstance(x, type1, type2, type3, ...).  Clearly the varargs are all
> type objects here.
> 
> Not so sure about **kwargs, but these should probably be treated the
> same way.

Shouldn't be a problem:

def foo(bar, *args: [Int], **kw: {String: Float}) -> None:
  ...


Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From skip@mojam.com (Skip Montanaro)  Fri Dec 17 17:26:46 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Fri, 17 Dec 1999 11:26:46 -0600 (CST)
Subject: [Types-sig] doc-sig/types-sig clash?
Message-ID: <14426.29270.837380.905106@dolphin.mojam.com>

One of the proposals regarding typing seems to be inserting stuff into doc
strings.  It is perhaps worth noting for those who don't subscribe to the
doc-sig that that bunch of ne'er-do-wells (I mean esteemed colleagues) also
has their eyes on the doc string (imagine that!).

Just raising a small flag to make sure people don't assume the doc string is
their private sandbox.  Apologies if this has already been addressed.  I'm
still wading through all the recent Python activity.

Skip


From skip@mojam.com (Skip Montanaro)  Fri Dec 17 17:32:45 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Fri, 17 Dec 1999 11:32:45 -0600 (CST)
Subject: [Types-sig] RFC 0.1
In-Reply-To: <000601bf482c$6e075b40$63a2143f@tim>
References: <199912141633.LAA23558@eric.cnri.reston.va.us>
 <000601bf482c$6e075b40$63a2143f@tim>
Message-ID: <14426.29629.400734.628570@dolphin.mojam.com>

    Tim> Optimizing compilers do this routinely under the covers, where the
    Tim> (misnamed in this case) "verification" code simply branches to a
    Tim> slower all-purpose translation of the code if the assumptions turn
    Tim> out to be false at runtime.

As does/did Self (without explicit type declarations)... ;-)

Skip


From faassen@vet.uu.nl  Fri Dec 17 17:32:45 1999
From: faassen@vet.uu.nl (Martijn Faassen)
Date: Fri, 17 Dec 1999 18:32:45 +0100
Subject: [Types-sig] New syntax?
In-Reply-To: <3859AD3F.CFB0ED0A@prescod.net>
References: <Pine.LNX.4.10.9912161358320.16305-100000@nebula.lyra.org> <38596586.63D9E29F@vet.uu.nl> <3859AD3F.CFB0ED0A@prescod.net>
Message-ID: <19991217183245.A12905@vet.uu.nl>

Paul Prescod wrote:
> Martijn Faassen wrote:
> > 
> > ...
> > Didn't you think parameterized types looked fairly straightforward in my
> > syntax proposal?
> 
> I must have missed something. Could you show me how to do Btree of X and
> then make concrete types Btree of Int and Btree of Functions From String
> to Int?

I don't know enough about Btrees to give a good example of that, but this
would be the idea:

# declclass, decldef to define classes and functions of same name
declclass Test: 
    whatevertype: param

    def __init__(self, whatevertype):
        self.data: whatevertype   

    def getvalue(self):
        result: whatevertype

# need to come up with new typedefinition syntax here
# for classes this isn't necessary as a class definition should be a type
# definition, but for functions it may be necesary. This needs to be
# thought out. I think typedeffing functions when necessary is better than
# devising some syntax to declare a function inline in a parametric
# type instantiation.
typedef Func(string):
    result: int

foo: Test(int)
bar: Test(Func)
baz: Test(Test(int))

Something like that. The keywords aren't ideal yet, but the syntax is
fairly Pythonic. 

By the way, I thought of an alternative syntax that might be
more Pythonic as it cuts down on the use of ':'

declclass Foo:
    def __init__(self, value=[int]):
        self.data = [int]
        self.moredata = string
    
    def dosomething(self, one=int, two=string):
        three = [int]
        return int

This one might be more Pythonic and I think I'll advocate this syntax
from now on. :) Transform to external interface by removing all type
assignments to local variables. Transform to interface without exposing 
member data by removing all type assignments in method bodies.

Did you all notice how the terms 'type assignment' and 'type instantiation'
nicely map to Pythonic syntax?

Regards,

Martijn
   

From tismer@appliedbiometrics.com  Fri Dec 17 17:36:21 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Fri, 17 Dec 1999 18:36:21 +0100
Subject: [Types-sig] A challenge
References: <002501bf4878$d2faf240$63a2143f@tim>
Message-ID: <385A7495.D7A25EC4@appliedbiometrics.com>


Tim Peters wrote:
> 
> [Christian Tismer]
> > ...
> > Just a question, please:
> >
> >> import fnmatch
> >> import os
> >>
> >> decl _debug: Int  # but Boolean makes more sense; see below
> >
> > Is this meant to be lexically true in the globals scope from
> > here on?
> 
> I'm not sure I grasp the question.  The implicit model is the "global" stmt:
> a type declaration applies to the entire compilation unit in which it
> appears (module, class or def), and referencing a declared name before the
> declaration should be verboten.  "global" currently allows redundant
> declarations, and you get a Fabulous Prize for later noticing the instance
> of redundant type declarations that I snuck into one of the examples <wink>.

Understood. There is always just one possible type in a scope,
and if defined again, it has to match.

[name equivalence]
That sounds very right, since it allows to create different
things even if they look the same from structure. You get more
strength in error checking, since using the parameter in the wrong
context can be detected even if a foo's components look like a bar's.

...
> >>     decl list, names: [String], name: String   # LINE1
> >>         decl name, fullname: String            # LINE2
> 
> > Same question: "name" is redefined from here on?
> 
> My intent was to illustrate redundant declaration.  I'm definitely not
> trying to invent new scoping rules!  The semantics would be exactly the same
> if "name" were absent from either LINE1 or LINE2; it would be an error if
> e.g. "decl name: Int" appeared in the "for" loop instead; and e.g. it would
> be illegal to reference fullname before LINE2.

Isn't this in conflict with one of your earlier posts where you
wanted the same variable to take different types in sequence?
I found that example very clean. You assigned a dict's keys()
tot he variable which held the dict. Is this idea gone?

> how-do-we-declare-the-type-of-a-continuation<wink>?-ly y'rs  - tim

Let's see :-)

PythonWin 1.5.42c1 (#0, Dec 15 1999, 01:48:37) [MSC 32 bit (Intel)] on
win32
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
Portions Copyright 1994-1999 Mark Hammond (MHammond@skippinet.com.au)
>>> import continuation
>>> co = continuation.caller()
>>> co
<Continuation object of runcode() frame at 129a200>
>>> type(co)
<type 'Continuation'>
>>> co.__doc__
"I am a continuation object, Deleting 'link' kills me."
>>> callable(co)
1
>>> 

I think the type of a continuation is Continuation.

Thanks for the good question - ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From skip@mojam.com (Skip Montanaro)  Fri Dec 17 17:40:29 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Fri, 17 Dec 1999 11:40:29 -0600 (CST)
Subject: [Types-sig] Module Attribute visibility
In-Reply-To: <385991C6.5B2917B4@prescod.net>
References: <Pine.LNX.4.10.9912161256270.16305-100000@nebula.lyra.org>
 <385991C6.5B2917B4@prescod.net>
Message-ID: <14426.30093.817662.654658@dolphin.mojam.com>

    Paul> Greg Stein wrote:
    >> IMO, let's solve static type checking. Leave visibility and
    >> modification rules to another phase. They are orthogonal problems,
    >> and we would do well to reduce our problem set (and the amount of
    >> discussion thereby engendered (my 25 cent word for the day :-)).

    Paul> They are not orthogonal at all. I can't statically check a file
    Paul> that uses sys.version unless I know that sys.version has not been
    Paul> overwritten with a string. We can't allow the runtime system to
    Paul> violate the expectations of the static type engine. We also don't
    Paul> want every user of sys.version to need to assert its type.

Depending what version of Python we are proposing this for, I think you can
punt on the issues of visibility and modification if you allow the
programmer to state (perhaps with a command line argument) that the elements
of all core modules (sys, os, posix, math, ...) have stable type
representations.  This allows you (us?) to write a set of declarations for
these modules akin to function prototypes in C or class declarations in C++.
If that doesn't leave enough wiggle room for some objects, perhaps you need
an "Object" declaration that tells the type checker the object is of an
indeterminate (my 25-cent word for the day) type.

Skip


From faassen@vet.uu.nl  Fri Dec 17 17:41:40 1999
From: faassen@vet.uu.nl (Martijn Faassen)
Date: Fri, 17 Dec 1999 18:41:40 +0100
Subject: [Types-sig] New syntax?
In-Reply-To: <001201bf4844$01f56a60$63a2143f@tim>
References: <Pine.LNX.4.10.9912161358320.16305-100000@nebula.lyra.org> <001201bf4844$01f56a60$63a2143f@tim>
Message-ID: <19991217184140.B12905@vet.uu.nl>

Tim Peters wrote:
> [GregS]
> > ...
> > In fact, I don't even like Tim's notion of declaring a function since a
> > "def" is more than adequate for doing that.
> 
> I thought it would be easier to get one new stmt than to modify existing
> stmts, and *much* easier to write a dirt-simple tool to strip them out again
> (vis a vis Guido's requirement).
> 
> In real life I would certainly prefer annotating "def" stmts directly.  I
> think a declaration statement needs the *ability* to specify full function
> signatures, though; e.g.,
> 
> decl handlerMap: {String: def(Int, Int)->Int}
> 
> handlerMap = {"+": lambda x, y: x+y,
>               "*": lambda x, y: x*y,
>               ...
>              }

I think inline type declarations like def(Int, Int)->Int may not be necessary
if you allow typedefs. People often give the advice to avoid Lambdas in Python
anyway; why not avoid a lambda like construct in our type definition language
as well?

typedef Footype(int, int):
    return int

var handlermap = {string: Footype}

> In either case, I'm not sure what to do about varargs (the "*rest" form of
> argument).

Me neither. Perhaps something like:

decldef foo(first=int, second=string, *[int]):
    return int

i.e. all the extra arguments must be ints.

Note that I'm currently in the out-of-line camp with Paul. :)

Regards,

Martijn


From faassen@vet.uu.nl  Fri Dec 17 18:10:05 1999
From: faassen@vet.uu.nl (Martijn Faassen)
Date: Fri, 17 Dec 1999 19:10:05 +0100
Subject: [Types-sig] Type annotations
In-Reply-To: <002401bf4876$327173a0$63a2143f@tim>
References: <38592CD4.963694B0@prescod.net> <002401bf4876$327173a0$63a2143f@tim>
Message-ID: <19991217191005.C12905@vet.uu.nl>

Tim Peters wrote:
> BTW, Python's actual "map" function is quite a puzzle to describe!  I can't
> do it:
> 
>     map: def(def(_A)->_B, Seq(_A) -> [_B] |
>          def(def(_A, _B)->_C, Seq(_A), Seq(_B)) -> [_C] | ...
> 
> is just the start of an unbounded sequence of legit map signatures.  Haskell
> avoids the difficulty here thanks to currying.

I can't do it either, but for the simple version of the function, isn't
this more readable (I already notice a typo involving a closeing ) in yours!):

# another variety on my proposal presented here :)
decl: 
    typedef Func(_A):
        return _B

decl:
    typedef map(Func(_A, _B), [_A]):
        return [_B]
    actualmap = map(_A=int, _B=string)

But perhaps it's just a matter of getting used to things.

Regards,

Martijn


From tony@metanet.com  Fri Dec 17 18:37:06 1999
From: tony@metanet.com (Tony Lownds)
Date: Fri, 17 Dec 1999 10:37:06 -0800 (PST)
Subject: [Types-sig] A challenge
In-Reply-To: <000501bf4779$5e566b40$58a2143f@tim>
Message-ID: <Pine.GSO.3.93.991217093241.28577A-100000@adam12>

Here is another syntactical variant. 
------------------------------------

import sys, find

def main() -> None:
    #Note:  I have to rename the "list" variable here, because list is
    # used as a type
    a_list: list of str
    name: str
    dir:= "."		#Note: type implied from the literal. 
    if sys.argv[1:]:
        dir = sys.argv[1]
    a_list = find.find("*.py", dir)
    a_list.sort()
    for name in a_list:
        print name

if __name__ == "__main__":
    main()

----------------------------------------------
import fnmatch
import os
 
_debug: = 0
 
_prune: list of string = ['(*)']

def find(pattern: str, dir: str = os.curdir) -> list of str:
    #Note: again, renaming the var named "list"
    a_list: list of str = []
    names:= os.listdir(dir)	#Note: asking for type to be implied here
    names.sort()
    for name:str in names:	
        if name in (os.curdir, os.pardir):
            continue
        #Note: not asking for fullname to be typed; its usage should be 
        # easy to type check
        fullname = os.path.join(dir, name)
        if fnmatch.fnmatch(name, pattern):
            a_list.append(fullname)
        if os.path.isdir(fullname) and not os.path.islink(fullname):
            for p:str in _prune:
                if fnmatch.fnmatch(name, p):
                    if _debug: print "skip", `fullname`
                    break
            else:
                if _debug: print "descend into", `fullname`
                a_list = a_list + find(pattern, fullname)
    return a_list
#----------------------------------------------------------------------

import re

#Note: I'm showing dictionaries' types are declared using a literal (ie
# {str: int})  rather than a parameterized type name (list of int) because
# a) the consistent choice for the name of a dictionary ("dict") doesnt
#    exist in python right now
# b) actually tuples and lists can be declared using a literal, but that
# would declare a tuple/list of exactly that size.
#
# also note that RegexObject is in a module and is accessed as such.
_cache:{str:re.RegexObject} = {}

# Declaring all the function signatures in a block here, to follow Tim's
# format. 
#
# The reason that "declare" is being used is that if I simply declared the
# type of the variable just like the other local variable then when the
# type checker gets to the actual def statement, which is really just an
# assignment, it should raise an error if it cant determine that the new
# value assigned does not match the definition. 
# 
# Also I am assuming a "bool" type. The corresponding builtin
# function "bool" (str, int, float, etc. all have corresponding builtin
# functions and I think that is a Good Thing) could in essence be:
#
# def bool(value):
#   if any: 
#     return 1
#   else: 
#     return 0
#

declare:
  fnmatch: (str, str) -> bool
  fnmatchcase: (str, str) -> bool
  translate: str -> str

# showing this one again with parameter names; without them you
# restrict the users of this function from using a keyword calling syntax.
declare:
  fnmatchcase: (name:str, pat:str) -> bool

def fnmatch(name, pat):
    import os
    name = os.path.normcase(name)
    pat = os.path.normcase(pat)
    return fnmatchcase(name, pat)

def fnmatchcase(name, pat):
    if not _cache.has_key(pat):
        res:str = translate(pat)
        _cache[pat] = re.compile(res)
    return _cache[pat].match(name) is not None

def translate(pat):
    #Note: the next line was originally:   i, n = 0, len(pat)
    # I had to add parens to use the implied type sugar and its not as
    # easy to read. Which makes me wonder if that bit of sugar is a wart.
    (i, n) := 0, len(pat)
    res := ''
    while i < n:
        # Note: introducing chr as a type
        c:chr = pat[i]
        i = i+1
        if c == '*':
            res = res + '.*'
        elif c == '?':
            res = res + '.'
        elif c == '[':
            j:int = i
            if j < n and pat[j] == '!':
                j = j+1
            if j < n and pat[j] == ']':
                j = j+1
            while j < n and pat[j] != ']':
                j = j+1
            if j >= n:
                res = res + '\\['
            else:
                stuff:str = pat[i:j]
                i = j+1
                if stuff[0] == '!':
                    stuff = '[^' + stuff[1:] + ']'
                elif stuff == '^'*len(stuff):
                    stuff = '\\^'
                else:
                    while stuff[0] == '^':
                        stuff = stuff[1:] + stuff[0]
                    stuff = '[' + stuff + ']'
                res = res + stuff
        else:
            res = res + re.escape(c)
    return res + "$"


Thats it. 

Here is a point-by-point summary, so you can quickly point out what you
dislike (and if a few people do so off-line then I'll post a summary
and maybe save a bit of traffic):

- types are either classes or builtin type names or type expressions.
- the builtin functions involving types are overloaded when used in type
  expressions.
- "bool" is a new type and a new builtin
- parameterized types are instantiated (er, whats the real term for this?)
  with the "of" operator.
- dictionaries of any size are defined by a literal dictionary syntax
- lists and tuples of exact size are defined by a literal tuple syntax
- callables are shown with the -> operator 
- the arguments to a signature is not a tuple; you can have names and
  * and ** keywords.
- you can omit the type expression altogether if it is an assignment from
  a well-known function (e.g. builtin) or a simple literal (e.g. 0, '')
- variables created by for loops are typed in-line
- if you are specifying a signature for a function that occurs later it
  should be in a declare block
- using "list", "int", etc. for variable names potentially shadows the
  builtin function and the type name.


-Tony


From da@ski.org  Fri Dec 17 18:53:18 1999
From: da@ski.org (David Ascher)
Date: Fri, 17 Dec 1999 10:53:18 -0800
Subject: [Types-sig] Keyword arg declarations
References: <Pine.LNX.4.10.9912170109080.16305-100000@nebula.lyra.org>  <199912171442.JAA12404@eric.cnri.reston.va.us>
Message-ID: <00c101bf48bf$fb9850c0$c355cfc0@ski.org>

From: "Guido van Rossum" <guido@CNRI.Reston.VA.US>


> I just realized that Tim's decl syntax that's currently being bandied
> around doesn't declare the names of arguments.  That's fine for a
> language like C, but in Python, any argument with a name (*args
> excluded) can be used as a keyword argument.

This brings to mind a point which may or may not be relevant.  Sometimes
Python users use some tricks to do 'deferred' type checking and other
argument manipulation, because the syntax doesn't allow one to specify the
interface which one needs.

An example of such a signature is familiar to all is the signature for
range().  The docstring for range reads:

range([start,] stop[, step]) -> list of integers

which is not expressible with the current syntax.  A Python version of range
would have to do, much like NumPy's arange does,

def range(start, stop=None, step=1):
    if (stop == None):
        stop = start
        start = 0

Now, the builtin typechecker can of course be told about __builtin__.range's
signature peculiarities, but is there any way we can address the more
general problem?  Or is it, as I suspect, rare enough that one can ignore
it?

> (Note that not all builtins support keyword arguments; in fact most
> don't.)

And a shame it is, IMO.  Would it make sense to consider for 2.0 a mechanism
which allows keyword arguments almost by default?  That way I could do

pickle.dump(object=foo, file=myfile)

and never have to worry about which came first...

--david


From gstein@lyra.org  Fri Dec 17 19:02:04 1999
From: gstein@lyra.org (Greg Stein)
Date: Fri, 17 Dec 1999 11:02:04 -0800 (PST)
Subject: [Types-sig] typedefs (was: New syntax?)
In-Reply-To: <19991217184140.B12905@vet.uu.nl>
Message-ID: <Pine.LNX.4.10.9912171048160.16305-100000@nebula.lyra.org>

On Fri, 17 Dec 1999, Martijn Faassen wrote:
> Tim Peters wrote:
> > [GregS]
> > > ...
> > > In fact, I don't even like Tim's notion of declaring a function since a
> > > "def" is more than adequate for doing that.
> > 
> > I thought it would be easier to get one new stmt than to modify existing
> > stmts, and *much* easier to write a dirt-simple tool to strip them out again
> > (vis a vis Guido's requirement).
> > 
> > In real life I would certainly prefer annotating "def" stmts directly.  I
> > think a declaration statement needs the *ability* to specify full function
> > signatures, though; e.g.,
> > 
> > decl handlerMap: {String: def(Int, Int)->Int}
> > 
> > handlerMap = {"+": lambda x, y: x+y,
> >               "*": lambda x, y: x*y,
> >               ...
> >              }
> 
> I think inline type declarations like def(Int, Int)->Int may not be necessary
> if you allow typedefs. People often give the advice to avoid Lambdas in Python
> anyway; why not avoid a lambda like construct in our type definition language
> as well?
> 
> typedef Footype(int, int):
>     return int
> 
> var handlermap = {string: Footype}

I see typedefs as a way to associate a typedecl with a name. In your
example here, I'm not sure how to do a typedef of something like
List<String>. You seem to have pegged typedef to only do function
typedefs.

Per the GFS proposal, I would recommend that "typedef" is a unary operator
keyword. The operand is a typedecl, in the form that we see to the right
of a "decl" statement. The result of the operator is a typedecl object.
This typedecl object can, of course, be used in further typedecl
constructions. For example:

HandlerMapType = typedef {String: def(Int, Int)->Int}
decl std_handlers: [HandlerMapType]

def foo(m: HandlerMapType)->Int:
  ...


In any case, I think using "def" inline to define a function typedecl is
fine. A typedef is merely used to create an alias, to clarify a later
declaration.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Fri Dec 17 19:52:07 1999
From: gstein@lyra.org (Greg Stein)
Date: Fri, 17 Dec 1999 11:52:07 -0800 (PST)
Subject: [Types-sig] parameterized typing (was: New syntax?)
In-Reply-To: <19991217183245.A12905@vet.uu.nl>
Message-ID: <Pine.LNX.4.10.9912171102520.16305-100000@nebula.lyra.org>

Hrm. Looks like I deleted Paul's original note. I've got some ideas now on
how to do parameterized types, so I'll just piggyback on Martijn's :-)

On Fri, 17 Dec 1999, Martijn Faassen wrote:
> Paul Prescod wrote:
> > Martijn Faassen wrote:
> > > ...
> > > Didn't you think parameterized types looked fairly straightforward in my
> > > syntax proposal?
> > 
> > I must have missed something. Could you show me how to do Btree of X and
> > then make concrete types Btree of Int and Btree of Functions From String
> > to Int?

The first issue to handle with parameterized classes is how to specify the
parameter(s) to be used in your class definition. Tim's "decl" keyword
comes into play here. So lets jump in with definition-by-example:

class Btree:
  decl param _T: any    # the parameter can have any type

  def insert(self, value: _T) -> None:
    ...

  def asList(self) -> [_T]:
    ...


Now that we have defined and declared our Btree class, we can parameterize
it in later declarations:

decl tree1: Btree(Int)               # Paul's first request
decl tree2: Btree(def(String)->Int)  # Paul's second request

tree1.insert("foo")   # causes type-check error; we're passing wrong type
l = tree1.asList()    # we know that <l> is [Int]

Other examples of param declarations might be:

decl param _T: Int or String or Tuple
decl param _S: [Int] or [Float]

It is interesting to note that the above lines are almost exactly the same
as doing:

_T = typedef Int or String or Tuple
_S = typedef [Int] or [Float]

The only difference is that (in the decl case) Python understands that _T
and _S are type-substitution parameters. If a true typedef was used in the
Btree class definition, then we would simply end up with a
non-parameterizable class that had "any" in some of its declarations.
Note that the compiler *does* treat the declarations using the typedef
notion: it can perform optimizations, type checks, and other stuff as if
"any" was used. To be clear:

class Foo:
  decl param _T: Int or Float

  def bar(self, value: _T) -> _T:
    return 2 * value

In the above code, the compiler uses _T as a typedef and optimizes the
"2 * value" line, knowing that "value" is either an Int or a Float.
Runtime checks are present, as usual. The "decl param" is only important
to *users* of class Foo -- the _T becomes part of Foo's interface and
type substitutions can be made.

---- Point (1)   (this will be important later)

One issue is that Btree has no (runtime) argument checks in the above
example. "any" is allowed for the insert() parameter, which effectively
means "no type check." In other words, Btree is still an abstract
implementation; the concreteness is only present in compile-time type
checks.

Recall my previous note regarding type declarators. Note that a type
declarator object can be a type, a class, an interface, or a composite.
For example:

t1 = typedef Int         # typedecl of a type
t2 = typedef Btree       # typedecl of a class
t3 = typedef Sequence    # typedecl of an interface
t4 = typedef Btree(Int)  # typedecl of a composite

I might suggest that, in the same way instances are created, we can create
concrete classes through the use of type declarators:

BtreeInt = typedef Btree(Int)
tree = BtreeInt()

In other words, typedecl objects are callable. A typedecl that is a class
or a parameterized class will instantiate an object.

So what does "concrete class" actually give you? (note that I mean
something other than the type checks mentioned above)  I think a concrete
class would be an on-the-fly constructed subclass of the abstract class.
Each method would be overridden: a method arg runtime check is done, then
a call to the abstract method if performed. For example:

class __compiler_built_Btree_Int(Btree):
  def insert(self, value: Int)->None:
    return Btree.insert(self, value)
  def asList(self)->[Int]:
    return Btree.asList(self)

The notion of "concrete class" (to me) simply means the addition of
runtime checks to enforce the type constraint.

Theoretically, the system could recompile the Btree class and perform
various optimizations. But: I don't think that is really possible, given
Python's model (the source may not be readily available, and there isn't a
way for the compiler to reach into the middle of a source file to grab the
Btree class source and rebuild a concrete version).

Given that we have compile-time checks with the simplest notion of
parameterized types, I don't see the runtime checks offering a whole lot
more. Especially for the complexity involved.

I think I'll take Tim's tack here: the notion of on-the-fly building
concrete classes won't work; the example above is a before-somebody-
suggests-it counterproof.

So, I would say everything above Point (1) is valid. Everything below
should not be dealt with.

Paul: does this sufficiently address your desire for parameterized types?
Others: how does this look? It seems quite Pythonic to me, and is a basic
extension of previous discussions (and to my thoughts of the design).

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Fri Dec 17 20:04:37 1999
From: gstein@lyra.org (Greg Stein)
Date: Fri, 17 Dec 1999 12:04:37 -0800 (PST)
Subject: [Types-sig] Keyword arg declarations
In-Reply-To: <00c101bf48bf$fb9850c0$c355cfc0@ski.org>
Message-ID: <Pine.LNX.4.10.9912171155100.16305-100000@nebula.lyra.org>

On Fri, 17 Dec 1999, David Ascher wrote:
> From: "Guido van Rossum" <guido@CNRI.Reston.VA.US>
> > I just realized that Tim's decl syntax that's currently being bandied
> > around doesn't declare the names of arguments.  That's fine for a
> > language like C, but in Python, any argument with a name (*args
> > excluded) can be used as a keyword argument.

I responded to this elsewhere; I believe we can easily declare varargs and
keywords with an unambigous syntax.

>...
> An example of such a signature is familiar to all is the signature for
> range().  The docstring for range reads:
> 
> range([start,] stop[, step]) -> list of integers
> 
> which is not expressible with the current syntax.  A Python version of range
> would have to do, much like NumPy's arange does,

I believe it is expressible:

def range(start: Int, stop=None: Int or None, step=1: Int) -> [Int]:
  ...

The only caveat is that somebody could do:

  range(3, None, 1)

Which (technically) is not supposed to be allowed.

>...
> Now, the builtin typechecker can of course be told about __builtin__.range's
> signature peculiarities, but is there any way we can address the more
> general problem?  Or is it, as I suspect, rare enough that one can ignore
> it?

Well, the above isn't necessarily the prettiest, but it *is* possible with
at least one proposal for syntax extensions. I believe this kind of
argument funkiness is pretty rare and we don't need to provide any special
handling or consideration for the problem.

>...
> > (Note that not all builtins support keyword arguments; in fact most
> > don't.)
> 
> And a shame it is, IMO.  Would it make sense to consider for 2.0 a mechanism
> which allows keyword arguments almost by default?  That way I could do

I think the builtins need to start using PyArg_ParseTupleAndKeywords(). I
seem to recall that there have been problems with that function in the
past, but I think they've been cleared up in the 1.5 series. That should
keyword-enable those functions...

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From skaller@maxtal.com.au  Fri Dec 17 22:06:54 1999
From: skaller@maxtal.com.au (skaller)
Date: Sat, 18 Dec 1999 09:06:54 +1100
Subject: [Types-sig] Type Inference I
References: <Pine.LNX.4.10.9912171155100.16305-100000@nebula.lyra.org>
Message-ID: <385AB3FE.AE6C2630@maxtal.com.au>

I've just been reading the December Types-Sig archive.
Wow, from an almost dead SIG ...

I've got a lot of comments, and I'm not sure if to roll them
all into one post, or lots of small ones. So I'll try a small one,
to see if I get suitably flammed :-)

Proposition: it is possible to do whole program type analysis of
existing Python programs, and generate optimised code.

Evidence: JimH has in fact done this, and claimed some
astounding (almost unbelievable) results.
I have done some 'micky mouse' testing, and found that
in some cases, it can indeed work.

So I think the proposition is proven that it can be _done_,
but how effective is it? What obstacles stand in the way?
Here are some thoughts.

OBSTACLES TO TYPE INFERENCE IN PYTHON, I: POOR SPECIFICATIONS
----------------------------------------------------------

First, the biggest obstacle to doing this is the state of the
current language definition! The exclamation mark is there
because I know you didn't expect this. [You thought that
optional type declarations would be the biggest help]
Of course, Guido (and ONLY guido) could go ahead and
do some type inferencing, and then 'explain' some extra
restrictions on Python. But no other third party could
do this, and claim to faithfully compile Python.

I've presented this argument before, but I'll give it again
anyhow. At present, there is no such thing as an 'erroneous'
or 'ill formed' python program. Every single file on your
computer is a correct python program. If you run python
on some arbitrary file, and it throws a SyntaxError,
then the program is correct: it has well defined
semantics, namely to raise a SyntaxError.

Perhaps you think this example is extreme (it is,
deliberately), but when we come down to compiling files
which are more 'obviously' python, this issue
almost completely prevents any optimisation
by type inference -- and that is why the most
important change for Python is to very carefully
define what stuff is NOT correct python.
It is very important that in these cases,
the language specification does NOT say an exception
is raised because that is _precisely_ what the problem is.
Raising an exception is well specified behaviour, and when
it happens according to specification, the client code
which causes it to happen is CORRECT python -- precisely
because the behaviour is specified.

For example, consider:

	x = 1
	y = open("something")
	try: x + y
	except: print "OK"

This code is CORRECT python at the moment (AFAIK).
It is NOT 'illegal' to add a file and an integer,
it is perfectly correct to do it, and then handle
the resulting exception.

There is no hope for any kind of type inference 
until this is fixed. What must be said is that
this case is an error, and that Python can
do anything if the user does this: the result
of executing the code MUST be undefined.

The fact that a particular implementation throws an exception,
is good behavour on the part of that particular implementation,
but it must NOT be required -- because that would prevent
a compiler rejecting the program.

FIXING THE PROBLEM -- WITHOUT DOING TOO MUCH WORK
-------------------------------------------------

It is relatively easy to fix SyntaxError: it is easy to say
that a 'python' program is one that (at least) conforms to the grammar.

It is not so easy to fix all the other cases, because 
in _some_ of these cases, we would want 'undefined'
behaviour, and in other cases, we would actually
_want_ to throw an exception. For example, it is common
to say this:

	try:
		import X
	except ImportError:
		import Y


and many people think this is reasonable, and changing
the existing semantics would break a lot of existing
code. It is possible to manually look at EVERY possible
place in the language specification, and decide
in which cases an exception must be raised, and in which
cases the _program_ is plain wrong, and the result is
undefined -- and perhaps go on to specify implementation
details such as "The current CPython implementation raises
an XXX exception here".

But that is a lot of work. The example of "SyntaxError"
suggests an alternative: we examine exceptions directly,
and specify which ones must be thrown, and which ones
are the result of an invalid program. This _might_
require some reworking of the exception tree
(I can't spell heirarchy :-)

The way to do this is to pretend you are a compiler
implementor for Python, and want to know which things
you MUST generate code for, and which things you
will either assume (and let the clients program
go haywire or coredump if the assumption is violated),
or which things you can reject at compile time as WRONG.

For example, os errors clearly require runtime support:
they're not (necessarily) program errors.

On the other hand , TypeError and ValueError are difficult.
Here's why: when the top level catches a type error,
it is almost certainly a program error. But in the case:

	def f(arg):
		try:
			return "<" + arg + ">"
		except TypeError:
			return repr(arg)

the client is doing something sensibly pythonic <g>.
Here, a much more complex rule may be required:

	[Badly worded WRAPPING rule}
	----------------------------

	"If an operator has 'invalid' argument
	combinations, then, unless it is wrapped
	in a lexically enclosing try block which
	catches a TypeError, the program is ill formed."

In other words, if you really want to catch a TypeError
when evaluating some expression, you are REQUIRED to
make sure it is lexically enclosed in a try block
which explicitly catches the TypeError. The reason
is clear -- a type inference algorithm can note
the try/except block here, and generate code
that does type checking. But in the case of plain old:

	"<" + arg + ">"

_without_ lexical enclosure, the type inference is
entitles to assume that 'arg' is a string.
And then, when the client calls the function containing
this fragment, a compiler may reject the program
because it can see that 'arg' is not a string, as required
by the semantics.

	The key word in that sentence is 'required'.
At present, the semantics don't really _require_ anything,
and so optimising python by local type inference is impossible.
[This does not prevent whole program analysis, but it does
make the results much less effective]

Finally, I note that contrary to my simplified assumptions,
the current reference DOES in fact say, in places,
that something is 'illegal' (or whatever).

The contention of this particular article is that this is
the first, and most important, work that can be done
by people wanting to compile python to efficient code:
clean up the semantics.

In many cases, I feel Guido will already have an opinion
on what is 'intended' to be an error in the program, and what is
'intended' to throw an exception: that is, I think Guido
could sensibly resolve some of the cases that other SIG members
disagreed on, or found difficult. Here is the 1.5.2 exception tree,
with comments:


Exception(*)
 |
 +-- SystemExit
 +-- StandardError(*)
      |
      +-- KeyboardInterrupt    NO! This should be TOP LEVEL
      +-- ImportError          RAISED ONLY IF WRAPPED, ELSE ERROR ****
      +-- EnvironmentError(*)  MUST BE RAISED IN COMPILED CODE
      |    |
      |    +-- IOError
      |    +-- OSError(*)
      |
      +-- EOFError             MUST BE RAISED IN COMPILED CODE
      +-- RuntimeError
      |    |
      |    +-- NotImplementedError(*)  UNDEFINED BEHAVIOUR
      |
      +-- NameError            UNDEFINED BEHAVIOUR (except in 'exec,
eval')
      +-- AttributeError       UNDEFINED (use getattr to get defined
behaviour)
      +-- SyntaxError          UNDEFINED
      +-- TypeError            UNDEFINED UNLESS WRAPPED LOCALLY
      +-- AssertionError       UNDEFINED
      +-- LookupError(*)       UNDEFINED UNLESS WRAPPED LOCALLY
      |    |
      |    +-- IndexError
      |    +-- KeyError
      |
      +-- ArithmeticError(*)    UNDEFINED
      |    |
      |    +-- OverflowError
      |    +-- ZeroDivisionError
      |    +-- FloatingPointError
      |
      +-- ValueError           UNDEFINED UNLESS WRAPPED
      +-- SystemError          UNDEFINED
      +-- MemoryError          UNDEFINED


The outcome of this is that really, the only times python
guarrantees to raise an exception is for environment errors,
or for typing/indexing/lookup errors which are locally
wrapped.

The biggest problem here is clearly ImportError,
since it cannot be raised _unless_ the importer
catches an exception -- and the only possible
exception it could catch would be an environment error,
which is unlikely, except if the module cannot
be found on the file system.

I'm NOT sure I like the wrapping rule. An alternative
is just to say that the client is required to test,
instead of relying on exceptions. But we need _something_.
Comments needed here: this is the hard part  (a suitably
'pythonic' rule)

------------------------------------------------------------
** Keyboard Interrupt: this is wrong wrong wrong!! <g>
Ocaml does this too. The reason is,
that Ctrl-C (or whatever) can occur _anywhere_ in a program,
and catching it therefore clashes with:

	try: something
	except: handler

Here, 'except' is probably intended to catch SYNCHRONOUS
errors caused by code in 'something', but it
inadvertantly catches a KeyboardInterrupt as well.

SOLUTION: At least, KeyboardInterrupt should be a
top level exception, or, better, divide the exception
tree into two kinds: synchronous and asynchronous.
At least then the programmer can write:

	try: something
	except SynchronousException: handler

which will still let the Ctrl-C escape through to
the top level and kill the program (or, get
explicitly handled).

A better solution, perhaps, is to get rid of Keyboard
Interrupt altogether, and handle signals by a different
mechanism (perhaps in the signal module, perhaps with
core language support)

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From gstein@lyra.org  Fri Dec 17 22:58:25 1999
From: gstein@lyra.org (Greg Stein)
Date: Fri, 17 Dec 1999 14:58:25 -0800 (PST)
Subject: [Types-sig] Type Inference I
In-Reply-To: <385AB3FE.AE6C2630@maxtal.com.au>
Message-ID: <Pine.LNX.4.10.9912171453120.16305-100000@nebula.lyra.org>

On Sat, 18 Dec 1999, skaller wrote:
> ... long post about exceptions and semantic definition ...

Sorry John, call me dense, but I really don't see what you're talking
about. :-(

I don't see a problem with exceptions. That is part of Python. I don't see
that it causes any problems with type inference, either (it just
introduces interesting items into the control/data flow graph).

This whole tangent about feeding an email to Python and claiming it is a
valid Python program with defined semantics (raise SyntaxError). I
understand your explanation, but I totally miss the point. So what?

Type inferencing for the "1 + file" case is easy. You know the two types,
and you know they can't be added. Bam. Error.

And this whole thing about wrapping ImportError or TypeError or
whatever... I just don't see your point.

It was a long email, but what exactly were you trying to say? "Define the
semantics" isn't very clear. I feel Python has very clear semantics. What
exactly is wrong with them?

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Fri Dec 17 23:31:40 1999
From: gstein@lyra.org (Greg Stein)
Date: Fri, 17 Dec 1999 15:31:40 -0800 (PST)
Subject: [Types-sig] I've collected my thoughts...
Message-ID: <Pine.LNX.4.10.9912171502320.16305-100000@nebula.lyra.org>

I've grabbed up the various snippets that I've blathered on about and
dumped them all into a web page:

  http://www.lyra.org/greg/python/type-proposal.html


The page is weak on semantics discussion, but strict on detail. I've got
syntax changes defined, and a (starting?) list of runtime and compile-time
semantic (changes).

I'll keep adding to it as I think of things and hear back from people. I'm
going to try to slow down on this type stuff, though, as I'm going to be
starting work on the new import system.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From tim_one@email.msn.com  Fri Dec 17 23:29:04 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 17 Dec 1999 18:29:04 -0500
Subject: [Types-sig] A lurker's comment
In-Reply-To: <19991216190524.16032.qmail@lannert.rz.uni-duesseldorf.de>
Message-ID: <000501bf48e6$81a33ce0$32a2143f@tim>

[lannert@lannert.rz.uni-duesseldorf.de]
> ....
> An assignment with lhs: (IntType,), rhs: (NoneType, IntType)
> should not be rejected by the interpreter if rhs happens to
> be an Int, but by the compiler.

??? If it's rejected by the compiler, the interpreter will never get to see
it.

See earlier msgs for disussion of "modes".  Some people will want a
compile-time error on that; others will want a runtime error iff rhs==None
obtains; others will want a compile-time warning but not a compile-time
error or runtime expense.

Does one of those cover your view of the world, or do you have a 4th idea in
mind?  Note that GregS's "!" operator gives another approach to cases like
this (explicit runtime check, spelled in a convenient way but on an
expression-by-expression bais).


From tim_one@email.msn.com  Fri Dec 17 23:39:47 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 17 Dec 1999 18:39:47 -0500
Subject: [Types-sig] What is the Essence of Python?
In-Reply-To: <E11ygQD-0001W8-00@lsls4p>
Message-ID: <000701bf48e8$00ce5d00$32a2143f@tim>

[Eddie, cogently attempting to tell Howard this isn't a movie <wink>]
> ...
> There are some important `pythonic theses' I saw (by Tim Peters, I
> think) but I've lost the bookmark ... ask Tim Peters, they were good.
> They might come closer to satisfying your criteria of essentiality.

They've propagated to surprising places, from Andrew Kuchling's "Python
Quotes" page (on the Starship) to Linux News(!).  Guido still needs to fill
in the 20th, though.

the-essence-of-stone-is-the-quality-of-stoniness-ly y'rs  - tim


From skaller@maxtal.com.au  Fri Dec 17 23:59:24 1999
From: skaller@maxtal.com.au (skaller)
Date: Sat, 18 Dec 1999 10:59:24 +1100
Subject: [Types-sig] Type Inference I
References: <Pine.LNX.4.10.9912171453120.16305-100000@nebula.lyra.org>
Message-ID: <385ACE5C.17CD5684@maxtal.com.au>

Greg Stein wrote:
 
> On Sat, 18 Dec 1999, skaller wrote:
> > ... long post about exceptions and semantic definition ...
> 
> Sorry John, call me dense, but I really don't see what you're talking
> about. :-(

	It takes a while to understand the impact of conformance
and specifications on semantics .. and that this is not just a
matter of language lawyering, but a real, pragmatic, issue.
 
> I don't see a problem with exceptions. That is part of Python. I don't see
> that it causes any problems with type inference, either (it just
> introduces interesting items into the control/data flow graph).

	The problem arises roughly as follows:
type inference works by examining an expression like:

	x + 1

and _deducing_ that x MUST (**) be an integer. It cannot
be a file, because it isn't allowed to add a file to an integer.
But in Python you CAN add a file to an integer. It is perfectly
legal, it just throws an exception.

Do you see? This means we cannot deduce ANYTHING about 'x'
in the example snippet given above.

Of course, the _expression_ x+1 can only be an integer,
we _can_ deduce that. But that isn't enough. Python
is too dynamic. We need more constraints to be able
to do effective inference.
 
(**) This example ignores class instances with __add__ methods,
to make the argument easier to follow.

> This whole tangent about feeding an email to Python and claiming it is a
> valid Python program with defined semantics (raise SyntaxError). I
> understand your explanation, but I totally miss the point. So what?

	See above. We cannot infer anything, unless there are rules.
That is, there MUST be set of permitted signatures for
functions/operators,
in order to do inference at all.

	It is possible to do synthetic (bottom up) type analysis, such as:

	x = 1 + 1

Here, we know that Int + Int -> Int, and so x (at least at this
point in the program) must be an Int. But that is only
the 'deductive' part of inference, the 'inferential' part
infers the types of _arguments_ from the set of allowable
signatures of functions. That is, we must do  the inference
top down, not just bottom up.
 
> Type inferencing for the "1 + file" case is easy. You know the two types,
> and you know they can't be added. Bam. Error.

	but you're wrong, the result of applying the addition
operation is, in fact, well defined: it is NOT an error in the program,
it just throws an exception, rather than returning a value.
If you throw an exception deliberately, that is hardly an error, is it?
 
> It was a long email, but what exactly were you trying to say? "Define the
> semantics" isn't very clear. I feel Python has very clear semantics. What
> exactly is wrong with them?

	There is no distinction made between 'incorrect' code,
and 'correct' code for which an exception is thrown.

	In compiled code, we need the distinction, because there
is a lot of overhead in doing the dynamic type checking required
to throw the exception. The whole point of compilation is to
eliminate the overhead of run time type checking.

	This can be done if we know what the type of an
expression MUST be, but the current semantics don't
allow this in enough cases. Let me invent a temporary syntax, in which
there is no exception throwing, instead, the value
returned by an expression is either 'something nice',
or 'exceptional'. This IS what happens now in C code,
where NULL represents 'exceptional', right?

	Now, the type of 

	x + 1

can be 'nice or exceptional', which tells us nothing
about the type of x. But if the type is 'nice',
we know that x must be an integer. And then,
the compiler can generate code that doesn't bother
to check the result of calling PyAdd(x, &One),
because it cannot be NULL. We know PyAdd will 
return an actual object.  I'm sure YOU have done
that yourself, writing C extensions.

[In fact, we can do better, we can peek into the Int 
data structure and add 1 to the result, and rewrap it as a PyObject;
that is, we can INLINE the PyAdd function, and throw
out the cases that cannot occur -- with any luck, within
a larger code fragment, we can get rid of PyAdd altogether,
and just use 'x++' -- a single machine instruction.]

Hope this makes sense: to compile python code effectively,
we need to add some reasonable 'static-y' restrictions.
Where, 'reasonable' means 'suitably pythonic', but not
quite as dynamic as the current CPython 1.5.2 implementation
allows.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Sat Dec 18 00:32:29 1999
From: skaller@maxtal.com.au (skaller)
Date: Sat, 18 Dec 1999 11:32:29 +1100
Subject: [Types-sig] Easier?
References: <Pine.LNX.4.10.9912171502320.16305-100000@nebula.lyra.org>
Message-ID: <385AD61D.548EB5F8@maxtal.com.au>

Paul Prescod wrote:

>>
Jim created an implementation of an excellent, intelligent optimizing
compiler. His work is as, or more, interesting than mine, but it is a
different problem he is trying to solve. (OPT) comes into the picture
because my work makes his much, much easier and more effective in many
cases. 
<<

I think you are wrong, and right: you are wrong that declarations
makes inferencing easier, it makes it harder, because there is
more syntax to analyse, and so more code to add to the inference engine.
But of course, you are right it makes the algorithm more effective,
in particular, if it is only applied locally (say, to a module,
rather than a whole program).

While I'm here, I'd also like to correct a misconception that
type inference is 'hard'. Irrespective of the case for static
languages like C++ or ML, a Python type inferencer is not
harder, but MUCH easier to write.

Here's the explanation: for static systems, the inferencer
needs to be as complete as possible: the client is going to 
be pretty pissed off it it crashes, or if it fails to deduce
a type (both these thing can happen in the ocaml inferencer).
That's because in these languages, it's necessary to deduce
the type for compilation to proceed.

This is not true in Python. It is not necessary for
inferencing to be 'complete' in the sense it is strongly
desired for static languages. In Python, the following
inferencer is acceptable, if not very good:

	def infer_type(expr): return "PyObject"

This inferencer will not help optimise compiled code,
but it is CORRECT. I will call such an inferencer
a _conservative_ inferencer: it will always work
and always give the correct results, even if they
do not help much with optimisation.

The inferencer above is probably the first one I will use
for Viperc: it will be a brain dead code generator that
does nothing more than wrap the equivalent of
bytecode instructions into the equivalent C function calls.
This has to work ANYHOW, as a fallback if the inferencer
cannot infer enough.

Now, you can try for a better inferencer, and, provided
it is conservative, you can probably only get better performance.
The point is that you can build a compiler with a lousy inference
engine, and improve the inferencer -- and the code generator
-- later. All that is required of the inferencer is that
it be correct (conservative).

The issue then arises, that even a good inferencer will yield
poor performing code. That is where 'aggressive' inferencers
come into play -- these are inferencers that make extra
assumptions, in order to get better deductions.

And my 'Type Inference I' post is all about adding extra
constraints to Python, so that more aggressive inferencers
can be considered conservative -- that is, guarranteed correct
according to language specifications.

Apart from the exception handling problem, another 'assumption'
that inferencers can benefit from is 'freezing': assuming
that, after loading, the bindings in a module are immutable.

Indeed, Viperi actually allows freezing modules now
(i.e. even the interpreter can beneift)  but you
have to explicitly call a function to do it -- because otherwise
the interpreter would be breaking the Python language.
I could dispense with that requirement (that the user call the
freezing function) if Guido were to mandate "thou shalt not
change the binding of a module after it is imported".

I'm not asking for that, just trying to explain how
important conformance issues and specifications are
in optimisation, and in particular, how important
it is that certain operations NOT be defined
(even by a requirement that an exception be thrown).

It is, in fact, fairly true to say that it is the
things which are NOT legal, which are the very
things which permit optimisation. Perhaps Tim or Guido
can explain this better.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From tim_one@email.msn.com  Sat Dec 18 00:46:35 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 17 Dec 1999 19:46:35 -0500
Subject: [Types-sig] Low-hanging fruit: recognizing builtins
In-Reply-To: <3859FD66.E47352E@lemburg.com>
Message-ID: <000a01bf48f1$56213bc0$32a2143f@tim>

[M.-A. Lemburg]
> ...
> BTW, just to make buying one of those new microwave
> ovens more attractive: what is the pystone rating for
> the new Athlon and Pentium III chips ?

No idea.  AMD and Intel both put in new instructions to speed speech
recognition, so that's a clear direction for Python's implementation to
follow <wink>.  AMD's solution to lousy cache performance was to add
prefetch instructions, allowing the assembly-language programmer to tell the
memory system which addresses they *expect* to be reading from "pretty
soon".  Helps some SR inner loops a lot.  That's data, though, not
instruction space.  eval_code2's problem is that it contains more code than
the English language has words <wink>.

chances-are-it-scales-with-the-clock-rate-ly y'rs  - tim


From tim_one@email.msn.com  Sat Dec 18 00:46:37 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 17 Dec 1999 19:46:37 -0500
Subject: [Types-sig] New syntax?
In-Reply-To: <Pine.LNX.4.10.9912170109080.16305-100000@nebula.lyra.org>
Message-ID: <000b01bf48f1$572dc9c0$32a2143f@tim>

[Greg Stein]
> ...
> I guess that does mean that something like:
>
> decl a: def(Int)->None
>
> would be possible. e.g. <a> is a member holding a ref to a function
> object.

If it weren't possible, it would be quite a hole in the type description
mechanism!

> Of course, the type of <a> in this case is no different than:
>
> def a(Int x)->None:
>
> It is just that one declares a member and the other declares a method.
> There is a subtle difference there :-)

I'd say it's exactly as subtle as the difference between

    Class.f = somefunc
and
    Class().f = somefunc

today.  BTW, I wouldn't object to requiring that the class/member
distinction be explicit.

    decl class a: ...
    decl member a: ...

If "decl" gets used for more stuff down the road, it could be a real help to
make the syntax explicit from the start:

   ofwhat : 'class' | 'member' | 'var' | 'type' | 'frozen' | ...
   decl-stmt : 'decl' ofwhat <stuff that depends on ofwhat>

> In fact, these two are probably equivalent:
>
> decl class a: def(Int)->None
> def a(Int x)->None:

WRT type, yes, but (of course!) the former is merely a declaration while the
latter is the initial stmt of a definition.

>> In either case, I'm not sure what to do about varargs (the
>> "*rest" form of argument).

> What's wrong with:
>
> decl a: def(Int, *)->Int
> decl b: def(Int, **)->Int
> decl c: def(Int, *, **)->Int
>
> I don't see any ambiguity in the grammar there, unless you use "*"
> to mean unknown (as Paul once mentioned). I think the unknown type
> should be "Any" (or "any"), since it really means "take any type
> of value."

Yes, Any is good.

The problem with * and ** is that people are going to want to express
restrictions, like "only strings from here on in" or "all the keyword args
must be of int type".  Under the theory that things work well if you just
don't think about them <wink>,

    decl c: def(Int, *: (String), **: {String: Int})->Int

> ...
> I'm not sure whether to go for practical or pure.

I'm leaning toward the "always explicit" above.  Restrictions can always be
loosened later if they prove too confining, but tightening a permissive spec
is usually impossible.

despite-guido's-charming-belief-that-we-could-actually-ban-
    intractably-magical-namespace-mutation<wink>-ly y'rs  - tim


From tim_one@email.msn.com  Sat Dec 18 01:17:27 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 17 Dec 1999 20:17:27 -0500
Subject: [Types-sig] Keyword arg declarations
In-Reply-To: <199912171442.JAA12404@eric.cnri.reston.va.us>
Message-ID: <000c01bf48f5$a60fbd60$32a2143f@tim>

[Guido van Rossum]
> I just realized that Tim's decl syntax that's currently being
> bandied around doesn't declare the names of arguments.  That's
> fine for a language like C, but in Python, any argument with a
> name (*args excluded) can be used as a keyword argument.

I never specified the full syntax, and partly because regurgitating the full
arglist syntax at this point would lose the idea to the details!  Arglists
in Python are complex beasts.

> I think it will be useful for the decl syntax to allow leaving out or
> supplying argument names -- that tells whether keyword arguments are
> allowed for this particular function.  And that is part of a
> function's signature.

Definitely part of the signature.  Optional arguments too!  Are default
*values* also part of the signature?

    def increment(x, bump=1): ...

If this got declared via e.g.

    decl increment: def(Int, Int=1) -> Int

then *call* sites could generate code to build the full argument list
appropriately, and invoke a leaner entry point to the eval loop that didn't
have to deduce the correct arglist at runtime via an all-purpose algorithm.
This could be a valuable time-saving (albeit code-bloating) optimization.

OTOH, I have too much abusive code that looks like:

    def whatever(arg1, arg2, _int=int, _ord=ord):

and I've been secretly hoping I could abuse the declaration mechanism to
"export" this as

    decl whatever: def(Any, Any) -> Any

This addresses the rare but real errors wherein a caller inadvertently
passes "too many" arguments, overwriting one of the speed-hack default args.

Then there's also

    decl yadda: def(Int, =Int) -> None

for the case where the 2nd argument is optional but the user doesn't want it
treated as a keyword argument.  My idea on that one was to say "tough
luck -- you need to give the name here".  What do (the generic) you say?

> (Note that not all builtins support keyword arguments; in fact most
> don't.)

So fix that <wink>.

> (Un)related: I think it makes sense to be able to restrict the types
> of *varargs arguments.  E.g. eons ago (last week in the types-sig)
> someone proposed an extension to isinstance() allowing one to write
> isinstance(x, type1, type2, type3, ...).  Clearly the varargs are all
> type objects here.
>
> Not so sure about **kwargs, but these should probably be treated the
> same way.

Coincidentally addressed that in an earlier msg.  Don't think it's a
problem.

    decl isinstance: def(Any, Type) | def(Any, Type, *: (Type))

although the use of "(thing)" to mean "tuple of things of arbitrary length"
does look like a syntax awaiting regret <0.9 wink>, and more than one of us
has stuck "extra" parens in declaration examples for clarity.

unicode-will-supply-many-more-matching-brackets<wink>-ly y'rs  - tim


From tim_one@email.msn.com  Sat Dec 18 01:23:17 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 17 Dec 1999 20:23:17 -0500
Subject: [Types-sig] Keyword arg declarations
In-Reply-To: <Pine.LNX.4.10.9912170651540.16305-100000@nebula.lyra.org>
Message-ID: <000d01bf48f6$764ee5a0$32a2143f@tim>

[Greg Stein]
> ...
> Shouldn't be a problem:
>
> def foo(bar, *args: [Int], **kw: {String: Float}) -> None:
>   ...

Except that *args is a tuple of ints (not a list of 'em).  (Int) is really
unattractive for this.  I know:  (!Int).

tee-hee-ing-ly y'rs  - tim


From skaller@maxtal.com.au  Sat Dec 18 01:49:27 1999
From: skaller@maxtal.com.au (skaller)
Date: Sat, 18 Dec 1999 12:49:27 +1100
Subject: [Types-sig] Type Inference II
References: <Pine.LNX.4.10.9912171453120.16305-100000@nebula.lyra.org>
Message-ID: <385AE827.42ECB891@maxtal.com.au>

In part I of this, I discussed the issues related to 
the Python language specification and exception handling
and the impact on type inference. 

Summary: certain operations which can throw exception
limit the effectiveness of type inference, but it may
be relatively easy to clean up the semantics by specifying
that many of the exceptions must be caught locally
or the program is in error.  In this part, I continue
by looking at other parts of the language which may
restrict the ability to generate optimised code.

TYPE INFERENCE II: SOME SPECIAL CASES
-------------------------------------

Freezing.
---------

Type inference can be severely inhibited by the fact
that Python permits programmers to modify modules
after they have been imported. Indeed, it is not
only _type_ inference that is affected, but also 
inability to cache values, or inline functions,
that is affected.

For example, if a module m contains a variable 'x'
and a function f, and we know 'x' is not modified
after the module is loaded, then we can replace 

	m.x

with the actual value of x. There are a lot of such
constants in Python library modules, for example,
in socket and errno.

A similar argument applies to functions: we cannot
inline a function call, if we do not know the body
of the function. In the case of a call like:

	m.f(1,2,3)

we cannot easily inline f, because we don't know the
user hasn't replaced the original f with something else.

It may seem at first this issue is independent of
'type inference', but that is not the case. First,
consider C and C++, in which 'const' is part of the type
system. Second, consider inference applied to an expression

	y + m.x

Here, if we know m.x is an integer -- we don't care about
the value -- we can deduce that y is also an integer
(ignoring __add__ methods for clarity). But if m.x
could be rebound after module loading, we don't know
what the type is.

Now, I want to explain a bit about how Viperc is going to work.
[At least, my current ideas .. which may change :-]
Viperc begins by loading modules: it _executes_ the modules
using the interpreter Viperi. There's no compilation here!
But AFTER the modules are imported dynamically -- with the full
power of all the dynamism of the interpreter -- the modules
are 'frozen'. The compiler assumes that the modules cannot
be tampered with.

As a result, Viperc can do type inference on the mainline,
since the type of _every_ module attribute is known,
and indeed, the values are known too. Indeed, it can inline
the functions, and other nice 'global analysis' -- but it
is NOT working with the 'source code' or 'AST' for a module:
it is working directly with built, run time, objects.

Of course, none of this can work, if the mainline is allowed
to modify the module contents after the modules are imported:
the importing is done at compile time using the interpreter,
the modules don't even _exist_ in the generated executable.

[There is a more difficult case: freezing classes, and even
instances. Perhaps more, later.]

Ok, now I'm going to backstep. Assuming all imported modules
are frozen after importing is overkill. I think -- not sure --
that we need bit more flexibility than that: the optimisation
is massive, but it is useless if it kills sensible semantics.

To see what 'sensible semantics' means, I will first consider
where freezing is more or less mandatory: the answer is,
when you have a threaded implementation or other re-entering
code, then modifying global variables is a bad idea.

But not everyone is doing that. For a start, it isn't
always clear when importing is finished: interscript,
for example imports modules 'on the fly' in response
to user requests -- this is necessary, because some of the
modules represent typesetters (LaTex, HTML etc), and
the set of these is 'user installable'.

Secondly, it is common to 'patch' functions in modules,
just once, in other modules: for example MA-Lemburg's
Tools do that. There is an obvious 'semantic equivalence'
requirement here so code that doesn't need the extra functionality
will still work, but there is still a use for dynamically
changing module variables after modules are loaded.

What can help here? I don't know. Some ideas include

	const MYCONSTANT = 1

with a specification that changing a 'const' value
after initialisation is an error. For imports,

	import const MODULE

says that changing any attribute of MODULE after importing
IN THE CALLING MODULE is an error (this does not stop
other importers changing it though .. :-(

I'd sure like to see some 'pythonic' ideas here.

LOOP VARIABLES
--------------

Another case which inhibits optimisation is loop variables.
In the loop:

	for x in y: ....

is it allowed to assign to x? What about mutating y?
What about mutating x? [Also, an aside: the code

	for x[1] in y: ...
	for x.attr in y: ...

is allowed but I can't see a real use for it.
Is there one? Could we simplify the syntax,
and required the loop control to be a whole
variable, or tuple of whole variables (recurively),
so that the names involved are always bound directly?

The idea with loop optimisation is that we can:

	(1) keep the loop index in a register 
	(2) cache the sequence
	(3) generate sequence values for 'range(..)' lazily

Tightening up for loops will break code that does things like:

	(1) do extra increments on an loop variable to skip cases
	(2) mutate a list while scanning it

but these seem to lead to newbie posts on c.l.p anyhow.


RESTRICTED SCOPE EXTENSION/RENAMING?
------------------------------------

At present it is sometimes necessary to destroy temporary variables
like this:

	x = value
	...
	del x

There are some examples in the standard library; this
occurs when a temporary is created doing calculations
in a module, but is not meant to be there to use
after the module is imported. Another case occurs like this:

	_f = f # protect our f
	from MODULE import * # might destroy f
	f = _f # set f back
	del _f   # get rid of temporary _f


This is ugly. It also makes inference harder, because
there is now a control flow issue.

One idea I had was this:

	import X as Y   # import X, but name it Y in this module
	from M import x as v


A related idea is:

	import private X
	private x = 1
	private def f(qqq): ...

which makes these name visible in the module, so you can use
unqualified lookup within the module, but so that

	MODULE.X  # fails, X is not visible via M
	MODULE.x  # fails
	MODULE.f  # fails

that is, so external clients cannot see X as an attribute
of MODULE. This is actually easy to implement in Viper,
since it uses a concept of 'environments' for unqualified
lookup: a private name would put into a special private
dictionary which is looked up inside the module,
but which is not part of the module dictionary.
[The 'envionment' looks up both dictionaries, qualified
acces via operator dot does not]

Yet another idea is genuinely temporary variables:

	temporary x = 1

which are automatically destroyed at the end of module loading.

I note python currently supports privacy by name mangling,
but really, this is a hack: for Python 2, a more sophisticated
architecture would be better.

------------

A final comment: it is useful to _implement_ ideas to test
them out. It is damn hard to do that with CPython, because it
it written in C, unless you have extensive familiarity with
the source.

Viper, on the other hand, is written in ocaml, and it is MUCH
easier to play with extensions (and write compilers) in ocaml
than in C. Viper is available for evaluation, if anyone wants to play
with it,
but you will need to know some ML. OTOH, I will be playing with some
extensions anyhow, so if people here have some ideas to try out,
I might be able to implement them relatively easily.
[For example: I extended the grammar to include list comprehensions
in 15 minutes, I expect that implementing the semantics will take
less than an hour. This will be available in the second alpha.]

One thing that is required, though, is a _concrete_ syntax.
I can't implement an idea, even if the semantics are specified,
if there is no syntax! I'm very keen to try 'optional
type declarations', but I need a definite syntax,
not just for the declaration (easy, but also easy to argue about
for a long time), but also for the type (I think this is quite hard).
[Remember Viper uses an extended type system, a type can be any object!]

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From tim_one@email.msn.com  Sat Dec 18 01:51:31 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 17 Dec 1999 20:51:31 -0500
Subject: [Types-sig] doc-sig/types-sig clash?
In-Reply-To: <14426.29270.837380.905106@dolphin.mojam.com>
Message-ID: <001001bf48fa$68c618a0$32a2143f@tim>

[Skip Montanaro]
> ...
> Just raising a small flag to make sure people don't assume the
> doc string is their private sandbox.

Don't worry, Skip:  I'm acutely aware that docstrings are *my* private
sandbox, and I won't let these fellows break doctest <wink>.

determinedly y'rs  - tim


From tim_one@email.msn.com  Sat Dec 18 02:11:00 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 17 Dec 1999 21:11:00 -0500
Subject: [Types-sig] A challenge
In-Reply-To: <385A7495.D7A25EC4@appliedbiometrics.com>
Message-ID: <001201bf48fd$210ba400$32a2143f@tim>

[Christian Tismer]
> ...
> Isn't this in conflict with one of your earlier posts where you
> wanted the same variable to take different types in sequence?
> I found that example very clean. You assigned a dict's keys()
> to the variable which held the dict. Is this idea gone?

Not in *my* code it isn't, but I don't think a type system has to cater to
every abuse I can come up with <wink>.  Greg can handle this fine with
expression-based type operators, but when I look at my own code I think
name-based type declaration is overwhelmingly the less bothersome approach.

This leaves me with several choices; at least:

+ Don't ask for static typing on code that "cheats" this way.

+ Declare "result" as a union type; e.g.,

    decl typedef Set(_T) = {_T: Int}
    ...
    decl var result: Set(Int) | [Int]

+ Harass Guido to add a core dlict type <wink>.

+ Use Greg's form of dynamic cast (which appears to me to have
  real merit whether or not declaration stmts are introduced).

+ Kill any chance of adding declaration stmts by introducing a maze
  of bizarre new rules just to cater to line-by-line redeclaration.

Note that the last is trickier than it may appear, because the crucial line:

    result = result.keys()

uses result as an [Int] on the LHS but as a Set(Int) on the RHS.  So it's
wholly unnatural for name-based typing -- and that doesn't bother me a bit.

>> how-do-we-declare-the-type-of-a-continuation<wink>?-ly y'rs  - tim
>
> Let's see :-)
>
> PythonWin 1.5.42c1 (#0, Dec 15 1999, 01:48:37) [MSC 32 bit (Intel)] on
> win32
> Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
> Portions Copyright 1994-1999 Mark Hammond (MHammond@skippinet.com.au)
> >>> import continuation
> >>> co = continuation.caller()
> >>> co
> <Continuation object of runcode() frame at 129a200>
> >>> type(co)
> <type 'Continuation'>
> >>> co.__doc__
> "I am a continuation object, Deleting 'link' kills me."
> >>> callable(co)
> 1
> >>>
>
> I think the type of a continuation is Continuation.

Hey -- makes *my* life easy <wink>.

will-spend-the-rest-of-the-night-wondering-where-"1"-
    came-from-ly y'rs  - tim


From tim_one@email.msn.com  Sat Dec 18 02:45:31 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 17 Dec 1999 21:45:31 -0500
Subject: [Types-sig] New syntax?
In-Reply-To: <19991217184140.B12905@vet.uu.nl>
Message-ID: <001301bf4901$f3a2f040$32a2143f@tim>

[Martijn Faassen]
> I think inline type declarations like def(Int, Int)->Int may not
> be necessary if you allow typedefs.

I like typedefs fine too, but couldn't make sense of a system in which a
typedef was *essential* for spelling a concept.  typedefs are traditionally
shorthands for things that may be (at worst) clumsy to spell without them.

> People often give the advice to avoid Lambdas in Python anyway;
> why not avoid a lambda like construct in our type definition
> language as well?

Hmm.  These have nothing in common with lambdas apart from having an
argument list -- as do all functions and methods.  Indeed, from the type
expression

    def(Int) -> Int

we have no clue whether it's *defined* via a lambda or def.  And the
declaration should not expose that, so all is well (strictly, the word
"lambda" makes marginally more sense than "def" in the above, but I don't
want to encourage a lambda mindset <wink>).

> typedef Footype(int, int):
>     return int
>
> var handlermap = {string: Footype}

If I had a lot of binary integer functions to declare, I would probably use
a typedef, a la

    decl typedef BinaryFunc(_T) = def(_T, _T) -> _T
    decl typedef BinaryIntFunc = BinaryFunc(Int)
    ...
    decl var intHandlerMap: {string: BinaryIntFunc}
    decl var floatHandlerMap: {string: BinaryFunc(Float)}

etc.  The "deep" problem I have with your "Pythonic" notations is that while
Python excels at expressing imperative algorithms, type specification is a
purely declarative task.  Type *expressions* allow for a convenient, precise
and concise calculus of type-specification "equations".  As in the above
example, the common parts of common patterns can be factored out and resued
with ease.  This is useful!

You're not going to get the same level of expressiveness in an
imperative-style Python syntax:  it's the right tool for the wrong job.  A
type-expression sublanguage with one operator ("|") should suffice.

[on varargs]
> Me neither. Perhaps something like:
>
> decldef foo(first=int, second=string, *[int]):
>     return int
>
> i.e. all the extra arguments must be ints.

Hmm!  You and Greg both seem to think varargs get implemented as lists
<wink>.

> Note that I'm currently in the out-of-line camp with Paul. :)

Aren't we all?  This has been an intense week for the Types-SIG!  The good
news is that Paul must be taking most of it out on his family & not us
<wink>.

i-asked-sinterklaas-and-we're-*all*-getting-nice-presents-ly y'rs  - tim


From skaller@maxtal.com.au  Sat Dec 18 02:56:09 1999
From: skaller@maxtal.com.au (skaller)
Date: Sat, 18 Dec 1999 13:56:09 +1100
Subject: [Types-sig] List of FOO
References: <000d01bf48f6$764ee5a0$32a2143f@tim>
Message-ID: <385AF7C9.7CEE78E5@maxtal.com.au>

Martin wrote, in reponse to Paul:

>>1. This system is supposed to be extensible, right? So I could, for
>> instance, define a binary tree module and have "binary trees of ints"
>> and "binary trees of strings." How do I define the binary tree class and
>> state that it is parameterizable?
>
>Good question; so far I only thought about making built in types (such
>as list) parameterizable. One could however do something similar with
>classes, though:

Let me turn that around. In Viper, there are NO fixed names
for any types: types are just any objects. This leads to
the question Martin asked immediately. Before I proceed, another
observation: in Viper, there is a class

	class PyListType: ....

for lists, but it is planned to allow

	class PyListOf: ...

The idea is that this is a parameterized type, and the
instance is specified by constructing an object:

	PyListOfInt = PyListOf(PyIntType)

The class PyListOf contains methods like 'append' which
take THREE arguments, instead of two: the first argument
is the type.

When an object of kind 'PyListOfInt' is the type object of
some object x, then a call like

	x.append(1)

ends up calling 

	PyListOf.append(PyIntType, x, 1)

which means it can check that 1 is of type PyIntType.
In other words, PyListOf is a meta-type, which happens
to be a class, and the instance type, PyListOfInt
is an _instance_ of it.

In this system, there is no provision for types like

	[Int]

meaning a list of integers, instead, any object
which has the type 'list of integers' actually
has a type object 'ListOfInt' physically available.

One of the points of this type system is that 
builtin types and user defined types are all the same:
they all have type objects which provide methods.
[That is, the type system is unified, although it
is not the case classes are types, but instead that
any object can be a type, and all types are objects]

In this system, the client must CONSTRUCT a type
as that type. Therefore

	[1,2,3]

has type list, NOT type list of integer.

I'm not saying this is good, but I am saying that
two problems raised in this list disappear automatically:

	1) complicated syntax, aluded to by Paul, cannot occur.
	   Indeed, NO new syntax is required at all (to name types)

There is ONLY one way to name a type: by refering to an object.
Hence 'function taking list of X of Y returning .....' simply
cannot occur: no new syntax for typing is used. Example:

	decl X: ListOfListOfTupleOfInt

is possible, if the client defined that horrible name.
You can't say:

	decl X: ListOf(IntType)

because that is a different object to the type object of

	decl Y:ListOfInt(IntType)

since class instance objects are compared by address.

	2) Extension objects are not different to builtin ones.

BOTH use the same idea, of having type objects associated with them,

[In fact, there is a hack in Viper: builtin types' type object
is found by an extra indirection, since the objects don't
exist when the interpreter is first started]

If I can summarise: there is considerable advantage using
arbitrary objects as type objects: they can be specified
using EXISTING python syntax, using the power of the EXISTING
python interpreter, without needing a special, second class
language, to complicate python, and pose an additional
implementation overhead.

In particular, the idea is that the type inference mechanism
can _compile_ the type objects like any other, and therefore
NO special handling is required for extension types.
The ONLY types the inference engine needs to 'know' about
are the builtin ones. 

I'm not sure if this will work :-)

BTW: In Viper, extension 'modules' do NOT build any vtables
or objects, they're just a table of named functions.
No types! The 'types' are constructed in Python,
usually with a class:

	class XType:
		mymethod = Xmymethod

[at present, all 'extensions' have their functions loaded
directly into _builtins_, which is something I will soon
have to fix]

--
One thing I think we DO need though, is a categorical sum.
In ML notation:

	IntOrNone = Int | None

Python does not support sum objects, so it isn't obvious
how to represent type sums using objects. I'm working on this.
In ML, this is done by:

	type Sum = X | Y of int | Z of float
	let sum = Y 1 in ...

Python, like other languages, needs this construction,
it is as fundamental as tuples (in fact, it is
precisely the categorical dual of a tuple or struct)
Dictionaries can be used for this, as can pairs
(kind, value), so it can be represented, just not
nicely. [I'm still thinking on how to do this 'pythonically' <g>]

--
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Sat Dec 18 03:05:21 1999
From: skaller@maxtal.com.au (skaller)
Date: Sat, 18 Dec 1999 14:05:21 +1100
Subject: [Types-sig] Viper Type specification
References: <000c01bf48f5$a60fbd60$32a2143f@tim>
Message-ID: <385AF9F1.1108A094@maxtal.com.au>

FYI: here is the Viper file py_types.vy which defines
many Viper types. [But there is no reason ALL the types
need be here: you can't see the regexp object here,
nor sockets -- these are defined elsewhere]

# This module is exclusive to Viper
# It is a builtin module, defining classes for all
# the Python types.
#
# Methods and attributes for objects can be found in the
# class dictionary of the typing class, even when the object
# is not a PyInstance of the class.
import string
import sys

# datum types
class PyNoneType: pass
class PyIntType : 
  def succ(x): return x + 1
  def pred(x): return x - 1

class PyFloatType : pass
class PyComplexType : pass
class PyLongType : pass
class PyRationalType : pass
class PyStringType: pass

# Sequence types
class PyTupleType : pass
class PyListType : 
  append = list_append
  extend = list_extend
  count= list_count
  index = list_index
  insert = list_insert 
  sort = list_sort
class PyXrangeType: pass

# Others 
class PyClassType: pass
PyTypeType = PyClassType # an alias in Viper
class PyInstanceType: pass

# Viper doesn't currently support execution frames, so tb_frame is set
to None
class PyTracebackType:
  tb_frame = None

class PyFunctionType: pass # Python function type
class PyModuleType: pass   # python module type
class PyNativeFunctionType: pass # builtin function type
class PyNativeMacroType: pass # a macro is an environment sensitive
function (eg globals())
class PyBoundMethodType: pass # a bound method
class PyDictionaryType:     # dictionary
  items = dictionary_items
  clear = dictionary_clear
  copy = dictionary_copy
  has_key = dictionary_has_key
  keys = dictionary_keys
  update = dictionary_update
  values = dictionary_values
  get = dictionary_get
class PyExpressionType: pass # partially evaluated expression (general
expression type)
class PyStatementType: pass # type of a code object
class PyEnvironmentType: pass # an environment for unqualified name
lookup
class PyClosureType: pass # a pair consisting of an expression and an
environment
class PyThreadType: pass # type of a thread
class PyInterpreterType: pass # type of a Viper interpreter object

class PyLockType: # type of a mutual exclusion lock for threads
  def __init__(self):
    self.mutex = lock_create()
  def acquire(self, waitflag=None):
    return lock_acquire(self.mutex, waitflag)
  def release(self):
    lock_release(self.mutex)
  def locked(self):
    return lock_test(self.mutex)

# ---------- GUI -----------------------------------------------------
class PyWidgetType: pass # widget
class PyColorType: pass  # color
class PyFontType: pass   # font
class PyGraphicsContextType: pass # a graphics context
class PyDrawableType: pass # something that can be drawn
class PyCanvasType: pass # something we can draw on
class PyImageType: pass # an external representation of a picture

# ---------- FILES -----------------------------------------------------
# this is type of _native_ files
class PyFileType:
  def close(f): file_close(f)
  def write(f,s): file_write(f,s)
  def flush(f): file_flush(f)

  # note: never raises an exception, does nothing at EOF
  def read(f,amt=None):
    try:
      if amt is None:
        s = ""
        while 1:
          b = file_read(f,8096)
          s = s + b
          if len(b) < 8096: break
        return s 
      else:
        s= file_read(f,amt)
        return s
    except: print "EXCEPTION: IOERROR"

# this is the class used for _client_ files
# we use a class, to support easy subtyping
class PyFileClass: 
  def __init__(self): 
    self.buffer = ""
  
  def read(self,amt=None): 
    return self.native_file.read(amt)
  
  def write(self,s): 
    return self.native_file.write(s)
  def close(self): 
    self.closed = 1
    self.native_file.close()

  # note: returns '' on end of file (no exception raised)
  def readline(self):
    eolpos = string.find(self.buffer, "\n")
    while eolpos == -1:
      n = len(self.buffer)
      data = self.read(1024)
      self.buffer = self.buffer + data
      eolpos = string.find(self.buffer, "\n", n)
      if len(data) == 0: break
    if eolpos == -1: eolpos = len(self.buffer)-1
    line = self.buffer[0:eolpos+1] # include the eol
    self.buffer = self.buffer[eolpos+1:]
    return line

  def readlines(self, hint=None):
    data = self.native_file.read()
    return string.split(data,'\n')

# this function opens a file
def open(filename, mode="r"):
  try:
    native_file = file_open(filename, mode)
    python_file = PyFileClass()
    python_file.native_file = native_file
    python_file.name = filename
    python_file.mode = mode 
    python_file.closed = 0
    return python_file
  except OSError, object:
    exc = IOError(object.errno, object.strerror, filename)
    raise exc

def make_file_object(native_file, filename, mode):
  python_file = PyFileClass()
  python_file.native_file = native_file
  python_file.filename = filename
  python_file.mode = mode
  python_file.closed = 0
  return python_file

# these functions return _native_ files, not client ones!
def get_native_stdin(): return file_get_std_files()[0]
def get_native_stdout(): return file_get_std_files()[1]
def get_native_stderr(): return file_get_std_files()[2]

# these functions return _client_ files!
def get_client_stdin(): return
make_file_object(get_native_stdin(),"stdin","r")
def get_client_stdout(): return
make_file_object(get_native_stdout(),"stdout","w")
def get_client_stderr(): return
make_file_object(get_native_stderr(),"stderr","w")

# this is a hack!
def set_std_files():
  sys.stdin = sys.__stdin__ = get_client_stdin()
  sys.stdout = sys.__stdout__ = get_client_stdout()
  sys.stderr = sys.__stderr__ = get_client_stderr()

# this is a sucky hack!
def type(x): 
  typename = getattr(x,"__typename__")
  return eval (typename)


-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Sat Dec 18 04:08:14 1999
From: skaller@maxtal.com.au (skaller)
Date: Sat, 18 Dec 1999 15:08:14 +1100
Subject: [Types-sig] type declaration syntax
Message-ID: <385B08AE.A491CD36@maxtal.com.au>

On syntax: for functions, the use of ":" for parameters 
seems 'natural' to me:

	def f(a:Int, y:Float=0.0)

as it is used in pascal and ML. But the return type
is a problem. Suggestions include

	def f() -> Float

and

	def f: Float ()

and I'll add using -> in the list:

	def f(a->Int, y->Float)-> Float: ..

but here is another idea: don't bother.

The reason is: local type inference needs to know
the parameter types, and these are needed for call
checking. 

But the _return_ type doesn't need to be annotated as much.
Why? Because the inferencer can usually deduce it:
it's an output, the argument types are inputs.

If the inferencer _cannot_ deduce the return type,
it _also_ cannot check that the function is returning
the correct type.

It is true that knowing the return type can help
inferencing, and it is true it is needed for inferencing
at the point of call, although in this case the deduced
type is (may be) still available.

Only an idea .. but "when in doubt, don't", is a good rule
for language design :-)

One problem with :, that is probably a killer: it cannot
work with lambdas:

	lambda x:Int, y: woops

[I'm not saying if this will kill ":" or lambda though  :-]

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From gstein@lyra.org  Sat Dec 18 04:27:58 1999
From: gstein@lyra.org (Greg Stein)
Date: Fri, 17 Dec 1999 20:27:58 -0800 (PST)
Subject: [Types-sig] type declaration syntax
In-Reply-To: <385B08AE.A491CD36@maxtal.com.au>
Message-ID: <Pine.LNX.4.10.9912172024490.16305-100000@nebula.lyra.org>

On Sat, 18 Dec 1999, skaller wrote:
>...
> But the _return_ type doesn't need to be annotated as much.
> Why? Because the inferencer can usually deduce it:
> it's an output, the argument types are inputs.

Users of the function need the return type. The inferencer won't be
global -- it isn't going to look at the function to determine the return
type. In order to skip that requirement, we annotate the return type. The
caller then simply assumes the return type is correct.

When compiling the function in question, the compiler can verify that the
declared return type is truly what the function will return.

>...
> One problem with :, that is probably a killer: it cannot
> work with lambdas:
> 
> 	lambda x:Int, y: woops

Good point. I'll need to update my page with this issue.

> [I'm not saying if this will kill ":" or lambda though  :-]

Heh. I would simply state that lambda cannot be annotated. If people want
the annotation, then they should use "real" functions. I know that would
please Guido's desire to deprecate lambda :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From skaller@maxtal.com.au  Sat Dec 18 04:30:21 1999
From: skaller@maxtal.com.au (skaller)
Date: Sat, 18 Dec 1999 15:30:21 +1100
Subject: [Types-sig] Re: [Types Sig] Progress
References: <001201bf48fd$210ba400$32a2143f@tim>
Message-ID: <385B0DDD.2C5473A2@maxtal.com.au>

Paul wrote:

>>>>
1. Most people seem to agree with the idea that shadow files allow us a
nice way to separate type assertions out so that their syntax can vary.
I think Greg disagreed but perhaps not violently enough to argue about
it. Interface files are in. Inline syntax is temporarily out. Syntactic
"details" to be worked out.
<<<<

I'm more interested in the inline syntax. Reason: it is easy to 
modify the Viper grammar to allow it. It is much harder to
build a completely new translater for a new 'type' language,
and, this will not sit well with Viper's "any object can be a type".
I also dislike maintaining separate interface files.
[i'm not against this, just stating something I dislike about it]

However, here is an idea for interface files:
an interface file is an ORDINARY python file.
No special stuff. Instead, a new keyword: 'defered'.
For example:

	def f(a:int, b:long): defered

'defered' has the same semantics as 'pass',
but it means 'we'll define this function later'.

The important thing, then, is that the interface file
has a different extension, so that a compiler
can get the type information, without building
the actual module, and it can match the interface
against the actual module. But 'defered' can be
used anywhere.

>>>>>
2. Everybody but me is comfortable with defining
genericity/templating/parameterization only for built-in types for now.
But now that we are separating interfaces from implementations I am
thinking that I may be able to think more clearly about
parameterizability. It may be possible to define parameterizable
interfaces by IPC8. Parameterization is in. Syntactic "details" to be
worked out.
<<<<<<<<

I agree: parameterisation is important. But I don't think
the usual notions used by static languages will work so well
in Python. Before proceeding, please consider how Viper
is supposed to do this. It's real easy to implement, and it
obviates the need for any special new syntax. 

>>>>>>>>>>>
3. We agree that we need a syntax for asserting the types of expressions
at runtime. Greg proposes ! but says he is flexible on the issue. The
original RFC spelled this as:  has_type( foo, types.StringType ) which
returns (in this case) a string or NULL. This strikes me as more
flexible than ! because you can use it in an assertion but you don't
have to. 
<<<<<<<<<<<

I don't think we agree on this: Guido says that assertions
are good enough. I wouldn't argue.

>>>>>>>>>>>
4. The Python misfeature that modules are externally writable by default
is gone. Only Guido has expressed an opinion on whether they should be
writeable at all. His opinion is no. 
<<<<<<<<<<<

I would like this. however a point: we can always write to
a class instance attribute instead. And this is just deferring the 
real problem. Another point: if 'defered' is accepted,
it could be OK to write ONCE to a defered variable
(and an error to use one that had not been written).

>>>>>>>>>>>>>>
5. It isn't clear WHAT we can specify in "PyDL" interface files. Clearly
we can define function, class/interface and method interfaces.
<<<<<<<<<<

Yes it is: if we have them, we have to be able to specify EVERYTHING.
:-)


-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Sat Dec 18 04:34:43 1999
From: skaller@maxtal.com.au (skaller)
Date: Sat, 18 Dec 1999 15:34:43 +1100
Subject: [Types-sig] type declaration syntax
References: <Pine.LNX.4.10.9912172024490.16305-100000@nebula.lyra.org>
Message-ID: <385B0EE3.D05A4945@maxtal.com.au>

Greg Stein wrote:
> 
> On Sat, 18 Dec 1999, skaller wrote:
> >...
> > But the _return_ type doesn't need to be annotated as much.
> > Why? Because the inferencer can usually deduce it:
> > it's an output, the argument types are inputs.
> 
> Users of the function need the return type. The inferencer won't be
> global -- it isn't going to look at the function to determine the return
> type. 

	Viperc _will_ use a global inferencer.
Please don't assume "python" means CPython. There are two other
full scale implementations now. There may be more in the future.
And there may be other programs -- not full interpreters or
compilers, like PyLint -- which will _use_ the information.

> > [I'm not saying if this will kill ":" or lambda though  :-]
> 
> Heh. I would simply state that lambda cannot be annotated.

	OK, agreed.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From gstein@lyra.org  Sat Dec 18 04:40:59 1999
From: gstein@lyra.org (Greg Stein)
Date: Fri, 17 Dec 1999 20:40:59 -0800 (PST)
Subject: [Types-sig] type declaration syntax
In-Reply-To: <385B0EE3.D05A4945@maxtal.com.au>
Message-ID: <Pine.LNX.4.10.9912172039100.16305-100000@nebula.lyra.org>

On Sat, 18 Dec 1999, skaller wrote:
> Greg Stein wrote:
> > On Sat, 18 Dec 1999, skaller wrote:
> > >...
> > > But the _return_ type doesn't need to be annotated as much.
> > > Why? Because the inferencer can usually deduce it:
> > > it's an output, the argument types are inputs.
> > 
> > Users of the function need the return type. The inferencer won't be
> > global -- it isn't going to look at the function to determine the return
> > type. 
> 
> 	Viperc _will_ use a global inferencer.
> Please don't assume "python" means CPython. There are two other
> full scale implementations now. There may be more in the future.
> And there may be other programs -- not full interpreters or
> compilers, like PyLint -- which will _use_ the information.

But I am talking about CPython. Do what you want with Viper, but I'm
concerned with the core/authoritative distribution. I do not believe that
will have a global inferencer. Sure, maybe it will one day, but my
proposal assumes "no".

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From skaller@maxtal.com.au  Sat Dec 18 04:44:09 1999
From: skaller@maxtal.com.au (skaller)
Date: Sat, 18 Dec 1999 15:44:09 +1100
Subject: [Types-sig] type declaration syntax
References: <Pine.LNX.4.10.9912172024490.16305-100000@nebula.lyra.org>
Message-ID: <385B1119.D036BB9F@maxtal.com.au>

Paul wrote:

>Martijn Faassen wrote:
>> 
>> * we don't have to debate about syntax anymore and can actually think
>> about semantics without syntax confusion.
>
>Clean syntax helps comprehension. 

I don't agree, but this time it is because I think you have
_understated_ the issue. I think that the syntax is just
about the ONLY issue here: what 'semantics' is there to debate?

The way I see it, we need a way to declare something is type T, 
which is a syntax issue. And then we can argue about
waht "T" can be, which is, more or less, also a syntax issue.

Now, if we debate this, we will find we're getting into the
details of the type model, which is not a syntactic issue,
but it, well, is 'rendered' in syntax all the same.
For example, Viper is using a particular type model which is 
minor extension of CPython 1.5's own model, which leads
to a particular syntax: a python expression denoting an object
is what "T" is, rather than some new, invented, syntax
(like Tim Peters ML/Haskell like one).

In other words, I think we SHOULD focus on the syntax,
because it is the representation of the ideas we have,
and the one programmers will be using.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From gstein@lyra.org  Sat Dec 18 04:44:53 1999
From: gstein@lyra.org (Greg Stein)
Date: Fri, 17 Dec 1999 20:44:53 -0800 (PST)
Subject: [Types-sig] Re: [Types Sig] Progress
In-Reply-To: <385B0DDD.2C5473A2@maxtal.com.au>
Message-ID: <Pine.LNX.4.10.9912172042120.16305-100000@nebula.lyra.org>

On Sat, 18 Dec 1999, skaller wrote:
>...
> >>>>>>>>>>>
> 3. We agree that we need a syntax for asserting the types of expressions
> at runtime. Greg proposes ! but says he is flexible on the issue. The
> original RFC spelled this as:  has_type( foo, types.StringType ) which
> returns (in this case) a string or NULL. This strikes me as more
> flexible than ! because you can use it in an assertion but you don't
> have to. 
> <<<<<<<<<<<
> 
> I don't think we agree on this: Guido says that assertions
> are good enough. I wouldn't argue.

The '!' operator is much more than just a new name for "assert". It can
assist the compiler in determining the type of an expression value, which
leads to the ability to type check and/or optimize.

In other words, I believe Guido is wrong (heresy!) -- assertions are not
good enough.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Sat Dec 18 04:55:12 1999
From: gstein@lyra.org (Greg Stein)
Date: Fri, 17 Dec 1999 20:55:12 -0800 (PST)
Subject: [Types-sig] New syntax?
In-Reply-To: <000b01bf48f1$572dc9c0$32a2143f@tim>
Message-ID: <Pine.LNX.4.10.9912172045580.16305-100000@nebula.lyra.org>

On Fri, 17 Dec 1999, Tim Peters wrote:
> [Greg Stein]
> > ...
> > I guess that does mean that something like:
> >
> > decl a: def(Int)->None
> >
> > would be possible. e.g. <a> is a member holding a ref to a function
> > object.
> 
> If it weren't possible, it would be quite a hole in the type description
> mechanism!

hehe :-)

This syntax is part of my current proposal. I definitely agree it is a
requirement to be able to specify functional types.

>...
> today.  BTW, I wouldn't object to requiring that the class/member
> distinction be explicit.
> 
>     decl class a: ...
>     decl member a: ...
> 
> If "decl" gets used for more stuff down the road, it could be a real help to
> make the syntax explicit from the start:
> 
>    ofwhat : 'class' | 'member' | 'var' | 'type' | 'frozen' | ...
>    decl-stmt : 'decl' ofwhat <stuff that depends on ofwhat>

This seems entirely reasonable to me. Let's see what Mr. Consensus says.

> > In fact, these two are probably equivalent:
> >
> > decl class a: def(Int)->None
> > def a(Int x)->None:
> 
> WRT type, yes, but (of course!) the former is merely a declaration while the
> latter is the initial stmt of a definition.

Correct. I forgot to mention that and noticed the lack later when I read
that email. No worries... you won't let me get away with being a
slacker... :-)

>...
> Yes, Any is good.

I've listed this in my proposal as an open question. I'm leaning to
"formally endorsing" it. My only real opposition is whether it must be a
new keyword, or we can find some other way to deal with it.

For example:

import types
Int = types.IntType
String = types.StringType
Any = None

decl foo: Any
decl bar: String

The compiler isn't going to have recognized names for the types. I think
it will be using data flow to figure that out (and maybe some builtin
knowledge of the type() builtin and the types module). If the compiler
determines that a particular dotted_name leads to the value None (whereas
it typically refers to a PyTypeObject, a class object, or a typedecl
object), then it says "oh. that is the 'any' construct".

This also leads quite naturally to the following:

def foo(bar):
  ...

In this case, all the type annotations are not specified -- they are None.
Implicitly, that means "any".

Damn, I'm smooth. ;-)

> The problem with * and ** is that people are going to want to express
> restrictions, like "only strings from here on in" or "all the keyword args
> must be of int type".  Under the theory that things work well if you just
> don't think about them <wink>,
> 
>     decl c: def(Int, *: (String), **: {String: Int})->Int

Yah... this has been covered. No problem.

Funny note: looking at the grammar, I've found the following is legal:

  def foo(bar, *args, * *kw):
    ...

In my typedecl syntax, I punted the ability to use "* *" ... you must use
"**". So there :-)

> > ...
> > I'm not sure whether to go for practical or pure.
> 
> I'm leaning toward the "always explicit" above.  Restrictions can always be
> loosened later if they prove too confining, but tightening a permissive spec
> is usually impossible.

Yup. Quite a reasonable argument.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From skaller@maxtal.com.au  Sat Dec 18 04:58:14 1999
From: skaller@maxtal.com.au (skaller)
Date: Sat, 18 Dec 1999 15:58:14 +1100
Subject: [Types-sig] type declaration syntax
References: <Pine.LNX.4.10.9912172039100.16305-100000@nebula.lyra.org>
Message-ID: <385B1466.EA5E1F6B@maxtal.com.au>

Greg Stein wrote:

> >       Viperc _will_ use a global inferencer.
> > Please don't assume "python" means CPython. There are two other
> > full scale implementations now. There may be more in the future.
> > And there may be other programs -- not full interpreters or
> > compilers, like PyLint -- which will _use_ the information.
> 
> But I am talking about CPython. Do what you want with Viper, but I'm
> concerned with the core/authoritative distribution. I do not believe that
> will have a global inferencer. Sure, maybe it will one day, but my
> proposal assumes "no".

	It is possible that Viperc will generate C code for CPython.
In fact, it seems likely. It may be a third part tool, written in
ocaml rather than C, and so not part of the 'core' distribution,
but is a LOT more likely to work than anything that will ever
make it into the core distribution for the simple reasons
that it is written in a language suitable for the task,
unlike C, and it is already under development.

	In fact, IMHO, even Java is a LOT more suitable
for doing this than C will ever be. Perhaps a C version
can be written AFTER a proof of principle version is got working
in a high level language.

	Now, I'd love to be proven wrong, and find a 
real Python compiler in the next major distribution,
so my Interscript program actually becomes useful.
But I'm not going to hold my breath, and I guess that the
'small change left over from DARPA funding' Guido mentions
will not fund a compiler -- indeed, I doubt the WHOLE
of the funding provided would be enough, if it is going
to be written in C.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From gstein@lyra.org  Sat Dec 18 05:03:17 1999
From: gstein@lyra.org (Greg Stein)
Date: Fri, 17 Dec 1999 21:03:17 -0800 (PST)
Subject: [Types-sig] New syntax?
In-Reply-To: <001301bf4901$f3a2f040$32a2143f@tim>
Message-ID: <Pine.LNX.4.10.9912172055380.16305-100000@nebula.lyra.org>

On Fri, 17 Dec 1999, Tim Peters wrote:
>...
> If I had a lot of binary integer functions to declare, I would probably use
> a typedef, a la
> 
>     decl typedef BinaryFunc(_T) = def(_T, _T) -> _T
>     decl typedef BinaryIntFunc = BinaryFunc(Int)
>     ...
>     decl var intHandlerMap: {string: BinaryIntFunc}
>     decl var floatHandlerMap: {string: BinaryFunc(Float)}

Okay, Tim. I'm going to stop you right here :-)

The problem with using "decl" to do typedefs is that it does weird voodoo
to associate the typedecl with the name (e.g. BinaryFunc). I believe my
unary operator is much clearer to what is happening:

  BinaryIntFunc = typedef BinaryFunc(Int)

In this case, it is (IMO) very clear that you are storing a typedecl
object into BinaryIntFunc, for later use. For example, we might see the
following code:

  import types
  Int = types.IntType
  List = types.ListType
  IntList = typedef [Int]
  ...


Hrm. I don't have a ready answer for your first typedef, though. That is a
new construct that we haven't seen yet. We've been talking about
parameterizing *classes*, rather than typedecls.

*ponder*

>...
> You're not going to get the same level of expressiveness in an
> imperative-style Python syntax:  it's the right tool for the wrong job.  A
> type-expression sublanguage with one operator ("|") should suffice.

"or" is more Pythonic.

> [on varargs]
> > Me neither. Perhaps something like:
> >
> > decldef foo(first=int, second=string, *[int]):
> >     return int
> >
> > i.e. all the extra arguments must be ints.
> 
> Hmm!  You and Greg both seem to think varargs get implemented as lists
> <wink>.

Bite me. :-)

You do raise a good point in another post, however:

  def foo(*args: (Int)):

Looks awfully funny. For a Python programmer, that looks like grouping
rather than a tuple. If it had a comma in there, then it would look like a
tuple. But of course: there will never be more than one typedecl inside
there, so whythehell is there a comma?

*grumble*  .... I don't have a handy resolution for this one.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Sat Dec 18 05:06:27 1999
From: gstein@lyra.org (Greg Stein)
Date: Fri, 17 Dec 1999 21:06:27 -0800 (PST)
Subject: [Types-sig] type declaration syntax
In-Reply-To: <385B1466.EA5E1F6B@maxtal.com.au>
Message-ID: <Pine.LNX.4.10.9912172104180.16305-100000@nebula.lyra.org>

On Sat, 18 Dec 1999, skaller wrote:
> Greg Stein wrote:
> ... me talking about the core distro's inferencer ...
> 
> 	It is possible that Viperc will generate C code for CPython.
> In fact, it seems likely. It may be a third part tool, written in
> ocaml rather than C, and so not part of the 'core' distribution,
> but is a LOT more likely to work than anything that will ever
> make it into the core distribution for the simple reasons
> that it is written in a language suitable for the task,
> unlike C, and it is already under development.
> 
> 	In fact, IMHO, even Java is a LOT more suitable
> for doing this than C will ever be. Perhaps a C version
> can be written AFTER a proof of principle version is got working
> in a high level language.
> 
> 	Now, I'd love to be proven wrong, and find a 
> real Python compiler in the next major distribution,
> so my Interscript program actually becomes useful.
> But I'm not going to hold my breath, and I guess that the
> 'small change left over from DARPA funding' Guido mentions
> will not fund a compiler -- indeed, I doubt the WHOLE
> of the funding provided would be enough, if it is going
> to be written in C.

Nobody has ever suggested writing the bugger in C. My assumption is that
it will be written in Python. A second assumption is that it will always
remain as a lint-like tool rather than integrated into the core compiler.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Sat Dec 18 08:43:45 1999
From: gstein@lyra.org (Greg Stein)
Date: Sat, 18 Dec 1999 00:43:45 -0800 (PST)
Subject: [Types-sig] Type Inference I
In-Reply-To: <385ACE5C.17CD5684@maxtal.com.au>
Message-ID: <Pine.LNX.4.10.9912180015400.16305-100000@nebula.lyra.org>

Whatever you want to call it: inference or deduction or type analysis. I
think we will be doing "bottom up type analysis" to use your phrasing. I
don't think we need any top-down inferencing at all.

In a function, we get our initial type inputs from the arguments and from
function return values. With those, we compute the types of each item.
Those are the types we pass to functions or return from the function.

I still don't see your point. You've gone on at length about exceptions,
semantics, and what kinds of inference can or can't be done. What is the
point? At the end of this you say something about adding "static-y" stuff
to Python? What do you mean? Honestly, the previous email and this one
just seems to be a lot of gobbledygook. Long words, short on applicable,
useful content. Again: call me small-minded, but I think you're being
overly obtuse.

Comments below...

On Sat, 18 Dec 1999, skaller wrote:
> Greg Stein wrote:
>  
> > On Sat, 18 Dec 1999, skaller wrote:
> > > ... long post about exceptions and semantic definition ...
> > 
> > Sorry John, call me dense, but I really don't see what you're talking
> > about. :-(
> 
> 	It takes a while to understand the impact of conformance
> and specifications on semantics .. and that this is not just a
> matter of language lawyering, but a real, pragmatic, issue.

I'm not an idiot. If it takes me a while, then it is going to take
everybody a while. Phrase your discussion so that you're actually saying
something, rather than speaking so much in the abstract. You bring up
points about boundary cases and how they throw exceptions: great, but
nobody cares about those boundary cases (I'm never going to feed my .emacs
file into Python).

>...
> > I don't see a problem with exceptions. That is part of Python. I don't see
> > that it causes any problems with type inference, either (it just
> > introduces interesting items into the control/data flow graph).
> 
> 	The problem arises roughly as follows:
> type inference works by examining an expression like:
> 
> 	x + 1
> 
> and _deducing_ that x MUST (**) be an integer. It cannot
> be a file, because it isn't allowed to add a file to an integer.
> But in Python you CAN add a file to an integer. It is perfectly
> legal, it just throws an exception.

You can't deduce/infer anything from x+1. x could be a class instance, in
which case you're totally screwed. Otherwise, it could be any numeric
type. But even then: as you point out, it could be a string.

This is entirely the wrong direction. We aren't trying to figure out what
x *should* be. We're trying to say "x is <this>. will it cause an error?"

> Do you see? This means we cannot deduce ANYTHING about 'x'
> in the example snippet given above.
> 
> Of course, the _expression_ x+1 can only be an integer,
> we _can_ deduce that. But that isn't enough. Python

You can't deduce that at all.

class foo:
  def __add__(self, value):
    return "hello"

> is too dynamic. We need more constraints to be able
> to do effective inference.

Not at all. As I mentioned: we'll be doing bottoms-up. We don't need
constraints: we just need some type annotations on the input values
(e.g. arguments and return values).

> (**) This example ignores class instances with __add__ methods,
> to make the argument easier to follow.

You argument is easy to follow, but I don't see *why* you're making the
argument. I don't care what x is from "x+1". I know what x is from where
it got assigned a value.

> > This whole tangent about feeding an email to Python and claiming it is a
> > valid Python program with defined semantics (raise SyntaxError). I
> > understand your explanation, but I totally miss the point. So what?
> 
> 	See above. We cannot infer anything, unless there are rules.
> That is, there MUST be set of permitted signatures for
> functions/operators,
> in order to do inference at all.
> 
> 	It is possible to do synthetic (bottom up) type analysis, such as:
> 
> 	x = 1 + 1
> 
> Here, we know that Int + Int -> Int, and so x (at least at this
> point in the program) must be an Int. But that is only
> the 'deductive' part of inference, the 'inferential' part
> infers the types of _arguments_ from the set of allowable
> signatures of functions. That is, we must do  the inference
> top down, not just bottom up.

No. We're only going to do bottom up, as far as I know. Nobody has even
ventured a request to have any kind of top-down inference.

In fact, most people don't want anything beyond simple statement-level
inference (achievable by declaring the types of all names used). I'd
rather see no local declarations because we can infer/deduce the types of
all local names.

> > Type inferencing for the "1 + file" case is easy. You know the two types,
> > and you know they can't be added. Bam. Error.
> 
> 	but you're wrong, the result of applying the addition
> operation is, in fact, well defined: it is NOT an error in the program,
> it just throws an exception, rather than returning a value.
> If you throw an exception deliberately, that is hardly an error, is it?

All right. Now you're just being silly. The entire purpose of this
discussion is to local those exceptions at compile time. THAT IS THE
PURPOSE HERE. By definition, we are saying it is wrong.

Argue semantics all you want about what is correct or not, but raising an
exception is exactly what we want to avoid. We want to know about it
before we run the program.

> > It was a long email, but what exactly were you trying to say? "Define the
> > semantics" isn't very clear. I feel Python has very clear semantics. What
> > exactly is wrong with them?
> 
> 	There is no distinction made between 'incorrect' code,
> and 'correct' code for which an exception is thrown.

What?! Of course there is a distinction. We want to filter out the
incorrect code (i.e. that which uses types incorrectly and would throw
errors at runtime).

> 	In compiled code, we need the distinction, because there
> is a lot of overhead in doing the dynamic type checking required
> to throw the exception. The whole point of compilation is to
> eliminate the overhead of run time type checking.

In Python, the whole point of compilation is to transform source code into
something that the PVM can execute. Done.

>... more stuff ...

> Hope this makes sense: to compile python code effectively,
> we need to add some reasonable 'static-y' restrictions.
> Where, 'reasonable' means 'suitably pythonic', but not
> quite as dynamic as the current CPython 1.5.2 implementation
> allows.

No, it doesn't make sense. I see that we can do this with declarations and
without the need for restrictions.

I'm getting the feeling that you are trying to solve an entirely different
problem from what we've been discussing over the past week. Your
discussions about what is correct and incorrect just doesn't seem to have
any basis in the problem being worked on. We want to detect incorrect code
before runtime, where "incorrect" is defined as throwing an (unexpected)
exception. And it is actually pretty easy to tell whether something is
expected or not: did the developer put in a try/catch?

Your discussion seems to saying something about removing exceptions. But
honestly, I really don't know what you're advocating.

I'm sorry, but I'm obviously a little bit tweaked. As Guido said, maybe
too much sugar lately :-). More likely, not enough sleep. How about if you
write a short email with a concrete suggestion for a change? That may help
to define what exactly you're suggesting should happen. All this
background "theory" is just noise to me.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Sat Dec 18 08:51:44 1999
From: gstein@lyra.org (Greg Stein)
Date: Sat, 18 Dec 1999 00:51:44 -0800 (PST)
Subject: [Types-sig] Viper Type specification
In-Reply-To: <385AF9F1.1108A094@maxtal.com.au>
Message-ID: <Pine.LNX.4.10.9912180050580.16305-100000@nebula.lyra.org>

On Sat, 18 Dec 1999, skaller wrote:
> FYI: here is the Viper file py_types.vy which defines
> many Viper types. [But there is no reason ALL the types
> need be here: you can't see the regexp object here,
> nor sockets -- these are defined elsewhere]
>...

I don't understand the relevancy of this to the types-sig and our recent
discussions about adding static typing to Python.

??

-g

-- 
Greg Stein, http://www.lyra.org/


From tismer@appliedbiometrics.com  Sat Dec 18 13:57:35 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Sat, 18 Dec 1999 14:57:35 +0100
Subject: [Types-sig] Viper Type specification
References: <Pine.LNX.4.10.9912180050580.16305-100000@nebula.lyra.org>
Message-ID: <385B92CF.69937EF6@appliedbiometrics.com>


Greg Stein wrote:
> 
> On Sat, 18 Dec 1999, skaller wrote:
> > FYI: here is the Viper file py_types.vy which defines
> > many Viper types. [But there is no reason ALL the types
> > need be here: you can't see the regexp object here,
> > nor sockets -- these are defined elsewhere]
> >...
> 
> I don't understand the relevancy of this to the types-sig and our recent
> discussions about adding static typing to Python.

John is telling us his truth, and we have to learn.
This is no discussion but a lecture.

Look into class PyFileType.
My lesson was that I have to learn that
8096 is a power of two :-)

enlighted-ly - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From skaller@maxtal.com.au  Sat Dec 18 14:07:35 1999
From: skaller@maxtal.com.au (skaller)
Date: Sun, 19 Dec 1999 01:07:35 +1100
Subject: [Types-sig] Viper Type specification
References: <Pine.LNX.4.10.9912180050580.16305-100000@nebula.lyra.org>
Message-ID: <385B9527.53DDB2F8@maxtal.com.au>

Greg Stein wrote:
> On Sat, 18 Dec 1999, skaller wrote:
> > FYI: here is the Viper file py_types.vy which defines
> > many Viper types. 

> I don't understand the relevancy of this to the types-sig and our recent
> discussions about adding static typing to Python.

The types-SIG has discussed a 'language' for types, correct?
For example, Tim Peters demonstrated a Haskell like syntax.
One property of that syntax is that it allows complex specifications,
including generics. Correct?

Well, I'm showing another way to do it. The file I posted
IS the type specification for Viper 2. With this mechanism,
there is no need for any 'language' to describe types,
the only 'language' permitted or required is 'python
expression denoting an object'. This meets the some
of the stated requirements of the SIG, and Guido,
better than any other language describing types,
in particular the 'no new stuff' requirement -- 
since it is clearly done entirely IN python.

I posted the file, mainly for interest, so people could see
what type specifications IN python would look like:
this file is the one actually used in Viper, it isn't
a 'demo', but the real thing.

I'm not saying this is 'the' solution, but it ought to
be considered because it simultaneously provides a 
powerful typing model, which is based on the existing
model with only a small generalisation, seems easy
to implement, and also solves the problem of how to name types.

I hope it is clear now, why it is relevant.

BTW: w.r.t expr!type, your (Greg's) proposal, what precedence
would your give operator ! ?

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Sat Dec 18 16:04:16 1999
From: skaller@maxtal.com.au (skaller)
Date: Sun, 19 Dec 1999 03:04:16 +1100
Subject: [Types-sig] Type Inference I
References: <Pine.LNX.4.10.9912180015400.16305-100000@nebula.lyra.org>
Message-ID: <385BB080.A2DBB67C@maxtal.com.au>

Greg Stein wrote:
[]

Ok Greg, lets see where we agree and what we understand.

First, interpreter python is too damn slow for some applications.
Also, errors sometimes get reported later than we'd like.

We'd both like to:

	(OPT)	be able to translate Python sources in to C 
		which runs faster than interpreter python

	(ERR)	find errors in a program, before running it

I hope we agree on these points so far.

Now, here is something I believe, mainly from comments
made at various times by Guido, Tim, and others:
people have tried compiling python before, and found that
the resulting C code didn't run much faster than the
interpreter. Thats mainly because these compilers didn't
know anythong about the types, they just generated API
calls corresponding to what the byte code interpreter would
execute -- and the interpreter is pretty fast already.

So the question arises: how well can we do if we know
what the types of things are, or at least some of them?
I am going to assume, without evidence, that we can do better,
and I'm going to assume you agree.

So now the question arises: how can we find out the types?
Now, I am going to TELL you, that there is some evidence,
that we can in fact do surprisingly well without changing
Python one little bit. We can do better, if we analyse a whole
program. We can do better, if we make some assumptions.
I'd like you to accept this, without argument, because I cannot
prove it: I've only done a micky mouse experiment, but JimH
has done a less micky one, I'm told.

So now the question arises: what is required to make the
type inference we can do WITHOUT changing python one
little bit, even better, if we make some changes ?????

So, when you say 'we' are not going to do this or that
kind of inference, you are missing the point.
I surely AM going to. Others certainly WILL be going to.
This is how it is done. When you are writing a compiler,
you use every bit of information you can to make
it go faster.

So the point is how to give the compiler more information,
while minimising the impact on 'python' (you can read that
as 'keeping it pythonic' if you like).

Now, my point, in Type Inference I and II, is that static
type declarations are only ONE way of providing 
more information, and they are not even the most important
one. In fact, type inference is hampered by quite a lot 
of other factors. 

I'm sure you will agree on some. For example, I'm sure
you understand how 'freezing' a module, or banning
rebinding of variables afer importing, or, disallowing

	module.attribute = xxx

will help type inference: you clearly understand that
python is very dynamic, which makes static analysis
difficult. Right?

So what I am trying to do is list all the things
OTHER than adding optional type declarations,
which might contribute to our agreed upon aim,
namely, to provide a compiler with enough information
to generate fast code.

I think it is clear this is in the scope of the SIG's charter,
for example, there seems to be a consensus that

	module.attribute = xxx

is going to be disallowed -- because if it isn't,
sophisticated global control flow analysis is required
to even be sure _which_ function is being called 
at some point in the program. Tim Peters said that
the standard algoithm for that might not even terminate.
Clearly, this is a problem for a compiler :-)

Now I want to go back to the original example I gave,
and I want you to accept, temporarily, that we have
only THREE types: integers, strings, and files. And assume
that a function 'add(x,y)' exists, which throws
an exception if the types of x and y are not
both integers, or both strings. 

I want you to accept, that given a function call:

	add(x,1)

that deducing that x is an integer is useful
to a type inferencer, IF it can be done.

The question is: can it be done?

And the answer is: it depends on the
DOCUMENTED SPECIFICATION OF THE FUNCTION.

Consider two cases:

	1) The spec says: 
	IF the arguments are both ints ..
	OR IF the arguments are both strings ..
	OTHERWISE an exception is thrown

	2) The spec says:
	IF the arguments are both ints ..
	OR IF the arguments are both strings ...
	OTHERWISE THE BEHAVIOUR IS UNDEFINED

There is a huge difference between these two cases for
a compiler. In case (2), the compiler can ASSUME
that given the call

	add(x,1)

that x must be an integer. This is a valid type deduction,
because the compiler doesn't care what happens if the
program has undefined behaviour: the assumption that
x is an integer is STILL CORRECT, because it cannot have
any consequences which break the language specification.
In this case, the compiler could, for example,
just keep x in a C int variable, and add 1 to it
by using the code x + 1 -- which is much faster than

	PyAdd(x, One)

On the other hand, in case (1), the compiler cannot deduce
anything, at least from the given fragment, so it can NOT
generate fast code: it has to call 

	PyAdd(x,One)

or, perhaps do something like:

	if (PyTypeIsInt(x)) 
		x->value ++;
	else PyRaise(SomeException)

.. which involves an extra run time check, at least,
and is therefore much slower.

Therefore, there is a performance advantage in adopting
spec (2) as a language specification, instead of (1).
Note this does not mean the generated code will crash,
if x is not an integer. What it means is that if the
compiler detects that x is not an integer, it can
generate a compile time error. It is NOT allowed to
do that with specification (1).

So my point is: the Python documentation contains
many examples where it says 'such and such an exception
is thrown', and this prevents generating fast code,
and it prevents early error detection. The point
is that throwing an exception is _well defined_ behaviour,
and it would be better if the specification said
the program was in error. That way, a compiler
can report the error at compile time.

At the moment, no errors can be reported by a compiler,
because there is no such thing as an erroneous python
program -- someone may catch the exception that
the specification says is thrown and do something
they think is OK, and would rightly claim the compiler
is breaking their program and not implementing the
Python language faithfully.

Just to make it clear, an example:

	try: return x + 1
	except TypeError: return str(x) + "1"

is a valid python code fragment, and it relies on
x + 1 throwing an exception if x is not an integer.
So IF we continue to allow this, we cannot deduce
that x must be an integer, and this prevents optimising
generated code, both here, and in other places in the
program.

You might just think, seeing

	x + 1

that if x is not an integer, the code must be an error,
but the example above shows that you'd be wrong
if you said that.

For this reason, I think it is important that the Types SIG
also examine when it is legitimate to catch an exception
at run time, and do something, and when the code is just 
plain wrong, and a compiler can reject it.

My proposal, and I thought it was fairly 'concrete' as
you required, was that apart from EnvironmentErrors,
all _standard_ exceptions not trapped within a function body
(I said 'lexically local' before, but now I'll be more
specific), are in fact programming errors: the programmer
may NOT rely on catching any standard exceptions
other than environment errors, generated by code inside
a client written function.

Note this is only a proposal, I'm not sure if I like it,
but I hope the reason for proposing it for discussion is
easier to understand now. Note: I find this difficult too.
I'm not a compiler writer. But I have spent over five years 
on a standards committee, and have some vague idea of the impact
of specifications -- and in particular lack of them --
on the ability to generate fast, conforming code.
The way I learned it, was listening to the arguments
of compiler writers, explaining the impact of various
possible specifications on opportunties for optimisation.
Believe me, some of the arguments are pretty devious :-)

-- 	
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From paul@prescod.net  Sat Dec 18 16:58:12 1999
From: paul@prescod.net (Paul Prescod)
Date: Sat, 18 Dec 1999 10:58:12 -0600
Subject: [Types-sig] consensus(?) summary (was: Type annotations)
References: <38592F43.11042753@prescod.net>
 <Pine.LNX.4.10.9912161300170.16305-100000@nebula.lyra.org> <14425.23797.624232.17777@dolphin.mojam.com>
Message-ID: <385BBD24.B1DE49D2@prescod.net>

Skip Montanaro wrote:
> 
> In particular, is the following a non-local write?
> 
>     import sys
>     p = sys.path
>     p.append("/usr/local/lib/other")

No, only name rebindings are writes. p.append is just a method call.
It's type safety is checked by the usual method call type assertion
mechanism.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things never trust in: That's the vendor's final bill
The promises your boss makes, and the customer's good will 
http://www.geezjan.org/humor/computers/threes.html


From gstein@lyra.org  Sat Dec 18 19:31:39 1999
From: gstein@lyra.org (Greg Stein)
Date: Sat, 18 Dec 1999 11:31:39 -0800 (PST)
Subject: [Types-sig] Viper Type specification
In-Reply-To: <385B9527.53DDB2F8@maxtal.com.au>
Message-ID: <Pine.LNX.4.10.9912181128270.16305-100000@nebula.lyra.org>

On Sun, 19 Dec 1999, skaller wrote:
>...
> I'm not saying this is 'the' solution, but it ought to
> be considered because it simultaneously provides a 
> powerful typing model, which is based on the existing
> model with only a small generalisation, seems easy
> to implement, and also solves the problem of how to name types.
> 
> I hope it is clear now, why it is relevant.

Now it is clear, yes. But when it just gets posted with "here is X" rather
than "this is how Y could be done", then it is definitely unclear.

> BTW: w.r.t expr!type, your (Greg's) proposal, what precedence
> would your give operator ! ?

Lowest possible (as seen in the type-proposal.html I recently posted
here). I don't have any actual experience with it, but I would think that
when somebody is using it to annotate/verify their code, they would just
append it to the end of key lines in a function. The lowest precedence
creates the correct binding in this case.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Sat Dec 18 20:38:19 1999
From: gstein@lyra.org (Greg Stein)
Date: Sat, 18 Dec 1999 12:38:19 -0800 (PST)
Subject: [Types-sig] Type Inference I
In-Reply-To: <385BB080.A2DBB67C@maxtal.com.au>
Message-ID: <Pine.LNX.4.10.9912181144540.16305-100000@nebula.lyra.org>

On Sun, 19 Dec 1999, skaller wrote:
>...
> Ok Greg, lets see where we agree and what we understand.
> 
> First, interpreter python is too damn slow for some applications.
> Also, errors sometimes get reported later than we'd like.
> 
> We'd both like to:
> 
> 	(OPT)	be able to translate Python sources in to C 
> 		which runs faster than interpreter python
> 
> 	(ERR)	find errors in a program, before running it
> 
> I hope we agree on these points so far.

Sure.

> Now, here is something I believe, mainly from comments
> made at various times by Guido, Tim, and others:
> people have tried compiling python before, and found that
> the resulting C code didn't run much faster than the
> interpreter. Thats mainly because these compilers didn't
> know anythong about the types, they just generated API
> calls corresponding to what the byte code interpreter would
> execute -- and the interpreter is pretty fast already.

Bill Tutt and I have done it and measured about 30% speed improvement in
most cases. Not as lot as most people would hope for, but definitely
there. Bill is continuing to improve the code.

> So the question arises: how well can we do if we know
> what the types of things are, or at least some of them?
> I am going to assume, without evidence, that we can do better,
> and I'm going to assume you agree.

Agreed.

> So now the question arises: how can we find out the types?
> Now, I am going to TELL you, that there is some evidence,
> that we can in fact do surprisingly well without changing
> Python one little bit. We can do better, if we analyse a whole
> program. We can do better, if we make some assumptions.
> I'd like you to accept this, without argument, because I cannot
> prove it: I've only done a micky mouse experiment, but JimH
> has done a less micky one, I'm told.

Sure, I accept that we can.

But to state up front: I don't think we want to rely on whole-program
analysis. At the moment, I am assuming that a type-checking tool will not
be part of the [byte-code] compiler -- that is is just too much and too
slow to directly include. However, that obviously negates a number of
things that the compiler could do if it knew the types. For example, maybe
we introduce some integer-manipulation opcodes because we find they would
be beneficial to 90% of Python programs. An external tool doesn't let
Python take advantage of them.

To that end, I think we might eventually want to integrate something. And
to do that, we definitely cannot rely on whole-program analysis. In other
words, if we depend on whole-program analysis, then I don't think the
builtin, byte-code compiler will ever be able to take advantage of type
information.

>...
> So, when you say 'we' are not going to do this or that
> kind of inference, you are missing the point.
> I surely AM going to. Others certainly WILL be going to.
> This is how it is done. When you are writing a compiler,
> you use every bit of information you can to make
> it go faster.

I agree that you want to use every bit of information possible. I disagree
that I'm missing the point: I think we are discussing what will happen to
the native compiler. To that end, I *am* positing that 'we' will do <this>
or <that>.

If "third parties" (if you will) want to create an even better compiler,
than I'm all for it. However, we still want to improve the native system,
and I believe that is through a different path than you are suggesting.

> So the point is how to give the compiler more information,
> while minimising the impact on 'python' (you can read that
> as 'keeping it pythonic' if you like).

Yes.

> Now, my point, in Type Inference I and II, is that static
> type declarations are only ONE way of providing 
> more information, and they are not even the most important
> one. In fact, type inference is hampered by quite a lot 
> of other factors. 

This was entirely unclear. I saw it as some kind of weird ramble about
changing Python's exception behavior in some unclear way, and for some
unknown purpose.

Given the above paragraph: know I understand what you are trying to get
at.

> I'm sure you will agree on some. For example, I'm sure
> you understand how 'freezing' a module, or banning
> rebinding of variables afer importing, or, disallowing
> 
> 	module.attribute = xxx
> 
> will help type inference: you clearly understand that
> python is very dynamic, which makes static analysis
> difficult. Right?

Yup.

> So what I am trying to do is list all the things
> OTHER than adding optional type declarations,
> which might contribute to our agreed upon aim,
> namely, to provide a compiler with enough information
> to generate fast code.

All right.

>... module attribute assignments ...
>... add() example, explaining exceptions mess up compiler ...
> 
> 	1) The spec says: 
> 	IF the arguments are both ints ..
> 	OR IF the arguments are both strings ..
> 	OTHERWISE an exception is thrown
> 
> 	2) The spec says:
> 	IF the arguments are both ints ..
> 	OR IF the arguments are both strings ...
> 	OTHERWISE THE BEHAVIOUR IS UNDEFINED
>...
> Therefore, there is a performance advantage in adopting
> spec (2) as a language specification, instead of (1).
> Note this does not mean the generated code will crash,
> if x is not an integer. What it means is that if the
> compiler detects that x is not an integer, it can
> generate a compile time error. It is NOT allowed to
> do that with specification (1).

Interesting point. As a user of Python, I like (1) and do not want to see
Python to change to use (2). Sure, it hurts the compiler, but it ensures
that I always know what will happen.

> So my point is: the Python documentation contains
> many examples where it says 'such and such an exception
> is thrown', and this prevents generating fast code,
> and it prevents early error detection. The point
> is that throwing an exception is _well defined_ behaviour,
> and it would be better if the specification said
> the program was in error. That way, a compiler
> can report the error at compile time.

While true, I think the compiler will still know enough about the type
information to generate very good code. If somebody needs to squeak even
more performance, then I'd say Python is the wrong language for the job. I
do understand that you believe Python can be the right language, IFF it
relaxes its specification. I hope it doesn't, though, as I like the
rigorous definition.

>... catching exceptions ...
> is a valid python code fragment, and it relies on
> x + 1 throwing an exception if x is not an integer.
> So IF we continue to allow this, we cannot deduce
> that x must be an integer, and this prevents optimising
> generated code, both here, and in other places in the
> program.

It only prevents it in the presence of exception handlers. You can still
do a lot of optimization outside of them. And a PyIntType_Check() test
here and there to validate your assumptions is techically more expensive
than not having it, but (IMO) it is not expensive in absolute terms.

>...
> For this reason, I think it is important that the Types SIG
> also examine when it is legitimate to catch an exception
> at run time, and do something, and when the code is just 
> plain wrong, and a compiler can reject it.

Sure.

> My proposal, and I thought it was fairly 'concrete' as
> you required, was that apart from EnvironmentErrors,
> all _standard_ exceptions not trapped within a function body
> (I said 'lexically local' before, but now I'll be more
> specific), are in fact programming errors: the programmer
> may NOT rely on catching any standard exceptions
> other than environment errors, generated by code inside
> a client written function.

All right. This is clear now. And it is clearly something that I would not
want to see :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From tim_one@email.msn.com  Sat Dec 18 20:56:44 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sat, 18 Dec 1999 15:56:44 -0500
Subject: [Types-sig] type declaration syntax
In-Reply-To: <385B08AE.A491CD36@maxtal.com.au>
Message-ID: <000001bf499a$64fa9c00$dca2143f@tim>

[John Skaller]
> ...
> But the _return_ type doesn't need to be annotated as much.
> Why? Because the inferencer can usually deduce it:
> it's an output, the argument types are inputs.
>
> If the inferencer _cannot_ deduce the return type,
> it _also_ cannot check that the function is returning
> the correct type.

"The" correct type (as opposed to "a type consistent with the operations")
is impossible for an inferencer to determine, but this is addressed more to
the SIG than to John <wink>:

My bet is that the vast majority of Python people asking for "static typing"
have in mind a conventional explicit system of the Algol/Pascal/C (APC) ilk,
and that decisions based on what *inference* schemes can do are going to
leave them very unhappy.

Inference schemes commit two kinds of gross errors that the APC camp won't
abide:

1) Inferring types that aren't general enough.
2) Inferring types that are too general.

Both mistakes occur because inference can only look at the code that's
written, knowing nothing about the user's *intent*.  In APC, explicit type
declarations often serve the latter purpose, supplying (& enforcing) design
and semantic constraints that can't be deduced from the code:

1) Not general enough.  This is usually due to an implementation in
progress, where looking at the code that currently exists can't possibly
guess what will get implemented tomorrow; e.g., a function that returns an
int if it can, but is spec'ed to return None if it can't, but the author
hasn't yet gotten around to coding up the latter cases.  The clients of this
routine must nevertheless accept an IntOrNone result, and explicit
declarations can force that on clients long before the routine is actually
capable of producing a None.  The alternative is a large class of all too
familiar last-second "integration crises".

2) Too general.  This is very common in numeric programming.  An inferencer
sees a routine with nothing but +, *, / and infers "ah, any Number will do".
But it's *unusual* for any such routine to work *correctly* for all Numbers
(algorithms appropriate for complex numbers are often wildly different from
those appropriate for floats, and likewise for ints).  For example, I tell
Haskell

    intgamma 1 = 1
    intgamma n = x * intgamma x
                 where x = n-1

and it deduces the type

    intgamma :: Num a => a -> a

Arghghghgh <wink>.  Yes, every flavor of Num supports all the operations I
used, but no, call it with anything other than an Int and the *algorithm* is
plain wrong.  APC folk (Haskell folk too) routinely use explicit
declarations to enforce such constraints.

Explicit typing goes beyond what type inference can do "even in theory";
while types must be *consistent* with the code, only the author can know
*the* correct type (which may be more-- or less! --general than what an
inferencer determines is merely consistent).

in-the-apc-tradition-types-communicate-design-ly y'rs  - tim


From tim_one@email.msn.com  Sat Dec 18 20:56:49 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sat, 18 Dec 1999 15:56:49 -0500
Subject: [Types-sig] Type Inference I
In-Reply-To: <Pine.LNX.4.10.9912180015400.16305-100000@nebula.lyra.org>
Message-ID: <000101bf499a$670a9040$dca2143f@tim>

> ... the _expression_ x+1 can only be an integer, we _can_ deduce that

> ... You can't deduce/infer anything from x+1.

> ["Yes! No!" * 10 <wink>]

We can deduce that x+1 will blow up at runtime with a TypeError (perhaps
spelled by some other name <wink>) unless type(x) supports an __add__ method
which in turn accepts (at least, and besides self) a single argument of type
Int.

If type(x) does support an __add__ method which in turn etc, we have no idea
whether it will blow up at runtime.  But the current incarnation of the
Types-SIG (TCIOTTS) doesn't care about that:  it's trying (only!) to
determine at compile-time when it's certain that type(x) *does* support etc.
Toward that end, TCIOTTS assumes that type(x) and all relevant information
about type(x).__add__ has been handed to it on a silver platter.

The type of x+1 is the union of all the types that T.__add__(1) may return
across all types T in the set of possible types for x, and that info
constitutes the "silver platter" handed to x+1's context.  Bottom-up, all
the way, with oracles at the base.

AFAIK, TCIOTTS doesn't yet have an explicit policy about what to do in the
presence of try/except blocks.  Everyone has clearly assumed that, for
purposes of type-checking, the possibility of an *up*-level handler will be
ignored (and if a user can't live with that, fine, then they can't enable
type-checking).

Given that this SIG self-destructed the last time it tried to take on too
much, and currently has a goal to produce genuinely useful code in a matter
of months, I doubt TCIOTTS will be persuaded to move beyond that for now.

Indeed, I think it should forget inferencing *entirely* at the start, even
for cases like

def unity() -> Int:
    a = 1   # compile-time error in type-check mode -- a not declared
    return a

Inferencing (ya, ya -- *useful* inferencing) is harder than mere checking
(indeed, checking is easy enough to write in K&R C <wink>).

one-man's-opinion-ly y'rs  - tim


From paul@prescod.net  Sat Dec 18 17:06:10 1999
From: paul@prescod.net (Paul Prescod)
Date: Sat, 18 Dec 1999 11:06:10 -0600
Subject: [Types-sig] Optionality and performance
References: <Pine.LNX.4.10.9912161332580.16305-100000@nebula.lyra.org>
Message-ID: <385BBF02.E4B1DC2@prescod.net>

Greg Stein wrote:
> 
> Skip Montanaro wrote:
> > I humbly assert this train of thought rates a *bzzzt*.  I thought one core
> > requirement was that all type declaration stuff be optional.  The worst that
> > the type checker/inferencer should do in the face of incomplete type info is
> > display a warning.

> My entire post was pre-conditioned on the assumption that type-checking
> has been enabled.

Optionality of type checking is not about it being enabled or disabled.
Even when it is enabled, type checking any particular method must be
optional. This whole discussion should presume "enabled". But
optionality is still important.

> IMO, type checking is NOT enabled by default. I believe it will impose a
> noticable performance penalty and I'm not willing to pay that in the
> general case. 

I don't see how we can logically treat type checks differently than
array bounds checks, overflow checks and so forth. It needs to be on by
default and we'll just need to figure out how to minimize its impact.
Most type checks should involve quick pointer comparisons and that will
be covered up by the other performance enhancement.

In particular, when you declare conformance to a class or interface,
method calls should no longer be string-dispatched. That means you need
interfaces to be like vtables so the type checker's job is to find the
right vtable. The type check actually comes "for free" in implementing
the name lookup optimization.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things never trust in: That's the vendor's final bill
The promises your boss makes, and the customer's good will 
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Sat Dec 18 16:58:12 1999
From: paul@prescod.net (Paul Prescod)
Date: Sat, 18 Dec 1999 10:58:12 -0600
Subject: [Types-sig] consensus(?) summary (was: Type annotations)
References: <38592F43.11042753@prescod.net>
 <Pine.LNX.4.10.9912161300170.16305-100000@nebula.lyra.org> <14425.23797.624232.17777@dolphin.mojam.com>
Message-ID: <385BBD24.B1DE49D2@prescod.net>

Skip Montanaro wrote:
> 
> In particular, is the following a non-local write?
> 
>     import sys
>     p = sys.path
>     p.append("/usr/local/lib/other")

No, only name rebindings are writes. p.append is just a method call.
It's type safety is checked by the usual method call type assertion
mechanism.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things never trust in: That's the vendor's final bill
The promises your boss makes, and the customer's good will 
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Sat Dec 18 17:13:26 1999
From: paul@prescod.net (Paul Prescod)
Date: Sat, 18 Dec 1999 11:13:26 -0600
Subject: [Types-sig] Syntax
References: <000801bf482c$71670100$63a2143f@tim>
Message-ID: <385BC0B6.3879244B@prescod.net>

Tim Peters wrote:
> 
> [Martijn Faassen]
> > While my agenda is to kill the syntax discussions for the moment,
> > ...
> 
> Martijn, in that case you should stop feeding the syntax meta-discussion and
> just view all the other notations as virtual spellings for masses of obscure
> nested dicts <wink>.

Let me point out that it was the masses of obscure nested dicts that I
was objecting to when I told Greg that the syntax cannot be restricted
to Python (by which I meant Python 1.5). Obviously, by definition any
syntax that we use for Python 2 becomes "Python". In fact, I don't see a
lot of difference between the widely embraced Tim-syntax and the syntax
I posted a few days ago (based on the Tim-syntax). But if putting the
keyword "decl:" in front makes it feel better then I'm all for that!

I'm still thinking that it should go in another file because I want to
be able to experiment with this stuff WITHOUT maintaining a new Python
interpreter binary.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things never trust in: That's the vendor's final bill
The promises your boss makes, and the customer's good will 
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Sat Dec 18 18:06:25 1999
From: paul@prescod.net (Paul Prescod)
Date: Sat, 18 Dec 1999 12:06:25 -0600
Subject: [Types-sig] Type Inference I
References: <Pine.LNX.4.10.9912171453120.16305-100000@nebula.lyra.org> <385ACE5C.17CD5684@maxtal.com.au>
Message-ID: <385BCD21.1373DDAD@prescod.net>

skaller wrote:
> Of course, the _expression_ x+1 can only be an integer,
> we _can_ deduce that. But that isn't enough. Python
> is too dynamic. We need more constraints to be able
> to do effective inference.

John, I have tried languages that were big on inferencing and I have
tried languages that were big on dynamicity and I strongly prefer the
latter. 

I don't see how your global type inferencer is going to handle:

a = 1 + unpickle( "foo.pcl" )
b = a + eval( raw_input() ) 

I don't think that we can make these illegal without alienating most
Python users.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things never trust in: That's the vendor's final bill
The promises your boss makes, and the customer's good will 
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Sat Dec 18 18:43:31 1999
From: paul@prescod.net (Paul Prescod)
Date: Sat, 18 Dec 1999 12:43:31 -0600
Subject: [Types-sig] type declaration syntax
References: <Pine.LNX.4.10.9912172024490.16305-100000@nebula.lyra.org> <385B1119.D036BB9F@maxtal.com.au>
Message-ID: <385BD5D3.2BD541B5@prescod.net>

skaller wrote:
> 
> I don't agree, but this time it is because I think you have
> _understated_ the issue. 

I just can't win. :)

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things never trust in: That's the vendor's final bill
The promises your boss makes, and the customer's good will 
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Sat Dec 18 19:59:28 1999
From: paul@prescod.net (Paul Prescod)
Date: Sat, 18 Dec 1999 13:59:28 -0600
Subject: [Types-sig] Type Inference I
References: <Pine.LNX.4.10.9912180015400.16305-100000@nebula.lyra.org> <385BB080.A2DBB67C@maxtal.com.au>
Message-ID: <385BE7A0.285282C3@prescod.net>

I don't see how this strategy can work.

skaller wrote:
> 
> You might just think, seeing
> 
>         x + 1
> 
> that if x is not an integer, the code must be an error,
> but the example above shows that you'd be wrong
> if you said that.

But as you and others have pointed out, Python is protocol-centric, not
type-centric. In real Python, x could be anything that with an __add__
function. The optimization opportunity is thus dubious. 

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things never trust in: That's the vendor's final bill
The promises your boss makes, and the customer's good will 
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Sat Dec 18 20:31:13 1999
From: paul@prescod.net (Paul Prescod)
Date: Sat, 18 Dec 1999 14:31:13 -0600
Subject: [Types-sig] type declaration syntax
References: <Pine.LNX.4.10.9912172104180.16305-100000@nebula.lyra.org>
Message-ID: <385BEF11.A8A6266E@prescod.net>

Greg Stein wrote:
> 
> ...
> 
> Nobody has ever suggested writing the bugger in C. My assumption is that
> it will be written in Python. A second assumption is that it will always
> remain as a lint-like tool rather than integrated into the core compiler.

That is not my assumption. If a function creator asks for the function
to be type checked, it should be type checked every time it is
recompiled unless some option has turned type-checking off.

The difference between type signatures and lint is that lint is guessing
about things that are, strictly speaking, correct, but questionable.
Type check declarations are either right or wrong and if they are wrong,
the programmer should be told.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things see no end: A loop with exit code done wrong
A semaphore untested, and the change that comes along
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Sat Dec 18 20:37:30 1999
From: paul@prescod.net (Paul Prescod)
Date: Sat, 18 Dec 1999 14:37:30 -0600
Subject: [Types-sig] New syntax?
References: <Pine.LNX.4.10.9912172055380.16305-100000@nebula.lyra.org>
Message-ID: <385BF08A.6B9BD070@prescod.net>

Greg Stein wrote:
> 
> Bite me. :-)
> 
> You do raise a good point in another post, however:
> 
>   def foo(*args: (Int)):

Python should not use tuples as "read-only lists." From a type-system
point of view, a tuple should be a fixed-length, fixed-type data
structure defined at compile time.

A mathematician would not say: "A foo is a variable-length tuple of X".
Rather they would say: "A foo is a variable-langth list of X."

The "unary tuple" problem almost always arises when people are using
tuples as readonly lists also. We should just make a readonly list type
(or readonly type annotation) and be done with it. Heck, we could have
read-write tuples also!

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things see no end: A loop with exit code done wrong
A semaphore untested, and the change that comes along
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Sat Dec 18 20:46:05 1999
From: paul@prescod.net (Paul Prescod)
Date: Sat, 18 Dec 1999 14:46:05 -0600
Subject: [Types-sig] type declaration syntax
References: <385B08AE.A491CD36@maxtal.com.au>
Message-ID: <385BF28D.38FE3543@prescod.net>

skaller wrote:
> 
> If the inferencer _cannot_ deduce the return type,
> it _also_ cannot check that the function is returning
> the correct type.

Two different issues. Some functions will have return type declarations
that are checked at runtime. I strongly believe that it should be legal
to declare a return type on a function that cannot be proved to return
the type you claim.


def foo() -> String :
	return FunctionThatReturnsStringWhenICallWithString("abc")

def foo() -> Int :
	return FunctionThatReturnsIntWhenICallWithInt(5)

Anyhow, the inferencer won't have access to all of the code. We still
need to deal with pre-compiled functions.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things see no end: A loop with exit code done wrong
A semaphore untested, and the change that comes along
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Sat Dec 18 20:15:49 1999
From: paul@prescod.net (Paul Prescod)
Date: Sat, 18 Dec 1999 14:15:49 -0600
Subject: [Types-sig] List of FOO
References: <000d01bf48f6$764ee5a0$32a2143f@tim> <385AF7C9.7CEE78E5@maxtal.com.au>
Message-ID: <385BEB75.BD06517B@prescod.net>

Thanks for describing how viper does parameterized types. There are a 
couple of things that I don't understand:

skaller wrote:
> 
>         PyListOfInt = PyListOf(PyIntType)

But does this involve executing arbitrary code defined by PyListOf? That
would hurt our ability to do static type checking.

>         x.append(1)
> 
> ends up calling 
> 
>         PyListOf.append(PyIntType, x, 1)
> 
> which means it can check that 1 is of type PyIntType.

Right, but what is the declaration for append and how does it say that
it takes a single argument and the argument must be of type PyXType
where X can vary?
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things see no end: A loop with exit code done wrong
A semaphore untested, and the change that comes along
http://www.geezjan.org/humor/computers/threes.html


From gstein@lyra.org  Sat Dec 18 21:24:37 1999
From: gstein@lyra.org (Greg Stein)
Date: Sat, 18 Dec 1999 13:24:37 -0800 (PST)
Subject: [Types-sig] Optionality and performance
In-Reply-To: <385BBF02.E4B1DC2@prescod.net>
Message-ID: <Pine.LNX.4.10.9912181320480.16305-100000@nebula.lyra.org>

On Sat, 18 Dec 1999, Paul Prescod wrote:
> Greg Stein wrote:
> > 
> > Skip Montanaro wrote:
> > > I humbly assert this train of thought rates a *bzzzt*.  I thought one core
> > > requirement was that all type declaration stuff be optional.  The worst that
> > > the type checker/inferencer should do in the face of incomplete type info is
> > > display a warning.
> 
> > My entire post was pre-conditioned on the assumption that type-checking
> > has been enabled.
> 
> Optionality of type checking is not about it being enabled or disabled.
> Even when it is enabled, type checking any particular method must be
> optional. This whole discussion should presume "enabled". But
> optionality is still important.

I'm assuming that we type-check a module at a time -- that we don't have
the kind of fine-grained checking you're assuming. If a person doesn't
want find-grained checking, then they just shouldn't add type annotations
there.

> > IMO, type checking is NOT enabled by default. I believe it will impose a
> > noticable performance penalty and I'm not willing to pay that in the
> > general case. 
> 
> I don't see how we can logically treat type checks differently than
> array bounds checks, overflow checks and so forth. It needs to be on by

The latter are runtime checks. I do agree that *runtime* type checks will
always be generated by the compiler (per my other emails) and that the
runtime check will always be performed (well, not with -O, just like
regular asserts are not enabled when -O is provided).

> default and we'll just need to figure out how to minimize its impact.
> Most type checks should involve quick pointer comparisons and that will
> be covered up by the other performance enhancement.

Again: runtime. I was referring to compile-time, static checks. I do not
believe those will always be enabled.

> In particular, when you declare conformance to a class or interface,
> method calls should no longer be string-dispatched. That means you need
> interfaces to be like vtables so the type checker's job is to find the
> right vtable. The type check actually comes "for free" in implementing
> the name lookup optimization.

Different issue (and a good/valid one!). I would recommend adding this to
your list of issues and deferring it for now.

Tim rightly points out: taking on too much right now will just cause
another self-destruction.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Sat Dec 18 21:26:16 1999
From: gstein@lyra.org (Greg Stein)
Date: Sat, 18 Dec 1999 13:26:16 -0800 (PST)
Subject: [Types-sig] tuples (was: New syntax?)
In-Reply-To: <385BF08A.6B9BD070@prescod.net>
Message-ID: <Pine.LNX.4.10.9912181324390.16305-100000@nebula.lyra.org>

On Sat, 18 Dec 1999, Paul Prescod wrote:
> Greg Stein wrote:
> > 
> > Bite me. :-)
> > 
> > You do raise a good point in another post, however:
> > 
> >   def foo(*args: (Int)):
> 
> Python should not use tuples as "read-only lists." From a type-system
> point of view, a tuple should be a fixed-length, fixed-type data
> structure defined at compile time.

Ideal or not, this is the current situation. *args is a tuple.

Are you suggesting a particular change here? If so, then add it to your
issues list :-)  [you are maintaining one, right? :-)]

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Sat Dec 18 21:31:23 1999
From: gstein@lyra.org (Greg Stein)
Date: Sat, 18 Dec 1999 13:31:23 -0800 (PST)
Subject: [Types-sig] Syntax
In-Reply-To: <385BC0B6.3879244B@prescod.net>
Message-ID: <Pine.LNX.4.10.9912181326190.16305-100000@nebula.lyra.org>

On Sat, 18 Dec 1999, Paul Prescod wrote:
> Tim Peters wrote:
> > 
> > [Martijn Faassen]
> > > While my agenda is to kill the syntax discussions for the moment,
> > > ...
> > 
> > Martijn, in that case you should stop feeding the syntax meta-discussion and
> > just view all the other notations as virtual spellings for masses of obscure
> > nested dicts <wink>.
> 
> Let me point out that it was the masses of obscure nested dicts that I
> was objecting to when I told Greg that the syntax cannot be restricted
> to Python (by which I meant Python 1.5). Obviously, by definition any
> syntax that we use for Python 2 becomes "Python". In fact, I don't see a

I'll reiterate: I think our goal is for 1.6. We should assume that 2.0
does not and will not exist. It is too far out to defer any of our goals
to that version.

Yes, we'll have V1, V2, V3 goals, but I think we ought to shoot for their
inclusion into 1.6. Only when Guido says "no, I don't want to put that
into 1.6," *then* we start to lobby for Python 2.0 changes."

> lot of difference between the widely embraced Tim-syntax and the syntax
> I posted a few days ago (based on the Tim-syntax). But if putting the
> keyword "decl:" in front makes it feel better then I'm all for that!

Sorry. I won't let you rewrite history :-). You were suggesting a new,
alternative syntax, rather than adding new syntax to Python. Tim and I
(and some others) have lobbied for adding new syntax. In particular, I
don't want to see Yet Another Language and Yet Another Parser to deal with
a distinct language/syntax for type specifications.

> I'm still thinking that it should go in another file because I want to
> be able to experiment with this stuff WITHOUT maintaining a new Python
> interpreter binary.

This will be quite possible. My current development proposal specifies the
static, compile-time checker as a separate tool. That tool could easily
use a separate file for its input.

Regardless: I'd hope that the first step to any implementation is to
update the Python grammar and allow us to annotate existing Python
programs (i.e. to use inline syntax). Updating the grammar is not super
difficult, but I hear you about wanting to not use another binary. But
I'll just shrug that off and say that's your problem :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Sat Dec 18 21:34:51 1999
From: gstein@lyra.org (Greg Stein)
Date: Sat, 18 Dec 1999 13:34:51 -0800 (PST)
Subject: [Types-sig] type declaration syntax
In-Reply-To: <385BEF11.A8A6266E@prescod.net>
Message-ID: <Pine.LNX.4.10.9912181331250.16305-100000@nebula.lyra.org>

On Sat, 18 Dec 1999, Paul Prescod wrote:
> Greg Stein wrote:
> > ...
> > Nobody has ever suggested writing the bugger in C. My assumption is that
> > it will be written in Python. A second assumption is that it will always
> > remain as a lint-like tool rather than integrated into the core compiler.
> 
> That is not my assumption. If a function creator asks for the function
> to be type checked, it should be type checked every time it is
> recompiled unless some option has turned type-checking off.

If you want to write the C code, then please be my guest. I'm hoping that
I'll find time to contribute to actual coding here (between my other
projects), and assuming that to be true, then I'll be using Python. I'm
structuring my development proposal assuming that Python will be used for
the majority of the compile-time checking.

> The difference between type signatures and lint is that lint is guessing
> about things that are, strictly speaking, correct, but questionable.
> Type check declarations are either right or wrong and if they are wrong,
> the programmer should be told.

Woah!! Do not read "historical implementation of lint" into my phrasing. I
meant "a separate tool, separately invoked." I totally agree that it will
declare things right/wrong. However, I do not believe that it will be
integrated into the core, bytecode compiler any time in the near future.
If it does, then its invocation will be optional (IMO).

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From skaller@maxtal.com.au  Sat Dec 18 22:47:44 1999
From: skaller@maxtal.com.au (skaller)
Date: Sun, 19 Dec 1999 09:47:44 +1100
Subject: [Types-sig] List of FOO
References: <000d01bf48f6$764ee5a0$32a2143f@tim> <385AF7C9.7CEE78E5@maxtal.com.au> <385BEB75.BD06517B@prescod.net>
Message-ID: <385C0F10.E003F55E@maxtal.com.au>

Paul Prescod wrote:
> 
> Thanks for describing how viper does parameterized types. There are a
> couple of things that I don't understand:
> 
> skaller wrote:
> >
> >         PyListOfInt = PyListOf(PyIntType)
> 
> But does this involve executing arbitrary code defined by PyListOf? 

	Yes.

> That would hurt our ability to do static type checking.

	I'm not so certain. I think you have asked the right question.
Here's why I'm uncertain: the code for PyListOf is written in Python.
Typically, it will be a simple class. A compiler or other static
analysis tool can analyse that code just like any other.

	Now, for _builtin_ types, it will surely help to
have _builtin_ semantics, and this is possible, because
Python does have a specification for these types.
For user defined types, it isn't clear analysing a type object
is that much harder or different, to analysing any other python
code. 

	Compare with analysing the behaviour of class instances,
tracking which classes they are statically. I'm not sure it
is much different. In fact, __getattr_ and friends already make
analysis of user defined classes difficult .. so perhaps
there isn't much difference here. I don't (yet) know.
 
> >         x.append(1)
> >
> > ends up calling
> >
> >         PyListOf.append(PyIntType, x, 1)
> >
> > which means it can check that 1 is of type PyIntType.
> 
> Right, but what is the declaration for append and how does it say that
> it takes a single argument and the argument must be of type PyXType
> where X can vary?

	Well, in Viper, the definition of append would be in the
class PyListOf:

	class PyListOf:
		...
		def append(Type, object, value):
			if type(value) is not Type):
				raise TypeError
			else:
				object.append(value)


Now, here, using a "Guido rambling argument" I think you ( a human )
could deduce what is going on. The explicit type test indicates
typeness of value: it tells that the type of  value in 
object.append(value) must be Type. It is harder to deduce
that 'object' is a list. Indeed, it might not be, it can be
anything with an append method. Hopefully, the definition
I gave won't lead to an infinite recursion. 

	I guess the point I'm making is: suppose the
Viper type system works out nicely for the interpreter.
Then this suggests a more 'pythonic' way of naming types,
the way python programmers do it now:

	type([1,2,3]) is types.ListType

	type(user_object) is user_module.MyType

where the RHS in both cases is a python expression
denoting a type object.

The reason I'm suggesting this is worth examining,
is that it doesn't require much change to python:
the CPython currently uses special type objects for
types ... but JPython is a bit different, and Viper
just generalises CPython a bit.

At least one advantages is that C extensions are well
covered by this idea. No, I should say "it seems to me
that this might work well with C extensions, possibly
better in Python 2 than 1.6 (since the architecture
of Python 2 will be reworked)". Might also work
better for JPython too.

Note I'm not against using a functional language's
type description for Python, a'la Tim/Haskell,
but it isn't clear that is going to work well either,
and it seems to involve 'extra' work, writing a parser
for a 'new' language, etc.

I think you said 'ignore non builtin types
for the moment', and I think I'm giving an argument
that this might not be such a restriction after all.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From faassen@vet.uu.nl  Sat Dec 18 22:54:12 1999
From: faassen@vet.uu.nl (Martijn Faassen)
Date: Sat, 18 Dec 1999 23:54:12 +0100
Subject: [Types-sig] Type Inference I
In-Reply-To: <000101bf499a$670a9040$dca2143f@tim>
References: <Pine.LNX.4.10.9912180015400.16305-100000@nebula.lyra.org> <000101bf499a$670a9040$dca2143f@tim>
Message-ID: <19991218235412.A15050@vet.uu.nl>

Tim Peters wrote:
> Indeed, I think it should forget inferencing *entirely* at the start, even
> for cases like

> def unity() -> Int:
>     a = 1   # compile-time error in type-check mode -- a not declared
>     return a

To use my famous phrase again: I agree.

The counter argument I got to this before is that inferencing takes place
anyway in the case of expressions:

def foo(a, b):
    # Martijn's evil verbose format in yet another form
    decl:
        a = Int
        b = Int
        return Int
    return a + b

'a + b' would need inferencing to figure out what the type is of the complete
expression. I think that this argument overlooks that this kind of evaluation
is a lot more easy than a back-tracking kind of inferencing. 
 
> Inferencing (ya, ya -- *useful* inferencing) is harder than mere checking
> (indeed, checking is easy enough to write in K&R C <wink>).

Though checking could be seen as a kind of inferencing, right? Or are
people confusing the issues? Initially I didn't consider the expression
evaluation stuff as inferencing either, but there's a good argument to
consider it so, not?

Regards,

Martijn


From skaller@maxtal.com.au  Sat Dec 18 23:05:41 1999
From: skaller@maxtal.com.au (skaller)
Date: Sun, 19 Dec 1999 10:05:41 +1100
Subject: [Types-sig] type declaration syntax
References: <000001bf499a$64fa9c00$dca2143f@tim>
Message-ID: <385C1345.C21FF180@maxtal.com.au>

Tim Peters wrote:
 
> My bet is that the vast majority of Python people asking for "static typing"
> have in mind a conventional explicit system of the Algol/Pascal/C (APC) ilk,
> and that decisions based on what *inference* schemes can do are going to
> leave them very unhappy.

I'm not sure why. My 'assumption' is that

	1) a conservative inferencer is used,
which means it tries to optimise code by inference,
but if it isn't sure, it falls back to the usual run-time
checking -- that is, it faithfully reproduces the expected
behaviour no matter what.

	2) optional static type declarations allow
the performance of the inferencer to be improved;
that is, to generate better code

	3) it would also help to tighten up
the specifications of python, particularly
in areas like

	a) when is it OK to expect an exception
	b) module freezing

etc.

I would make the point that, as often is the case,
the client is 'asking' for X, but what they actually
need is Y, because they don't understand their own requirements.
That is, they may be 'asking' for APC style static typing,
but they have no idea what the implications are,
and if they knew, they would withdraw their application.
I guess that NO python programmer wants to declare the type
of every single name, which is what APC style static type
checking requires. So they 'throw in' the word 'optional',
and that changes the whole thing to 'general inference
like in a functional programming language, only trickier,
because Python isn't' :-)

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Sat Dec 18 23:15:05 1999
From: skaller@maxtal.com.au (skaller)
Date: Sun, 19 Dec 1999 10:15:05 +1100
Subject: [Types-sig] Type Inference I
References: <Pine.LNX.4.10.9912171453120.16305-100000@nebula.lyra.org> <385ACE5C.17CD5684@maxtal.com.au> <385BCD21.1373DDAD@prescod.net>
Message-ID: <385C1579.1D937D34@maxtal.com.au>

Paul Prescod wrote:

> I don't see how your global type inferencer is going to handle:
> 
> a = 1 + unpickle( "foo.pcl" )
> b = a + eval( raw_input() )
> 
> I don't think that we can make these illegal without alienating most
> Python users.

I agree. the way I plan to handle this in Viperc is to fall
back on the run time system (Viperi). However, you might be
surprised how well inference can do. For example, consider

	b = a + eval( raw_input() )

It may seem that this tells nothing about a or b.
But looking closer, both a and b must be 'addable'
in some sense.

Furthermore, in context, both 'a' and 'b' have to be
_used_ elsewhere for the code to be useful, and we can
learn more about the typing from examining those contexts.

There is no need to always deduce the types:
python is not a functional programming language with
a full static typing system. It is enough, that we
can make significant performance improvements in some
places, or report a few definite errors.

Short answer: you're right, but it doesn't matter:
no one expects a python compiler to produce code that
runs as fast as C.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Sat Dec 18 23:37:22 1999
From: skaller@maxtal.com.au (skaller)
Date: Sun, 19 Dec 1999 10:37:22 +1100
Subject: [Types-sig] Type Inference I
References: <Pine.LNX.4.10.9912180015400.16305-100000@nebula.lyra.org> <385BB080.A2DBB67C@maxtal.com.au> <385BE7A0.285282C3@prescod.net>
Message-ID: <385C1AB2.E840F9D6@maxtal.com.au>

Paul Prescod wrote:
> 
> I don't see how this strategy can work.
> 
> skaller wrote:
> >
> > You might just think, seeing
> >
> >         x + 1
> >
> > that if x is not an integer, the code must be an error,
> > but the example above shows that you'd be wrong
> > if you said that.
> 
> But as you and others have pointed out, Python is protocol-centric, not
> type-centric. In real Python, x could be anything that with an __add__
> function. The optimization opportunity is thus dubious.

That depends on the scope of the analyser I think.
If you are only analysing a function, by itself, without any
type declarations, you are probably right that many cases
cannot be optimised. In that case type declarations may help.
However, it may well be that it _is_ possible to deduce things
in an isolated function in important places, like in the
body of an innner loop.

On the other hand, if you widen the scope of the analyser to
a whole module, or the whole program, then it may be possible
to do better. [See Guidos 'rambling' post]

Greg wants to write one kind of tool, I'm building a different one.
The point is to try to help both these tools, and any others,
do a better job for the programmer, by changing the python
language. I.e. the goal of the SIG is to recommend language
changes NOT to produce any kind of tool (although that 
is useful to help decide what needs changing, and it may
also be useful to end users as well: these are secondary
goals) At least, that's my understanding.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Sat Dec 18 23:39:45 1999
From: skaller@maxtal.com.au (skaller)
Date: Sun, 19 Dec 1999 10:39:45 +1100
Subject: [Types-sig] Type Inference I
References: <Pine.LNX.4.10.9912181144540.16305-100000@nebula.lyra.org>
Message-ID: <385C1B41.73F23836@maxtal.com.au>

Greg Stein wrote:
 
> Bill Tutt and I have done it and measured about 30% speed improvement in
> most cases. Not as lot as most people would hope for, but definitely
> there. Bill is continuing to improve the code.

	That's quite worthwhile, though.
 
-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From gstein@lyra.org  Sat Dec 18 23:43:30 1999
From: gstein@lyra.org (Greg Stein)
Date: Sat, 18 Dec 1999 15:43:30 -0800 (PST)
Subject: [Types-sig] what tools? (was: Type Inference I)
In-Reply-To: <385C1AB2.E840F9D6@maxtal.com.au>
Message-ID: <Pine.LNX.4.10.9912181540440.16305-100000@nebula.lyra.org>

On Sun, 19 Dec 1999, skaller wrote:
>...
> Greg wants to write one kind of tool, I'm building a different one.
> The point is to try to help both these tools, and any others,
> do a better job for the programmer, by changing the python
> language.

Agreed. So far, I do not believe that adding type annotations
(declarations) will hinder your tool. And it certainly will help the
standard tools (i.e. those incorporated into the standard distro).

> I.e. the goal of the SIG is to recommend language
> changes NOT to produce any kind of tool (although that 
> is useful to help decide what needs changing, and it may
> also be useful to end users as well: these are secondary
> goals) At least, that's my understanding.

I don't believe we are limited to language changes. That is a bit too
narrow to solving the problems at hand. I figure that we'll implement an
external tool, leaving the integration decision for another day.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Sat Dec 18 23:44:20 1999
From: gstein@lyra.org (Greg Stein)
Date: Sat, 18 Dec 1999 15:44:20 -0800 (PST)
Subject: [Types-sig] Type Inference I
In-Reply-To: <385C1B41.73F23836@maxtal.com.au>
Message-ID: <Pine.LNX.4.10.9912181543580.16305-100000@nebula.lyra.org>

On Sun, 19 Dec 1999, skaller wrote:
> Greg Stein wrote:
>  
> > Bill Tutt and I have done it and measured about 30% speed improvement in
> > most cases. Not as lot as most people would hope for, but definitely
> > there. Bill is continuing to improve the code.
> 
> 	That's quite worthwhile, though.

Yup. But when people say "Python is 10X slower", then you want a 10X speed
improvement to shut them up :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From tony@metanet.com  Sat Dec 18 23:56:50 1999
From: tony@metanet.com (Tony Lownds)
Date: Sat, 18 Dec 1999 15:56:50 -0800 (PST)
Subject: [Types-sig] Syntax
In-Reply-To: <385BC0B6.3879244B@prescod.net>
Message-ID: <Pine.GSO.3.93.991218153850.4238A-100000@adam12>

On Sat, 18 Dec 1999, Paul Prescod wrote:

> Tim Peters wrote:
> > 
> > [Martijn Faassen]
> > > While my agenda is to kill the syntax discussions for the moment,
> > > ...
> > 
> > Martijn, in that case you should stop feeding the syntax meta-discussion and
> > just view all the other notations as virtual spellings for masses of obscure
> > nested dicts <wink>.
> 
> Let me point out that it was the masses of obscure nested dicts that I
> was objecting to when I told Greg that the syntax cannot be restricted
> to Python (by which I meant Python 1.5). Obviously, by definition any
> syntax that we use for Python 2 becomes "Python". In fact, I don't see a
> lot of difference between the widely embraced Tim-syntax and the syntax
> I posted a few days ago (based on the Tim-syntax). But if putting the
> keyword "decl:" in front makes it feel better then I'm all for that!
> 
> I'm still thinking that it should go in another file because I want to
> be able to experiment with this stuff WITHOUT maintaining a new Python
> interpreter binary.
> 

I think it'd be possible to put type declarations in-line without using a
new binary, at least in the short term:

1. make a module that overloads __import__()

2. when a module is imported it asks the syntax handler to parse the file 
and generate a plain .py file and a .pi (ie "interface") file with
appropriately nested dicts in it.

3. Then it asks the type checker to make sure the .pi and .py match up.
The type checker may need to call __import__() recursively.

4. Then, __import__() should import the generated .py file.


There are a few caveats I can think of:

a. eval/exec/execfile couldnt use type declarations

b. The outputted .py file would basically be stripped of type
declarations, nothing would be added to it. A full-blown system might want
to add runtime type checks.

c. The syntax handler wouldn't get to use Python's parser "for free".


-Tony


From gstein@lyra.org  Sun Dec 19 00:04:05 1999
From: gstein@lyra.org (Greg Stein)
Date: Sat, 18 Dec 1999 16:04:05 -0800 (PST)
Subject: [Types-sig] Syntax
In-Reply-To: <Pine.GSO.3.93.991218153850.4238A-100000@adam12>
Message-ID: <Pine.LNX.4.10.9912181601490.16305-100000@nebula.lyra.org>

On Sat, 18 Dec 1999, Tony Lownds wrote:
>...
> I think it'd be possible to put type declarations in-line without using a
> new binary, at least in the short term:
> 
> 1. make a module that overloads __import__()
> 
> 2. when a module is imported it asks the syntax handler to parse the file 
> and generate a plain .py file and a .pi (ie "interface") file with
> appropriately nested dicts in it.
> 
> 3. Then it asks the type checker to make sure the .pi and .py match up.
> The type checker may need to call __import__() recursively.
> 
> 4. Then, __import__() should import the generated .py file.

Interesting approach!

However, I'd think that implementing that would be about the same
difficulty as altering Python's grammar (i.e. not a walk in the park, but
not hard). But if a single binary is important (for now), then your
thought is quite valid.

Over the next week or so, I'm going to be work on Python's import system.
Depending on whether Guido likes the changes and if checks them in, then
tweaking the import as you mention would get a good deal easier.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From skaller@maxtal.com.au  Sun Dec 19 00:42:48 1999
From: skaller@maxtal.com.au (skaller)
Date: Sun, 19 Dec 1999 11:42:48 +1100
Subject: [Types-sig] Type Inference I
References: <Pine.LNX.4.10.9912181144540.16305-100000@nebula.lyra.org>
Message-ID: <385C2A08.DD36C485@maxtal.com.au>

Greg Stein wrote:
 
> But to state up front: I don't think we want to rely on whole-program
> analysis. 

	Right. I accept this as a reasonable requirement.
Perhaps to explain my viewpoint: it is worthwhile seeing
just how far it is possible to go, with no changes,
and with a few small changes like optional type checking,
in each case: whole program analysis, single module, and
single function.

	The reason it is worth considering whole program
analysis is that, because there is a LOT more information
available, such a tool can do a much better job, and therefore
less changes are needed to the python language. Establishing
the _minimum_ changes required is useful, even if we both
agree we'd like to do more -- and I share your desire to
work at a much finer grained level like modules or functions.

	Viperc has taken the whole program approach,
NOT because I like it, but because, at the moment,
there is no real alternative. I'd sure like to see
things that made per module compilation possible!!

> At the moment, I am assuming that a type-checking tool will not
> be part of the [byte-code] compiler -- that is is just too much and too
> slow to directly include. 

	You are probably right. however, I think the issue here
is the Python language, and what _might_ be done if we change it,
rather than any particular tool. That is, we should be examining
"what can we do, if we make these changes" and not "what tool
can we write for the CPython 1.6 distribution". I'm not saying
a tool cannot be written, just that the issue isn't writing the tool,
but changing the language so tool writers can actually get better
results.

> To that end, I think we might eventually want to integrate something.

	Actually, I tend to think the most likely tool which will
get people excited is a compiler that generates C code: nothing
to do with the bytecode interpreter at all. But I could be wrong.
[The JPython people, for example, won't care :-]

> And to do that, we definitely cannot rely on whole-program analysis. 

	Agreed. It makes sense to consider how to change Python
so a more localised tool can work well.

> In other words, if we depend on whole-program analysis, then I don't think the
> builtin, byte-code compiler will ever be able to take advantage of type
> information.

	You are probably right, but I don't think the bytecode
compiler is the target. Well, i _didn't_ think that, until you said
that's what you were interested in right now. Forgive my
misunderstanding!

> I agree that you want to use every bit of information possible. I disagree
> that I'm missing the point: I think we are discussing what will happen to
> the native compiler. To that end, I *am* positing that 'we' will do <this>
> or <that>.

	I see. This appears to be where we are crossing wires.
It hadn't even occurred to me that this had anything at all to do with
the bytecode compiler. My assumption was people were interested in:

	a) a stand alone type checker to help diagnose errors
	b) a compiler to convert functions, modules, or whole programs 
		into C.

	Thanks for pointing out my assumptions were overly restrictive:
you are right, the bytecode compiler might benefit from analysis too.
 
> > Now, my point, in Type Inference I and II, is that static
> > type declarations are only ONE way of providing
> > more information, and they are not even the most important
> > one. In fact, type inference is hampered by quite a lot
> > of other factors.
> 
> This was entirely unclear. I saw it as some kind of weird ramble about
> changing Python's exception behavior in some unclear way, and for some
> unknown purpose.

	I accept it was a weird ramble. Sorry. I was trying my
best to explain something which is, in fact, difficult to
understand for me. Perhaps that's why the ramble was so long
winded and rambly (or perhaps I write like that anyhow :-)
 
> > I'm sure you will agree on some. For example, I'm sure
> > you understand how 'freezing' a module, or banning
> > rebinding of variables afer importing, or, disallowing
> >
> >       module.attribute = xxx
> >
> > will help type inference: you clearly understand that
> > python is very dynamic, which makes static analysis
> > difficult. Right?
> 
> Yup.

	So, you agree with my point that there are OTHER
things than optional type declarations that can improve
the situation wrt typing/optimisation?

	You understand the freezing issue, but not
the exception handling one? I don't fully understand it either.
That's why the ramble, to promote discussion: if I fully
understood it, I would have posted a tightly worded proposal
instead.

> >... module attribute assignments ...
> >... add() example, explaining exceptions mess up compiler ...
> >
> >       1) The spec says:
> >       IF the arguments are both ints ..
> >       OR IF the arguments are both strings ..
> >       OTHERWISE an exception is thrown
> >
> >       2) The spec says:
> >       IF the arguments are both ints ..
> >       OR IF the arguments are both strings ...
> >       OTHERWISE THE BEHAVIOUR IS UNDEFINED
> >...
> > Therefore, there is a performance advantage in adopting
> > spec (2) as a language specification, instead of (1).
> > Note this does not mean the generated code will crash,
> > if x is not an integer. What it means is that if the
> > compiler detects that x is not an integer, it can
> > generate a compile time error. It is NOT allowed to
> > do that with specification (1).
> 
> Interesting point. As a user of Python, I like (1) and do not want to see
> Python to change to use (2). Sure, it hurts the compiler, but it ensures
> that I always know what will happen.

	Right. But you can see the tension between these two
specifications and the impact on performance??

	So now: let me put it to you, that we could
try for specification (3) -- which is a compromise
between (1) and (2), which provides BOTH advantages
with some restrictions:

	(3) ....
	
	OTHERWISE, IF the function call is
	enclosed in a try block within the same
	function as the call, AND that try block
	has a handler which explicitly catches
	the exception or a base thereof, or,
	any exception, THEN an exception
	will be thrown, OTHERWISE the program 
	is in error.

EXAMPLE:

	def f(x):
		try:
			return x + 1
		except:
			return str(x)

REQUIRED: x + 1 requires an exception be thrown at run time,
and the function f will never fail. No compile time
diagnostics can be printed. It is hard to optimise
the code, but it does what Python does right now.

EXAMPLE:

	def f(x):
		return x + 1

PERMITTED: a static analyser can report a compile
time error, if it sees a call:

	def g(x):
		if type(x) is StringType:
			return f(x)

THIS IS A CHANGE FROM CPYTHON. We're saying
that this code IS AN ERROR, and that a compiler
can REJECT the program as invalid python.

Let me explain the rationale: even a simple,
local, per function analyser can _see_ that
a function call is wrapped inside a try/except
clause, and can examine the exceptions that
will be handled .. if the exception is a Python
defined one, and is named with the Guido given
name (rather than a variable bound to it),
then this case is 'static' enough to determine
that the user DELIBERATELY executed possibly
faulty code, with the aim of handling the exception.

In this case, we should respect the users wishes.

Now, if the user puts the handler well down the stack,
somewhere else .. then the user is NOT deliberately
trying to use exception handling to do type checking,
they're just trying to make the program continue to
run without falling over, or, print some diagnostic
before terminating .. in other words, in this
case, it is reasonable to assume that the function
call is a programming error. That is,
THE CODE IS NOT VALID PYTHON. In the first example,
the code IS valid python though.


	The question I would ask is: does this
rule cover enough existing code, that if it is
taken as a Python language rule, it will
not break too many programs if a compiler
REJECTS the code in the case that the handler
is not local to the function call??

	And the second question is: will this
really provide opportunities for better code generation
by a compiler, or better diagnostics from a static
analyser?

	Finally: is there a better rule??

	I'm really hoping here that you (Greg)
will like the idea, because it helps a per function
or per module static checker more than a whole program
analyser.

	It is clear Guido is willing to add restrictions
on the existing language rules, to allow better
static analysis for ERR or OPT: he already said that
module freezing is a goer. I do not know if the
rule above is a goer: I don't really know what all
the issues are.

> While true, I think the compiler will still know enough about the type
> information to generate very good code. 

	I'm not so sure. I tried it, and the result was that
because of the  EH (exception handling) issue I mentioned,
it wasn't possible EVEN with global whole program analysis,
to get good results -- without doing a full control flow
analysis as well. And that, unlike type inference, is probably
too hard.

> If somebody needs to squeak even
> more performance, then I'd say Python is the wrong language for the job. I
> do understand that you believe Python can be the right language, IFF it
> relaxes its specification. I hope it doesn't, though, as I like the
> rigorous definition.

	I would say python needs to _tighten_ its specification,
not relax it, so we had better watch out when we use these terms
we don't cross wires, since we clearly agree on the semantics
we're refering to :-) [Technically, I think you're right,
and I'm wrong]

> It only prevents it in the presence of exception handlers. 

	Yes. But 'presence' is a dynamic thing, not a lexical one.
You need whole program control flow analysis to detect presence
statically. The kind of rule I proposed above changes that.

>You can still
> do a lot of optimization outside of them. And a PyIntType_Check() test
> here and there to validate your assumptions is techically more expensive
> than not having it, but (IMO) it is not expensive in absolute terms.

	That depends on where the check is. If it is in 
a tight inner loop, where the code would otherwise

	a) do no function calls
	b) all fit in the machines hardware cache
	c) reduce to register operations

then I'm sure you would agree you were wrong.
And these are the cases of most interest, because it
is the really tight Python coded inner loops where
Python falls so badly behind C in performance.

With just the right specifications and tweaks to the
language, plus static type declarations, it may well
be possible to compile down to extremely fast C code.

> All right. This is clear now. And it is clearly something that I would not
> want to see :-)

	So you are willing to throw out optimisation
opportunities in favour of preserving the existing
semantics. This is valid viewpoint. I accept one vote
against any change here. I'd like to see other peoples
opinions, and then perhaps a consensus can be
reached: I am quite willing to accept whatever
is decided by Guido in the end .. I can always add
features to Viper than make it much faster than Python :-)
But I'd prefer not to, I'd rather it compile Guido Python
with good performance :-)

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Sun Dec 19 00:54:33 1999
From: skaller@maxtal.com.au (skaller)
Date: Sun, 19 Dec 1999 11:54:33 +1100
Subject: [Types-sig] Type Inference I
References: <Pine.LNX.4.10.9912181543580.16305-100000@nebula.lyra.org>
Message-ID: <385C2CC9.8F4158A7@maxtal.com.au>

Greg Stein wrote:
> 
> On Sun, 19 Dec 1999, skaller wrote:
> > Greg Stein wrote:
> >
> > > Bill Tutt and I have done it and measured about 30% speed improvement in
> > > most cases. Not as lot as most people would hope for, but definitely
> > > there. Bill is continuing to improve the code.
> >
> >       That's quite worthwhile, though.
> 
> Yup. But when people say "Python is 10X slower", then you want a 10X speed
> improvement to shut them up :-)

	To put it in context: I have a literate programming
tool, Interscript, written in Python. My problem isn't
that it is 10X too slow.

	It takes four hours to build itself, instead of
four seconds. For a more usual project, it would take
20 seconds, where less than a second is required.
In other words, it is something like 100x - 1000X too slow.
I know that other LP tools written in C can run that much
faster, because I have used them.

	The crux of the problem is in the character
by character handling which occurs in tight loops
that continually reallocate strings by appending
characters to them.

	I believe it is possible to compile
suitably written and annotated python so it
will run 100-1000X faster. 30% is worthwhile,
but 100000% is better :-)


-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Sun Dec 19 00:58:15 1999
From: skaller@maxtal.com.au (skaller)
Date: Sun, 19 Dec 1999 11:58:15 +1100
Subject: [Types-sig] Viper Type specification
References: <Pine.LNX.4.10.9912181128270.16305-100000@nebula.lyra.org>
Message-ID: <385C2DA7.9E429569@maxtal.com.au>

Greg Stein wrote:
 
> > BTW: w.r.t expr!type, your (Greg's) proposal, what precedence
> > would your give operator ! ?
> 
> Lowest possible (as seen in the type-proposal.html I recently posted
> here). I don't have any actual experience with it, but I would think that
> when somebody is using it to annotate/verify their code, they would just
> append it to the end of key lines in a function. The lowest precedence
> creates the correct binding in this case.

OK, I'll implement it, and if you make some test files,
I'll run them and send you the results to see if they're
what you expected (in an interpreter! no compilation yet :-)

[BTW: it will take less than an hour to do: I just finished
doing list comprehensions, it took less than 2 hour all up.
Assignment operators took longer -- there are more of them
to code up]

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Sun Dec 19 01:04:17 1999
From: skaller@maxtal.com.au (skaller)
Date: Sun, 19 Dec 1999 12:04:17 +1100
Subject: [Types-sig] Re: what tools? (was: Type Inference I)
References: <Pine.LNX.4.10.9912181540440.16305-100000@nebula.lyra.org>
Message-ID: <385C2F11.C89266A4@maxtal.com.au>

Greg Stein wrote:
> 
> On Sun, 19 Dec 1999, skaller wrote:
> >...
> > Greg wants to write one kind of tool, I'm building a different one.
> > The point is to try to help both these tools, and any others,
> > do a better job for the programmer, by changing the python
> > language.
> 
> Agreed. So far, I do not believe that adding type annotations
> (declarations) will hinder your tool. 

	Of course not: I want them badly.

> > I.e. the goal of the SIG is to recommend language
> > changes NOT to produce any kind of tool (although that
> > is useful to help decide what needs changing, and it may
> > also be useful to end users as well: these are secondary
> > goals) At least, that's my understanding.
> 
> I don't believe we are limited to language changes. That is a bit too
> narrow to solving the problems at hand. I figure that we'll implement an
> external tool, leaving the integration decision for another day.

	Yeah, but your tool will probably by CPython centric,
that is, it will not work in JPython or Viper because you will
hook the AST stuff etc. [If you can do it all in 'pure python'
that would be  better -- however it STILL won't cope with
Viper or Jpython extensions ...]

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From paul@prescod.net  Sun Dec 19 01:56:03 1999
From: paul@prescod.net (Paul Prescod)
Date: Sat, 18 Dec 1999 19:56:03 -0600
Subject: [Types-sig] Type Inference I
References: <Pine.LNX.4.10.9912181543580.16305-100000@nebula.lyra.org>
Message-ID: <385C3B33.13C8AEFC@prescod.net>

Greg Stein wrote:
> 
> ...
>
> Yup. But when people say "Python is 10X slower", then you want a 10X speed
> improvement to shut them up :-)

Damn would that be sweet.

We'll get some nice marketing on the safety side, too. One of the first
questions Java programmers ask is: "is it like Perl in that its only
going to catch my typing mistakes at runtime??" I'd love to be able to
say: "If you ask it to."

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things see no end: A loop with exit code done wrong
A semaphore untested, and the change that comes along
http://www.geezjan.org/humor/computers/threes.html


From tim_one@email.msn.com  Sun Dec 19 04:42:50 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sat, 18 Dec 1999 23:42:50 -0500
Subject: [Types-sig] Keyword arg declarations
In-Reply-To: <00c101bf48bf$fb9850c0$c355cfc0@ski.org>
Message-ID: <000101bf49db$810c5dc0$16a2143f@tim>

[David Ascher]
> ...
> An example of such a signature is familiar to all is the signature for
> range().  The docstring for range reads:
>
> range([start,] stop[, step]) -> list of integers
>
> which is not expressible with the current syntax.  A Python
> version of range would have to do, much like NumPy's arange does,
>
> def range(start, stop=None, step=1):
>     if (stop == None):
>         stop = start
>         start = 0

Or whrandom's funkier:

	def randrange(self, start, stop=None, step=1,
		      # Do not supply the following arguments
		      int=int, default=None):

> Now, the builtin typechecker can of course be told about
> __builtin__.range's signature peculiarities, but is there
> any way we can address the more general problem?  Or is it,
> as I suspect, rare enough that one can ignore it?

I suggest you're wrestling with an illusion here:  Python *internally* has
no such form of argument list as

     range([start,] stop[, step])

This is, that's just the way the *doc* is written, to make it clearer.
bltinmodule.c's builtin_range analyzes the snot out of the arglist, much
like the Python versions do.  A clue that the doc makes no actual sense
<wink> is that it apparently allows expressing a stop and step without a
start.

Everything the builtin truly requires can be captured via the declaration

    decl range: def(Int, =Int, =Int) -> [Int]

using =Type notation for optional arguments that are not also keyword
arguments.  If the builtin also accepted these as keyword arguments, this
could be expressed as (dropping my customary "|" in favor of GregS's "or"):

    decl range: def(stop: Int) -> [Int] or   \
                def(start: Int, stop: Int, step=:Int) -> [Int]

using name=:Type notation for an optional keyword argument.

An alternative is to change the docs to match what actually happens.

but-no-need-for-extremes<wink>-ly y'rs  - tim


From da@ski.org  Sun Dec 19 06:54:31 1999
From: da@ski.org (David Ascher)
Date: Sat, 18 Dec 1999 22:54:31 -0800
Subject: [Types-sig] Keyword arg declarations
References: <000101bf49db$810c5dc0$16a2143f@tim>
Message-ID: <00ac01bf49ed$e76a3f80$df55cfc0@ski.org>

From: Tim Peters <tim_one@email.msn.com>
> I suggest you're wrestling with an illusion here:  Python *internally* has
> no such form of argument list as
>
>      range([start,] stop[, step])
>
> This is, that's just the way the *doc* is written, to make it clearer.

I know that.  However, I can imagine that it will be hard to justify to the
unwashed masses why they need to use seemingly unrelated syntax to describe
the signature for humans and the signature for the compiler.

I believe that you raise a similar point in another of your posts, w.r.t the
'int=int, ord=ord' extra junk in your function definition.

That said, I suspect that the issue is peripheral and rare enough that I
needn't worry.

--da


From paul@prescod.net  Sun Dec 19 09:20:20 1999
From: paul@prescod.net (Paul Prescod)
Date: Sun, 19 Dec 1999 03:20:20 -0600
Subject: [Types-sig] Easier?
References: <Pine.LNX.4.10.9912171502320.16305-100000@nebula.lyra.org> <385AD61D.548EB5F8@maxtal.com.au>
Message-ID: <385CA354.18C252A7@prescod.net>

skaller wrote:
> 
> ...
> 
> I'm not asking for that, just trying to explain how
> important conformance issues and specifications are
> in optimisation, and in particular, how important
> it is that certain operations NOT be defined
> (even by a requirement that an exception be thrown).

I think that you should build an inferencer for a Python subset where
TypeError and module-rebindings are illegal. If you get a serious
speedup then those feature will defacto fall out of use because people
will want to use your inferencer to get serious speedups. At that point
we can officially deprecate those features.
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things never trust in: That's the vendor's final bill
The promises your boss makes, and the customer's good will 
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Sun Dec 19 09:20:27 1999
From: paul@prescod.net (Paul Prescod)
Date: Sun, 19 Dec 1999 03:20:27 -0600
Subject: [Types-sig] Type Inference I
References: <Pine.LNX.4.10.9912181144540.16305-100000@nebula.lyra.org> <385C2A08.DD36C485@maxtal.com.au>
Message-ID: <385CA35B.5F919934@prescod.net>

If you find that a restriction like this practically allows interesting
type inference then I would propose a rule similar to the following:

"If a Python compiler can determine that there is a code path through
the program that raises TypeError it may reject the program. If it does
not reject the program then it must report the TypeError at runtime by
throwing an exception."
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things see no end: A loop with exit code done wrong
A semaphore untested, and the change that comes along
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Sun Dec 19 11:44:08 1999
From: paul@prescod.net (Paul Prescod)
Date: Sun, 19 Dec 1999 05:44:08 -0600
Subject: [Types-sig] parameterized typing (was: New syntax?)
References: <Pine.LNX.4.10.9912171102520.16305-100000@nebula.lyra.org>
Message-ID: <385CC508.D8684CEC@prescod.net>

Greg Stein wrote:
> 
> ....
> Paul: does this sufficiently address your desire for parameterized types?
> Others: how does this look? It seems quite Pythonic to me, and is a basic
> extension of previous discussions (and to my thoughts of the design).

Without thinking every detail through it looks good to me for handling
parameterized classes. I think that parameterized typedecls and
functions are still an issue.

Also, was it your intent that the _ be required or would the fact that
the param was declared obviate that. I am thinking that there may a more
general syntax that allows us to parameterize various sorts of things.

interface (a,b) foo: ...
class (a, b) foo: ...
def (a, b) foo(a) -> b:
decl foo(a,b) = typedef ...

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things see no end: A loop with exit code done wrong
A semaphore untested, and the change that comes along
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Sun Dec 19 10:30:39 1999
From: paul@prescod.net (Paul Prescod)
Date: Sun, 19 Dec 1999 04:30:39 -0600
Subject: [Types-sig] Syntax
References: <Pine.LNX.4.10.9912181326190.16305-100000@nebula.lyra.org>
Message-ID: <385CB3CF.AB7343AE@prescod.net>

Greg Stein wrote:
> 
> > lot of difference between the widely embraced Tim-syntax and the syntax
> > I posted a few days ago (based on the Tim-syntax). But if putting the
> > keyword "decl:" in front makes it feel better then I'm all for that!
> 
> Sorry. I won't let you rewrite history :-). You were suggesting a new,
> alternative syntax, rather than adding new syntax to Python. 

I was suggesting a new, alternative syntax that would eventually become
a part of the Python runtime system. I said that the new, alternative
syntax should go in separate files for now because that makes
implementation simpler. What I argued against was restricting ourselves
to Python as it exists today, in particular nested dicts and lists.

> Tim and I
> (and some others) have lobbied for adding new syntax. In particular, I
> don't want to see Yet Another Language and Yet Another Parser to deal with
> a distinct language/syntax for type specifications.

decl has a grammar. It *is* Yet Another Language. As Tim says, it is
rapidly approaching the complexity of Python itself. :) 

I am still guessing that for the first version it will also use Yet
Another Parser because I don't want to change the real Python parser
while we are in development mode. Are we going to set up our own CVS
tree and have all of our testers install a new binary when we change the
precedent of the ! operator or add a feature to the decl statement? I
would rather send them one or two new Python files.

*If* 1.6 is coming when we need it to, then we could give it a very
informal grammar for decl that basically stops at a comment or line
boundary. That could invoke our (Python coded) decl-parser. More likely,
we will want to test things out before 1.6 so we will probably stuff
decl statements into expression statement-strings or shadow files.
Either way, we have our own (sub-)grammar and (sub-)parser based, it
seems, on Haskell.

> Regardless: I'd hope that the first step to any implementation is to
> update the Python grammar and allow us to annotate existing Python
> programs (i.e. to use inline syntax). Updating the grammar is not super
> difficult, but I hear you about wanting to not use another binary. But
> I'll just shrug that off and say that's your problem :-)

Updating the grammar is not super-difficult but getting it right the
first time is difficult.

I cannot believe that nobody in parser-land has written a Python-based
Python parser that we can hack. Whatever happened to the ethic that a
parser-generator was not done until it could parse the language it was
written in? That a Real Programming Language was not done until it could
compile itself? :)

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things see no end: A loop with exit code done wrong
A semaphore untested, and the change that comes along
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Sun Dec 19 10:41:35 1999
From: paul@prescod.net (Paul Prescod)
Date: Sun, 19 Dec 1999 04:41:35 -0600
Subject: [Types-sig] Optionality and performance
References: <Pine.LNX.4.10.9912181320480.16305-100000@nebula.lyra.org>
Message-ID: <385CB65F.FAE20948@prescod.net>

Greg Stein wrote:
> 
> > Optionality of type checking is not about it being enabled or disabled.
> > Even when it is enabled, type checking any particular method must be
> > optional. This whole discussion should presume "enabled". But
> > optionality is still important.
> 
> I'm assuming that we type-check a module at a time -- that we don't have
> the kind of fine-grained checking you're assuming. If a person doesn't
> want find-grained checking, then they just shouldn't add type annotations
> there.

Here's the issue I tried to address in RFC 1.0 under the term "safety
declarations".

def foo() -> String:
	# 10,000 lines of code
	return str( abc )

This function is guaranteed to meet its type signature if it completes,
but it is not necessarily type-safe in the Java sense. Anywhere within
it, an integer could be added to a string or a ".foo" invoked on a
float. 
For ERR it is important to be able to say that this function is
type-safe if there is some important reason that it really should not
fail. The type system can't guarantee termination but it can at least
ensure that TypeError and AttributeError will never be triggered by this
code. For a big part of our target audience, this assertion is the
reason for the exercise.

For OPT it is important to be able to say: "I need this code to run like
a bat out of hell and I believe that there are no string lookups or
runtime type checks required. Please verify that for me." So we need

"decl type-safe foo"

or something like that.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things see no end: A loop with exit code done wrong
A semaphore untested, and the change that comes along
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Sun Dec 19 11:12:08 1999
From: paul@prescod.net (Paul Prescod)
Date: Sun, 19 Dec 1999 05:12:08 -0600
Subject: [Types-sig] Issue: binding assertions
Message-ID: <385CBD88.ECC0009F@prescod.net>

Okay, we need to move to conclusion on certain issues so I can make a
new RFC. A week ago I posted a proposal that had a concept called
"binding assertion":

#2. The system must allow authors to make assertions about the sorts 
of values that may be bound to names. These are called binding
assertions. They should behave as if an assertion statement was
inserted after every assignment to that name program-wide.

Greg argued relatively persuasively that it was more Pythonic to allow
the same variable to have varying values over time. This is great in the
local case but causes problems in cases like this:

def brian(a) -> int:
	a.spam=42
	somefunc( a )
	return a.spam+a.parrot

Without getting into nightmares of "const" and so forth, we need to deal
with the fact that we can't know the type of a.spam and a.parrot without
analyzing somefunc and maybe other code. We don't have that option.
Therefore we must be able to guarantee at least the type, if not the
values, of a.parrot and a.spam. 

Now if a is a module object then hard-coding the values is equivalent to
declaring the static, unchanging type of global variables. So in at
least the module and class instance cases, we are binding types to
names, not values. This begs the question: is there any reason to treat
parameters differently and allow parameters to vary over their lifetime?
And what about declared local variables? Surely they must behave the
same as global variables!

My personal vote is that we treat all variables the same. Someone who
wants to allow some particular name to vary its type over its lifetime
has the following options:

 * turn off type checking
 * declare it to be of type Any
 * declare it to be of type dict | list of string

If we are too strict at first then we could ease the rule based on
feedback. I *do* understand that the vast majority of local variables
(not parameters) should have their types inferred (perhaps just as
"Any") rather than declared.
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things see no end: A loop with exit code done wrong
A semaphore untested, and the change that comes along
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Sun Dec 19 11:58:29 1999
From: paul@prescod.net (Paul Prescod)
Date: Sun, 19 Dec 1999 05:58:29 -0600
Subject: [Types-sig] Issue: definition of "type"
Message-ID: <385CC865.13C842A5@prescod.net>

A "static type" is either a statically declared (top-level) class or
something declared with a "decl type" statement or whatever we come up
with. Jim Fulton and Max Skaller notwithstanding, we do not seem to be
moving in the direction that any Python name can serve as a type. For
instance, these things are not types:

if somefunc():
	class spam:
		foo: String
else:
	class spam:
		foo: int

spam is a class but not a static type.

Jim Fulton also defines some ways to make interfaces at runtime. Those
are also not "static types" for our purposes. An interface constructed
at the top level would be a valid static type.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things see no end: A loop with exit code done wrong
A semaphore untested, and the change that comes along
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Sun Dec 19 11:59:23 1999
From: paul@prescod.net (Paul Prescod)
Date: Sun, 19 Dec 1999 05:59:23 -0600
Subject: [Types-sig] Any
References: <Pine.LNX.4.10.9912172045580.16305-100000@nebula.lyra.org>
Message-ID: <385CC89B.D228E925@prescod.net>

Can somebody tell me why we are looking at the word "Any" instead of
"Object" or "PyObject." I think that it is useful to encourage people to
think of Python's type system as a graph and not a set. Or to put it
another way, we should emphasize what is common about all Python objects
like the ability to convert them to a string.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things see no end: A loop with exit code done wrong
A semaphore untested, and the change that comes along
http://www.geezjan.org/humor/computers/threes.html


From gstein@lyra.org  Sun Dec 19 19:10:37 1999
From: gstein@lyra.org (Greg Stein)
Date: Sun, 19 Dec 1999 11:10:37 -0800 (PST)
Subject: [Types-sig] Optionality and performance
In-Reply-To: <385CB65F.FAE20948@prescod.net>
Message-ID: <Pine.LNX.4.10.9912191059360.16305-100000@nebula.lyra.org>

I'm not sure what point your're making here -- it seems to be different
than my issue.

On Sun, 19 Dec 1999, Paul Prescod wrote:
> Greg Stein wrote:
> > > Optionality of type checking is not about it being enabled or disabled.
> > > Even when it is enabled, type checking any particular method must be
> > > optional. This whole discussion should presume "enabled". But
> > > optionality is still important.
> > 
> > I'm assuming that we type-check a module at a time -- that we don't have
> > the kind of fine-grained checking you're assuming. If a person doesn't
> > want find-grained checking, then they just shouldn't add type annotations
> > there.
> 
> Here's the issue I tried to address in RFC 1.0 under the term "safety
> declarations".
> 
> def foo() -> String:
> 	# 10,000 lines of code
> 	return str( abc )
> 
> This function is guaranteed to meet its type signature if it completes,

I agree. Although, I would state that it is only guaranteed if the
type-checking is enabled when you compile the *module*. Specifically: the
presence of the return type means that the type-checker will verify the
return value as matching that type. This process is enabled when the
type-checker is enabled; the checking is *not* done if the type-checking
is not enabled.

You enable/disable compile-time, static type checking on a *module* basis,
when you compile the module. Given two declarations:

  def foo():
    ...
  def bar() -> String:
    ...

The foo() function will always pass the type-checker because it assumes
"-> any" as the return value, which is satisfied by whatever foo() might
return.

The bar() function passes IFF it only returns strings (and doesn't fall
off the end of the function, which implies a "return None").

> but it is not necessarily type-safe in the Java sense. Anywhere within
> it, an integer could be added to a string or a ".foo" invoked on a
> float. 

Sure...

> For ERR it is important to be able to say that this function is
> type-safe if there is some important reason that it really should not
> fail. The type system can't guarantee termination but it can at least
> ensure that TypeError and AttributeError will never be triggered by this
> code. For a big part of our target audience, this assertion is the
> reason for the exercise.

Absolutely.

> For OPT it is important to be able to say: "I need this code to run like
> a bat out of hell and I believe that there are no string lookups or
> runtime type checks required. Please verify that for me." So we need
> 
> "decl type-safe foo"
> 
> or something like that.

I'm saying we don't declare a need for type-safety. I'm saying that
type-safety checking is preconditioned on two things:

  1) type-checking is enabled when the module is compiled
  2) type annotations are present

So: when the compilation process is occurring and type-checking is
enabled, then it will verify as many types as possible.

Now, maybe it is desirable to have a second switch that says "foo.bar()
should fail if you cannot ensure foo has a bar method taking no
parameters." I'm presuming that without this switch, the type-checker
could assume that anything of type "any" would have any method that is
used.

Just to clarify since I think we missed each other:

1) the process of type-checking is an aspect of the compilation process
   and is enabled/disabled at the module level
2) a basic level of type checking states "check whatever you know about"
3) a stricter level of type checking states "all types must be done and
   verified correct."

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Sun Dec 19 19:23:18 1999
From: gstein@lyra.org (Greg Stein)
Date: Sun, 19 Dec 1999 11:23:18 -0800 (PST)
Subject: [Types-sig] development approach (was: Syntax)
In-Reply-To: <385CB3CF.AB7343AE@prescod.net>
Message-ID: <Pine.LNX.4.10.9912191111280.16305-100000@nebula.lyra.org>

On Sun, 19 Dec 1999, Paul Prescod wrote:
>...
> I am still guessing that for the first version it will also use Yet
> Another Parser because I don't want to change the real Python parser
> while we are in development mode. Are we going to set up our own CVS
> tree and have all of our testers install a new binary when we change the
> precedent of the ! operator or add a feature to the decl statement? I
> would rather send them one or two new Python files.

Our own CVS tree? Nah. I think that once we reach consensus and have Guido
Approval, then it goes right into the main CVS tree. I think the issue
here would be whether or how much we find we must iterate. I don't see any
iteration happening with the grammar, especially once we have Guido
Approval.

> *If* 1.6 is coming when we need it to, then we could give it a very

What do you mean "when we need it to" ? I didn't realize that there was
any "need" being discussed. I certainly am not interested in building a
system that is some kind of hackery add-on to 1.5. Design, build, and
integrate into 1.6, IMO.

> informal grammar for decl that basically stops at a comment or line
> boundary. That could invoke our (Python coded) decl-parser. More likely,
> we will want to test things out before 1.6 so we will probably stuff
> decl statements into expression statement-strings or shadow files.
> Either way, we have our own (sub-)grammar and (sub-)parser based, it
> seems, on Haskell.

Well... whoever codes it gets to decide :-). I'm just stating for the
record, that I believe the best approach is to directly start working on
the grammar changes [rather than use a short-term, throw-away solution].
If I do any coding on this, then it will use that approach.

> > Regardless: I'd hope that the first step to any implementation is to
> > update the Python grammar and allow us to annotate existing Python
> > programs (i.e. to use inline syntax). Updating the grammar is not super
> > difficult, but I hear you about wanting to not use another binary. But
> > I'll just shrug that off and say that's your problem :-)
> 
> Updating the grammar is not super-difficult but getting it right the
> first time is difficult.

I disagree, but that's okay.

> I cannot believe that nobody in parser-land has written a Python-based
> Python parser that we can hack. Whatever happened to the ethic that a
> parser-generator was not done until it could parse the language it was
> written in? That a Real Programming Language was not done until it could
> compile itself? :)

I think a number of people have done this. Go take a look for it.

One of my projects for 1.6 is to insert a hook for the parsing and the
compilation process. This would allow Python-level code to replace the
parse step, and/or Python code to replace the bytecode compilation. Once
those hooks are in, then it will be pretty much a given to have a Python
parser written in Python. And a bytecode compiler written in Python.
Specifically: can you replace each step, and parse the same files,
producing the same set of bytecodes?
[ I'll be doing the hooks; not sure if I want to write the replacements,
  though ]

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Sun Dec 19 19:47:27 1999
From: gstein@lyra.org (Greg Stein)
Date: Sun, 19 Dec 1999 11:47:27 -0800 (PST)
Subject: [Types-sig] Issue: binding assertions
In-Reply-To: <385CBD88.ECC0009F@prescod.net>
Message-ID: <Pine.LNX.4.10.9912191123281.16305-100000@nebula.lyra.org>

On Sun, 19 Dec 1999, Paul Prescod wrote:
>...
> def brian(a) -> int:
> 	a.spam=42
> 	somefunc( a )
> 	return a.spam+a.parrot
> 
> Without getting into nightmares of "const" and so forth, we need to deal
> with the fact that we can't know the type of a.spam and a.parrot without
> analyzing somefunc and maybe other code. We don't have that option.
> Therefore we must be able to guarantee at least the type, if not the
> values, of a.parrot and a.spam. 

You cannot know the type of <a> because you did not add a type declaration
to the parameter. If you had the following code:

class Foo:
  decl member spam: Int
  decl member parrot: Int
  ...

def somefunc(x: Foo):
  ...

def brian(a: Foo) -> Int:
  ...


THEN, you could assert type-safety. But if you don't even tell the thing
what type your inputs are: well... your problem.

What I believe is a distinct issue: while the interface specification of
Foo tells you what a.spam *is*, I believe we have a separate problem of
deciding whether to *enforce* that. While I am not strictly opposed to
enforcing type safety during assignment, I would ask that you please list
this as two problems: 1) declaring an interface, 2) enforcing type-safety
during assignments.

> Now if a is a module object then hard-coding the values is equivalent to
> declaring the static, unchanging type of global variables. So in at

Declaring module globals is really declaring the module *interface*, IMO.
The fact that the interface happens to be implemented with globals is
beside the point.

In fact, it might be neat to have something like this:

module Foo:
  decl member a: Int
  decl member b: Int
  def foo(x: String) -> String:
    "doc string"
    # no code!
  def bar(y: String) -> None:
    "doc string"

Then at some point in your code, you could do:

def baz(a: Foo) -> Int:
  ...


While all this is neat, you'll notice that the "module Foo:" is exactly
the same as doing "interface Foo:".

> least the module and class instance cases, we are binding types to
> names, not values. This begs the question: is there any reason to treat
> parameters differently and allow parameters to vary over their lifetime?
> And what about declared local variables? Surely they must behave the
> same as global variables!

I think you are binding types as part of an interface declaration. That is
quite different than binding locals.

> My personal vote is that we treat all variables the same. Someone who
> wants to allow some particular name to vary its type over its lifetime
> has the following options:
> 
>  * turn off type checking
>  * declare it to be of type Any
>  * declare it to be of type dict | list of string
> 
> If we are too strict at first then we could ease the rule based on
> feedback. I *do* understand that the vast majority of local variables
> (not parameters) should have their types inferred (perhaps just as
> "Any") rather than declared.

I believe that if we add syntax to declare locals, then we are going to
have a real hard time getting rid of that syntax. I would rather proceed
with caution in this regard: if we simply can't get the type inferencing
(or deduction as some people would say) working, then I'll relent and
advocate declaring locals. But until we reach that step, I'd like to avoid
adding new syntax for declaring locals.

If people want to use local declarations for commentary purposes, I would
just tell them:

def foo(a, b):
  #decl local c: Int
  ...
  c = 5
  ...

In other words, Python isn't going to provide direct assistance for
declaring the things.

Note that, for a function, I do not believe there is much merit to
declaring types for enforcement purposes. I don't want to add syntax and
the resulting non-cleanliness to deal with people who do the following:

def foo():
  decl local c: Int
  c = "foo"

Tough for them. I think most programmers can easily keep track of types
within a single function. We will be helping them out, certainly, with the
type checking, but it just doesn't happen as a type-enforcement at assign
time.
[ in other words: I think type problems occur at boundaries (func calls)
  rather than within the code/variables of a single function ]

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Sun Dec 19 20:02:53 1999
From: gstein@lyra.org (Greg Stein)
Date: Sun, 19 Dec 1999 12:02:53 -0800 (PST)
Subject: [Types-sig] Issue: definition of "type"
In-Reply-To: <385CC865.13C842A5@prescod.net>
Message-ID: <Pine.LNX.4.10.9912191147580.16305-100000@nebula.lyra.org>

On Sun, 19 Dec 1999, Paul Prescod wrote:
> A "static type" is either a statically declared (top-level) class or
> something declared with a "decl type" statement or whatever we come up
> with. Jim Fulton and Max Skaller notwithstanding, we do not seem to be
> moving in the direction that any Python name can serve as a type. For

euh... I don't think so.

We should be able to do the following:

import types
Int = types.IntType
int = type(1)

def foo(x: Int) -> int:
  return x

I think the compiler/inferencer will understand the "types" module's
interface and will understand the type() builtin. There are probably a few
other ways to get type information (e.g. foo.__class__) that it also must
understand. But each of these mechanisms are quite traceable. For example,
as the inferencer is doing its type analysis, if a value has TypeType,
then it remembers the value, too. That value can then be used in the
future for type assertions.

In the above example, the inferencer sees that types.IntType is a TypeType
(with value <IntType>). As part of its normal effort, it records that Int
has a TypeType value. In this case, it also records that the value is
<IntType>. The inferencer also understands that type(1) returns <IntType>
and records the appropriate bits with "int".

> instance, these things are not types:
> 
> if somefunc():
> 	class spam:
> 		foo: String
> else:
> 	class spam:
> 		foo: int
> 
> spam is a class but not a static type.

I disagree. After the if/else statement, spam is effectively:

spam = typedef _internal_interface_spam1 or _internal_interface_spam2

i.e. the type inferencer simply understands that the typedecl "spam" is a
class with one of two interfaces.

> Jim Fulton also defines some ways to make interfaces at runtime. Those
> are also not "static types" for our purposes. An interface constructed
> at the top level would be a valid static type.

I agree: runtime constructions such as the JimF stuff are not static
types.

However, I disagree with JimF's mechanism for interface definition. I
believe our syntax changes are sufficient.
[ still need the associated functions such as implements() ]

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Sun Dec 19 20:04:32 1999
From: gstein@lyra.org (Greg Stein)
Date: Sun, 19 Dec 1999 12:04:32 -0800 (PST)
Subject: [Types-sig] Any
In-Reply-To: <385CC89B.D228E925@prescod.net>
Message-ID: <Pine.LNX.4.10.9912191203010.16305-100000@nebula.lyra.org>

On Sun, 19 Dec 1999, Paul Prescod wrote:
> Can somebody tell me why we are looking at the word "Any" instead of
> "Object" or "PyObject." I think that it is useful to encourage people to
> think of Python's type system as a graph and not a set. Or to put it
> another way, we should emphasize what is common about all Python objects
> like the ability to convert them to a string.

Comes from CORBA/IDL.

I don't think most people view the number 5 as an object. Sure, Python
happens to view it that way, but people will say "that's a number!".

From a human perspective, "any" makes more sense than "object".

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From paul@prescod.net  Sun Dec 19 21:58:38 1999
From: paul@prescod.net (Paul Prescod)
Date: Sun, 19 Dec 1999 15:58:38 -0600
Subject: [Types-sig] Issue: binding assertions
References: <Pine.LNX.4.10.9912191123281.16305-100000@nebula.lyra.org>
Message-ID: <385D550E.AD1F0F2F@prescod.net>

Greg Stein wrote:
> 
> ...
> You cannot know the type of <a> because you did not add a type declaration
> to the parameter. 

Yes, I meant to do so.

> What I believe is a distinct issue: while the interface specification of
> Foo tells you what a.spam *is*, I believe we have a separate problem of
> deciding whether to *enforce* that. While I am not strictly opposed to
> enforcing type safety during assignment, I would ask that you please list
> this as two problems: 1) declaring an interface, 2) enforcing type-safety
> during assignments.

That's fine. Do you support enforcing type safety during assignments? If
not, doesn't the type declaration become meaningless documentation?

And if you support enforcing type declarations during assignments, do
you support doing so for assignments to:

 a) instance variables
 b) module variables
 c) local variables
 d) parameters

If you could summarize your proposed syntax/semantics for the four types
of assertions in a small chart, that would help a lot.

> I believe that if we add syntax to declare locals, then we are going to
> have a real hard time getting rid of that syntax. 

The syntax to declare locals would be the same syntax used to declare
globals and instance variables. It would just be in the function
context. Anyhow, I wasn't saying that we would ever get rid of the
syntax. We could just allow variables so declared to vary across their
lifetime.

> I don't want to add syntax and
> the resulting non-cleanliness to deal with people who do the following:
> 
> def foo():
>   decl local c: Int
>   c = "foo"

Neither do I. But I also do not want to illogically restrict the syntax
that is used in the module context, and class context from being
available in the local context. I also do not want parameter
declarations to have a very different semantic from instance variable
declarations.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things see no end: A loop with exit code done wrong
A semaphore untested, and the change that comes along
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Sun Dec 19 22:19:20 1999
From: paul@prescod.net (Paul Prescod)
Date: Sun, 19 Dec 1999 16:19:20 -0600
Subject: [Types-sig] Issue: binding assertions
References: <Pine.LNX.4.10.9912191123281.16305-100000@nebula.lyra.org> <385D550E.AD1F0F2F@prescod.net>
Message-ID: <385D59E7.6DDE668A@prescod.net>

Paul Prescod wrote:
> 
> Greg Stein wrote:
> >
> > ...
> > You cannot know the type of <a> because you did not add a type declaration
> > to the parameter.
> 
> Yes, I meant to do so.

I mean that I meant to put in the type, but forgot to go back and do so.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things see no end: A loop with exit code done wrong
A semaphore untested, and the change that comes along
http://www.geezjan.org/humor/computers/threes.html


From paul@prescod.net  Sun Dec 19 22:29:18 1999
From: paul@prescod.net (Paul Prescod)
Date: Sun, 19 Dec 1999 16:29:18 -0600
Subject: [Types-sig] Optionality and performance
References: <Pine.LNX.4.10.9912191059360.16305-100000@nebula.lyra.org>
Message-ID: <385D5C3E.8609C4FF@prescod.net>

Greg Stein wrote:
> 
> ...
> I'm saying we don't declare a need for type-safety. I'm saying that
> type-safety checking is preconditioned on two things:
> 
>   1) type-checking is enabled when the module is compiled
>   2) type annotations are present

And I'm saying that there are times when "as many as possible" is not
enough. It is my presumption that this function will *always* pass the
type checker:

def foo() -> String:
      # 10,000 lines of code
      print "abc"+eval( raw_input() )
      return str( abc )

Its declaration is correct. Sure, it may raise TypeError but Python
isn't Java and we should make it easy to write partially type-safe code.
The return value is verifiably correct and that is all that matters.

But the same function would *never* pass the type checker if it was
declared type-safe:

decl type-safe foo

Because its declaration is *incorrect*. Even though it returns what it
should it is NOT typesafe because it can trigger one of TypeError or
AttributeError. type-safe means safe in the Java/C++ sense which is a
very different issue.

There is no reason to require foo to be moved to a separate module if it
is the only function that requires that level of safety. I can think of
no interesting reason to require all functions in a module to have the
same safety level.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Three things see no end: A loop with exit code done wrong
A semaphore untested, and the change that comes along
http://www.geezjan.org/humor/computers/threes.html


From tim_one@email.msn.com  Sun Dec 19 22:53:37 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sun, 19 Dec 1999 17:53:37 -0500
Subject: [Types-sig] Type Inference I
In-Reply-To: <385AB3FE.AE6C2630@maxtal.com.au>
Message-ID: <000d01bf4a73$e37edda0$922d153f@tim>

[John Skaller]
> ...
> For example, consider:
>
> 	x = 1
> 	y = open("something")
> 	try: x + y
> 	except: print "OK"
>
> This code is CORRECT python at the moment (AFAIK).
> It is NOT 'illegal' to add a file and an integer,
> it is perfectly correct to do it, and then handle
> the resulting exception.

What I wonder is why you imagine Guido didn't intend this; just as he
intended that your "open" call above may also raise an exception (if e.g.
"something" doesn't exist, or it does but the program doesn't have read
permission, or ...).

> There is no hope for any kind of type inference
> until this is fixed. What must be said is that
> this case is an error, and that Python can
> do anything if the user does this: the result
> of executing the code MUST be undefined.

I think this is dead on arrival; almost nothing in Python was intended to be
undefined.  It might help if you gave an example that actually presented a
difficulty <wink>:  above, *if* you get beyond the "except", x is an int and
y is an open file object, and an inferencer shouldn't care what the type of
"x + y" is (it's not referenced).  Change it to "z = x + y", and an
inferencer knows that the type of z is the same as the type it had before
entering the block of code shown (because the inferencer knows that int+file
will blow up, so z won't get rebound

> The fact that a particular implementation throws an exception,
> is good behavour on the part of that particular implementation,
> but it must NOT be required -- because that would prevent
> a compiler rejecting the program.

This SIG is *adding* (optionally enforced) rules about when compile-time
detectable potential TypeErrors can cause compile-time rejection (just
"potential" because nobody has signed up to do reachability analysis; that
is,

    def f():
        return 2
        3 + open("3")

will probably get rejected in typecheck mode, despite that no runtime
TypeError is possible (the offending stmt is unreachable)).

> ...
> The outcome of this is that really, the only times python
> guarrantees to raise an exception is for environment errors,
> or for typing/indexing/lookup errors which are locally
> wrapped.

This certainly wasn't the intent!  E.g., in certain endcases, and sometimes
platform-dependent ones, Python has failed to raise OverflowError when
appropriate ((-sys.maxint-1)/-1 on (at least) Pentiums is the most recent
example that comes to mind).  Guido has always considered such behaviors to
be bugs in the implementation, and either fixes them or makes me do it
<wink>.

> ...
> ** Keyboard Interrupt: this is wrong wrong wrong!! <g>

Agreed there!

but-probably-not-a-complaint-for-the-types-sig<wink>-ly y'rs  - tim


From tim_one@email.msn.com  Sun Dec 19 22:53:44 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sun, 19 Dec 1999 17:53:44 -0500
Subject: [Types-sig] I've collected my thoughts...
In-Reply-To: <Pine.LNX.4.10.9912171502320.16305-100000@nebula.lyra.org>
Message-ID: <000f01bf4a73$e6ef4c40$922d153f@tim>

[GregS]
>   http://www.lyra.org/greg/python/type-proposal.html
> ...
> I'll keep adding to it as I think of things and hear back from
> people. I'm going to try to slow down on this type stuff, though,
> as I'm going to be starting work on the new import system.

Thanks, Greg!

I'm going to vanish entirely Real Soon -- pushed all my vacation time to the
end of the year so I could get to the Python Conference (it was scheduled
for Dec. when I did this), and just learned that I can't carry it over to
next year.  That is, "use it or lose it".  After a day of thoughtful
consideration <snort>, "use it" still appears to be the wiser choice.

preparing-to-celebrate-the-first-1000-glorious-years-of-
    types-sig-ly y'rs  - tim


From gstein@lyra.org  Sun Dec 19 23:08:21 1999
From: gstein@lyra.org (Greg Stein)
Date: Sun, 19 Dec 1999 15:08:21 -0800 (PST)
Subject: [Types-sig] Issue: binding assertions
In-Reply-To: <385D550E.AD1F0F2F@prescod.net>
Message-ID: <Pine.LNX.4.10.9912191455510.16305-100000@nebula.lyra.org>

On Sun, 19 Dec 1999, Paul Prescod wrote:
> Greg Stein wrote:
>...
> > What I believe is a distinct issue: while the interface specification of
> > Foo tells you what a.spam *is*, I believe we have a separate problem of
> > deciding whether to *enforce* that. While I am not strictly opposed to
> > enforcing type safety during assignment, I would ask that you please list
> > this as two problems: 1) declaring an interface, 2) enforcing type-safety
> > during assignments.
> 
> That's fine. Do you support enforcing type safety during assignments? If

Generally, I don't support it in V1. I think the assignments are usually
being done near to their definition, and only by the original author. In
that sense, I think there won't be too many errors in type-incorrectness.

In V2, then sure. But I want to separate the issue and discuss it later.
We can implement a system without worrying about assignments.

Small bites!

> not, doesn't the type declaration become meaningless documentation?

Absolutely not. In the following:

class Foo:
  decl member bar: Int

Anytime that the type-checker/inferencer refers to Foo_instance.bar, it
knows what the type is. Very important.

> And if you support enforcing type declarations during assignments, do
> you support doing so for assignments to:
> 
>  a) instance variables
>  b) module variables

The above two are part of assigning values to an interface's attributes.
In the future: sure, enforcement would make sense.

>  c) local variables

I don't think we should even be declaring these, thus rendering
type-enforcement moot.

>  d) parameters

Unsure. I'm punting my thoughts to V2.

[ V2 meaning Type System V2, not Python 2.0 ... I don't even think Python
  2.0 should be mentioned in our discussions... ]

> If you could summarize your proposed syntax/semantics for the four types
> of assertions in a small chart, that would help a lot.

-- No enforcement at all for any assignment.
-- All references use the declared type info (if any) for purposes of type
   checking

> > I believe that if we add syntax to declare locals, then we are going to
> > have a real hard time getting rid of that syntax. 
> 
> The syntax to declare locals would be the same syntax used to declare
> globals and instance variables. It would just be in the function

I disagree. The latter two are part of an interface declaration (of a
module or a class instance). Locals are not part of an interface, so I
don't think they fall into the same category at all.

decl member a: Int

That is an incorrect semantic for locals. And I don't support adding a
"decl var" or "decl local" for local declaration.

IMO, of course :-)

> context. Anyhow, I wasn't saying that we would ever get rid of the
> syntax. We could just allow variables so declared to vary across their
> lifetime.

My comment about getting rid of the syntax was based on the assumption
that we *might* have local declarations until the inferencer is
up-to-snuff. At that point, local declarations would be redundant. Problem
is: the interim code that used the new syntax would break once we tried to
remove that interim syntax.

> > I don't want to add syntax and
> > the resulting non-cleanliness to deal with people who do the following:
> > 
> > def foo():
> >   decl local c: Int
> >   c = "foo"
> 
> Neither do I. But I also do not want to illogically restrict the syntax
> that is used in the module context, and class context from being
> available in the local context. I also do not want parameter
> declarations to have a very different semantic from instance variable
> declarations.

I think the module/class context declarations are the same: interface
member declarations.

Parameters, locals, and return values have a different semantic
altogether. Therefore, I don't see any illogic to keeping the syntax
separate. I don't want to worry about enforcement, but I do want to worry
about declaring parameters and return values so that we can properly
infer/deduce the types of all expressions within a function.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Sun Dec 19 23:14:51 1999
From: gstein@lyra.org (Greg Stein)
Date: Sun, 19 Dec 1999 15:14:51 -0800 (PST)
Subject: [Types-sig] Optionality and performance
In-Reply-To: <385D5C3E.8609C4FF@prescod.net>
Message-ID: <Pine.LNX.4.10.9912191511570.16305-100000@nebula.lyra.org>

Good point. I think that I kind of described this difference in a previous
email, but you've stated it much more clearly.

I see type-safety as a second generation step, building on top of
type-checking. Can we defer these issues to a V2 system?

In general: I'd be happy with adding a "typesafe" keyword, but I think it
would behoove us to keep things small and simple. Let's just do
type-checking and defer type-safety and assignment enforcement.

Cheers,
-g

On Sun, 19 Dec 1999, Paul Prescod wrote:
> Greg Stein wrote:
> > ...
> > I'm saying we don't declare a need for type-safety. I'm saying that
> > type-safety checking is preconditioned on two things:
> > 
> >   1) type-checking is enabled when the module is compiled
> >   2) type annotations are present
> 
> And I'm saying that there are times when "as many as possible" is not
> enough. It is my presumption that this function will *always* pass the
> type checker:
> 
> def foo() -> String:
>       # 10,000 lines of code
>       print "abc"+eval( raw_input() )
>       return str( abc )
> 
> Its declaration is correct. Sure, it may raise TypeError but Python
> isn't Java and we should make it easy to write partially type-safe code.
> The return value is verifiably correct and that is all that matters.
> 
> But the same function would *never* pass the type checker if it was
> declared type-safe:
> 
> decl type-safe foo
> 
> Because its declaration is *incorrect*. Even though it returns what it
> should it is NOT typesafe because it can trigger one of TypeError or
> AttributeError. type-safe means safe in the Java/C++ sense which is a
> very different issue.
> 
> There is no reason to require foo to be moved to a separate module if it
> is the only function that requires that level of safety. I can think of
> no interesting reason to require all functions in a module to have the
> same safety level.
> 
> 

-- 
Greg Stein, http://www.lyra.org/


From tim_one@email.msn.com  Sun Dec 19 23:47:02 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sun, 19 Dec 1999 18:47:02 -0500
Subject: [Types-sig] Type Inference II
In-Reply-To: <385AE827.42ECB891@maxtal.com.au>
Message-ID: <001701bf4a7b$59ab61e0$922d153f@tim>

[John Skaller]
> ...
> Another case which inhibits optimisation is loop variables.
> In the loop:
>
> 	for x in y: ....
>
> is it allowed to assign to x?

Yes (explicitly allowed by section 7.3 of the Lang Ref).

> What about mutating y?

Yes but delicate (see 7.3 again); e.g., I have a great deal of code that
does breadth-first traversals via:

    sawit = {root: 1}  # set of things seen so far
    sequence = [root]
    for thing in sequence:
        for child in thing.children():
            if not sawit.has_key(child):
                sawit[child] = 1
                sequence.append(child)
    return sequence   # a breadth-first list of everything reachable

> What about mutating x?

Sure; this one's so obvious it doesn't need to be said <0.9 wink>.

> [Also, an aside: the code
>
> 	for x[1] in y: ...
> 	for x.attr in y: ...
>
> is allowed but I can't see a real use for it.
> Is there one?

I used this once, many years ago, but felt silly doing it.  There was no
real need for it.  Can't recall ever seeing it in other folks' code, either.
Wouldn't miss it.

> Could we simplify the syntax, and required the loop control to
> be a whole variable, or tuple of whole variables (recurively),
> so that the names involved are always bound directly?

OK by me, but do note it would make CPython's grammar more complicated (in
the sense that Grammar/Grammar would grow larger); the current

for_stmt: 'for' exprlist 'in' testlist ':' suite ['else' ':' suite]

shares "exprlist" with the "del" stmt.

> ...
> Tightening up for loops will break code that does things like:
>
> 	(1) do extra increments on an loop variable to skip cases

Nope.  Assigning to the loop variable has no effect on the values assigned
*to* it by the for stmt:

>>> for i in range(5):
        print i
        i = i ** 10
        print i

0
0
1
1
2
1024
3
59049
4
1048576
>>>

> 	(2) mutate a list while scanning it

Again this shouldn't break anything.  "The rules" are already set up to
reflect a simple one-at-a-time implementation.  When you talk about
"caching" the list, though, I don't know what that could mean other than
making a *copy* of it -- but that would be a pessimization, not a speedup.

In any case, loop overhead is usually minor.  It's usually the guts of the
loop that consume the time.

> ...
> 	_f = f # protect our f
> 	from MODULE import * # might destroy f
> 	f = _f # set f back
> 	del _f   # get rid of temporary _f
>
> This is ugly. It also makes inference harder, because
> there is now a control flow issue.
>
> One idea I had was this:
>
> 	import X as Y   # import X, but name it Y in this module
> 	from M import x as v

Has been suggested many times on c.l.py.  To the best of my knowledge, Guido
has never responded to one of these.  But in any case I'd say that a "real
inferencer" that can't deal with a branch-free basic block isn't real enough
for the Types-SIG to worry about -- "import as" belongs in a different
forum.  "import *" is a legit Types-SIG headache, though; Guido already
voted "if they do that, tough nuts to them".

> ...
> I note python currently supports privacy by name mangling,
> but really, this is a hack: for Python 2, a more sophisticated
> architecture would be better.

It had a more sophisticated one, years ago (the now-defunct "access" stmt);
experience with that was bad, so it was tossed and __ was added; Guido isn't
likely to backtrack here.

it's-a-*cute*-hack<wink>-that-works-well-in-practice-ly y'rs  - tim


From tim_one@email.msn.com  Mon Dec 20 00:59:33 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sun, 19 Dec 1999 19:59:33 -0500
Subject: [Types-sig] List of FOO
In-Reply-To: <385AF7C9.7CEE78E5@maxtal.com.au>
Message-ID: <001801bf4a85$7aba51c0$922d153f@tim>

[John Skaller]
> ...
> If I can summarise: there is considerable advantage using
> arbitrary objects as type objects: they can be specified
> using EXISTING python syntax, using the power of the EXISTING
> python interpreter, without needing a special, second class
> language, to complicate python, and pose an additional
> implementation overhead.

One of the groundrules here is that the SIG's work cannot require importing
(or executing via any other means) modules.  It *may* be OK to compile them
and deviously suck out their code objects for inspection, but execution is
forbidden.

That's why a "special, second class language" is attractive, provided it's
recognizable as such:  it can be analyzed without execution.  If types are
arbitrary run-time objects-- and *especially* if they "use the power of the
... Python interpreter", I don't see how the compilation process could get
*at* them without execution.

statically y'rs  - tim


From tim_one@email.msn.com  Mon Dec 20 00:59:37 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sun, 19 Dec 1999 19:59:37 -0500
Subject: [Types-sig] New syntax?
In-Reply-To: <Pine.LNX.4.10.9912172045580.16305-100000@nebula.lyra.org>
Message-ID: <001901bf4a85$7c8ec3a0$922d153f@tim>

[Tim]
> Yes, Any is good.

[GregS]
> I've listed this in my proposal as an open question. I'm leaning
> to "formally endorsing" it. My only real opposition is whether
> it must be a new keyword, or we can find some other way to deal
> with it.
>
> For example:
>
> import types
> Int = types.IntType
> String = types.StringType
> Any = None
>
> decl foo: Any
> decl bar: String
>
> The compiler isn't going to have recognized names for the types.

I pushed almost everything into "decl" stmts so that type specification
really was a sublanguage distinct from current Python, and specifically a
declarative  (no control flow of its own & no side-effects) sublanguage,
fully evaluable at compile-time via simple means.  To the extent that that's
true, it can enjoy its own "compile time" namespace distinct from the
runtime namespaces, and Int, Any, String, Boolean ... can be decreed to
"just be there", by magic, *in* declarations, for purposes of compile-time
type checking.  If instead we have to interpret imports and binding stmts
and attribute dereferences and ... to get at names for types, we pretty much
have to *execute* the code -- and Guido won't go for that.  Or, if he does,
he shouldn't <wink>.  The "static" in "static typing" has implications.

> ...
> Funny note: looking at the grammar, I've found the following is legal:
>
>   def foo(bar, *args, * *kw):
>     ...
>
> In my typedecl syntax, I punted the ability to use "* *" ... you must
> use "**". So there :-)

Good!  It's little-known that e.g.

    x = 2and 3

is legit Python but

    x = 2 and3

isn't, and I'm sure Guido would like to suppress that secret too.

or-if-he-wouldn't-he-should<wink>-ly y'rs  - tim


From tim_one@email.msn.com  Mon Dec 20 01:51:11 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sun, 19 Dec 1999 20:51:11 -0500
Subject: [Types-sig] New syntax?
In-Reply-To: <Pine.LNX.4.10.9912172055380.16305-100000@nebula.lyra.org>
Message-ID: <001a01bf4a8c$b22354c0$922d153f@tim>

[Tim[
> ...
> If I had a lot of binary integer functions to declare, I
> would probably use a typedef, a la
>
>     decl typedef BinaryFunc(_T) = def(_T, _T) -> _T
>     decl typedef BinaryIntFunc = BinaryFunc(Int)
>     ...
>     decl var intHandlerMap: {string: BinaryIntFunc}
>     decl var floatHandlerMap: {string: BinaryFunc(Float)}

[GregS]
> Okay, Tim. I'm going to stop you right here :-)

Good -- the speed was killing me <wink>.

> The problem with using "decl" to do typedefs is that it does
> weird voodoo to associate the typedecl with the name (e.g.
> BinaryFunc).

Perhaps an earlier msg made this clearer:  I've viewed "decl"s as (purely!)
compile-time expressions.  IOW, BinaryFunc is a compile-time name in the
above; there's no implication that a name introduced by a "decl typedef"
will appear in any runtime namespace (this doesn't preclude that in some
modes the implementation may *want* to make a Python object of the same name
available at runtime).

> I believe my unary operator is much clearer to what is happening:
>
>   BinaryIntFunc = typedef BinaryFunc(Int)

This looks like a runtime stmt to me; if so, it's of no use to static
(compile-time) type declaration.  If it's not a runtime stmt, better to
stick a "decl" (or something) in front of it to make that crucial
distinction obvious.

> In this case, it is (IMO) very clear that you are storing a typedecl
> object into BinaryIntFunc, for later use. For example, we might see the
> following code:
>
>   import types
>   Int = types.IntType
>   List = types.ListType
>   IntList = typedef [Int]
>   ...

This all looks like runtime code to me -- if so, how is a *compiler*
supposed to get any benefit out of it?  Or if not, how is a compiler
supposed to recognize that it's not runtime code?

> Hrm. I don't have a ready answer for your first typedef, though. That
> is a new construct that we haven't seen yet. We've been talking about
> parameterizing *classes*, rather than typedecls.
>
> *ponder*

In my twisted little universe, I'm using a declarative language for
compile-time type expressions, and BinaryFunc(_T) can be thought of as a
compile-time macro -- same as the BinaryIntFunc typedef (except the latter
doesn't take any arguments -- or does take no arguments <wink>).

>> ("|") should suffice.

> "or" is more Pythonic.

Agreed.  I'm not sure what's in vogue among category theorists, though
<wink>.

> Bite me. :-)

Yummy!

> You do raise a good point in another post, however:
>
>   def foo(*args: (Int)):
>
> Looks awfully funny. For a Python programmer, that looks like
> grouping rather than a tuple. If it had a comma in there, then
> it would look like a tuple.

Worse, it would look like a tuple of length one, which *args is not.

> But of course: there will never be more than one typedecl inside
> there, so whythehell is there a comma?

I think it should be legal to do, e.g.,

    def foo(*args: (Int, Float, String)) -> whatever:

This says the function takes exactly three arguments, of the given types,
but gets them as the * tuple.  Some people do that (typically if they're
just going to pass the arglist on via apply(somfunc, args)).

> *grumble*  .... I don't have a handy resolution for this one.

So let's make one up.  The problem is spelling "tuple of unknown length"
(and Paul's complaint notwithstanding, that *is* Python so we gotta deal
with it).  Python has no notation for this.  OK:

    ...
    Tuple(T1, T2, T3) equivalent_to (T1, T2, T3)
    Tuple(T1, T2) equivalent_to (T1, T2)
    Tuple(T1,) equivalent_to (T1,)
    Tuple(T1) means tuple-of-T1 of unknown length

So it's always *legal* to stick "Tuple" in front of a tuple specifier, and
it's *required* in the last case.

Actually, tuples show up in type specifiers rarely enough-- and look so much
like grouping now --that I'd be happy requiring "Tuple" all the time.  Again
one of those things that could be relaxed later if it proved too irksome.

if-only-you-could-relax-me-too<wink>-ly y'rs  - tim


From tim_one@email.msn.com  Mon Dec 20 03:12:06 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sun, 19 Dec 1999 22:12:06 -0500
Subject: [Types-sig] Type Inference I
In-Reply-To: <385BB080.A2DBB67C@maxtal.com.au>
Message-ID: <002001bf4a97$ff9a90a0$922d153f@tim>

[John Skaller]
> ... accept, temporarily, that we have only THREE types:
> integers, strings, and
> ... a function 'add(x,y)' exists, which throws an exception
> if the types of x and y are not both integers, or both strings.
> ...
> Consider two cases:
>
> 	1) The spec says:
> 	IF the arguments are both ints ..
> 	OR IF the arguments are both strings ..
> 	OTHERWISE an exception is thrown
>
> 	2) The spec says:
> 	IF the arguments are both ints ..
> 	OR IF the arguments are both strings ...
> 	OTHERWISE THE BEHAVIOUR IS UNDEFINED
>
> There is a huge difference between these two cases for
> a compiler. In case (2), the compiler can ASSUME
> that given the call
>
> 	add(x,1)
>
> that x must be an integer.

The philosophy of (at least) CPython is that core dumps and stack faults are
never the user's fault -- any time that happens, it's a bug in Python
itself.  Some of those bugs will never get fixed <0.9 wink>, but they're
considered to be Python bugs all the same.  This is unlike C/C++/Fortran
etc, and is one of Python's selling points relative to them.

Now the kind of assumption you want to make above will lead to generating
code that *can* cause core dumps when the assumption is false.  For example,
if x is actually a file object, you may well generate code that adds 1 to
some internal field of the underlying FILE* struct.  Not good -- not in
Python.  Spec #1 is the *intent* of the language.

> ...
> On the other hand, in case (1), the compiler cannot deduce
> anything, at least from the given fragment, so it can NOT
> generate fast code: it has to call
>
> 	PyAdd(x,One)
>
> or, perhaps do something like:
>
> 	if (PyTypeIsInt(x))
> 		x->value ++;
> 	else PyRaise(SomeException)
>
> .. which involves an extra run time check, at least,
> and is therefore much slower.

The transformation is valid provided your inferencer is strong enough to
prove that x *is* an int at this point.

If it's not, it may be no significant loss anyway:  most program time is
spent inside loops, and runtime type checks can often be floated out; when
they can't, the compiler can provide feedback that a user who gives a rip
about speed can act on.

> Therefore, there is a performance advantage in adopting
> spec (2) as a language specification, instead of (1).

Yes, but I doubt Python will ever go that route.  It's harder to optimize
Python than Fortran -- but, in return, you get to program in *Python*
<wink>.

> Note this does not mean the generated code will crash,
> if x is not an integer. What it means is that if the
> compiler detects that x is not an integer, it can
> generate a compile time error. It is NOT allowed to
> do that with specification (1).

No, but under #1, and under the same assumption (that the compiler detects
that x is not an int), it can generate code to raise the exception and skip
the PyAdd business.  That is, *if* the inferencer can *either* say
"definitely an int" or "definitely not an int", it makes no practical
difference whether spec #1 or #2 is in effect.

> So my point is: the Python documentation contains
> many examples where it says 'such and such an exception
> is thrown', and this prevents generating fast code,
> and it prevents early error detection.

I'm sure that if anything comes of this SIG, people will ask for a compile
*option* to warn about (or error on) every nastiness the analysis code can
detect.

> ...
> But I have spent over five years on a standards committee,
> and have some vague idea of the impact of specifications --
> and in particular lack of them -- on the ability to generate
> fast, conforming code.

Speed wasn't one of Python's primary goals; had you been on the ECMAS
JavaScript committee instead, you wouldn't have heard anyone arguing about
speed <0.9 wink>.  C++ is the Fortran of OO languages ...

not-that-speed-kills-ly y'rs  - tim


From tim_one@email.msn.com  Mon Dec 20 03:51:12 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sun, 19 Dec 1999 22:51:12 -0500
Subject: [Types-sig] Syntax
In-Reply-To: <385BC0B6.3879244B@prescod.net>
Message-ID: <002201bf4a9d$76357900$922d153f@tim>

[Paul Prescod]
> ...
> In fact, I don't see a lot of difference between the widely
> embraced Tim-syntax and the syntax I posted a few days ago
> (based on the Tim-syntax).

Me neither.

> But if putting the keyword "decl:" in front makes it feel
> better then I'm all for that!

Ditto if taking "decl" away makes people feel better.  I'm getting an
increasingly strong suspicion that what I've had in mind doesn't match what
*anybody* here has been talking about, though!  That is, as covered in
earlier msgs tonight, I've taken it as a given that type declarations must
be fully identifiable and evaluable at compile time, without code execution.
The real point of slopping "decl" in front of everything was to add just one
new "compile time" keyword and statement (Guido's happiness will be
proportional to 1./math.e**k, where k == the number of new keywords <wink>).

> I'm still thinking that it should go in another file because I
> want to be able to experiment with this stuff WITHOUT maintaining
> a new Python interpreter binary.

I don't know what people are arguing about here.  We'll need a separate file
to declare the signatures of stuff supplied by C modules anyway.

good-enough-to-start-with!-ly y'rs  - tim


From tim_one@email.msn.com  Mon Dec 20 04:50:55 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sun, 19 Dec 1999 23:50:55 -0500
Subject: [Types-sig] List of FOO
In-Reply-To: <385C0F10.E003F55E@maxtal.com.au>
Message-ID: <002401bf4aa5$cdb7d760$922d153f@tim>

[John Skaller]
> ...
> Note I'm not against using a functional language's
> type description for Python, a'la Tim/Haskell,
> but it isn't clear that is going to work well either,

The only thing that's clear is that *something's* going to work well -- we
just have no idea what that is <wink>.

> and it seems to involve 'extra' work, writing a parser
> for a 'new' language, etc.

But a language much simpler than Python, and one designed for the task
rather than (hard?) pressed into service.

There are reasons to be skeptical either way.

also-reasons-to-be-optimisitic-ly y'rs   - tim


From tim_one@email.msn.com  Mon Dec 20 04:51:04 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sun, 19 Dec 1999 23:51:04 -0500
Subject: [Types-sig] type declaration syntax
In-Reply-To: <385C1345.C21FF180@maxtal.com.au>
Message-ID: <002701bf4aa5$d2334d60$922d153f@tim>

[Tim]
> My bet is that the vast majority of Python people asking for
> "static typing" have in mind a conventional explicit system of
> the Algol/Pascal/C (APC) ilk, and that decisions based on what
> *inference* schemes can do are going to leave them very unhappy.

[John Skaller]
> I'm not sure why.

The rest of my msg went on at some length about why they would be unhappy;
I'm not going to repeat it here.  Or if the "why" is wrt why it's a
majority, you yourself recently wrote a very cogent piece on c.l.py
explaining that:  familiarity.  Relatively few Python programmers have any
experience with inference schemes; and the two on this SIG who have admitted
to it <wink> (that's Paul & me, BTW) testified they didn't really care for
it (I at least explicitly declare *every* function in Haskell, although I do
generally skip delcaring local names).

> My 'assumption' is that
>
> 	1) a conservative inferencer is used,
> which means it tries to optimise code by inference,
> but if it isn't sure, it falls back to the usual run-time
> checking -- that is, it faithfully reproduces the expected
> behaviour no matter what.
>
> 	2) optional static type declarations allow
> the performance of the inferencer to be improved;
> that is, to generate better code
>
> 	3) it would also help to tighten up
> the specifications of python, particularly
> in areas like
>
> 	a) when is it OK to expect an exception
> 	b) module freezing
>
> etc.

All of which are irrelevant to the points in the msg to which you're
replying.  People from APC are accustomed to declaring types to communicate
design requirements and semantic restrictions beyond the competence of
inference to determine.  If my first example was too subtle <wink>, at least
read the intgamma example.

> I would make the point that, as often is the case,
> the client is 'asking' for X, but what they actually
> need is Y, because they don't understand their own requirements.
> That is, they may be 'asking' for APC style static typing,
> but they have no idea what the implications are,
> and if they knew, they would withdraw their application.

Ditto.

> I guess that NO python programmer wants to declare the type
> of every single name, which is what APC style static type
> checking requires.

I would be delighted to if it sped some of my "marginal" programs by a
factor of 2.  My employer would be delighted to if it saved them from
runtime TypeErrors next week instead of next year.  Any tradeoff you can
think of has a larger audience than you can imagine <wink -- but that
includes an audience for Viperish global inference too!>.

repetitively y'rs  - tim


From tim_one@email.msn.com  Mon Dec 20 04:51:02 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sun, 19 Dec 1999 23:51:02 -0500
Subject: [Types-sig] Type Inference I
In-Reply-To: <19991218235412.A15050@vet.uu.nl>
Message-ID: <002601bf4aa5$d0c390c0$922d153f@tim>

[Martijn Faassen]
> ...
> The counter argument I got to this before is that inferencing takes
> place anyway in the case of expressions:
>
> def foo(a, b):
>     # Martijn's evil verbose format in yet another form
>     decl:
>         a = Int
>         b = Int
>         return Int
>     return a + b

Tony also had a "declaration block" construct; they look nice.

> 'a + b' would need inferencing to figure out what the type is of
> the complete expression. I think that this argument overlooks
> that this kind of evaluation is a lot more easy than a back-
> tracking kind of inferencing.
> ...
> Though checking could be seen as a kind of inferencing, right? Or
> are people confusing the issues? Initially I didn't consider the
> expression evaluation stuff as inferencing either, but there's a
> good argument to consider it so, not?

Not to me -- it's logic-chopping.  This is like the "compiler vs
interpreter" arguments that pop up on c.l.py from time to time:  yes,
there's a fine line between compilation and interpretation, but Python today
is nowhere near that line.  It's an interpreter.

Likewise there's a fine line between inferencing and checking, but in the
common usage of the words, deducing the type of "a + b" *given* the types of
a and b, and *given* the signatures of a.__add__ and b.__radd__, is not
called inferencing.  Insisting that it is cheapens the currency of the
marketplace of verbal discourse <wink>.  To the extent that you take away
one or more of the the four givens, "inferencing" gets more and more
appropriate.

Rule of thumb:  If it's something Algol and Fortran did Since The Beginning,
it's unhelpful to call it inferencing.

hmm!-we-could-just-begin-every-python-identifier-with-its-type-name-
    and-call-it-quits-early-ly y'rs  - tim


From tim_one@email.msn.com  Mon Dec 20 04:51:08 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sun, 19 Dec 1999 23:51:08 -0500
Subject: [Types-sig] type declaration syntax
In-Reply-To: <19991219005308.A29210@chronis.pobox.com>
Message-ID: <002801bf4aa5$d45d3240$922d153f@tim>

For whatever reason, "xxx" sent the attached to me in email, but not to the
SIG.  Since it's relevent, I'm passing it on.

yes-their-name-really-is-xxx<wink>-ly y'rs  - tim


-----Original Message-----
From: xxx
Sent: Sunday, December 19, 1999 12:53 AM
To: Tim Peters
Subject: Re: [Types-sig] type declaration syntax

On Sat, Dec 18, 1999 at 03:56:44PM -0500, Tim Peters wrote:
> [John Skaller]
> > ...
> > But the _return_ type doesn't need to be annotated as much.
> > Why? Because the inferencer can usually deduce it:
> > it's an output, the argument types are inputs.
> >
> > If the inferencer _cannot_ deduce the return type,
> > it _also_ cannot check that the function is returning
> > the correct type.
>
> "The" correct type (as opposed to "a type consistent with the
> operations") is impossible for an inferencer to determine, but
> this is addressed more to the SIG than to John <wink>:
>
> My bet is that the vast majority of Python people asking for
> "static typing" have in mind a conventional explicit system of
> the Algol/Pascal/C (APC) ilk, and that decisions based on what
> *inference* schemes can do are going to leave them very unhappy.

If this is so, I am a member of the majority.

> Inference schemes commit two kinds of gross errors that the APC
> camp won't abide:
>
> 1) Inferring types that aren't general enough.
> 2) Inferring types that are too general.

I always had more trouble following the inferencer in ml than simply
declaring everything as in C.  I think we should aim at one thing, a way to
declare types, and what the different types will be.  Once there is a
standard for this in core python, a myriad of tools from optimizers to code
checkers to cross-language compilers will become much more feasible.  A type
inferencer is one of those things, and it could be built for those who
wished to have their types inferenced from a minimal set of type
declarations.

xxx


From tim_one@email.msn.com  Mon Dec 20 06:15:50 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Mon, 20 Dec 1999 01:15:50 -0500
Subject: [Types-sig] Keyword arg declarations
In-Reply-To: <00ac01bf49ed$e76a3f80$df55cfc0@ski.org>
Message-ID: <002c01bf4ab1$a96e2ce0$922d153f@tim>

>>      range([start,] stop[, step])
>>
>> This is, that's just the way the *doc* is written, to make it
>> clearer.

[DavidA]
> I know that.  However, I can imagine that it will be hard to
> justify to the unwashed masses why they need to use seemingly
> unrelated syntax to describe the signature for humans and the
> signature for the compiler.

How would that need arise?  Signatures for the builtins will come with the
system.  If an unwashed mass is clever enough to document their own function
in a way that doesn't reflect the way they implemented it, they already
understand the two points of view fine.

I think there's more potential for confusion from automated tools (like
IDLE's calltips) that dig out the actual signature instead of the documented
one; e.g., for randrange, calltips pops up

    (start, stop=None, step=1, int=<built-in function int>,
     default=None)

> I believe that you raise a similar point in another of your
> posts, w.r.t the 'int=int, ord=ord' extra junk in your function
> definition.

Yes; I'm wondering whether it's possible (or wise, or something) to *lie*
about the true signature.

> That said, I suspect that the issue is peripheral and rare enough
> that I needn't worry.

I agree; but if you do want to worry, think about calltips too <wink>.

it's-the-end-users-i'm-not-worried-about-ly y'rs  - tim


From tim_one@email.msn.com  Mon Dec 20 06:30:12 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Mon, 20 Dec 1999 01:30:12 -0500
Subject: [Types-sig] Syntax
In-Reply-To: <385CB3CF.AB7343AE@prescod.net>
Message-ID: <002d01bf4ab3$ab7c1360$922d153f@tim>

[Paul Prescod]
> I cannot believe that nobody in parser-land has written a
> Python-based Python parser that we can hack.

John Aycock's extremely general (any CF grammar, ambiguous or not) parsing
framework comes with a Python grammar.  I know Gordon McMillan hacked on
that to fix some of the productions, but don't know whether John folded that
back in yet (or, indeed, whether G gave it back to J!).  It's probably the
fastest way in the universe to get a parser working -- as well as the
slowest way to actually parse <wink>.  Definitely worth a look, anyway.

> Whatever happened to the ethic that a parser-generator was not
> done until it could parse the language it was written in? That a
> Real Programming Language was not done until it could compile
> itself? :)

Coming from a background in Fortran compiler development, that always struck
me as a charming myth kept alive by people in other fields.

kinda-like-the-myth-that-rocket-scientists-are-smart<wink>-ly
    y'rs  - tim


From tim_one@email.msn.com  Mon Dec 20 06:50:26 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Mon, 20 Dec 1999 01:50:26 -0500
Subject: [Types-sig] Issue: binding assertions
In-Reply-To: <385CBD88.ECC0009F@prescod.net>
Message-ID: <002e01bf4ab6$7f9d9400$922d153f@tim>

[Paul Prescod]
> ...
>  I *do* understand that the vast majority of local variables
> (not parameters) should have their types inferred (perhaps just as
> "Any") rather than declared.

Just a reminder that if type decorations do someone actual good, to get that
they'll put up with declaring everything at first.  Inference is frosting.

in-a-land-without-cake-ly y'rs  - tim


From tim_one@email.msn.com  Mon Dec 20 06:50:31 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Mon, 20 Dec 1999 01:50:31 -0500
Subject: [Types-sig] Issue: definition of "type"
In-Reply-To: <385CC865.13C842A5@prescod.net>
Message-ID: <002f01bf4ab6$820a9c60$922d153f@tim>

[PaulP]
> For instance, these things are not types:
>
> if somefunc():
> 	class spam:
> 		foo: String
> else:
> 	class spam:
> 		foo: int
>
> spam is a class but not a static type.

True, but it can be given a static type *name*; e.g.,

    decl type spam

Provided that the attributes of spam actually referenced outside of spam
have the same signatures, static type checking outside of spam shouldn't
care that it doesn't know about spam's internals.  Or, IOW, if the two
dynamic versions of spam present the same external interface to the
compiler, it doesn't matter how the *class* spam comes into being at
runtime.

> Jim Fulton also defines some ways to make interfaces at runtime. Those
> are also not "static types" for our purposes. An interface constructed
> at the top level would be a valid static type.

As above, shouldn't matter.

what-isn't-built-until-runtime-can-yet-be-declared-at-
    compile-time-ly y'rs  - tim


From gstein@lyra.org  Mon Dec 20 09:20:22 1999
From: gstein@lyra.org (Greg Stein)
Date: Mon, 20 Dec 1999 01:20:22 -0800 (PST)
Subject: [Types-sig] New syntax?
In-Reply-To: <001901bf4a85$7c8ec3a0$922d153f@tim>
Message-ID: <Pine.LNX.4.10.9912200116020.16305-100000@nebula.lyra.org>

On Sun, 19 Dec 1999, Tim Peters wrote:
>...
> [GregS]
>...
> > decl foo: Any
> > decl bar: String
> >
> > The compiler isn't going to have recognized names for the types.
> 
> I pushed almost everything into "decl" stmts so that type specification
> really was a sublanguage distinct from current Python, and specifically a
> declarative  (no control flow of its own & no side-effects) sublanguage,
> fully evaluable at compile-time via simple means.  To the extent that that's
> true, it can enjoy its own "compile time" namespace distinct from the
> runtime namespaces, and Int, Any, String, Boolean ... can be decreed to
> "just be there", by magic, *in* declarations, for purposes of compile-time
> type checking.

Ack. Now you're talking about a whole new set of names to introduce to the
language. I think that is a Bad Thing.

I can understand the desire to simplify the task for the compiler, but
creating a distinct, partitioned namespace is just that. It doesn't mesh
well into Python itself.

> If instead we have to interpret imports and binding stmts
> and attribute dereferences and ... to get at names for types, we pretty much
> have to *execute* the code -- and Guido won't go for that.  Or, if he does,
> he shouldn't <wink>.  The "static" in "static typing" has implications.

Nah. No execution needs to take place. Just some data flow analysis. And
thankfully, Python doesn't have "goto" ... the hardest control structure
to model is try/except. The others are pretty basic.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Mon Dec 20 09:29:00 1999
From: gstein@lyra.org (Greg Stein)
Date: Mon, 20 Dec 1999 01:29:00 -0800 (PST)
Subject: [Types-sig] New syntax?
In-Reply-To: <001a01bf4a8c$b22354c0$922d153f@tim>
Message-ID: <Pine.LNX.4.10.9912200120590.16305-100000@nebula.lyra.org>

On Sun, 19 Dec 1999, Tim Peters wrote:
>...
> > The problem with using "decl" to do typedefs is that it does
> > weird voodoo to associate the typedecl with the name (e.g.
> > BinaryFunc).
> 
> Perhaps an earlier msg made this clearer:  I've viewed "decl"s as (purely!)
> compile-time expressions.  IOW, BinaryFunc is a compile-time name in the
> above; there's no implication that a name introduced by a "decl typedef"
> will appear in any runtime namespace (this doesn't preclude that in some
> modes the implementation may *want* to make a Python object of the same name
> available at runtime).

I think that we definitely want to be able to construct and use typedecl
objects at runtime. That's why I prefer the typedef unary operator over
your "sub-language."

Viewing the "decl" stuff as a sub-language is kind of icky. Where is the
integration with Python itself? Having a clean integration is a good
measure that you have a Pythonic syntax and feel.

> > I believe my unary operator is much clearer to what is happening:
> >
> >   BinaryIntFunc = typedef BinaryFunc(Int)
> 
> This looks like a runtime stmt to me; if so, it's of no use to static
> (compile-time) type declaration.  If it's not a runtime stmt, better to
> stick a "decl" (or something) in front of it to make that crucial
> distinction obvious.

It definitely is a runtime statement. But the compiler can easily track
what is happening.

We're doing data flow and type checking already: that's what the SIG is
about. Tracking the result of a typedef is cake once you have that.

> > In this case, it is (IMO) very clear that you are storing a typedecl
> > object into BinaryIntFunc, for later use. For example, we might see the
> > following code:
> >
> >   import types
> >   Int = types.IntType
> >   List = types.ListType
> >   IntList = typedef [Int]
> >   ...
> 
> This all looks like runtime code to me -- if so, how is a *compiler*
> supposed to get any benefit out of it?  Or if not, how is a compiler
> supposed to recognize that it's not runtime code?

It is runtime code. The runtime is going to need those objects to execute
the runtime type checks (on function entry and for the type-assert
operator; possibly for assignment enforcement).

But the compiler can extract a lot of benefit from the above statements.
As I mentioned: the compiler can/should understand the types module and
the type() builtin (plus things like __class__). Then you're quite fine.
No magic voodoo involved.

> > Hrm. I don't have a ready answer for your first typedef, though. That
> > is a new construct that we haven't seen yet. We've been talking about
> > parameterizing *classes*, rather than typedecls.
> >
> > *ponder*
> 
> In my twisted little universe, I'm using a declarative language for
> compile-time type expressions, and BinaryFunc(_T) can be thought of as a
> compile-time macro -- same as the BinaryIntFunc typedef (except the latter
> doesn't take any arguments -- or does take no arguments <wink>).

I know that. I meant that I didn't have a response that fits into *my*
universe :-)

>... tuple stuff ...
> > *grumble*  .... I don't have a handy resolution for this one.
> 
> So let's make one up.  The problem is spelling "tuple of unknown length"
> (and Paul's complaint notwithstanding, that *is* Python so we gotta deal
> with it).  Python has no notation for this.  OK:
> 
>     ...
>     Tuple(T1, T2, T3) equivalent_to (T1, T2, T3)
>     Tuple(T1, T2) equivalent_to (T1, T2)
>     Tuple(T1,) equivalent_to (T1,)
>     Tuple(T1) means tuple-of-T1 of unknown length
> 
> So it's always *legal* to stick "Tuple" in front of a tuple specifier, and
> it's *required* in the last case.
> 
> Actually, tuples show up in type specifiers rarely enough-- and look so much
> like grouping now --that I'd be happy requiring "Tuple" all the time.  Again
> one of those things that could be relaxed later if it proved too irksome.

A little wordy to include that keyword(?) in there, while things like List
and Dict don't require it. While a valiant attempt to solve the
readability of tuple type declarators, it just doesn't seem right... :-(

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From paul@prescod.net  Mon Dec 20 13:18:29 1999
From: paul@prescod.net (Paul Prescod)
Date: Mon, 20 Dec 1999 07:18:29 -0600
Subject: [Types-sig] Python parser in Python?
References: <002d01bf4ab3$ab7c1360$922d153f@tim>
Message-ID: <385E2CA5.E88FE7B5@prescod.net>

Tim Peters wrote:
> 
> John Aycock's extremely general (any CF grammar, ambiguous or not) parsing
> framework comes with a Python grammar.  

It depends on Python's built-in lexer:

#
#  Why would I write my own when GvR maintains this one?
#
import tokenize

Doesn't that remove the possibility for new keywords?

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
The occasional act of disrespect for the American flag creates but a 
flickering insult to the values of democracy -- unless it provokes 
America into limiting the freedoms that are its hallmark.
           -- Paul Tash, executive editor of the St. Petersburg Times


From paul@prescod.net  Mon Dec 20 13:28:18 1999
From: paul@prescod.net (Paul Prescod)
Date: Mon, 20 Dec 1999 07:28:18 -0600
Subject: [Types-sig] Issue: definition of "type"
References: <002f01bf4ab6$820a9c60$922d153f@tim>
Message-ID: <385E2EF2.6CE888DC@prescod.net>

Tim Peters wrote:
> 
> > spam is a class but not a static type.
> 
> True, but it can be given a static type *name*; e.g.,
> 
>     decl type spam
> 
> Provided that the attributes of spam actually referenced outside of spam
> have the same signatures, static type checking outside of spam shouldn't
> care that it doesn't know about spam's internals.  Or, IOW, if the two
> dynamic versions of spam present the same external interface to the
> compiler, it doesn't matter how the *class* spam comes into being at
> runtime.

Okay, but do you or do you not agree that in the simple case of:

class spam:
	def a(self) -> String:
		return "abc"

a type object should be made implicitly as if someone had actually typed
in the decl. I certainly would not support a position that said that the
entire signature of spam had to be re-declared. I MIGHT support a
position that said that the user had to explicitly declare spam as being
available to the static type system.

I'm on the fence about this last requirement because I would like to
think that all of the code out there with class statements is *already*
defining a bunch of types. A minority of it depends on runtime
information and we can easily detect those cases. So why not let the
simple case of "defined class that doesn't depend on runtime
information" be a shortcut for a type declaration?

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
The occasional act of disrespect for the American flag creates but a 
flickering insult to the values of democracy -- unless it provokes 
America into limiting the freedoms that are its hallmark.
           -- Paul Tash, executive editor of the St. Petersburg Times


From paul@prescod.net  Mon Dec 20 13:55:42 1999
From: paul@prescod.net (Paul Prescod)
Date: Mon, 20 Dec 1999 07:55:42 -0600
Subject: [Types-sig] New syntax?
References: <Pine.LNX.4.10.9912200116020.16305-100000@nebula.lyra.org>
Message-ID: <385E355E.9B15FA42@prescod.net>

Greg Stein wrote:
> 
> ...
> 
> Nah. No execution needs to take place. Just some data flow analysis. 

Let's be concrete:
1.

if somefunction():
	class a:
		def b(self)->String: return "abc"
else:
	class a:
		def b(self)->Int: return 5

How many type objects are created? What are there names? What is the
type of a? 

2. 

class a:
	def b(self)->String: return "abc"
for i in sys.argv:
	class a:
		def b(self)->Int: return 5

3.

def makeClass():
	class a:
		def b( self ):
			return "abc"
	return a

j=makeClass()()

--------------------
This seems intractable to me. I got around this in my original proposal
by requiring all declaring classes to be *top-level*. In other words I
formally defined the subset of Python that does not require code
execution. If you can formally define the semantics of "data flow" then
I will be able to compare the proposals.

Note that I am half-way between you and Tim. I think that type objects
should be more like Python objects but I am willing to restrict where
they are created to make the problem tractable and the semantics
understandable.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
The occasional act of disrespect for the American flag creates but a 
flickering insult to the values of democracy -- unless it provokes 
America into limiting the freedoms that are its hallmark.
           -- Paul Tash, executive editor of the St. Petersburg Times


From tratt@dcs.kcl.ac.uk  Mon Dec 20 10:18:43 1999
From: tratt@dcs.kcl.ac.uk (Laurence Tratt)
Date: Mon, 20 Dec 1999 10:18:43 GMT
Subject: [Types-sig] Type Inference II
In-Reply-To: <385AE827.42ECB891@maxtal.com.au>
References: <Pine.LNX.4.10.9912171453120.16305-100000@nebula.lyra.org> <385AE827.42ECB891@maxtal.com.au>
Message-ID: <3665527349.laurie@btinternet.com>

In message <385AE827.42ECB891@maxtal.com.au>
          skaller <skaller@maxtal.com.au> wrote:

> I note python currently supports privacy by name mangling, but really,
> this is a hack: for Python 2, a more sophisticated architecture would be
> better.

Nnngg. I'm not keen on Python ever gaining privacy (the __ name mangling is
nasty, I agree). It just doesn't really seem in the spirit of things; I
always tend to think of the Larry Wall quote "Perl would rather you kept out
of its living room because you weren't invited, not because it has a
shotgun".

In my recent projects, I denote "private" (there's no distinction between
private, protected etc as there is in, say, Java) by just preceeding names
with a "_". I've actually found that highly effective, and it makes it
obvious that "self._method()" and so on are private calls. This approach
also tends to make modules fairly "from module import *" safe.

The only argument I can imagine for privacy is that "from module import *"
tends to import module names etc as well which can make it confusing; but
when we use that feature we deserve everything we get <wink>.


Laurie
-- 
http://eh.org/~laurie/comp/python/


From guido@CNRI.Reston.VA.US  Mon Dec 20 15:14:44 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Mon, 20 Dec 1999 10:14:44 -0500
Subject: [Types-sig] tuples (was: New syntax?)
In-Reply-To: Your message of "Sat, 18 Dec 1999 13:26:16 PST."
 <Pine.LNX.4.10.9912181324390.16305-100000@nebula.lyra.org>
References: <Pine.LNX.4.10.9912181324390.16305-100000@nebula.lyra.org>
Message-ID: <199912201514.KAA04222@eric.cnri.reston.va.us>

> On Sat, 18 Dec 1999, Paul Prescod wrote:
> > Greg Stein wrote:
> > > 
> > > Bite me. :-)
> > > 
> > > You do raise a good point in another post, however:
> > > 
> > >   def foo(*args: (Int)):
> > 
> > Python should not use tuples as "read-only lists." From a type-system
> > point of view, a tuple should be a fixed-length, fixed-type data
> > structure defined at compile time.

[Greg again]
> Ideal or not, this is the current situation. *args is a tuple.
> 
> Are you suggesting a particular change here? If so, then add it to your
> issues list :-)  [you are maintaining one, right? :-)]

I don't think there's a good, deep reason why *args yields a tuple;
only a historical one.

I think that originally, all argument lists were internally passed
around as tuples, because (in *very* early Python) argument lists
*were* tuples.

There's no particular reason why it should be immutable -- after all
**kwargs returns a dict, which is mutable.

The only reason not to switch to tuples is backwards compatibility --
in particular there is a lot of code (e.g. in the std library) that
creates new arg lists by adding tuples to *args.  This could be solved
by allowing + to operate on a mix of lists and tuples.  I think the
result should yield a list.

Not a strong argument to do this, just a relaxation of Greg's argument
that *args is a tuple -- it needn't be, if we have a good reason to
change it.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From m.faassen@vet.uu.nl  Mon Dec 20 15:44:18 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Mon, 20 Dec 1999 16:44:18 +0100
Subject: [Types-sig] typedefs (was: New syntax?)
References: <Pine.LNX.4.10.9912171048160.16305-100000@nebula.lyra.org>
Message-ID: <385E4ED2.C3EEB28E@vet.uu.nl>

Greg Stein wrote:
> 
> On Fri, 17 Dec 1999, Martijn Faassen wrote:
[snip]
> > typedef Footype(int, int):
> >     return int
> >
> > var handlermap = {string: Footype}
> 
> I see typedefs as a way to associate a typedecl with a name. In your
> example here, I'm not sure how to do a typedef of something like
> List<String>. You seem to have pegged typedef to only do function
> typedefs.

And class typedefs, I suppose, but you're right. Though you could do
this:

typedef Footype: List(Int)
 
I should finally work out my syntax proposal into something sensible
because now I'm confusing myself. :) I do still think there's something
interesting to be learned from the 'class instantiation' - 'typedef
instantiation' and 'value assignment' - 'type assignment' analogy.

[snip]
> In any case, I think using "def" inline to define a function typedecl is
> fine. A typedef is merely used to create an alias, to clarify a later
> declaration.

Yes, but you basically have the same setup with current Python if you
exclude Lambdas. A function definition is merely used to create an
'alias' for a piece of code, to clarify other pieces of code. If you
assume for the moment lambdas are bad, we may want to assume by analogy
that inline defs are not a good idea either.

Regards,

Martijn


From rmasse@cnri.reston.va.us  Mon Dec 20 16:16:32 1999
From: rmasse@cnri.reston.va.us (Roger Masse)
Date: Mon, 20 Dec 1999 11:16:32 -0500 (EST)
Subject: [Types-sig] Low-hanging fruit: recognizing builtins
In-Reply-To: <000a01bf48f1$56213bc0$32a2143f@tim>
References: <3859FD66.E47352E@lemburg.com>
 <000a01bf48f1$56213bc0$32a2143f@tim>
Message-ID: <14430.22112.622745.998835@nobot.cnri.reston.va.us>

Tim Peters writes:
 > [M.-A. Lemburg]
 > > ...
 > > BTW, just to make buying one of those new microwave
 > > ovens more attractive: what is the pystone rating for
 > > the new Athlon and Pentium III chips ?
 > 

I just bought a cute little 600MHz compaq with the K7 for around $1400.
(Chrismas gift for my girlfriend) Performance is 7487.21/pystones per sec.

	    -Roger


From m.faassen@vet.uu.nl  Mon Dec 20 16:17:25 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Mon, 20 Dec 1999 17:17:25 +0100
Subject: [Types-sig] Issue: definition of "type"
References: <002f01bf4ab6$820a9c60$922d153f@tim> <385E2EF2.6CE888DC@prescod.net>
Message-ID: <385E5695.1CAEF90B@vet.uu.nl>

Paul Prescod wrote:
[snip]
> I'm on the fence about this last requirement because I would like to
> think that all of the code out there with class statements is *already*
> defining a bunch of types. A minority of it depends on runtime
> information and we can easily detect those cases. So why not let the
> simple case of "defined class that doesn't depend on runtime
> information" be a shortcut for a type declaration?

Are you sure that in fact a minority depends on runtime information?
Often Python code is used without any inheritance link, like this:

class Foo:
    def doSomething(self):
        ...

class Bar:
    def doSomething(self):
        ...

a = [Foo(), Bar()]

for el in a:
   el.doSomething()

Doesn't this rely on run-time information? How would a type system deal
with this? I suppose I'm entering the domain of interfaces now...

Regards,

Martijn


From paul@prescod.net  Mon Dec 20 16:22:48 1999
From: paul@prescod.net (Paul Prescod)
Date: Mon, 20 Dec 1999 10:22:48 -0600
Subject: [Types-sig] Issue: definition of "type"
References: <002f01bf4ab6$820a9c60$922d153f@tim> <385E2EF2.6CE888DC@prescod.net> <385E5695.1CAEF90B@vet.uu.nl>
Message-ID: <385E57D8.E5518928@prescod.net>

Martijn Faassen wrote:
> 
>...
> 
> Doesn't this rely on run-time information? How would a type system deal
> with this? I suppose I'm entering the domain of interfaces now...

Yes, that is the role of interfaces. Nobody has yet suggested that the
code you described would be type-safe. The two doSomething methods are
unrelated.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
The occasional act of disrespect for the American flag creates but a 
flickering insult to the values of democracy -- unless it provokes 
America into limiting the freedoms that are its hallmark.
           -- Paul Tash, executive editor of the St. Petersburg Times


From paul@prescod.net  Mon Dec 20 16:24:32 1999
From: paul@prescod.net (Paul Prescod)
Date: Mon, 20 Dec 1999 10:24:32 -0600
Subject: [Types-sig] Issue: definition of "type"
References: <002f01bf4ab6$820a9c60$922d153f@tim> <385E2EF2.6CE888DC@prescod.net> <385E5695.1CAEF90B@vet.uu.nl>
Message-ID: <385E5840.EF5ED124@prescod.net>

Martijn Faassen wrote:
> 
> Paul Prescod wrote:
> [snip]
> > I'm on the fence about this last requirement because I would like to
> > think that all of the code out there with class statements is *already*
> > defining a bunch of types. A minority of it depends on runtime
> > information and we can easily detect those cases. So why not let the
> > simple case of "defined class that doesn't depend on runtime
> > information" be a shortcut for a type declaration?
> 
> Are you sure that in fact a minority depends on runtime information?

Note that I'm saying that the vast majority of Python classes are
statically declared, not that the vast majority of Python *code* is
statically type checkable.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
The occasional act of disrespect for the American flag creates but a 
flickering insult to the values of democracy -- unless it provokes 
America into limiting the freedoms that are its hallmark.
           -- Paul Tash, executive editor of the St. Petersburg Times


From m.faassen@vet.uu.nl  Mon Dec 20 16:39:38 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Mon, 20 Dec 1999 17:39:38 +0100
Subject: [Types-sig] A challenge
References: <002501bf4878$d2faf240$63a2143f@tim> <385A7495.D7A25EC4@appliedbiometrics.com>
Message-ID: <385E5BCA.E238292E@vet.uu.nl>

Christian Tismer wrote:
> 
> Tim Peters wrote:
> > [stuff on name equivalence]
> That sounds very right, since it allows to create different
> things even if they look the same from structure. You get more
> strength in error checking, since using the parameter in the wrong
> context can be detected even if a foo's components look like a bar's.

Okay, but then I'll repeat the question I asked before:

class Foo:
    def getIt(self)->String:
        ...

class Bar:
    def getIt(self)->String:
        ...
list = [Foo(), Bar()]

for el in list:
    print el.doIt()

This wouldn't work, even though the interfaces are similar. This brings
us into two domains:

* inheritance

I haven't seen too much discussion on how types are going to interact
with the inheritance system. I could of course let Foo and Bar derive
from a common base class which defines doIt() as well. This is a common
way to do it, if type annotations get inherited from base classes, etc.

* interfaces

Another way to do it is to use interfaces and say Foo and Bar both
conform to some interface which supports doIt(). This was something we
wouldn't discuss in this SIG, but can we in fact avoid it? 

Regards,

Martijn


From guido@CNRI.Reston.VA.US  Mon Dec 20 16:46:52 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Mon, 20 Dec 1999 11:46:52 -0500
Subject: [Types-sig] development approach (was: Syntax)
In-Reply-To: Your message of "Sun, 19 Dec 1999 11:23:18 PST."
 <Pine.LNX.4.10.9912191111280.16305-100000@nebula.lyra.org>
References: <Pine.LNX.4.10.9912191111280.16305-100000@nebula.lyra.org>
Message-ID: <199912201646.LAA04958@eric.cnri.reston.va.us>

[Greg]
> I think that once we reach consensus and have Guido Approval, then
> it goes right into the main CVS tree.

I don't think so.  I'd like to see an experimental implementation
first.  Consensus about a proposal is very different than a working
implementation!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From m.faassen@vet.uu.nl  Mon Dec 20 16:53:24 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Mon, 20 Dec 1999 17:53:24 +0100
Subject: [Types-sig] minimal or major change? (was: RFC 0.1)
References: <001f01bf486d$09feec80$63a2143f@tim>
Message-ID: <385E5F04.1B218E20@vet.uu.nl>

Tim Peters wrote:
> 
> [Martijn Faassen, reasonably demands ...]
> > So that's where I'm coming from. It's important for our proposal
> > to actually come up with a workable development plan, because
> > adding type checking to Python is rather involved. So I've been
> > pushing one course of implementation towards a testable/hackable
> > system that seems to give us the minimal amount of development
> > complexities. I haven't seen clear development paths from others
> > yet; most proposals seem to involve both run-time and compile-
> > time developments at the same time.
> >
> > So I'm interested to see other development proposals; possibly
> > there's a simpler approach or equally complex approach with more
> > payoff, that I'm missing.
> 
> I haven't given a lick of thought to development, beyond sketching "the
> usual" approach to type inference for Guido, and having a hard-won intuition
> about what is and isn't reasonably parseable.  This SIG has been "alive
> again" for on the order of just one week:  design precedes implementation,
> and I won't bemoan the lack of implementation details even if they're
> delayed for *another* whole week <wink>.

Of course I can wait a couple of weeks longer. :)

Now I'll add some buts:

But implementation possibilities do influence design. I wasn't asking
for actual implementation proposals, I was thinking about how to go
about development. What brings us early payoffs. What is most effective
and least complex. What development difficulties may appear that we
simply can't avoid (so we can brace ourselves :). 

> At that point, it's fine by me if the first cut is *spelled* using plain
> dicts and docstrings etc to ease development.  But before that point, we
> don't even know what we want it to *do*.

remember-there-something-called-'prototyping'-ly yours,

Martijn


From skaller@maxtal.com.au  Mon Dec 20 17:18:01 1999
From: skaller@maxtal.com.au (skaller)
Date: Tue, 21 Dec 1999 04:18:01 +1100
Subject: [Types-sig] New syntax?
References: <Pine.LNX.4.10.9912200120590.16305-100000@nebula.lyra.org>
Message-ID: <385E64C9.A82E84@maxtal.com.au>

Greg Stein wrote:
 
> I think that we definitely want to be able to construct and use typedecl
> objects at runtime. That's why I prefer the typedef unary operator over
> your "sub-language."

Are these options mutually exclusive?

I've implemented operator ! in Viper now, x!t checks type(x) is t, 
and returns x if it is, otherwise it raises a TypeError. 
The precedence is one level tighter than Greg recommended, 
mainly because that was slightly easier for me to implement quickly. 
Here is some code I wrote using it:

  def append(self,object,value):
     object.list.append(value ! self.Type)

which is quite terse, seems readable and 'pythonic',
and works as I expected. Without this operator,
the code would read:

  def append(self,object,value):
     if type(value) is self.Type:
       object.list.append(value)
     else:
       raise TypeError

My current feeling: I quite like it -- but the above
is the only use I have tried, other than specifically 
for testing it. My feeling, also, is that in those
circumstances where the test would fail, then the
program should be considered in error (that is,
it is not legitimate practice to catch and handle
the TypeError, so that if a compiler can prove it would
be raised, it is entitled to reject the program,
and a lint like checker, to issue a diagnostic.
[The explicit test, like in the second example above,
should be used if it is desired to catch and handle
the raised TypeError]

This means that the x!t can be optimised to x,
without affecting strictly conforming program
semantics.

Comments? Greg?

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From paul@prescod.net  Mon Dec 20 17:28:39 1999
From: paul@prescod.net (Paul Prescod)
Date: Mon, 20 Dec 1999 11:28:39 -0600
Subject: [Types-sig] Basic questions
References: <002501bf4878$d2faf240$63a2143f@tim> <385A7495.D7A25EC4@appliedbiometrics.com> <385E5BCA.E238292E@vet.uu.nl>
Message-ID: <385E6747.4831D32@prescod.net>

Martijn Faassen wrote:
> 
> Another way to do it is to use interfaces and say Foo and Bar both
> conform to some interface which supports doIt(). This was something we
> wouldn't discuss in this SIG, but can we in fact avoid it?

It seems to me that we've been discussing it for about a week now! You
are right that we can't avoid it.

> * inheritance
> 
> I haven't seen too much discussion on how types are going to interact
> with the inheritance system.

I think it would work more or less as it does in other object oriented
languages. I, personally, am concentrating on the parts of the system
that I feel I don't understand. Those parts mostly have to do with
Python's dynamism and not with its already existing type system. Of
course subtypes of "foo" should follow "foo"'s interface and should be
recognized as "foo"s.

But the much more basic question is whether:

class foo: pass

even *defines* a type that can be used in type declarations. Greg says
yes, even if the declaration is buried in code. Tim says no,(I think) 
not unless it is preceded with a decl statement. I'm trying to figure
out which one is right. We can get to inheritance and interfaces later.

Basic questions. 

1. Is this valid:

class foo: pass

def a( arg: foo ): pass

2. Is this valid:

if someFunc():
	class foo: "abc"
else:
	class foo: "def"

def a( arg: foo ): pass


-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
The occasional act of disrespect for the American flag creates but a 
flickering insult to the values of democracy -- unless it provokes 
America into limiting the freedoms that are its hallmark.
           -- Paul Tash, executive editor of the St. Petersburg Times


From aycock@csc.UVic.CA  Mon Dec 20 17:33:20 1999
From: aycock@csc.UVic.CA (John Aycock)
Date: Mon, 20 Dec 1999 09:33:20 -0800
Subject: [Types-sig] Re: [String-SIG] Python parser in Python?
Message-ID: <199912201733.JAA12537@valdes.csc.UVic.CA>

| From: Paul Prescod <paul@prescod.net>
|
| Tim Peters wrote:
| > 
| > John Aycock's extremely general (any CF grammar, ambiguous or not) parsing
| > framework comes with a Python grammar.  
|
| It depends on Python's built-in lexer:
|
| import tokenize

The tokenize module doesn't interface with the lexer inside Python -- it
does its work using a set of ugly-looking regular expressions.

| Doesn't that remove the possibility for new keywords?

Not at all.  If the new keywords (here I'm assuming reserved words) are
of the same form as identifiers, as would most likely be the case, then
you can easily pick them out after tokenize splits them apart.  That's
what my Python lexer does: piggybacks on tokenize, then flags reserved
words.  (Some people advocate such a splitting of lexical analysis tasks
this way, into a scanner (tokenize) and a screener (postprocessing of
tokens).)

Of course, if you want odd-looking keywords, you could always modify
a provate copy of tokenize :-)

John


From skaller@maxtal.com.au  Mon Dec 20 17:36:58 1999
From: skaller@maxtal.com.au (skaller)
Date: Tue, 21 Dec 1999 04:36:58 +1100
Subject: [Types-sig] Type Inference II
References: <001701bf4a7b$59ab61e0$922d153f@tim>
Message-ID: <385E693A.94A2BF83@maxtal.com.au>

Tim Peters wrote:
> > ...
> > Tightening up for loops will break code that does things like:
> >
> >       (1) do extra increments on an loop variable to skip cases
> 
> Nope.  Assigning to the loop variable has no effect on the values assigned
> *to* it by the for stmt:
> 
> >>> for i in range(5):
>         print i
>         i = i ** 10
>         print i

	Ooops. You're right. Comment withdrawn. [Anyone that actually
wrote that should probably be withdrawn too :-]
 
-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From evan@4-am.com  Mon Dec 20 17:51:08 1999
From: evan@4-am.com (Evan Simpson)
Date: Mon, 20 Dec 1999 11:51:08 -0600
Subject: [Types-sig] New syntax?
References: <19991220045108.747A41CD8C@dinsdale.python.org>
Message-ID: <385E6C8C.635293B2@4-am.com>

[Tim Peters wrote:]

> So let's make one up.  The problem is spelling "tuple of unknown length"
> (and Paul's complaint notwithstanding, that *is* Python so we gotta deal
> with it).  Python has no notation for this.

In one of the many messages I started composing for this SIG, then never sent,
I mixed regexp-style notation into your ML-style declarations.  How's about:

(T*) means T-tuple of unknown length, (T+) means length at least one, (T1?,
T2{1,2}, T3) means optional T1 followed by one or two T2's and exactly one T3.
This still requires (T,) for a single-T tuple, but all other uses are
distinguishable from grouping.

> Actually, tuples show up in type specifiers rarely enough-- and look so much
> like grouping now --that I'd be happy requiring "Tuple" all the time.  Again
> one of those things that could be relaxed later if it proved too irksome.

Naturally, this isn't restricted to tuples;  Argument (and regular) lists could
also use it.
In particular, the much-discussed range could be declared as:

decl def range(start as Int?, stop as Int, step as Int?) return [Int*]

(and I still think "default" arguments used for closures should be separable
from regular arguments by a ';', but that's another SIG)

Still can't spell 'map', though.

decl def(ResultType, *SeqTypes) map(def(SeqTypes) return ResultType, map(lambda
x: Sequence(x), SeqTypes) )
    return [Result{len(SeqTypes)}]

<bleagh!>

Cheers,

Evan @ 4-am


From m.faassen@vet.uu.nl  Mon Dec 20 17:58:34 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Mon, 20 Dec 1999 18:58:34 +0100
Subject: [Types-sig] Basic questions
References: <002501bf4878$d2faf240$63a2143f@tim> <385A7495.D7A25EC4@appliedbiometrics.com> <385E5BCA.E238292E@vet.uu.nl> <385E6747.4831D32@prescod.net>
Message-ID: <385E6E4A.4D56A904@vet.uu.nl>

Paul Prescod wrote:
> 
> Martijn Faassen wrote:
[snip] 
> > * inheritance
> >
> > I haven't seen too much discussion on how types are going to interact
> > with the inheritance system.
> 
> I think it would work more or less as it does in other object oriented
> languages. I, personally, am concentrating on the parts of the system
> that I feel I don't understand. 

Okay, I'll focus away from inheritance for now.

> Those parts mostly have to do with
> Python's dynamism and not with its already existing type system. Of
> course subtypes of "foo" should follow "foo"'s interface and should be
> recognized as "foo"s.
> 
> But the much more basic question is whether:
> 
> class foo: pass
> 
> even *defines* a type that can be used in type declarations. Greg says
> yes, even if the declaration is buried in code. Tim says no,(I think)
> not unless it is preceded with a decl statement. I'm trying to figure
> out which one is right.

I'm in the make it explicit camp here. Nothing defines any type
(functions or classes) unless we explicitly say that it does. Otherwise
it may default to 'everything is the Any' type which should be
equivalent to basic Python. Note that we will probably lose static type
info in any code path that passes through basic Python, to any path
exiting Python into statically typed Python should have runtime checks
(or give compile time errors). I wonder if in practice this will mean
people will start to assign types to *everything* to make it work well
(or efficient) with types at all. If so then we need to somehow avoid
this.
 
> We can get to inheritance and interfaces later.

I actually need interfaces in my following discussion, so I apologize in
advance. :)
 
> Basic questions.
> 
> 1. Is this valid:
> 
> class foo: pass
> 
> def a( arg: foo ): pass

Not unless somewhere it says explicitly that class foo defines a static
type, I'd say.
 
> 2. Is this valid:
> 
> if someFunc():
>         class foo: "abc"
> else:
>         class foo: "def"
> 
> def a( arg: foo ): pass

This is the really interesting one.. Perhaps interfaces can help here.

One rule could be this:

You can't define the same name multiple times in the same scope. You
have to do  'class foo1' and 'class foo2' instead, and then say they
both conform to the interface 'foo'.

Consequences:

* A separate interface declaration syntax would seem to be required.

Consequences I describe at the alternative rule apply too, I think.

An alternative rule would be the following:

Any class names that are defined multiple times in the same scope are
taken to support an interface with that same name. This interface is the
only type you can use elsewhere; you can't use the class type directly.
It is a compile time error if classes with the same name define
different interfaces. 

Consequences: 

* This may mean we enter access-rule land; it would be okay classes
conforming to an interface to define different member variables, as long
as these are private.

* The interface needs to be hooked up to the actual implementation
during runtime. This may happen as soon as a class (that the compiler
has seen has multiple definitions) is actually being defined at
run-time.

* These are odd interfaces in the sense that it looks as if you can
instantiate from them! What 'in fact' happens is that the interfaces
passes any instantiation requests to the actual class that's doing the
implementation -- the interface is a simple factory.

The same story would apply to function signatures/prototypes; if the
same function name occurs multiple times in the same scope they're all
taken to define the same prototype, which would be the actual type used.

Regards,

Martijn


From m.faassen@vet.uu.nl  Mon Dec 20 18:10:00 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Mon, 20 Dec 1999 19:10:00 +0100
Subject: [Types-sig] Return of the Docstring: The Typening
References: <385966D5.BAF592C4@4-am.com>
Message-ID: <385E70F8.95EBFA07@vet.uu.nl>

Evan Simpson wrote:
[snip]
> I still like the Sparrow/SPython concept, too <wink>.

That was Swallow, as in African or European swallows. :)

And that doesn't do any analysis, it basically declares *all* types
exhaustively anywhere and restricts the heck out of what is allowed. All
for OPT, of course. :)

Regards,

Martijn


From m.faassen@vet.uu.nl  Mon Dec 20 18:15:15 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Mon, 20 Dec 1999 19:15:15 +0100
Subject: [Types-sig] Issue: definition of "type"
References: <002f01bf4ab6$820a9c60$922d153f@tim> <385E2EF2.6CE888DC@prescod.net> <385E5695.1CAEF90B@vet.uu.nl> <385E57D8.E5518928@prescod.net>
Message-ID: <385E7233.74DF151A@vet.uu.nl>

Paul Prescod wrote:
> Martijn Faassen wrote:
[snip]
> > Doesn't this rely on run-time information? How would a type system deal
> > with this? I suppose I'm entering the domain of interfaces now...
> 
> Yes, that is the role of interfaces. Nobody has yet suggested that the
> code you described would be type-safe. The two doSomething methods are
> unrelated.

I understood that, but I am saying that this type of thing is quite
common in Python, and I was reacting to what you said here:

> ...because I would like to
> think that all of the code out there with class statements is *already*
> defining a bunch of types. A minority of it depends on runtime
> information and we can easily detect those cases.

I was pointing out this common idiom in Python as an argument against
your statement that a minority depends on runtime information (that we
can easily detect). Lots of Python code depends on this idiom so it's
good to address it.

Regards,

Martijn


From gstein@lyra.org  Mon Dec 20 18:32:04 1999
From: gstein@lyra.org (Greg Stein)
Date: Mon, 20 Dec 1999 10:32:04 -0800 (PST)
Subject: [Types-sig] private names (was: Type Inference II)
In-Reply-To: <3665527349.laurie@btinternet.com>
Message-ID: <Pine.LNX.4.10.9912201026160.16305-100000@nebula.lyra.org>

I just wanted to jump in with a "me too!" :-)

On Mon, 20 Dec 1999, Laurence Tratt wrote:
> In message <385AE827.42ECB891@maxtal.com.au>
>           skaller <skaller@maxtal.com.au> wrote:
> > I note python currently supports privacy by name mangling, but really,
> > this is a hack: for Python 2, a more sophisticated architecture would be
> > better.
> 
> Nnngg. I'm not keen on Python ever gaining privacy (the __ name mangling is
> nasty, I agree). It just doesn't really seem in the spirit of things; I
> always tend to think of the Larry Wall quote "Perl would rather you kept out
> of its living room because you weren't invited, not because it has a
> shotgun".

That's an excellent quote :-). I agree. Guido has always phrased it as
"we're all adults here [so we know what to do and what not to do]."

But I agree: I never really liked the __ name mangling. I liked relying on
adulthood.

However, there was a secondary reason for the mangling, not just privacy.
It was added to help prevent conflicts between super/subclass' use of
attributes. Personally, I think that Python is transparent enough that a
subclass is going to know what attributes its parent class uses and will
avoid those.
[ this may also be a result of my tendency towards shallow hierarchies ]

> In my recent projects, I denote "private" (there's no distinction between
> private, protected etc as there is in, say, Java) by just preceeding names
> with a "_". I've actually found that highly effective, and it makes it
> obvious that "self._method()" and so on are private calls. This approach
> also tends to make modules fairly "from module import *" safe.

Same here.

> The only argument I can imagine for privacy is that "from module import *"
> tends to import module names etc as well which can make it confusing; but
> when we use that feature we deserve everything we get <wink>.

Absolutely! IMO, the construct should just go away.

I do believe that the following is goodness:

   from deep.package import subpackage

   subpackage.somefunc()

In this case, you are still importing a module, but you've stripped down
some of its hierarchy for easier access. But you *don't* import "somefunc"
directly, because then you would lose the "subpackage." when you call the
thing. I believe that people should always use "module.foo" references
rather than just "foo". About the only exception that I make for this is
with the "types" module.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Mon Dec 20 18:38:24 1999
From: gstein@lyra.org (Greg Stein)
Date: Mon, 20 Dec 1999 10:38:24 -0800 (PST)
Subject: [Types-sig] Re: decl f: def(_T, _T) -> _T (fwd)
Message-ID: <Pine.LNX.4.10.9912201037580.16305-100000@nebula.lyra.org>

misfire... redirecting...

---------- Forwarded message ----------
Date: Tue, 04 Apr 2000 15:30:21 +0100
From: Edward Welbourne <eddyw@lsl.co.uk>
To: gstein@lyra.org
Subject: Re: decl f: def(_T, _T) -> _T

Hey ! I remember that type.  Ponder calls it Boolean.

def true(this, that): return this
def false(this, that): return that
def or(this, that):
    return lambda i,a,_i=this,_a=that: _i(i,_a(i,a))
def and(this, that):
    return lambda i,a,_i=this,_a=that: _i(_a(i,a),a)

wow ! the rest of the type transcribes cleanly too ;^>

I like the spec (of course, I want to change it, too).
More comments to the list when I've read more (notably, I see Tim
Peters has responded ...)

A suggestion:

decl f(_T): def(_T, _T) -> _T

that is, the `foralltype' names are parameters of the decl ?

	Eddy.
-- 
I believe in getting into hot water; it keeps you clean.
			-- G. K. Chesterton.


From m.faassen@vet.uu.nl  Mon Dec 20 18:35:38 1999
From: m.faassen@vet.uu.nl (Martijn Faassen)
Date: Mon, 20 Dec 1999 19:35:38 +0100
Subject: [Types-sig] Interfaces (was: definition of "type")
References: <002f01bf4ab6$820a9c60$922d153f@tim> <385E2EF2.6CE888DC@prescod.net> <385E5695.1CAEF90B@vet.uu.nl> <385E5840.EF5ED124@prescod.net>
Message-ID: <385E76FA.48F953DF@vet.uu.nl>

Paul Prescod wrote:
> 
> Martijn Faassen wrote:
> >
> > Paul Prescod wrote:
> > [snip]
> > > I'm on the fence about this last requirement because I would like to
> > > think that all of the code out there with class statements is *already*
> > > defining a bunch of types. A minority of it depends on runtime
> > > information and we can easily detect those cases. So why not let the
> > > simple case of "defined class that doesn't depend on runtime
> > > information" be a shortcut for a type declaration?
> >
> > Are you sure that in fact a minority depends on runtime information?
> 
> Note that I'm saying that the vast majority of Python classes are
> statically declared, not that the vast majority of Python *code* is
> statically type checkable.

[just responded to your other response to me, but here you address my
concern in that response, so...I've hopelessly confused everyone now]

Okay. We should look into this issue, though. Ideally it should be as
easy as possible for the current Python programmer to adapt his code to
use types. I think interfaces are the answer here more than a common
base class.

Here I'll go off on a tangent that may help here:

Possibly this is a wild idea (or possibly it's old hat to everyone), but
what about a system to produces interfaces without having to declare
them? Take the intersection of two class interfaces and call this a new
interface; all methods with the same signature (and possibly members).

class Foo conforms FooInter:
   def getS(self)->String:
       ...

   def otherstuff(self):
       ...

class Bar conforms FooInter:
    def getS(self)->String:
        ...

    def otherstuff(self, a, b):
        ...

class Baz:
    pass

Foo ! fooInter # works
Bar ! fooInter # works
Baz ! fooInter # TypeError

Alternatively you could move the interface declaration code outside the
classes, into something like this:

decl interface FooInter: intersection(Foo, Bar)

This way programmers don't need to explicitly declare interfaces and
still have them. I don't know if this is a good idea though; there's a
lot to say for explicitness. These intersections may be too big,
containing overlaps you aren't interested in. Though of course it's easy
to explicitize it and prevent this, too, if you want:

class ExplicitInterface conforms FooInter:
    def getS(self)->String:
        pass


Though in this case you could make the interface too small if not all
conforming classes actually implement a method. So you'd need something
like:

class ExplicitInterface defines FooInter:
    ...

to be really sure you get compiler errors if not all classes conform.

Sorry to fill the SIG with discussions on interfaces. They just seem
unavoidable if you want to preserve lots of interesting Python features. 

Python right now is after all often used in a way that decouples
interface from implementation without explicitizing the interfaces (for
instance cStringIO, File, etc). This idea would at least make them more
explicit with minimum programmer-hassle, while still providing for full
explicity if desired. In this way it is similar to current Python
practice; you can make interfaces fully explicit by using a common base
class.

Regards,

Martijn


From skaller@maxtal.com.au  Mon Dec 20 18:40:50 1999
From: skaller@maxtal.com.au (skaller)
Date: Tue, 21 Dec 1999 05:40:50 +1100
Subject: [Types-sig] Basic questions
References: <002501bf4878$d2faf240$63a2143f@tim> <385A7495.D7A25EC4@appliedbiometrics.com> <385E5BCA.E238292E@vet.uu.nl> <385E6747.4831D32@prescod.net>
Message-ID: <385E7832.81D4CFC7@maxtal.com.au>

Paul Prescod wrote:

> I think it would work more or less as it does in other object oriented
> languages. I, personally, am concentrating on the parts of the system
> that I feel I don't understand. Those parts mostly have to do with
> Python's dynamism and not with its already existing type system. Of
> course subtypes of "foo" should follow "foo"'s interface and should be
> recognized as "foo"s.

	Sure, but how do you know what types are subtypes?
You cannot tell from inheritance: subtyping isn't related
to inheritance, at least in Python (same in ocaml).

	Example:

	class Foo:
		def f(x): return None
	class Bar(Foo):
		def f(x,y): return int(x)+int(y)

Foo is a base of Bar, Bar is not a subtype of Foo.
Classes do not specify types. They are simply constructions
which make constructing instances easy: all the instances
have the same type, Instance... you could argue that
instances of a particular class X have type 'Instance of X',
but the behaviour is only default (since attributes can
be dynamically altered).
 
> But the much more basic question is whether:
> 
> class foo: pass
> 
> even *defines* a type that can be used in type declarations. 

	The declaration defines a class.
It specifies the initial attributes of the class.
In CPython 1.5 at least, class declarations do not
define types. If you expand the syntax so that,
if the type is Instance, then you can give a class
name instead, then this would imply that _any_ class
object which can be refered to can be used.

	If I may: there is an issue here which
some people may not have realised: recursive types.
In an interface file, this can be handled by 
two passes.

	In implementation files, it is much
harder, since scoping rules are dynamic.
This is a good argument for interface files.
Example:

	class X:
		def f(y:Y): ...
	class Y:
		def g(x:X): ...

Resolving this in a single pass requires
backpatching, which is messy: but using two
passes leads to difficult ambiguities
due to renaming:

	class X:
		def h(): ...

Ok -- so now, which X does the X in g refer to?

In python, names are bound dynamically, which resolves
the problem. In an interface file, renaming can be banned.
Can it be banned, for classes, in implementation files?

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Mon Dec 20 19:20:38 1999
From: skaller@maxtal.com.au (skaller)
Date: Tue, 21 Dec 1999 06:20:38 +1100
Subject: [Types-sig] Interfaces (was: definition of "type")
References: <002f01bf4ab6$820a9c60$922d153f@tim> <385E2EF2.6CE888DC@prescod.net> <385E5695.1CAEF90B@vet.uu.nl> <385E5840.EF5ED124@prescod.net> <385E76FA.48F953DF@vet.uu.nl>
Message-ID: <385E8186.3BC32DF4@maxtal.com.au>

Martijn Faassen wrote:
> class Foo conforms FooInter:

How about

	class Foo is a FooInter: ..


-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From tony@metanet.com  Mon Dec 20 19:37:22 1999
From: tony@metanet.com (Tony Lownds)
Date: Mon, 20 Dec 1999 11:37:22 -0800 (PST)
Subject: [Types-sig] tuples (was: New syntax?)
In-Reply-To: <199912201514.KAA04222@eric.cnri.reston.va.us>
Message-ID: <Pine.GSO.3.93.991220113010.10109C-100000@adam12>

On Mon, 20 Dec 1999, Guido van Rossum wrote:
> 
> The only reason not to switch to tuples is backwards compatibility --
> in particular there is a lot of code (e.g. in the std library) that
> creates new arg lists by adding tuples to *args.  This could be solved
> by allowing + to operate on a mix of lists and tuples.  I think the
> result should yield a list.
> 

There would be forwards compatability issues too; people might starting
writing:

class A:
  def foo(self, *args):
    args[:0] = [self]
    apply(foo, args)

  def bar(self, *args):
    ...

This code would not work on existing Pythons.

If this change would just be because of a lack of a way to say, a tuple of
any length of type A, then may I suggest "tuple of A", e.g.

def foo(*args: tuple of float) -> (float, float):
  # return the median and mode of its arguments
  ...

-Tony


From tim_one@email.msn.com  Mon Dec 20 22:18:51 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Mon, 20 Dec 1999 17:18:51 -0500
Subject: [Types-sig] RE: [String-SIG] Python parser in Python?
In-Reply-To: <385E2CA5.E88FE7B5@prescod.net>
Message-ID: <000801bf4b38$32b5f2e0$b3a0143f@tim>

I can only make time for one easy one, and ... lessee ... Paul wins!

[Tim]
> John Aycock's ... framework comes with a Python grammar.

[Paul Prescod]
> It depends on Python's built-in lexer:
>
> #
> #  Why would I write my own when GvR maintains this one?
> #
> import tokenize
>
> Doesn't that remove the possibility for new keywords?

I'm going to respond a little more than John did, because tokenize.py has a
funky API that takes some getting used to.  Run the attached, and things
will be clearer.  tokenize.py doesn't know about keywords per se; all
alphanumeric names (whether keyword or identifier) come back with the NAME
token type.  Deciding what's a keyword is a post-lexing decision (i.e.,
that's up to tokenize's caller).

So unless the Types-SIG decides to prototype syntax unreasonably different
from current Python's, the only likely way in which tokenize.py may need to
be altered is in extending its Operator regexp.  For example, the "->" in
the attached is tokenized as two distinct OP tokens, "-" and ">".  You can
easily live with that by defining a *grammar* production to recognize that
pair, but then you can't stop e.g. "-      >" from getting treated as "->"
too (tokenize suppresses intraline whitespace).  Good enough for a
prototype!  Note that "-" followed by ">" is never legit Python today.

Subtleties for tokenize newbies:  a NEWLINE token terminates a stmt.  An NL
token is produced for an *intra*-stmt newline (NL does not terminate a stmt;
you can usually ignore NL, and COMMENT, tokens).  Changes in nesting level
are signaled by INDENT and DEDENT tokens.  Watch out for files whose final
line is indented but doesn't end with \n (that's the only time you'll see a
sequence of DEDENT tokens not immediately preceded by a NEWLINE token); Mark
Hammond has no other kind of file <wink>.

I'll be back next year, if not next week.  Americans should leave cookies
and milk out for Santa and his reindeer; people in other countries should
set deadly traps for evil goat gods -- or whatever other foolishness they
believe in.

and-remember-that-whoever-writes-code-first-wins<wink>-l y'rs  - tim

import tokenize

class TokDemo:
    def __init__(self, file):
        self.f = file

    def run(self):
        tokenize.tokenize(self.f.readline, self.gobbler)

    def gobbler(self, ttype, token, (sline, scol), (eline, ecol), line):
        print tokenize.tok_name[ttype], `token`


example = """
    def rootlist(n: Int, r: Real) -> [Real]:
        decl var result: [Real]
        result = []
        decl var i: Int
        for i in range(n):
            result.append(i ** (1/r))
        return result
"""

import StringIO
d = TokDemo(StringIO.StringIO(example))
d.run()


From paul@prescod.net  Tue Dec 21 00:16:22 1999
From: paul@prescod.net (Paul Prescod)
Date: Mon, 20 Dec 1999 18:16:22 -0600
Subject: [Types-sig] Basic questions
References: <002501bf4878$d2faf240$63a2143f@tim> <385A7495.D7A25EC4@appliedbiometrics.com> <385E5BCA.E238292E@vet.uu.nl> <385E6747.4831D32@prescod.net> <385E6E4A.4D56A904@vet.uu.nl>
Message-ID: <385EC6D6.E55B0A27@prescod.net>

Martijn Faassen wrote:
> 
> 
> ... I wonder if in practice this will mean
> people will start to assign types to *everything* to make it work well
> (or efficient) with types at all. If so then we need to somehow avoid
> this.

This is why I think that "make everything explicit" is too strong of a
rule in practice. I want type-checked code and untype-checked code to
work together more or less seamlessly. On the other hand, I don't want
to get into complicated data flow analysis. Even if someone implemented
it, how would we explain it to Python programmers? "In order to
understand what types your program is producing, follow this complicated
algorithm." That's why we are running away from strict
(non-conservative) type inferencing in the first place.

I think that the middle ground is more or less what I proposed last
week.

This is a class/static type definition:

class a: pass

This is not:

if 1:
	class a: pass

This is a function declaration where the function's type (Any->Any) is
known at compile time:

def a( b ): return "foo"

This is not a static function declaration and cannot be used from static
code without a type assertion:

if 1:
	def a( b ): return "foo"

I'm trying to keep thing simple.


-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
The occasional act of disrespect for the American flag creates but a 
flickering insult to the values of democracy -- unless it provokes 
America into limiting the freedoms that are its hallmark.
           -- Paul Tash, executive editor of the St. Petersburg Times


From billtut@microsoft.com  Tue Dec 21 00:55:14 1999
From: billtut@microsoft.com (Bill Tutt)
Date: Mon, 20 Dec 1999 16:55:14 -0800
Subject: [Types-sig] Re: Py2C speed benefit
Message-ID: <4D0A23B3F74DD111ACCD00805F31D8101D8BCB6F@RED-MSG-50>


> From: Greg Stein [mailto:gstein@lyra.org]
> 
> On Sun, 19 Dec 1999, skaller wrote:
> >...
> 
> > Now, here is something I believe, mainly from comments
> > made at various times by Guido, Tim, and others:
> > people have tried compiling python before, and found that
> > the resulting C code didn't run much faster than the
> > interpreter. Thats mainly because these compilers didn't
> > know anythong about the types, they just generated API
> > calls corresponding to what the byte code interpreter would
> > execute -- and the interpreter is pretty fast already.
> 
> Bill Tutt and I have done it and measured about 30% speed 
> improvement in
> most cases. Not as lot as most people would hope for, but definitely
> there. Bill is continuing to improve the code.
> 

To clarify, this is just an approximate speed improvement in pystone. This
doesn't (as yet) reflect a speed benefit when using typical OOP-like
production code. I'm hoping to eventaully find some time to optimizing
class-based method calls in Py2C.

Bill


From skaller@maxtal.com.au  Tue Dec 21 17:29:22 1999
From: skaller@maxtal.com.au (skaller)
Date: Wed, 22 Dec 1999 04:29:22 +1100
Subject: [Types-sig] Basic questions
References: <002501bf4878$d2faf240$63a2143f@tim> <385A7495.D7A25EC4@appliedbiometrics.com> <385E5BCA.E238292E@vet.uu.nl> <385E6747.4831D32@prescod.net> <385E6E4A.4D56A904@vet.uu.nl> <385EC6D6.E55B0A27@prescod.net>
Message-ID: <385FB8F2.343ED658@maxtal.com.au>

Paul Prescod wrote:
 
> This is why I think that "make everything explicit" is too strong of a
> rule in practice. I want type-checked code and untype-checked code to
> work together more or less seamlessly. 

	I agree.

> On the other hand, I don't want
> to get into complicated data flow analysis. Even if someone implemented
> it, how would we explain it to Python programmers? 

	But this I do not understand. When an inferencer assigns
types to a variable of function, there are three cases:

	(1) the types are what the programmer expected.
Programmers usually expect a particular result.

	(2) the type is more general than the programmer expected.
This is easy to explain: the inferencer isn't as smart as the
programmer.
If you want better types for these cases, add explicit type
declarations.

	(3) the type is not what the programmer expected.
In this case, there is a definite bug in the programmers understanding
of the code. (assuming the inference engine actually works correctly).
[The bug may be in a function, or in a function call: that is,
the programmer has to sort out whether the serving code is wrong,
or the client code: does the function have to be generalised to meet
the clients requirements, or does the client need to adjust the
code to use the function as it was intended to be used??]

	It is only case (3) which is difficult. But, the difficulty
is less than that which would result from a run time error, so the
inferencing cannot reduce the programmers understanding, only
make them realise there are bugs earlier than they might wish to
be reminded :-)

	Of course, there is a fouth case: the inferencer is 
producing the wrong answer. This would certainly confuse the 
programmer(s) -- probably both the client programmer and the
author(s) of the inferencer :-)

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From gstein@lyra.org  Tue Dec 21 19:21:43 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 21 Dec 1999 11:21:43 -0800 (PST)
Subject: [Types-sig] parameterized typing
In-Reply-To: <385CC508.D8684CEC@prescod.net>
Message-ID: <Pine.LNX.4.10.9912211115410.16305-100000@nebula.lyra.org>

On Sun, 19 Dec 1999, Paul Prescod wrote:
> Greg Stein wrote:
> > ....
> > Paul: does this sufficiently address your desire for parameterized types?
> > Others: how does this look? It seems quite Pythonic to me, and is a basic
> > extension of previous discussions (and to my thoughts of the design).
> 
> Without thinking every detail through it looks good to me for handling
> parameterized classes. I think that parameterized typedecls and
> functions are still an issue.

True.

> Also, was it your intent that the _ be required or would the fact that
> the param was declared obviate that. I am thinking that there may a more
> general syntax that allows us to parameterize various sorts of things.

The "_" was just following Tim's lead. Certainly, there shouldn't be a
requirement. Maybe not even a convention (e.g. "self" is a convention
rather than a requirement).

I kind of like the leading underscore as it differentiates the param from
regular arguments.

> interface (a,b) foo: ...
> class (a, b) foo: ...
> def (a, b) foo(a) -> b:
> decl foo(a,b) = typedef ...

This has possibilities.

Remember, though: I believe that parameterization is only useful for the
type-checker. The Python runtime doesn't need it. In other words, our
basis for choosing to do parameterization is based solely on a need for
type checking. Is that need sufficient?

I'm on the fence with parameterization altogether. The problem is that I'm
not sure we can defer this one to a second phase because the type
declarator syntax will probably be affected dramatically by the
requirement for parameterization. i.e. we have to design it now to get the
long-term type decl syntax correct :-(

So: your idea for parameterization is nice, but I'd like to understand
whether there is a strong feeling for having it in the first place.
(before spending a lot of brain-cycles on the issue)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Tue Dec 21 19:24:49 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 21 Dec 1999 11:24:49 -0800 (PST)
Subject: [Types-sig] Issue: definition of "type"
In-Reply-To: <385E2EF2.6CE888DC@prescod.net>
Message-ID: <Pine.LNX.4.10.9912211122050.16305-100000@nebula.lyra.org>

On Mon, 20 Dec 1999, Paul Prescod wrote:
> Tim Peters wrote:
> > > spam is a class but not a static type.
> > 
> > True, but it can be given a static type *name*; e.g.,
> > 
> >     decl type spam
> > 
> > Provided that the attributes of spam actually referenced outside of spam
> > have the same signatures, static type checking outside of spam shouldn't
> > care that it doesn't know about spam's internals.  Or, IOW, if the two
> > dynamic versions of spam present the same external interface to the
> > compiler, it doesn't matter how the *class* spam comes into being at
> > runtime.
> 
> Okay, but do you or do you not agree that in the simple case of:
> 
> class spam:
> 	def a(self) -> String:
> 		return "abc"
> 
> a type object should be made implicitly as if someone had actually typed
> in the decl. I certainly would not support a position that said that the
> entire signature of spam had to be re-declared. I MIGHT support a
> position that said that the user had to explicitly declare spam as being
> available to the static type system.

I believe that a typedecl object would be created implicitly with the
above class definition. Even if the ->String wasn't present. Every class
definition implies an interface typedecl.

I concur: having to redeclare would suck.

Explicit declaration is unneeded -- why should a person have to declare
that the implicit type is usable? It is there, so it can be used.

> I'm on the fence about this last requirement because I would like to
> think that all of the code out there with class statements is *already*
> defining a bunch of types. A minority of it depends on runtime
> information and we can easily detect those cases. So why not let the
> simple case of "defined class that doesn't depend on runtime
> information" be a shortcut for a type declaration?

Absolutely.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Tue Dec 21 19:43:23 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 21 Dec 1999 11:43:23 -0800 (PST)
Subject: [Types-sig] computing typedecl objects (was: New syntax?)
In-Reply-To: <385E355E.9B15FA42@prescod.net>
Message-ID: <Pine.LNX.4.10.9912211124530.16305-100000@nebula.lyra.org>

On Mon, 20 Dec 1999, Paul Prescod wrote:
> Greg Stein wrote:
> > 
> > ...
> > 
> > Nah. No execution needs to take place. Just some data flow analysis. 
> 
> Let's be concrete:
> 1.
> 
> if somefunction():
> 	class a:
> 		def b(self)->String: return "abc"
> else:
> 	class a:
> 		def b(self)->Int: return 5
> 
> How many type objects are created? What are there names? What is the
> type of a? 

There are three typedecl objects (not "type object"!). Let's annotate your
code a bit to make this clearer:

if somefunction():
    class a:
        ...
    _internal_interface_a_1 = make_interface(a)  # compiler implies this
    a.__typedecl = _internal_interface_a_1  # compiler remembering the type
    first_interface = typedef a   # == _internal_interface_a_1
else:
    class a:
        ...
    _internal_interface_a_2 = make_interface(a)
    a.__typedecl = _internal_interface_a_2
    second_interface = typedef a
# type inferencer unions the type of a
a.__typedecl = typedef _internal_interface_a_1 or _internal_interface_a_2
final_interface = typedef a

The compiler does not create any names for these typedecl objects,
although it does imply them (the "_internal.*" names demo this). I've also
annotated that the compiler is remembering/associating a particular
typedecl with the class object. Finally, I've shown where the user is
explicitly fetching the typedecl object using the "typedef" keyword.

At the end of the above code, "a" has a union typedecl (for the purposes
of type checking/inferencing). The user has three typedecl objects held in
three variables (first_interface, second_interface, final_interface).

One point to make: "a" is a name referring to a class object. That is not
the same as a typedecl object, although it can be used in some contexts
where typedecl objects are needed. "typedef a" is definitely a typedecl
object and it cannot be used to instantiate an object. It refers to an
interface definition, actually.

> 2. 
> 
> class a:
> 	def b(self)->String: return "abc"
> for i in sys.argv:
> 	class a:
> 		def b(self)->Int: return 5

Just before the "for" statement, "typedef a" returns one typedecl. After
the "for" loop, it returns a different typedecl. Again: it will be a union
of the two interfaces (because we don't know whether the loop executes
zero or more iterations, so we can't know whether the class was
redefined).

If somebody gets smart and upgrades the type inferencer, it might be able
to detect:

class a: ...
for i in range(10):
  class a: ...

In this case, the inferencer knows the redefinition occurred, so it does
not have to create a union type.

> 3.
> 
> def makeClass():
> 	class a:
> 		def b( self ):
> 			return "abc"
> 	return a
> 
> j=makeClass()()

In this case, the "def" marks an analysis boundary. Its return type is
"any". In a type-safe world, the makeClass()() fails because we cannot
verify that a callable object was returned from makeClass. In a
type-checked world, there is nothing wrong with the above code.

> --------------------
> This seems intractable to me. I got around this in my original proposal
> by requiring all declaring classes to be *top-level*. In other words I
> formally defined the subset of Python that does not require code
> execution. If you can formally define the semantics of "data flow" then
> I will be able to compare the proposals.

The data flow merely replaces typedecls [that are associated with names],
or unions them if there are alternate code paths. The conditionals and
things that push classes away from "top-level" do not confuse an
inferencer. The type of your object will be a bit looser at type-check
time than at runtime, however.

The union will occur frequently when try/except is present:

  a = 1
  try:
    a = "foo"
    ...
  except:
    ...

  typedef a == typedef Int or String

> Note that I am half-way between you and Tim. I think that type objects
> should be more like Python objects but I am willing to restrict where
> they are created to make the problem tractable and the semantics
> understandable.

Please call them typedecl objects to avoid confusion with TypeType
objects. typedecl objects are created through type declarators or
implicitly by the inferencer and/or compiler.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Tue Dec 21 19:46:46 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 21 Dec 1999 11:46:46 -0800 (PST)
Subject: [Types-sig] typedefs (was: New syntax?)
In-Reply-To: <385E4ED2.C3EEB28E@vet.uu.nl>
Message-ID: <Pine.LNX.4.10.9912211144200.16305-100000@nebula.lyra.org>

On Mon, 20 Dec 1999, Martijn Faassen wrote:
>...
> I should finally work out my syntax proposal into something sensible
> because now I'm confusing myself. :) I do still think there's something
> interesting to be learned from the 'class instantiation' - 'typedef
> instantiation' and 'value assignment' - 'type assignment' analogy.

A summary would be good. I'm not sure at all where your position is
because you've been discussing from each position at different times.
Please create a bit of coherence :-)

> [snip]
> > In any case, I think using "def" inline to define a function typedecl is
> > fine. A typedef is merely used to create an alias, to clarify a later
> > declaration.
> 
> Yes, but you basically have the same setup with current Python if you
> exclude Lambdas. A function definition is merely used to create an
> 'alias' for a piece of code, to clarify other pieces of code. If you

I disagree that a function def is merely an alias. It provides a new
namespace, parameter binding, and capabilities such as deferred execution.
I definitely don't see it as simply an alias.

> assume for the moment lambdas are bad, we may want to assume by analogy
> that inline defs are not a good idea either.

I don't think that argument follows.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Tue Dec 21 20:04:07 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 21 Dec 1999 12:04:07 -0800 (PST)
Subject: [Types-sig] Issue: definition of "type"
In-Reply-To: <385E5695.1CAEF90B@vet.uu.nl>
Message-ID: <Pine.LNX.4.10.9912211200510.16305-100000@nebula.lyra.org>

On Mon, 20 Dec 1999, Martijn Faassen wrote:
> Paul Prescod wrote:
> [snip]
> > I'm on the fence about this last requirement because I would like to
> > think that all of the code out there with class statements is *already*
> > defining a bunch of types. A minority of it depends on runtime
> > information and we can easily detect those cases. So why not let the
> > simple case of "defined class that doesn't depend on runtime
> > information" be a shortcut for a type declaration?
> 
> Are you sure that in fact a minority depends on runtime information?
> Often Python code is used without any inheritance link, like this:
> 
> class Foo:
>     def doSomething(self):
>         ...
> 
> class Bar:
>     def doSomething(self):
>         ...
> 
> a = [Foo(), Bar()]
> 
> for el in a:
>    el.doSomething()
> 
> Doesn't this rely on run-time information? How would a type system deal
> with this? I suppose I'm entering the domain of interfaces now...

The type of "a" is a List where the elements' type is the union of the
type of each initialization value. In this case:

typedef a == typedef [Foo or Bar]

Pretty straightforward, but I'd be happy to detail this.

When the checker gets to el.doSomething, it knows that the type of el is
"Foo or Bar". It then goes through each alternative and verifies that
".doSomething" is legal for that possibility.

No problem :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Tue Dec 21 20:05:33 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 21 Dec 1999 12:05:33 -0800 (PST)
Subject: [Types-sig] Issue: definition of "type"
In-Reply-To: <385E57D8.E5518928@prescod.net>
Message-ID: <Pine.LNX.4.10.9912211204200.16305-100000@nebula.lyra.org>

On Mon, 20 Dec 1999, Paul Prescod wrote:
> Martijn Faassen wrote:
> >...
> > Doesn't this rely on run-time information? How would a type system deal
> > with this? I suppose I'm entering the domain of interfaces now...
> 
> Yes, that is the role of interfaces. Nobody has yet suggested that the
> code you described would be type-safe. The two doSomething methods are
> unrelated.

I maintain that it could be declared type-safe. In fact, it is reasonably
straight-forward to generate the type information at each point, for each
value, and then to verify that the .doSomething is valid.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Tue Dec 21 20:09:36 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 21 Dec 1999 12:09:36 -0800 (PST)
Subject: [Types-sig] polymorphic code (was: A challenge)
In-Reply-To: <385E5BCA.E238292E@vet.uu.nl>
Message-ID: <Pine.LNX.4.10.9912211205540.16305-100000@nebula.lyra.org>

On Mon, 20 Dec 1999, Martijn Faassen wrote:
>...
> Okay, but then I'll repeat the question I asked before:
> 
> class Foo:
>     def getIt(self)->String:
>         ...
> 
> class Bar:
>     def getIt(self)->String:
>         ...
> list = [Foo(), Bar()]
> 
> for el in list:
>     print el.doIt()
> 
> This wouldn't work, even though the interfaces are similar. This brings

I maintain that it will work :-)

[ assuming your doIt() is a typo, and you intended getIt() ]

>...
> * interfaces
> 
> Another way to do it is to use interfaces and say Foo and Bar both
> conform to some interface which supports doIt(). This was something we
> wouldn't discuss in this SIG, but can we in fact avoid it? 

We don't need any explicit interfaces to resolve the above code to
determine that it is type-safe. "el" has one of two types: Foo or Bar.
The (implicit) interface of each has a method named getIt that takes zero
parameters. In this case, the "print" statement can take any type, so we
don't even need to worry about the return types (even though they happen
to be the same).

Specifically: the type-checker does not have to unify the interface to
verify type-safety or type-checks, it merely needs to check that each
alternative for the type of "el" supports the required method and
signature.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Tue Dec 21 20:13:27 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 21 Dec 1999 12:13:27 -0800 (PST)
Subject: [Types-sig] type-assert operator optimizations (was: New syntax?)
In-Reply-To: <385E64C9.A82E84@maxtal.com.au>
Message-ID: <Pine.LNX.4.10.9912211210080.16305-100000@nebula.lyra.org>

On Tue, 21 Dec 1999, skaller wrote:
> Greg Stein wrote:
>  
> > I think that we definitely want to be able to construct and use typedecl
> > objects at runtime. That's why I prefer the typedef unary operator over
> > your "sub-language."
> 
> Are these options mutually exclusive?

I'm not sure that I understand this question. I think some context was
lost (i.e. what is the sub-language).

> I've implemented operator ! in Viper now, x!t checks type(x) is t, 
>...
> My current feeling: I quite like it -- but the above
> is the only use I have tried, other than specifically 
> for testing it. My feeling, also, is that in those
> circumstances where the test would fail, then the
> program should be considered in error (that is,
> it is not legitimate practice to catch and handle
> the TypeError, so that if a compiler can prove it would
> be raised, it is entitled to reject the program,
> and a lint like checker, to issue a diagnostic.
> [The explicit test, like in the second example above,
> should be used if it is desired to catch and handle
> the raised TypeError]
> 
> This means that the x!t can be optimised to x,
> without affecting strictly conforming program
> semantics.

If the compiler can definitively state that the test will never fail, then
it doesn't have to include a runtime check.

If the compiler can definitively state that the test will always fail,
then it can issue an error and refuse to compile.
[ with the caveat of catching exceptions ]

If the compiler believes that it might fail in some cases, then it could
issue a warning (and go ahead and insert a runtime check).
[ and yes, there can be switches to avoid issuing warnings ]

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Tue Dec 21 20:17:09 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 21 Dec 1999 12:17:09 -0800 (PST)
Subject: [Types-sig] Basic questions
In-Reply-To: <385E6747.4831D32@prescod.net>
Message-ID: <Pine.LNX.4.10.9912211213540.16305-100000@nebula.lyra.org>

On Mon, 20 Dec 1999, Paul Prescod wrote:
>...
> But the much more basic question is whether:
> 
> class foo: pass
> 
> even *defines* a type that can be used in type declarations. Greg says
> yes, even if the declaration is buried in code. Tim says no,(I think) 

Definitely yes. The typedecl syntax allows the use of a class object as a
way to specify a typedecl. Internally, the class contains a reference to
an interface definition; the interface is the "real" typedecl.

>...
> 1. Is this valid:
> 
> class foo: pass
> 
> def a( arg: foo ): pass

Absolutely. The compiler understands that "foo" refers to a class object,
so it is allowed in a typedecl.

> 2. Is this valid:
> 
> if someFunc():
> 	class foo: "abc"
> else:
> 	class foo: "def"
> 
> def a( arg: foo ): pass

Absolutely. The compiler understands that "foo" refers to a class object,
although it doesn't know which one. No matter, though, as it just
associates a union typedecl with the name "foo".

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Tue Dec 21 20:19:22 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 21 Dec 1999 12:19:22 -0800 (PST)
Subject: [Types-sig] New syntax?
In-Reply-To: <385E6C8C.635293B2@4-am.com>
Message-ID: <Pine.LNX.4.10.9912211217270.16305-100000@nebula.lyra.org>

On Mon, 20 Dec 1999, Evan Simpson wrote:
>...
> In one of the many messages I started composing for this SIG, then never sent,
> I mixed regexp-style notation into your ML-style declarations.  How's about:
> 
> (T*) means T-tuple of unknown length, (T+) means length at least one, (T1?,
> T2{1,2}, T3) means optional T1 followed by one or two T2's and exactly one T3.
> This still requires (T,) for a single-T tuple, but all other uses are
> distinguishable from grouping.

An interesting approach, but it is a bit cryptic. Almost shades of
Perl...  :-)


But hey: I haven't offered a better approach, so who am I to say? :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Tue Dec 21 20:26:11 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 21 Dec 1999 12:26:11 -0800 (PST)
Subject: [Types-sig] Basic questions
In-Reply-To: <385E6E4A.4D56A904@vet.uu.nl>
Message-ID: <Pine.LNX.4.10.9912211219560.16305-100000@nebula.lyra.org>

On Mon, 20 Dec 1999, Martijn Faassen wrote:
>...
> > 2. Is this valid:
> > 
> > if someFunc():
> >         class foo: "abc"
> > else:
> >         class foo: "def"
> > 
> > def a( arg: foo ): pass
> 
> This is the really interesting one.. Perhaps interfaces can help here.
> 
> One rule could be this:
> 
> You can't define the same name multiple times in the same scope. You
> have to do  'class foo1' and 'class foo2' instead, and then say they
> both conform to the interface 'foo'.

Icky. No way.

Even if I didn't believe that the inferencer could resolve the above code
fragment, I wouldn't like having to use different names. That feels like
too much of an imposition (on the part of the type-checker) onto my code.

> Consequences:
> 
> * A separate interface declaration syntax would seem to be required.

Nah. A class implies an interface.

> Consequences I describe at the alternative rule apply too, I think.
> 
> An alternative rule would be the following:
> 
> Any class names that are defined multiple times in the same scope are
> taken to support an interface with that same name. This interface is the
> only type you can use elsewhere; you can't use the class type directly.
> It is a compile time error if classes with the same name define
> different interfaces. 

The typedecl is a union of the two (implied) interfaces. No reason to
impose a single interface or to refuse the usage of the class name.

> Consequences: 
> 
> * This may mean we enter access-rule land; it would be okay classes
> conforming to an interface to define different member variables, as long
> as these are private.

I don't see this happening.

> * The interface needs to be hooked up to the actual implementation
> during runtime. This may happen as soon as a class (that the compiler
> has seen has multiple definitions) is actually being defined at
> run-time.

I do agree that a class object would have an associated typedecl object at
runtime. The typedecl would define the class' interface.

> * These are odd interfaces in the sense that it looks as if you can
> instantiate from them! What 'in fact' happens is that the interfaces
> passes any instantiation requests to the actual class that's doing the
> implementation -- the interface is a simple factory.

I disagree. I would say the interface is just another typedecl. And
typedecl objects are not callable (and are certainly not factories).

Even if you wanted to make an interface instantiable, that just can't
happen: an interface could be used by multiple classes.

> The same story would apply to function signatures/prototypes; if the
> same function name occurs multiple times in the same scope they're all
> taken to define the same prototype, which would be the actual type used.

Again: I disagree. The inferencer would associate different typedecl
objects (signatures) with the name at different points in the execution.
Depending on the control flow, each redefinition will cause a union of
typedecl objects, or a replacment.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Tue Dec 21 20:37:48 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 21 Dec 1999 12:37:48 -0800 (PST)
Subject: [Types-sig] Basic questions
In-Reply-To: <385E7832.81D4CFC7@maxtal.com.au>
Message-ID: <Pine.LNX.4.10.9912211236050.16305-100000@nebula.lyra.org>

On Tue, 21 Dec 1999, skaller wrote:
>...
> 	If I may: there is an issue here which
> some people may not have realised: recursive types.

These are not possible in Python because definitions are actually
constructed at runtime. The particular name/object must be available at
that point in the execution.

> In an interface file, this can be handled by 
> two passes.
> 
> 	In implementation files, it is much
> harder, since scoping rules are dynamic.
> This is a good argument for interface files.
> Example:
> 
> 	class X:
> 		def f(y:Y): ...

This fails. Y is not defined.

> 	class Y:
> 		def g(x:X): ...
> 
> Resolving this in a single pass requires
> backpatching, which is messy: but using two
> passes leads to difficult ambiguities
> due to renaming:
> 
> 	class X:
> 		def h(): ...
> 
> Ok -- so now, which X does the X in g refer to?

Y.g referred to the X that existed at that point in time.

> In python, names are bound dynamically, which resolves
> the problem. In an interface file, renaming can be banned.
> Can it be banned, for classes, in implementation files?

No need to ban it. Y.g refers to the first X. Simple.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Tue Dec 21 20:39:52 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 21 Dec 1999 12:39:52 -0800 (PST)
Subject: [Types-sig] Interfaces
In-Reply-To: <385E8186.3BC32DF4@maxtal.com.au>
Message-ID: <Pine.LNX.4.10.9912211238330.16305-100000@nebula.lyra.org>

On Tue, 21 Dec 1999, skaller wrote:
> Martijn Faassen wrote:
> > class Foo conforms FooInter:
> 
> How about
> 
> 	class Foo is a FooInter: ..

I don't think we should be worrying about how to explicitly declare and
associate interfaces with classes.

The type system can easily infer an interface from a class definition, and
we can work with that. I also believe they do not need to be explicit for
the type system to function.

A later phase can make interfaces explicit.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Tue Dec 21 20:41:13 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 21 Dec 1999 12:41:13 -0800 (PST)
Subject: [Types-sig] compatibility (was: tuples)
In-Reply-To: <Pine.GSO.3.93.991220113010.10109C-100000@adam12>
Message-ID: <Pine.LNX.4.10.9912211240100.16305-100000@nebula.lyra.org>

On Mon, 20 Dec 1999, Tony Lownds wrote:
> On Mon, 20 Dec 1999, Guido van Rossum wrote:
> > The only reason not to switch to tuples is backwards compatibility --
> > in particular there is a lot of code (e.g. in the std library) that
> > creates new arg lists by adding tuples to *args.  This could be solved
> > by allowing + to operate on a mix of lists and tuples.  I think the
> > result should yield a list.
> 
> There would be forwards compatability issues too; people might starting
> writing:
> 
> class A:
>   def foo(self, *args):
>     args[:0] = [self]
>     apply(foo, args)
> 
>   def bar(self, *args):
>     ...
> 
> This code would not work on existing Pythons.

This kind of stuff happens all the time. There is code out there with
"assert" statements that don't work on old versions of Python. Python 1.6
has methods on the string objects; old versions do not.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From guido@CNRI.Reston.VA.US  Tue Dec 21 20:42:33 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 21 Dec 1999 15:42:33 -0500
Subject: [Types-sig] Recursive types (was: Basic questions)
In-Reply-To: Your message of "Tue, 21 Dec 1999 12:37:48 PST."
 <Pine.LNX.4.10.9912211236050.16305-100000@nebula.lyra.org>
References: <Pine.LNX.4.10.9912211236050.16305-100000@nebula.lyra.org>
Message-ID: <199912212042.PAA13406@eric.cnri.reston.va.us>

> From: Greg Stein <gstein@lyra.org>
> 
> On Tue, 21 Dec 1999, skaller wrote:
> >...
> > 	If I may: there is an issue here which
> > some people may not have realised: recursive types.
> 
> These are not possible in Python because definitions are actually
> constructed at runtime. The particular name/object must be available at
> that point in the execution.

Huh?  "Recursive types" typically refers to all sorts of nodes, graphs
and trees (where an instance attribute has the same type as its
container).  Certainly these are possible in Python!

> > In an interface file, this can be handled by 
> > two passes.
> > 
> > 	In implementation files, it is much
> > harder, since scoping rules are dynamic.
> > This is a good argument for interface files.
> > Example:
> > 
> > 	class X:
> > 		def f(y:Y): ...
> 
> This fails. Y is not defined.

If I understand the context correctly (X defined before Y but using Y)
I disagree.  Since this works fine without type declarations (as long
as the instantiations happen after the classes are defined) I don't
see why adding static typing should break this.

Also, I think that static typing should have a much more liberal view
about forward referencing than Python itself.  Since it is quite legal
to have

  def f(): g()
  def g(): ...
  print f()

I think that typecheckers should deal with this, and not complain
about the forward reference to g in f!  (Except when f is called
before g is defined.  Flow analysis should allow this distinction.)
(Incidentally, this is one of the things that annoys me about Aaron
Watters's kjpylint: it warns about all forward references.  This
conflicts with the top-down coding style that I currently prefer for
procedural coding, where main() precedes everything it calls, and so
on.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gstein@lyra.org  Tue Dec 21 20:50:44 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 21 Dec 1999 12:50:44 -0800 (PST)
Subject: [Types-sig] Basic questions
In-Reply-To: <385EC6D6.E55B0A27@prescod.net>
Message-ID: <Pine.LNX.4.10.9912211241390.16305-100000@nebula.lyra.org>

On Mon, 20 Dec 1999, Paul Prescod wrote:
>...
> This is why I think that "make everything explicit" is too strong of a
> rule in practice. I want type-checked code and untype-checked code to
> work together more or less seamlessly. On the other hand, I don't want
> to get into complicated data flow analysis. Even if someone implemented
> it, how would we explain it to Python programmers? "In order to
> understand what types your program is producing, follow this complicated
> algorithm." That's why we are running away from strict
> (non-conservative) type inferencing in the first place.

Euh... why does it have to be explained? Why do Python programmers care
what the types are? They know.

The inferencer is just figuring out what the programmer did. The
programmer doesn't have to understand it to produce valid programs.

John responded to this in a different email listing the kinds of
mismatches between the programmer's intent and the inferencer's
deductions. He explains the situation well...

>...
> This is a class/static type definition:
> 
> class a: pass

Yes.

> This is not:
> 
> if 1:
> 	class a: pass

I disagree. This does create an implicit typedecl which can be used. In
addition the class name "a" can be used.

Caveat:
   typedef a == typedef <some-interface> or Undefined

Specifically, the compiler may warn the programmer that "a" could possibly
be undefined.
[ because I really don't think we want to do constant evaluation in the
  inferencer. although if somebody does... cool! it could then removed the
  Undefined alternative. ]

> This is a function declaration where the function's type (Any->Any) is
> known at compile time:
> 
> def a( b ): return "foo"

Agreed.

> This is not a static function declaration and cannot be used from static
> code without a type assertion:
> 
> if 1:
> 	def a( b ): return "foo"

I disagree again :-), for the same reasons as the class.

> I'm trying to keep thing simple.

An approach that I heartily agree with!

However, I'd rather make it simple for the Python programmer: define it
wherever you want, in whatever style you're using -- we won't force you to
use a particular style.

In other words, I think the rule of "it must be at the top-level, by which
I mean yah.. at the globals level, but not inside an 'if' statement. oh,
or inside a 'for' statement or a 'while' statement, for that matter. hrm.
imports might enter into this somehow, too... lemme think..."

I say just let them do what they want. I believe we can figure it out.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Tue Dec 21 21:07:27 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 21 Dec 1999 13:07:27 -0800 (PST)
Subject: [Types-sig] recursive types, type safety, and flow analysis (was: Recursive
 types)
In-Reply-To: <199912212042.PAA13406@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912211252310.16305-100000@nebula.lyra.org>

On Tue, 21 Dec 1999, Guido van Rossum wrote:
> > From: Greg Stein <gstein@lyra.org>
> > 
> > On Tue, 21 Dec 1999, skaller wrote:
> > >...
> > > 	If I may: there is an issue here which
> > > some people may not have realised: recursive types.
> > 
> > These are not possible in Python because definitions are actually
> > constructed at runtime. The particular name/object must be available at
> > that point in the execution.
> 
> Huh?  "Recursive types" typically refers to all sorts of nodes, graphs
> and trees (where an instance attribute has the same type as its
> container).  Certainly these are possible in Python!

True... I use the stuff, too...

I should have clarified that I don't think his particular example would
work because of the compile- / definition- time recursion of the names.
Runtime? Sure. It's fine.

> > > In an interface file, this can be handled by 
> > > two passes.
> > > 
> > > 	In implementation files, it is much
> > > harder, since scoping rules are dynamic.
> > > This is a good argument for interface files.
> > > Example:
> > > 
> > > 	class X:
> > > 		def f(y:Y): ...
> > 
> > This fails. Y is not defined.
> 
> If I understand the context correctly (X defined before Y but using Y)
> I disagree.  Since this works fine without type declarations (as long
> as the instantiations happen after the classes are defined) I don't
> see why adding static typing should break this.

Because Y is not defined. This is analogous to the following code:

class Foo:
  def build_it(self, x, y, cls=Bar):
    return cls(x, y)

class Bar:
  ...

The above code breaks. I am positing that if you put "Y" into a
declarator, then it should exist at that point in time. Where "time" is
specified as following the flow of execution as the functions/classes are
defined.

> Also, I think that static typing should have a much more liberal view
> about forward referencing than Python itself.  Since it is quite legal
> to have
> 
>   def f(): g()
>   def g(): ...
>   print f()
> 
> I think that typecheckers should deal with this, and not complain
> about the forward reference to g in f!  (Except when f is called
> before g is defined.  Flow analysis should allow this distinction.)

Good point.

I don't think we can detect that call-before-definition, though.

I think your point can be restated as: Can we type-check the following
code?

def f() -> String:
  return g()

def g() -> String:
  ...


I haven't thought about this particular scenario or the resulting impact
on the inferencer. We probably require some kind of a two-pass analysis as
John points out. Maybe it is as simple as deferring analysis of function
bodies until the global "body" is analyzed. Actually, I think the deferral
mechanism is sufficient, as that mirrors the execution environment: at the
time the function body is executed, the globals are defined.
[ with the caveat of call-before-define, but we can't determine that ]

Hrm. The whole call-before-define problem might actually axe Paul's desire
for type safety. I don't think we can *ever* guarantee that a name will
exist at the time it is used. For example:

value = 1
def typesafe f():
  somefunc(value)

del value
f()

If you start doing *control* flow analysis, then you might be able to
definitely state the above code is in error. But then, I'll just throw
this wrench at it:

if sometest():
  del value
f()

Now what?

The type inferencing that I believe we can/should be using is based on
some basic data flow, without regard to definitively determining whether a
particular branch is reached or not. If a possibility exists, then the
possible types are unioned in. If type-safety is defined as "no
NameErrors", then we have a problem, as control flow is required.

Note that I believe the following can be handled:

value = 1
def typesafe f()
  func_taking_int(value)

value = "foo"
f()

In this case, the global variable "value" has a typedecl of: Int or
String. This would fail the func_taking_int() function call.

Back to the other point: I do believe that you should not use a name in a
type-declarator if it isn't defined.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Tue Dec 21 21:11:21 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 21 Dec 1999 13:11:21 -0800 (PST)
Subject: [Types-sig] Basic questions
In-Reply-To: <385FB8F2.343ED658@maxtal.com.au>
Message-ID: <Pine.LNX.4.10.9912211308240.16305-100000@nebula.lyra.org>

On Wed, 22 Dec 1999, skaller wrote:
> Paul Prescod wrote:
> > This is why I think that "make everything explicit" is too strong of a
> > rule in practice. I want type-checked code and untype-checked code to
> > work together more or less seamlessly. 
> 
> 	I agree.

Me too!

:-)

[ although, strictly speaking, I'm not sure of the granularity of enabling
  type-checking, other than the presence/absence of type declarators. is
  the checking on a module level? if a function level, how do we indicate
  that? a new keyword(s)? ]

> > On the other hand, I don't want
> > to get into complicated data flow analysis. Even if someone implemented
> > it, how would we explain it to Python programmers? 
> 
> 	But this I do not understand. When an inferencer assigns
> types to a variable of function, there are three cases:
> 
> 	(1) the types are what the programmer expected.
>...
> 	(2) the type is more general than the programmer expected.
>...
> 	(3) the type is not what the programmer expected.
>...
> 	It is only case (3) which is difficult. But, the difficulty
> is less than that which would result from a run time error, so the
> inferencing cannot reduce the programmers understanding, only
> make them realise there are bugs earlier than they might wish to
> be reminded :-)

Excellent analysis!  I heartily concur!

> 	Of course, there is a fouth case: the inferencer is 
> producing the wrong answer. This would certainly confuse the 
> programmer(s) -- probably both the client programmer and the
> author(s) of the inferencer :-)

Ssshhh! Quiet! We don't talk about that around here.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From guido@CNRI.Reston.VA.US  Tue Dec 21 23:38:58 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 21 Dec 1999 18:38:58 -0500
Subject: [Types-sig] recursive types, type safety, and flow analysis (was: Recursive types)
In-Reply-To: Your message of "Tue, 21 Dec 1999 13:07:27 PST."
 <Pine.LNX.4.10.9912211252310.16305-100000@nebula.lyra.org>
References: <Pine.LNX.4.10.9912211252310.16305-100000@nebula.lyra.org>
Message-ID: <199912212338.SAA13830@eric.cnri.reston.va.us>

[Greg Stein]

> I should have clarified that I don't think his particular example would
> work because of the compile- / definition- time recursion of the names.
> Runtime? Sure. It's fine.

Hm...  Since type checking is essentially a compile time activity, I
think it would be better if the run time order of events didn't
matter.  Yes, in Pascal or C you need to declare everything before
it's used.  This is a compromise because of old-fashioned one pass
compiler technology.  I don't see a reason why we should adopt this
rule in Python.  Note that Java doesn't have this either -- you can
declare your methods in any order you like and the compiler will
figure it out.

But, you may say, in Python we have a certain run time order of events
that defines the validity of names.  Names must be defined before they
are used, and a later redefinition overrides an earlier one.

Of course.  (Although I wouldn't mind getting at least a compile time
warning when I define two classes or methods with the same name; it
can be frustrating when you're editing the first definition but your
program keeps using the second one! :-)

But checking that names are defined by the time they are used at run
time is a different kind of check.  (Java does this too, to a decent
extent.)  I personally find this a very useful check.  But it doesn't
necessarily affect the compile time static typechecking.

> Because Y is not defined. This is analogous to the following code:
> 
> class Foo:
>   def build_it(self, x, y, cls=Bar):
>     return cls(x, y)
> 
> class Bar:
>   ...
> 
> The above code breaks. I am positing that if you put "Y" into a
> declarator, then it should exist at that point in time. Where "time" is
> specified as following the flow of execution as the functions/classes are
> defined.

I disagree.  See above -- there's no reason to burden the compile time
type checker with run time ordering.

> I don't think we can detect that call-before-definition, though.

But I think you can, in by far the most cases.  There may be a few
borderline cases where it's impossible to tell, and I don't mind
requiring a little help to make the type checker happy.  For example,
in C code I frequently add an initialization of a local variable to 0
which isn't really necessary because it is initialized in a for loop,
but the compiler isn't smart to figure out that the for loop will
execute at least once.  Gcc -Wall complains about such cases, and
shutting it up completely every once in a while is sufficiently
satisfying that I'll add redundant code.  Of course, I'd be happier if
gcc was smarter, and I hope that Python's type checker will usually be
smarter -- and then in the remaining cases I think it's okay to help
it.

> I think your point can be restated as: Can we type-check the following
> code?
> 
> def f() -> String:
>   return g()
> 
> def g() -> String:
>   ...
> 
> I haven't thought about this particular scenario or the resulting impact
> on the inferencer. We probably require some kind of a two-pass analysis as
> John points out. Maybe it is as simple as deferring analysis of function
> bodies until the global "body" is analyzed. Actually, I think the deferral
> mechanism is sufficient, as that mirrors the execution environment: at the
> time the function body is executed, the globals are defined.
> [ with the caveat of call-before-define, but we can't determine that ]

I don't see a big problem here for the type checker.  Assuming that
there's only one definition of g, and that we disallow changes to g
from outside the module (and from exec statements), the type checker
will have no trouble discovering that g is a global function
definition, and it can collect its type info to help checking f.
There may even be arbitrary cross references; the solution (from the
type checking point of view) is to iterate until all definitions are
found.

Again, checking that g is actually defined by the time f is called is
a separate thing; but again in most cases this will be easy, since
there is usually no executable code between the definitions of f and g
(except perhaps other function definitions).  It's a simple flow check.

> Hrm. The whole call-before-define problem might actually axe Paul's desire
> for type safety. I don't think we can *ever* guarantee that a name will
> exist at the time it is used. For example:
> 
> value = 1
> def typesafe f():
>   somefunc(value)
> 
> del value
> f()
> 
> If you start doing *control* flow analysis, then you might be able to
> definitely state the above code is in error. But then, I'll just throw
> this wrench at it:
> 
> if sometest():
>   del value
> f()
> 
> Now what?

Simple.  After the if statement has executed, value is "possibly
undefined".  This warrants a warning.

> The type inferencing that I believe we can/should be using is based on
> some basic data flow, without regard to definitively determining whether a
> particular branch is reached or not. If a possibility exists, then the
> possible types are unioned in. If type-safety is defined as "no
> NameErrors", then we have a problem, as control flow is required.

I don't see the problem.  I claim that the examples you give are
accidents waiting to happen, so it's only helpful if the type checker
complains about them!

> Note that I believe the following can be handled:
> 
> value = 1
> def typesafe f()
>   func_taking_int(value)
> 
> value = "foo"
> f()
> 
> In this case, the global variable "value" has a typedecl of: Int or
> String. This would fail the func_taking_int() function call.

Yes.  In my view, "possibly undefined" is no different than "int or
string".

> Back to the other point: I do believe that you should not use a name in a
> type-declarator if it isn't defined.

I like the idea better (I think proposed by Tim Peters) that the names
used for type declarations live in a separate compile time namespace
where different rules apply.  (Even though there are obvious
correspondences, e.g. the names of defined or imported classes should
probably be available both at compile time and at run time.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gstein@lyra.org  Wed Dec 22 00:33:24 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 21 Dec 1999 16:33:24 -0800 (PST)
Subject: [Types-sig] recursive types, type safety, and flow analysis
In-Reply-To: <199912212338.SAA13830@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912211554310.16305-100000@nebula.lyra.org>

On Tue, 21 Dec 1999, Guido van Rossum wrote:
> [Greg Stein]
> > I should have clarified that I don't think his particular example would
> > work because of the compile- / definition- time recursion of the names.
> > Runtime? Sure. It's fine.
> 
> Hm...  Since type checking is essentially a compile time activity, I
> think it would be better if the run time order of events didn't
> matter.

But runtime order does matter.

   a = 1
   func_taking_int(a)
   a = "foo"
   func_taking_string(a)

I can come up with all kinds of variants, but that's the basic pattern.
The code is perfectly type-safe, and depends on the order of events. If we
waited until the end, then "a" will either have type "String," or type
"Int or String." Either way, it produces false errors.

> Yes, in Pascal or C you need to declare everything before
> it's used.  This is a compromise because of old-fashioned one pass
> compiler technology.  I don't see a reason why we should adopt this
> rule in Python.  Note that Java doesn't have this either -- you can
> declare your methods in any order you like and the compiler will
> figure it out.

My new position (after your prodding my brain :-) is that each suite is
evaluated in order. The global suite first, then each of the function
bodies (in arbitrary order).

This basically gives us a multiple pass, and allows functions, variables,
classes, etc to be defined in any order.

But: it still doesn't allow for the recursive type declarators. To be
clear, it allows:

def f() -> String:
  return g()
def g() -> String:
  return "abc"

But it does not allow:

def f(x: Foo):
  ...
class Foo:
  ...


I believe the compiler should be recording information about the function
arguments' typedecls. Unless the compiler is going to have multiple passes
then the name should be defined before usage.

Or rather, let's assume that the function argument information is
constructed and recorded at runtime (as part of the standard function
object construction at runtime). Then you really have to ensure that name
is available, so the appropriate value can be stored into the function
object.

(this is, of course, predicated on recording signatures in the function
 object for use at runtime; I feel strongly that we should do this, as it 
 will dramatically assist some runtime tools/apps such as IDLE)

> But, you may say, in Python we have a certain run time order of events
> that defines the validity of names.  Names must be defined before they
> are used, and a later redefinition overrides an earlier one.
> 
> Of course.  (Although I wouldn't mind getting at least a compile time
> warning when I define two classes or methods with the same name; it
> can be frustrating when you're editing the first definition but your
> program keeps using the second one! :-)

We can surely issue a warning for a redefinition that changes the type.

> But checking that names are defined by the time they are used at run
> time is a different kind of check.  (Java does this too, to a decent
> extent.)  I personally find this a very useful check.  But it doesn't
> necessarily affect the compile time static typechecking.

But we have runtime information to record, and I also believe that we have
some runtime type checks to perform. In the above example, I think we
should be implicitly inserting code like so:

def f(x: Foo):
  x ! Foo
  ...

While we can do a lot of useful compile-time checks, I think we still have
runtime considerations that impose ordering.

> > Because Y is not defined. This is analogous to the following code:
> > 
> > class Foo:
> >   def build_it(self, x, y, cls=Bar):
> >     return cls(x, y)
> > 
> > class Bar:
> >   ...
> > 
> > The above code breaks. I am positing that if you put "Y" into a
> > declarator, then it should exist at that point in time. Where "time" is
> > specified as following the flow of execution as the functions/classes are
> > defined.
> 
> I disagree.  See above -- there's no reason to burden the compile time
> type checker with run time ordering.

I've loosened up to "globals first, then function bodies." That provides a
lot of relaxation of the requirements. (but it still does not allow for
recursive declarators)

To resolve the recursive declarator problem, I think we'd simply want a
notion of an undefined interface (much like an undefined struct). I'm not
sure of the mechanics for runtime resolution, but this would allow us to
do something like:

decl class Foo

class Bar:
  def method(self, x: Foo)->None:
    ...
class Foo:
  def method(self, x: Bar)->None:
    ...

We still have a runtime issue, however :-(

> > I don't think we can detect that call-before-definition, though.
> 
> But I think you can, in by far the most cases.  There may be a few

You can within a single block of code. It is very difficult across code
bodies, and requires an entirely different kind of analysis.

My point was about different code bodies.

> borderline cases where it's impossible to tell, and I don't mind
> requiring a little help to make the type checker happy.  For example,
> in C code I frequently add an initialization of a local variable to 0
> which isn't really necessary because it is initialized in a for loop,
> but the compiler isn't smart to figure out that the for loop will
> execute at least once.  Gcc -Wall complains about such cases, and
> shutting it up completely every once in a while is sufficiently
> satisfying that I'll add redundant code.  Of course, I'd be happier if
> gcc was smarter, and I hope that Python's type checker will usually be
> smarter -- and then in the remaining cases I think it's okay to help
> it.

Sure. This is all within a single code body. I agree that we can provide
use-before-definition errors.

Going back to the original problem:

def f():
  g()
def g():
  ...

There isn't a way to easily know that g is defined at the time f is
called. We don't even record that g is used by f!

The best that we can do is note that g is available in the global
namespace and that it has a proper type. But we can't determine whether it
might be Undefined or not.

Specifically, consider the two cases:

1)  del g
    f()

2)  f()
    del g

Unless the compiler knows that f() is going to use g, it can't do anything
here. It has to do some *serious* control flow analysis and record a lot
of information about f's requirements.

We might be able to go one step beyond and say (during f's analysis) that
g is possibly undefined, but we would get a lot of those warnings. That's
because we don't really have/record information that distinguishes these
patterns:

1) f()
   def g():
     ...

2) def g():
     ...
   f()

3) def g():
     ...
   del g
   f()

4) def g():
     ...
   f()
   del g

In case 1, you could say that g is "Func or Undefined" simply by stating a
policy that *any* global is "... or Undefined". The analysis of f() would
then raise an appropriate flag.

In case 2, any assumption of "or Undefined" is invalid. The code is fine.

In cases 3 and 4, maybe we're assuming that not all globals get an "or
Undefined" unless we see a "del" statement. That's fine, but we may have a
false warning because we can't different cases 3 and 4.

> > I think your point can be restated as: Can we type-check the following
> > code?
> > 
> > def f() -> String:
> >   return g()
> > 
> > def g() -> String:
> >   ...
> > 
> > I haven't thought about this particular scenario or the resulting impact
> > on the inferencer. We probably require some kind of a two-pass analysis as
> > John points out. Maybe it is as simple as deferring analysis of function
> > bodies until the global "body" is analyzed. Actually, I think the deferral
> > mechanism is sufficient, as that mirrors the execution environment: at the
> > time the function body is executed, the globals are defined.
> > [ with the caveat of call-before-define, but we can't determine that ]
> 
> I don't see a big problem here for the type checker.  Assuming that
> there's only one definition of g, and that we disallow changes to g
> from outside the module (and from exec statements), the type checker
> will have no trouble discovering that g is a global function
> definition, and it can collect its type info to help checking f.
> There may even be arbitrary cross references; the solution (from the
> type checking point of view) is to iterate until all definitions are
> found.

I agree. We can do this. This was my track about deferral.

> Again, checking that g is actually defined by the time f is called is
> a separate thing; but again in most cases this will be easy, since
> there is usually no executable code between the definitions of f and g
> (except perhaps other function definitions).  It's a simple flow check.

I disagree that it will be easy or that it is "a simple flow check."
Checking for undefined names is really only easy within a single code
body. As I outline further above, I don't think you can tell whether g is
really defined by the time f is called.

> > Hrm. The whole call-before-define problem might actually axe Paul's desire
> > for type safety. I don't think we can *ever* guarantee that a name will
> > exist at the time it is used. For example:
> > 
> > value = 1
> > def typesafe f():
> >   somefunc(value)
> > 
> > del value
> > f()
> > 
> > If you start doing *control* flow analysis, then you might be able to
> > definitely state the above code is in error. But then, I'll just throw
> > this wrench at it:
> > 
> > if sometest():
> >   del value
> > f()
> > 
> > Now what?
> 
> Simple.  After the if statement has executed, value is "possibly
> undefined".  This warrants a warning.

Right. But we don't cross-reference the fact that at after the if
statement is one of the times that we call f() and that f() happens to
need "value."

Recording this information about "when" is control flow. I think we just
want to record possible types and feed that into the type-checker. With
the presence of "del" in the above code, maybe we can record "or
Undefined", but that isn't really going to do what we'd like.

> > The type inferencing that I believe we can/should be using is based on
> > some basic data flow, without regard to definitively determining whether a
> > particular branch is reached or not. If a possibility exists, then the
> > possible types are unioned in. If type-safety is defined as "no
> > NameErrors", then we have a problem, as control flow is required.
> 
> I don't see the problem.  I claim that the examples you give are
> accidents waiting to happen, so it's only helpful if the type checker
> complains about them!

I agree, and it would be nice to have a warning. I don't think it is
possible (given the scope of analysis that I'm thinking of). You would
need a LOT more analysis to determine "undefined." And it would probably
have to be global (cross-module).

> > Note that I believe the following can be handled:
> > 
> > value = 1
> > def typesafe f()
> >   func_taking_int(value)
> > 
> > value = "foo"
> > f()
> > 
> > In this case, the global variable "value" has a typedecl of: Int or
> > String. This would fail the func_taking_int() function call.
> 
> Yes.  In my view, "possibly undefined" is no different than "int or
> string".

Agreed. See above.

> > Back to the other point: I do believe that you should not use a name in a
> > type-declarator if it isn't defined.
> 
> I like the idea better (I think proposed by Tim Peters) that the names
> used for type declarations live in a separate compile time namespace
> where different rules apply.  (Even though there are obvious
> correspondences, e.g. the names of defined or imported classes should
> probably be available both at compile time and at run time.)

I think we're going to want those names at runtime, which means they
should be defined at the time of their usage.

If we have a separate namespace, then I think the output of the compiler
will need a bit more magic because it would need to build that namespace
right at the beginning, for use by the code later on.  i.e. some prologue
which sets up the typedecl namespace.  That just doesn't strike me as a
good thing.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From guido@CNRI.Reston.VA.US  Wed Dec 22 01:19:05 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 21 Dec 1999 20:19:05 -0500
Subject: [Types-sig] recursive types, type safety, and flow analysis
In-Reply-To: Your message of "Tue, 21 Dec 1999 16:33:24 PST."
 <Pine.LNX.4.10.9912211554310.16305-100000@nebula.lyra.org>
References: <Pine.LNX.4.10.9912211554310.16305-100000@nebula.lyra.org>
Message-ID: <199912220119.UAA14134@eric.cnri.reston.va.us>

[me]
> > Hm...  Since type checking is essentially a compile time activity, I
> > think it would be better if the run time order of events didn't
> > matter.

[Greg]
> But runtime order does matter.
> 
>    a = 1
>    func_taking_int(a)
>    a = "foo"
>    func_taking_string(a)
> 
> I can come up with all kinds of variants, but that's the basic pattern.
> The code is perfectly type-safe, and depends on the order of events. If we
> waited until the end, then "a" will either have type "String," or type
> "Int or String." Either way, it produces false errors.

If this pattern occurs locally (a is a local variable), fine.  The
flow analyzer will have no problem with this, and shouldn't find any
type errors.

But if a were a global, I'd say this was bad taste and asking for
trouble.

I'm giving up on responding point-by-point -- let's just agree that we
differ in opinion on this matter.

> But: it still doesn't allow for the recursive type declarators. To be
> clear, it allows:
> 
> def f() -> String:
>   return g()
> def g() -> String:
>   return "abc"
> 
> But it does not allow:
> 
> def f(x: Foo):
>   ...
> class Foo:
>   ...

If there's only one Foo (which is usually the case) I still think this
is too strict, and I don't see a technical reason why it would be
necessary.

> I believe the compiler should be recording information about the function
> arguments' typedecls. Unless the compiler is going to have multiple passes
> then the name should be defined before usage.
> 
> Or rather, let's assume that the function argument information is
> constructed and recorded at runtime (as part of the standard function
> object construction at runtime). Then you really have to ensure that name
> is available, so the appropriate value can be stored into the function
> object.

OK, this is why we disagree.  I am only interested in compile time
type checking; I can admit that some run time checking is necessary,
but only in order to assert certain invariants that are assumed by the
compile time checker.  E.g. if I'm deducing that global X is a
constant, I'm going to make sure at run time it really won't change.
This catches several things: (1) dynamically loaded or generated code
that surreptitiously tries to change the value of a constant (memories
of Fortran...:); (2) other cases where (e.g. through unexpected
aliasing) the constant might be changed.

A form of type checking that happens completely at run time (the way
you describe it) is uninteresting to me, and using such a system as
the semantic basis for a type checker seems to be a mistake.  Yes,
this follows Python's semantics closer than what I am proposing.  But
I don't think that it is closer to what the user expects the type
checker to do.

Here's the crux of my argument:

    Python's dynamic semantics can often be surprising.  Compile time
    checking should warn the user about these surprises, it shouldn't
    try to assume that these surprises are what the user wanted!

(I've skipped the rest of what you wrote, because of the agreement to
disagree.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gstein@lyra.org  Wed Dec 22 04:06:28 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 21 Dec 1999 20:06:28 -0800 (PST)
Subject: [Types-sig] recursive types, type safety, and flow analysis
In-Reply-To: <199912220119.UAA14134@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.10.9912211927130.16305-100000@nebula.lyra.org>

On Tue, 21 Dec 1999, Guido van Rossum wrote:
>...
> I'm giving up on responding point-by-point -- let's just agree that we
> differ in opinion on this matter.

I'm not sure that I'm there yet :-)

Basically, I think your request to find and report on
use-before-definition is "intractable" *when* you're talking about
multiple bodies of code (e.g. two functions, or the global space and a
function).

[ by "intractable", I mean within the scope of what I believe we want to
  build; the problem is certainly doable but I believe it would involve
  complex, global, control-flow analysis. ]

>...
> > But it does not allow:
> > 
> > def f(x: Foo):
> >   ...
> > class Foo:
> >   ...
> 
> If there's only one Foo (which is usually the case) I still think this
> is too strict, and I don't see a technical reason why it would be
> necessary.

I want compile time checks, but I also want function objects to contain
typedecl information at runtime. I'm not talking about runtime type
checks, just recording more information with the function objects.

For example, I'd like to be able to say something like:

for i in range(func.func_code.co_argcount):
  print func.func_code.co_varnames[i], ':', func.func_argtypes[i]

> > I believe the compiler should be recording information about the function
> > arguments' typedecls. Unless the compiler is going to have multiple passes
> > then the name should be defined before usage.
> > 
> > Or rather, let's assume that the function argument information is
> > constructed and recorded at runtime (as part of the standard function
> > object construction at runtime). Then you really have to ensure that name
> > is available, so the appropriate value can be stored into the function
> > object.
> 
> OK, this is why we disagree.  I am only interested in compile time
> type checking; I can admit that some run time checking is necessary,
> but only in order to assert certain invariants that are assumed by the
> compile time checker.

Actually, I'm assuming that runtime checks are *only* present to verify
parameter values and when the type-assert operator is used. I do not
believe we would ever insert them outside of these two cases.

Asserting the types of parameters could be arguable.

Back to the point: I think we're in agreement on compile-time vs run-time
checks. The difference is that I have one more requirement: the typedecl
information should be available at runtime (for introspection purposes).

>...
> A form of type checking that happens completely at run time (the way
> you describe it) is uninteresting to me, and using such a system as
> the semantic basis for a type checker seems to be a mistake.

Sorry, I have been unclear if this was the result. I do not want a
runtime-based type checker. I want compile time (*).

The runtime checks for function parameters are just assertions to ensure
that non-type-checked code does not pass the wrong thing. The runtime
check for the type-assert operator is present because the person requested
it.

[ although it possible that the compiler can optimize away the assertion
  generatd by the type-assert operator if the compiler can determine that
  it will always fail or that it will never fail. ]

Neither of these two classes of runtime checks are intended to replace any
compile-time type checks.

(*) strictly speaking, I don't care about compile-time checks, as I'm in
    this for (OPT) :-), but I'm attempting to design a solution that
    encompasses (ERR), too.

> Yes,
> this follows Python's semantics closer than what I am proposing.  But
> I don't think that it is closer to what the user expects the type
> checker to do.

I agree with you.

> Here's the crux of my argument:
> 
>     Python's dynamic semantics can often be surprising.  Compile time
>     checking should warn the user about these surprises, it shouldn't
>     try to assume that these surprises are what the user wanted!

Agreed.

> (I've skipped the rest of what you wrote, because of the agreement to
> disagree.)

Our difference lies in two items:

* I do not believe that you can do cross-function, compile-time checks to
  determine if a name is undefined.
  [ or if a name has different types over time, which type it may be ]

* I am requiring the ability to associate typedecl objects with a function
  object at runtime. This imposes the requirement on a typedecl name (such
  as a class' name) being defined at the point that a function is defined.
  [ I also want typedecl objects associated with a class object and a
    module object so that we can reflect on their interface at runtime ]

We can agree to disagree on the first item (I'll let you write the code
to do that :-). I'd like your opinion on the second.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From scott@chronis.pobox.com  Wed Dec 22 04:44:15 1999
From: scott@chronis.pobox.com (scott)
Date: Tue, 21 Dec 1999 23:44:15 -0500
Subject: [Types-sig] recursive types, type safety, and flow analysis
In-Reply-To: <Pine.LNX.4.10.9912211927130.16305-100000@nebula.lyra.org>
References: <199912220119.UAA14134@eric.cnri.reston.va.us> <Pine.LNX.4.10.9912211927130.16305-100000@nebula.lyra.org>
Message-ID: <19991221234415.A12628@chronis.pobox.com>

On Tue, Dec 21, 1999 at 08:06:28PM -0800, Greg Stein wrote:
> On Tue, 21 Dec 1999, Guido van Rossum wrote:
[...]
> >...
> 
> Basically, I think your request to find and report on
> use-before-definition is "intractable" *when* you're talking about
> multiple bodies of code (e.g. two functions, or the global space and a
> function).
> 
> [ by "intractable", I mean within the scope of what I believe we want to
>   build; the problem is certainly doable but I believe it would involve
>   complex, global, control-flow analysis. ]

I'd agree that this has been demonstrated, but only for examples of
code which seem like great candidates for compile time warnings.  Are
there examples which strike you otherwise?

[...]
> 
> I want compile time checks, but I also want function objects to contain
> typedecl information at runtime. I'm not talking about runtime type
> checks, just recording more information with the function objects.
> 
> For example, I'd like to be able to say something like:
> 
> for i in range(func.func_code.co_argcount):
>   print func.func_code.co_varnames[i], ':', func.func_argtypes[i]
> 

This sounds great, but to what extent do you think it should affect
the initial coding design?  It seems to me like this sort of
functionality is more likely a candidate for something quite
post-prototype-version-1 code, and to a large extent could be added to
a compile-time checking system that could store it's type assertions
in a form usable at runtime.  

If that information is in the byte code (is that even feasible a
remotely backword compatible fashion?), then planning for this needs
to happen earlier.  If it's acceptible that that information could be
stored elsewhere, perhaps even (optionally) in the interpreter itself,
It seems like this functionality could be relatively easy to add to an
existing compile-time-only static typing mechanism without cluttering
the initial develop of that compile-time-only static typing mechanism
with what seems like a rather large new set of complexities.

The way I see all this compile-time vs. run-time stuff is that 1) run
time is much more complex and undesirable for several already stated
reasons. 2) run time has the ability to further resolve some
inadequately-typed-at-compile-time code.  3) run time offers myriads
of cool python code interfaces for interacting with a stronger typing
system.

To me, it seems to make the most sense to develop a compile time base
first, paying enough attention to the benefits of run-time use so as
not to preclude it or present it with unduly large obstacles.  And as
the compile-time code begins to stabilize a bit, define an interface
though which the compile time type information may become available to
the interpreter at run time.

I'd much rather have a fairly closed compile time type system in several
months than a fairy closed compile+run type system in a few years,
especially if the former doesn't preclude the latter.

scott


From gstein@lyra.org  Wed Dec 22 06:02:08 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 21 Dec 1999 22:02:08 -0800 (PST)
Subject: [Types-sig] recursive types, type safety, and flow analysis
In-Reply-To: <19991221234415.A12628@chronis.pobox.com>
Message-ID: <Pine.LNX.4.10.9912212146150.16305-100000@nebula.lyra.org>

On Tue, 21 Dec 1999, scott wrote:
> On Tue, Dec 21, 1999 at 08:06:28PM -0800, Greg Stein wrote:
>...
> > Basically, I think your request to find and report on
> > use-before-definition is "intractable" *when* you're talking about
> > multiple bodies of code (e.g. two functions, or the global space and a
> > function).
> > 
> > [ by "intractable", I mean within the scope of what I believe we want to
> >   build; the problem is certainly doable but I believe it would involve
> >   complex, global, control-flow analysis. ]
> 
> I'd agree that this has been demonstrated, but only for examples of
> code which seem like great candidates for compile time warnings.  Are
> there examples which strike you otherwise?

One of my points was that I do not believe you can issue warnings because
you can't know whether a problem might exist. Basically, it boils to not
knowing whether a global used by a function exists at the time the
function is called. So you either issues warnings for all global usage, or
you issue none. You can make a few guesses based on what happens in the
global code body, but I don't think the guesses will really improve the
quality of warnings.

Examples? No, I don't really have any handy. Any example would be a short
code snippet and people would say, "yah. that's bad. it should fail." But
the issue is with larger bodies of code... that's what we're trying to
fix! So... No, I don't have a non-trivial example.

> [...]
> > I want compile time checks, but I also want function objects to contain
> > typedecl information at runtime. I'm not talking about runtime type
> > checks, just recording more information with the function objects.
> > 
> > For example, I'd like to be able to say something like:
> > 
> > for i in range(func.func_code.co_argcount):
> >   print func.func_code.co_varnames[i], ':', func.func_argtypes[i]
> 
> This sounds great, but to what extent do you think it should affect
> the initial coding design?

The origination of this discussion was based on the recursive type issue.
If we have runtime objects, then I doubt we could support the recursive
type thing without some additional work. Or, as I'm suggesting, you do not
allow an undefined name (as specified by runtime/execution order) to be
used in a typedecl.

The design of how to handle recursive types depends on the decision to
include/exclude runtime objects that define function, class, or module
typedecl information. Even if we defer the runtime creation of those
objects, it will affect the design today.

> It seems to me like this sort of
> functionality is more likely a candidate for something quite
> post-prototype-version-1 code, and to a large extent could be added to
> a compile-time checking system that could store it's type assertions
> in a form usable at runtime.  

I'm all for deferring stuff, but unfortunately, I believe this affects the
V1 design.

> If that information is in the byte code (is that even feasible a
> remotely backword compatible fashion?), then planning for this needs
> to happen earlier.

Bytecodes do not really need to be backwards compatible. The magic value
in the header of a .pyc prevents use of an incorrect version of bytecodes.
(see line 80 or so in Python/import.c)

I do believe the information goes into the bytecode, but I don't think
that is the basis for needing to plan now. Instead, we have to define the
semantics of when/where those typedecl objects exist. Do we have them at
runtime? Does a name have to exist (in terms of runtime execution) for it
to be used in a typedecl, or does it just have to exist *somewhere*? If
names must exist before usage, then how is the recursive type thing
handled? With unspecified typedecls? (like an unspecified struct)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Wed Dec 22 09:45:43 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 22 Dec 1999 01:45:43 -0800 (PST)
Subject: [Types-sig] recursive types, type safety, and flow analysis
In-Reply-To: <19991222033636.A14007@chronis.pobox.com>
Message-ID: <Pine.LNX.4.10.9912220117490.16305-100000@nebula.lyra.org>

On Wed, 22 Dec 1999, scott wrote:
>...
> > One of my points was that I do not believe you can issue warnings because
> > you can't know whether a problem might exist. Basically, it boils to not
> > knowing whether a global used by a function exists at the time the
> > function is called. So you either issues warnings for all global usage, or
> > you issue none. You can make a few guesses based on what happens in the
> > global code body, but I don't think the guesses will really improve the
> > quality of warnings.
> 
> I personally can't imagine that it would be an issue to treat globals
> in functions as anything other than a simple flat-rule: for type
> checking purposes, globals must be defined at compile time in the
> global namespace, that's just me, but I'd probably fire any of the
> python programmers that work for me if they did what you describe
> above with globals in a large project :)

So it sounds like we agree? Treat globals simply, using a union of all the
types that they may have in the global space (of course, noting that most
sane people won't be changing the type!). Do not worry about control flow:
specifically, what is the type and/or defined-status when <this> function
is called.

> > Examples? No, I don't really have any handy. Any example would be a short
> > code snippet and people would say, "yah. that's bad. it should fail." But
> > the issue is with larger bodies of code... that's what we're trying to
> > fix! So... No, I don't have a non-trivial example.
> 
> I can't even imagine one, so if there's any way to describe this
> global issue a little further without putting too much effort into it,
> I'd appreciate it.

I posted a set of 4 cases a few messages ago. Without control flow
analysis, the type checker cannot determine which of the four cases is
being used when it analyzes f(). Now just take one of those four patterns
and drop it into a large module. Given that big old module, it would be
nice to find problems with sequencing of type/defined-ness and function
calls (because it is too big to eyeball; we want compiler support), but
I'm saying "punt -- the compiler is not going to be able to provide any
kind of adequate warning."

The compiler *will* be able to generally verify types. It just can't
handle a determine which of a set of alternatives an object will have at a
specific point in type (assuming that object occurs in a different body of
code than that which is being analyzed).

Am I being clear enough? It seems like I've said this about three times so
far...

> > The origination of this discussion was based on the recursive type issue.
> > If we have runtime objects, then I doubt we could support the recursive
> > type thing without some additional work. Or, as I'm suggesting, you do not
> > allow an undefined name (as specified by runtime/execution order) to be
> > used in a typedecl.
> 
> you could even allow typedecl to import modules for the sake of
> gaining access to the names, where those imports would only occur when
> the optional type checking is turned on.  I'd agree that the use of an
> undefined name should be disallowed.  With the presence of
> type-check-only import, following the same
> no-mutually-recursive-imports rule of the regular import, but only
> importing typedecl statements, you could achieve all this at compile
> time. 

Actually, the recursive import issue is resolved by have a module
registered which is incomplete. If you have:

--- a.py
import b

--- b.py
import a

>>> import a

Module "a" will get partially defined and then its code will be run.
During that execution, the "import b" occurs and the "b" module is
imported. Now the code for "b" runs and it says "import a". Since "a" has
been partially defined (specifically, a name/module is entered into
sys.modules), then b.py can create a local name "a" referring to the
module object that it finds in sys.modules (which is about to be filled
in when the "import b" completes).

I'm suggesting a similar mechanism be made available to resolve the
recursive typedecl issue. Specifically, we provide a way to create a
partially-defined ("incomplete") typedecl object and bind that to a name.
That name can then be used; later, the name will become fully specified.
More thought is needed here, but I'll hold off as this is still premised
on runtime typedecl availability.

>...
> > I do believe the information goes into the bytecode, but I don't think
> > that is the basis for needing to plan now. Instead, we have to define the
> > semantics of when/where those typedecl objects exist. Do we have them at
> > runtime? 
> 
> in the above, no, though we do have the ability to find a name
> anywhere at compile time.
> 
> >Does a name have to exist (in terms of runtime execution) for it
> > to be used in a typedecl, or does it just have to exist *somewhere*? 
> 
> in the above, it has to exist in the typedecl 'execution' model, which
> is during compile time.
>
> >If
> > names must exist before usage, then how is the recursive type thing
> > handled? With unspecified typedecls? (like an unspecified struct)
> 
> How about an iterative model which continues until all typedecl names
> are filled in?

These three items form a possible alternative. You wouldn't really need an
iterative model to gather typedecl names; two passes is sufficient.

>...
> For me, it is sufficient to proceed from the premiss that you can't
> have static typing work on code that redefines types at run time, and
> to limit runtime checking (for the time being) to optionally have the
> interpreter take some action (warn or abort) when that happens. That
> requirement alone implies that typedecl'd names and their typedecl
> bodies need to be available at run time, which is sufficient to
> support just about any future developments in a static-typeing
> interface in pure python.

I definitely agree with the second part. For the first, if I assume
"redefines types at run time" as being "through some shady mechanism,
redefining a typedecl object", then yes: we can/should limit static
checking. If you're talking about a name having multiple types over a
period of time, then I disagree: we can handle that case.

Also, I think the runtime objects are for more than the occasional type
assertion.

>...
> As an aside, I'm glad to learn it wouldn't be difficult to have python
> put static type information in it's byte code.  That seems like a good
> place for it.

I'm hoping it would be hung from the function, class, and module objects.

> As weird as it is to have a separate type-decl name model, it seems
> infintely  to depict dynamic typing in a static typing model.

I don't follow/parse this line...

Cheers,
-0g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Wed Dec 22 09:47:33 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 22 Dec 1999 01:47:33 -0800 (PST)
Subject: FWD: [Types-sig] recursive types, type safety, and flow analysis
Message-ID: <Pine.LNX.4.10.9912220146210.16305-100000@nebula.lyra.org>

I tried to "bounce" this to the SIG, but it looks like it got
held/discarded for admin approval since it didn't have types-sig in the
To: header. Forwarding this time...

---------- Forwarded message ----------
Date: Wed, 22 Dec 1999 03:36:36 -0500
From: scott <scott@chronis.pobox.com>
To: Greg Stein <gstein@lyra.org>
Subject: Re: [Types-sig] recursive types, type safety, and flow analysis

On Tue, Dec 21, 1999 at 10:02:08PM -0800, Greg Stein wrote:
> On Tue, 21 Dec 1999, scott wrote:
> > On Tue, Dec 21, 1999 at 08:06:28PM -0800, Greg Stein wrote:
> >...
> > > Basically, I think your request to find and report on
> > > use-before-definition is "intractable" *when* you're talking about
> > > multiple bodies of code (e.g. two functions, or the global space and a
> > > function).

[...]

> > I'd agree that this has been demonstrated, but only for examples of
> > code which seem like great candidates for compile time warnings.  Are
> > there examples which strike you otherwise?
> 
> One of my points was that I do not believe you can issue warnings because
> you can't know whether a problem might exist. Basically, it boils to not
> knowing whether a global used by a function exists at the time the
> function is called. So you either issues warnings for all global usage, or
> you issue none. You can make a few guesses based on what happens in the
> global code body, but I don't think the guesses will really improve the
> quality of warnings.

I personally can't imagine that it would be an issue to treat globals
in functions as anything other than a simple flat-rule: for type
checking purposes, globals must be defined at compile time in the
global namespace, that's just me, but I'd probably fire any of the
python programmers that work for me if they did what you describe
above with globals in a large project :)

> 
> Examples? No, I don't really have any handy. Any example would be a short
> code snippet and people would say, "yah. that's bad. it should fail." But
> the issue is with larger bodies of code... that's what we're trying to
> fix! So... No, I don't have a non-trivial example.

I can't even imagine one, so if there's any way to describe this
global issue a little further without putting too much effort into it,
I'd appreciate it.

[...]

> 
> The origination of this discussion was based on the recursive type issue.
> If we have runtime objects, then I doubt we could support the recursive
> type thing without some additional work. Or, as I'm suggesting, you do not
> allow an undefined name (as specified by runtime/execution order) to be
> used in a typedecl.

you could even allow typedecl to import modules for the sake of
gaining access to the names, where those imports would only occur when
the optional type checking is turned on.  I'd agree that the use of an
undefined name should be disallowed.  With the presence of
type-check-only import, following the same
no-mutually-recursive-imports rule of the regular import, but only
importing typedecl statements, you could achieve all this at compile
time. 

I've run into this issue on large projects, importing a classname,
just to run 
    assert isinstance(foo, thatclass), "complain meaningfully"

But it hasn't come up with recursive types in any code I've seen, just
deeply-complex types in terms of container and class hierarchy
relationships.

> 
> The design of how to handle recursive types depends on the decision to
> include/exclude runtime objects that define function, class, or module
> typedecl information. Even if we defer the runtime creation of those
> objects, it will affect the design today.
> 

indeed.

[...]
> 
> I do believe the information goes into the bytecode, but I don't think
> that is the basis for needing to plan now. Instead, we have to define the
> semantics of when/where those typedecl objects exist. Do we have them at
> runtime? 

in the above, no, though we do have the ability to find a name
anywhere at compile time.

>Does a name have to exist (in terms of runtime execution) for it
> to be used in a typedecl, or does it just have to exist *somewhere*? 

in the above, it has to exist in the typedecl 'execution' model, which
is during compile time.

>If
> names must exist before usage, then how is the recursive type thing
> handled? With unspecified typedecls? (like an unspecified struct)

How about an iterative model which continues until all typedecl names
are filled in?

I understand your concern about 2 distinct namespace models being
unsettling.  It raises issues of what exactly we want out of static
typing, and what sets of existing and future python code may benefit
from static typing, and these are indeed big issues.  

For me, it is sufficient to proceed from the premiss that you can't
have static typing work on code that redefines types at run time, and
to limit runtime checking (for the time being) to optionally have the
interpreter take some action (warn or abort) when that happens. That
requirement alone implies that typedecl'd names and their typedecl
bodies need to be available at run time, which is sufficient to
support just about any future developments in a static-typeing
interface in pure python.

As an aside, I'm glad to learn it wouldn't be difficult to have python
put static type information in it's byte code.  That seems like a good
place for it.

As weird as it is to have a separate type-decl name model, it seems
infintely  to depict dynamic typing in a static typing model.

scott


From mwh21@cam.ac.uk  Wed Dec 22 11:33:03 1999
From: mwh21@cam.ac.uk (Michael Hudson)
Date: Wed, 22 Dec 1999 11:33:03 +0000
Subject: [Types-sig] recursive types, type safety, and flow analysis
Message-ID: <E120k1i-0005IT-0X@anchor-post-33.mail.demon.net>

Whew! I go on holiday for a week and 400+ messages turn up on types-sig!
I've scanned them all, but I'm not sure I'm not repeating others here.
Exciting times... 

----------
>From: Greg Stein <gstein@lyra.org>
>To: types-sig@python.org
>Subject: Re: [Types-sig] recursive types, type safety, and flow analysis
>Date: Wed, Dec 22, 1999, 6:02 am
>

>On Tue, 21 Dec 1999, scott wrote:
>> On Tue, Dec 21, 1999 at 08:06:28PM -0800, Greg Stein wrote:
>>...
>> > Basically, I think your request to find and report on
>> > use-before-definition is "intractable" *when* you're talking about
>> > multiple bodies of code (e.g. two functions, or the global space and a
>> > function).
>> > 
>> > [ by "intractable", I mean within the scope of what I believe we want to
>> >   build; the problem is certainly doable but I believe it would involve
>> >   complex, global, control-flow analysis. ]
>> 
>> I'd agree that this has been demonstrated, but only for examples of
>> code which seem like great candidates for compile time warnings.  Are
>> there examples which strike you otherwise?
>
>One of my points was that I do not believe you can issue warnings because
>you can't know whether a problem might exist. Basically, it boils to not
>knowing whether a global used by a function exists at the time the
>function is called. 

Which is because you CAN'T! For the very simple case (i.e. name assigned to
at toplevel of module, never referred to in a "del" statement), you know
everything about the lifetime of the variable, and for other cases you in
general know nothing, because to know more for arbitrary cases involves
solving the halting problem. If people want to typecheck code along the
lines of


a = 0
if some_function():
    del a

then, frankly, sod 'em.

You could make allowances for code along the lines of

if __debug__:
    verbose = 1
else:
    verbose = 0

but I don't think it's worth it. (which leads to an argument for being able
to restrict types assigned to names, thinking about it...)

[snip]

On a separate point, there is only one language that I can think of that is
as dynamic (probably more so, actually) than Python, yet has optional static
typing (mainly for OPT, to be sure, but you get some ERR, too), and that is
ANSI Common Lisp. The more I read of this present discussion, the more I
repsect the way CL is designed.

The things that seem most immediately relavent are: 

1) Just declaring the type for names is enough most of the time (eg.
(declare (type <type> <var>*))), but occasionally you want to type
expressions too (eg. (the <type> expr)).

2) It's worth having distinguished "read" and "compile" phases (and a "macro
expansion" phase would surely be nice...).

3) Type inference need not be demanded in a compiler, but it's nice when
it's there.

If anyone partaking in this discussion doesn't know CL's approach to types
and typing, I'd heartily enjoin them to find out. Dylan might be relavent
too, but I don't know that.


From scott@chronis.pobox.com  Wed Dec 22 11:47:24 1999
From: scott@chronis.pobox.com (scott)
Date: Wed, 22 Dec 1999 06:47:24 -0500
Subject: [Types-sig] recursive types, type safety, and flow analysis
In-Reply-To: <Pine.LNX.4.10.9912220117490.16305-100000@nebula.lyra.org>
References: <19991222033636.A14007@chronis.pobox.com> <Pine.LNX.4.10.9912220117490.16305-100000@nebula.lyra.org>
Message-ID: <19991222064724.A14726@chronis.pobox.com>

On Wed, Dec 22, 1999 at 01:45:43AM -0800, Greg Stein wrote:
> On Wed, 22 Dec 1999, scott wrote:
> 
> I posted a set of 4 cases a few messages ago. Without control flow
> analysis, the type checker cannot determine which of the four cases is
> being used when it analyzes f(). 

for easy reference, they're at:
http://www.python.org/pipermail/types-sig/1999-December/000935.html

2 of those 4 cases use 'del' in the global space. I'm inclined to
believe that those 2 cases are beyond the scope of static type
checking.  

I've even recently written code that does just that, creating a
module-as-object interface, where the deleted variables are parallell
to private variables in a class -- basically inaccessible. I would not
expect static typing mechanism to grok that module, and would be fine
with simply having it ignore the possibility of undefined names that
result from 'del' or other runtime behavior. It certainly seems like
an exceptional case that has undue complications in a static type
system.  Do you see any relatively easy way to handle it more
accurately ?

The other 2 cases seem handleable at compile time under a system which
builds typedecl information by reference (*) -- iteratively or 2-pass,
whichever works out best, sans flow-control analysis, and at compile
time, with a separate typedecl-name namespace mechanism.  Any
information distinguishing case 1 from case 2 (like redefining g() 
inside f) would either fall under local namespace type-ing  or something 
which should, IMO, not be handled by a static type system.

[...]
> 
> The compiler *will* be able to generally verify types. It just can't
> handle a determine which of a set of alternatives an object will have at a
> specific point in type (assuming that object occurs in a different body of
> code than that which is being analyzed).
> 
> Am I being clear enough? It seems like I've said this about three times so
> far...

Yes, I got it, or so I think :) But I think we may have 2 different
expectations of something fairly basic:

decl f(x: Int) -> Int|None
decl g(x: Int) -> Int

def g(x):
    return f(max(x, 1))

def f(x):
    if x > 0:
        return x
    else:
        return None

I *want* the static typing to complain and to be warned or blow up
with the message that according to the type information alone, g() is
not verifiably an Int.  It seems like you want this to work without
complaints.  Is this correct?

> 
> > > The origination of this discussion was based on the recursive type issue.
> > > If we have runtime objects, then I doubt we could support the recursive
> > > type thing without some additional work. Or, as I'm suggesting, you do not
> > > allow an undefined name (as specified by runtime/execution order) to be
> > > used in a typedecl.

I still don't see how enforcing your suggestion allows any compile time
checking at all -- unless you you further qualify it with 'used in a typedecl
as it operates at run time'.  What happens at run time should become
more clear once we come up with a way to provide run time access to
compile time static typing.  Run time behavior could even be programmable.

[...]

> I'm suggesting a similar mechanism be made available to resolve the
> recursive typedecl issue. Specifically, we provide a way to create a
> partially-defined ("incomplete") typedecl object and bind that to a name.
> That name can then be used; later, the name will become fully specified.
> More thought is needed here, but I'll hold off as this is still premised
> on runtime typedecl availability.

(*)

yes, essentially the same way that

struct node {
    int i;
    node * n;
}; 

is OK, but

struct node {
    int i;
    node n;
}

isn't. (gives 'incomplete type' with gcc).  You need to be able to have
a reference to a type and alter it as the type declarations are
processed.


> 
> >...
> > > I do believe the information goes into the bytecode, but I don't think
> > > that is the basis for needing to plan now. Instead, we have to define the
> > > semantics of when/where those typedecl objects exist. Do we have them at
> > > runtime? 
> > 
> > in the above, no, though we do have the ability to find a name
> > anywhere at compile time.

I'd like to recant this statement and replace it with:  

1) The typedecl information is stored in an application-wide static
type model which is created at compile time (implies typedecl specific
import/#include mechanism).

2)The model is mapped to something potentially available at run time,
eg bytecode with associated module, classe and function objects.

3)The runtime environment can do with that information what it
pleases, but 1) and 2) need to be done first, and have a lot of
potential for use, even without 3).

> > 
> > >Does a name have to exist (in terms of runtime execution) for it
> > > to be used in a typedecl, or does it just have to exist *somewhere*? 
> > 
> > in the above, it has to exist in the typedecl 'execution' model, which
> > is during compile time.
> >
> > >If
> > > names must exist before usage, then how is the recursive type thing
> > > handled? With unspecified typedecls? (like an unspecified struct)
> > 
> > How about an iterative model which continues until all typedecl names
> > are filled in?
> 
> These three items form a possible alternative. You wouldn't really need an
> iterative model to gather typedecl names; two passes is sufficient.
> 
> >...
[...]

> checking. If you're talking about a name having multiple types over a
> period of time, then I disagree: we can handle that case.

perhaps for local variables, but I don't see how with global variables
unless that global variable is explicitly stated to be a union by the
programmer, and the type model works out OK -- with atleast the option
of working with my expectation of static typing of (global) unions as
described above.

> 
> Also, I think the runtime objects are for more than the occasional type
> assertion.

Indeed, there are lots of places where optimizations can be made at runtime if
the types are known.  I just don't think we're there yet.  I don't think
undefined names are an issue with possible future run-time optimizations and 
introspective interfaces, as it seems like a NameError would do it's job only
after all namespaces are searched, and that something introspective
wouldn't even run if a NameError was going to be raised.

For the moment, it seems sufficient to make static, compile-time type
information available at run time (optionally), and then we can decide
what to do with that information at run time - optimize code, keep
tabs on the types of variables at run time and make sure they match
the static type information, etc.

> > As weird as it is to have a separate type-decl name model, it seems
> > infintely  to depict dynamic typing in a static typing model.
              ^
      insert 'weirder ' here

guido should threaten to kill sigs more often :)

scott


                    __off_topic_but_interesting_from_here_down__
> > 
> > you could even allow typedecl to import modules for the sake of
> > gaining access to the names, where those imports would only occur when
> > the optional type checking is turned on.  I'd agree that the use of an
> > undefined name should be disallowed.  With the presence of
> > type-check-only import, following the same
> > no-mutually-recursive-imports rule of the regular import, but only
> > importing typedecl statements, you could achieve all this at compile
> > time. 
> 
> Actually, the recursive import issue is resolved by have a module
> registered which is incomplete. If you have:
> 
> --- a.py
> import b
> 
> --- b.py
> import a
> 
> >>> import a

right, but there are limits:

--- a.py
from b import c
d = 1

--- b.py
from a import d
c = 1

doesn't work:
1222 04:47 shock:~% python a.py
Traceback (innermost last):
  File "a.py", line 1, in ?
    from b import c
  File "b.py", line 1, in ?
    from a import d
  File "a.py", line 1, in ?
    from b import c
ImportError: cannot import name c
> 
scott


From skaller@maxtal.com.au  Wed Dec 22 19:12:50 1999
From: skaller@maxtal.com.au (skaller)
Date: Thu, 23 Dec 1999 06:12:50 +1100
Subject: [Types-sig] type-assert operator optimizations (was: New syntax?)
References: <Pine.LNX.4.10.9912211210080.16305-100000@nebula.lyra.org>
Message-ID: <386122B2.5B2932B0@maxtal.com.au>

Greg Stein wrote:

> > This means that the x!t can be optimised to x,
> > without affecting strictly conforming program
> > semantics.
> 
> If the compiler can definitively state that the test will never fail, then
> it doesn't have to include a runtime check.
> 
> If the compiler can definitively state that the test will always fail,
> then it can issue an error and refuse to compile.
> [ with the caveat of catching exceptions ]
> 
> If the compiler believes that it might fail in some cases, then it could
> issue a warning (and go ahead and insert a runtime check).
> [ and yes, there can be switches to avoid issuing warnings ]

These semantics are 

	(a) incoherent/inconsistent
	(b) not the same as what I propose

I want to explain both points. (b) first:
I propose that if the test were to fail at any point
in the execution of the program, the program is invalid,
and the translator can do anything it wants: behaviour is
undefined. So the test can be elided.
If the test would always succeed, then
it can be elided. It follows the test can ALWAYS be
elided.

Now for (a). There is an assumption: that there is no definite
algorithm given for deducing if the test will fail.
In this case, it is possible that compiler (A) deduces
the test will always fail, and rejects the program,
while compiler (B) isn't smart enough, and compiles code
to raise an exception. In this case, a programmer
may catch the exception and handle it, and this
behaviour would be required of the language.
But that is NOT the behaviour (A) produced.
If one compiler can reject the program, it CANNOT be
a valid program, and in that case, a requirement
on a compiler (that it throw an exception if it is dumb)
cannot be made to stick, since the program is in error.

I think you must decide that the semantics require
a run time error ALWAYS, or, that the test can
be elided ALWAYS. There is no half way ground.

The current requirements for assertion statements
are, effectively, that the test can be elided,
and therefore, invalid programs exist. The fact
that the current CPython interpreter in non-optimising
mode raises an exception is nicety of that particular
implementation, not a requirement of the language.

I'm assuming 

	1) there is ONE python language
	2) both the optimising and non-optimising
		byteocode compiler conform to the semantics

and from this I deduce the above language semantics.
Remember language semantics are constraints on translators,
they're not specifications of what a particular tool does
in cases that no particular behaviour is required.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Wed Dec 22 19:45:40 1999
From: skaller@maxtal.com.au (skaller)
Date: Thu, 23 Dec 1999 06:45:40 +1100
Subject: [Types-sig] typedefs (was: New syntax?)
References: <Pine.LNX.4.10.9912211144200.16305-100000@nebula.lyra.org>
Message-ID: <38612A64.12E5F327@maxtal.com.au>

Greg Stein wrote:

> > Yes, but you basically have the same setup with current Python if you
> > exclude Lambdas. A function definition is merely used to create an
> > 'alias' for a piece of code, to clarify other pieces of code. If you
> 
> I disagree that a function def is merely an alias. It provides a new
> namespace, parameter binding, and capabilities such as deferred execution.
> I definitely don't see it as simply an alias.

Greg is correct in at least one sense: when a 'def' is executed,
the function is given a particular name which can be retrieved,
the same applies to classes. These names are independent of a
variable which happens to be bound to the function or class:

	def f(): pass
	g = f
	del f

The function refered to by variable g has name 'f'.
Def is not an alias for a lambda, even if lamda were
extended to provide a suite in which statements could
be written and locals exist.

The case of functions is not interesting here, but the case
of classes certainly is, a point I missed in a previous post.
When a class is refered to in a type declaration, we have
to decide if the reference is an evaluable expression,
or is the name of a class -- independent of any variables.

In the second case, we must disallow two classes having the same name,
or permit an ambiguity in the type declaration. 

I'm half guessing that Guido would be happy to prohibit
definition of two classes with the same name.

Unfortunately, this leads to a problem: clearly this
restriction is not global, but only 'per namespace'.
Which opens up the question: how are namespaces
identified. For modules, the module name?
For functions, the function name? 

Sounds good: we can ban duplicate definitions 
of classes, identifying a class by its fully
qualified name: the full package name of the
containing module, then any enclosing functions
and classes, finally the class name.

But note that for a class enclosed in a function,
the class has a transitory life. So in this case,
it isn't clear that static analysis means anything,
since two _distinct_ classes can have the same name
at different times. This also applies to
other scopes, where class objects can be deleted.

To make the issue even more complex, the lifetime
of a class is not determined solely by the existence
of a variable bound to it -- since a class can be a 
base of another, or refered to by one of its instances.

So a ban on duplicate names might be hard to enforce,
and wouldn't make any sense in the function case ..
so I'm half guessing Guido will not be so happy
to ban duplicate definitions :-)

Now, where does this lead? I think it leads to
a requirement for syntax to _specify_ a class is
static. In this case, the class is immortal,
will not be deleted, and no other of the same
name may be created. And THEN, we can use
the class names in type declarations (of function
parameters).

Summary: if you want a function parameter to have
a specified class type,  then the class definition
must be specified as static.

This can be assumed in an interface file,
but must be specified by additional syntax in an implementation
file.

It looks like Guido is right: interface files are the only
way to get enough control to actually do static analysis/checking,
without adding a lot more syntax to the python (implementation)
language.

BTW: I want to propose some terminology here: when a type
declaration is given in a function definition, that
declaration is called an INLINE declaration,
to distinguish it from a declaration in an interface file,
or a stand-alone prototype which is not a definition.

	def f(x : Int): ...
	        ^^^^^
	this is an INLINE declaration

given that terminology, it would seem that
inline declarations cannot work well UNLESS
there is also a non-inline declaration of the type
being used (Int in the example).

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From gstein@lyra.org  Wed Dec 22 19:57:20 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 22 Dec 1999 11:57:20 -0800 (PST)
Subject: [Types-sig] type-assert operator optimizations
In-Reply-To: <386122B2.5B2932B0@maxtal.com.au>
Message-ID: <Pine.LNX.4.10.9912221135090.16305-100000@nebula.lyra.org>

On Thu, 23 Dec 1999, skaller wrote:
> Greg Stein wrote:
> > > This means that the x!t can be optimised to x,
> > > without affecting strictly conforming program
> > > semantics.
> > 
> > If the compiler can definitively state that the test will never fail, then
> > it doesn't have to include a runtime check.
> > 
> > If the compiler can definitively state that the test will always fail,
> > then it can issue an error and refuse to compile.
> > [ with the caveat of catching exceptions ]
> > 
> > If the compiler believes that it might fail in some cases, then it could
> > issue a warning (and go ahead and insert a runtime check).
> > [ and yes, there can be switches to avoid issuing warnings ]
> 
> These semantics are 
> 
> 	(a) incoherent/inconsistent

They're fine to me :-)

> 	(b) not the same as what I propose

Tough. I proposed it first. :-)

> I want to explain both points. (b) first:
> I propose that if the test were to fail at any point
> in the execution of the program, the program is invalid,

Sorry, but that is with your "exceptions-are-evil" model, and we've
covered that before. Exceptions are part of Python, and their rigorous
specification is also part of Python.

The type-assert operator is *defined* to raise an exception if the type is
wrong. Period.

There are no statements about the validity of the program.

>... point moot; I disagree with the premise ...
> 
> Now for (a). There is an assumption: that there is no definite
> algorithm given for deducing if the test will fail.

If the compiler sees:

  x = 1 ! type(1)

Assuming that "type" cannot be altered, then the compiler can most
assuredly elide the check because it knows it will always succeed.

Given:

  x = 1 ! type("")

It will always fail, and the compiler may as well stop right there.
[ with the caveat of possible switches that turn this in a warning or
  ignore it or whatever... ]

I do grant you, however, that in most cases the compiler *cannot* make a
definitive statement about what will happen at runtime with that operator.
So it just puts the assertion code in, and continues on. No biggy.
[ personally, I probably wouldn't even bother with trying to elide the
  code; the type checker could easily comment on it, though ]

> In this case, it is possible that compiler (A) deduces
> the test will always fail, and rejects the program,
> while compiler (B) isn't smart enough, and compiles code
> to raise an exception. In this case, a programmer
> may catch the exception and handle it, and this
> behaviour would be required of the language.
> But that is NOT the behaviour (A) produced.

Okay. Let's work on the assumption that we have different behaviors
from the compilers.

> If one compiler can reject the program, it CANNOT be
> a valid program, and in that case, a requirement
> on a compiler (that it throw an exception if it is dumb)
> cannot be made to stick, since the program is in error.

If one compile rejects it, then it is just smarter. It doesn't say
anything about the validity of the program.

---- foo.py

raise "an exception"

----

I maintain the above is an entirely valid program. Your smart compiler
might say "sorry, that always raises an exception, so I'm not going to
bother compiling it." Fine. But that doesn't alter the fact the program is
valid.

> I think you must decide that the semantics require
> a run time error ALWAYS, or, that the test can
> be elided ALWAYS. There is no half way ground.

Yes, there is a half-way ground.

In the "exceptions-are-evil" model, maybe not. But in the standard Python
model, I can certainly elide tests that I know will always succeed.
Granted: it is hard to know that, but when I *can*, then I'm free to
remove the tests. Runtime errors are fine, and they do not imply that the
program is in error and the compiler should have rejected it. It would
*nice* if the compiler did, but life is a bitch.

Very few C compilers will complain about:

void main(void) {
  *(int *)0 = 5;
}

But it certainly bombs quite quickly.

> The current requirements for assertion statements
> are, effectively, that the test can be elided,
> and therefore, invalid programs exist. The fact
> that the current CPython interpreter in non-optimising
> mode raises an exception is nicety of that particular
> implementation, not a requirement of the language.

IMO, you have this entirely wrong. CPython defines the language (where it
isn't explicit in the language and library reference manuals). The fact
that exceptions are raised means they are part of the language, and
therefore a requirement.

Guido has been using his JPython chips every now and then to define the
language, but generally speaking: CPython is the definition and states the
requirements for all Python implementations.

> I'm assuming 
> 
> 	1) there is ONE python language
> 	2) both the optimising and non-optimising
> 		byteocode compiler conform to the semantics
> 
> and from this I deduce the above language semantics.
> Remember language semantics are constraints on translators,
> they're not specifications of what a particular tool does
> in cases that no particular behaviour is required.

Your two assumptions are correct. But I think you're assuming the wrong
semantics.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Wed Dec 22 19:33:43 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 22 Dec 1999 11:33:43 -0800 (PST)
Subject: [Types-sig] recursive types, type safety, and flow analysis
In-Reply-To: <19991222064724.A14726@chronis.pobox.com>
Message-ID: <Pine.LNX.4.10.9912221044460.16305-100000@nebula.lyra.org>

On Wed, 22 Dec 1999, scott wrote:
> On Wed, Dec 22, 1999 at 01:45:43AM -0800, Greg Stein wrote:
> > On Wed, 22 Dec 1999, scott wrote:
> > 
> > I posted a set of 4 cases a few messages ago. Without control flow
> > analysis, the type checker cannot determine which of the four cases is
> > being used when it analyzes f(). 
> 
> for easy reference, they're at:
> http://www.python.org/pipermail/types-sig/1999-December/000935.html
> 
> 2 of those 4 cases use 'del' in the global space. I'm inclined to
> believe that those 2 cases are beyond the scope of static type
> checking.  
>...
> The other 2 cases seem handleable at compile time under a system which
> builds typedecl information by reference (*) -- iteratively or 2-pass,
> whichever works out best, sans flow-control analysis, and at compile
> time, with a separate typedecl-name namespace mechanism.  Any
> information distinguishing case 1 from case 2 (like redefining g() 
> inside f) would either fall under local namespace type-ing  or something 
> which should, IMO, not be handled by a static type system.

Yes, 2 can be handled and 2 cannot. My point was that while you're
analyzing f(), you cannot know which of the 4 cases are present in the
global script. Therefore you cannot issue warnings for f's use of g().
[ and funny heuristics may just serve to make the warnings issue
  non-deterministic from a human's standpoint ]

> [...]
> > The compiler *will* be able to generally verify types. It just can't
> > handle a determine which of a set of alternatives an object will have at a
> > specific point in type (assuming that object occurs in a different body of
> > code than that which is being analyzed).
> > 
> > Am I being clear enough? It seems like I've said this about three times so
> > far...
> 
> Yes, I got it, or so I think :) But I think we may have 2 different
> expectations of something fairly basic:
> 
> decl f(x: Int) -> Int|None
> decl g(x: Int) -> Int
> 
> def g(x):
>     return f(max(x, 1))
> 
> def f(x):
>     if x > 0:
>         return x
>     else:
>         return None
> 
> I *want* the static typing to complain and to be warned or blow up
> with the message that according to the type information alone, g() is
> not verifiably an Int.  It seems like you want this to work without
> complaints.  Is this correct?

Incorrect.

I said that we can perform type verification, but we cannot know a global
name's type (or existance) at a point in time. We collect (and union if
necessary) the types of globals. Then we analyze the functions.

In your example above, we find the types of f and g (as you listed in the
decl statements). When we turn to analyzing g(), we find a mismatch
between the return value of f() and g's return value and flag it.

Note that you don't have to declare f and g beforehand, but could just
rely on their definition to work:

def g(x: Int)->Int:
    return f(max(x, 1))

def f(x: Int)->Int or None:
    ...

Since we collect the globals' type information before looking at the
function bodies, we know the typedecl for f() before we analyze g.

> > > > The origination of this discussion was based on the recursive type issue.
> > > > If we have runtime objects, then I doubt we could support the recursive
> > > > type thing without some additional work. Or, as I'm suggesting, you do not
> > > > allow an undefined name (as specified by runtime/execution order) to be
> > > > used in a typedecl.
> 
> I still don't see how enforcing your suggestion allows any compile time
> checking at all -- unless you you further qualify it with 'used in a typedecl
> as it operates at run time'.

Given:

Int = type(1)
String = type("")
def f(x=len(sys.path): Int)->String:
  return str(x)

At runtime, Int and String must be defined when the function object is
constructed (because their values are stored into the function object).
This is analogous to requiring that sys.path and len must be defined.

That is the specified runtime behavior. The question: what happens at
compile time? The compiler knows that Int and String are defined (earlier
in the program) and what they represent. It performs the type checking as
we expect it to.

Let's be concrete and say this declaration occurs in the global code body.
As I've previously stated, the global body is analyzed first. It is
analyzed in runtime order! (top to bottom) As it steps through, and
reaches the f() definition, it knows the types/values of Int and String;
it then records that information as part of f's signature for later use.
If the line assigning to String is shift below f(), then we have an error:
when the analyzer reaches f(), it has not yet seen a definition for
String.

>...
> yes, essentially the same way that
> 
> struct node {
>     int i;
>     node * n;
> }; 
> 
> is OK, but
> 
> struct node {
>     int i;
>     node n;
> }
> 
> isn't. (gives 'incomplete type' with gcc).  You need to be able to have
> a reference to a type and alter it as the type declarations are
> processed.

Correct. That is what I'm suggesting.

>...
> > > > I do believe the information goes into the bytecode, but I don't think
> > > > that is the basis for needing to plan now. Instead, we have to define the
> > > > semantics of when/where those typedecl objects exist. Do we have them at
> > > > runtime? 
> > > 
> > > in the above, no, though we do have the ability to find a name
> > > anywhere at compile time.
> 
> I'd like to recant this statement and replace it with:  
> 
> 1) The typedecl information is stored in an application-wide static
> type model which is created at compile time (implies typedecl specific
> import/#include mechanism).
> 
> 2)The model is mapped to something potentially available at run time,
> eg bytecode with associated module, classe and function objects.
> 
> 3)The runtime environment can do with that information what it
> pleases, but 1) and 2) need to be done first, and have a lot of
> potential for use, even without 3).

Yes, yes, and yes.

And:

4) the design needs to incorporate (3) even though we may be deferring its
   implementation.

>...
> > checking. If you're talking about a name having multiple types over a
> > period of time, then I disagree: we can handle that case.
> 
> perhaps for local variables, but I don't see how with global variables
> unless that global variable is explicitly stated to be a union by the
> programmer, and the type model works out OK -- with atleast the option
> of working with my expectation of static typing of (global) unions as
> described above.

The global case is the same as the local case. Sequence through the
statements, looking at what happens to the types at each point. Take this
example:

  a = 1
  f(a)
  a = "foo"
  f(a)

As we step through, we know that "a" starts as an Int and that we call f()
with an Int. "a" then becomes a String and we call f() with a String. Now
that we've processed the global body, we start processing the function
bodies. But first, we create a union of all the types that "a" ever used;
effectively, the functions see the following declaration:

  decl global a: Int or String

[ it is possible to consider adding "or Undefined" in there for
  completeness, but that may cause more trouble than it saves ]

So, to answer your question: we can handle varying types. But it does take
a qualification -- it can only be handled within a single code body. When
you consider the global body vs. a function body, then we must be
conservative and take the union of all the types a name ever had.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From skaller@maxtal.com.au  Wed Dec 22 20:04:43 1999
From: skaller@maxtal.com.au (skaller)
Date: Thu, 23 Dec 1999 07:04:43 +1100
Subject: [Types-sig] Issue: definition of "type"
References: <Pine.LNX.4.10.9912211200510.16305-100000@nebula.lyra.org>
Message-ID: <38612EDB.6C1C119D@maxtal.com.au>

Greg Stein wrote:

> > a = [Foo(), Bar()]
> >
> > for el in a:
> >    el.doSomething()
> >
> > Doesn't this rely on run-time information? How would a type system deal
> > with this? I suppose I'm entering the domain of interfaces now...
> 
> The type of "a" is a List where the elements' type is the union of the
> type of each initialization value.

No. Extra values can be appended of other types. If you want to have
lists of a particular type, this must be declared as a constraint.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From gstein@lyra.org  Wed Dec 22 20:16:33 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 22 Dec 1999 12:16:33 -0800 (PST)
Subject: [Types-sig] Issue: definition of "type"
In-Reply-To: <38612EDB.6C1C119D@maxtal.com.au>
Message-ID: <Pine.LNX.4.10.9912221212260.16305-100000@nebula.lyra.org>

On Thu, 23 Dec 1999, skaller wrote:
> Greg Stein wrote:
> > > a = [Foo(), Bar()]
> > >
> > > for el in a:
> > >    el.doSomething()
> > >
> > > Doesn't this rely on run-time information? How would a type system deal
> > > with this? I suppose I'm entering the domain of interfaces now...
> > 
> > The type of "a" is a List where the elements' type is the union of the
> > type of each initialization value.
> 
> No. Extra values can be appended of other types. If you want to have
> lists of a particular type, this must be declared as a constraint.

Extra values could be appended, but in the above code, the union
algorithm is sufficient to determine the type of "a".

Later in the code, the list may change type or we may raise errors if some
appends something not part of its type -- I don't care and it doesn't
matter here. We're trying to figure out the type of the list to know
whether the loop will succeed or not.

I also believe that we would ignore the fact that el.doSomething() could
theoretically alter "a".
[ again, a case where Paul's desire for type-safety breaks -- we can't
  know that "a" doesn't get changed during that call. ]

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From guido@CNRI.Reston.VA.US  Wed Dec 22 20:17:01 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 22 Dec 1999 15:17:01 -0500
Subject: [Types-sig] recursive types, type safety, and flow analysis
In-Reply-To: Your message of "Tue, 21 Dec 1999 20:06:28 PST."
 <Pine.LNX.4.10.9912211927130.16305-100000@nebula.lyra.org>
References: <Pine.LNX.4.10.9912211927130.16305-100000@nebula.lyra.org>
Message-ID: <199912222017.PAA20990@eric.cnri.reston.va.us>

This is the last thing I'm saying in this thread before the new year:

> Our difference lies in two items:
> 
> * I do not believe that you can do cross-function, compile-time checks to
>   determine if a name is undefined.
>   [ or if a name has different types over time, which type it may be ]

I'm assuming that a global name in a module won't be undefined once it
is defined if there are no deletions of it anywhere in the module.  I
believe this catches 99.9% of all module globals (including functions,
classes, and imported modules).

> * I am requiring the ability to associate typedecl objects with a function
>   object at runtime. This imposes the requirement on a typedecl name (such
>   as a class' name) being defined at the point that a function is defined.
>   [ I also want typedecl objects associated with a class object and a
>     module object so that we can reflect on their interface at runtime ]

I only care about that as a secondary objective.  The run time
information made available follows whatever we decide we do at compile
time.

> We can agree to disagree on the first item (I'll let you write the code
> to do that :-). I'd like your opinion on the second.

I don't think the second requirement should affect the type checking
rules.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From paul@prescod.net  Wed Dec 22 20:19:06 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 22 Dec 1999 14:19:06 -0600
Subject: [Types-sig] Non-conservative inferencing considered harmful
References: <Pine.LNX.4.10.9912211204200.16305-100000@nebula.lyra.org>
Message-ID: <3861323A.6C23ACCA@prescod.net>

Greg Stein wrote:
> 
> ...
>
> I maintain that it could be declared type-safe. In fact, it is reasonably
> straight-forward to generate the type information at each point, for each
> value, and then to verify that the .doSomething is valid.

Greg, you are getting into exactly what I want to avoid. Let's say the
first doSomething returns an int and the second doSomething returns a
string. Now you are trying to statically bind it to an integer-bearing
parameter. What's the appropriate error message:

> > a = [Foo(), Bar()]
> > 
> > ... 10,000 lines of code ...
> > 
> > for el in a:
> >    j = el.doSomething()
> > 
> > ... 10,000 lines of code ...
> > decl k: Int
> > k = j

"Warning: integer|string cannot be assigned to integer."

Note also that the point where the error message occurs may be miles
from the place where you did the funky thing that allowed the variable
to take two types. 

This is EXACTLY the sort of error message that made me run screaming
from the last language that tried to "help" me with data flow analysis
and type inferencing. To put this is in the strongest possible terms,
this sort of data flow analysis/inferencing "helps" me like MS Word's
guessing what I mean when I "misspell" happy face (e.g. Python code) and
Word fixes it for me. We are trying too hard and the result will be
non-intuitive. There is no need to complicate our system in order to
handle these corner cases. If the user wants a to be a List of (Foo|Bar)
then they can darn well SAY SO.

It is because of this experience that I am strongly of the belief that
we should do NO flow analysis in the normative part of our type system.
I am willing to support these basic kinds type inferencing:

 * if a variable is consistently assigned a particular type within its
scope, we inference the type.
 * if a variable is inconsistently assigned we infer it as "Any"
 * if a non-error-checking optimizer wants to inference anything else
and that inference doesn't change the language semantics then I am all
for it.

 Paul Prescod


From paul@prescod.net  Wed Dec 22 20:50:15 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 22 Dec 1999 14:50:15 -0600
Subject: [Types-sig] recursive types, type safety, and flow analysis (was:
 Recursive types)
References: <Pine.LNX.4.10.9912211252310.16305-100000@nebula.lyra.org> <199912212338.SAA13830@eric.cnri.reston.va.us>
Message-ID: <38613987.4D6C32F1@prescod.net>

Guido van Rossum wrote:
> 
> Hm...  Since type checking is essentially a compile time activity, I
> think it would be better if the run time order of events didn't
> matter.  

I agree.

> > I haven't thought about this particular scenario or the resulting impact
> > on the inferencer.
> ...
> I don't see a big problem here for the type checker.  

The problem is that you and I are thinking about a Algol-style type
checker. Greg is talking about a complex, sophisticated type inferencer
that tries to understand what your Python code *means*. I am only
willing to try to understand Python in the very simplest cases: top
level definitions that have no dependence on procedural code.

> I like the idea better (I think proposed by Tim Peters) that the names
> used for type declarations live in a separate compile time namespace
> where different rules apply.  (Even though there are obvious
> correspondences, e.g. the names of defined or imported classes should
> probably be available both at compile time and at run time.)

I agree. We will soon have to move forward on this basis.

The static type checker is a very different tool and it does not, in
general, try to understand "Python code." Two *convenient exceptions*
are top-level class and function declarations. But these are just
convenient exceptions. I feel strongly that this:

if doSomething():
	class a:
		def doSomething(self):
			pass
else:
	class a:
		def doSomething(self):
			pass

is NOT TYPE CHECKABLE because we do NOT do data-flow analysis.

 Paul Prescod


From skaller@maxtal.com.au  Wed Dec 22 21:13:59 1999
From: skaller@maxtal.com.au (skaller)
Date: Thu, 23 Dec 1999 08:13:59 +1100
Subject: [Types-sig] type-assert operator optimizations
References: <Pine.LNX.4.10.9912221135090.16305-100000@nebula.lyra.org>
Message-ID: <38613F17.AF01AD63@maxtal.com.au>

Greg Stein wrote:

> The type-assert operator is *defined* to raise an exception if the type is
> wrong. Period.

	That is not my defintion. And yours has not made
it into the Guido distribution yet, so I am free to make
my own definition :-)
 
> > I'm assuming
> >
> >       1) there is ONE python language
> >       2) both the optimising and non-optimising
> >               byteocode compiler conform to the semantics
> >

> Your two assumptions are correct. 

	I am glad you agree. Consider:

	assert 0

Run that through CPython. It raises an exception, right?
Now run it again, this time with optimisation enabled.
Nothing happens. No exception. The assert statement
was elided by turning on optimisation.
But both compilers are conforming by agreement,
so it follows that an assert statement does NOT
have to raise an exception.

What is required then? Well, it isn't defined!
The program is invalid: it isn't a valid python
program. A third compiler could reject it as such.

[Of course, you COULD specify that either

	a) an exception is raised OR
	b) nothing happens

so that anything else -- like deleting your
hard disk, or rejecting the program -- is not
allowed. In other words, you could make
the behaviour non-determinate, without
making it undefined altogether.]

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Wed Dec 22 21:21:29 1999
From: skaller@maxtal.com.au (skaller)
Date: Thu, 23 Dec 1999 08:21:29 +1100
Subject: [Types-sig] Non-conservative inferencing considered harmful
References: <Pine.LNX.4.10.9912211204200.16305-100000@nebula.lyra.org> <3861323A.6C23ACCA@prescod.net>
Message-ID: <386140D9.3AA7D25D@maxtal.com.au>

Paul Prescod wrote:
> 
>  * if a variable is consistently assigned a particular type within its
> scope, we inference the type.
>  * if a variable is inconsistently assigned we infer it as "Any"

I'd like to suggest ONE extra case worth considering:

   * particular type OR None

The reason I suggest this, is that it is common enough
for either None or a single type to be used, for example,
often function parameters have a default of None.

It would be a pity to lump this case into the case Any.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From gstein@lyra.org  Wed Dec 22 21:39:26 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 22 Dec 1999 13:39:26 -0800 (PST)
Subject: [Types-sig] recursive types, type safety, and flow analysis
In-Reply-To: <38613987.4D6C32F1@prescod.net>
Message-ID: <Pine.LNX.4.10.9912221300050.16305-100000@nebula.lyra.org>

On Wed, 22 Dec 1999, Paul Prescod wrote:
>...
> The problem is that you and I are thinking about a Algol-style type
> checker. Greg is talking about a complex, sophisticated type inferencer
> that tries to understand what your Python code *means*. I am only

"Means"? Not at all. I'm only suggesting we watch what happens with the
types. We have to do that *anyhow* to do the type checking.

You're saying we declare types up front and verify them when we run into
the name. To verify it, that implies we need to know the implicit typedecl
when we find the name.

I'm saying we use the implied typedecl.

Why make it explicit, when we have all the information we already need?

>...
> > I like the idea better (I think proposed by Tim Peters) that the names
> > used for type declarations live in a separate compile time namespace
> > where different rules apply.  (Even though there are obvious
> > correspondences, e.g. the names of defined or imported classes should
> > probably be available both at compile time and at run time.)
> 
> I agree. We will soon have to move forward on this basis.

As long as we have a solution for making the typedecl objects available at
runtime. If we are introducing type information to Python, then we ought
be able to introspect on that information.

> The static type checker is a very different tool and it does not, in
> general, try to understand "Python code."

It has to understand it to some level to ensure that your types are used
properly. There is no avoiding that.

I think you're assuming there is too much of an increment between type
checking and the level of inference that I'm proposing (actually,
"deduction" as John would call it).

> Two *convenient exceptions*
> are top-level class and function declarations. But these are just
> convenient exceptions. I feel strongly that this:
> 
> if doSomething():
> 	class a:
> 		def doSomething(self):
> 			pass
> else:
> 	class a:
> 		def doSomething(self):
> 			pass
> 
> is NOT TYPE CHECKABLE because we do NOT do data-flow analysis.

I think we do. Not pure data flow in the classic sense, but really type
flow. We are doing that anyhow as part of the checking. With declared
names, the scope of your type analysis can be minimized (i.e. on a
per-statement level rather than per-function).

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Wed Dec 22 21:46:52 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 22 Dec 1999 13:46:52 -0800 (PST)
Subject: [Types-sig] type-assert operator optimizations
In-Reply-To: <38613F17.AF01AD63@maxtal.com.au>
Message-ID: <Pine.LNX.4.10.9912221340300.16305-100000@nebula.lyra.org>

On Thu, 23 Dec 1999, skaller wrote:
>...
> 	I am glad you agree. Consider:
> 
> 	assert 0
> 
> Run that through CPython. It raises an exception, right?
> Now run it again, this time with optimisation enabled.
> Nothing happens. No exception. The assert statement
> was elided by turning on optimisation.
> But both compilers are conforming by agreement,
> so it follows that an assert statement does NOT
> have to raise an exception.

The assert statement has different meanings based on the compiler you use.
That's all. Nothing funny about that.

In one case, it is defined to test and value and raise an
exception. In the other, it is a no-op.

> What is required then? Well, it isn't defined!

Yes, it is. It is just defined differently for the two compilers.

> The program is invalid: it isn't a valid python
> program.

It is entirely valid.

> A third compiler could reject it as such.

Sure: it could say "this thing will always fail, so I won't compile this
code."

But you're still missing the point. "assert" has a very specific
definition, and that is to raise an exception if its value is zero. It is
also defined to be a no-op in an optimizing compiler.

I do not see where you're going with all this. Exceptions are part of
Python. They exist, and they must be raised when needed. Your attempt to
redefine their role in Python is getting awfully tiresome.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Wed Dec 22 21:54:29 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 22 Dec 1999 13:54:29 -0800 (PST)
Subject: [Types-sig] Non-conservative inferencing considered harmful
In-Reply-To: <386140D9.3AA7D25D@maxtal.com.au>
Message-ID: <Pine.LNX.4.10.9912221353000.16305-100000@nebula.lyra.org>

On Thu, 23 Dec 1999, skaller wrote:
> Paul Prescod wrote:
> > 
> >  * if a variable is consistently assigned a particular type within its
> > scope, we inference the type.
> >  * if a variable is inconsistently assigned we infer it as "Any"
> 
> I'd like to suggest ONE extra case worth considering:
> 
>    * particular type OR None
> 
> The reason I suggest this, is that it is common enough
> for either None or a single type to be used, for example,
> often function parameters have a default of None.

Agreed, but I believe Paul's response would simply be "then declare it."

But then: I have different thoughts on the base issue :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From paul@prescod.net  Wed Dec 22 21:56:44 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 22 Dec 1999 15:56:44 -0600
Subject: [Types-sig] Back to basics
Message-ID: <3861491C.48DDE5F4@prescod.net>

I think that our version 1 is going to spin out of control if we spend
too much energy trying to reverse engineer what Python code means in
terms of a type system. Yes, I think that we can easily extract type
declarations from 90% of all Python code. We will do that for version 2.
Yes, I think that there are more sophisticated ways of doing data flow
analysis, we may do that for version 3.

For version *1* we should going to require all declarations to be 100%
explicit. We will allow out-of-line declarations in "shadow files" to
allow the annotation of "old" Python and C modules. Inline declarations
will be treated as Tim Peters suggests. They are identical in semantics
to out-of-line declarations. Inline declarations should always be
preceded by the "decl" keyword so that they can be easily stripped.

We can even write a small script that extracts them from Python code and
generates an out-of-line file so that the semantics are clearly not
context-dependent. In version 1 there is no automatic anything.

There will be two syntaxes for declaring types: interface declarations
and compound typedecls. Both are parameterizable.

This should help us to answer (in the short term) many of the tricky
questions that have been raised. "del foo" is merely illegal if it
violates the declared interface of a module and is not otherwise. 

a=5
a='abc'

is illegal if it violates the declared interface of a module and is not
otherwise.

In version 2 and subsequent versions we can get to automatic type
detection and maybe dataflow and type inferencing. But for version 1
we've got to KISS if we're going to make progress.

 Paul Prescod


From gstein@lyra.org  Wed Dec 22 22:08:40 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 22 Dec 1999 14:08:40 -0800 (PST)
Subject: [Types-sig] Non-conservative inferencing considered harmful
In-Reply-To: <3861323A.6C23ACCA@prescod.net>
Message-ID: <Pine.LNX.4.10.9912221354360.16305-100000@nebula.lyra.org>

On Wed, 22 Dec 1999, Paul Prescod wrote:
> Greg Stein wrote:
> > ...
> > I maintain that it could be declared type-safe. In fact, it is reasonably
> > straight-forward to generate the type information at each point, for each
> > value, and then to verify that the .doSomething is valid.
> 
> Greg, you are getting into exactly what I want to avoid. Let's say the
> first doSomething returns an int and the second doSomething returns a
> string. Now you are trying to statically bind it to an integer-bearing
> parameter. What's the appropriate error message:
> 
> > > a = [Foo(), Bar()]
> > > 
> > > ... 10,000 lines of code ...
> > > 
> > > for el in a:
> > >    j = el.doSomething()
> > > 
> > > ... 10,000 lines of code ...
> > > decl k: Int
> > > k = j
> 
> "Warning: integer|string cannot be assigned to integer."
> 
> Note also that the point where the error message occurs may be miles
> from the place where you did the funky thing that allowed the variable
> to take two types. 

That is *exactly* what would happen in my proposal.

Yes, the error is located at the assignment, and that is a good distance
from the assignments to "a" and "j". If you want to tighten up the code,
then declare "a" or "j". It will fail at whatever point you feel the code
is "wrong".

BUT!! -- you never said what the error in the program was, and what the
type-checker was supposed to find:

1) was "a" filled in with inappropriate types of values?
2) was "j" assigned a type it wasn't supposed to hold?
3) was "k" declared wrong?

In the absence of knowing which of the three cases is wrong, I *strongly*
maintain that the error at the "k = j" assignment is absolutely correct.
How is the compiler to know that "a" or "j" is wrong? You didn't tell it
that their types were restricted (and violated).

In other words, the error message is NOT "miles away". It is exactly where
it should be. When that error hit, the programmer could have went "oops! I
declared k wrong. I see that j is supposed to be an Int or a String.
okay... lemme fix that..."

Find another example -- this one doesn't support your position that
inferencing is harmful.

> This is EXACTLY the sort of error message that made me run screaming
> from the last language that tried to "help" me with data flow analysis
> and type inferencing. To put this is in the strongest possible terms,
> this sort of data flow analysis/inferencing "helps" me like MS Word's
> guessing what I mean when I "misspell" happy face (e.g. Python code) and
> Word fixes it for me. We are trying too hard and the result will be
> non-intuitive. There is no need to complicate our system in order to
> handle these corner cases. If the user wants a to be a List of (Foo|Bar)
> then they can darn well SAY SO.

Sure. And in the absence of saying so, the inferencer above did exactly
what it was supposed to do. You have one of three problems in your code,
yet no way for any compiler to know which one you meant.

If you want to sprinkle declarations throughout your code, then be my
guest. But you don't have to. Python has very clean and rigorous execution
semantics that we can easily and deterministically determine type
information.

I hate the thought that people are going to start feeling that they should
put declarations into their code. We will lose one of the best things of
Python -- the ability to toss out and ignore all that declaration crap
from Algol-like languages. If you give a means to people to declare their
variables, then they'll start using it. "But it's optional!" you'll say.
Well, that won't be heard. People will just blindly follow their C and
Java knowledge and we will lose the cleanliness of syntax that Python has
enjoyed for so long.

> It is because of this experience that I am strongly of the belief that
> we should do NO flow analysis in the normative part of our type system.

I disagree, and your example does not support that position.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From paul@prescod.net  Wed Dec 22 22:09:24 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 22 Dec 1999 16:09:24 -0600
Subject: [Types-sig] recursive types, type safety, and flow analysis
References: <Pine.LNX.4.10.9912221300050.16305-100000@nebula.lyra.org>
Message-ID: <38614C14.48075352@prescod.net>

Greg Stein wrote:
> 
>...
> 
> "Means"? Not at all. I'm only suggesting we watch what happens with the
> types. We have to do that *anyhow* to do the type checking.

I say: "given the name of the thing and the type associated with the
name, we check that the operations are legal." Seems straightforward.
We've been doing it since the 60's.

You say: "given the name of the thing we will intuit a type (often an
anonymous/union type (which I consider confusing)) and then check that
the operations are legal." 

"Classic" type inference works from the operators to figure out what the
types are. "Classic" type checking works from the names to validate the
operators. You want both. I don't think that it will work in practice.
The will be the semantics will be confusing as hell and the error
messages will be totally indecipherable.

I am willing to iteratively open up the design to types of type
inferencing and discovery as time goes by but I don't yet understand
your model well enough to be able to write a specification around it.

 Paul Prescod


From gstein@lyra.org  Wed Dec 22 22:24:59 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 22 Dec 1999 14:24:59 -0800 (PST)
Subject: [Types-sig] recursive types, type safety, and flow analysis
In-Reply-To: <38614C14.48075352@prescod.net>
Message-ID: <Pine.LNX.4.10.9912221415090.16305-100000@nebula.lyra.org>

On Wed, 22 Dec 1999, Paul Prescod wrote:
> Greg Stein wrote:
> > "Means"? Not at all. I'm only suggesting we watch what happens with the
> > types. We have to do that *anyhow* to do the type checking.
> 
> I say: "given the name of the thing and the type associated with the
> name, we check that the operations are legal." Seems straightforward.
> We've been doing it since the 60's.
> 
> You say: "given the name of the thing we will intuit a type (often an
> anonymous/union type (which I consider confusing)) and then check that
> the operations are legal." 

That's not what I've been saying. I take blame for not making myself
clearer in this case.

> "Classic" type inference works from the operators to figure out what the
> types are. "Classic" type checking works from the names to validate the
> operators. You want both. I don't think that it will work in practice.
> The will be the semantics will be confusing as hell and the error
> messages will be totally indecipherable.
> 
> I am willing to iteratively open up the design to types of type
> inferencing and discovery as time goes by but I don't yet understand
> your model well enough to be able to write a specification around it.

Given:

  x = a + b

I don't want to infer anything from the "+" there. So no... this wouldn't
be called "classic type inferencing." John Skaller best described it as
type deduction. I want to know the type of x, given the types of a and b.

1) In the above statement, we already know the types of "a" and "b" (if
   not, then they haven't been assigned to and we can raise an error!).
2) We know what the "+" operator does, given those two types.
3) We produce a result type.
4) We associate that result type with "x".

In your model, (4) is replaced by "validate the result type against the
declared type of 'x'".

Otherwise, there is no difference between our proposals.

Your statements about not working in practice, confusing semantics, and
indecipherable error messages are simply FUD. You said yourself you were
unsure of the model that I was proposed. And FUD like this generally
peeves me.

I'm talking about figuring out types from the assignments, not from
declarations. That's it. Forget declarations and the clutter that they
bring to programs.

-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Wed Dec 22 22:30:32 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 22 Dec 1999 14:30:32 -0800 (PST)
Subject: [Types-sig] Back to basics
In-Reply-To: <3861491C.48DDE5F4@prescod.net>
Message-ID: <Pine.LNX.4.10.9912221426060.16305-100000@nebula.lyra.org>

This is all fine as long as the design does not preclude the availability
of typedecl information at runtime. Some of these discussions about new
namespaces or not worrying about names being defined could prevent that.

I've proposed plenty of syntax for the typedecls and interface
declarations. I don't think there has been a solid proposal yet for
parameterizations. I would recommend that the syntax design at least
starts with the proposal that I set up to save some work and provide a
basis for discussion about how to add parameterization.
[ personally: I'd recommend parameterization get punted to V2, although I
  worry that if we don't take its syntax into account, we might preclude
  its addition later on. ]

-g

On Wed, 22 Dec 1999, Paul Prescod wrote:
> I think that our version 1 is going to spin out of control if we spend
> too much energy trying to reverse engineer what Python code means in
> terms of a type system. Yes, I think that we can easily extract type
> declarations from 90% of all Python code. We will do that for version 2.
> Yes, I think that there are more sophisticated ways of doing data flow
> analysis, we may do that for version 3.
> 
> For version *1* we should going to require all declarations to be 100%
> explicit. We will allow out-of-line declarations in "shadow files" to
> allow the annotation of "old" Python and C modules. Inline declarations
> will be treated as Tim Peters suggests. They are identical in semantics
> to out-of-line declarations. Inline declarations should always be
> preceded by the "decl" keyword so that they can be easily stripped.
> 
> We can even write a small script that extracts them from Python code and
> generates an out-of-line file so that the semantics are clearly not
> context-dependent. In version 1 there is no automatic anything.
> 
> There will be two syntaxes for declaring types: interface declarations
> and compound typedecls. Both are parameterizable.
> 
> This should help us to answer (in the short term) many of the tricky
> questions that have been raised. "del foo" is merely illegal if it
> violates the declared interface of a module and is not otherwise. 
> 
> a=5
> a='abc'
> 
> is illegal if it violates the declared interface of a module and is not
> otherwise.
> 
> In version 2 and subsequent versions we can get to automatic type
> detection and maybe dataflow and type inferencing. But for version 1
> we've got to KISS if we're going to make progress.
> 
>  Paul Prescod

-- 
Greg Stein, http://www.lyra.org/


From sjmachin@lexicon.net  Wed Dec 22 23:55:22 1999
From: sjmachin@lexicon.net (John Machin)
Date: Thu, 23 Dec 1999 09:55:22 +1000
Subject: [Types-sig] type declaration syntax
In-Reply-To: <002701bf4aa5$d2334d60$922d153f@tim>
References: <385C1345.C21FF180@maxtal.com.au>
Message-ID: <19991222224650088.AAA118.228@max41101.izone.net.au>

[John Skaller]
> > I guess that NO python programmer wants to declare the type
> > of every single name, which is what APC style static type
> > checking requires.
> [Tim]
> I would be delighted to if it sped some of my "marginal" programs by a
> factor of 2.  My employer would be delighted to if it saved them from
> runtime TypeErrors next week instead of next year.  Any tradeoff you can
> think of has a larger audience than you can imagine <wink -- but that
> includes an audience for Viperish global inference too!>.
> 

Indeed. Further, from what we of the APC persuasion also desire 
salvation is the speed-driven need to cache references to methods and 
functions in local variables e.g.

   def myfunc(real_arg_1, real_arg_2, srepl = string.replace):
      # known favourite trick of Tim's
      # as Tim has said recently, it's not robust in the face of caller 
      # going myfunc("foo", "bar", "you lose")
or one of mine:
   wlist = []
   wapp = wlist.append
   for x in some_long_sequence:
	wapp(some_func(x))

Now I do appreciate the possibilities and practicalities of replacing 
functions and methods on the fly, the reflective capabilities, the 
whole dynamic thing, but please please please can't we have a way of 
declaring that we're peeking, not poking, 99% of the time?

Some possibilities:

(a) declare that certain objects are unpokeable --- pragma nopokeever 
at the top of the sys module would trap sys.maxint = "gotcha" at 
compile-time

(b) others may be pokeable initially but then frozen
   some_module.some_func = my_better_func
   nopokeanymore(some_module)
or
   my_soon_to_be_read_only_data_structure = load_from_file(.......)
   nopokeanymore(my_soon...)
so that run-time checks could made on attempts to poke at the object?

(c) declare that code in the current module doesn't poke in 
replacements for methods in other modules (or in itself) and neither do 
the modules it calls and if they do then I'll take the rap .... so that 
an optimising compiler
could move the resolution of wlist.append outside the loop as in my 
example, or even as in Tim's trick do the resolution when the module is 
loaded ...

(d) I really miss #define and const and enum; how about 

   const INITIAL_STATE = 0
   const NEXT_STATE = 1
   const ANOTHER = 2

so that faster and more robust code could be generated?

Don't-stop-at-bytecode-hacks-the-ultimate-dynamic-language-would-allow-
"poke(arbitrary_memory_address)=expression"-ly yours

John


From paul@prescod.net  Wed Dec 22 23:00:02 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 22 Dec 1999 17:00:02 -0600
Subject: [Types-sig] Non-conservative inferencing considered harmful
References: <Pine.LNX.4.10.9912221354360.16305-100000@nebula.lyra.org>
Message-ID: <386157F2.183DA4AC@prescod.net>

Greg Stein wrote:
> 
> BUT!! -- you never said what the error in the program was, and what the
> type-checker was supposed to find:
> 
> 1) was "a" filled in with inappropriate types of values?
> 2) was "j" assigned a type it wasn't supposed to hold?
> 3) was "k" declared wrong?
>
> In the absence of knowing which of the three cases is wrong, I *strongly*
> maintain that the error at the "k = j" assignment is absolutely correct.
> How is the compiler to know that "a" or "j" is wrong? You didn't tell it
> that their types were restricted (and violated).

That's right. This is all valid Python code and should *not* give an
error message. That's why I'm been trying to make a distinction between
a type safety declaration and a type declaration. If you didn't ask for
this code to be type-safe then it won't cause a problem. Here's where
the error might arise:


... 10,000 lines of code ...

for el in a:
   j = el.doSomething()

... 10,000 lines of code ...

type-safe
def foo( k: Int ):
	k = j

> In other words, the error message is NOT "miles away". It is exactly where
> it should be. When that error hit, the programmer could have went "oops! I
> declared k wrong. I see that j is supposed to be an Int or a String.
> okay... lemme fix that..."

No, here's what really happens (based on my experience with ML):

"Oooops. Something weird has happened why would it expect an int or
string? Where does this variable get its value? Humm. What are the
possible types of el? Humm, what are the possible contents of a? Hummm.
Why does this language make it so hard for me to find my errors?"

> Find another example -- this one doesn't support your position that
> inferencing is harmful.

It absolutely does. If I can't jump immediately from the error message
to the line that causes the problem then something is wrong with the
type system. I shouldn't have to "debug" compiler errors by inserting
declarations here and there.

> Sure. And in the absence of saying so, the inferencer above did exactly
> what it was supposed to do. You have one of three problems in your code,
> yet no way for any compiler to know which one you meant.

It shouldn't have let me get so far without saying *exactly what I
mean*. It shouldn't have tried to read my mind. I'm the human asking for
the computer's help in keeping things straight. If it tries to read my
mind it's going to be making the same mistakes I make! Rather it should
tell me when I've first done something fishy.

> I hate the thought that people are going to start feeling that they should
> put declarations into their code. 

That's *inevitable*. People who want statically type-checked code are
inevitably going to start feeling that they must put declarations in
their code. That's the case with ML. That's the case with Haskell. That
will be the case with Python. We aren't smarter here then all of the
programming language researchers in the world.

Just as programmers mistrust the ML and Haskell inferencers, they will
distrust the Python inferencer. Just as programmers (me, Tim, the editor
of the Journal of Functional Programmers) always put in declarations in
ML and Haskell code, we will do so for Python. One or two cryptic error
messages that take you fifteen minute to "debug" are enough to turn you
off inferencing really quick.

Therefore our real decision is: do we want to force programmers who want
static type checking to sprinkle their code with declarations
*explicitly* or do we want to wait until they get frustrated with the
inferencer? It seems to me that "simple and explicit" is more Pythonic
than "we'll guess what you mean and you can being explicit if we guess
wrong." If you don't want to put in declarations, don't use static type
checking!

> We will lose one of the best things of
> Python -- the ability to toss out and ignore all that declaration crap
> from Algol-like languages. If you give a means to people to declare their
> variables, then they'll start using it. "But it's optional!" you'll say.
> Well, that won't be heard. People will just blindly follow their C and
> Java knowledge and we will lose the cleanliness of syntax that Python has
> enjoyed for so long.

I don't see the absence of type declarations as having much to do with
Python's cleanliness. As Tim said, declarations improve readability by
serving as documentation.

 Paul Prescod


From paul@prescod.net  Wed Dec 22 23:10:19 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 22 Dec 1999 17:10:19 -0600
Subject: [Types-sig] recursive types, type safety, and flow analysis
References: <Pine.LNX.4.10.9912221415090.16305-100000@nebula.lyra.org>
Message-ID: <38615A5B.D2E2B608@prescod.net>

Greg Stein wrote:
> 
> I'm talking about figuring out types from the assignments, not from
> declarations. That's it. Forget declarations and the clutter that they
> bring to programs.

Okay, let's try again.

a.py:

a=5

if something():
	a = 5
elif somethingElse():
	a = "abc"
elif somethingElse2():
	a = 32L
elif somethingElse3():
	a = ["ab"]
else:
	del a


b.py:

import a
a.a = "jab"

c.py:

import a
a.a = ("abc",5)

d.py:

import a, b, c

j: Int
j = a.a

What's the error message? What are the list of valid types that a.a
could have at this point and how would the type system have inferred
them?

 Paul Prescod


From skaller@maxtal.com.au  Wed Dec 22 23:18:01 1999
From: skaller@maxtal.com.au (skaller)
Date: Thu, 23 Dec 1999 10:18:01 +1100
Subject: [Types-sig] type-assert operator optimizations
References: <Pine.LNX.4.10.9912221340300.16305-100000@nebula.lyra.org>
Message-ID: <38615C29.2F26AB8F@maxtal.com.au>

Greg Stein wrote:
 
> The assert statement has different meanings based on the compiler you use.
> That's all. Nothing funny about that.
> 
> In one case, it is defined to test and value and raise an
> exception. In the other, it is a no-op.

	The semantics of a programming language are a property
of the LANGUAGE and not the compiler.

	So what you are saying is that there are TWO python
languages, which you disagreed with before.
 
> But you're still missing the point. "assert" has a very specific
> definition, and that is to raise an exception if its value is zero. It is
> also defined to be a no-op in an optimizing compiler.

	Make up your mind :-)
 
> I do not see where you're going with all this. 

	I know. I've got five or so years intensive
experience with standardisation committees -- and quite a few
more as a 'lingerer'. :-) It takes a long
while to understand some of the subtle distinctions and issues,
and I can't claim to understand them myself.

	This is NOT a put down, I know you are not
stupid, etc etc .. I just happen to know something
about these issues because I spent lots of time
dealing with them.  Be glad I have done so, and spared
you most of the agony :-)

	A programming language isn't defined by an implementation. 
Semantics comes from documented specifications -- and sometimes the
specifications don't agree with practice, are suboptimal,
etc .. which makes discussing changing the specifications
all that much harder.

	At present, the python language reference
confuses what a single implementation does with semantics.
[but, it's fairly good, because the distinction IS made
in some places] This is because it was written as a 
description of a  single implementation. However, there are now
many implementations -- several versions of CPython
including 1.5.0, 1.5.1, 1.5.2, 1.6 (under development)
and 2.0, as well as versions of JPython, Viper,
and other tools like type checkers. And then,
it runs on multiple platforms. Whew! That's a lot
of Pythons. What do they ALL have in common?
What will run on ALL these implementations on ALL platforms?

	It is no longer sensible to merely describe
what a particular version of CPython does on one platform: 
the Python Language itself needs an implementation independent 
specification. This is so (ultimately) programmers can write programs
and 
DEPEND on a particular behaviour -- even when compiled with
an optimising compiler (and I'm not talking about merely optimising
byte code)

	One of my complaints about Python was that
it requires exceptions in far too many places, where
in fact one would consider that the program is in error:
the fact that CPython 1.5.2 raises exceptions in these cases
ought to be considered a quality of implementation issue,
and NOT mandatory semantics.

	The reason is that all these requirements for
exceptions prevent effective compilation, by preventing
many optimisations to be done, and by preventing
early error diagnosis. There is an exception,
effectively, in the case of assert statements --
in this case the intention is quite clearly reflected
in the optimising compiler: the language semantics
required for assert are 'none'. This specification
only applies to correct programs, so that 
each implementation is free to choose a behviour
for incorrect ones -- throw an exception, do nothing,
or reject the program -- or even core dump.
The point is the semantics of assert are there
to allow the program to declare intentions, and discover
if in fact their program is wrong.

	The same applies in many other areas,
where one can NOT produce an efficient code generator
unless some of the requirements are relaxed.
For example, if a program recurses past some reasonable
limit, CPython reports the error. But this cannot
be a sensible language requirement, because 
stack overflow is VERY expensive to check for in
compiled code. So it is a Quality Of Implementation (QOI)
issue what happens on stack overflow, NOT part
of the language specification. [At least, this is
a sensible position if one wants efficient code to
be generated] Did YOU check for stack overflow
in the last extension module you wrote? <g>

	What I expect you (Greg) will find
when you try to write the type checking stuff,
is that just a few, judicious, language changes,
-- mainly restrictions on what is considered
a correct Python program -- will vastly improve
what you can deduce. You will need to experience
this yourself, to really begin to understand
how sensitive optimisation is to small changes
in semantic requirements. [And when you do,
please report your experiences!]

	There is a classic example of this:
FORTRAN vs C. It is well known that Fortran
produces considerably more efficient code than C.
It is also well known why, and that in retrospect,
C was just a bit too dynamic. The problem
relates to aliasing, especially of function
arguments -- Fortran does NOT permit this.
C does, and it pays dearly.

	The C committee tried to fix this
with a new keyword 'noalias', and they stuffed up the design,
and had to withdraw it. A new proposal for keyword 'restricted'
has been accepted for C9X: the semantics are subtly different.
It remains to be seen whether this allows C to catch up
with Fortran.

	Now, I myself have a desire to compile
Python to efficient code. I know that the lack of
explicit type declarations, by itself, does not
prevent this -- although allowing explicit declarations
can certainly improve performance, provided that a failure
to comply means that the program is in error -- and NOT
that exceptions be throw: that would defeat the (OPT) purpose. 
(Not entirely though)

	It is in fact quite conventional
in most programming languages to provide assertions
with NO semantics, to allow debugging dynamic situations
(for example, Eiffel).

	I can't see any reason Python shouldn't
follow this conformance model. I also cannot see a reason
why CPython should not raise exceptions in these cases.
But it would be doing so because that is what a quality
INTERPRETER would do, and NOT because it was a requirement
of the language.

	That is why I tried to distinguish two uses
of exceptions: for reporting program errors, and for
reporting environmental problems (like end of file,
or cannot open file) -- in the latter case, we would
not dispute that the program is correct and should handle
the exception (whether or not it is interpreted or compiled).

	In the other cases there is a tradeoff-- the more
code that is banned, the faster the code we can generate,
and the more likely we are to get a core dump or unexpected
behaviour. in trying to decide on where the tradeoff should
lie, your position at one extreme is not helping.

	The optimisations which can be performed
by NOT requiring assert statements to raise exceptions
are minimal but existant. The reason that they're minimal
is that, apart from eliding the actual test, it is hard
to 'understand' what an assertion is asserting (for a compiler).

	On the other hand, your typecheck operator seems to
provide MORE information, precisely because it is a far more
specific test: assert statements can check almost anything,
whereas the typecheck operator very specifically asserts
something has a particular type, allowing inferencing
to do a lot more work, and a lot of typechecking and
method lookup to be reduced or eliminated.

	You designed it for this reason!
But we will lose quite a bit of performance if we
have to keep checking at run time, and raising
exceptions. The reason is that if the exception
is caught -- the type that it asserts an object
have cannot be assumed anymore -- the object can
have any type:

	try: y = x ! Int
 		# assert y,x are ints ..
	except TypeError: pass
	y .. x

We cannot say anything about the type of y and x here.
To be sure, we can do some control flow analysis
in this case, and notice that we lose information
where the TypeError is caught.

So you COULD argue that requiring an exception be
thrown doesn't cost that much (only the test) 
because information is lost only outside the
'try' body. I might even agree. But the point
is that we would then be discussing the cost of various
possible restrictions -- not insisting that every exception
must be generated exactly as in the CPython 1.5.2 interpreter.
We do not HAVE to require the CPython 1.5.2 behaviour
in a compiler, we should just weigh up the tradeoffs.

Just to give you an example:

	try: y = x ! int
	except: ValueError ...

You might think this didn't catch a failed
exception, and that y and x had to be ints .. until
you realised that because Python is so dynamic,
someone may have said:

	ValueError = TypeError 

somewhere. So you might have to throw away all
type information whenever you saw ANY except clause,
just to be sure you conformed to the requirements.
And you might just want to ban changing the meaning
of the standard exceptions!

	I have a final word. I have a vested interest
in exceptions being thrown in many cases: my interscript
program relies on catching exceptions when client script
contains errors: it reports the error, and then continues
on. The program is robust: a useful property for a literate
programming tool, where bugs result in scrambled output,
rather than a core dump. But I ALSO want interscript
to run at least 1000 times faster. A compromise is needed
which supports BOTH usages. So I'm NOT in the
'exceptions are evil camp' after all, I in the
'what compromises preserve as much existing behaviour
as possible while still allowing effective optimisation
and type checking?'

	I.e. I think we're on the same side, or ought to be:
looking for a suitable compromise. Probably, we can at best
analyse -- and try out -- various possibilities, with Guido's
Guidance <g> to help, and knowing he's making the final
decision (at least for CPython <g>) anyhow.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Wed Dec 22 23:58:39 1999
From: skaller@maxtal.com.au (skaller)
Date: Thu, 23 Dec 1999 10:58:39 +1100
Subject: [Types-sig] type declaration syntax
References: <385C1345.C21FF180@maxtal.com.au> <19991222224650088.AAA118.228@max41101.izone.net.au>
Message-ID: <386165AF.F6E6BF81@maxtal.com.au>

John Machin wrote:

> Now I do appreciate the possibilities and practicalities of replacing
> functions and methods on the fly, the reflective capabilities, the
> whole dynamic thing, but please please please can't we have a way of
> declaring that we're peeking, not poking, 99% of the time?

	Why not the OTHER way around, to match the statistics?

1) Ban rebinding module variables after importing is done
	
2) Specify that 'def' creates an immutable binding. You can't
   rebind (or del) the name. Same for 'class'. [This includes
    defs inside classes]

3) It isn't necessary to introduce 'const', at least for modules and
classes:
   Here's why:
   Given (1), the only ways to change a module level variable are:

	a) a module level assignment -- easy to detect
	b) an assignement in a function defined in the same module,
           which has a global directive for that variable
	c) exec statement [if you see one, abandon hope of optimisation]
	d) __dict__ hackery from outside the module

The only hard case is (d). This cannot be statically checked.
It can be banned all the same (and on the programmers head be it),

I.e. TWO bans fix most problems. The ban on module level rebindings
is a significant restriction. The ban on changing 'def' and 'class'
bindings can be worked around [use another variable] and so
no functionality is lost, provided that the author takes
this into account. EG:

	class X:
		def f(..): pass
		g = f # g can be rebound, f cannot be

I think this covers:

	caching all module and class level variables
	caching instance methods

what is not covered is caching instance attributes.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Thu Dec 23 00:14:44 1999
From: skaller@maxtal.com.au (skaller)
Date: Thu, 23 Dec 1999 11:14:44 +1100
Subject: [Types-sig] recursive types, type safety, and flow analysis
References: <Pine.LNX.4.10.9912221415090.16305-100000@nebula.lyra.org> <38615A5B.D2E2B608@prescod.net>
Message-ID: <38616974.AFC365D6@maxtal.com.au>

Paul Prescod wrote:
 
> a.py:
> 
> a=5
> 
> if something():
>         a = 5
> elif somethingElse():
>         a = "abc"
> elif somethingElse2():
>         a = 32L
> elif somethingElse3():
>         a = ["ab"]
> else:
>         del a

	The type of a is one of:

	int, string, long, tuple, or terminal

where 'terminal' means 'a doesn't exist'.
 
> b.py:
> 
> import a
> a.a = "jab"

	module attribute assignment is banned

> c.py:
> 
> import a
> a.a = ("abc",5)

	module attribute assignment is banned
 
> d.py:
> 
> import a, b, c
> 
> j: Int
> j = a.a
> 
> What's the error message? 

	There isn't one at compile time.
a could be an int, the inference engine cannot 
know if it is or not, so it keeps quiet.
At run time, a TypeError is raised if something
other than int is assigned.

	[Yes, I know I changed the rules on you
by saying the module attribute assignments were banned]

	Note: Viper will do better than this
compiling whole programs. It loads modules dynamically at 
compile time, by running the interpreter, so all modules
are fully built at compile time.

	Then, it is easy to know not only the types
of every (module) variable, but also their values!
Obviously, this will not work on a 'per module' basis.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From gstein@lyra.org  Thu Dec 23 00:50:52 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 22 Dec 1999 16:50:52 -0800 (PST)
Subject: [Types-sig] examples (was: recursive types, type safety, and flow analysis)
In-Reply-To: <38615A5B.D2E2B608@prescod.net>
Message-ID: <Pine.LNX.4.10.9912221618440.16305-100000@nebula.lyra.org>

On Wed, 22 Dec 1999, Paul Prescod wrote:
> Greg Stein wrote:
> > I'm talking about figuring out types from the assignments, not from
> > declarations. That's it. Forget declarations and the clutter that they
> > bring to programs.
> 
> Okay, let's try again.
> 
> a.py:
> 
> a=5
> 
> if something():
> 	a = 5
> elif somethingElse():
> 	a = "abc"
> elif somethingElse2():
> 	a = 32L
> elif somethingElse3():
> 	a = ["ab"]
> else:
> 	del a

As John stated: at the end of this code block, "a" is one of Int, String,
Long, List<String>, or Undefined.

There are no errors.

"a" does not form part of a.py's interface as it is not explicitly
mentioned in a "decl member" statement.
[ below, I explore the ramifications of changing this feature/design ]

> b.py:
> 
> import a
> a.a = "jab"

I would say this is perfectly acceptable. Remember: I'm proposing to defer
all assignment enforcement.

No errors.

> c.py:
> 
> import a
> a.a = ("abc",5)

Same as b.py.

> d.py:
> 
> import a, b, c
> 
> j: Int
> j = a.a
> 
> What's the error message? What are the list of valid types that a.a
> could have at this point and how would the type system have inferred
> them?

None of the modules above export an interface. Especially with regard to
module attributes (i.e. module globals). I do believe that a module
exports function signatures as part of its interface; but unless you
include a "decl member" in there, the module does not export defined
attributes.

In my proposal, I would state that ".a" is not part of a's interface, so
the reference simply fails. This is analogous to:

  some_instance.undefined_attribute

In other words, Module "a" and "some_instance" both export an interface.
It is an error to reference something that is NOT part of that interface.

For discussion's sake, because I think you are seeking more detail around
this particular fragment... Let us posit that referencing "a.a" is allowed
(for discussion: let's say we make allowances for module interfaces).

Given that: the expression "a.a" is not type-checked at all. We know that
"a" is a Module, but nothing more. a.a might raise an AttributeError, but
we can't know, and we don't flag that. If it does have a value, we will
treat it as type "Any".

The assignment to "j" succeeds because I don't want local declarations or
assignment enforcement.

==========================================

Now. Let's weaken some of my assumptions/requirements and/or look further
ahead.

1) module globals implicitly form part of an exported interface.

   This would imply that "a.a" has an exported type (the union described
   above). Things like "b.a" would also be exported (with a type of
   Module), but that doesn't enter into this discussion.

   Given this, the type-checker will know that the RHS of "j = a.a" has
   that union type. No error is raised because enforcement does not exist.

2) Enable assignment enforcement (and local declarations)

   d.py issues an error at the assignment.

3) Enable module attribute assignment enforcement (by virtue of an
   exported interface, clients must respect the typedecls specified)

   b.py does not raise an error. String is a valid type for a.a.

   c.py raises an error. A tuple is an invalid type for a.a.

4) Module attribute assignment is outright forbidden.

   b.py and c.py raise errors. These assignments are forbidden.
   
5) Separate track: references to an attribute that is not part of a Module
   interface is noted as an error.

   b.py, c.py, and d.py would raise an error because "a" is not part of
   the Module interface (it was not declared).

   [ per my note above: I do think that these references to "a.a" would
     cause an error because of the interface violation -- a.a is not
     exported by the Module. ]

Items 2, 3, 4 are conditioned upon "a" being part of Module a's interface
(implicitly per #1, or explicit via a declaration).

In summary, the deferred parts of my proposal are:

1) a module global does not form part of its interface unless it is
   explicitly declared with a "decl member" statement.
2) assignment enforcement does not exist
3) module attribute assignments are legal and un-type-checked

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Thu Dec 23 01:16:00 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 22 Dec 1999 17:16:00 -0800 (PST)
Subject: [Types-sig] Non-conservative inferencing considered harmful
In-Reply-To: <386157F2.183DA4AC@prescod.net>
Message-ID: <Pine.LNX.4.10.9912221651120.16305-100000@nebula.lyra.org>

On Wed, 22 Dec 1999, Paul Prescod wrote:
> Greg Stein wrote:
> > 
> > BUT!! -- you never said what the error in the program was, and what the
> > type-checker was supposed to find:
> > 
> > 1) was "a" filled in with inappropriate types of values?
> > 2) was "j" assigned a type it wasn't supposed to hold?
> > 3) was "k" declared wrong?
> >
> > In the absence of knowing which of the three cases is wrong, I *strongly*
> > maintain that the error at the "k = j" assignment is absolutely correct.
> > How is the compiler to know that "a" or "j" is wrong? You didn't tell it
> > that their types were restricted (and violated).
> 
> That's right. This is all valid Python code and should *not* give an
> error message.

Woah... if you presume enforcement of assignments, then a type-check error
exists somewhere. We're *trying* to raise errors. Did you typo here?

> That's why I'm been trying to make a distinction between
> a type safety declaration and a type declaration. If you didn't ask for
> this code to be type-safe then it won't cause a problem.

All right. This is a different problem than I was responding to. You asked
me whether an error would be raised by a specified piece of code. I said
yes. Now you're saying the code was actually correct and an error
shouldn't be raised? Or that there was some type-safety vs type-decl
difference being discussed?

You're confusing me.

Regardless: in your world of assignment-enforcement, the code you
specified *would* raise an error. Either at the assignment to "a", "j", or
"k", depending on which ones you had declared ahead of time.

> Here's where
> the error might arise:
>
> ... 10,000 lines of code ...
> 
> for el in a:
>    j = el.doSomething()
> 
> ... 10,000 lines of code ...
> 
> type-safe
> def foo( k: Int ):
> 	k = j

Sure. You'd get an error at the "k = j" line.

What's the point? We agree on this one. [presuming assignment enforcement]

> > In other words, the error message is NOT "miles away". It is exactly where
> > it should be. When that error hit, the programmer could have went "oops! I
> > declared k wrong. I see that j is supposed to be an Int or a String.
> > okay... lemme fix that..."
> 
> No, here's what really happens (based on my experience with ML):
> 
> "Oooops. Something weird has happened why would it expect an int or
> string? Where does this variable get its value? Humm. What are the
> possible types of el? Humm, what are the possible contents of a? Hummm.
> Why does this language make it so hard for me to find my errors?"

All right. This is because you are specifying the error was #1 or #2. If
the error was #3, then it would be immediately obvious what happened (you
declared "k" wrong).

Given error #1 or #2, this implies that at the "k = j" statement, you had
the misconception that "j" was of type Int. The compiler tells you that
your impression is wrong. No harm in that, and exactly what you're seeking
from the type checker. As a thrifty Python programmer, knowing how to
declare variables, you simply insert "j: Int" somewhere. Now you get the
error up there at the doSomething() call.

If your error was back at #1 when you constructed "a", then the error at
doSomething() will make you pause. Again, you'll realize that your
assumptions were incorrect and you insert another declaration to verify
your assumption. Now, you get an error at the assignment to "a".

I see no head-scratching in here, other than that caused by the
programmer's failure to understand their code. That is what we are trying
to solve -- find cases where their misunderstanding is going to cause
problems at run time.

> > Find another example -- this one doesn't support your position that
> > inferencing is harmful.
> 
> It absolutely does. If I can't jump immediately from the error message
> to the line that causes the problem then something is wrong with the
> type system. I shouldn't have to "debug" compiler errors by inserting
> declarations here and there.

But I'm telling you: in this case, there is ONE of THREE problems. If the
problem was #3, then you *can* immediately jump to the spot to fix the
declaration of "k". Otherwise, your assumptions are incorrect and the
compiler just told you that. That is exactly what it is supposed to do.

When I get a segfault in my C code, it is rarely caused by a local
problem. If you believe that all error messages and their cause are
supposed to be localized, then you are truly mistaken.

On one hand, you say that you don't want to "debug compiler errors" by
inserting declarations. But that is *exactly* what you're proposing to do!
You just want to be pre-emptive and force people to declare them up front.

What if "a" and "j" were *supposed* to be of type "any"? There is no
compiler error in your example and my proposed response. You specifically
did not want to enforce any types on "a" or "j". The compiler did exactly
what you told it to: no type checks on those. Later, you made an assertion
that "j" was an Int by virtue of believing that you could assign it to
"k". Well, the compiler just *helped* you out by telling you that
assumption was incorrect.

What is the problem with that?

> > Sure. And in the absence of saying so, the inferencer above did exactly
> > what it was supposed to do. You have one of three problems in your code,
> > yet no way for any compiler to know which one you meant.
> 
> It shouldn't have let me get so far without saying *exactly what I
> mean*. It shouldn't have tried to read my mind. I'm the human asking for
> the computer's help in keeping things straight. If it tries to read my
> mind it's going to be making the same mistakes I make! Rather it should
> tell me when I've first done something fishy.

But you didn't do anything fishy! Not until you assumed that "j" was an
Int, when it really wasn't. The compiler isn't trying to read your mind.
It just told you that you messed up.

The assignments to "a" and "j" were perfectly valid.

If, in your mind, they were not, then you should have made that clear. But
I'm trying to say that we should not require that. We can get along quite
fine with knowing the type information as it gets resolved, rather than
having to declare it all up front.

>...
> Therefore our real decision is: do we want to force programmers who want
> static type checking to sprinkle their code with declarations
> *explicitly* or do we want to wait until they get frustrated with the
> inferencer?

You are assuming that it will cause frustration. It isn't making a single
guess. It is tracking exactly what you are doing with your types -- it is
entirely logical and deterministic. When it tells that you goofed, that is
because you really did. That is valid, helpful information.

I do not want to require explicit declaration. There is no need for it.

> It seems to me that "simple and explicit" is more Pythonic
> than "we'll guess what you mean and you can being explicit if we guess
> wrong." If you don't want to put in declarations, don't use static type
> checking!

Stop that. There are no guesses occurring!

> > We will lose one of the best things of
> > Python -- the ability to toss out and ignore all that declaration crap
> > from Algol-like languages. If you give a means to people to declare their
> > variables, then they'll start using it. "But it's optional!" you'll say.
> > Well, that won't be heard. People will just blindly follow their C and
> > Java knowledge and we will lose the cleanliness of syntax that Python has
> > enjoyed for so long.
> 
> I don't see the absence of type declarations as having much to do with
> Python's cleanliness.

I do. Python is remarkably devoid of syntactic sugar. The lack of
declarations is a huge factor in that.

> As Tim said, declarations improve readability by
> serving as documentation.

I agree and submit:

# foo is an Integer

Perfectly valid documentation if that is what you need. No reason to
introduce syntax for that.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From paul@prescod.net  Thu Dec 23 01:15:15 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 22 Dec 1999 19:15:15 -0600
Subject: [Types-sig] Back to basics
References: <Pine.LNX.4.10.9912221426060.16305-100000@nebula.lyra.org>
Message-ID: <386177A3.86F0D505@prescod.net>

Greg Stein wrote:
> 
> This is all fine as long as the design does not preclude the availability
> of typedecl information at runtime. 

I totally agree. I would even like a type-checking function/operator
(preferably the former for implementation reasons).

> Some of these discussions about new
> namespaces or not worrying about names being defined could prevent that.

I don't follow the part following "or".

> I've proposed plenty of syntax for the typedecls and interface
> declarations. I don't think there has been a solid proposal yet for
> parameterizations. I would recommend that the syntax design at least
> starts with the proposal that I set up to save some work and provide a
> basis for discussion about how to add parameterization.

Agreed.

> [ personally: I'd recommend parameterization get punted to V2, although I
>   worry that if we don't take its syntax into account, we might preclude
>   its addition later on. ]

Agreed. Tim Peters convinced me that it isn't actually too big of a
deal. Parameterization is almost like string substitution. In some
lexical scope, _T stands for a paramater that is substituted in when a
concrete object is declared. If you treat it like string substitution
then the semantics are pretty simple. One minor detail to work out is
whether to predeclare the list of parameter variables or just look for
names beginning with underscores.

 Paul Prescod


From gstein@lyra.org  Thu Dec 23 09:35:40 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 23 Dec 1999 01:35:40 -0800 (PST)
Subject: [Types-sig] Back to basics
In-Reply-To: <386177A3.86F0D505@prescod.net>
Message-ID: <Pine.LNX.4.10.9912230118220.16305-100000@nebula.lyra.org>

On Wed, 22 Dec 1999, Paul Prescod wrote:
> Greg Stein wrote:
> > This is all fine as long as the design does not preclude the availability
> > of typedecl information at runtime. 
>...
> > Some of these discussions about new
> > namespaces or not worrying about names being defined could prevent that.
> 
> I don't follow the part following "or".

Sorry. There are at least a couple things that have been discussed
recently might prevent us from having typedecl objects at runtime:

- shifting names of typedecl objects into a distinct namespace
  (how does the runtime access this namespace? when is the namespace
   created? etc)

- undefined names, per the runtime execution order
  (at the time a function object is created, we need a typedecl object for
   one of its args; if the name referring to the typedecl is undefined at
   that point in the execution, then we fail or we don't get a typedecl;
   both situations are untenable for me)

In light of these possible directions in discussion, I'm concerned that
following them will mean we don't have runtime typedecls.

While walking to dinner tonite, I came up with a great example for runtime
typedecl objects: debuggers. I imagine there are other IDE functions that
would find the information useful.
[ of course, I can also imagine that, in some situations, an IDE needs
  typedecl objects without loading a module ]

>...
> > [ personally: I'd recommend parameterization get punted to V2, although I
> >   worry that if we don't take its syntax into account, we might preclude
> >   its addition later on. ]
> 
> Agreed. Tim Peters convinced me that it isn't actually too big of a
> deal. Parameterization is almost like string substitution. In some
> lexical scope, _T stands for a paramater that is substituted in when a
> concrete object is declared. If you treat it like string substitution
> then the semantics are pretty simple. One minor detail to work out is
> whether to predeclare the list of parameter variables or just look for
> names beginning with underscores.

Sure. Parameterization isn't a hard concept, but coming up with a nice
syntax :-). I would hope that we don't base semantics on the presence of a
leading underscore.

In your original "Back to basic" note (which started this thread), you
state that out-of-line files would be used. I'm beginning to agree with
the need for separate interface files. But for a different reason :-).
Let's say that you have a module that has a few "decl member" statements
here and there. Those decl statements along with function and class
definitions form its interface. How do we cache that interface so the
type-checker can use it later? If we are analyzing module "foo" and it
imports "bar", then where do we get bar's interface? I hope the answer
isn't that we go and parse bar to derive it.
[ actually, I'd hope we somehow generate a central database of interfaces,
  but that is an implementation detail best left for later :-) ]

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From guido@CNRI.Reston.VA.US  Thu Dec 23 13:37:44 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 23 Dec 1999 08:37:44 -0500
Subject: [Types-sig] type declaration syntax
In-Reply-To: Your message of "Thu, 23 Dec 1999 10:58:39 +1100."
 <386165AF.F6E6BF81@maxtal.com.au>
References: <385C1345.C21FF180@maxtal.com.au> <19991222224650088.AAA118.228@max41101.izone.net.au>
 <386165AF.F6E6BF81@maxtal.com.au>
Message-ID: <199912231337.IAA21818@eric.cnri.reston.va.us>

[John Skaller]
> 1) Ban rebinding module variables after importing is done
> 	
> 2) Specify that 'def' creates an immutable binding. You can't
>    rebind (or del) the name. Same for 'class'. [This includes
>     defs inside classes]
> 
> 3) It isn't necessary to introduce 'const', at least for modules and
> classes:
>    Here's why:
>    Given (1), the only ways to change a module level variable are:
> 
> 	a) a module level assignment -- easy to detect
> 	b) an assignement in a function defined in the same module,
>            which has a global directive for that variable
> 	c) exec statement [if you see one, abandon hope of optimisation]
> 	d) __dict__ hackery from outside the module
> 
> The only hard case is (d). This cannot be statically checked.
> It can be banned all the same (and on the programmers head be it),
> 
> I.e. TWO bans fix most problems. The ban on module level rebindings
> is a significant restriction. The ban on changing 'def' and 'class'
> bindings can be worked around [use another variable] and so
> no functionality is lost, provided that the author takes
> this into account. EG:
> 
> 	class X:
> 		def f(..): pass
> 		g = f # g can be rebound, f cannot be
> 
> I think this covers:
> 
> 	caching all module and class level variables
> 	caching instance methods
> 
> what is not covered is caching instance attributes.

Agreed.  I proposed most of this a week ago.  However, I don't see why
you propose to disallow rebinding def and class.  I proposed to have
a warning for this.  If they are rebound, it is easily detected, so
the effects can simply be calculated by the type checker.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@CNRI.Reston.VA.US  Thu Dec 23 13:47:32 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 23 Dec 1999 08:47:32 -0500
Subject: [Types-sig] recursive types, type safety, and flow analysis
In-Reply-To: Your message of "Thu, 23 Dec 1999 11:14:44 +1100."
 <38616974.AFC365D6@maxtal.com.au>
References: <Pine.LNX.4.10.9912221415090.16305-100000@nebula.lyra.org> <38615A5B.D2E2B608@prescod.net>
 <38616974.AFC365D6@maxtal.com.au>
Message-ID: <199912231347.IAA21844@eric.cnri.reston.va.us>

[a.a can have type int, string, or a few other possibilities]
> > j: Int
> > j = a.a
> > 
> > What's the error message? 
> 
> 	There isn't one at compile time.
> a could be an int, the inference engine cannot 
> know if it is or not, so it keeps quiet.
> At run time, a TypeError is raised if something
> other than int is assigned.

I don't like this rule, and I don't think this kind of rule exists in
other languages.  I would say a.a is a union, and most languages
dealing with unions require the programmer to explicitly code a type
test.

Never mind that *we* might know that the original program in fact
will only reach this point with an int in a.a; the fact that the type
checker can't see that (and it would have to solve the halting problem
to see it!) means that I'd be happy to get an error message here.

Paul's issue was that in ML the error message is typically
ununderstandable.  I have never used ML (it's a language for people
with excess IQ points) but I don't think that is the right level of
critique.  In any case I think we can do better by simply referring to
the line number(s) where a.a gets assigned a non-int value.  Good
error messages are a human factors issue, not a type system issue.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skaller@maxtal.com.au  Thu Dec 23 16:48:59 1999
From: skaller@maxtal.com.au (skaller)
Date: Fri, 24 Dec 1999 03:48:59 +1100
Subject: [Types-sig] Run time arg checking implemented
References: <Pine.LNX.4.10.9912221426060.16305-100000@nebula.lyra.org> <386177A3.86F0D505@prescod.net>
Message-ID: <3862527B.99B783C8@maxtal.com.au>

I have implemented run time argument checking in Viper,
using Greg's ! operator. The syntax (so far) is like:

	def f( p ! t = dflt): pass

and the semantics are to check that an argument
has the nominated type:

	f(a) 

checks like:

	if type(a) is not t:
		raise TypeError "messge"

There is no return type declaration or checking,
and the type can be an arbitrary expression.
[In Viper, any object can be a type]
Even if a parameter has a default argument,
it is not checked at the point of definition.
The type is bound at the point of definition.
Implementation time, a few hours.
Tuple parameters are checked componentwise:

	def f((a!t1, b!t2)) : pass

We're not talking any complex type checking,
inference, or anything else here .. but this
could, IMHO, be implemented easily in CPython,
well in time for 1.6. Again, it _looks_ nice to me, 
and it sits well with the expression form

	x ! t


Example:
----------------------------------------
>>>t = type(1)
...def f(x!t): pass
...f("no")
...

Uncaught Python Exception at top level!!
  .. Kind: Instance of TypeError
  .. Attributes:
      args --> Argument 1 in call to function f has type <class
PyStringType> but type <class PyIntType> is required 


-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Thu Dec 23 17:40:29 1999
From: skaller@maxtal.com.au (skaller)
Date: Fri, 24 Dec 1999 04:40:29 +1100
Subject: [Types-sig] type declaration syntax
References: <385C1345.C21FF180@maxtal.com.au> <19991222224650088.AAA118.228@max41101.izone.net.au>
 <386165AF.F6E6BF81@maxtal.com.au> <199912231337.IAA21818@eric.cnri.reston.va.us>
Message-ID: <38625E8D.5280EA8A@maxtal.com.au>

Guido van Rossum wrote:

> > 2) Specify that 'def' creates an immutable binding. You can't
> >    rebind (or del) the name. Same for 'class'. [This includes
> >     defs inside classes]

> Agreed.  I proposed most of this a week ago.  

	Yes, I guess I was trying to summarise more than propose
anything new.

> However, I don't see why
> you propose to disallow rebinding def and class.  I proposed to have
> a warning for this.  If they are rebound, it is easily detected, so
> the effects can simply be calculated by the type checker.

	You're at least right that my stance on 'const' variables
and classes/functions is inconsistent. The rebinding rule was also
intended
to conver nested classes and functions. At present:

	def f():
		class X: pass
		return X()

	x1 = X()
	x2 = X()
	assert classof(x1) != classof(x2)

I hoped to eliminate this excess dynamism. .. redefining
the class every function invocation is rarely desirable.

But it may be harder than I thought, because even though
CPython does not lexically scope function bodies 
(Viper does .. so my rule would break it),
default arguments _are_ lexically scoped.
And perhaps it is best left alone, since nested classes and
functions are rarely used. [I mean, nested in functions]

Hmm. I have already put a test in Viper for module.attr = value.
I found I used it in a couple of places, initialising the sys module.
[specifically, assigning the standard input/output files]
I like the warning.

I guess I could put duplicate defintion warnings in easily,
and see what warnings I get.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From guido@CNRI.Reston.VA.US  Thu Dec 23 18:08:55 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 23 Dec 1999 13:08:55 -0500
Subject: [Types-sig] type declaration syntax
In-Reply-To: Your message of "Fri, 24 Dec 1999 04:40:29 +1100."
 <38625E8D.5280EA8A@maxtal.com.au>
References: <385C1345.C21FF180@maxtal.com.au> <19991222224650088.AAA118.228@max41101.izone.net.au> <386165AF.F6E6BF81@maxtal.com.au> <199912231337.IAA21818@eric.cnri.reston.va.us>
 <38625E8D.5280EA8A@maxtal.com.au>
Message-ID: <199912231808.NAA22805@eric.cnri.reston.va.us>

[John Skaller]
> CPython does not lexically scope function bodies 

I'm not sure what you mean by lexically scoped here, since in my
opinion Python functions *are* lexically scoped -- however the scopes
don't nest like they do in Pascal etc.  (The opposite of lexical
scoping is dynamic scoping, which is emphatically *not* used in Python
-- variables bound in outer stack frames don't affect a function's
use of variable names.)

> default arguments _are_ lexically scoped.

I'm not sure what you mean here either.  Defaults are evaluated in the
containing scope.  I'm not sure how that makes them any more lexically
scoped.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skaller@maxtal.com.au  Fri Dec 24 02:01:27 1999
From: skaller@maxtal.com.au (skaller)
Date: Fri, 24 Dec 1999 13:01:27 +1100
Subject: [Types-sig] type declaration syntax
References: <385C1345.C21FF180@maxtal.com.au> <19991222224650088.AAA118.228@max41101.izone.net.au> <386165AF.F6E6BF81@maxtal.com.au> <199912231337.IAA21818@eric.cnri.reston.va.us>
 <38625E8D.5280EA8A@maxtal.com.au> <199912231808.NAA22805@eric.cnri.reston.va.us>
Message-ID: <3862D3F7.1D71AC9B@maxtal.com.au>

Guido van Rossum wrote:
> 
> [John Skaller]
> > CPython does not lexically scope function bodies
> 
> I'm not sure what you mean by lexically scoped here, since in my
> opinion Python functions *are* lexically scoped -- however the scopes
> don't nest like they do in Pascal etc.  (The opposite of lexical
> scoping is dynamic scoping, which is emphatically *not* used in Python
> -- variables bound in outer stack frames don't affect a function's
> use of variable names.)
> 
> > default arguments _are_ lexically scoped.
> 
> I'm not sure what you mean here either.  Defaults are evaluated in the
> containing scope.  I'm not sure how that makes them any more lexically
> scoped.

I agree with what you say, your guess at what I meant is correct.
I should have given a more detailed description. The issue remains,
independently of the terminology used to describe it:

	def f(x):
		def g(a): ....
		return g

Here, each g created on invocation of f is has the same behaviour
on each invocation of f, no matter what the arguments of f are,
and no matter what the values of the locals of f are. But each
invocation of f() returns a _new_ function called g: a distinct
object.

This is not the case if g has a default argument:

	def f(x):
		def g(a=x): return a
		return g

	
	g1 = f(1)
	g2 = f(2)
	print g1(), g2()

prints 1,2: g1 and g2 have different behaviours, even though
the bodies of the functions are the same, because the
default arguments differ. So _part_ of a function definition
may depend on the definition context, namely, its default
arguments. Unlike C and C++, in Python default arguments
are part of the function.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Fri Dec 24 03:16:04 1999
From: skaller@maxtal.com.au (skaller)
Date: Fri, 24 Dec 1999 14:16:04 +1100
Subject: [Types-sig] Multiple dispatch [for interest only]
References: <385C1345.C21FF180@maxtal.com.au> <19991222224650088.AAA118.228@max41101.izone.net.au> <386165AF.F6E6BF81@maxtal.com.au> <199912231337.IAA21818@eric.cnri.reston.va.us>
 <38625E8D.5280EA8A@maxtal.com.au> <199912231808.NAA22805@eric.cnri.reston.va.us>
Message-ID: <3862E574.156465BB@maxtal.com.au>

[This is mainly for interest, it is not a 'proposal' of any kind]

A day after implementing type checked function arguments in Viper,
using the syntax

	def f(a! t1, b!t2) ...

I am thinking of some interesting new possibilities this leads to
for polymorphism. This comes about
because the type information is part of the run-time function
object. 

Consider, just for the moment:

	def f(x!t1): ...
	def f(x!t2): ...

and suppose this was allowed, and matched the arguments of a function
call in reverse order of definition:

	f(a) 

is notionally expanded to

	if type(a) is t2: f_t2(a)
	else if type(a) is t1: f_t1(a)

where f_t1, and f_t2 are the two functions
named f, defined by the above definitions, respectively.

This could be useful, and the mechanism has a name:
it's called multiple dispatch.  In particular,
it is _message_ based multiple dispatch.
[The table of possible signatures is centralised,
and associated with the function name, rather than
distributed between 'objects' as methods]

Here's an example: suppose you have an algorithm:
	
	# module System
	def add(a,b): # add standard integer types
	def gcd(a,b): ... add(a,b) ... add(b,a)

It's well known that Euclids algorithm for calculating
the greatest common divisor is generic: it works when 
both numbers elements of the same integral domain such as
the integers. Python already has two integer types,
'int' and 'long': the algorithm will work for both.

What happens if a new type, MyType, is created?
You would write:

	_add = add
	def add(a,b): 
		if type(a) is MyNumber and type(b) is MyNumber:
			... # details for adding MyNumbers
		else: _add(a,b) # call old function
	System.add = add

I note this is somewhat insecure, since _add is likely to
get redefined in a subsequent attempt at the same thing,
Viper has a syntax to fix this, but that's another story :-]

I also note that this _requires_ invading the module System,
and breaking the 'freezing'.

Let me call this an incremental function defintion.
What we're doing, is overloading the add function.
This is part of the theoretical 'type' of MyNumber
(that you can add them). Adding a new meaning to
the System 'add' function is essential for making
the generic algorithm work with MyNumbers.


[There is another way to do this, using classes 
and __add__ and __radd__ methods,
but this does NOT generalise because of the usual
covariance problem with object orientation:
please consider a three argument function to see this,
I've used a two argument one for brevity]

So, I was thinking that it would be interesting to consider
what would happen if the run time system, instead of
just looking a function up by name, ALSO used the types
of the arguments to help choose which one to call.

[This could _also_ be applied to methods, except that
the method would not be applied to the 'object',
since the class of the object already acts as a lookup
namespace]

Just to summarise: I'm thinking about some way of
permitting 'incremental' function defintions,
and an associate 'multiple dispatch'.

Before considering syntax, there are architectural
issues. The invisaged implementation for _builtin_
functions, such as 'len', would be to keep a list
of the functions named 'len' associated with the
string 'len': instead of the lookup table
I'm using at present, which is like:

	{'len': len, 'abs': abs .. ]

the table would be changed to

	{'len', [len1, len2 ..] .. }

and instead of dispatch

	return table['len'](arg)

we'd have something like:

	for f in table['len']:
		if type(arg) = f.param_types[0]:
			return f(arg)


This is NOT particularly new for Python, it is similar
to the way exceptions are matched against handlers.

I note: the actual implementation is trivial.
The extra overhead calling functions is also minimal.
The issues here are semantics, and to some extent, syntax.
-------------------------------------------------------

FYI: what excites me about this is that it is well known
that object oriented polymorphism doesn't work in the sense that
it doesn't extend gracefully to multiple arguments.

Dynamic multiple dispatch is a fairly unprincipled way of
supporting the notion of genericity; however proper
genericity cannot so easily be implemented: no one knows
how to do it, it is a very active research area in which
there are currently a diverse collection of theories.

Many functional languages provide _correct_, well principled
support for genericity, using type variables, modules,
Haskell classes/monads .. etc, but all of these type systems
are severely limited in their expressivity. Generally,
less well principled systems like C++ templates, or dynamic
disptching, are less secure .. but more expressive.

For those interested in why someone like me, interested
in genericity, is VERY interested in Python: this is the reason:
it is possible to use the relatively unprincipled dynamism
to _implement_ generic things in python, even though
such implementations are not very secure. This is somewhat
like using casts in C.

As any Python enthusiast will tell you, Pythons dynamic
nature makes its object orientation more powerful
(expressive) than that provided by many statically
typed languages .. and also more dangerous. 

However, OO doesn't work well, and it is interesting
to experiment with a dynamic language which does not
have the constraints of _both_ strict object orientation
and also static typing: a suitable dynamic system
can suggest how to 'design' a static type system
that retains the expressivity enabled by dynamism,
but also provide the extra security static typing
traditionally provides.


-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


Return-Path: <owner-types-sig@python.org>
Delivered-To: types-sig@dinsdale.python.org
Received: from python.org (parrot.python.org [132.151.1.90])
	by dinsdale.python.org (Postfix) with ESMTP id AFB6D1CDCE
	for <types-sig@dinsdale.python.org>; Wed, 22 Dec 1999 04:14:14 -0500 (EST)
Received: from nebula.lyra.org (IDENT:gstein@nebula.lyra.org [216.98.236.100])
	by python.org (8.9.1a/8.9.1) with ESMTP id EAA04412
	for <types-sig@python.org>; Wed, 22 Dec 1999 04:14:13 -0500 (EST)
Received: from localhost (gstein@localhost)
	by nebula.lyra.org (8.9.3/8.9.3) with ESMTP id BAA25356
	for <types-sig@python.org>; Wed, 22 Dec 1999 01:16:46 -0800
X-Received: from chronis.pobox.com (chronis.pobox.com [208.210.124.49])
	by nebula.lyra.org (8.9.3/8.9.3) with ESMTP id AAA25205
	for <gstein@lyra.org>; Wed, 22 Dec 1999 00:39:31 -0800
X-Received: by chronis.pobox.com (Postfix, from userid 1001)
	id CAD4D9B1B; Wed, 22 Dec 1999 03:36:36 -0500 (EST)
Date: Wed, 22 Dec 1999 03:36:36 -0500
From: scott <scott@chronis.pobox.com>
To: Greg Stein <gstein@lyra.org>
Subject: Re: [Types-sig] recursive types, type safety, and flow analysis
Message-ID: <19991222033636.A14007@chronis.pobox.com>
References: <19991221234415.A12628@chronis.pobox.com> <Pine.LNX.4.10.9912212146150.16305-100000@nebula.lyra.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 0.95.7i
In-Reply-To: <Pine.LNX.4.10.9912212146150.16305-100000@nebula.lyra.org>
ReSent-Date: Wed, 22 Dec 1999 01:16:39 -0800 (PST)
Resent-From: Greg Stein <gstein@lyra.org>
Resent-To: types-sig@python.org
ReSent-Subject: Re: [Types-sig] recursive types, type safety, and flow analysis
ReSent-Message-ID: <Pine.LNX.4.10.9912220116390.16305@nebula.lyra.org>
Sender: types-sig-admin@python.org
Errors-To: types-sig-admin@python.org
X-BeenThere: types-sig@python.org
X-Mailman-Version: 1.2 (experimental)
Precedence: bulk
List-Id: Special Interest Group on the Python type system <types-sig.python.org>

On Tue, Dec 21, 1999 at 10:02:08PM -0800, Greg Stein wrote:
> On Tue, 21 Dec 1999, scott wrote:
> > On Tue, Dec 21, 1999 at 08:06:28PM -0800, Greg Stein wrote:
> >...
> > > Basically, I think your request to find and report on
> > > use-before-definition is "intractable" *when* you're talking about
> > > multiple bodies of code (e.g. two functions, or the global space and a
> > > function).

[...]

> > I'd agree that this has been demonstrated, but only for examples of
> > code which seem like great candidates for compile time warnings.  Are
> > there examples which strike you otherwise?
> 
> One of my points was that I do not believe you can issue warnings because
> you can't know whether a problem might exist. Basically, it boils to not
> knowing whether a global used by a function exists at the time the
> function is called. So you either issues warnings for all global usage, or
> you issue none. You can make a few guesses based on what happens in the
> global code body, but I don't think the guesses will really improve the
> quality of warnings.

I personally can't imagine that it would be an issue to treat globals
in functions as anything other than a simple flat-rule: for type
checking purposes, globals must be defined at compile time in the
global namespace, that's just me, but I'd probably fire any of the
python programmers that work for me if they did what you describe
above with globals in a large project :)

> 
> Examples? No, I don't really have any handy. Any example would be a short
> code snippet and people would say, "yah. that's bad. it should fail." But
> the issue is with larger bodies of code... that's what we're trying to
> fix! So... No, I don't have a non-trivial example.

I can't even imagine one, so if there's any way to describe this
global issue a little further without putting too much effort into it,
I'd appreciate it.

[...]

> 
> The origination of this discussion was based on the recursive type issue.
> If we have runtime objects, then I doubt we could support the recursive
> type thing without some additional work. Or, as I'm suggesting, you do not
> allow an undefined name (as specified by runtime/execution order) to be
> used in a typedecl.

you could even allow typedecl to import modules for the sake of
gaining access to the names, where those imports would only occur when
the optional type checking is turned on.  I'd agree that the use of an
undefined name should be disallowed.  With the presence of
type-check-only import, following the same
no-mutually-recursive-imports rule of the regular import, but only
importing typedecl statements, you could achieve all this at compile
time. 

I've run into this issue on large projects, importing a classname,
just to run 
    assert isinstance(foo, thatclass), "complain meaningfully"

But it hasn't come up with recursive types in any code I've seen, just
deeply-complex types in terms of container and class hierarchy
relationships.

> 
> The design of how to handle recursive types depends on the decision to
> include/exclude runtime objects that define function, class, or module
> typedecl information. Even if we defer the runtime creation of those
> objects, it will affect the design today.
> 

indeed.

[...]
> 
> I do believe the information goes into the bytecode, but I don't think
> that is the basis for needing to plan now. Instead, we have to define the
> semantics of when/where those typedecl objects exist. Do we have them at
> runtime? 

in the above, no, though we do have the ability to find a name
anywhere at compile time.

>Does a name have to exist (in terms of runtime execution) for it
> to be used in a typedecl, or does it just have to exist *somewhere*? 

in the above, it has to exist in the typedecl 'execution' model, which
is during compile time.

>If
> names must exist before usage, then how is the recursive type thing
> handled? With unspecified typedecls? (like an unspecified struct)

How about an iterative model which continues until all typedecl names
are filled in?

I understand your concern about 2 distinct namespace models being
unsettling.  It raises issues of what exactly we want out of static
typing, and what sets of existing and future python code may benefit
from static typing, and these are indeed big issues.  

For me, it is sufficient to proceed from the premiss that you can't
have static typing work on code that redefines types at run time, and
to limit runtime checking (for the time being) to optionally have the
interpreter take some action (warn or abort) when that happens. That
requirement alone implies that typedecl'd names and their typedecl
bodies need to be available at run time, which is sufficient to
support just about any future developments in a static-typeing
interface in pure python.

As an aside, I'm glad to learn it wouldn't be difficult to have python
put static type information in it's byte code.  That seems like a good
place for it.

As weird as it is to have a separate type-decl name model, it seems
infintely  to depict dynamic typing in a static typing model.

scott


From skaller@maxtal.com.au  Sat Dec 25 21:33:41 1999
From: skaller@maxtal.com.au (skaller)
Date: Sun, 26 Dec 1999 08:33:41 +1100
Subject: [Types-sig] Viper module compiler begun
Message-ID: <38653835.11199C33@maxtal.com.au>

I have started work on a function which takes an already
imported module, and generates CPython 1.5.2 CAPI
compatible C code. [i.e. the idea is that you can compile
it as a replacement for the source python script]

This function does not as yet use the type information
which the function argument type declarations provide
[recall I have implemented 'def f(x!typeexpr)']

However, it _does_ use the type information available
from the module level objects (obviously, this is necessary
to distinguish functions from other kinds of object :-)

Note also, the model under investigation does not
'compile' from script source, instead, the script
is executed by the interpreter first, to build the module
dictionary, and then code is generate to make each of the
objects found in the module dictionary.

Because of this approach, a lot of the 'dynamism'
involved in constructing a module is bypassed -- it is 
handled in the usual way by the interpreter.

Clearly, this approach is limited. One of the problems
is that any objects 'imported' from elsewhere will
get duplicated in each such module (rather than being shared).

I'd call this a 'leaf compiler' for this reason: it works
best with modules that can be regarded as leaves (not
needing to import from other modules). I'm investigating
how to get around this (look at the source, to see what is
imported, and actually import it, rather than building
copies of the objects).

I'll report back later on further progress. I note that while
my code is written in ML, if the tool proves useful, in that
it can compile code for some modules, and that code runs
significantly faster than script, then a translation
into Python should be possible.

In particular, such a translator could be compiled with itself .. :-)

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From gstein@lyra.org  Sat Dec 25 21:38:47 1999
From: gstein@lyra.org (Greg Stein)
Date: Sat, 25 Dec 1999 13:38:47 -0800 (PST)
Subject: [Types-sig] Viper module compiler begun
In-Reply-To: <38653835.11199C33@maxtal.com.au>
Message-ID: <Pine.LNX.4.10.9912251336020.412-100000@nebula.lyra.org>

On Sun, 26 Dec 1999, skaller wrote:
>...
> Note also, the model under investigation does not
> 'compile' from script source, instead, the script
> is executed by the interpreter first, to build the module
> dictionary, and then code is generate to make each of the
> objects found in the module dictionary.
> 
> Because of this approach, a lot of the 'dynamism'
> involved in constructing a module is bypassed -- it is 
> handled in the usual way by the interpreter.

Interesting approach.

Dunno why, but for those that need the dynamism or want to avoid
importing, the Python2C compiler can be used:

  http://www.mudlib.org/~rassilon/p2c/

And it exists today :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From skaller@maxtal.com.au  Sun Dec 26 01:04:19 1999
From: skaller@maxtal.com.au (skaller)
Date: Sun, 26 Dec 1999 12:04:19 +1100
Subject: [Types-sig] Viper module compiler begun
References: <Pine.LNX.4.10.9912251336020.412-100000@nebula.lyra.org>
Message-ID: <38656993.58E50914@maxtal.com.au>

Greg Stein wrote:

> > Because of this approach, a lot of the 'dynamism'
> > involved in constructing a module is bypassed -- it is
> > handled in the usual way by the interpreter.
> 
> Interesting approach.
> 
> Dunno why, but for those that need the dynamism or want to avoid
> importing, the Python2C compiler can be used:
> 
>   http://www.mudlib.org/~rassilon/p2c/

	Last time I tried that, it crashed unceremoniously.
Has that been fixed?

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From gstein@lyra.org  Sun Dec 26 09:45:59 1999
From: gstein@lyra.org (Greg Stein)
Date: Sun, 26 Dec 1999 01:45:59 -0800 (PST)
Subject: [Types-sig] Viper module compiler begun
In-Reply-To: <38656993.58E50914@maxtal.com.au>
Message-ID: <Pine.LNX.4.10.9912260144270.412-100000@nebula.lyra.org>

On Sun, 26 Dec 1999, skaller wrote:
> Greg Stein wrote:
> > > Because of this approach, a lot of the 'dynamism'
> > > involved in constructing a module is bypassed -- it is
> > > handled in the usual way by the interpreter.
> > 
> > Interesting approach.
> > 
> > Dunno why, but for those that need the dynamism or want to avoid
> > importing, the Python2C compiler can be used:
> > 
> >   http://www.mudlib.org/~rassilon/p2c/
> 
> 	Last time I tried that, it crashed unceremoniously.
> Has that been fixed?

Not much of a bug report. Get serious. How the heck should I know whether
that particular bug has been fixed?

"oh. it broke. fix it." *snort*


As far as I know, P2C can successfully convert *any* module into a Python
extension model.

-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Sun Dec 26 12:13:41 1999
From: gstein@lyra.org (Greg Stein)
Date: Sun, 26 Dec 1999 04:13:41 -0800 (PST)
Subject: [Types-sig] merry christmas... here is a demo!
Message-ID: <Pine.LNX.4.10.9912260403160.412-101000@nebula.lyra.org>

  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.
  Send mail to mime@docserver.cac.washington.edu for more info.

--1658348780-1663354158-946210421=:412
Content-Type: TEXT/PLAIN; charset=US-ASCII

Hi all,

I banged together a rough prototype for a type checker. It provides some
interesting errors/warnings, but totally ignores a bazillion others :-)

But: it does provide a complete structure for handling that stuff. It
understands a variety of types and composite types and whatnot. It
analyzes a parse tree of a target module. It provides for looking up names
in the builtin, global, and local namespaces; each name has an associated
type. No declarations exist, but it does extract a bit of type information
based on what is going on.

It runs in "verbose" mode. In this mode, all assignments (or dels) and
return statements are printed, along with the type that will be assigned
or returned. It's fun to do something like:

  return (1, "ab", [])

And watch it print:

  line 1: return tuple<type_int, type_string, list<*(Any)>>


I figured it would be nice to go ahead and dump a copy to the SIG. Merry
Christmas! :-)
[ and for those who don't celebrate christmas, figure this to be an early
  New Year's Gift... for those who celebrate Chinese New Years... consider
  this a *really* early gift :-) ... for those... hehe... just have fun! ]


For fun: run "check.py" on itself.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

--1658348780-1663354158-946210421=:412
Content-Type: APPLICATION/octet-stream; name="typesys.tar.gz"
Content-Transfer-Encoding: BASE64
Content-ID: <Pine.LNX.4.10.9912260413410.412@nebula.lyra.org>
Content-Description: 
Content-Disposition: attachment; filename="typesys.tar.gz"

H4sIAJkBZjgAA+09a3PbRpL5Gv6KiVQ+kDZN62E7W0xkl2wpG9XaclaWN7ul
0/EgEpQQUQAXACVr49xvv35MzwMPknJsZ7eKqMQigMFMT0+/prtnZngRDS97
09uvPuO1sbmx8e2TJ19twPXtU/67+fgx/d3YePpk88m3X218u7G9+XhjY+vb
TSi/+e3206/UxucESq5ZXoSZUl+d50UUJ/PKRVn+JQD6stf6N49mefboLE4e
Rcm1mt4WF2nSWm+tq6GmDPXwIf9WxUWkittppGZ5eB6pOFGh+ok+UFfpaDaJ
4LtWfDVNs0JNwwwQJnf57dVZOpG7Ir2MEvsqNz+LLE7OTSloKW+5d6NoaOqY
XIeTWSR3SXgV5dNwGLVaAHmYqOj9MJoWMUBWpABupMZppq7DLE5nObQzG49b
6+vrahy/h17FuZpNe71eK8oyKLajBgOscDBQD1TQo4cBVDychHmuXiIqoqzf
UmoUjaFonMTFYNDOo8m4q66j7CzNo52NDhZQCp/29EOoWP9q2XeDUQyAF8ML
ePur+o1eMKzQvS51DBHNCOzBHwKtFxfRVd7WjSgVj9VFmIdFkWk4gkEAwGNR
U6bc4Ak1cQrt2sIt3Ssa78E4nkS6Pvypa4K2cDDa9Ejt7PBA9d7S4B3Db2kQ
C0Dt6TRKqDBAlQUd23VupIjeF/S6l0XhqN3p+DDQa4YhT2fZUKAI8wLqZirr
5TPAR1u/p9dFFkX2PRTeKmZT6A386qpNB4iBbgbKt/Efan4daP73XFgFVfPT
0cHfdo/31ev94x/f7L2l50I4TsPcP2peegdgF/TkZOMUkawJABEFFDedFQ4N
jWfJMIfenqjTr4Gm4Q4pP0fSZ8ZFAoqGaTIChOQ5f7kOjwqgezWcZVmUODzU
U0mqzifpWTjJ1W1U9GxLCTZjC75LoJ1DuW07aJXPd9RhmkTSYpxQt0DiJqMw
G2mpwRW6jfSQ08MsagfCiUHXSABNaZ2mD0bpEMsbqncvU8cMIR/Ibd5ul2qv
/7z5Mp8XI+xwp9MI3tksnhSACL9P75LLJL1JXhNCOi0jBpJ0RPxPlLDZP3UY
Hl8hcXyzwwK1t3+493r36C/7RyAAR9XXh/s/vzo43LfiQBOZFLRElhdXRUlo
4KM2lQQgOjKeqBAMNagzIDH4TU9l/KFTdUSh8SL1TLN0GIFsjUKQg+mYavjn
LJpFI0vNXFR/kKRF1HeJ/yq8VedZeqNGMxw+FuqTNJ0ipkJTC3Ieymv4qWvi
Kx0CF+RInzHgO4TqL0BsmFHAz5mJpEHBYh1XuAyhi60rTQGqDUwZJeEZkD3g
Pz5PrpD3SKtGCbQ1jPBBR4WTCaFBIDcVgUiD+gswRnoiZ5FO4BmQ0wihxE+A
Dnthdj4oRnlJ/jvUKB8ZMF2JiLW0qSocdqgr/7Si8e0xCMbX+4fH6vDN3r4v
GonaWCiaxvsiteAdDD0ofpCEeKM+KKC7aTpLRgNDuW5PsIq21AOUrvvhfdQH
OpHabi5iWzUgV34W2a00IHRjrg+EdQQeoEE7AX5CuYHXCFoW+BeEtChhoHAc
5Rx1OhggVHV+BYOvoYreT037U6Dsgm/gbhRNzAtoz8A7SW/qofyg2FiSksyN
che9j4bymwWD1DKw8JgOLOiBAaKvzkCrX9pxSkD0zQxys6iYZYm5C+M8Mq2a
OpZrlOnGUkUj+dgifQfTqh18Fzj3nfvqBJ6cKi01RVJFaDSFRcSiChRZQfwK
xsllPJ2y5AG4oqt4mE5QB6MoljrKQl2AEw4tCWQo0RaJbADramn+dv/1QdeX
7J2qbnBkuiWqsmg3b3wBTxjVXNGETf26r4J4HKgiAqMs6AMe0R4DlEaT6mPC
bDTJo8A+O9XoBYs8iXBsUxDoCc4xSE+TShBVbXVjjBjKwuQ8am901QSsTAOf
eqi2u+pxp18j2HxxEIPlu3naQeJCOJcqv83lCfSW2MR++/fUY0T+9mIAHmLr
TmWEdiuCmjBvSwDy6aaC/9+H5oaSwMIwiTmPElD41WFp7ib30qC4ueC2g45a
1D5Tjxdj9WkNUkWUN6FU3gNC4WdAsncSI0rjhJHLd38kgnnaa0wo3Xe2JMrI
Fvj5Q/hxvFf6rg7z0k/blQs0q1J1E+H0NR7fkojT9ZHhgjIOpmlZOCxo7j2J
ro73ntP3/FvtWEt3N7k1POPOjmVMJ9cwCDgJ5672YCiuwmKgO97Wf0XWkU5U
wSROInVv1Mcx7Kt7iJ97eQBc6Or9k63Trq6+qwHr2F7yTOnmIkLTT7p3RcqG
TVpuGWdN5ptwOp1U0NEGGoXhtJ/ktpF/zmJo5CJEZ0qKNh6gljWSYw3mgp44
B4MU5kpgSeqauoKVV3/DP4fu/F6sO00M+oseG3led5tJ4Ikr2Rq470+Lue9P
JQFJ7CfmUxP7yXtgP/gZuIqEvTkDMKzQlWNVyQPPiMerypJgWJTrC2AOALh3
n4l6dxsCQPg+UCckWk+CLsuB0xpGBw7JYfrDHMG8jIP5jZn7wvQC2AO7+Yir
VaDD+F7D81xQ7rCko8J94MyEwFa4cFy2iMOvQpzKmF5XlOl2SZl21bbjRCLj
ZceCGJ+aV7WGRy3UJdJCstpy3zV0AjvglSrV8divYwlxefL41Mx+nErrBJNb
611ElL3KwkqTgVAbiS0Qj0ZuwfA7MqsCJ1LfzUVYaOGMDgQVnqWzgm1QdHJq
qcIUCk+Bexi45zV1sWl601W/zIDYkdOMrPYL3kGKeVi9mzwr0UKTZHMVi4vy
OSbcgy2tHx30ewyBessf+KqvRJveu6/3sSooritaHoRWuaF1Vxgsw8wLyzxx
THk9QW20fvg1CD34N1DYMcfXUJGU9hV80QYJeQ0PoD5UhKcq6AS6nPO4r9rj
KYJyEuxoOapAonbuV6Q4Xe3gvgaD5C7c3g8+4DP4v0MvULbXPO7U1Qcz43Lb
bay2/JSmJ93gtNauR5x7Zv06oXacptYVD5YETHlBy3heJ3S4hAUwWBFe4iwx
Ab7JzmcuoxQj10w63vtBf912KbyLTt4uOVX53yoVAXfE5Bx3oIb//DkmF/Il
tTNUdiapC8IspUL3lj+8YVY3cXEBvdePUMtdRrc3aTZ6SPdEOfZTJwpjHvqo
gG7/4HvBfO70W08TMMnQ0QBaDjGdReczQD43m1txRv46dpqbZ+7kXHfc8bnS
VxqvgAxflSWukx5pyhdj1FoP7MUoGbWZX6mUnXK7sgvnzV6NSJp+hST8gciy
KAkKQHBknA9ZlM8mhcri84uC5fkZqIT7o/S+RAMKryaGBlsoO3jN+IhiGE/C
oogS45xkHKL5z6IZFYbWNP4Is7+WnLLwRR5hwI1qAKqIMoawYFUW5dihK2wo
k5b9QRsgGHUjp2NlVKhkaZBhTuGweRGrciMyYPShNz5ls0ACNjXNvMKpAfyq
oQeYNFWr127bRR0kAD0Vxd8JzO0a9eiOq8JAxMjGdK2MamDDRomkG/alkrGP
Mag7LIAn82lEsq/r+bV1pzBSUGCwSYXXYTxB/7ipwSt+lo5uoTjSSD6bTsE+
I9smjyJDkTAYaGpDneh9jSfRSCsEVPPaiPhOiRdc3aQzwMM0i67REw+y25gw
drJkjQ9njg3/Od7zdY5YCGEbgHHgYKSAmHWUA0PcNKIYVNhRe9E4yrJoZJAr
LXDdBgQKO8jg4o2j2afz9DoJI1akH0hVj6fsxzA6mh/0tS60arGiC93JdJM2
YJ+uV3RT11AWuXW+UMLr/DjXyzevX+8uEcaiPviiYwkJLPBTTFIwLC79JiTL
e7CF6KcxXNqu46gTnDp2lKFviX5Zk8q8MvkJzKy6ZKvKnDp++HJi47vNvMeJ
DMsxHpf9N+Y6wZ4F1sb4UHeodlyg1ikuUhCqNzGorjOUdsx1iJC0mg5QddEa
UjARmSZaMAX6dujbYl/iHRi8uuh5VFgBrOOe+DlKCW3s4O0Cv52AhzVqPWwq
1Y4oyXyxkfbFDvStrtrqWK/cMn7H2FoOjZPou06ey5NmbrPBybdZcvIR8mwU
tuLm09id4+YrO/r0F/McfR8xSf6IyfF8d5/tOBGtDR02Ua0tATKMbmDKxQEF
nKKxB+y0mphgUX8noV1LzRZeCW42QSvvabY6CUqu7mV95J/RDw1QzfdAO2Mj
0dvGoZECODKoXOi5ZPJwmM5GdhtDdbYIhuvoLgB7G+zsESXZsOJ3HtRPzMGG
GGfplf+pqY8m7R9Y+1F9NCG/L1Ny55u+FOpJoVozY5M0utTfdxgsvyAVQr6b
FOU7ToDOIpw3YEyYC9ZqSc6yEWadF5Ft8Gm63WhwEtXMl91pJCJtx5TcdCeT
6zyFJRGeTh9OQENOjLIWf17Zy+SoSpPUgS/dmUot226781u/szrQfLx75Hay
TOk/7x4dHhz+ua/WkCwkpUvnZd5fg8Yu4jNUwaSPiBdhoHrIGtql2+h003Pc
UUpTXNTlUs/C7ntOQeESJ+OhiUucIsAlfBdUyLnKgCZ9otkq0AUoqhANWWaB
hSixRS+6UKnfScloasEpAm3wXaXqJcK0dY4vz0E/N4JoILaJH00A2xIAL93U
SDYvZaTR/HYLoQ2u72vqc1JOmmpzikBdfKdDQOTdbAj5A8cYFJHYmY/lRXZS
mc/0vKRBqWiDmLto8mgae2hKYAfxphLicsmxhh51xm1Djg2+LOdo6SQVdXC4
h0lf+PSB2tvHm0Z8btZkxvjzT8lusU15ROomBHlapV/jfv8IReDn0zQnS36y
rLn9v/90tP/27cGbQydtDl8cU3BHE0mo0OG0Z73O6dkv0ZBSwmw4VpNzYzhW
v+dZDEs//OU7BcgKOt4rG1tncZJO7Vh6HCOf8AibXFWZyjRbjbpz8j26YAVW
zO1+OA1JtfAcjzI6v7MIoSJWd+jnrnMrfh+NXkXJMeWJGyA9fM3DVR97MtC4
wvQRuQWEfVCT8OoMvRHNnhRNUbpgqc+MW/2uSsl3HoqyXr25iIcX0I3sSqah
1AWZLTMU2uooo66cSl1FnWCiUXPp9330RQgK4WFg7sU4/MR9pDY+UScF1KZO
ynuQuPDTdk2nr8IEPYcpv3XtLJjvL1RBaDxhDGOEsSjyrXO7aETiZDQafYdZ
02eYrDAFBkmiG2hYpjPGjSSIGDkaWaBtVsdSghNYVRufDNIp3clYTqJxMbef
G0LZJV/FViU/wfopOOoxr1bHR8GQAlwwJt/jtCV4Rv/u8G/+873+w6++4bud
nUDq+NoI/w+cKPZBI1pu4tz8oRf6Q0GJm0rxcLMUptOFfBGhH+qSTNY7pihJ
UEpnJAl639oZumR9CE/Tkq7Fj3LWALQ1d3oDlTng6Ckc9t0Pmgjo+AqQ5byr
BlhM2Th3kFgtWYOOTUKHMxytOhZBF088DM/iSVzc9tCkhFn6bFgwYwDj95yv
zKTnhpZuYc4FEim5as2XfbWtvlfv4f8n4lNgkjdS5SApPC7D956jcZ6Psa/e
p9mA+Sv4EJi7j5KVHiYwukehqxosePCWVDGBLWA0gS7vWV9q8P8nMHd/MPgC
xjxdxeDnF/G4kA78V+Dc/8FdsIA0GuimBIxCFhcXuhvt4PvvgZSePQs6zvN/
H4FdFZc2Dg8yB1+LRHq1/8Px2x8Pfjh28gQQxZLBimoQOT6iZXLYtUcESk/N
pjQG+KhnviWF6EgcEChua0cHf/6Rm/t0rTUKBjsyjURqSvTZ4IGRfQAD+xDG
Fe//I0f0p1fv3n6RwXx9cPju7RcYRxyJ5hlFdtVX43CIs7c2elU/BI/g/3sw
gvz0P3IMfWfiJxvDciuvdt/++EVI5af9o5fiwPi8xMKD3hjkp7d9JWz+Ifg/
Qyi4QAzM/Wz5uQUVTNxB3TBDKolJVZ6c03d/XmHGLCnx3Cep6fjg1d7+R9Wk
0zgDwlbQNP2ht42hGnwJOrVIr1SRhfEkyu5zimKJaxcMQxPPbpZ41t0AwGFN
zzbfe/Puxat9n/OK0dZ87n6wWU4CA4P6DH2RBcbMR1vmnYeh2q5R8HSg0dHG
76tSpIpnxGGjfoN3Ou3UeGUxvQLt+xP/4Sk9/BUejuJhcRVeRhk8/U1SX/iC
Ev/rJmr8r4lhwZ93r1/sH8GPt8dHYPI/aOSOGvZ49ZNFujtCpeDQ0U/u2Cxy
TbVNqqDnHJrrZy6B9favL5YByylWB9bfwgyAwvS2th9vaYQOsTswFi2Is4WQ
vjjafbm/DKxeQR20zzg75TK6fcRBfbKpgZKzLAKD+iLKoudz+rcXU0JWmN2W
Et6W7C0S3J16+2L35V/++u7NsenH4jhNFW7Oaqyrnim5VjSGBa1sbXdMKgQu
+tKJPD1c+w4GyRSa4hwJBENXcwmMQJ/6Qdsm6CiPyfjR2pe1eGBO69+9g45H
w4mwOmlojoSifSn0fXbrika7P0HvPCpsSqfYAbxswN0F4wyXeGN+xj7uFeM5
VdyND1w/hdem364uXdv4HQHAyzFhNAsV2Bj63WXPGHdXim5515hyEL00hABl
Ux3egmRstDm6tX909Oaor2bcE8UJAg2xLh8flVzxEtlovURRLkV7+XRVMONE
AA734MeB0TviZG9QPeKfB9jxV1hZ+dBnRWJn9WP0L8eJp+bqUtvN9H12lg+z
eDo3OuMV6ttbjtOY23/LYA0uTjLwglHG0RsqWe98XxycMfUtxBcMXC8w/39Q
2gV/opej9AP5eQIDOozS6anUwLd9pwzPvzBFyLXJ7mA+770x7gntX52Q47VB
8h3vvWAWa7NzaH8yiad5nGOwz2YAKswwnoAdizDq8IauSdInqDbyMpHF6xSR
MTLCFnvN+fk3UDDEvUrS+u3AbuIipsxDMaZNIqPeqWTOmoYlzGEjSiZeZNhl
dF0EqtrUUwPrG8anG4yjcjC5ZC+/fPPqzaHhAfVMgTFca2Y/2CzF7vxVGktZ
126FD8v1CcmJSkv9lZdYHv7/GoVSzv2RyJb+0of6zsA6rsmbiLbYEQIwqcrx
vyLbYC3zEgV90hj4O5jiHRwf7Ndtp0WwWxmg+b+8aoezgQfaRuC+y9YiqHbb
SceOTuEWdwihvCUbKNMBmJrwbUXNae3mwKiGYYJVX4BEBV7lutTavXxNte+N
Oqj3/tvT4sruBaIbPElQFxqrBKfBTMDw7++HiKpaAJDfXBmemmQWo+VL2hdU
/YgSwuoMMtbcgd6GT9dOJpBjtHXVNdjlZziat7iuZjiIk3HarqQR67shyD7c
XAoz1owNwuZGtZdnveJsMEaC740HZ+HwEv7gR0mKTda8HEK/esN04Boqmi3c
3dDMzn+eSa0LilnFtOxQYKdmlxYvw7VC/evqzeG+evOD3CmTLivhWbQZ2NOv
zQUpuDAJhArq4tdx6IMyIM3erW1+TvTKbXy5rIoKFGYvgY+orFZilHSX2RoE
U4ZKQpXw5air+jXuNfNFjbQ8+ucMd//QIpiLO7LfLKdZV/9KgX5HyEqki3HP
w1lSxBMU1+MYbDWbGEszOIAOcy9p8yJdBT2/ikYxOiSpFZ1Xe0bKBKA/Ly52
drb6bIeT3d9Fc+ohgdfT1VylgIxqXefAGFdhltNcHDOoI1wq2Bd5106n+mfn
/nfGPASTMNM5pOSsS1LcQ5ISmfIeecRmOCmnFnIEixMpuuJi67o1OZXwV1IV
Q04757g7LhnSRQtriWEk4fT2NinC9/s8uVgDQRoUZiWGBX5t7ljCkNG6e9pI
CtEERlfC2+FRR3tYwo5NiHtJaFelFR6Na6HYA1wuUjY40K8m8MRsSdCHDDKo
E3Fq5rSQKCGzAIF2FuFcteZhDMTwFY6CaF3y8gksJiuP/Z5g5Egq3slWHywd
b2INbxe6OT9ufKpQ1vkxdRla/jjHCSTkVP6KeuWu3KHBD2WJSICI4YHtAXNd
AzMmxOnCcH+oC7SGij2hudkw36nxh5YzSUq+0CWGr91ZW4R+f/MQH6ajqjO0
ApPvCF0CppPThTD5cr4BuJqFlTU1oVkgTNRx3R3zgZzEIK7DyVqtGWHB+6L2
BE+pzQIed6WoKw78xUZaFCxYIGoFhqPEFVdcnRqZr2Vhlbdy1CMuFADe2lFn
NVgVrzRYFZx6XzoLvrxhrVYmYoXrk9CKs6USvdZyAuAsiwnPjSSSQi/IuTtL
L0N0Zl00muBrdRU6zDaHeTwnmXScuMjTcp6vxcm0cRzfNzT1cUDkSCEY3/HZ
rDBT6vP4OrLTWmdiPJn0ej1es4NJdqgnQVy3TQUdd6NXZ4/XVuPA76Lh7/Sp
efzrfIUVKrirr7D01Sf1mNErqHrQ7PExctBZEclfdGtcYR3ySZpJGxfskVvK
bvSAgLj7PDSQAC4ls0NN4KPaJXowGfRNg0aN0KgxDJ251OY3BaZ69L6uqbqG
DrBwuSGHNGjj3lo36NpLnvbqPZJ5S/CHGB9AstdQujstgOHcWxMDZekNgniM
G71fj2u2ZnQd0bWgz10dorF0Mt+DqKpecFe9zFkM7WbGyNd1+qCSIeNpBfmy
1OlKTLRG3y41DW+0/f0dHf2tRT6LYrXxD91gI7JYdbrgI+LhC5w0wDAEuEq7
NO2ZRCFt8xMZaVDK1ZgfoS5n8mNjZRaqRG5rxsTY0n2776o/OubRcsNkKuTe
XEa3pW1giIAG1fGrzw/puPvf6srmEq0lWKepBV/gNmadJuzXxc/L2NeAdZbY
b3/+WoxB3Tj6M7bPYiR9hJVUNWpE5g7QKDKav25C4FpH89Wosz9cgw4irfOI
1dxaMVrjjbvWuNrS/EXCqPV2FeHAWdK8ZYLt6DOAkqOehJqtT5TPLBEtTSN7
+MbxE9TFh9v38k7vnt7GY0j79kFHdsXUIqOz6sQ1DgZeKelHkOvjx+ROL4P3
evcfL/abIDRLAgRGjJd8LhA9jhvVGACWmuqpXuLS7CNLyvXps27KexP1pSH/
0BtRebQMte/SiWPClcw3qdI9J0NX48S47EvGwQiP92GYc7ShQCfcltzftCcH
OtGsY5ROgWH3KO/X0cJFvuFEDhHyfH4oXKFZR5V17a04jJ1Hxv1oH9n1T85D
nG06t+LvLlVeemQT9d1yJru76wKaXTm37ArFB2D1tOKxPdIIl+IMBrgH7GBA
63HkgKWYj/+hBSrwu4f/UABFn3jU3uz03HOBbnM8YeJaJ/todogm4ZQ2B4KP
+3jAia1JPVRF648+7mp1lS49tfisB8Dh+W9P9Xlv1fPfNra2t5/w+W9b25tP
HuP5b1ub8Gh1/tsXuPCkN55X8u5ZUe4c4lZ7IJvWDfwRCpDgnd4OLIsw4wPd
HKFaw+z1hxhMZmNyjY6LSxxfSA8d0Jz07tZIXi/+SbpkbZe+oN0tU+sgpwD3
WoM+sglsrFi0aYIHnMkH0DVd3vfAmQ98qMgl0wxV4vuLQnUiqu5UT+ebYEW1
iN+68IquMw+wADzCPws7EGj7g1LndHVdW03H7xc7E+Z1TPsmxvh7fldS9tHE
o/cDYwdQu6m2U+mvfcwF4Tn/WK5nJ/fyU9s13aRTXal/7JWZR07s5Vm+e/TB
ch2Uomii65+/t5Om9VaLdx6u3ZSK1Lre9FSe0Sze25DUp3ugb2dPK7P7FU80
aRPLa2RhXcZ4ZXQbNfudiqHMVclEkvqHfauA3qnuyOF/yiBSMQs0xrN/SWPc
AI33fAOjI+jgwY3dYGVtLL5sFvHnMwEW6f9vv91m/b/9dGvrKen/p5vbK/3/
JS7U/+bAOk7+QpW/bvQ8svLrOKd1ATR5be/L2aokZ0iDc9FyHnpdSV3UNNlv
1UlbV7iiX0wfjqqL+olOst8Zf5JibjpIS/yL+lfcI1gLp/ZR+TbvVbzRMb4I
8wHnOBe6pl4+m8JMFpAwSMdt04wqn75AmZEgH2lRid5sXBaXcFysgsiW38cT
hAyPY2278BuXlhzaV9vrhko2/cNUpyAnCxe/iU0F5y1uAf0d/UbwbrE3TKe3
bX+TXOPMlrUKFQOsGFEmXu0weEmWuuBHo7dMfL5Lg+qmJHQhVe/kUvOLbYS1
NTJpeUsHOczSWXCBvf5ZAxXqOIpC9Eiw0B6KyftR6P3oefWMOQsqnGAcUdKo
c3bOhdi7Ih7SpvVkuV5Ek1FP7QKWQGNO3LrRg3ONiSqcdc3byF5HGU27uwg/
jgvtiYRVX5AiRadJBK9oQwvo6jwGNHjp+W8t0XFnlmZPW2F9qdZcJqbGFnKx
XWdiP7J8odmaniI/u6xWtj/qvke4NpbjStvZhmKteY0sxby2iXIhYWMZIAeB
mpFJ/huX3Gf0AMzX/1uPnz7dsOe/P95G/f8YHq30/xe4UP+X9v/X8yDXEcBT
f98r0Gr9Y/8t7sgDf4Kv12Xb7Whyq26h8OEbfHf4pvQqSVvkRseX9APfc5LI
Ga4LyiLP8rB72PVdZpDT1s20DoPb3s7OvOs1vA5yOZJL9niO83KPe2sVF0Cc
DCezEUxMSsvduEnDmp51IDN6HyJKpKy2iR7sK5DCoDoTTMPihMpqponARkru
MC0O0P2B/pNoZKwIhIQFWgUEOYuGxYBUitoHi5KxU20z75nMHCqF0Buh6mDJ
LdRoKNWUL5tVjlQrr+fRd8d7u5MiyhLoTG47KX2X4FLF+lg7xjispYnYrn/i
dFZcujKNhvE4xpRb8eH49HD4xlCkWYDl0yZbDcelAUbzQa+J1FPjel3LcRHP
AUTpK2a3n0WEJqu7/GBLKUXGyYoxbSzEH1phGGkfhbddpUNZB5jlPUb7A7rJ
aEX/17CYARGTp6rgPHNWp+9D8tft4IZW9JMWtrYNDEZ1S0kYI1TrxhC02/hR
YhUeMMkluypP0diY4dbIfHSM4hUs15FN4vKpCsZSNyd9NpV5y1518X/svy1T
wkIHDg1SoB5YNPck+GLpCGmlbUeIPVMVEaFJE+N4t0RO+Nmx3tc/A7Mzcw5w
YUd+E4m5otK026uhw7wnjTjmCS+f3+FabC9+josLQw6Lu0PJJoYl2DIWOcuu
Rl3VXEbpwugNrQNubme4C9VgJNeAPGLAd3rLb9ExST8WcgnOYPTRPJLVXmT5
fFJyGirFxhdT2L38+16v9wy9aNwekZzrtD7eo5NA2qUhonpsadTBfFwJPiag
0eMXDDgBZoCutMHgLMyjXG6YjoPThvEpjYzXeD2xEZzH7rDaPvA+7Ut0gjcd
b+rFpwOWAWqA9kCIazG8Qodz8a4n5J8QfIGwoQNvJeN7KUYmuYPrux5RWoyk
iy+h6fg02jtxMJG5/go3nOBfv0c3SmdZObr1lw0Zeb4kZ95HB/cz6723I98l
J7vbVMfBvpOr5kC3YACwOG9OSQj3qQzHJWgYh/IISIP1hCMufftdGW5emL88
4MdmO+Y6yImuPg3o1FAT7LKtwHlxUU/0ZCiy2jJm4kNeE2covst5U9aMjNg+
V5gl2hT1cwHKl+ODCiNQZhXh3VS0DEdof1uJI3wvYAfXRZgMJUwp9BqmYBJn
avZKnTC8sEG3Jj+xWoe36Kbwe3YSnzaxISXWVtZRbrhsuDmHUxltV+G0TdEi
H6QyL9/LmxnZBJ7o6DuKOgH60WA+0OYqLry8lGQ7tl3xQC+gEE1ByKC4dgBn
UOlVRIdsoRVLtYRK4l4ds9guIMM2w7ym6DrKbnlpZzoe44En+BFv6EyGuhLi
tktCQU1iDiDYytjcRXqDn5V5QUsghzXmm3NlvniE2OSufZxMsuzgwLCkXMpr
mFtE0+/sEXHaxwqrO/SpKrDcTjk5tUtrafxGRG0tkJyE2zVZv8tNE7Da0jSB
KwKk8I/S/IFfyc/fo7wtFkrLP0qQVGQIP276SECrfGYQs4QJ8Ou9HLcS+M0K
DkGv14Yzpo5nYxm3Ah3Al5FXPzRfPjJBhfn21334xE/RRMGMDzHh3vhncHYK
D2nZhAnsG8pE5VBfpBz7hwKYt0LF5FFNsy1blhYy4QtWH5OwGPALN+MdCAmz
iSbuOvsKkVi0OuoCvpFTV2FcTTvl/SqkXRHBnq+e4ZEiy9AxyZVJMWC4pRJ3
oQW9bHShOZrN03mLJqb6KOAd0yQMQ00mdj145bUgDKL3NZoB3IaFdN0JaNHa
cNxmSLtkIuti69LWCehbokJUIHZ2Q0UFeBW/V1dRmORmVV2SqpuQjhjWzuJe
GUPkU3axxPDd0SIwhFGTaML6HsEPXG0jx7cuFMvs9xAlk7vnhGNELCywW038
C/CQLAnlzN3rkO4vb6zcDp2ziHW5puOIQzkNnMaeHW+KfNOGmXQDnqHvuVWo
6YaP6V1JYreWUS6CzuOS6cv9VzsaEQ5TmlOT9S/cFgmNEFU6EszzaolKgr9f
f01Hnt46J3GyHU8nmOsDZJ47uu5Gq7qbJZVZvd3tZMJ7NrcxibmjFZXEjxvs
dDPq1kyXR/OtdF5FVKnHJj3oM3yJZ/VJ1OxdvnUOSFBy3HfkFqcADLpsKcyN
NA+SAbP4UZPpE+J5ZYephy94wZuN5ZjzaSWSC6LexOsbWmviP3Mx5ODVKbRZ
nWx4b5umG+bsrtCJOZt0Bk1eJOYKp2CNdTGok/XcBu/rNPLbqIWCCLEKBdOp
CwU9qZpGN0tA4bXhqqGl5l56cYdo8lr0YRmTK3if3OjiNGE8lU7t9CDyP/a/
5v452ypLFN7qPB7z2srYqOPtKCV6LwsXuVeuiuDFMDQn9GxDOl77Hp6u+0xq
46IOp3seReNar7ELq5qFGMtVLFwRvJFTGRc62NEBWZN8hX+WkXEoQs/Toggx
bcdsPiTHcS6MYc7zsNv0IbOt2ScK2Bis/AqA/hZY/OvjQt0Ih+/VXfuBNiPi
3cx01Lwra6okSoaJLnmRs5fIpolP0vRyNm2y1M0UrNGl++tvi8Ofnk10F0Q8
f/6c9odxJinJ7V2o0CwFyOWkrdsAs6AwH9B2eYm53+bCTlq/ByFerxXWK+/A
XMww2fv5x+AD+gzUQInWtVkAhIAXdDJ72Dj75si6BOf0+io0zbiOFk+4rH3m
Wj4eOO5eseWPJGTnfaCjjA3tmMCLxaJG3W2YXNKRRrhdqKEJv2YJL1neAHqs
NmIDJh/VigkALWjGDWx8VENO7KbUVMspZXZf1dn3dUvJsfnSUvYu7uZUs2yd
zSKg22mWXsejaEROJJl+o/WxYPpdKeJPv7WHeCSb7+plJTnPxUkFwlRYz/rw
uZnw6ZI6SU6XcvLuaYnoOgl4hYgbhdmozAB5CwgWqmHR0Wnxtt30wN/Fll+w
oQ/IrykBT/k1reaoq0I2+Oi0ihFNQ6pl8LGeTVBCEiAgbd8n5d1XbdzJndQy
/Oi3HFajmszkjrZ/B/ulNB+iz7s0AaLqtSrAeSpKoRu9VZpzGCHNkUNMz3RI
sqxvqO2KEuq0vGOsqZBmlHJZU5TYlUoy4zYWFEagsoYrqsVlGS6eHShLcXF3
nwhtfMnkEDTAfAp3HOR8lnWdVKIzGqxC1Fkjz1smUQTTSltK0/aL2XgcZUTd
II3xdzsIaN8AeU+j7YlPsFsyW8JKPBqpLswQNT5oTHJgRdr3SZdOR5G05sr+
Hs5caItPp+YU4X3PxTd+sS/ER9rH/Fj72N24ua/kzr7/IdZCUwP6Na/Xfk84
t6Umaahr3+htOF/jnixeP2u/9hDl99HW5UlWZA7GmiGTGsRpbgWgbDVGMBHz
2McpMz6UfWWfvqbNTk2L4nJyClitwk8YKM0PNSAJ72uEmOd2YyBeoNXedAjK
yqW+CgL7/DgDDsCdXhei2EhsEDBOxUZZl+5tib8f4Tycy7zXe9JQBb+tljSt
rtW1ulbX6lpdq2t1ra7VtbpW1+paXatrda2u1bW6VtfqWl2ra3WtrtW1ulbX
6qq9/h9NWvziAMgAAA==
--1658348780-1663354158-946210421=:412--


From skaller@maxtal.com.au  Sun Dec 26 15:48:52 1999
From: skaller@maxtal.com.au (skaller)
Date: Mon, 27 Dec 1999 02:48:52 +1100
Subject: [Types-sig] merry christmas... here is a demo!
References: <Pine.LNX.4.10.9912260403160.412-101000@nebula.lyra.org>
Message-ID: <386638E4.541677F7@maxtal.com.au>

Greg Stein wrote:
> 
> Hi all,
> 
> I banged together a rough prototype for a type checker. It provides some
> interesting errors/warnings, but totally ignores a bazillion others :-)

Here's the output of check.py, run on itself (only the last few lines ..
:-)

line 800: WARNING: (module<...>).LPAR may cause an AttributeError
line 802: WARNING: (module<...>).LSQB may cause an AttributeError
line 803: WARNING: (Any)._check_assign_subscriptlist may cause an
AttributeError
line 803: return Any
line 805: WARNING: (module<...>).DOT may cause an AttributeError
line 810: WARNING: (module<...>).LValueAttr may cause an AttributeError
line 810: return Any
line 817: WARNING: (Any)._check_node may cause an AttributeError
line 817: assign: sub_td = Any
line 819: WARNING: (Any).type may cause an AttributeError
line 819: WARNING: (module<...>).SliceType may cause an AttributeError
line 821: WARNING: (module<...>).LValueSlice may cause an AttributeError
line 821: return Any
line 824: WARNING: (module<...>).LValueIndex may cause an AttributeError
line 824: return Any
line 831: WARNING: (Any)._check_node may cause an AttributeError
line 835: WARNING: (Any)._check_node may cause an AttributeError
line 835: return list<*(Any)>
line 837: assign: valueTDs = list<*(Any)>
line 838: for: i = Any
line 840: return list<*(Any)>
line 844: WARNING: (module<...>).testlist may cause an AttributeError
line 846: assign: tds = list<*(Any)>
line 847: for: arg = Any
line 848: WARNING: (module<...>).COMMA may cause an AttributeError
line 849: WARNING: (module<...>).test may cause an AttributeError
line 855: WARNING: (module<...>).TDVarLenList may cause an
AttributeError
line 855: return Any
line 859: WARNING: (module<...>).dictmaker may cause an AttributeError
line 861: assign: key_tds = list<*(Any)>
line 862: assign: value_tds = list<*(Any)>
line 863: for: i = Any
line 867: WARNING: (module<...>).TDDictionary may cause an
AttributeError
line 867: return Any
line 873: assign: n = Any
line 874: WARNING: (module<...>).LPAR may cause an AttributeError
line 875: WARNING: (Any)._check_function_call may cause an
AttributeError
line 875: return Any
line 876: WARNING: (module<...>).LSQB may cause an AttributeError
line 877: WARNING: (Any)._check_node may cause an AttributeError
line 877: assign: sub_td = Any
line 879: return Any
line 881: WARNING: (module<...>).DOT may cause an AttributeError
line 882: assign: name = Any
line 883: WARNING: (Any).hasattr may cause an AttributeError
line 883: assign: has = Any
line 884: WARNING: (module<...>).NO may cause an AttributeError
line 887: WARNING: (module<...>).Any may cause an AttributeError
line 887: assign: td = Any
line 888: WARNING: (module<...>).MAYBE may cause an AttributeError
line 891: WARNING: (module<...>).Any may cause an AttributeError
line 891: assign: td = Any
line 893: return Any
line 897: return Any
line 903: assign: (Any).nodeargs = Any
line 904: assign: (Any).td = Any

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Sun Dec 26 17:50:22 1999
From: skaller@maxtal.com.au (skaller)
Date: Mon, 27 Dec 1999 04:50:22 +1100
Subject: [Types-sig] Viper module compiler begun
References: <Pine.LNX.4.10.9912260144270.412-100000@nebula.lyra.org>
Message-ID: <3866555E.B0C44B59@maxtal.com.au>

Greg Stein wrote:

> > >   http://www.mudlib.org/~rassilon/p2c/
> >
> >       Last time I tried that, it crashed unceremoniously.
> > Has that been fixed?
> 
> Not much of a bug report. Get serious. How the heck should I know whether
> that particular bug has been fixed?

	It wasn't a bug report. It was a comment: when I last
tried it, it failed so catastophically that I just junked it.
 
> As far as I know, P2C can successfully convert *any* module into a Python
> extension model.

	I'll try it again, since you seem to believe
the current version actually works, and the one I downloaded,
was probably an early release (I was pretty eager!).

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Sun Dec 26 18:01:35 1999
From: skaller@maxtal.com.au (skaller)
Date: Mon, 27 Dec 1999 05:01:35 +1100
Subject: [Types-sig] Viper module compiler begun
References: <Pine.LNX.4.10.9912260144270.412-100000@nebula.lyra.org>
Message-ID: <386657FF.86736C80@maxtal.com.au>

Greg Stein wrote:

> >       Last time I tried that, it crashed unceremoniously.
> > Has that been fixed?
> 
> Not much of a bug report. Get serious. How the heck should I know whether
> that particular bug has been fixed?
> 
> "oh. it broke. fix it." *snort*
> 
> As far as I know, P2C can successfully convert *any* module into a Python
> extension model.

Here's what I get with the latest version:
Did I do something wrong?

[root@ruby] ~/py2c>python gencode.py gencode.py __gencode.c _gencode.py
Traceback (innermost last):
  File "gencode.py", line 35, in ?
    genc.Generator(args[0], args[1], args[2])
  File "genc.py", line 91, in __init__
    tree = t.parsefile(input)
  File "transformer.py", line 176, in parsefile
    return self.parsesuite(file.read())
  File "transformer.py", line 166, in parsesuite
    return self.transform(parser.suite(text))
parser.ParserError: Could not parse string.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From paul@prescod.net  Mon Dec 27 03:45:12 1999
From: paul@prescod.net (Paul Prescod)
Date: Sun, 26 Dec 1999 22:45:12 -0500
Subject: [Types-sig] PyDL RFC 0.01
Message-ID: <3866E0C8.C1867223@prescod.net>

I've been off-list for a few days so if this RFC doesn't include the
last few day's feedback, I apologize in advance.

PyDL RFC 0.01
=============

A PyDL file declares the interface for a Python module. PyDL files
declare interfaces, objects and the required interfaces of objects.

At some point in the future, PyDL files will likely be generated from
source code using a combination of declarations within Python code and
some sorts of interface deduction and inferencing based on the contents
of
those files. For version 1, however, PyDL files are separate although
they do have some implications for the Python runtime.

This document describes the behavior of a class of software modules
called "static interface interpreters" and "static interface
checkers". Interface interpreters are run as part of the regular
Python module interpetation process. They read PyDL files and make the
type objects available to the Python interpreter. Interface checkers
read interfaces and Python code to verify conformance of the code to
the interface.

Concepts:
=========

An interface is a Python object with the following attributes:

__conforms__ : def (obj: Any ) -> boolean
__class_conforms__ : def (obj: Class ) -> boolean

(the rest of the interface reflection API will be worked out later)

Interfaces can be created through interface definitions and typedefs.
There may also be facilities for creating interfaces at runtime but
they are neither available nor relevant to the interface interpreter.

Interface definitions are similar to Python class definitions. They
use the keyword "interface" instead of the keyword "class". 

Sometimes an interface can be specialized for working with specific
types. For instance a list could be specialized for working with
integers. We call this "parameterization". A type with unresolved
parameter variables is said to be "parameterizable". A type with some
resolved parameter variables is said to be "partially resolved."
A type with all parameter variables resolved is said to be "fully
resolved."

Typedefs allow us to give names to partially or fully resolved 
instantiations of interfaces.

In addition to defining interfaces, it is possible to declare other
attributes of the module. Each declaration associates an interface
with the name of the attribute. Values associated with the name in the
module namespace must never violate the declaration. Furthermore, by
the time the module has been imported each name must have an
associated value.

Behavior:
=========

The Python interpreter invokes the static interface interpreter and
optionally the interface checker on a Python file and its associated
PyDL file.  Typically a PyDL file is associated with a Python file
through placement in the same path with the same base name and a
".pydl"  or ".gpydl" extension. "Non-standard" importer modules may
find PyDL files using other mechanisms such as through a look-up in an
relational database.

The interface interpreter reads the interface file and builds the
relevant type objects. If the interface file refers to other modules
then the interface interpreter can read the interface files associated
with those other modules. The interface interpreter maintains its own
module dictionary so that it does not import the same module twice.

The Python interpreter can optionally invoke the interface checker
after the interface interpreter has built type objects and before it
interprets the Python module.

Once it interprets the Python code, the type objects are available to
the runtime code through a special namespace called the "interface
namespace". This namespace is interposed in the name search order
between the module's namespace and the built-in namespace.

Type expression language:
=========================

Type expressions are used to declare the types of variables and to
make new types. In a type expression you may:

1. refer to a "dotted name" (local name or name in an imported module)

2. make a union of two or more types:

integer or float or complex

3. parametrize a type:

Array( Integer, 50 )

Note that the arguments can be either types or simple Python
expressions. A "simple" Python expression is an expression that does
not involve a function call.

4. use a syntactic shortcut:

[Foo] => Sequence( Foo ) # sequence of Foo's
{a:b} => Mapping( a, b ) # Mapping from a's to b's
(a,b,c) => Record( a, b, c ) # 3-element sequence of type a, followed by
b
followed by c

5. Declare un-modifability:

const [const Array( Integer )]

Declarations in a PyDL file:
============================

(formal grammar to follow)

 1. Imports

An import statement in an interface file loads another interface file.

 2. Basic attribute type declarations:

decl myint as Integer                   # basic 
decl intarr as Array( Integer, 50 )     # parameterized
decl intarr2 as Array( size = 40, elements = Integer ) 
					# using keyword syntax

Attribute declarations are not parameteriable. Furthermore, they must
resolve to fully parameterized (not parameterizable!) types.

 3. Callable object type declarations:

Functions are the most common sort of callable object but class
instances can also be callable. They may be runtime parameterized
and/or type parameterized.  For instance, there might be a method
"add" that takes two numbers of the same type and returns a number of
that type.

decl Add(_X: Number) as def( a: const _X, b: const _X )-> _X

 4. Class Declarations

A class is a callable object that can be subclassed.  Currently the
only way to make those (short of magic) is with a class declaration,
but one could imagine that there might someday be an __subclass__
magic method that would allow any old object instance to also stand in
as a class.

decl TreeNode(_X: Number) as 
        class( a: _X, Right: TreeNode( _X ) or None,
                    Left: TreeNode( _X ) or None )
                -> ParentClasses, Interfaces

 5. Interface declarations:

interface (_x,_y) foo(a, b ):
    decl shared somemember as _x
    decl someOtherMember as _y
    decl shared const someClassAttr as List( _x )

    decl shared const someFunction as def( a: Integer, b: Float ) ->
String

 6. Typedefs:

Typedefs allow interfaces to be renamed and for parameterized
variations of interfaces to be given names.

typedef PositiveInteger as BoundedInt( 0, maxint )
typedef NullableInteger as Integer or None
typedef Dictionary(_Y) as {String:_Y}

The Undefined Object:
=====================

The undefined object is used as the value of unassigned attributes and
the return value of functions that do not return a value. It may not
be bound to a name.

a = Undefined   # raises UndefinedValueError
a = b           # raises UndefinedValueError if b has not been assigned

Undefined CAN be compared.

if a==Undefined:
    blah
    blah
    blah

New Runtime Function:
=====================

conforms( x: Any, y: Interface ) -> Any or Undefined

This function can be used in various ways. Here it is used as an
assertion:

j = conforms( j, Integer )

which is equivalent to:

if isinstance( j, Integer ):
    raise UndefinedValueError

Here it is test:

if conforms( j, Integer )!=Undefined:
    anint = conforms( j, Integer )

which is equivalent to the very similar isinstance based code:

if isinstance( j, Integer ):
    anint = j

Experimental syntax:
====================

There is a backwards compatible syntax for embedding declarations in a
Python 1.5x file:

"decl","myint as Integer"
"typedef","PositiveInteger as BoundedInt( 0, maxint )"

There will be a tool that extracts these declarations from a Python
file to generate a .gpydl (for "generated PyDL") file. These files are
used alongside hand-crafted PyDL files. The "effective interface" of
the file is evaluated by combining the declarations from the same file
as if they were concatenated together (more or less...exact details to
follow). The two files must not contradict each other, just as
declarations within a single file must not contradict each other.

Over time the generation of the .gpydl file may be more intelligent
and may deduce type information based on code outside of explicit
declarations (for instance function and class definitions, assignment
statements and so forth).

Runtime Implications:
=====================

All of the named types defined in a PyDL file are available in the
"types" dictionary that is searched between the module dictionary and
the built-in dictionary.

The runtime should not allow an assignment or function call to violate
the declarations in the PyDL file. In an "optimized speed mode" those
checks would be disabled.


From scott@chronis.pobox.com  Mon Dec 27 05:15:24 1999
From: scott@chronis.pobox.com (scott)
Date: Mon, 27 Dec 1999 00:15:24 -0500
Subject: [Types-sig] PyDL RFC 0.01
In-Reply-To: <3866E0C8.C1867223@prescod.net>
References: <3866E0C8.C1867223@prescod.net>
Message-ID: <19991227001524.A39501@chronis.pobox.com>

On Sun, Dec 26, 1999 at 10:45:12PM -0500, Paul Prescod wrote:
> I've been off-list for a few days so if this RFC doesn't include the
> last few day's feedback, I apologize in advance.

Very grateful for you providing some more centralized direction with
this RFC.  Some questions follow, mostly intended to make sure I'm on
the same page as the intent of the RFC.  
> 
> PyDL RFC 0.01
> =============
[...]
> 
> An interface is a Python object with the following attributes:
> 
> __conforms__ : def (obj: Any ) -> boolean
> __class_conforms__ : def (obj: Class ) -> boolean

What is the rational behind separating __conforms__ and
__class_conforms__? It seems to me like __conforms__ could do 
everything __class_conforms__ is supposed to.  Am I missing something? 

[...]
[...] 
> Type expression language:
> =========================
> 
> Type expressions are used to declare the types of variables and to
> make new types. In a type expression you may:
> 
> 1. refer to a "dotted name" (local name or name in an imported module)
> 
> 2. make a union of two or more types:
> 
> integer or float or complex
> 
> 3. parametrize a type:
> 
> Array( Integer, 50 )
By `50', do you intend length of 50?

> 
> Note that the arguments can be either types or simple Python
> expressions. A "simple" Python expression is an expression that does
> not involve a function call.
> 
> 4. use a syntactic shortcut:
> 
> [Foo] => Sequence( Foo ) # sequence of Foo's
> {a:b} => Mapping( a, b ) # Mapping from a's to b's
> (a,b,c) => Record( a, b, c ) # 3-element sequence of type a, followed by
> b
> followed by c
> 
> 5. Declare un-modifability:
> 
> const [const Array( Integer )]

By nesting const declarations, do you intend that checks against
modifiability at runtime are shallow? For example, if I declare
array A as a constant array of Foo instances, are those foo instances
(or their attributes) modifiable?

> 
> Declarations in a PyDL file:
> ============================
> 
> (formal grammar to follow)
> 
>  1. Imports
> 
> An import statement in an interface file loads another interface file.

Do you envision the import statement to be similar in syntax to 
regular python?  (eg from m import v; from m2 import *)
> 
>  2. Basic attribute type declarations:
> 
> decl myint as Integer                   # basic 
> decl intarr as Array( Integer, 50 )     # parameterized
> decl intarr2 as Array( size = 40, elements = Integer ) 
> 					# using keyword syntax
> 
> Attribute declarations are not parameteriable. Furthermore, they must
> resolve to fully parameterized (not parameterizable!) types.

what do you mean by 'attribute declarations'?  I'd hate to see
classes that couldn't have attributes that are parameterizable, but 
agree that resolving parameters needs to end somewhere.

[...]
>  5. Interface declarations:
> 
> interface (_x,_y) foo(a, b ):
>     decl shared somemember as _x
>     decl someOtherMember as _y
>     decl shared const someClassAttr as List( _x )
> 
>     decl shared const someFunction as def( a: Integer, b: Float ) ->
> String

what do you mean by 'shared' in the above?  Are you referring to the
distinction between class attributes and instance attributes?

[...]

> The Undefined Object:
> =====================
> 
> The undefined object is used as the value of unassigned attributes and
> the return value of functions that do not return a value. It may not
> be bound to a name.

By functions that do not return a value, do you mean functions that
return None, or that may return None?

> 
> a = Undefined   # raises UndefinedValueError
> a = b           # raises UndefinedValueError if b has not been assigned

by 'b has not been assigned', do you mean assigned a type, or is this
in your view a replacement for NameError?  I'm a little unclear where
your going with Undefined.

> 
> Undefined CAN be compared.
> 
> if a==Undefined:
>     blah
>     blah
>     blah
> 

scott


From paul@prescod.net  Mon Dec 27 11:01:05 1999
From: paul@prescod.net (Paul Prescod)
Date: Mon, 27 Dec 1999 06:01:05 -0500
Subject: [Types-sig] PyDL RFC 0.01
References: <3866E0C8.C1867223@prescod.net> <19991227001524.A39501@chronis.pobox.com>
Message-ID: <386746F1.64CB72B5@prescod.net>

Every one of your questions addresses an issue of ambiguity in the spec.
Thanks! I'll quote you pieces of the NEW and IMPROVED spec that your
comments generated.

scott wrote:
> 
> > An interface is a Python object with the following attributes:
> >
> > __conforms__ : def (obj: Any ) -> boolean
> > __class_conforms__ : def (obj: Class ) -> boolean
> 
> What is the rational behind separating __conforms__ and
> __class_conforms__? It seems to me like __conforms__ could do
> everything __class_conforms__ is supposed to.  Am I missing something?

Every interface object (remember, interfaces are just Python objects!)
has the following method :

__conforms__ : def (obj: Any ) -> boolean

This method can be used at runtime to determine whether an object
conforms to the interface. It would check the signature for sure but
might also check the actual values of particular attributes.

There is also a global function with this signature:

class_conforms : def ( obj: Class, Obj: Interface ) -> boolean

This function can be used either at compile time (e.g. by an
implementation of an interface checker) or runtime to check that a
class will generate objects that have the right signature to conform
to the interface.

> > Array( Integer, 50 )
> By `50', do you intend length of 50?

3. parameterize a type:

Array( Integer, 50 )
Array( length=50, elements=Integer )

> > const [const Array( Integer )]
> 
> By nesting const declarations, do you intend that checks against
> modifiability at runtime are shallow? For example, if I declare
> array A as a constant array of Foo instances, are those foo instances
> (or their attributes) modifiable?

Right. That's my feeling right now but I could probably be convinced
otherwise.

> Do you envision the import statement to be similar in syntax to
> regular python?  (eg from m import v; from m2 import *)
	
An import statement in an interface file loads another interface file.
The import statement works just like Python's except that it loads the
PyDL file found with the referenced module, not the module itself. (of
course we will make this definition more formal in the future)

> what do you mean by 'attribute declarations'?  I'd hate to see
> classes that couldn't have attributes that are parameterizable, but
> agree that resolving parameters needs to end somewhere.

I'm talking both about module, interface and class attributes. I think
that it is sufficient that a class' attributes can be parameterized and
can use class parameters. They don't need to be independently
parameterizable.

So this is allowed:

class (_X,_Y) spam( A, B ):
    decl someInstanceMember as _X
    decl someOtherMember as Array( _X, 50 )

    ....

These are NOT allowed:

decl someModuleMember(_X) as Array( _X, 50 )
class (_Y) spam( A, B ):
    decl someInstanceMember(_X) as Array( _X, 50 ) 

Because that would allow you to create a "spam" without getting around
to saying what _X is for that spam's someInstanceMember. That strikes
me as overly dynamic for a static type-check system (at least for
version 1).

> what do you mean by 'shared' in the above?  Are you referring to the
> distinction between class attributes and instance attributes?

Yes. But in retrospect the concept may not jibe very well with the idea
that there can be many classes that implement a particular interface.
There is no way to share state between all of these objects. I think
I'll take that out.

Here's the new section on Undefined:

The Undefined Object:
=====================

The undefined object is used as the value of unassigned attributes and
the return value of functions that do not return a value. It may not
be bound to a name. 

a = Undefined   # raises UndefinedValueError
a = b           # raises UndefinedValueError if b has not been assigned

Undefined can be thought of as a subtype of NameError. Undefined is
needed because it is now possible to declare names at compile time but
never get around to assigning to them. In ordinary Python this is not
possible.

The only useful thing you can do with Undefined is check whether an
object "is" Undefined:

if a is Undefined:
    doSomethingWithA(a)
else:
    doSomethingElse()

This is equivalent to:

try:
    doSomethingWithA( a )
except NameError:
    doSomethingElse

It is debatable whether we still need NameError for anything other
than backwards compatibility. We could say that any referenced
variable is automatically initialized to "undefined". Undefined is
sufficiently restrictive that this will not lead to buggy programs.

Undefined also corrects a long-term unsafe issue with functions. Now,
functions that do not explicitly return a value return Undefined
instead of None. That means that this is no longer possible

a = list.sort()

With Undefined, it will blow up because it is not possible to assign the
Undefined value. Before Undefined, the code did not blow up but it
also did not do the "right thing." It assigned None to "a" which was
seldom what was intended.

 Paul Prescod


From paul@prescod.net  Mon Dec 27 11:02:04 1999
From: paul@prescod.net (Paul Prescod)
Date: Mon, 27 Dec 1999 06:02:04 -0500
Subject: [Types-sig] PyDL RFC 0.02
Message-ID: <3867472C.7ECBAC55@prescod.net>

PyDL RFC 0.02

A PyDL file declares the interface for a Python module. PyDL files
declare interfaces, objects and the required interfaces of objects.

At some point in the future, PyDL files will likely be generated from
source code using a combination of declarations within Python code and
some sorts of interface deduction and inferencing based on the contents
of
those files. For version 1, however, PyDL files are separate although
they do have some implications for the Python runtime.

This document describes the behavior of a class of software modules
called "static interface interpreters" and "static interface
checkers". Interface interpreters are run as part of the regular
Python module interpetation process. They read PyDL files and make the
interface objects available to the Python interpreter. Interface
checkers
read PyDL files and Python code to verify conformance of the code
to the interface.

Interfaces:
===========
Interfaces are the central concept in PyDL file. Interfaces are Python
objects like anything else but they are created by the interface
interpreter available to the static interface checker before Python
interpretation begins. The PyDL file itself generates an interface 
object that describes the attributes of the module. It may also
contain interface definitions for class instances and other objects.

These other interfaces can be created through interface definitions
and typedefs.  There may also be facilities for creating interfaces at
runtime but they are neither available to nor relevant to the
interface interpreter.

Interface definitions are similar to Python class definitions. They
use the keyword "interface" instead of the keyword "class". 

Sometimes an interface can be specialized for working with specific
other interfaces. For instance a list could be specialized for working
with integers. We call this "parameterization". An interface with
unresolved parameter variables is said to be "parameterizable". A type
with some resolved parameter variables is said to be "partially
resolved." A type with all parameter variables resolved is said to be
"fully resolved."

Typedefs allow us to give names to partially or fully resolved 
instantiations of interfaces.

In addition to defining interfaces, it is possible to declare other
attributes of the module. Each declaration associates an interface
with the name of the attribute. Values associated with the name in the
module namespace must never violate the declaration. Furthermore, by
the time the module has been imported each name must have an
associated value.

Behavior:
=========

The Python interpreter invokes the static interface interpreter and
optionally the interface checker on a Python file and its associated
PyDL file.  Typically a PyDL file is associated with a Python file
through placement in the same path with the same base name and a
".pydl"  or ".gpydl" extension. If both are avaiable, the module'sj
interface is created by combining the declarations in the ".pydl" and
".gpydl" files.

"Non-standard" importer modules may find PyDL files using other
mechanisms such as through a look-up in an relational database, just
as they find modules themselves using non-standard mechanisms.

The interface interpreter reads the PyDL file and builds the
relevant interface objects. If the PyDL file refers to other modules
then the interface interpreter can read the PyDL files associated
with those other modules. The interface interpreter maintains its own
module dictionary so that it does not import the same module twice.

The Python interpreter can optionally invoke the interface checker
after the interface interpreter has built interface objects and before
it interprets the Python module.

Once it interprets the Python code, the interface objects are
available to the runtime code through a special namespace called the
"interface namespace". This namespace is interposed in the name search
order between the module's namespace and the built-in namespace.

Interface expression language:
==============================

Interface expressions are used to declare that attributes must conform
to certain interfaces.  In a interface expression you may:

1. refer to a "dotted name" (local name or name in the PyDL of an
imported module ).

2. make a union of two or more interfaces:

integer or float or complex

3. parameterize a interface:

Array( Integer, 50 )
Array( length=50, elements=Integer )

Note that the arguments can be either interfaces or simple Python
expressions. A "simple" Python expression is an expression that does
not involve a function call.

4. use a syntactic shortcut:

[Foo] => Sequence( Foo ) # sequence of Foo's
{A:B} => Mapping( A, B ) # Mapping from A's to B's
(A,B,C) => Record( A, B, C ) # 3-element sequence of interface a,
followed
                             # by b followed by c

5. Declare un-modifiability:

const [const Array( Integer )]

(the semantics of un-modifiability need to be worked out)

Declarations in a PyDL file:
============================

(formal grammar to follow)

 1. Imports

An import statement in an interface file loads another interface file.
The import statement works just like Python's except that it loads the
PyDL file found with the referenced module, not the module itself. (of
course we will make this definition more formal in the future)

 2. Basic attribute interface declarations:

decl myint as Integer                   # basic 
decl intarr as Array( Integer, 50 )     # parameterized
decl intarr2 as Array( size = 40, elements = Integer ) # using keyword
syntax

Attribute declarations are not parameteriable. Furthermore, they must
resolve to fully parameterized (not parameterizable!) interfaces.

So this is allowed:

class (_X,_Y) spam( A, B ):
    decl someInstanceMember as _X
    decl someOtherMember as Array( _X, 50 )

    ....

These are NOT allowed:

decl someModuleMember(_X) as Array( _X, 50 )
class (_Y) spam( A, B ):
    decl someInstanceMember(_X) as Array( _X, 50 ) 

Because that would allow you to create a "spam" without getting around
to saying what _X is for that spam's someInstanceMember. That strikes
me as overly dynamic for a static type-check system (at least for
version 1).

 3. Callable object interface declarations:

Functions are the most common sort of callable object but class
instances can also be callable. Callables may be runtime parameterized
and/or interface parameterized.  For instance, there might be a method
"add" that takes two numbers of the same interface and returns a number
of
that interface.

decl Add(_X: Number) as def( a: const _X, b: const _X )-> _X

_X is the interface parameter. a and b are the runtime parameters.

 4. Class Declarations

A class is a callable object that can be subclassed.  Currently the
only way to make those (short of magic) is with a class declaration,
but one could imagine that there might someday be an __subclass__
magic method that would allow any old object instance to also stand in
as a class.

Here is the syntax for a class definition:

decl TreeNode(_X: Number) as 
        class( a: _X, Right: TreeNode( _X ) or None,
                    Left: TreeNode( _X ) or None )
                -> ParentClasses, Interfaces

What we are really defining is the constructor. The signature of the
created object can be described in an interface declaration.

 5. Interface declarations:

interface (_X,_Y) spam( a, b ):
    decl somemember as _X
    decl someOtherMember as _Y
    decl const someClassAttr as [ _X ]

    decl const someFunction as def( a: Integer, b: Float ) -> String

 6. Typedefs:

Typedefs allow interfaces to be renamed and for parameterized
variations of interfaces to be given names.

typedef PositiveInteger as BoundedInt( 0, maxint )
typedef NegativeInteger as BoundedInt( max=-1, min=minint )
typedef NullableInteger as Integer or None
typedef Dictionary(_Y) as {String:_Y}

The Undefined Object:
=====================

The Undefined object is used as the value of unassigned attributes and
the return value of functions that do not return a value. It may not
be bound to a name. 

a = Undefined   # raises UndefinedValueError
a = b           # raises UndefinedValueError if b has not been assigned

Undefined can be thought of as a subtype of NameError. Undefined is
needed because it is now possible to declare names at compile time but
never get around to assigning to them. In ordinary Python this is not
possible.

The only useful thing you can do with Undefined is check whether an
object "is" Undefined:

if a is Undefined:
    doSomethingWithA(a)
else:
    doSomethingElse()

This is equivalent to:

try:
    doSomethingWithA( a )
except NameError:
    doSomethingElse

It is debatable whether we still need NameError for anything other
than backwards compatibility. We could say that any referenced
variable is automatically initialized to "Undefined". Undefined is
sufficiently restrictive that this will not lead to buggy programs.

Undefined also corrects a long-term unsafe issue with functions. Now,
functions that do not explicitly return a value return Undefined
instead of None. That means that this is no longer possible

a = list.sort()

With Undefined, it will blow up because it is not possible to assign the
Undefined value. Before Undefined, the code did not blow up but it
also did not do the "right thing." It assigned None to "a" which was
seldom what was intended.

New Runtime Functions:
======================

conforms( x: Any, y: Interface ) -> Any or Undefined

This function can be used in various ways. The most basic way to use
it is as a test:

if conforms( j, Integer ) is Undefined:
    anint = conforms( j, Integer )

Because of the behavior of Undefined, it can also be used as an
assertion:

j = conforms( j, Integer )

which is equivalent to:

if isinstance( j, Integer ):
    raise UndefinedValueError

Every interface object (remember, interfaces are just Python objects!)
has the following method :

__conforms__ : def (obj: Any ) -> boolean

This method can be used at runtime to determine whether an object
conforms to the interface. It would check the signature for sure but
might also check the actual values of particular attributes.

There is also a global function with this signature:

class_conforms : def ( obj: Class, Obj: Interface ) -> boolean

This function can be used either at compile time (e.g. by an
implementation of an interface checker) or runtime to check that a
class will generate objects that have the right signature to conform
to the interface.

(the rest of the interface reflection API will be worked out later)

Experimental syntax:
====================

There is a backwards compatible syntax for embedding declarations in a
Python 1.5x file:

"decl","myint as Integer"
"typedef","PositiveInteger as BoundedInt( 0, maxint )"

There will be a tool that extracts these declarations from a Python
file to generate a .gpydl (for "generated PyDL") file. These files are
used alongside hand-crafted PyDL files. The "effective interface" of
the file is evaluated by combining the declarations from the same file
as if they were concatenated together (more or less...exact details to
follow). The two files must not contradict each other, just as
declarations within a single file must not contradict each other.

Over time the .gpydl generator will get more intelligent and may
deduce type information based on code outside of explicit declarations
(for instance function and class definitions, assignment statements
and so forth).

Summary of Major Runtime Implications:
=====================

All of the named interfaces defined in a PyDL file are available in the
"interfaces" dictionary that is searched between the module dictionary
and
the built-in dictionary.

The runtime should not allow an assignment or function call to violate
the declarations in the PyDL file. In an "optimized speed mode" those
checks would be disabled.

Several new object interfaces and functions are needed.

The new "Undefined" object is needed and assignments need to check for
"Undefined".


From paul@prescod.net  Mon Dec 27 11:54:10 1999
From: paul@prescod.net (Paul Prescod)
Date: Mon, 27 Dec 1999 06:54:10 -0500
Subject: [Types-sig] recursive types, type safety, and flow analysis
References: <Pine.LNX.4.10.9912221415090.16305-100000@nebula.lyra.org> <38615A5B.D2E2B608@prescod.net>
 <38616974.AFC365D6@maxtal.com.au> <199912231347.IAA21844@eric.cnri.reston.va.us>
Message-ID: <38675362.AF29942E@prescod.net>

Guido van Rossum wrote:
> 
> Paul's issue was that in ML the error message is typically
> ununderstandable.  I have never used ML (it's a language for people
> with excess IQ points) 

My impression was that the ONLY thing that makes ML tricky was the type
system. And the type system was built around the idea of type
inferencing.

> but I don't think that is the right level of
> critique.  In any case I think we can do better by simply referring to
> the line number(s) where a.a gets assigned a non-int value.  Good
> error messages are a human factors issue, not a type system issue.

It's a little bit more subtle than that. The problem is that we are
generating anonymous types left right and center:

if a:
	def foo(): # anon type foo1
		def bar(self): # anon type foo1->String
			if self.something:
				return "Abc"
			else:
				return None
else:
	def foo2(): # anon type foo2
		def bar(self): # anon type foo2->String
			if self.something:
				return 123
			else:
				return 45L

k = [foo, foo] # anon type [foo1_class or foo2_class]

j = [] 

for i in k:
	j.append(i()) #oops. how do we handle this?

#okay, another try:

k=[foo().bar(), foo().bar()] 
# anon type [String or None or Integer or Long ]

mvar: String or Integer or Long
myvar = k

Now you have to back-track a LOT of code. Merely reporting: "bar can
return None on line 13" is not very helpful because I have to trace a
path from bar to where I am which is harder when this code is embedded
in other code.

Anyhow, I won't say (anymore) that this sort of deduction is
unequivocally a bad idea. If we generate these PyDL files then you will
have a useful debugging tool which may clear up a lot of these problems.
It comforts me to know that if the inferencing goes horibbly wrong you
can look at the PyDL file to figure out what the inferencer was thinking
and override it.

 Paul Prescod


From gstein@lyra.org  Mon Dec 27 12:43:38 1999
From: gstein@lyra.org (Greg Stein)
Date: Mon, 27 Dec 1999 04:43:38 -0800 (PST)
Subject: [Types-sig] type inference/deduction (was: recursive types, type safety, and
 flow analysis)
In-Reply-To: <38675362.AF29942E@prescod.net>
Message-ID: <Pine.LNX.4.10.9912270432070.412-100000@nebula.lyra.org>

On Mon, 27 Dec 1999, Paul Prescod wrote:
> Guido van Rossum wrote:
>...
> > but I don't think that is the right level of
> > critique.  In any case I think we can do better by simply referring to
> > the line number(s) where a.a gets assigned a non-int value.  Good
> > error messages are a human factors issue, not a type system issue.
> 
> It's a little bit more subtle than that. The problem is that we are
> generating anonymous types left right and center:
> 
> if a:
> 	def foo(): # anon type foo1

This would be def()->Any.

And for discussion, we'll call this type "foo1"

Oh. Wait. I just saw. You actually meant:

        class foo:

And yah: we'll refer to this version as "foo1"

> 		def bar(self): # anon type foo1->String

Huh? This would be: def(foo1)->Any

> 			if self.something:
> 				return "Abc"
> 			else:
> 				return None
> else:
> 	def foo2(): # anon type foo2

We'll assume: class foo

And we'll refer to this version as "foo2"

> 		def bar(self): # anon type foo2->String
> 			if self.something:
> 				return 123
> 			else:
> 				return 45L
> 
> k = [foo, foo] # anon type [foo1_class or foo2_class]
> 
> j = [] 

This would be: [Any]

> for i in k:
> 	j.append(i()) #oops. how do we handle this?

No problem. j can take any element.

> #okay, another try:
> 
> k=[foo().bar(), foo().bar()] 
> # anon type [String or None or Integer or Long ]
> 
> mvar: String or Integer or Long

myvar: ...

> myvar = k

myvar = k[0]   (??)

Assuming assignment enforcement, you would get an error here. Nowhere
else. It would give some error message about k having a type which is
incompatible with myvar.

> Now you have to back-track a LOT of code. Merely reporting: "bar can
> return None on line 13" is not very helpful because I have to trace a
> path from bar to where I am which is harder when this code is embedded
> in other code.

I do not believe that we would issue an error message like that. The
message would be about a type conflict at the assignment.

> Anyhow, I won't say (anymore) that this sort of deduction is
> unequivocally a bad idea. If we generate these PyDL files then you will
> have a useful debugging tool which may clear up a lot of these problems.
> It comforts me to know that if the inferencing goes horibbly wrong you
> can look at the PyDL file to figure out what the inferencer was thinking
> and override it.

I would expect to be able to generate/cache the PyDL files. I do not see
how things can go "horribly wrong." The types are the types. As I
mentioned previously: we have to know the type of the RHS in an
assignment. I say we use that info, you say we check it against the type
of the LHS.

Based on the prototype that I posted, I think that I'm going to modify my
position a bit:

* if you declare a variable/attribute, then you get assignment enforcement
  (note this also applies to func param names)

* if you do not declare, then there is no enforcement (the type is deduced
  from the RHS)


Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Mon Dec 27 12:55:08 1999
From: gstein@lyra.org (Greg Stein)
Date: Mon, 27 Dec 1999 04:55:08 -0800 (PST)
Subject: [Types-sig] PyDL RFC 0.02
In-Reply-To: <3867472C.7ECBAC55@prescod.net>
Message-ID: <Pine.LNX.4.10.9912270443520.412-100000@nebula.lyra.org>

Only a few quick comments for now...

On Mon, 27 Dec 1999, Paul Prescod wrote:
> PyDL RFC 0.02

PyDL ?!

Don't tell me "Python Definition Language." That is the wrong semantic...
you're talking about an "inteface file", not a language.

I'd seriously recommend a new acronym before you confuse people :-)

.pyi or something.

>...
> The Python interpreter invokes the static interface interpreter and
> optionally the interface checker on a Python file and its associated
> PyDL file.  Typically a PyDL file is associated with a Python file
> through placement in the same path with the same base name and a
> ".pydl"  or ".gpydl" extension. If both are avaiable, the module'sj
> interface is created by combining the declarations in the ".pydl" and
> ".gpydl" files.

The notion of two types of files just adds complexity. There is no reason
that a generated file would be *any* different in form/syntax than a
human's file. The human just gets to add funky comments, indentation, etc.

In other words: design around a single file.

>...
> Once it interprets the Python code, the interface objects are
> available to the runtime code through a special namespace called the
> "interface namespace". This namespace is interposed in the name search
> order between the module's namespace and the built-in namespace.

Search *another* namespace? Eek! We're already seeing people avoiding the
time with things like:

  def foo(len=len):
    ...

Adding another namespace will just exacerbate the situation.

I don't recommend adding another distinct namespace, but *IF* you are
going to do so, then I might suggest that it is only available for use
from withing a typedecl.

>...
> 5. Declare un-modifiability:
> 
> const [const Array( Integer )]
> 
> (the semantics of un-modifiability need to be worked out)

Wasn't the notion of "const" (successfully) argued against inclusion?

>...
> The Undefined Object:
> =====================
> 
> The Undefined object is used as the value of unassigned attributes and
> the return value of functions that do not return a value. It may not
> be bound to a name. 

I don't think this is going to work as you expect. The Python interpreter
can't work with "Undefined" unless it is an object (otherwise, you're
talking about a near-impossible revamp). Therefore, Undefined is an object
and you're going to have some *real* serious issues trying to keep that
out of some kind of assignment or other usage.

Pass it as a parameter? Shove it into a list or tuple? Check for Undefined
on every name binding? What about indexed or slice assignment?

>...
> conforms( x: Any, y: Interface ) -> Any or Undefined

This is predicated upon the "Undefined" concept. I believe that Undefined
isn't possible as you're currently defined it, thereby making conforms()
unusable.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From scott@chronis.pobox.com  Mon Dec 27 16:39:39 1999
From: scott@chronis.pobox.com (scott)
Date: Mon, 27 Dec 1999 11:39:39 -0500
Subject: [Types-sig] PyDL RFC 0.02
In-Reply-To: <Pine.LNX.4.10.9912270443520.412-100000@nebula.lyra.org>
References: <3867472C.7ECBAC55@prescod.net> <Pine.LNX.4.10.9912270443520.412-100000@nebula.lyra.org>
Message-ID: <19991227113939.B41570@chronis.pobox.com>

On Mon, Dec 27, 1999 at 04:55:08AM -0800, Greg Stein wrote:
> Only a few quick comments for now...
> 
> On Mon, 27 Dec 1999, Paul Prescod wrote:
> > PyDL RFC 0.02
> 
> PyDL ?!
> 
> Don't tell me "Python Definition Language." That is the wrong semantic...
> you're talking about an "inteface file", not a language.
> 
> I'd seriously recommend a new acronym before you confuse people :-)
> 
> .pyi or something.

One thing to consider is that windows/dos users can't have a 4-char
suffix on a file name reliably.  

I like .pyi as greg suggests.  The shorter the suffix, the better IMO.

> 
> >...
> > The Python interpreter invokes the static interface interpreter and
> > optionally the interface checker on a Python file and its associated
> > PyDL file.  Typically a PyDL file is associated with a Python file
> > through placement in the same path with the same base name and a
> > ".pydl"  or ".gpydl" extension. If both are avaiable, the module'sj
> > interface is created by combining the declarations in the ".pydl" and
> > ".gpydl" files.
> 
> The notion of two types of files just adds complexity. There is no reason
> that a generated file would be *any* different in form/syntax than a
> human's file. The human just gets to add funky comments, indentation, etc.
> 
> In other words: design around a single file.

Greg, are you suggesting a single file which gets generated type info
appened automatically?  If so, I don't see it being harmful.  A simple
comment header denoting the beginning of machine generated info would
suffice IMO, and it would facilitate some of the problems with working
with extra files... like permissions denying writes of the interface
file and what not.

> 
> >...
> > Once it interprets the Python code, the interface objects are
> > available to the runtime code through a special namespace called the
> > "interface namespace". This namespace is interposed in the name search
> > order between the module's namespace and the built-in namespace.
> 
> Search *another* namespace? Eek! We're already seeing people avoiding the
> time with things like:
> 
>   def foo(len=len):
>     ...
> 
> Adding another namespace will just exacerbate the situation.
> 
> I don't recommend adding another distinct namespace, but *IF* you are
> going to do so, then I might suggest that it is only available for use
> from withing a typedecl.

I believe greg has a good point here.  But I also think another
namespace gives us a good degree of flexibility in development.
Perhaps the extra namespace could also just be available at runtime
iff python is invoked in such a way as to run type checking and
interface interpretting (not the default).  just another way to
possibly minimize the extra lookup overhead.

Another idea:

Perhaps an additional attribute or set of attributes would achieve a
similar level of modularity of the type checking system?  For example,
maybe each existing namespace could have an __interfaces__ attribute
or __types__ or something that would contain the type information
without affecting lookup time of builtins so much.  Intuitively, this
seems like something which could be extended in the future to work
with local namespace type checking or optimization more easily.

This area could use some more specification before we start anything
too serious, IMO.  I think modularity of this type checking is
important because it seems like it will facilitate making the
necessary changes as time goes on: a potential big efficiency winner
in the ongoing development of type checking.

> 
> >...
> > 5. Declare un-modifiability:
> > 
> > const [const Array( Integer )]
> > 
> > (the semantics of un-modifiability need to be worked out)
> 
> Wasn't the notion of "const" (successfully) argued against inclusion?

Any pointers to this discussion?

scott


From paul@prescod.net  Mon Dec 27 17:16:54 1999
From: paul@prescod.net (Paul Prescod)
Date: Mon, 27 Dec 1999 12:16:54 -0500
Subject: [Types-sig] PyDL RFC 0.02
References: <Pine.LNX.4.10.9912270443520.412-100000@nebula.lyra.org>
Message-ID: <38679F06.37B6DFDD@prescod.net>

PyDL is a pun. These files behave just like IDL in COM and CORBA. They
are a separate language, at least for now. 

Greg Stein wrote:
> 
> 
> The notion of two types of files just adds complexity. There is no reason
> that a generated file would be *any* different in form/syntax than a
> human's file. The human just gets to add funky comments, indentation, etc.

From PyDL RFC 0.03:

We have two different files so that hand-crafted PyDL files will not be
overwritten by generated ones. The syntax of the files is identical.

> In other words: design around a single file.

I did. The existence of two files affects the language semantics not one
whit but it makes the system much more safe and arguably much more
usable.

> Search *another* namespace? Eek! We're already seeing people avoiding the
> time with things like:
> 
>   def foo(len=len):
>     ...
> 
> Adding another namespace will just exacerbate the situation.

The whole point of static type checking is that the checks of the
namespaces should be done at *compile time*. It isn't like a name
reference from six levels of lexical nesting in a C++ file requires more
time to look up than a "local" reference. That's just a temporary Python
bug that we're trying to fix.

> Wasn't the notion of "const" (successfully) argued against inclusion?

Maybe. But I think that the situation changed when I moved from talking
about "lists", "tuples" and "dictionaries" to talking about "sequences",
"mappings" and "records" because we have no way of saying "read-only
record." Maybe that's a big deal for version 1. Maybe it isn't. I'm open
to opinions.

> I don't think this is going to work as you expect. The Python interpreter
> can't work with "Undefined" unless it is an object (otherwise, you're
> talking about a near-impossible revamp). Therefore, Undefined is an object
> and you're going to have some *real* serious issues trying to keep that
> out of some kind of assignment or other usage.
> 
> Pass it as a parameter? Shove it into a list or tuple? Check for Undefined
> on every name binding? What about indexed or slice assignment?

I think that there are finite number of such issues and you've listed
most of them. In each of these parts of the interpreter we need to add
two or three lines of C code. From a performance perspective, we will
just be doing a pointer comparison and branch which is tiny compared to
the rest of the interpreter overhead.

 Paul Prescod


From paul@prescod.net  Mon Dec 27 17:54:54 1999
From: paul@prescod.net (Paul Prescod)
Date: Mon, 27 Dec 1999 12:54:54 -0500
Subject: [Types-sig] PyDL RFC 0.02
References: <3867472C.7ECBAC55@prescod.net> <Pine.LNX.4.10.9912270443520.412-100000@nebula.lyra.org> <19991227113939.B41570@chronis.pobox.com>
Message-ID: <3867A7EE.5F7EB2D0@prescod.net>

scott wrote:
> 
> One thing to consider is that windows/dos users can't have a 4-char
> suffix on a file name reliably.

Well, DOS users....Windows 9x/NT users will have no problem. I'm not
sure if I care enough about DOS to think that we should change this.

> Greg, are you suggesting a single file which gets generated type info
> appened automatically?  If so, I don't see it being harmful.  A simple
> comment header denoting the beginning of machine generated info would
> suffice IMO, 

How do you combine the hand-maintained portions of the interface file
beside the generated parts? How do you prevent a hand-matained interface
file from being overwritten by a generated one?

> and it would facilitate some of the problems with working
> with extra files... like permissions denying writes of the interface
> file and what not.

I don't follow you at all. We have extra files. One or two, you are
going to have potential problems with permissions. That's one reason to
NOT use generated interface files in some circumstances.

> I believe greg has a good point here.  

I think I've addressed it. The Python interpreter should not be looking
at each namespace in turn. I would expect that in the future we will
allow an infinite number of nested namespaces without any performance
penalty.

> Any pointers to this discussion?

I don't have any. I think we just said: "we'll figure out const later."
There may not have been a big discussion.

 Paul Prescod


From scott@chronis.pobox.com  Mon Dec 27 18:39:38 1999
From: scott@chronis.pobox.com (scott)
Date: Mon, 27 Dec 1999 13:39:38 -0500
Subject: [Types-sig] PyDL RFC 0.02
In-Reply-To: <3867A7EE.5F7EB2D0@prescod.net>
References: <3867472C.7ECBAC55@prescod.net> <Pine.LNX.4.10.9912270443520.412-100000@nebula.lyra.org> <19991227113939.B41570@chronis.pobox.com> <3867A7EE.5F7EB2D0@prescod.net>
Message-ID: <19991227133937.A42463@chronis.pobox.com>

On Mon, Dec 27, 1999 at 12:54:54PM -0500, Paul Prescod wrote:
> scott wrote:
> > 
> > One thing to consider is that windows/dos users can't have a 4-char
> > suffix on a file name reliably.
> 
> Well, DOS users....Windows 9x/NT users will have no problem. I'm not
> sure if I care enough about DOS to think that we should change this.

Don't piss off the DOS users!  That's dangerous ;) On the other hand,
it does seem prudent to have a suffix that works on the Denial Of
Service platform if all it takes is a shorter set of suffixes.  plus,
that would mean less typing and neater output to `ls'.  Minor, Minor
point though.  

> 
> > and it would facilitate some of the problems with working
> > with extra files... like permissions denying writes of the interface
> > file and what not.
> 
> I don't follow you at all. We have extra files. One or two, you are
> going to have potential problems with permissions. That's one reason to
> NOT use generated interface files in some circumstances.

2 extra files with potential permissions problems leads to more
combinations of problem scenarios to deal with than a single one. It's
not that big a deal to me one way or the other, though. 2 files is
fine.

> 
> > I believe greg has a good point here.  
> 
> I think I've addressed it. The Python interpreter should not be looking
> at each namespace in turn. I would expect that in the future we will
> allow an infinite number of nested namespaces without any performance
> penalty.

Perhaps, but when?  I haven't seen any indication that this will
happen in the near future, and predicting such things in the longer
run seems to be asking for problems both in the meantime and in how
the long run might actually work out.

IMO, the most important goal is modularity of the system, the second
most important goal is clean accessibility, and the least important
goal is performance.  Does this seem like a reasonable set of goals
for deciding where to store this info?  

One thing to think about with the extra run-time namespace scheme is
accidentally overwriting the values of typedefs and what not.  It may
well be more modular to put this stuff in a special place which does
not affect the regular run-time environment at all.


> 
> > Any pointers to this discussion?
> 
> I don't have any. I think we just said: "we'll figure out const later."
> There may not have been a big discussion.

Sounds good to me.

scott


From gstein@lyra.org  Mon Dec 27 19:30:47 1999
From: gstein@lyra.org (Greg Stein)
Date: Mon, 27 Dec 1999 11:30:47 -0800 (PST)
Subject: [Types-sig] PyDL RFC 0.02
In-Reply-To: <19991227133937.A42463@chronis.pobox.com>
Message-ID: <Pine.LNX.4.10.9912271124060.412-100000@nebula.lyra.org>

On Mon, 27 Dec 1999, scott wrote:

> On Mon, Dec 27, 1999 at 12:54:54PM -0500, Paul Prescod wrote:
> > scott wrote:
> > > 
> > > One thing to consider is that windows/dos users can't have a 4-char
> > > suffix on a file name reliably.
> > 
> > Well, DOS users....Windows 9x/NT users will have no problem. I'm not
> > sure if I care enough about DOS to think that we should change this.

Windows 9x people can very well have problems. The underlying filesystem
is still 8.3. I continued to see issues with the name mapping between long
and short. Mostly, it appears with certain APIs and the registry.

Seriously: avoid more than .3 if possible.

>...
> > > I believe greg has a good point here.  
> > 
> > I think I've addressed it. The Python interpreter should not be looking
> > at each namespace in turn. I would expect that in the future we will
> > allow an infinite number of nested namespaces without any performance
> > penalty.
> 
> Perhaps, but when?  I haven't seen any indication that this will
> happen in the near future, and predicting such things in the longer
> run seems to be asking for problems both in the meantime and in how
> the long run might actually work out.

Yah. What he said.

"in the future" is a *long* ways off when there hasn't been any real 
discussion on if/how to deal with the multiple namespace issue. Relying on
a solution to appear is asking for trouble (IMO).

There are also a number of auxilliary things that would need to occur and
changes to programs to realize that more namespaces exist in the standard
lookup (a true partition of purpose would avoid this).

However: I'm still against adding a whole new namespace. I haven't seen a
good argument for why it is needed. Can somebody come up with a concise
rationale?

>...
> > > Any pointers to this discussion?
> > 
> > I don't have any. I think we just said: "we'll figure out const later."
> > There may not have been a big discussion.
> 
> Sounds good to me.

Hrm. I'll try to dig it up. I thought I remembered somebody saying "and
<that> is why const isn't really needed."

Happy Holidays,
-g

-- 
Greg Stein, http://www.lyra.org/


From scott@chronis.pobox.com  Mon Dec 27 19:37:31 1999
From: scott@chronis.pobox.com (scott)
Date: Mon, 27 Dec 1999 14:37:31 -0500
Subject: [Types-sig] PyDL RFC 0.02
In-Reply-To: <Pine.LNX.4.10.9912271124060.412-100000@nebula.lyra.org>
References: <19991227133937.A42463@chronis.pobox.com> <Pine.LNX.4.10.9912271124060.412-100000@nebula.lyra.org>
Message-ID: <19991227143731.A43112@chronis.pobox.com>

On Mon, Dec 27, 1999 at 11:30:47AM -0800, Greg Stein wrote:
> On Mon, 27 Dec 1999, scott wrote:
> 
> > On Mon, Dec 27, 1999 at 12:54:54PM -0500, Paul Prescod wrote:
> > > scott wrote:
> > > > 
> 
> However: I'm still against adding a whole new namespace. I haven't seen a
> good argument for why it is needed. Can somebody come up with a concise
> rationale?
> 

In my understanding of it, a separate namespace is needed for the
generation of compile-time checking, simply because compile time
checking can't know everything that happens in the run-time namespace.
In other words, the static-type interpreter in the RFC needs it's own
way of dealing with variable names.

This perspective, however, is 100% independent of the idea of a
separate namespace at run time.  I don't see a need for a separate run
time namespace at all, only for a modular, cleanly accessible way of
accessing type information at run time.

scott


From gstein@lyra.org  Mon Dec 27 19:41:20 1999
From: gstein@lyra.org (Greg Stein)
Date: Mon, 27 Dec 1999 11:41:20 -0800 (PST)
Subject: [Types-sig] PyDL RFC 0.02
In-Reply-To: <19991227113939.B41570@chronis.pobox.com>
Message-ID: <Pine.LNX.4.10.9912271134420.412-100000@nebula.lyra.org>

On Mon, 27 Dec 1999, scott wrote:
> On Mon, Dec 27, 1999 at 04:55:08AM -0800, Greg Stein wrote:
>...
> > > The Python interpreter invokes the static interface interpreter and
> > > optionally the interface checker on a Python file and its associated
> > > PyDL file.  Typically a PyDL file is associated with a Python file
> > > through placement in the same path with the same base name and a
> > > ".pydl"  or ".gpydl" extension. If both are avaiable, the module'sj
> > > interface is created by combining the declarations in the ".pydl" and
> > > ".gpydl" files.
> > 
> > The notion of two types of files just adds complexity. There is no reason
> > that a generated file would be *any* different in form/syntax than a
> > human's file. The human just gets to add funky comments, indentation, etc.
> > 
> > In other words: design around a single file.
> 
> Greg, are you suggesting a single file which gets generated type info
> appened automatically?

Nope. It sounded like Paul was suggesting different formats, suffixes, and
purpose. I don't think we should go that route.

It would seem best to have a .pyi file that a human can craft and
maintain. It would be quite easy to have the type-check mode warn the user
that they haven't declared some interface or something (so they can go
and add it in). Heck, maybe the
user did that on purpose, because the class isn't public. It would also be
quite possible to invoke the type-checker with a mode that says "generate
a .pyi file for me." The user can then edit the thing as needed.

I also think that we'd want to avoid "combining the declarations" of two
files. Again, the user may not want the second group of declarations. And
the combination rules might be a bit hard to describe or handle (from the
human's standpoint).

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Mon Dec 27 20:06:57 1999
From: gstein@lyra.org (Greg Stein)
Date: Mon, 27 Dec 1999 12:06:57 -0800 (PST)
Subject: [Types-sig] PyDL RFC 0.02
In-Reply-To: <38679F06.37B6DFDD@prescod.net>
Message-ID: <Pine.LNX.4.10.9912271141290.412-100000@nebula.lyra.org>

On Mon, 27 Dec 1999, Paul Prescod wrote:
> PyDL is a pun. These files behave just like IDL in COM and CORBA. They
> are a separate language, at least for now. 

Let's name it for its purpose, not a pun. Please. I know where the
derivation came from, and the file is an interface file. The file is not a
"python definition language."

If it uses a separate language, then that still doesn't mean we're talking
about something other than an interface file.

> Greg Stein wrote:
> > The notion of two types of files just adds complexity. There is no reason
> > that a generated file would be *any* different in form/syntax than a
> > human's file. The human just gets to add funky comments, indentation, etc.
> 
> >From PyDL RFC 0.03:
> 
> We have two different files so that hand-crafted PyDL files will not be
> overwritten by generated ones. The syntax of the files is identical.

In my reply to Scott: I think we should be choosing one or the other
rather than taking two files and combining them.

> > In other words: design around a single file.
> 
> I did. The existence of two files affects the language semantics not one
> whit but it makes the system much more safe and arguably much more
> usable.

Two more files for each module doesn't make it more usable. Having zero
extra files (and inline declarations) is more usable/maintainable.

How does it make it more safe?

> > Search *another* namespace? Eek! We're already seeing people avoiding the
> > time with things like:
> > 
> >   def foo(len=len):
> >     ...
> > 
> > Adding another namespace will just exacerbate the situation.
> 
> The whole point of static type checking is that the checks of the
> namespaces should be done at *compile time*.

Duh.

> It isn't like a name
> reference from six levels of lexical nesting in a C++ file requires more
> time to look up than a "local" reference. That's just a temporary Python
> bug that we're trying to fix.

Per my other email: we should not be relying on vapor to solve our
problems. We should avoid exacerbating the situation.

Regardless, I haven't seen a good rationale for needing a new namespace
yet. There has been conversation where people have thrown out "well, just
move it into a separate namespace," but I haven't seen a clear/cogent
description of the real need.

I think these names should follow standard Python rules. It goes into the
global, local, or class namespace depending upon the context.

> > Wasn't the notion of "const" (successfully) argued against inclusion?
> 
> Maybe. But I think that the situation changed when I moved from talking
> about "lists", "tuples" and "dictionaries" to talking about "sequences",
> "mappings" and "records" because we have no way of saying "read-only
> record." Maybe that's a big deal for version 1. Maybe it isn't. I'm open
> to opinions.

Record? How is that different from a sequence? That is new terminology,
and it seems it is just a bare cover for saying "tuple." Why don't we
stick to "tuple" instead of introducing a new term.

> > I don't think this is going to work as you expect. The Python interpreter
> > can't work with "Undefined" unless it is an object (otherwise, you're
> > talking about a near-impossible revamp). Therefore, Undefined is an object
> > and you're going to have some *real* serious issues trying to keep that
> > out of some kind of assignment or other usage.
> > 
> > Pass it as a parameter? Shove it into a list or tuple? Check for Undefined
> > on every name binding? What about indexed or slice assignment?
> 
> I think that there are finite number of such issues and you've listed
> most of them. In each of these parts of the interpreter we need to add
> two or three lines of C code. From a performance perspective, we will
> just be doing a pointer comparison and branch which is tiny compared to
> the rest of the interpreter overhead.

It is a lot more prevalent than what I just listed. I was just spouting
off some examples Some more: tuple unpacking, in an "is" expression, in an
"==" expression, or passed to the __cmp__() instance method. I bet that I
could come up with more.

That latter one should throw a nice screw into the machine.

Next up: you suggested pre-assigning all names to the Undefined object.
Now dir(some_instance) produces an incorrect list of valid names. Or
some_instance.__dict__.items() (uh oh! how does the items() return a
two-tuple with Undefined in there?!). But don't just look at class
instances, we have the same issue at the global level. Right here in
Lib/symbol.py, I see a globals().items() on line 73. 

And what does __builtins__['Undefined'] return? If I'm trying to establish
a restricted mode of execution, how do I insert Undefined into the
constructed builtins dictionary?

When I'm writing a C extension, do I have to check for the Undefined
object now? What happens if I see one? Raise an error? Can I return one?

I would venture that the Undefined concept would require a pretty
fundamental change to Python's (internal) object model. Given more
thought, I might find additional issues, and that worries me.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Mon Dec 27 20:14:41 1999
From: gstein@lyra.org (Greg Stein)
Date: Mon, 27 Dec 1999 12:14:41 -0800 (PST)
Subject: [Types-sig] PyDL RFC 0.02
In-Reply-To: <19991227143731.A43112@chronis.pobox.com>
Message-ID: <Pine.LNX.4.10.9912271207410.412-100000@nebula.lyra.org>

On Mon, 27 Dec 1999, scott wrote:
> On Mon, Dec 27, 1999 at 11:30:47AM -0800, Greg Stein wrote:
> > However: I'm still against adding a whole new namespace. I haven't seen a
> > good argument for why it is needed. Can somebody come up with a concise
> > rationale?
> 
> In my understanding of it, a separate namespace is needed for the
> generation of compile-time checking, simply because compile time
> checking can't know everything that happens in the run-time namespace.
> In other words, the static-type interpreter in the RFC needs it's own
> way of dealing with variable names.
> 
> This perspective, however, is 100% independent of the idea of a
> separate namespace at run time.  I don't see a need for a separate run
> time namespace at all, only for a modular, cleanly accessible way of
> accessing type information at run time.

Right -- a compile-time "namespace". But really: that is just an
abbreviated form of the runtime namespaces rather than a separate
compile-time namespace (so "... needed for the generation of compile-time 
checking, ..." doesn't hold).

Regardless of how the compile-time namespace is viewed, Paul was
suggesting a new runtime namespace in the RFC.

Note: the compile-time checking *does* need to know everything that
happens in the run-time namespaces. It must check the assignments and
usage of values in the namespaces.

Happy Holidays,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Mon Dec 27 20:34:18 1999
From: gstein@lyra.org (Greg Stein)
Date: Mon, 27 Dec 1999 12:34:18 -0800 (PST)
Subject: [Types-sig] const (was: PyDL RFC 0.02)
In-Reply-To: <3867A7EE.5F7EB2D0@prescod.net>
Message-ID: <Pine.LNX.4.10.9912271225260.412-100000@nebula.lyra.org>

On Mon, 27 Dec 1999, Paul Prescod wrote:
> scott wrote:
>...
> > Any pointers to this discussion?
> 
> I don't have any. I think we just said: "we'll figure out const later."
> There may not have been a big discussion.

Paul's right, and I'm senile :-)

The only discussion of "const" that I found is in Paul's own email at:

  http://www.python.org/pipermail/types-sig/1999-December/000599.html


I must be thinking of another concept that was raised and subsequently
dismissed...

Cheers,
-g

p.s. I'd recommend assignment enforcement over the notion of const; the
     former seems to be more easily enforcable at runtime.

-- 
Greg Stein, http://www.lyra.org/


From billtut@microsoft.com  Mon Dec 27 20:54:06 1999
From: billtut@microsoft.com (Bill Tutt)
Date: Mon, 27 Dec 1999 12:54:06 -0800
Subject: [Types-sig] Viper module compiler begun
Message-ID: <4D0A23B3F74DD111ACCD00805F31D8101D8BCB8E@RED-MSG-50>


> -----Original Message-----
> From: skaller [mailto:skaller@maxtal.com.au]
> 
> Greg Stein wrote:
> > > skaller wrote:
> > >
> > >       Last time I tried that, it crashed unceremoniously.
> > > Has that been fixed?
> > 
> > Not much of a bug report. Get serious. How the heck should 
> I know whether
> > that particular bug has been fixed?
> > 
> > "oh. it broke. fix it." *snort*
> > 
> > As far as I know, P2C can successfully convert *any* module 
> into a Python
> > extension model.
> 
> Here's what I get with the latest version:
> Did I do something wrong?
> 
> [root@ruby] ~/py2c>python gencode.py gencode.py __gencode.c 
> _gencode.py
> Traceback (innermost last):
>   File "gencode.py", line 35, in ?
>     genc.Generator(args[0], args[1], args[2])
>   File "genc.py", line 91, in __init__
>     tree = t.parsefile(input)
>   File "transformer.py", line 176, in parsefile
>     return self.parsesuite(file.read())
>   File "transformer.py", line 166, in parsesuite
>     return self.transform(parser.suite(text))
> parser.ParserError: Could not parse string.

Well, that indeed is a strange incident given that Python's internal parser
liked thef ile and then subsequently didn't like the file. :)

I've gone and stuck the current contents of CVS at:
http://lima.mudlib.org/~rassilon/p2c/p2c-cvs.zip

This should produce slightly happier output, the compiled transformer.py
actually works and shaves one whole second off the execution time of
translating genc.py.

I haven't yet been able to test genc.py's compiled C code completly since
MSVC has this annoying habit of stop emitting line #s in its debug info
after the 64kth line. (Ugh.)

Bill


From billtut@microsoft.com  Mon Dec 27 22:56:12 1999
From: billtut@microsoft.com (Bill Tutt)
Date: Mon, 27 Dec 1999 14:56:12 -0800
Subject: [Types-sig] Viper module compiler begun
Message-ID: <4D0A23B3F74DD111ACCD00805F31D8101D8BCB90@RED-MSG-50>


> -----Original Message-----
> From: Bill Tutt [mailto:billtut@microsoft.com]
> I've gone and stuck the current contents of CVS at:
> http://lima.mudlib.org/~rassilon/p2c/p2c-cvs.zip
> 

Err.. skaller just reminded me that to use this particular release you need
pyclbr.py from the python CVS repository.

http://www.python.org/download/cvs.html

Bill


From scott@chronis.pobox.com  Mon Dec 27 22:59:55 1999
From: scott@chronis.pobox.com (scott)
Date: Mon, 27 Dec 1999 17:59:55 -0500
Subject: [Types-sig] PyDL RFC 0.02
In-Reply-To: <Pine.LNX.4.10.9912271207410.412-100000@nebula.lyra.org>
References: <19991227143731.A43112@chronis.pobox.com> <Pine.LNX.4.10.9912271207410.412-100000@nebula.lyra.org>
Message-ID: <19991227175955.B44344@chronis.pobox.com>

On Mon, Dec 27, 1999 at 12:14:41PM -0800, Greg Stein wrote:
> On Mon, 27 Dec 1999, scott wrote:
> > On Mon, Dec 27, 1999 at 11:30:47AM -0800, Greg Stein wrote:
> > > However: I'm still against adding a whole new namespace. I haven't seen a
> > > good argument for why it is needed. Can somebody come up with a concise
> > > rationale?
> > 
> > In my understanding of it, a separate namespace is needed for the
> > generation of compile-time checking, simply because compile time
> > checking can't know everything that happens in the run-time namespace.
> > In other words, the static-type interpreter in the RFC needs it's own
> > way of dealing with variable names.
> > 
> > This perspective, however, is 100% independent of the idea of a
> > separate namespace at run time.  I don't see a need for a separate run
> > time namespace at all, only for a modular, cleanly accessible way of
> > accessing type information at run time.
> 
> Right -- a compile-time "namespace". But really: that is just an
> abbreviated form of the runtime namespaces rather than a separate
> compile-time namespace (so "... needed for the generation of compile-time 
> checking, ..." doesn't hold).

We already had a big discussion about this that was never resolved.
> 
> Regardless of how the compile-time namespace is viewed, Paul was
> suggesting a new runtime namespace in the RFC.

Yes.
> 
> Note: the compile-time checking *does* need to know everything that
> happens in the run-time namespaces. It must check the assignments and
> usage of values in the namespaces.

I don't see how compile-time checking can know much of anything about
runtime-specific namespaces without running code.  If it runs code, it
is no longer compile-time checking. Furthermore, if the compile-time
checker assumes that the running of code can do anything it can today,
there's not much of anything that can be checked at compile time to
begin with.

This is why it seems to me that checks done at compile-time must be
done based on a compile-time specific model of the namespaces, and
that model must be more restrictive in naming and scoping usage than
python currently is. Example restrictions that seem to help meet this
end are: don't delete typed variables, don't use different types for
variables at different times, unless that variable is pre-set as a
union of both types, etc.

scott


From skaller@maxtal.com.au  Mon Dec 27 23:37:52 1999
From: skaller@maxtal.com.au (skaller)
Date: Tue, 28 Dec 1999 10:37:52 +1100
Subject: [Types-sig] PyDL RFC 0.02
References: <3867472C.7ECBAC55@prescod.net>
Message-ID: <3867F850.14D60FCF@maxtal.com.au>

Paul Prescod wrote:
> 
> PyDL RFC 0.02
> 
> A PyDL file declares the interface for a Python module. PyDL files
> declare interfaces, objects and the required interfaces of objects.

	Please stick to a syntax which can _also_ be embedded
in .py files. In this case, an interface file is ordinary Python,
except that it consists only of compile time directives.

	If I understand you correctly, interface files
are used to provide module interfaces: there is no other
sensible way to do that at present, since .py files
define modules.

	IF there were a way to create modules like:

	module X:
		# stuff that normally goes in a module file

in python, then there would be a corresponding

	interface module X:
		# stuff that normally goes in a module interface file

In other words, interface files should be regarded as an _artefact_
of the existing 'lack of syntax for defining a module'.
[Which Viper may correct :-]
	
On this basis, some comments:
 
> Interface definitions are similar to Python class definitions. They
> use the keyword "interface" instead of the keyword "class".
> 
> Sometimes an interface can be specialized for working with specific
> other interfaces. For instance a list could be specialized for working
> with integers. 

	No. I think you have to make up your mind here.
You must choose. Either 'List' is an interface,
or, it is an interface generator, it cannot be both.
[In your terminology, you can't use a parameterised interface
where a fully resolved one is required; so List cannot
be both partly unresolved and also fully resolved]

> In addition to defining interfaces, it is possible to declare other
> attributes of the module. Each declaration associates an interface
> with the name of the attribute. Values associated with the name in the
> module namespace must never violate the declaration. Furthermore, by
> the time the module has been imported each name must have an
> associated value.

	OK. This is the crux of the semantics: you are
applying interfaces to names, rather than values/objects.

> The interface interpreter reads the PyDL file and builds the
> relevant interface objects. 

	Furthermore, the Python compiler will do it too;
that is, it will process embedded interface specifications.

>If the PyDL file refers to other modules
> then the interface interpreter can read the PyDL files associated
> with those other modules. 

	Yeah, but you would do well to get out of the habit
of saying 'can' and 'may'. Use the word 'shall'. Meaning,
that the damn thing is REQUIRED to do something :-)
Dont give permission. Specify requirements.

> The interface interpreter maintains its own
> module dictionary so that it does not import the same module twice.

	That's better, but should be marked as 'commentary',
since it has no semantic implications.
 
> Interface expression language:
> ==============================
> 
> Interface expressions are used to declare that attributes must conform
> to certain interfaces.  In a interface expression you may:

	Do NOT say 'may'. Do not refer to 'you', the programmer,
we're not interested in what the programmer does, we're interested
in what the interface compiler does. And it SHALL interpret
certain grammatical constructions in a particular way,
no 'may' about it.
 
--

	Point 0: Paul, list the predefined names like Integer,
	or whatever. Say if they are keywords or plain identifiers.

	Use a grammar production like:

	basic_if_name ::= "Integer" | "Float"


> 1. refer to a "dotted name" (local name or name in the PyDL of an
> imported module ).

	This doesn't make any sense to me.
 
> 2. make a union of two or more interfaces:
> integer or float or complex

	Give the grammar. EG:

	if_alt ::= if_name "or" if_alt | if_name

> 3. parameterize a interface:
> 
> Array( Integer, 50 )
> Array( length=50, elements=Integer )

	grammar?
 
> Note that the arguments can be either interfaces or simple Python
> expressions. A "simple" Python expression is an expression that does
> not involve a function call.

	No. See above. List(Int) already involves a 'function call'.

> 4. use a syntactic shortcut:
> 
> [Foo] => Sequence( Foo ) # sequence of Foo's
> {A:B} => Mapping( A, B ) # Mapping from A's to B's
> (A,B,C) => Record( A, B, C ) # 3-element sequence of interface a,
> followed
>                              # by b followed by c

	Forget this, for the moment. Add syntact sugar
later, when the core grammar and semantics are more settled.
 
> 5. Declare un-modifiability:
> 
> const [const Array( Integer )]
> 
> (the semantics of un-modifiability need to be worked out)

	Again, forget it, for the moment.
This one can be real nasty.
 
> Declarations in a PyDL file:
> ============================
> 
> (formal grammar to follow)
> 
>  1. Imports
> 
> An import statement in an interface file loads another interface file.
> The import statement works just like Python's except that it loads the
> PyDL file found with the referenced module, not the module itself. (of
> course we will make this definition more formal in the future)

	No. Use a distinct keyword like 'include'.
There is a good reason for this: consider embedded declarations.
Then it is 

	a) impossible to load an interface but not the module
	b) impossible to load a module, but not the interface

A separate keyword resolves the ambiguity when embedded:

	import X # load the module
	include X # load the interface

Note that importing a module implicitly loads the interface anyhow.
However, it will do so in an appropriate namespace.	

It is necessary to load interfaces even when modules
are not imported (by the client module). There are other
ways to get at stuff from a module than import it.
For example, a function call f() can return an object whose
class is defined in a module X the calling module has
not imported: we may want to type check the returned
object, which requires importing the module X's interface
-- without importing the module X itself.

>  2. Basic attribute interface declarations:
> 
> decl myint as Integer                   # basic
> decl intarr as Array( Integer, 50 )     # parameterized
> decl intarr2 as Array( size = 40, elements = Integer ) # using keyword
> syntax
> 
> Attribute declarations are not parameteriable. Furthermore, they must
> resolve to fully parameterized (not parameterizable!) interfaces.

	grammar. Again, distinguish interfaces from
interface generators, and the above ambiguity in the wording
disappears.
 
>  3. Callable object interface declarations:
> 
> Functions are the most common sort of callable object but class
> instances can also be callable. Callables may be runtime parameterized
> and/or interface parameterized.  For instance, there might be a method
> "add" that takes two numbers of the same interface and returns a number
> of
> that interface.
> 
> decl Add(_X: Number) as def( a: const _X, b: const _X )-> _X
> 
> _X is the interface parameter. a and b are the runtime parameters.

	I am already using ! not : here, following Greg Stein.
There are enough ":"'s in python already :-)

	OTOH, I am also using 'as' in another context :-(

>  4. Class Declarations
> 
> A class is a callable object that can be subclassed.  Currently the
> only way to make those (short of magic) is with a class declaration,

	They can be created in extension modules.

> Here is the syntax for a class definition:
> 
> decl TreeNode(_X: Number) as
>         class( a: _X, Right: TreeNode( _X ) or None,
>                     Left: TreeNode( _X ) or None )
>                 -> ParentClasses, Interfaces

	Hmmm. Confusing me. Especially the newline after 'as'.
Python requires brackets, or a colon, to start a newline,
you can't do it in the middle of a statement.
 
> What we are really defining is the constructor. The signature of the
> created object can be described in an interface declaration.

	Not good enough. The semantics of class instance
attributes would be 'when you assign to this attribute,
it had better have this type'. This doesn't mean that
you can be sure an access gives that type,
the attribute might not exist. This defeats optimisation.

	You'd need to say something like: AFTER the 
constructor has run, and BEFORE the destructor has run
the attribute exists and has the designated type.
Enforcing that might be tricky :-)
 
>  5. Interface declarations:
> 
> interface (_X,_Y) spam( a, b ):
>     decl somemember as _X
>     decl someOtherMember as _Y
>     decl const someClassAttr as [ _X ]
> 
>     decl const someFunction as def( a: Integer, b: Float ) -> String

	Semantics?
 
> The Undefined Object:
> =====================
> 
> The Undefined object is used as the value of unassigned attributes and
> the return value of functions that do not return a value. 

	No. All Python functions return a value. If one is not
returned explicitly, None is returned implicitly. People
check for that: for example:

	def f(x): 
		if x in [1,2,3]: return x

	if f(99): print 'Got it'
	else: print 'Not 1,2 or 3'

Your spec would break this code. You can argue that your
spec is a better spec -- but it isn't Python compatible.

	FYI: In Viper, uninitialised, statically
declared variables are initialised with the special object PyInitial.
Another special object, PyTerminal, also exists. These objects
are useful in the internal workings of the implementation,
for bounding things (i.e. as sentinels). For example,
it makes calculating max( .... ) much easier. [PyInitial
is less than all other objects]

> Undefined also corrects a long-term unsafe issue with functions. Now,
> functions that do not explicitly return a value return Undefined
> instead of None. 

	No. That would break compatibility.

> Experimental syntax:
> ====================
> 
> There is a backwards compatible syntax for embedding declarations in a
> Python 1.5x file:
> 
> "decl","myint as Integer"
> "typedef","PositiveInteger as BoundedInt( 0, maxint )"

	Nice.
 
> Summary of Major Runtime Implications:
> =====================
> 
> All of the named interfaces defined in a PyDL file are available in the
> "interfaces" dictionary that is searched between the module dictionary
> and
> the built-in dictionary.

	I _think_ you mean that the interface dictionary 
is 'per module'? And you can refer to an interface 
in another module with other.interfx notation?
 
> The runtime should not allow an assignment or function call to violate
> the declarations in the PyDL file. In an "optimized speed mode" those
> checks would be disabled.

	I think you have to think very carefully about what
constitues an error here: see my posts about errors in python.
It is not acceptable to specify that an exception be thrown.
That would NOT permit an optimiser to elide checks, except
when it could prove they were not needed.

	Much better, you deem a violating program
is not valid, and then the language processor can do whatever
it wants: it may raise an exception, or it may core dump,
or it may reject the program early.
 
-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From paul@prescod.net  Mon Dec 27 23:49:01 1999
From: paul@prescod.net (Paul Prescod)
Date: Mon, 27 Dec 1999 18:49:01 -0500
Subject: [Types-sig] PyDL RFC 0.02
References: <Pine.LNX.4.10.9912271124060.412-100000@nebula.lyra.org>
Message-ID: <3867FAED.BAEC3B6@prescod.net>

Greg Stein wrote:
> 
> Windows 9x people can very well have problems. The underlying filesystem
> is still 8.3. I continued to see issues with the name mapping between long
> and short. Mostly, it appears with certain APIs and the registry.
> 
> Seriously: avoid more than .3 if possible.

Okay, .pyi will be the extension but I won't give up on the pun as the
formal name for the language without more teeth pulling (and I've just
had my wisdom's removed so my tolerance level is high).

> "in the future" is a *long* ways off when there hasn't been any real
> discussion on if/how to deal with the multiple namespace issue. Relying on
> a solution to appear is asking for trouble (IMO).

It seems to me that the simplest solution is to move the "types"
namespace BEHIND the __builtin__ namespace.

> However: I'm still against adding a whole new namespace. I haven't seen a
> good argument for why it is needed. Can somebody come up with a concise
> rationale?

Well there are a few issues and I admit to having not thought all of
them through completely yet:

* importing modules are supposed to only see exported attributes. For
instance dir() should only show exported attributes.

* the two namespace arrangement is similar to the way that a class'
namespace is segmented from that of instances.

* Types are independent objects but variable declarations need to be
somehow unified with the declared objects.

* But we also need an API to query type information associated with a
name (instead of the value bound to the name)

* Type expressions can make forward references. So when they are
embedded in Python code we still won't think of them as ordinary
assignments.

I have not put a lot of thought into this part of the system and am open
to suggestions of how to get all of this to work.

 Paul Prescod


From skaller@maxtal.com.au  Tue Dec 28 00:14:18 1999
From: skaller@maxtal.com.au (skaller)
Date: Tue, 28 Dec 1999 11:14:18 +1100
Subject: [Types-sig] PyDL RFC 0.02
References: <19991227143731.A43112@chronis.pobox.com> <Pine.LNX.4.10.9912271207410.412-100000@nebula.lyra.org> <19991227175955.B44344@chronis.pobox.com>
Message-ID: <386800DA.7DBE1B7D@maxtal.com.au>

scott wrote:

> This is why it seems to me that checks done at compile-time must be
> done based on a compile-time specific model of the namespaces, and
> that model must be more restrictive in naming and scoping usage than
> python currently is. 

	No: the way I see it, the 'optional type checking' when
added to the python language makes it more expressive, not more
restrictive -- even though constraints on declared names
are part of the extension, it is a genuine extension.

> Example restrictions that seem to help meet this
> end are: don't delete typed variables, don't use different types for
> variables at different times, unless that variable is pre-set as a
> union of both types, etc.

	This is correct. There must be a set of 'text files'
which are not valid python programs. As you say, assigning
the wrong type to a statically typed variable would render
the program 'not python': but this situation cannot occur
in python 1.5, because there are no static type declarations.

	What this means is that there are some
files which technically have incompatible semantics to Python 1.5,
technically, running these files currently requires a SyntaxError
to be raised. Under the modified semantics, there are two cases:
the program runs 'more or less as if the declarations were not there',
which will happen if the execution of the program obeys the
declared type constraints, or, 'the file is not valid python',
which means we don't care what happens. In the latter case,
a run time exception would be useful from a non-optimising
interpreter, and a compile time diagnostic from a type-checking
translator, but in those cases where, in particular, the type
checker cannot ensure the constraints are met, an optimiser
is entitled to ASSUME that they're met, optimise accordingly,
and core dump if, in fact, they're not.

	It is ALSO possible to _require_ a diagnostic
in some cases of an invalid program. But much care is need
specifying them, to make sure it is possible to for
_all_ language translators to detect
these cases. [And usually, what happens after that is
undefined anyhow]

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From paul@prescod.net  Tue Dec 28 08:38:36 1999
From: paul@prescod.net (Paul Prescod)
Date: Tue, 28 Dec 1999 03:38:36 -0500
Subject: [Types-sig] const (was: PyDL RFC 0.02)
References: <Pine.LNX.4.10.9912271225260.412-100000@nebula.lyra.org>
Message-ID: <3868770C.7B427BC@prescod.net>

Greg Stein wrote:
> 
> p.s. I'd recommend assignment enforcement over the notion of const; the
>      former seems to be more easily enforcable at runtime.

I think we need both. We need to be able to enforce the TYPES of
assignments and we need to sometimes say that an object is not
modifiable, for all of the things we currently use tuples, files open
for read and other read-only objects for.

 Paul Prescod


From paul@prescod.net  Tue Dec 28 08:38:42 1999
From: paul@prescod.net (Paul Prescod)
Date: Tue, 28 Dec 1999 03:38:42 -0500
Subject: [Types-sig] PyDL RFC 0.02
References: <Pine.LNX.4.10.9912271134420.412-100000@nebula.lyra.org>
Message-ID: <38687712.1F7E3714@prescod.net>

Greg Stein wrote:
> 
> Nope. It sounded like Paul was suggesting different formats, suffixes, and
> purpose. I don't think we should go that route.

One format. One purpose. Two suffixes. Two maintenance strategies.

> It would seem best to have a .pyi file that a human can craft and
> maintain. It would be quite easy to have the type-check mode warn the user
> that they haven't declared some interface or something (so they can go
> and add it in). Heck, maybe the
> user did that on purpose, because the class isn't public. It would also be
> quite possible to invoke the type-checker with a mode that says "generate
> a .pyi file for me." The user can then edit the thing as needed.

But the whole point is that we don't want to be forced to maintain the
thing in a separate file. If you want to put some or all of the
declarations in your source file then we need a place to extract those
to. I could have just banished in-file declarations but it seemed that
we could easily extract them so why not allow the convenience?

> I also think that we'd want to avoid "combining the declarations" of two
> files. Again, the user may not want the second group of declarations. 

Then they shouldn't put declarations in their Python file.

> And
> the combination rules might be a bit hard to describe or handle (from the
> human's standpoint).

It's just concatenation! There is nothing hard about it.

 Paul Prescod


From gstein@lyra.org  Tue Dec 28 09:08:29 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 28 Dec 1999 01:08:29 -0800 (PST)
Subject: [Types-sig] const (was: PyDL RFC 0.02)
In-Reply-To: <3868770C.7B427BC@prescod.net>
Message-ID: <Pine.LNX.4.10.9912280105510.412-100000@nebula.lyra.org>

On Tue, 28 Dec 1999, Paul Prescod wrote:
> Greg Stein wrote:
> > p.s. I'd recommend assignment enforcement over the notion of const; the
> >      former seems to be more easily enforcable at runtime.
> 
> I think we need both. We need to be able to enforce the TYPES of
> assignments and we need to sometimes say that an object is not
> modifiable, for all of the things we currently use tuples, files open
> for read and other read-only objects for.

Um... Are you suggesting that we add a readonly flag to the list and dict
types? Short of that, I'm not sure how you would do "const". IMO, adding a
readonly flag to those types seems wrong.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From paul@prescod.net  Tue Dec 28 09:14:30 1999
From: paul@prescod.net (Paul Prescod)
Date: Tue, 28 Dec 1999 04:14:30 -0500
Subject: [Types-sig] Interface files
References: <3867472C.7ECBAC55@prescod.net> <3867F850.14D60FCF@maxtal.com.au>
Message-ID: <38687F76.47B415C7@prescod.net>

skaller wrote:
> 
> In other words, interface files should be regarded as an _artefact_
> of the existing 'lack of syntax for defining a module'.
> [Which Viper may correct :-]

If the normative spec. is in terms of interface files then we can deal
with various situations through transformation TO interface files:

 * C modules
 * "read-only" Python modules (like library modules that you don't want
to change)
 * modules (in any language) already defined by IDL
 * Python modules with embedded declarations
 * Python modules without embedded declarations that "use"
non-conservative type inferencing

 Paul Prescod


From gstein@lyra.org  Tue Dec 28 09:22:26 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 28 Dec 1999 01:22:26 -0800 (PST)
Subject: [Types-sig] PyDL RFC 0.02
In-Reply-To: <38687712.1F7E3714@prescod.net>
Message-ID: <Pine.LNX.4.10.9912280108380.412-100000@nebula.lyra.org>

On Tue, 28 Dec 1999, Paul Prescod wrote:
> Greg Stein wrote:
> > Nope. It sounded like Paul was suggesting different formats, suffixes, and
> > purpose. I don't think we should go that route.
> 
> One format. One purpose. Two suffixes. Two maintenance strategies.

Fine. It just didn't sound like that in your proposal, so I was concerned.

> > It would seem best to have a .pyi file that a human can craft and
> > maintain. It would be quite easy to have the type-check mode warn the user
> > that they haven't declared some interface or something (so they can go
> > and add it in). Heck, maybe the
> > user did that on purpose, because the class isn't public. It would also be
> > quite possible to invoke the type-checker with a mode that says "generate
> > a .pyi file for me." The user can then edit the thing as needed.
> 
> But the whole point is that we don't want to be forced to maintain the
> thing in a separate file.

I totally agree here.

> If you want to put some or all of the
> declarations in your source file then we need a place to extract those
> to.

While true, I could just as easily argue that they should be stored as
pickles in a central database.

It might be nice to start very simple: there is one file that we look for
(a .pyi). Whether that was hand-created or computer-created, we just don't
care. The file would be used for accessing a module's interface without
needing to actually load the module. In a type-check mode, it can be
verified against declarations (if any) in the source module.

> I could have just banished in-file declarations but it seemed that
> we could easily extract them so why not allow the convenience?

Euh... How could you have "just banished in-file declarations" ??

> > I also think that we'd want to avoid "combining the declarations" of two
> > files. Again, the user may not want the second group of declarations. 
> 
> Then they shouldn't put declarations in their Python file.

I do not believe this is a valid position. Specifically: I would put all
the declarations inline. If I create a .pyi, it would simply be as an
extract from the inline declarations *or* to create a public subset of the
items in the source file. Your scheme would mean that I couldn't use the
type stuff internally -- it would be exposed through the automagic
generated portion.

> > And
> > the combination rules might be a bit hard to describe or handle (from the
> > human's standpoint).
> 
> It's just concatenation! There is nothing hard about it.

It doesn't seem to be simple concatenation. How are conflicts handled? How
are merges done? e.g. a method is not declared in the interface file, but
it does have a declaration in the source.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From paul@prescod.net  Tue Dec 28 10:42:48 1999
From: paul@prescod.net (Paul Prescod)
Date: Tue, 28 Dec 1999 05:42:48 -0500
Subject: [Types-sig] RFC Comments
Message-ID: <38689428.4953374D@prescod.net>

>         No. I think you have to make up your mind here.
> You must choose. Either 'List' is an interface,
> or, it is an interface generator, it cannot be both.
> [In your terminology, you can't use a parameterised interface
> where a fully resolved one is required; so List cannot
> be both partly unresolved and also fully resolved]

Okay, I will use your terminology.

>         Yeah, but you would do well to get out of the habit
> of saying 'can' and 'may'. Use the word 'shall'. Meaning,
> that the damn thing is REQUIRED to do something :-)
> Dont give permission. Specify requirements.

I expect to rewrite the specification from scratch (with grammar) before
I am done. Consider this version a prototype. Once we have the design
down I will generate the normative spec.

>         Point 0: Paul, list the predefined names like Integer,
>         or whatever. Say if they are keywords or plain identifiers.

I've been putting this off because there are some tricky issues around
file objects.

> > Note that the arguments can be either interfaces or simple Python
> > expressions. A "simple" Python expression is an expression that does
> > not involve a function call.
> 
>         No. See above. List(Int) already involves a 'function call'.

List is (according to your terminology) an interface generator, not a
function.

> > 5. Declare un-modifiability:
> >
> > const [const Array( Integer )]
> >
> > (the semantics of un-modifiability need to be worked out)
> 
>         Again, forget it, for the moment.

Isn't that what I did? :)

> This one can be real nasty.

Agreed.

>         No. Use a distinct keyword like 'include'.
> There is a good reason for this: consider embedded declarations.
> Then it is
> 
>         a) impossible to load an interface but not the module
>         b) impossible to load a module, but not the interface
> 
> A separate keyword resolves the ambiguity when embedded:
> 
>         import X # load the module
>         include X # load the interface
> 
> Note that importing a module implicitly loads the interface anyhow.
> However, it will do so in an appropriate namespace.

I don't understand your model of namespaces and inclusions. I don't
understand mine either so don't feel bad.

> It is necessary to load interfaces even when modules
> are not imported (by the client module). There are other
> ways to get at stuff from a module than import it.
> For example, a function call f() can return an object whose
> class is defined in a module X the calling module has
> not imported: we may want to type check the returned
> object, which requires importing the module X's interface
> -- without importing the module X itself.

We can have an API like:

load_interface("foo")

I don't think that the needs of a very specific tool like a static type
checker should drive syntax to that extent. The other 99% of code will
never do an "include" and the keyword will be wasted.

>         I am already using ! not : here, following Greg Stein.

I'm going to presume that that isn't a backwards-compatibility argument.
:)

> There are enough ":"'s in python already :-)

Debatable. I would also be amenable to "as", "is" or "isa". "!" means
not to me.

> > What we are really defining is the constructor. The signature of the
> > created object can be described in an interface declaration.
> 
>         Not good enough. The semantics of class instance
> attributes would be 'when you assign to this attribute,
> it had better have this type'. This doesn't mean that
> you can be sure an access gives that type,
> the attribute might not exist. This defeats optimisation.

The attribute will either have the type or something like "undefined".
Since undefined is not a "useful" value, you can optimize away.

> Your spec would break this code. You can argue that your
> spec is a better spec -- but it isn't Python compatible.

Agreed. I will clarify that the behavior of "dropped off" functions is
just a suggestion of how Python 2 might be improved using the features
of the new object.

>         FYI: In Viper, uninitialised, statically
> declared variables are initialised with the special object PyInitial.
> Another special object, PyTerminal, also exists. These objects
> are useful in the internal workings of the implementation,
> for bounding things (i.e. as sentinels). For example,
> it makes calculating max( .... ) much easier. [PyInitial
> is less than all other objects]

It sounds like None re-invented. My only reason for wanting a new object
(not None) is because None is way too flexible. You could pass a None
through ten thousand lines of code accidently. So I wouldn't want
Undefined to be useful to "max" or anything else other than "is", "str"
and "repr".

>         I _think_ you mean that the interface dictionary
> is 'per module'? And you can refer to an interface
> in another module with other.interfx notation?

True.

> > The runtime should not allow an assignment or function call to violate
> > the declarations in the PyDL file. In an "optimized speed mode" those
> > checks would be disabled.
> 
>         I think you have to think very carefully about what
> constitues an error here: see my posts about errors in python.
> It is not acceptable to specify that an exception be thrown.
> That would NOT permit an optimiser to elide checks, except
> when it could prove they were not needed.
>
>         Much better, you deem a violating program
> is not valid, and then the language processor can do whatever
> it wants: it may raise an exception, or it may core dump,
> or it may reject the program early.

I will consider this. An alternate technique is to list allowed recovery
strategies:

"It is an error if this leaves more than one match. An XSLT processor
may signal the error; if it does not signal the error, it must recover
by choosing, from amongst the matches that are left, the one that occurs
last in the stylesheet."

"It is an error if instantiating the content of
xsl:processing-instruction creates nodes other than text nodes. An XSLT
processor may signal the error; if it does not signal the error, it must
recover by ignoring the offending nodes together with their content."

 Paul Prescod


From skip@mojam.com (Skip Montanaro)  Tue Dec 28 14:53:52 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 28 Dec 1999 08:53:52 -0600 (CST)
Subject: [Types-sig] A plea for quoting consistency...
In-Reply-To: <38689428.4953374D@prescod.net>
References: <38689428.4953374D@prescod.net>
Message-ID: <14440.52992.308594.392137@dolphin.mojam.com>

>>>>> "Paul" == Paul Prescod <paul@prescod.net> writes:

    >> No. I think you have to make up your mind here.  You must
    >> choose. Either 'List' is an interface, or, it is an interface
    >> generator, it cannot be both.  [In your terminology, you can't use a
    >> parameterised interface where a fully resolved one is required; so
    >> List cannot be both partly unresolved and also fully resolved]

    Paul> Okay, I will use your terminology.

    ...

Unfortunately, since this super-thread has grown so enormous, I wind up
reading things a bit out of order and/or in multiple chunks, separated by
significant time gaps.  Please, if you don't CC the author on your response,
at least list the author's name somewhere near the beginning of your
response.

Thx,

Skip Montanaro | http://www.mojam.com/
skip@mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From skaller@maxtal.com.au  Tue Dec 28 16:31:51 1999
From: skaller@maxtal.com.au (skaller)
Date: Wed, 29 Dec 1999 03:31:51 +1100
Subject: [Types-sig] PyDL RFC 0.02
References: <Pine.LNX.4.10.9912271134420.412-100000@nebula.lyra.org> <38687712.1F7E3714@prescod.net>
Message-ID: <3868E5F7.3B52D1ED@maxtal.com.au>

Paul Prescod wrote:
> 
> Greg Stein wrote:

> > I also think that we'd want to avoid "combining the declarations" of two
> > files.

	I don't think that this is possible. Separate interface
files seem necessary, if only for C extensions. Embedded
declarations seem important, at least to me.

> Then they shouldn't put declarations in their Python file.

	I agree.

> > And
> > the combination rules might be a bit hard to describe or handle (from the
> > human's standpoint).
> 
> It's just concatenation! There is nothing hard about it.

	If that is so, please give the rules.
In particular, you will need to cover the issue of duplicate
declarations. In c and C++ in particular, these issues
turned out to be non-trivial.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Tue Dec 28 16:39:29 1999
From: skaller@maxtal.com.au (skaller)
Date: Wed, 29 Dec 1999 03:39:29 +1100
Subject: [Types-sig] Interface files
References: <3867472C.7ECBAC55@prescod.net> <3867F850.14D60FCF@maxtal.com.au> <38687F76.47B415C7@prescod.net>
Message-ID: <3868E7C1.665CD6AC@maxtal.com.au>

Paul Prescod wrote:
> 
> skaller wrote:
> >
> > In other words, interface files should be regarded as an _artefact_
> > of the existing 'lack of syntax for defining a module'.
> > [Which Viper may correct :-]
> 
> If the normative spec. is in terms of interface files then we can deal
> with various situations through transformation TO interface files:

Yes, this is possible to some extent, but not totally:
first, the grammar needs to be compatible with existing
python to permit embedding in the first place, and secondly
_references_ to typedecls seem to require embedding,
even if the decls themselves are not embedded.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Tue Dec 28 16:57:15 1999
From: skaller@maxtal.com.au (skaller)
Date: Wed, 29 Dec 1999 03:57:15 +1100
Subject: [Types-sig] PyDL RFC 0.02
References: <Pine.LNX.4.10.9912280108380.412-100000@nebula.lyra.org>
Message-ID: <3868EBEB.BDAFABE0@maxtal.com.au>

Greg Stein wrote:
 
> It doesn't seem to be simple concatenation. How are conflicts handled? How
> are merges done? e.g. a method is not declared in the interface file, but
> it does have a declaration in the source.

	Yes. Can we please assume the following position:

	1) declarations can be embedded.
	2) declarations can also be given in a separate file
	3) Processing module X commences by loading the 
	   separate interface file
	4) Next, the .py file is scanned for declarations
	5) The results of (3) and (4) are merged somehow
	6) The .py files is scanned again by the code generator

We must make some decisions here. 

Question: what happens if a typedecl kind of name is 
declared more than once?

Partial Answer 1: This must be permitted, because this
is _exactly_ what will happen if the .pi file is generated
from the .py file by scanning for declarations.

It is not necessary to permit such declarations twice
in the _same_ file though.

One possible solution: require the declarations be
identical, token for token: this is what C++ requires.

Another solution: the declarations
must be semanically equivalent. What this means is that
the processor is free to chose either declaration
as 'the' definition.

Another solution: use the second declaration.

Another solution: require _both_ apply
(the product: combine constraints).

Another: require _either_ apply:
(the sum: take the 'union')

I could do some analysis on these alternatives,
but first, we need to agree there is an issue here.

I note there is plenty of existing practice -- 
with different resolutions :-)

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Tue Dec 28 17:55:36 1999
From: skaller@maxtal.com.au (skaller)
Date: Wed, 29 Dec 1999 04:55:36 +1100
Subject: [Types-sig] RFC Comments
References: <38689428.4953374D@prescod.net>
Message-ID: <3868F998.594C4EEF@maxtal.com.au>

Paul Prescod wrote:
 
> >         Yeah, but you would do well to get out of the habit
> > of saying 'can' and 'may'. Use the word 'shall'. Meaning,
> > that the damn thing is REQUIRED to do something :-)
> > Dont give permission. Specify requirements.
> 
> I expect to rewrite the specification from scratch (with grammar) before
> I am done. Consider this version a prototype. Once we have the design
> down I will generate the normative spec.

	I do consider your RFC's prototypes -- but they're already
quite good specifications, so it is already time to try to tighten them
up.
IMHO. Doing this will also help uncover ambiguities and problems,
that 'loose' wording will cover up.
 
> >         Point 0: Paul, list the predefined names like Integer,
> >         or whatever. Say if they are keywords or plain identifiers.
> 
> I've been putting this off because there are some tricky issues around
> file objects.

	The leave them out. Temporarily. If your proposal
is coherent and well principled, but doesn't quite cover all the
territory, it should be possible to extend it. If you try to
make it cover too much, it may be harder to get something
concrete enough to extend.
 
> > > 5. Declare un-modifiability:
> > >
> > > const [const Array( Integer )]
> > >
> > > (the semantics of un-modifiability need to be worked out)
> >
> >         Again, forget it, for the moment.
> 
> Isn't that what I did? :)

	No, you mentioned it in point 5. :-)
 
> I don't understand your model of namespaces and inclusions. I don't
> understand mine either so don't feel bad.

	I agree. I'll try again; perhaps an example:

	# file m.py
	import n
	include p

Here, in the module m, we import n. This has to actually
import the module n at run time. In pass 1, we read the
interface file n.pyi. In pass 2, we generate code to
actually load module n. Agree?

But, for p, we ONLY read in the interface file p.pyi.
We do not generate code to import p.

Why would we do this? The answer is, we may gain access
to classes and functions of the module p, even though
we have not imported it. For example, consider a function

	def f(x):
		import p # local import
		return p.someclass()

	def f(x): # get at module p from module n
		return n.p.someclass()

We cannot state the interface of f, in particular
the return type, without the name of the interface
of the class 'someclass' which is defined in 
the interface p. But p isn't imported into module m.

So: we have to be able to load an interface, without
that necessarily implying the module be imported.

On the other hand, in an _interface_ file, we cannot
import anything: importing implies run time code
generation, to bind a name to a module object.

So the correct way to load an interface, but not
import anything, requires a separate keyword like 'include'.
The semantics are distinct:

	import implies include
	the converse is not the case

> We can have an API like:
> 
> load_interface("foo")

	Yes, that would be possible but ugly. :-)
 
> I don't think that the needs of a very specific tool like a static type
> checker should drive syntax to that extent. The other 99% of code will
> never do an "include" and the keyword will be wasted.

	but you cannot write that in an implementation
file because it would be interpreted as a function call
to be done at run time, whereas loading the interface
must be done at compile time.
 
> >         I am already using ! not : here, following Greg Stein.
> 
> I'm going to presume that that isn't a backwards-compatibility argument.
> :)

	Sure it is. It is only a minor one though.
The reason I chose "!" for argument declarations was that it
was already being used in similar way for the _expression_:

	x ! t

as in:

	y = x ! t

and in this context, ":" cannot be used.
 
> > There are enough ":"'s in python already :-)
> 
> Debatable. I would also be amenable to "as", "is" or "isa". "!" means
> not to me.

	OK. You should proceed with _some_ fixed syntax. Perhaps
it makes sense to seek feedback from users on c.l.p?
I'll implement whatever you decide [provided it fits with the grammar
of course :-]
 
> > > What we are really defining is the constructor. The signature of the
> > > created object can be described in an interface declaration.
> >
> >         Not good enough. The semantics of class instance
> > attributes would be 'when you assign to this attribute,
> > it had better have this type'. This doesn't mean that
> > you can be sure an access gives that type,
> > the attribute might not exist. This defeats optimisation.
> 
> The attribute will either have the type or something like "undefined".
> Since undefined is not a "useful" value, you can optimize away.

	I understand that this is your intent, but I am questioning it.
My argument is something like this: a requirement that an attribute
have type X IF it exists, is weaker than one that doesn't require
anything at all, since the typing requirement is contingent
on the existence requirement. What I mean is that, the purpose
of the typing requirement can be stated as 'you can be sure
when you access this name that the object it is bound to has
the specified type', but that purpose is not met, if the name
isn't bound to an object. you cannot safely optimise an
access, because you don't know if the name is bound.

	Uggg. I'm not explaining this very well.
What I'm saying is that type safe access isn't type safe
at all unless the access is also safe, irrespective
of whether it is typesafe: it has to be safe, before
being typesafe is any use.
 
> > Your spec would break this code. You can argue that your
> > spec is a better spec -- but it isn't Python compatible.
> 
> Agreed. I will clarify that the behavior of "dropped off" functions is
> just a suggestion of how Python 2 might be improved using the features
> of the new object.

	The new Undefined object is an implementation detail in this respect:
It is not required, at all, to specify that Python
functions be required to explictly return a value, and may
not drop off the end, or, weaker, that IF a function
drops off the end, the return value may not be used.

	[Yes, I know you added some extra semantics allowing the
dropped of the end returns to be tested -- more debatable, I think]

> >         FYI: In Viper, uninitialised, statically
> > declared variables are initialised with the special object PyInitial.

> It sounds like None re-invented. 

	It is, except that clients may refer to None explicitly,
but NOT to PyInitial:

	x = None # valid Python
	x = PyInitial # NameError, no such thing

> My only reason for wanting a new object
> (not None) is because None is way too flexible. You could pass a None
> through ten thousand lines of code accidently. So I wouldn't want
> Undefined to be useful to "max" or anything else other than "is", "str"
> and "repr".

	Perhaps you misunderstood: PyInitial is used in the IMPLEMENTATION
of 'max', which is written in ocaml. It is not available to the client
python programmer. 

> I will consider this. An alternate technique is to list allowed recovery
> strategies:
> 
> "It is an error if this leaves more than one match. An XSLT processor
> may signal the error; if it does not signal the error, it must recover
> by choosing, from amongst the matches that are left, the one that occurs
> last in the stylesheet."

	Style sheets have different requirements: there is some kind
of need for robustness: compilers should be fragile.
[If it is at all possible to break the users program, do it!]
 
	It is, of course, possible to specify _anything_.
But it is not a good idea, IMHO. For example, Greg Stein might
argue that two options be allowed: a compile time diagnostic
OR a run time diagnostic. This is dangerous: it limits the
kind of processors to what Greg thinks is important today.

	The general rule of standards bodies is that
if there is no consensus, leave it out -- don't define
anything. This gives implementors maximum freedom,
and restricts the programmer most. It also gives
the standardisers the option of adding more constraints
on implementors _later_: it is much harder to undo a rule,
than to add a new one.

	Note that NO ONE likes 'undefined behaviour'.
On the other hand, most of us prefer 'deterministic behaviour',
that is, exactly one option is given the implementor,
and the programmer can rely on it. But the next best thing
is 'don't do it -- it is not defined'. Two or more choices
is a very weak compromise (usually), because the programmer
cannot rely on a particular behaviour, and will usually
have to avoid it for this reason: meaning the implementor
is constrained needlessly, providing a feature the programmer
cannot use.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Tue Dec 28 18:30:48 1999
From: skaller@maxtal.com.au (skaller)
Date: Wed, 29 Dec 1999 05:30:48 +1100
Subject: [Types-sig] const (was: PyDL RFC 0.02)
References: <Pine.LNX.4.10.9912280105510.412-100000@nebula.lyra.org>
Message-ID: <386901D8.9CD61B50@maxtal.com.au>

Greg Stein wrote:
> 
> On Tue, 28 Dec 1999, Paul Prescod wrote:
> > Greg Stein wrote:
> > > p.s. I'd recommend assignment enforcement over the notion of const; the
> > >      former seems to be more easily enforcable at runtime.
> >
> > I think we need both. We need to be able to enforce the TYPES of
> > assignments and we need to sometimes say that an object is not
> > modifiable, for all of the things we currently use tuples, files open
> > for read and other read-only objects for.
> 
> Um... Are you suggesting that we add a readonly flag to the list and dict
> types? Short of that, I'm not sure how you would do "const". IMO, adding a
> readonly flag to those types seems wrong.

'const', IMHO, in Paul's name based model, means the name
cannot be rebound:

	const x = 1 # x is always bound to 1

But:

	const x = []
	x.append(1) # fine, x is still bound to the same list

This does not require a readonly flag, it can be
enforced at compile time (in the absence of 'exec'
statements :-)

In some sense, this kind of const is a _stronger_
constraint that a type constraint:

	x: int = y

since any name which is not rebindable is necessarily
bound to the same object, and therefore has an invariant
type during its lifetime**: there is no need to give
the type for the purpose of checking assignments,
since any such asssignment is an error (because it
violates the no-rebinding requirement).

[** this is not true for raw objects in Viper,
where the type object can be dynamically changed ..
but that is another story :-]

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Tue Dec 28 18:41:06 1999
From: skaller@maxtal.com.au (skaller)
Date: Wed, 29 Dec 1999 05:41:06 +1100
Subject: [Types-sig] Help?
References: <Pine.LNX.4.10.9912280105510.412-100000@nebula.lyra.org>
Message-ID: <38690442.35EFCE57@maxtal.com.au>

Um, I feel dumb asking this but ..

I'm having some trouble figuring out how the C API works
with functions and methods. Consider the script:

	def f(self,arg): pass
	class X:
		g = f

	x = X()
	X.g(x, 1)
	f(2,1)
	x.g(1)

Here, there is only a single function object, f.
A call to f requires two arguments: in C, the
declaration

	PyObject *f(PyObject *self, PyObject *args)

would have args be a two argument tuple, and self NULL,
for the call:

	f(2,1)

Now, when f is called by _either_

	X.g(x,1)
	x.g(1)

then x is the 'self' argument of the function,
and the tuple 'args' contains only one element.
Right?

So HOW do I convert f to a C function?
It does not seem possible. When used as 'f',
there are two arguments in the 'args' tuple,
but when used as g, the first arg is the self
pointer. The python script indicates a _single_
function can be correctly used in both cases,
but I cannot see how this is possible if
a C function is used.

Sorry to ask a dumb question. Can anyone correct
my misconceptions?

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From gstein@lyra.org  Tue Dec 28 21:54:24 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 28 Dec 1999 13:54:24 -0800 (PST)
Subject: [Types-sig] Help?
In-Reply-To: <38690442.35EFCE57@maxtal.com.au>
Message-ID: <Pine.LNX.4.10.9912281342100.412-100000@nebula.lyra.org>

On Wed, 29 Dec 1999, skaller wrote:
>...
> 	PyObject *f(PyObject *self, PyObject *args)

In this form, the "self" argument is determined entirely by the C
implementation.

When the C function is a method on a C-based Type, then look in the module
for a Py_FindMethod() call. The second parameter is "self".

When the C function is a module-level function, then look at the
InitModule call. If the call is Py_InitModule() or Py_InitModule3(), then
self will always be NULL. If the call is Py_InitModule4(), then the fourth
parameter will be passed as self.

> would have args be a two argument tuple, and self NULL,
> for the call:
> 
> 	f(2,1)
> 
> Now, when f is called by _either_
> 
> 	X.g(x,1)
> 	x.g(1)
> 
> then x is the 'self' argument of the function,
> and the tuple 'args' contains only one element.
> Right?

See above. It is based on the C implementation, rather than the style of
call.

> So HOW do I convert f to a C function?
> It does not seem possible. When used as 'f',
> there are two arguments in the 'args' tuple,
> but when used as g, the first arg is the self
> pointer. The python script indicates a _single_
> function can be correctly used in both cases,
> but I cannot see how this is possible if
> a C function is used.

The C function is called in only one way. And that is based on the C
implementation and how you fetch the function (directly from a module or
via an object [of some type]).

> Sorry to ask a dumb question. Can anyone correct
> my misconceptions?

I hope that I have. Ask for more detail if I haven't been clear somewhere.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From paul@prescod.net  Tue Dec 28 21:56:52 1999
From: paul@prescod.net (Paul Prescod)
Date: Tue, 28 Dec 1999 16:56:52 -0500
Subject: [Types-sig] Type checks
References: <3867472C.7ECBAC55@prescod.net> <3867F850.14D60FCF@maxtal.com.au>
Message-ID: <38693224.490174D1@prescod.net>

skaller wrote:
> 
>         I think you have to think very carefully about what
> constitues an error here: see my posts about errors in python.
> It is not acceptable to specify that an exception be thrown.
> That would NOT permit an optimiser to elide checks, except
> when it could prove they were not needed.

If people use the static type check system extensively then it would
OFTEN be able to elide the checks. If you use type declarations as
aggressively (say) as you would in Java then you should get exactly as
many type checks. So I am leaning toward throwing an exception.

 Paul Prescod


From gstein@lyra.org  Tue Dec 28 22:04:22 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 28 Dec 1999 14:04:22 -0800 (PST)
Subject: [Types-sig] Type checks
In-Reply-To: <38693224.490174D1@prescod.net>
Message-ID: <Pine.LNX.4.10.9912281401170.412-100000@nebula.lyra.org>

On Tue, 28 Dec 1999, Paul Prescod wrote:
> skaller wrote:
> >         I think you have to think very carefully about what
> > constitues an error here: see my posts about errors in python.
> > It is not acceptable to specify that an exception be thrown.
> > That would NOT permit an optimiser to elide checks, except
> > when it could prove they were not needed.
> 
> If people use the static type check system extensively then it would
> OFTEN be able to elide the checks. If you use type declarations as
> aggressively (say) as you would in Java then you should get exactly as
> many type checks. So I am leaning toward throwing an exception.

Python is also very deterministic. "Implementation-defined" really does
not exist.

Dunno Guido's policy or leanings on this matter, but I've been assuming
that it would remain that way. And that CPython would generally be the
reference platform/definition when the language manual is not clear
enough.

Errors in Python raise exceptions. That is how it is defined, and that is
the general style/pattern for the language.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Tue Dec 28 22:40:22 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 28 Dec 1999 14:40:22 -0800 (PST)
Subject: [Types-sig] const (was: PyDL RFC 0.02)
In-Reply-To: <386901D8.9CD61B50@maxtal.com.au>
Message-ID: <Pine.LNX.4.10.9912281436150.412-100000@nebula.lyra.org>

On Wed, 29 Dec 1999, skaller wrote:
> Greg Stein wrote:
> > On Tue, 28 Dec 1999, Paul Prescod wrote:
> > > Greg Stein wrote:
> > > > p.s. I'd recommend assignment enforcement over the notion of const; the
> > > >      former seems to be more easily enforcable at runtime.
> > >
> > > I think we need both. We need to be able to enforce the TYPES of
> > > assignments and we need to sometimes say that an object is not
> > > modifiable, for all of the things we currently use tuples, files open
> > > for read and other read-only objects for.
> > 
> > Um... Are you suggesting that we add a readonly flag to the list and dict
> > types? Short of that, I'm not sure how you would do "const". IMO, adding a
> > readonly flag to those types seems wrong.
> 
> 'const', IMHO, in Paul's name based model, means the name
> cannot be rebound:
> 
> 	const x = 1 # x is always bound to 1
> 
> But:
> 
> 	const x = []
> 	x.append(1) # fine, x is still bound to the same list
> 
> This does not require a readonly flag, it can be
> enforced at compile time (in the absence of 'exec'
> statements :-)

Please re-read Paul's posts. In the quoted section above, he says we need
to say "that an object is not modifiable." In a previous post, he had the
following example code:

  const [ const Array( Integer )]

These two points said (to me) that he wanted to disable your second
example.

I disagree with the notion of add const-ness to objects. I could agree
with preventing rebinding (more agreement on preventing external
rebinding; less agreement on marking names as not rebindable at all).

If Paul means something else, then I'd ask for clarification.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Wed Dec 29 02:19:26 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 28 Dec 1999 18:19:26 -0800 (PST)
Subject: [Types-sig] PyDL RFC 0.02
In-Reply-To: <19991227175955.B44344@chronis.pobox.com>
Message-ID: <Pine.LNX.4.10.9912281814450.412-100000@nebula.lyra.org>

On Mon, 27 Dec 1999, scott wrote:
> On Mon, Dec 27, 1999 at 12:14:41PM -0800, Greg Stein wrote:
>...
> > Note: the compile-time checking *does* need to know everything that
> > happens in the run-time namespaces. It must check the assignments and
> > usage of values in the namespaces.
> 
> I don't see how compile-time checking can know much of anything about
> runtime-specific namespaces without running code.

It doesn't have to run code.

Try out the prototype that I posted to this list a few days ago. It can
tell you a lot about what, when, and where values are stored into the
different namespaces. And it doesn't run code -- it just walks the parse
tree.

> If it runs code, it
> is no longer compile-time checking. Furthermore, if the compile-time
> checker assumes that the running of code can do anything it can today,
> there's not much of anything that can be checked at compile time to
> begin with.

You'd be surprised at what it can check :-)

The checker can easily track type usage and find things that should not be
allowed. The check.py (and friends) that I posted only does a couple
things, but the framework is there for more. I just need to start filling
stuff in. I went for breadth-first so that people could see what a type
checker would look like.

> This is why it seems to me that checks done at compile-time must be
> done based on a compile-time specific model of the namespaces, and
> that model must be more restrictive in naming and scoping usage than
> python currently is.

Nope. I posted an "existence proof" that I believe contradicts this :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From scott@chronis.pobox.com  Wed Dec 29 06:05:07 1999
From: scott@chronis.pobox.com (scott)
Date: Wed, 29 Dec 1999 01:05:07 -0500
Subject: [Types-sig] PyDL RFC 0.02
In-Reply-To: <Pine.LNX.4.10.9912281814450.412-100000@nebula.lyra.org>
References: <19991227175955.B44344@chronis.pobox.com> <Pine.LNX.4.10.9912281814450.412-100000@nebula.lyra.org>
Message-ID: <19991229010507.A53430@chronis.pobox.com>

On Tue, Dec 28, 1999 at 06:19:26PM -0800, Greg Stein wrote:
> On Mon, 27 Dec 1999, scott wrote:
> > On Mon, Dec 27, 1999 at 12:14:41PM -0800, Greg Stein wrote:
> >...
> > > Note: the compile-time checking *does* need to know everything that
> > > happens in the run-time namespaces. It must check the assignments and
> > > usage of values in the namespaces.
> > 
> > I don't see how compile-time checking can know much of anything about
> > runtime-specific namespaces without running code.
> 
> It doesn't have to run code.
> 
> Try out the prototype that I posted to this list a few days ago. It can
> tell you a lot about what, when, and where values are stored into the
> different namespaces. And it doesn't run code -- it just walks the parse
> tree.

OK, done that.

> 
> > If it runs code, it
> > is no longer compile-time checking. Furthermore, if the compile-time
> > checker assumes that the running of code can do anything it can today,
> > there's not much of anything that can be checked at compile time to
> > begin with.
> 
> You'd be surprised at what it can check :-)

While there may be a lot of value in walking the parse tree as your
checker does, it doesn't seem to do much in terms of what I expect out
of a type checker.  

What I want to be able to do:  declare types, and have things which
contradict the declarations reported nicely at compile time.  A little
searching through the code you posted didn't show any clear way to
declare types, it just seems to spit out lots of attribute warnings
when run it on itself, and it fails to detect anything wrong with the
few simple cases I've thrown at it.  for example:

a = 1
b = "3"
a + b

yields no warnings, but is an error I'd expect a type checker to
understand.


def foo(x, y): return x + y

foo(2)

also yields no warnings, and is something I'd expect a type checker to
understand.

> 
> The checker can easily track type usage and find things that should not be
> allowed. The check.py (and friends) that I posted only does a couple
> things, but the framework is there for more. I just need to start filling
> stuff in. I went for breadth-first so that people could see what a type
> checker would look like.
> 
> > This is why it seems to me that checks done at compile-time must be
> > done based on a compile-time specific model of the namespaces, and
> > that model must be more restrictive in naming and scoping usage than
> > python currently is.
> 
> Nope. I posted an "existence proof" that I believe contradicts this :-)

The checker you posted either falls way short of being able to declare
and check static types, or it's sufficiently unclear how to make it do
that that I'd only accept existence proof as a series of examples of
making it do that.  For example, how do you make it check the two
examples above properly?  How can I declare variable 'a' to be an
integer, and then have the checker report something remotely
meaningful when I assign a string to the variable 'a' in the same
namespace? in another namespace via ``global''?

scott

scott

> 
> Cheers,
> -g
> 
> -- 
> Greg Stein, http://www.lyra.org/
> 


From gstein@lyra.org  Wed Dec 29 06:52:56 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 28 Dec 1999 22:52:56 -0800 (PST)
Subject: [Types-sig] check.py (was: PyDL RFC 0.02)
In-Reply-To: <19991229010507.A53430@chronis.pobox.com>
Message-ID: <Pine.LNX.4.10.9912282231120.412-100000@nebula.lyra.org>

On Wed, 29 Dec 1999, scott wrote:
>...
> While there may be a lot of value in walking the parse tree as your
> checker does, it doesn't seem to do much in terms of what I expect out
> of a type checker.  

It isn't done. I thought that I made that very clear. It provides a
framework for how to do this work. It tracks expression types, records the
types that should be associated with variables, etc.

The problem is that it does not yet have a way to declare types. Also,
some of the type recording (e.g. for types for an interface) is not yet
complete.

> What I want to be able to do:  declare types, and have things which
> contradict the declarations reported nicely at compile time.  A little
> searching through the code you posted didn't show any clear way to
> declare types,

To do this, I would need to change the Python grammar, or suck in .pyi
files. I plan to do the latter once some kind of formal grammar is
specified. If that doesn't happen soon, then I'll be using the grammar
that I posted in my type-proposal.html. It is complete and is sufficient
(yet Paul seems to be starting from scratch... :-( ).

> it just seems to spit out lots of attribute warnings
> when run it on itself, and it fails to detect anything wrong with the
> few simple cases I've thrown at it.  for example:
> 
> a = 1
> b = "3"
> a + b
> 
> yields no warnings, but is an error I'd expect a type checker to
> understand.
> 
> 
> def foo(x, y): return x + y
> 
> foo(2)
> 
> also yields no warnings, and is something I'd expect a type checker to
> understand.

Correct. It does not check these types of errors yet.

Try this, however:

a = { }
a.append(1)

b = [ ]
b.append(1)

You will get an error on that a.append. The attribute does not exist. But
it allows the b.append.

This demonstrates that it is tracking that "a" is a dictionary and that
"b" is a list. Further, it understands that "append" is only defined on a
list.

The first problem you list "a + b" is because _arith_expr() is not filled
in. It does not handle verification of the left/right operands as being
compatible with the "+" operator.

The second problem (with the foo(2)) is because _check_function_call() is
not yet filled in. However, the code *does* know that foo() has two
parameters named "x" and "y" (of type "Any" right now). This implies that
_check_function_call() has enough information to check the number of
arguments and to verify that if you use keywords, they must be "x" or "y".
[ but I don't record defaults yet, handle varargs or keyword funcs, or
  deal with things like: def foo(x, (y, z)):. ]

>...
> The checker you posted either falls way short of being able to declare
> and check static types,

It does fall way short. It is a prototype/demo. It is *not* complete. It
can be filled in to provide for this -- the necessary structure is there.

> or it's sufficiently unclear how to make it do
> that that I'd only accept existence proof as a series of examples of
> making it do that.

Fine. I'll accept that you don't see it as having the future capability to
do this. Not a problem, as I'll just work on it some more until it reaches
that point.

I feel that it *does* show you can do full namespace tracking without
running code (the original issue that stemmed this mini-thread). I believe
it also provides a good structure for writing a type-checker (in fact, if
somebody else were to write a type-checker, I think it would have so much
of the same form that I would recommend against duplication of work; I'd 
rather see a couple people contributing to the same chunk o' code).

> For example, how do you make it check the two
> examples above properly?

Described above. Fill in _arith_expr() and _check_function_call(). The
type information is present, although I need to think of a way to have a
TypeDeclarator object say "I can support addition" (at the moment, it can
only say "I have <this> attribute").

> How can I declare variable 'a' to be an
> integer,

We need an external file format and/or to change the grammar. It just
isn't possible right now since it is using Python's internal parser.

> and then have the checker report something remotely
> meaningful when I assign a string to the variable 'a' in the same
> namespace?

Currently, the checker understands the difference between something being
declared, and something having a specified type by virtue of an
assignment. It will issue an error for the former case, and allow a
redefinition in the latter case.

But: since you can't declare something to have a given type, this
functionality can't be exercised.

But #2: it raises a namespace.TypeMismatchError (and stops) rather than
printing an error; I simply need to add the appropriate try/except for
that and print the right message.

> in another namespace via ``global''?

Dunno. I haven't thought about how to handle the "global" statement yet. I
suspect that the Namespace class will simply understand that it must
delegate certain names to a different namespace; that target namespace
will then raise the appropriate error in case of a type mismatch.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From scott@chronis.pobox.com  Wed Dec 29 11:42:52 1999
From: scott@chronis.pobox.com (scott)
Date: Wed, 29 Dec 1999 06:42:52 -0500
Subject: [Types-sig] check.py (was: PyDL RFC 0.02)
In-Reply-To: <Pine.LNX.4.10.9912282231120.412-100000@nebula.lyra.org>
References: <19991229010507.A53430@chronis.pobox.com> <Pine.LNX.4.10.9912282231120.412-100000@nebula.lyra.org>
Message-ID: <19991229064252.A55464@chronis.pobox.com>

On Tue, Dec 28, 1999 at 10:52:56PM -0800, Greg Stein wrote:
> On Wed, 29 Dec 1999, scott wrote:
> >...
> > While there may be a lot of value in walking the parse tree as your
> > checker does, it doesn't seem to do much in terms of what I expect out
> > of a type checker.  
> 
> It isn't done. I thought that I made that very clear. It provides a

I was just working under the assumption that if it was a complete
framework -- filled in or not, there'd be a way to do things like
declare types.  Then I threw a couple of off-the-cuff basic things at
it, and it didn't do well, so I figured it wasn't done enough to
warrant a basic framework to develop on.

While I still sortof feel that way, your message has made a lot of
what's going on in check.py more clear -- and shows some really cool
things about one approach to it all.

So we all know that exactly how .pyi info and embedded declarations
maps runtime namespaces is a touchy issue -- we can't really account
for exec, and there are lots of things which may act odd, such as
get/set-attr hooks and global and del what not that can cause some
real issues.

We also know that whatever inaccuries or mismatches there might be
between the picture of runtime namespaces available at compile time
and how they work at runtime will probably become a royal pain in the
ass down the road.

So check.py uses the parser module to gain a pretty darn accurate
picture of runtime name spaces via the parse tree.  That's a *very*
good thing when compared to some half-assed namespace picture that
sortof works like you expect, but is bound to blow up in way too many
cases and create reams of new faqs down the road.

It also has some drawbacks:  it's a little awkward to have compile
time activity depend so heavily on a module that is optional.  Also,
compile-time activity IMO is rightly done in C (or Java or whatever)
and not in the language that is being interpretted, though prototyping
can of course be anything.  The module seems to be built primarily for
availability from within python, and not so much from the interpreter
itself; while the end product seems like it should be (mostly atleast)
in the interpreter itself.  

The list of drawbacks goes on a bit, but all the points rest on one
question that I'm not sure of:  Does the framework presented in
check.py actually depend on the parser module, or is this just a
functional relationship that can be met by some reasonable alternative
means in the interpreter itself? 

scott


From gstein@lyra.org  Wed Dec 29 12:14:15 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 29 Dec 1999 04:14:15 -0800 (PST)
Subject: [Types-sig] check.py
In-Reply-To: <19991229064252.A55464@chronis.pobox.com>
Message-ID: <Pine.LNX.4.10.9912290351030.412-100000@nebula.lyra.org>

On Wed, 29 Dec 1999, scott wrote:
>...
> I was just working under the assumption that if it was a complete
> framework -- filled in or not, there'd be a way to do things like
> declare types.  Then I threw a couple of off-the-cuff basic things at
> it, and it didn't do well, so I figured it wasn't done enough to
> warrant a basic framework to develop on.

If you want to hack some code, you can declare types. The framework is
there, you just need to figure out how to do something like:

  self.ns.declare('x', typedecl.Int)

(and then fix the code to handle the exception condition that will arise
if you attempt to assign something other than Int to 'x')

But your point still stands: it is missing some of the more interesting
stuff that is being generated in this SIG lately. But hey... I only hacked
on it for a day or two :-)

>...
> So we all know that exactly how .pyi info and embedded declarations
> maps runtime namespaces is a touchy issue -- we can't really account
> for exec, and there are lots of things which may act odd, such as
> get/set-attr hooks and global and del what not that can cause some
> real issues.

Right.

> We also know that whatever inaccuries or mismatches there might be
> between the picture of runtime namespaces available at compile time
> and how they work at runtime will probably become a royal pain in the
> ass down the road.

That would suck. But I'm pretty darn sure that we can figure out at
compile-time what 99% of the software out there will do at runtime (in
terms of storing values into namespaces). For that other 1%, I'm not sure
if we just won't work right, or whether we can at least warn the person
that we won't work as expected.
[ in other words, there are times when we can detect a bad situation, but
  can't do anything about it. other times, we just outright fail :-) ]

> So check.py uses the parser module to gain a pretty darn accurate
> picture of runtime name spaces via the parse tree. That's a *very*
> good thing

Yup. And yah, I think so :-)

> when compared to some half-assed namespace picture that
> sortof works like you expect, but is bound to blow up in way too many
> cases and create reams of new faqs down the road.

Quite true. I wouldn't think of doing it some other way.

> It also has some drawbacks:  it's a little awkward to have compile
> time activity depend so heavily on a module that is optional.

Well, we actually use just a single function from the parser module. And
the underlying C code is quite simplistic. Most of the parser module
actually deals with building AST nodes from Python and passing the result
to the Python bytecode compiler.

You could really view the parser module as an interface to two things: to
the parser output, and to the compiler input. We just want the parser
output.

Optional? The module may be, but the parser itself isn't :-). The parser
is enabled by default in recent distributions. Some code shifting or other
structural changes could ensure that we always have access to parser
output. We could also just say "type checking not available unless the
parser module is built."

> Also,
> compile-time activity IMO is rightly done in C (or Java or whatever)
> and not in the language that is being interpretted, though prototyping
> can of course be anything.

1) This isn't necessarily a compile-time activity. It could be an external
   tool that is occasionally run.

   We could also argue semantics and say that type-checking isn't part of
   compilation (since the output is not necessarily used/consumed by the
   compilation step).

2) I disagree that it is "rightly done in C", but recognize the "IMO" you
   inserted there :-). I see no issue whatsoever in using Python as part
   of the Python runtime environment. In fact, I would hope that Python
   1.6 allows you to write its parser and compiler entirely in Python. The
   only C code would be the builtin types and the VM.

> The module seems to be built primarily for
> availability from within python, and not so much from the interpreter
> itself; while the end product seems like it should be (mostly atleast)
> in the interpreter itself.  

I'm not sure that I understand the basis of this perception. However, I
don't really need to, I think... we can certainly restructure some of the
interfaces to make it follow whatever requirements/pattern that you're
thinking of.

> The list of drawbacks goes on a bit, but all the points rest on one
> question that I'm not sure of:  Does the framework presented in
> check.py actually depend on the parser module, or is this just a
> functional relationship that can be met by some reasonable alternative
> means in the interpreter itself? 

If you integrate the thing directly into the interpreter, then the need
for the parser module doesn't exist. The parser module is just a Python
API for the internal C API to the parser -- the interpreter definitely has
that access.

But again: I would disagree with the notion of integrating it tightly into
the interpreter. check.py is currently sitting at 929 lines of code. My
historical yardstick says this would expand to 9290 lines of C code -- for
its CURRENT form. I believe that check.py is going to get bigger once
all those missing expression handling checks are inserted. Maybe 2000
to 3000 lines of Python. Dropping that into C increases your bug count and
reduces flexibility/maintenance.

But hey... as I mentioned to Paul a week ago or so: feel free to code a
type-checker in C. I won't stop you. But I can guarantee that a Python
version will be ready before yours :-). And when the SIG comes up with
additional, nifty rules to check for... the Python version will have them
implemented much faster. In fact, people could very well present the new
rules as patches to the type-checker.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From paul@prescod.net  Wed Dec 29 13:14:14 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 29 Dec 1999 08:14:14 -0500
Subject: [Types-sig] PyDL RFC 0.02
References: <Pine.LNX.4.10.9912271134420.412-100000@nebula.lyra.org> <38687712.1F7E3714@prescod.net> <3868E5F7.3B52D1ED@maxtal.com.au>
Message-ID: <386A0926.84689DCE@prescod.net>

skaller wrote:
> 
> 
> > > And
> > > the combination rules might be a bit hard to describe or handle (from the
> > > human's standpoint).
> > 
> > It's just concatenation! There is nothing hard about it.
> 
>         If that is so, please give the rules.
> In particular, you will need to cover the issue of duplicate
> declarations. In c and C++ in particular, these issues
> turned out to be non-trivial.

The RFC says that the rules for duplicate and conflicting declarations
between a .pyi and a .gpi are the same as those within a .pyi. The issue
is simply orthogonal.

 Paul Prescod


From paul@prescod.net  Wed Dec 29 13:16:51 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 29 Dec 1999 08:16:51 -0500
Subject: [Types-sig] Interface files
References: <3867472C.7ECBAC55@prescod.net> <3867F850.14D60FCF@maxtal.com.au> <38687F76.47B415C7@prescod.net> <3868E7C1.665CD6AC@maxtal.com.au>
Message-ID: <386A09C3.8EF9153D@prescod.net>

skaller wrote:
> 
> > If the normative spec. is in terms of interface files then we can deal
> > with various situations through transformation TO interface files:
> 
> Yes, this is possible to some extent, but not totally:
> first, the grammar needs to be compatible with existing
> python to permit embedding in the first place, 

That's why we preced everything with "decl" or "typedef" and thus get
our own sublanguage.

> and secondly
> _references_ to typedecls seem to require embedding,
> even if the decls themselves are not embedded.

True enough. The references are going to be dotted names which Python
will look for at runtime.

 Paul Prescod


From paul@prescod.net  Wed Dec 29 13:48:46 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 29 Dec 1999 08:48:46 -0500
Subject: [Types-sig] PyDL RFC 0.02
References: <19991227175955.B44344@chronis.pobox.com> <Pine.LNX.4.10.9912281814450.412-100000@nebula.lyra.org> <19991229010507.A53430@chronis.pobox.com>
Message-ID: <386A113E.2FF6EE79@prescod.net>

scott wrote:
> 
> What I want to be able to do:  declare types, and have things which
> contradict the declarations reported nicely at compile time.  A little
> searching through the code you posted didn't show any clear way to
> declare types, it just seems to spit out lots of attribute warnings
> when run it on itself, and it fails to detect anything wrong with the
> few simple cases I've thrown at it.  

The important thing is that Greg's code (presumably!) knows how to
propogate types around expressions and suites like:

j = k or q(foo() and bar())

That's quite an accomplishment considering how quickly he coded it.

 Paul Prescod


From skaller@maxtal.com.au  Wed Dec 29 16:05:37 1999
From: skaller@maxtal.com.au (skaller)
Date: Thu, 30 Dec 1999 03:05:37 +1100
Subject: [Types-sig] const (was: PyDL RFC 0.02)
References: <Pine.LNX.4.10.9912281436150.412-100000@nebula.lyra.org>
Message-ID: <386A3151.8F5AFCD7@maxtal.com.au>

Greg Stein wrote:

> > 'const', IMHO, in Paul's name based model, means the name
> > cannot be rebound:
> >
> >       const x = 1 # x is always bound to 1
> >
> > But:
> >
> >       const x = []
> >       x.append(1) # fine, x is still bound to the same list
> >
> > This does not require a readonly flag, it can be
> > enforced at compile time (in the absence of 'exec'
> > statements :-)
> 
> Please re-read Paul's posts. In the quoted section above, he says we need
> to say "that an object is not modifiable." 

	I know, but that is my point: it isn't consistent
with a model in which checking is applied to _names_:
we'd need to model access like in C/C++ with pointers.
This is pervasive, and it doesn't seem to me to sit well
with optional declarations. Declaring a name non-rebindable
on the other hand fits well with current semantics
(a function cannot rebind non-local names unless declared 'global')

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Wed Dec 29 16:07:53 1999
From: skaller@maxtal.com.au (skaller)
Date: Thu, 30 Dec 1999 03:07:53 +1100
Subject: [Types-sig] Type checks
References: <Pine.LNX.4.10.9912281401170.412-100000@nebula.lyra.org>
Message-ID: <386A31D9.A29EA65E@maxtal.com.au>

Greg Stein wrote:
 
> Python is also very deterministic. "Implementation-defined" really does
> not exist.

	I agree, more or less. There is some indeterminism with
bitwise operators (depends on the underlying C implementation,
which sucks :-)
 
> Errors in Python raise exceptions. That is how it is defined, and that is
> the general style/pattern for the language.

	Not true for assertions.
And type constraints are assertions.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Wed Dec 29 16:17:59 1999
From: skaller@maxtal.com.au (skaller)
Date: Thu, 30 Dec 1999 03:17:59 +1100
Subject: [Types-sig] PyDL RFC 0.02
References: <19991227175955.B44344@chronis.pobox.com> <Pine.LNX.4.10.9912281814450.412-100000@nebula.lyra.org> <19991229010507.A53430@chronis.pobox.com> <386A113E.2FF6EE79@prescod.net>
Message-ID: <386A3437.741BCC93@maxtal.com.au>

Paul Prescod wrote:
 
> The important thing is that Greg's code (presumably!) knows how to
> propogate types around expressions and suites like:
> 
> j = k or q(foo() and bar())
> 
> That's quite an accomplishment considering how quickly he coded it.

	I agree. More to the point, Greg says this is primarily
a framework -- clearly, it is currently a pretty lousy checker,
but there's scope to add rules to improve it.

	The same is true of the code I'm doing for the
cgen_module function in Viper: it generates pretty lousy code
at the moment -- but that can be fixed later. What's important
at first is a working implementation that covers the territory.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skip@mojam.com (Skip Montanaro)  Wed Dec 29 17:11:02 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 29 Dec 1999 11:11:02 -0600 (CST)
Subject: [Types-sig] check.py (was: PyDL RFC 0.02)
In-Reply-To: <19991229064252.A55464@chronis.pobox.com>
References: <19991229010507.A53430@chronis.pobox.com>
 <Pine.LNX.4.10.9912282231120.412-100000@nebula.lyra.org>
 <19991229064252.A55464@chronis.pobox.com>
Message-ID: <14442.16550.539070.199835@dolphin.mojam.com>

    scott> It also has some drawbacks: it's a little awkward to have compile
    scott> time activity depend so heavily on a module that is optional.
    scott> Also, compile-time activity IMO is rightly done in C (or Java or
    scott> whatever) and not in the language that is being interpretted,
    scott> though prototyping can of course be anything.  The module seems
    scott> to be built primarily for availability from within python, and
    scott> not so much from the interpreter itself; while the end product
    scott> seems like it should be (mostly atleast) in the interpreter
    scott> itself.

There's no reason the parser module needs to always be optional.  Also,
coding the thing in Python makes sense for programmer productivity reasons.
Over time, as type information gets known more completely, tools like Greg's
& Bill's Python2C will be able to convert that code into fairly efficient C.

Skip Montanaro | http://www.mojam.com/
skip@mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From gstein@lyra.org  Wed Dec 29 20:17:30 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 29 Dec 1999 12:17:30 -0800 (PST)
Subject: [Types-sig] const (was: PyDL RFC 0.02)
In-Reply-To: <386A3151.8F5AFCD7@maxtal.com.au>
Message-ID: <Pine.LNX.4.10.9912291216010.412-100000@nebula.lyra.org>

On Thu, 30 Dec 1999, skaller wrote:
> Greg Stein wrote:
> > > 'const', IMHO, in Paul's name based model, means the name
> > > cannot be rebound:
> > >
> > >       const x = 1 # x is always bound to 1
> > >
> > > But:
> > >
> > >       const x = []
> > >       x.append(1) # fine, x is still bound to the same list
> > >
> > > This does not require a readonly flag, it can be
> > > enforced at compile time (in the absence of 'exec'
> > > statements :-)
> > 
> > Please re-read Paul's posts. In the quoted section above, he says we need
> > to say "that an object is not modifiable." 
> 
> 	I know, but that is my point: it isn't consistent
> with a model in which checking is applied to _names_:
> we'd need to model access like in C/C++ with pointers.
> This is pervasive, and it doesn't seem to me to sit well
> with optional declarations. Declaring a name non-rebindable
> on the other hand fits well with current semantics
> (a function cannot rebind non-local names unless declared 'global')

Then we are in agreement.

Paul say "readonly objects." I said no. You "explained" Paul's point and
said no. I said the explanation wasn't necessary because I had agreed with
you and said no.

Fun, huh? :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Wed Dec 29 20:21:32 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 29 Dec 1999 12:21:32 -0800 (PST)
Subject: [Types-sig] Type checks
In-Reply-To: <386A31D9.A29EA65E@maxtal.com.au>
Message-ID: <Pine.LNX.4.10.9912291217520.412-100000@nebula.lyra.org>

On Thu, 30 Dec 1999, skaller wrote:
> Greg Stein wrote:
> > Python is also very deterministic. "Implementation-defined" really does
> > not exist.
> 
> 	I agree, more or less. There is some indeterminism with
> bitwise operators (depends on the underlying C implementation,
> which sucks :-)

If this is the case, then let Guido know. He has generally taken the pain
to ensure that cases like this just don't exist.

> > Errors in Python raise exceptions. That is how it is defined, and that is
> > the general style/pattern for the language.
> 
> 	Not true for assertions.
> And type constraints are assertions.

Stop being a nit-pick. But since you are, let me rephrase:

[In general,] errors in Python raise exceptions. [This is the pattern used
for all errors. One error, AssertionError, as raised by the "assert"
statement will not be raised by the compiler in "debug" mode, or in code
generated when optimization is enabled.]

Essentially, even the assert statement is rigidly defined. I strongly
believe that type assertions would follow the exact pattern of regular
assertions.

-g

-- 
Greg Stein, http://www.lyra.org/


From paul@prescod.net  Wed Dec 29 16:49:31 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 29 Dec 1999 11:49:31 -0500
Subject: [Types-sig] RFC Comments
References: <38689428.4953374D@prescod.net> <3868F998.594C4EEF@maxtal.com.au>
Message-ID: <386A3B9B.43E94DB9@prescod.net>

skaller wrote:
> 
>         I do consider your RFC's prototypes -- but they're already
> quite good specifications, so it is already time to try to tighten them
> up.
> IMHO. Doing this will also help uncover ambiguities and problems,
> that 'loose' wording will cover up.

I'm thinking that my current work will evolve into a tutorial and the
spec will be separate. I'm a believer in short, formal specs and long,
wordy, expanatory tutorials.

>         No, you mentioned it in point 5. :-)

Okay, you win. Const is out for now.

> So: we have to be able to load an interface, without
> that necessarily implying the module be imported.

I see why you would sometimes only care about interfaces and not about
implementations but I do not see what it harms to do a "real import" of
the module. In languages like Java and C++ you might import a package or
header file only for interfaces.

I guess I need to know the difference which module-import semantics you
are trying to avoid.

> On the other hand, in an _interface_ file, we cannot
> import anything: importing implies run time code
> generation, to bind a name to a module object.

Okay, but if it can be shown that no code is ever executed from the
module then you don't have to generate that code.

> The reason I chose "!" for argument declarations was that it
> was already being used in similar way for the _expression_:
> 
>         y = x ! t
> 
> and in this context, ":" cannot be used.

Right. My RFC uses a function call syntax inline. It seems more Pythonic
and can cause no no precedence confusion. It is also compatible with the
Python 1.5.x grammar.

>         OK. You should proceed with _some_ fixed syntax. 

I used "as" everywhere else. The colon was just a lapse. 

> My argument is something like this: 

I'm lost in this subthread. I never understood what change you were
proposing. Can we start again?

 Paul Prescod


From paul@prescod.net  Wed Dec 29 16:49:27 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 29 Dec 1999 11:49:27 -0500
Subject: [Types-sig] check.py
References: <Pine.LNX.4.10.9912290351030.412-100000@nebula.lyra.org>
Message-ID: <386A3B97.42AEB426@prescod.net>

Greg Stein wrote:
> 
> ...
> 
> 2) I disagree that it is "rightly done in C", but recognize the "IMO" you
>    inserted there :-). I see no issue whatsoever in using Python as part
>    of the Python runtime environment. In fact, I would hope that Python
>    1.6 allows you to write its parser and compiler entirely in Python. The
>    only C code would be the builtin types and the VM.

I agree with you Greg. By Python 2 we may have sufficient performance
that the "standard" compiler can be written in Python.

 Paul Prescod


From paul@prescod.net  Wed Dec 29 16:49:29 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 29 Dec 1999 11:49:29 -0500
Subject: [Types-sig] const (was: PyDL RFC 0.02)
References: <Pine.LNX.4.10.9912281436150.412-100000@nebula.lyra.org>
Message-ID: <386A3B99.74EE9F6E@prescod.net>

It is quite possible that at some point I used const inconsistently to
mean both "non-rebindable name" and "immutable." Greg is right that I
was thinking about the latter when I put it in. In the long run we
really need both, but I will remove them from version 1 for now.

For now, we just need a solid definition of what types of rebinding are
legal. There are four kinds of names:

 * module -- we must always disallow rebinding these because we don't
have a notion of two modules with the "same interface". Maybe in some
future version we could.

 * class -- rebinding is fine as long as the new class has a signuture
that will produce instances that conform to the declared interface(s).

* functions and other objects -- rebinding is fine as long as the new
function conforms to the declared interface.

 Paul Prescod


From paul@prescod.net  Wed Dec 29 16:49:33 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 29 Dec 1999 11:49:33 -0500
Subject: [Types-sig] Type checks
References: <Pine.LNX.4.10.9912281401170.412-100000@nebula.lyra.org>
Message-ID: <386A3B9D.228BD31B@prescod.net>

Greg Stein wrote:
> 
> Python is also very deterministic. "Implementation-defined" really does
> not exist.
>
> Dunno Guido's policy or leanings on this matter, but I've been assuming
> that it would remain that way. And that CPython would generally be the
> reference platform/definition when the language manual is not clear
> enough.

To faithfully represent CPython, an optimizing compiler would need to
silently compile:

jfoieawjofij
fewajofijeawofj
fjowiaejfowei

to:

raise SyntaxError

So yes, we do have to allow some flexibility to the implementor. I agree
with Greg that as much as possible we should try to keep the undefined
stuff at *compile time* instead of runtime.

 Paul Prescod


From paul@prescod.net  Wed Dec 29 16:49:30 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 29 Dec 1999 11:49:30 -0500
Subject: Duplicate declarations Re: [Types-sig] PyDL RFC 0.02
References: <Pine.LNX.4.10.9912280108380.412-100000@nebula.lyra.org> <3868EBEB.BDAFABE0@maxtal.com.au>
Message-ID: <386A3B9A.C8DFE00D@prescod.net>

skaller wrote:
> 
>         1) declarations can be embedded.
>         2) declarations can also be given in a separate file
>         3) Processing module X commences by loading the
>            separate interface file
>         4) Next, the .py file is scanned for declarations
>         5) The results of (3) and (4) are merged somehow
>         6) The .py files is scanned again by the code generator

Agreed.

> We must make some decisions here.
>
> Question: what happens if a typedecl kind of name is
> declared more than once?
> 
> Partial Answer 1: This must be permitted, because this
> is _exactly_ what will happen if the .pi file is generated
> from the .py file by scanning for declarations.

In my model:

Human creates .pi
Human creates .py
Type extractor scans .py and generates .gpi
Type checker reads .pi and .gpi 

So we have no problem with the same declaration being read twice. Thus I
would say that for version 1 we should ban duplicate declarations.

 Paul Prescod


From paul@prescod.net  Wed Dec 29 16:49:28 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 29 Dec 1999 11:49:28 -0500
Subject: [Types-sig] check.py (was: PyDL RFC 0.02)
References: <Pine.LNX.4.10.9912282231120.412-100000@nebula.lyra.org>
Message-ID: <386A3B98.35DF5735@prescod.net>

Greg Stein wrote:
> 
> ...
> 
> To do this, I would need to change the Python grammar, or suck in .pyi
> files. I plan to do the latter once some kind of formal grammar is
> specified. If that doesn't happen soon, then I'll be using the grammar
> that I posted in my type-proposal.html. It is complete and is sufficient
> (yet Paul seems to be starting from scratch... :-( ).

My syntax is mostly based on your web page. I switched "!" for "as"
based on my belief that it isn't Pythonic to use random keyboard
characters in ways that are not universally understood. And I put decl
and typedecl at the front instead of making them operators because I
agree with Tim Peters that we are designing a sub-language that needs to
be understood as being separate by virtue of being evaluated BEFORE the
code is executed.

It is my personal opinion that the grammar should be the last thing you
integrate into your system. In order to avoid maintaining a whole
compiler while the grammar shifts, I would suggest you define classes
like this:

class ParameterizedInterface:
	....

class ConcreteInterface:
	....

class MethodSignature:

and so forth. You need these classes regardless.

Then your interface file becomes:

Array = new ParamterizedInterface( 
	parameters=["elements", "array"],
	attributes=[new MethodSignature( arguments=... )]
  )

We need this API anyhow so it would help alot if you could design it
while you are writing your package.

 Paul Prescod


From skaller@maxtal.com.au  Wed Dec 29 16:17:59 1999
From: skaller@maxtal.com.au (skaller)
Date: Thu, 30 Dec 1999 03:17:59 +1100
Subject: [Types-sig] PyDL RFC 0.02
References: <19991227175955.B44344@chronis.pobox.com> <Pine.LNX.4.10.9912281814450.412-100000@nebula.lyra.org> <19991229010507.A53430@chronis.pobox.com> <386A113E.2FF6EE79@prescod.net>
Message-ID: <386A3437.741BCC93@maxtal.com.au>

Paul Prescod wrote:
 
> The important thing is that Greg's code (presumably!) knows how to
> propogate types around expressions and suites like:
> 
> j = k or q(foo() and bar())
> 
> That's quite an accomplishment considering how quickly he coded it.

	I agree. More to the point, Greg says this is primarily
a framework -- clearly, it is currently a pretty lousy checker,
but there's scope to add rules to improve it.

	The same is true of the code I'm doing for the
cgen_module function in Viper: it generates pretty lousy code
at the moment -- but that can be fixed later. What's important
at first is a working implementation that covers the territory.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Wed Dec 29 23:51:24 1999
From: skaller@maxtal.com.au (skaller)
Date: Thu, 30 Dec 1999 10:51:24 +1100
Subject: [Types-sig] Type checks
References: <Pine.LNX.4.10.9912291217520.412-100000@nebula.lyra.org>
Message-ID: <386A9E7C.C170AC5C@maxtal.com.au>

Greg Stein wrote:
> 
> On Thu, 30 Dec 1999, skaller wrote:
> > Greg Stein wrote:
> > > Python is also very deterministic. "Implementation-defined" really does
> > > not exist.
> >
> >       I agree, more or less. There is some indeterminism with
> > bitwise operators (depends on the underlying C implementation,
> > which sucks :-)
> 
> If this is the case, then let Guido know. He has generally taken the pain
> to ensure that cases like this just don't exist.

	I agree. [I gave up 'advising' Guido on most things some time ago]
There's more 'indeterminism' or 'unspecified behaviour' in Python than
you might think -- although it is hard to say, since the specification
is not itself entirely precise, being worded somewhat informally.
 
> > > Errors in Python raise exceptions. That is how it is defined, and that is
> > > the general style/pattern for the language.
> >
> >       Not true for assertions.
> > And type constraints are assertions.
> 
> Stop being a nit-pick. 

	Why? Programs are executed by deterministic electro-mechanical
automata. They picky. Worse, language specifications describe
formal systems, which are also sensitive to nits. :-)

>But since you are, let me rephrase:
> 
> [In general,] errors in Python raise exceptions. [This is the pattern used
> for all errors. One error, AssertionError, as raised by the "assert"
> statement will not be raised by the compiler in "debug" mode, or in code
> generated when optimization is enabled.]

	Nit: usually, there's no such thing as an 'error' in Python.

> Essentially, even the assert statement is rigidly defined. I strongly
> believe that type assertions would follow the exact pattern of regular
> assertions.

	So do I. But I don't agree that the _formal_ semantics
of the language define the behaviour of an assertion when it
fails. The behaviour when it succeeds is 'none'.

	Of couse, since the 'reference manual' is not
as formal as it might be, my belief is not backed up
by a formal specification.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Thu Dec 30 00:06:21 1999
From: skaller@maxtal.com.au (skaller)
Date: Thu, 30 Dec 1999 11:06:21 +1100
Subject: [Types-sig] check.py (was: PyDL RFC 0.02)
References: <Pine.LNX.4.10.9912282231120.412-100000@nebula.lyra.org> <386A3B98.35DF5735@prescod.net>
Message-ID: <386AA1FD.C726C130@maxtal.com.au>

Paul Prescod wrote:
>
> My syntax is mostly based on your web page. I switched "!" for "as"
> based on my belief that it isn't Pythonic to use random keyboard
> characters in ways that are not universally understood.

	Then you had better think again. 'as' is an ENGLISH word.
English is not 'universally' understood.

> It is my personal opinion that the grammar should be the last thing you
> integrate into your system. 

	I don't agree. This viewpoint is equivalent to the old-fashioned
notion of top down analysis, in which a design is completed in every
detail before it is implemented. Object oriented programming is
utterly contrary to this paradigm, being bottom up: it specifies
bottom up development, with early coding of low level design parts.

	To be more concrete, what you are saying would require
a vast amount of human brain power, instead of permitting early,
partial implementations, which would allow machines to aid
in our analysis.

	I have implemented 'x!t' in Viper, and then, later,
I implemented 'def f(x!t)' -- the uses require grammar modifications
in different places and are technically distinct.

	As I am now implementing a C code generator, I am noticing
the effects of the optional typing on a compiler (although I'm
not actually using the information yet). 

	In particular, since my implementation is entirely dynamic, 
it fits well with cgen_module, which uses an already loaded module.
I have not tried a static compiler which 'parses' text to generate
code yet, but I suspect this will make my dynamic interpretation
difficult to implement -- on the other hand, Greg Stein HAS
tried this kind of tool -- and so I'd like to hear from him
what the impact of the 'at run time' meaning would be,
if he has looked at this.

>In order to avoid maintaining a whole
> compiler while the grammar shifts, I would suggest you define classes
> like this:
> 
> class ParameterizedInterface:
>         ....
> 
> class ConcreteInterface:
>         ....
> 
> class MethodSignature:
> 
> and so forth. You need these classes regardless.

	I don't. My implementation is ML based,
and the static compilations tools are likely
to use native constructions not Python ones .. although
I'm not sure.

	And in order to implement _anything_ I need
a starting point, which is a formal grammar for the syntax
extensions.
 
-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Thu Dec 30 00:11:46 1999
From: skaller@maxtal.com.au (skaller)
Date: Thu, 30 Dec 1999 11:11:46 +1100
Subject: [Types-sig] check.py
References: <Pine.LNX.4.10.9912290351030.412-100000@nebula.lyra.org> <386A3B97.42AEB426@prescod.net>
Message-ID: <386AA342.A9AF80AA@maxtal.com.au>

Paul Prescod wrote:

> I agree with you Greg. By Python 2 we may have sufficient performance
> that the "standard" compiler can be written in Python.

	Irrelevant. What is relevant is:

	a) the compiler uses efficient algorithms
	b) it is powerful enough to compile itself
	c) it  generates efficient code

In which case a Python written compiler can be used to generate
fast code fast, for any Python code, including itself,
by compiling itself.

I personally don't believe Python 2 will have much better 
performance than 1.5, because I don't think Guido will
add the features, and write the specification, in such a
way that high performance is possible. [It would be
too 'unpythonic' :-[

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Thu Dec 30 00:38:14 1999
From: skaller@maxtal.com.au (skaller)
Date: Thu, 30 Dec 1999 11:38:14 +1100
Subject: [Types-sig] RFC Comments
References: <38689428.4953374D@prescod.net> <3868F998.594C4EEF@maxtal.com.au> <386A3B9B.43E94DB9@prescod.net>
Message-ID: <386AA976.F268268@maxtal.com.au>

Paul Prescod wrote:

> 
> >         No, you mentioned it in point 5. :-)
> 
> Okay, you win. Const is out for now.

	Make an appendix for 'to be considered' items.
 
> > So: we have to be able to load an interface, without
> > that necessarily implying the module be imported.
> 
> I see why you would sometimes only care about interfaces and not about
> implementations but I do not see what it harms to do a "real import" of
> the module. 

	I do, I will try to explain, but the reason is that it is
inconsistent with the compilation module you specified, at least
as I understand it.

	The way I undertand it, a new-fangled python language
translator is required to behave 'as if' two passes are performed
on script: the first pass gleans static type information,
but generates no executable code, while the second generates
executable code.

	In this 'two pass' model, it is inconsistent to
'import' a module in pass 1, since 'importing' a module
requires a recursive tranlation pass involving TWO passes,
and we know that the second pass can even involve recursive
module execution. So it isn't _possible_ to import
a module during pass 1. It won't work.

	It _is_ possible to import only the interface
of a module, and this should be done when 'import X' is
seen. In pass two, a full two pass importation is triggered,
but the interface loading is skipped because the interface
is already loaded. However, the import still requires TWO
passes DURING PASS 2, because the implementation file
may also include inline declarations. It follows FROM THE MODEL
that these declarations are effectively private.

	You _could_ change the detail of the model
which makes that so, to perform pass 1 on the implementation
file during pass 1. I'm not sure what the impact is,
or what your intent is. But one thing you cannot do is
actually import a module during pass 1.

	Summmary: pass 1 processing only permits
pass 1 processing to occur recursively, whereas
pass 2 imports may invoke  a full two phase translation.

	So because the semantics of importing a module
(two passes) are quite distinct from only importing
interfaces, and even that has two possible variants,
it seems useful, if not essential, to permit a 
pass 1 only kind of importation -- 'include'.

> > The reason I chose "!" for argument declarations was that it
> > was already being used in similar way for the _expression_:
> >
> >         y = x ! t
> >
> > and in this context, ":" cannot be used.
> 
> Right. My RFC uses a function call syntax inline. It seems more Pythonic
> and can cause no no precedence confusion. It is also compatible with the
> Python 1.5.x grammar.

	Yes, but it cannot be used for function parameters.
This IS an 'inline' use, even though it is distinct
from a run time expression check (being applied to 
the parameter, syntactically, not the argument).
 
> I used "as" everywhere else. The colon was just a lapse.

	OK.

	I like '!' because it is terse. 'as' requires
four characters (two spaces are needed). This will
clutter function definitions:

	def f(self as X, x as X, y as X) ..

	def f(self!X, x!X, y!X)

but my taste is only a minor point here, I'll run with 'as'
if that is the final choice. But your use of a function
call like:

	interface_check(x,i)

for a run time test is not as simple as reusing "!"
for the same purpose: you could specify

	(x as i)

be allowed in an expression instead -- I know this
cannot be ambiguous, because 'as' is a keyword,
and so is '!', and I have implemented the latter.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Thu Dec 30 01:03:55 1999
From: skaller@maxtal.com.au (skaller)
Date: Thu, 30 Dec 1999 12:03:55 +1100
Subject: [Types-sig] Conformance model
References: <Pine.LNX.4.10.9912281401170.412-100000@nebula.lyra.org> <386A3B9D.228BD31B@prescod.net>
Message-ID: <386AAF7B.9EA4E3F3@maxtal.com.au>

Paul Prescod wrote:

> To faithfully represent CPython, an optimizing compiler would need to
> silently compile:
> 
> jfoieawjofij
> fewajofijeawofj
> fjowiaejfowei
> 
> to:
> 
> raise SyntaxError

	Precisely. In particular, assuming
the name 'silly' for the above module:

	# module X
	try:
		import silly
		x = 0
	except SyntaxError:
		x = 0

	.. more code using x ..

clearly indicates why this is necessary with
the current specification, and why it is a bad
specification to optimise. Indeed, it is
lucky that 'SyntaxError' is a properly of a whole
file so that:

	try:
		lkjhglkjhsdf
		lkhslkjhsdf
	except SyntaxError:
		....

raises a SyntaxError which is NOT trapped
by the except clause (since SyntaxErrors
are raised by the compiler, and apply to 
the WHOLE file).

My argument is that this is what should be 
specified for some other 'errors' in some
contexts. Given that python is dynamic,
my argument is that, say, for a type error,
it might make sense to ALLOW:

	try:
		1 + "Hello"
	except TypeError: pass

but mandate that

	def f(x):
		1 + "Hello"

is not a valid Python program -- a compiler
can reject the program, rather than being
forced to implement:

	def f(x):
		raise TypeError

which is what is currently required.

	in particular, most people who use
compilers RELY on the compiler, when it rejects
code, NOT providing a .o file, which prevents
the linker linking the code, which prevents
the 'erroneous' code actually being run.

	C/C++ compilers are not required
to do this, but they're not prevented from
doing it either. A Python compiler would be,
if we do not modify the semantics.

	At least as I see it, Greg in particular
is not seeing this project the way I am -- I see
that we are TRYING to make python LESS dynamic:
Guido never intended it to be as dynamic as it is.
That is, we DO NOT WANT to actually preserve the
existing semantics. We WANT to break some naughty
programs. It's desirable.

	It's clear Guido agrees. He has been the leader
in suggesting changes to the specification supporting
this, including backing up, if not actually first suggesting, 
the idea that

	module.value = x

should be disallowed. 

	You just can't sensibly talk about adding
static typing to a language, without also saying
that some text is not IN the language: it is
ill-formed, invalid, an error, or just plain NOT PYTHON.

	In fact, the more stuff we ban the better,
the condition we must observe is that we do not
break TOO much sensible code. Deciding exactly
what should be 'banned' is not easy, but it is not
even possible if don't first agree that things like

	hgkjgaskhgad
	khgsdfkhgsdfkhg

are not in fact valid Python programs.

BTW: this tract belongs in a thread marked 'conformance model',
rather than being mixed up with the static typing specification
as given in Paul's RFC. There is a relation, but I'd not
like the conformance issues to muddle, say, specification
of the static typing grammar extensions.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Thu Dec 30 01:23:16 1999
From: skaller@maxtal.com.au (skaller)
Date: Thu, 30 Dec 1999 12:23:16 +1100
Subject: [Types-sig] const (was: PyDL RFC 0.02)
References: <Pine.LNX.4.10.9912291216010.412-100000@nebula.lyra.org>
Message-ID: <386AB404.928FCFF7@maxtal.com.au>

Greg Stein wrote:

> Then we are in agreement.

	Oh dear, and I cannot think of any way out
of this. I'm agreeing with someone. Someone is agreeing with me.

	For the moment :->

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Thu Dec 30 01:36:29 1999
From: skaller@maxtal.com.au (skaller)
Date: Thu, 30 Dec 1999 12:36:29 +1100
Subject: [Types-sig] const (was: PyDL RFC 0.02)
References: <Pine.LNX.4.10.9912281436150.412-100000@nebula.lyra.org> <386A3B99.74EE9F6E@prescod.net>
Message-ID: <386AB71D.CFF4DE70@maxtal.com.au>

Paul Prescod wrote:
 
> It is quite possible that at some point I used const inconsistently to
> mean both "non-rebindable name" and "immutable." Greg is right that I
> was thinking about the latter when I put it in. In the long run we
> really need both, but I will remove them from version 1 for now.

	Hang on! removing 'const' is not the same as
adding a restriction preventing certain name rebindings --
as Guido pointed out in response to one of my posts,
it is possible to detect many rebindings, and it is possible
to ban some -- without needing a 'const' specification.
 
> For now, we just need a solid definition of what types of rebinding are
> legal. There are four kinds of names:

	No. There is only one kind of name.
Perhaps you mean 'a name can be bound to one of four kinds of  object'?

>  * module -- we must always disallow rebinding these because we don't
> have a notion of two modules with the "same interface". Maybe in some
> future version we could.

	I'm not so sure. Consider:

	import m
	module_x  = m.x
	module_x = m.y

Here, 'module_x' is a name bound to a module, namely m.x,
which is rebound to another module, m.y. Module objects
can be accessed just like any other python object.

And in this case, you may not even KNOW that 'module_x'
is bound to a module object.

I'm thinking that your intent is that the name 'module_x'
be declared as a module (with some properties), but then ..
	 
>  * class -- rebinding is fine as long as the new class has a signuture
> that will produce instances that conform to the declared interface(s).

	.. should be the same as for modules. Rebinding
is allowed, provided the constraints implied by a declaration
the name conforms to a particular interface are met.

> * functions and other objects -- rebinding is fine as long as the new
> function conforms to the declared interface.

	I think a better and simpler rule is to be found
by simply taking this rule without the words

	'functions and other'

-- i.e. the rule applies uniformly to all objects.
[not sure though]


	BUT: the is an important reason to do more,
namely, caching of module functions and class methods.
In this case, merely requiring interface conformance
is not enough: we'd actually want to prevent rebinding.
This can be done by banning:

	x.f = g

where x is a module or class: it is still permitted
where x is an instance. (Other cases??)

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Thu Dec 30 01:48:20 1999
From: skaller@maxtal.com.au (skaller)
Date: Thu, 30 Dec 1999 12:48:20 +1100
Subject: Duplicate declarations Re: [Types-sig] PyDL RFC 0.02
References: <Pine.LNX.4.10.9912280108380.412-100000@nebula.lyra.org> <3868EBEB.BDAFABE0@maxtal.com.au> <386A3B9A.C8DFE00D@prescod.net>
Message-ID: <386AB9E4.74E286B0@maxtal.com.au>

Paul Prescod wrote:
 
> In my model:
> 
> Human creates .pi
> Human creates .py
> Type extractor scans .py and generates .gpi
> Type checker reads .pi and .gpi
> 
> So we have no problem with the same declaration being read twice. Thus I
> would say that for version 1 we should ban duplicate declarations.

But you have not addressed the possibility that the .pi and .gpi
contain a declaration for the same name: more precisely,
at this point the above description does not describe exactly
how the set of declarations (interfaces) is constructed. Presumably,
the following axiom holds:

	name in names if name in (gpi xor pi)

where we're talking about sets. What happens if:

	name in (gpi and pi)

Here, there are TWO declarations of the same name.
I don't think you can ban this, because it is not
only likely to be a common case, it is likely
to be almost EVERY case -- since many people
will use a 'genpi' tool to extract embedded declarations
into a separate interface file -- but won't remove
the embedded declarations.

A rule is required, but the one you mention (don't allow it),
will not work in practice (IMHO).

Or perhaps I misunderstand completely, and what you are
saying is that the type checker's work is exactly to
check that the gpi names do not conflict with the .pi
names?

[Hmmm: the more I think about it, the more this seems to
be your intent ..??]

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From tim.hochberg@ieee.org  Thu Dec 30 02:29:15 1999
From: tim.hochberg@ieee.org (Tim Hochberg)
Date: Wed, 29 Dec 1999 19:29:15 -0700
Subject: [Types-sig] Conformance model
References: <Pine.LNX.4.10.9912281401170.412-100000@nebula.lyra.org> <386A3B9D.228BD31B@prescod.net> <386AAF7B.9EA4E3F3@maxtal.com.au>
Message-ID: <00c701bf526d$b5224c60$87740918@phnx3.az.home.com>

John Skaller wrote:
<SNIP>
> My argument is that this is what should be
> specified for some other 'errors' in some
> contexts. Given that python is dynamic,
> my argument is that, say, for a type error,
> it might make sense to ALLOW:
>
> try:
> 1 + "Hello"
> except TypeError: pass
>
> but mandate that
>
> def f(x):
> 1 + "Hello"
>
> is not a valid Python program -- a compiler
> can reject the program, rather than being
> forced to implement:
>
> def f(x):
> raise TypeError
>
> which is what is currently required.
>
> in particular, most people who use
> compilers RELY on the compiler, when it rejects
> code, NOT providing a .o file, which prevents
> the linker linking the code, which prevents
> the 'erroneous' code actually being run.

Let me stop lurking for a moment to comment:

First off, the function 'f' is close enough to:

def g(x):
    return g + "Hello"

that it strikes me as somewhat strange to ban the former but not the later.
I would like the compiler to catch, report, and reject a file containing the
definition of f(x) when invoked directly (e.g., from the command line) .
However, when invoked implicitly (e.g., by import) the compiler would go
ahead and compile f(x) to raise a type error.

In any event, I fail to see where you gain an efficiency advantage by
outlawing f(x). Perhaps someone can elighten me here. Don't get me wrong, I
definately see the advantage of having f(x) reported by a compiler. And I
see the advantage of not generating .o files by default when invoked from
the command line.

This has finally made me appreicate where Paul Prescod was going with
typesafe. I haven't gone back and reread it, so I apologize if I'm messsing
this up. Anyway, it seems that both:

typesafe
def f(x):
    return 1 + "hello"

typesafe
def g(x):
     return g + "Hello"

would both result in compile time errors similar to SyntaxError. (It
probably should not be TypeError -- TypeError allready has a runtime
meaning, perhaps InterfaceError or StaticTypeError). This, it seems, would
allow more efficient code to be generated (at the very least, checks for
thrown TypeErrors could be removed).

In fact, I would argue that:

def h(x) -> Int:
    return "spam"

should also be legal Python (in some sense), although I would like the
compiler to catch it by default. However:

typesafe
def h(x) -> Int:
    return "spam"

would again raise a compile time error.

<snip>


-tim

PS, I just realized that typesafe is equivalent to a " throws
everythingButTypeError" clause in Java. Doubt it's too important, but I
thought it interesting.


From tim_one@email.msn.com  Thu Dec 30 06:09:32 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 30 Dec 1999 01:09:32 -0500
Subject: [Types-sig] check.py (was: PyDL RFC 0.02)
In-Reply-To: <386A3B98.35DF5735@prescod.net>
Message-ID: <000601bf528c$70a62920$a02d153f@tim>

[Paul Prescod]
> My syntax is mostly based on your {GregS's] web page. I switched
> "!" for "as" based on my belief that it isn't Pythonic to use
> random keyboard characters in ways that are not universally
> understood...

FYI, in Common Lisp the name of this function is the delightful "the"; e.g.,

    (the integer (somefunc i))

looks at the value returned by (somefunc i), passes it along if it's an
integer, else raises an error.

and-some-people-say-lisp-is-unreadable<wink>-ly y'rs  - tim


From tim_one@email.msn.com  Thu Dec 30 06:09:36 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 30 Dec 1999 01:09:36 -0500
Subject: [Types-sig] Type checks
In-Reply-To: <386A31D9.A29EA65E@maxtal.com.au>
Message-ID: <000801bf528c$72f01920$a02d153f@tim>

[John Skaller]
> ...
> There is some indeterminism with bitwise operators (depends
> on the underlying C implementation, which sucks :-)

If you know of a platform dependence in longs, it's a bug.

Ditto for ints, unless it's one of the handful of shift cases that depends
on the native C long size (but, e.g., that Python's right shift sign-extends
is guaranteed regardless of what the platform C does with right shifts;
ditto for mixing signs across / and %; etc).

If you know of a bug, report it!

>> Errors in Python raise exceptions. That is how it is defined,
>> and that is the general style/pattern for the language.

> 	Not true for assertions.
> And type constraints are assertions.

John, you're not getting anywhere with this approach -- drop it.  This is
not the ISO C++ committee, and we're not bound by the latter's conventions &
conceits.  The behavior of assertions in Python depends on the setting of a
processor option.  The notion that "you can't do that!!" is an arbitrary
rule you're carrying in from C++ (*they* can't do that, because that's the
rule *they* agreed to live by -- we did not, and all evidence says nobody
else here is about to).

The Python language does not define the means by which processor options are
specified, but does define their effects.  It is not required that a
processor implement the processor option that we informally refer to as "-O
mode" -- but if it does, its effect is defined.

it's-not-hard-to-read-between-the-lines-when-they're-a-
    kilometer-apart<wink>-ly y'rs  - tim


From paul@prescod.net  Wed Dec 29 22:12:41 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 29 Dec 1999 17:12:41 -0500
Subject: [Types-sig] const (was: PyDL RFC 0.02)
References: <Pine.LNX.4.10.9912281436150.412-100000@nebula.lyra.org> <386A3B99.74EE9F6E@prescod.net> <386AB71D.CFF4DE70@maxtal.com.au>
Message-ID: <386A8759.C2567113@prescod.net>

skaller wrote:
> 
>         Hang on! removing 'const' is not the same as
> adding a restriction preventing certain name rebindings --
> as Guido pointed out in response to one of my posts,
> it is possible to detect many rebindings, and it is possible
> to ban some -- without needing a 'const' specification.

I don't follow you and anyhow I don't see why banning some name
rebindings needs to be high on our list of things to do. The
optimization and safety benefits are not as high as those of ordinary
type checking.

> > For now, we just need a solid definition of what types of rebinding are
> > legal. There are four kinds of names:
> 
>         No. There is only one kind of name.
> Perhaps you mean 'a name can be bound to one of four kinds of  object'?

Yes and no. We are now able to associate types with names so we need to
be cognizant of both the type of the name and the type of the object.

> >  * module -- we must always disallow rebinding these because we don't
> > have a notion of two modules with the "same interface". Maybe in some
> > future version we could.
> 
>         I'm not so sure. Consider:
> 
>         import m
>         module_x  = m.x
>         module_x = m.y
> 
> Here, 'module_x' is a name bound to a module, namely m.x,
> which is rebound to another module, m.y. Module objects
> can be accessed just like any other python object.
> 
> And in this case, you may not even KNOW that 'module_x'
> is bound to a module object.
> 
> I'm thinking that your intent is that the name 'module_x'
> be declared as a module (with some properties), but then ..
> 
> >  * class -- rebinding is fine as long as the new class has a signuture
> > that will produce instances that conform to the declared interface(s).
> 
>         .. should be the same as for modules. Rebinding
> is allowed, provided the constraints implied by a declaration
> the name conforms to a particular interface are met.

My point was that we need to treat modules differently than classes
because two modules cannot export the same interface whereas two classes
can.

interface Foo:
	decl bar: def( int ) -> int

decl foo1: def( ) -> Foo
decl foo2: def( ) -> Foo

class foo1:
	__impements__=[Foo]
	def bar( num ): return 5

class foo2:
	__impements__=[Foo]
	def bar( num ): return 6

foo1 = foo2 # okay

There is no equivalent code for modules because module interfaces are
not named and modules do not claim to conform to particular interfaces.

>         BUT: the is an important reason to do more,
> namely, caching of module functions and class methods.
> In this case, merely requiring interface conformance
> is not enough:

I cannot see that there is much to be gained in caching class methods.
The vast majority of time you will be handed an *instance* of the class
and you will look up the method at runtime using vtables or whatever.

 Paul Prescod


From paul@prescod.net  Wed Dec 29 22:16:06 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 29 Dec 1999 17:16:06 -0500
Subject: [Types-sig] Re: Conformance model
References: <Pine.LNX.4.10.9912281401170.412-100000@nebula.lyra.org> <386A3B9D.228BD31B@prescod.net> <386AAF7B.9EA4E3F3@maxtal.com.au>
Message-ID: <386A8826.23694955@prescod.net>

skaller wrote:
> 
> ...
> 
> Given that python is dynamic,
> my argument is that, say, for a type error,
> it might make sense to ... mandate that
> 
>         def f(x):
>                 1 + "Hello"
> 
> is not a valid Python program -- a compiler
> can reject the program, rather than being
> forced to implement:
> 
>         def f(x):
>                 raise TypeError
> 
> which is what is currently required.

My feeling could be summed up thus:

"The following actions are illegal. A Python compiler may report them
and refuse to compile the program or it may run the program and generate
some form of Error exception."

I would only willing to go further if you would describe overwhelming
optimization benefits in allowing undefined behavior. 

You are swimming against the tide of history here. Java doesn't have
much undefined behavior either. "Programmers these days" are more
interested in determinism than performance. Programmers are generally
more interested in optimizing coding efficiency rather than program
efficiency. Java and Python do not even array-bounds checks to be elided
even though there is no excuse for a valid program overwriting array
bounds.

 Paul Prescod


From paul@prescod.net  Wed Dec 29 22:19:51 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 29 Dec 1999 17:19:51 -0500
Subject: [Types-sig] RFC Comments
References: <38689428.4953374D@prescod.net> <3868F998.594C4EEF@maxtal.com.au> <386A3B9B.43E94DB9@prescod.net> <386AA976.F268268@maxtal.com.au>
Message-ID: <386A8907.906DC149@prescod.net>

skaller wrote:
> 
> ...
>         It _is_ possible to import only the interface
> of a module, and this should be done when 'import X' is
> seen. In pass two, a full two pass importation is triggered,
> but the interface loading is skipped because the interface
> is already loaded. However, the import still requires TWO
> passes DURING PASS 2, because the implementation file
> may also include inline declarations. 

Do you mean old fashioned Python function and class declarations or
newfangled decls and typedecls?

> It follows FROM THE MODEL
> that these declarations are effectively private.

It was never my intent that decls and typedecls could be private.

> ...
>         Summmary: pass 1 processing only permits
> pass 1 processing to occur recursively, whereas
> pass 2 imports may invoke  a full two phase translation.

My plan was: do everything relating to types, in ALL modules and then do
everything relating to code generation in all modules.

>         So because the semantics of importing a module
> (two passes) are quite distinct from only importing
> interfaces, and even that has two possible variants,
> it seems useful, if not essential, to permit a
> pass 1 only kind of importation -- 'include'.

Still not sold on "include".

> but my taste is only a minor point here, I'll run with 'as'
> if that is the final choice. 

We'll do a poll once other details are worked out.

> But your use of a function
> call like:
> 
>         interface_check(x,i)
> 
> for a run time test is not as simple as reusing "!"
> for the same purpose:

I am mildly uncomfortable with new expression syntax but my arguments
against it are not watertight so I will document the inline "as" unless
someone else feels as I do. To be honest, I would prefer colons in
function defs and "as" in other contexts if it came down to it. If
terseness is important enough to sacrifice readability for, I would
rather sacrifice it in favor of a mild inconsistency instead of a whole
new meaning for a punctuation character.

 Paul Prescod


From paul@prescod.net  Thu Dec 30 09:13:14 1999
From: paul@prescod.net (Paul Prescod)
Date: Thu, 30 Dec 1999 04:13:14 -0500
Subject: [Types-sig] check.py (was: PyDL RFC 0.02)
References: <Pine.LNX.4.10.9912282231120.412-100000@nebula.lyra.org> <386A3B98.35DF5735@prescod.net> <386AA1FD.C726C130@maxtal.com.au>
Message-ID: <386B222A.4AD0EC24@prescod.net>

skaller wrote:
> 
> > and so forth. You need these classes regardless.
> 
>         I don't. My implementation is ML based,
> and the static compilations tools are likely
> to use native constructions not Python ones .. although
> I'm not sure.

You need these classes because they are the Python equivalent of Java's
reflection API. They are available to the Python programmer. I think
that in the interests of determinism, you must keep this information
around as Python objects unless you can demonstrate that the programmer
does not "ask for" them. This will actually be pretty easy to
demonstrate if our API is explicit enough. (e.g. if you don't import
type_reflect then you can't get at the information so the compiler can
throw it away)

 Paul Prescod


From paul@prescod.net  Thu Dec 30 09:44:49 1999
From: paul@prescod.net (Paul Prescod)
Date: Thu, 30 Dec 1999 04:44:49 -0500
Subject: Duplicate declarations Re: [Types-sig] PyDL RFC 0.02
References: <Pine.LNX.4.10.9912280108380.412-100000@nebula.lyra.org> <3868EBEB.BDAFABE0@maxtal.com.au> <386A3B9A.C8DFE00D@prescod.net> <386AB9E4.74E286B0@maxtal.com.au>
Message-ID: <386B2991.F91CC926@prescod.net>

skaller wrote:
> 
> ...
> 
> Here, there are TWO declarations of the same name.
> I don't think you can ban this, because it is not
> only likely to be a common case, it is likely
> to be almost EVERY case -- since many people
> will use a 'genpi' tool to extract embedded declarations
> into a separate interface file -- but won't remove
> the embedded declarations.

The virtue of the model that you described so succinctly is that there
is no reason to run "genpi" explicitly. It is run during the first pass
of the compilation FOR you. 

> Or perhaps I misunderstand completely, and what you are
> saying is that the type checker's work is exactly to
> check that the gpi names do not conflict with the .pi
> names?

That is also a reasonable argument and probably one I would tend toward
in future versions of the spec.

 Paul Prescod


From paul@prescod.net  Thu Dec 30 09:44:36 1999
From: paul@prescod.net (Paul Prescod)
Date: Thu, 30 Dec 1999 04:44:36 -0500
Subject: [Types-sig] const (was: PyDL RFC 0.02)
References: <Pine.LNX.4.10.9912281436150.412-100000@nebula.lyra.org> <386A3151.8F5AFCD7@maxtal.com.au>
Message-ID: <386B2984.DEE409B@prescod.net>

skaller wrote:
> 
> ...
>         I know, but that is my point: it isn't consistent
> with a model in which checking is applied to _names_:

Yes it is. Readonly-ness is part of an object's interface. We could make
ReadOnlyMapping types, ReadOnlyFile types, ReadOnlyList types,
ReadOnlyBankAccount types and so forth. It makes more sense to me,
however, to separate out ReadOnly-ness because it is so pervasive. But
not for version 1.

 Paul Prescod


From skip@mojam.com (Skip Montanaro)  Thu Dec 30 15:46:34 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Thu, 30 Dec 1999 09:46:34 -0600 (CST)
Subject: [Types-sig] Run time arg checking implemented
In-Reply-To: <3862527B.99B783C8@maxtal.com.au>
References: <Pine.LNX.4.10.9912221426060.16305-100000@nebula.lyra.org>
 <386177A3.86F0D505@prescod.net>
 <3862527B.99B783C8@maxtal.com.au>
Message-ID: <14443.32346.368043.41998@dolphin.mojam.com>

    skaller> I have implemented run time argument checking in Viper, using
    skaller> Greg's ! operator. The syntax (so far) is like:

    skaller> 	def f( p ! t = dflt): pass

    skaller> and the semantics are to check that an argument has the
    skaller> nominated type:

    skaller> 	f(a) 

    skaller> checks like:

    skaller> 	if type(a) is not t:
    skaller> 		raise TypeError "messge"

Any reason this isn't

    assert type(a) is t

?

Skip Montanaro | http://www.mojam.com/
skip@mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From skip@mojam.com (Skip Montanaro)  Thu Dec 30 15:50:35 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Thu, 30 Dec 1999 09:50:35 -0600 (CST)
Subject: [Types-sig] type declaration syntax
In-Reply-To: <386165AF.F6E6BF81@maxtal.com.au>
References: <385C1345.C21FF180@maxtal.com.au>
 <19991222224650088.AAA118.228@max41101.izone.net.au>
 <386165AF.F6E6BF81@maxtal.com.au>
Message-ID: <14443.32587.574186.48706@dolphin.mojam.com>

    skaller> I.e. TWO bans fix most problems. The ban on module level
    skaller> rebindings is a significant restriction. 

I'll say.  The common idiom for trapping stdout or stderr is to rebind
sys.stdout/stderr to a file-like object.  How would that be accomplished in
such a straightforward way if module-level rebindings are disallowed?

Skip Montanaro | http://www.mojam.com/
skip@mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From skaller@maxtal.com.au  Thu Dec 30 16:38:12 1999
From: skaller@maxtal.com.au (skaller)
Date: Fri, 31 Dec 1999 03:38:12 +1100
Subject: [Types-sig] type declaration syntax
References: <385C1345.C21FF180@maxtal.com.au>
 <19991222224650088.AAA118.228@max41101.izone.net.au>
 <386165AF.F6E6BF81@maxtal.com.au> <14443.32587.574186.48706@dolphin.mojam.com>
Message-ID: <386B8A74.3617D3E9@maxtal.com.au>

Skip Montanaro wrote:
> 
>     skaller> I.e. TWO bans fix most problems. The ban on module level
>     skaller> rebindings is a significant restriction.
> 
> I'll say.  The common idiom for trapping stdout or stderr is to rebind
> sys.stdout/stderr to a file-like object.  How would that be accomplished in
> such a straightforward way if module-level rebindings are disallowed?

	Special case the sys module.
A bit messy -- but so is the sys module :-)

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Thu Dec 30 16:44:06 1999
From: skaller@maxtal.com.au (skaller)
Date: Fri, 31 Dec 1999 03:44:06 +1100
Subject: [Types-sig] Run time arg checking implemented
References: <Pine.LNX.4.10.9912221426060.16305-100000@nebula.lyra.org>
 <386177A3.86F0D505@prescod.net>
 <3862527B.99B783C8@maxtal.com.au> <14443.32346.368043.41998@dolphin.mojam.com>
Message-ID: <386B8BD6.4E494370@maxtal.com.au>

Skip Montanaro wrote:
> 
>     skaller> I have implemented run time argument checking in Viper, using
>     skaller> Greg's ! operator. The syntax (so far) is like:
> 
>     skaller>    def f( p ! t = dflt): pass
> 
>     skaller> and the semantics are to check that an argument has the
>     skaller> nominated type:
> 
>     skaller>    f(a)
> 
>     skaller> checks like:
> 
>     skaller>    if type(a) is not t:
>     skaller>            raise TypeError "messge"
> 
> Any reason this isn't
> 
>     assert type(a) is t
> 
> ?

	Different message.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Thu Dec 30 17:46:42 1999
From: skaller@maxtal.com.au (skaller)
Date: Fri, 31 Dec 1999 04:46:42 +1100
Subject: [Types-sig] RFC Comments
References: <38689428.4953374D@prescod.net> <3868F998.594C4EEF@maxtal.com.au> <386A3B9B.43E94DB9@prescod.net> <386AA976.F268268@maxtal.com.au> <386A8907.906DC149@prescod.net>
Message-ID: <386B9A82.ED38623E@maxtal.com.au>

Paul Prescod wrote:
> 
> skaller wrote:
> >
> > ...
> >         It _is_ possible to import only the interface
> > of a module, and this should be done when 'import X' is
> > seen. In pass two, a full two pass importation is triggered,
> > but the interface loading is skipped because the interface
> > is already loaded. However, the import still requires TWO
> > passes DURING PASS 2, because the implementation file
> > may also include inline declarations.
> 
> Do you mean old fashioned Python function and class declarations or
> newfangled decls and typedecls?

	Sorry, I mean't 'decls': i.e. interface declarations.
[which can also be embedded like 'def f(x as t)']

> > It follows FROM THE MODEL
> > that these declarations are effectively private.
> 
> It was never my intent that decls and typedecls could be private.

	OK. I think this is necessary for correct interface
design. But I concede that the default could sensibly be public;
and a 'private' keyword used?

> >         Summmary: pass 1 processing only permits
> > pass 1 processing to occur recursively, whereas
> > pass 2 imports may invoke  a full two phase translation.
> 
> My plan was: do everything relating to types, in ALL modules and then do
> everything relating to code generation in all modules.

	I see. I'm not sure that can work. The reason is,
interface loading is dynamic. Because module loading is dynamic.
[This can't be changed, it would break Interscript, for example]
My thought was the static processing would be done by 
making the compiler two pass. And the compiler currently
runs when a module is first loaded -- at varying
times during code execution when 'import' statements
are executed.

> Still not sold on "include".

	Leave it out, and see what happens. 
Write some code: invent a mini-language (easy to parse in python),
and implement the model. I think something like:

	import <name>
	decl y
	check y

where check y prints either 'y is defined' or 'y is not defined':
this code is to be executed at 'run time' to report on what names
are visible. Write the py -> pi tool. 

[!] 
> I am mildly uncomfortable with new expression syntax 

	Rightly so. I'm not insisting, just presenting
a feeling. I _do_ feel reasonably comfortable with 'as'.
[I write lots of code. I can't type well. I like code
compact. Means I can fit more information on a screen.
I like python string/sequence handling precisely because it
is a terse notation]

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From paul@prescod.net  Thu Dec 30 17:44:39 1999
From: paul@prescod.net (Paul Prescod)
Date: Thu, 30 Dec 1999 12:44:39 -0500
Subject: [Types-sig] PyDL RFC 0.4
Message-ID: <386B9A07.57234970@prescod.net>

PyDL RFC 0.03

A PyDL file declares the interface for a Python module. PyDL files
declare interfaces, objects and the required interfaces of objects.

This document (loosely, informally) describes the behavior of a class
of software modules called "static interface interpreters" and "static
interface checkers". Interface interpreters are run as part of the
regular Python module interpetation process. They read PyDL files and
make the interface objects available to the Python compiler.
Interface checkers read PyDL files and Python code to verify
conformance of the code to the interface.

Once this design is done we will write a formal specification.

PyDL Files:
===========
A PyDL file can be either created by a programmer or auto-generated.
The syntax and semantics of the two types are identical.  An
auto-generated file is created by scanning a Python module for inline
declarations. 

Interfaces are the central concept in PyDL files. Interfaces are
Python objects like anything else but they are created by the
interface interpreter. They are made available to the static interface
checker before Python compilation begins. 

In addition to defining interfaces, it is possible to declare other
attributes of the module. Each declaration associates an interface
with the name of the attribute. Values associated with the name in the
module namespace must always conform to the declared interface.
Furthermore, by the time the module has been imported each name must
have an associated value. It is not necessary for the static interface
checker to prove that these rules will not be violated. It is also
acceptable to check at runtime.

Grammar:
========
In the very short term, implementors are encouraged to use any grammar
that allows every example in this document. Contributions of proposals
for the grammar are solicited.

Interfaces:
===========
Interfaces are created through interface definitions and interface
expressions. There may also be facilities for creating interfaces at
runtime but they are neither available to nor relevant to the
interface interpreter.

Interface definitions are similar to Python class definitions. They
use the keyword "interface" instead of the keyword "class". 

Interfaces are either complete or incomplete. An incomplete interface
takes parameters and a complete interface does not. It is not possible
to create Python objects that conform to incomplete interfaces. They
are just a reuse mechanism analogous to functions in Python.  An
example of an incomplete interface would be "Sequence". It is
incomplete because we need to define the interface of the contents of
the sequence.  

In an interface expression the programmer can provide parameters to
generate a new interface. 

Typedefs allow us to give names to complete or incomplete interfaces
described by interface expressions. Typedefs are an interface
expression re-use mechanism.

Interfaces have an intuitive concept of equivalence which will be
formalized later in the document.

Behavior:
=========
For our purposes, we will presume that every Python environment has
some form of compilation phase. This is true of all existing Python
environments.

The Python compiler invokes the static interface interpreter and
optionally the interface checker on a Python file and its associated
PyDL file.  Typically a PyDL file is associated with a Python file
through placement in the same path with the same base name and a
".pydl"  or ".gpydl" extension. If both are avaiable, the module's
interface is created by combining the declarations in the ".pydl" and
".gpydl" files.

"Non-standard" importer modules may find PyDL files using other
mechanisms such as through a look-up in an relational database, just
as they find modules themselves using non-standard mechanisms.

The interface interpreter reads the PyDL file and builds the
relevant interface objects. If the PyDL file refers to other modules
then the interface interpreter can read the PyDL files associated
with those other modules after generating them if necessary.

It is acceptable to use date-stamps, CRCs and other heuristics to
demonstrate that a generated PyDL file is not likely to be
inconsistent with its module.

The Python compiler may invoke the interface checker after the
interface interpreter has built interface objects and before it
interprets the Python module.

Once it interprets the Python code, the interface objects are
available to the runtime code through a special namespace called the
"interface namespace". There is one such namespace per module. It is
accessible from the module's namespace via the name "__interfaces__". 

This namespace is interposed in the name search order between the
module's namespace and the built-in namespace.

Built-in Interfaces:
====================

Any
    Number
        Integral
            Int
            Long
        Float
        Complex
    Sequence
        String
        Record
    Mapping
    Modules
    Callable
        Class
        Function
        Methods
            UnboundMethods
            BoundMethods    
    Null
    File

Certain interfaces may have only one implementation. These "primitive"
types 
are Int, Long, Float, String, UnboundMethods, BoundMethods, Module,
Function 
and Null. Over time this list may get shorter as the Python
implementation is generalized to work mostly by interfaces.

Note: In rare cases it may be necessary to create new primitive
types with only a single implementation (such as "window handle" or
"file handle"). This is the case when the object's actual bit-pattern
is more important than its interface.

Note: The Python interface graph may not always be a tree. For
instance there might someday be a type that is both a mapping and a
sequence.

The details of each interface remain to be worked out. Volunteers are
solicited.

Interface expression language:
==============================

Interface expressions are used to declare that attributes must conform
to certain interfaces.  In a interface expression you may:

1. refer to an interface by name. The name can either be simple or
it may be of the form "module.interfacename" where "interfacename"
is a name in one of two PyDL files for the named module.

The expression evaluates to the referenced interface.

Two expressions consisting only of names are equivalent if the
referenced interface objects are equivalent.

2. make a union of two or more interfaces:

integer or float 
integer or float or complex

The expression evaluates to an interface object I such that a value V
conforms to I iff it conforms to any interface in ths list.

Two union expressions X and Y are equivalent if their lists are the
same length and each element in X has an equivalent in Y and vice
versa.

3. parameterize a interface:

Array( Int, 50 )
Array( length=50, elements=Int )

Note that the arguments can be either interface expressions or simple
Python expressions. A "simple" Python expression is an expression that
does not involve a function call or variable reference.

The expression evaluates to a complete instantiation of the referenced
incomplete interface.

Two parameterization expressions are equivalent if the parameterized
interface is equivalent and each parameter is equivalent.

4. use a syntactic shortcut:

[Foo] => Sequence( Foo ) # sequence of Foo's
{String:Int} => Mapping( String, Int ) # Mapping from A's to B's
(A,B,C) => Record( A, B, C ) # 3-element sequence of interface a,
followed
                             # by b followed by c

The expression evaluates to the same thing as the expanded versions.
Equivalence is identical to the situation for the expanded versions.

5. generate a callable interface:

def( Arg1 as Type1, Arg2 as Type2 ) -> ReturnType

The argument name may be elided:

def( Int, String ) -> None

Note: this is provided for compatibiity with libraries and tools that
may not support named arguments. Python programmers are strongly
encouraged to use argument names as they are good documentation and
are useful for development environments and other reflective tools.

It is possible to declare variable length argument lists. They must
always be declared as sequences but the element interface may vary.

def( Arg1 as String, * as [Int] ) -> Int 
            # callable taking a String, and some un-named Int
            # arguments

Finally, it is possible to declare keyword argument lists. They must
always be declared as mappings from string to some interface.

def( Arg1 as Int , ** as {String: Int}) - > Int

Note that at this point in time, every Python callable returns
something, even if it is None. The return value can be named,
merely as documentation:

def( Arg1 as Int , ** as {String: Int}) - > ReturnCode as Int

The expression evaluates to a callable interface that takes the
described arguments and returns the described value.

Declarations in a PyDL file:
============================

 1. Imports

An import statement in an interface file loads another interface file.
The import statement works just like Python's except that it loads the
PyDL file found with the referenced module, not the module itself. (of
course we will make this definition more formal in the future)

 2. Basic attribute interface declarations:

decl myint as Int                   # basic 
decl intarr as Array( Int, 50 )     # parameterized
decl intarr2 as Array( size = 40, elements = Int ) # using keyword
syntax

Attribute declarations are not parameteriable. Furthermore, they must
resolve to complete interfaces.

So this is allowed:

class (_X,_Y) spam( A, B ):
    decl someInstanceMember as _X
    decl someOtherMember as Array( _X, 50 )

    ....

These are NOT allowed:

decl someModuleMember(_X) as Array( _X, 50 )

class (_Y) spam( A, B ):
    decl someInstanceMember(_X) as Array( _X, 50 ) 

Because that would allow you to create a "spam" without getting around
to saying what _X is for that spam's someInstanceMember. That would
disallow static type checking.

 3. Callable object interface declarations:

Functions are the most common sort of callable object but class
instances can also be callable. Callables may be runtime parameterized
and/or interface parameterized.  For instance, there might be a method
"add" that takes two objects with the same interface and returns an
object with that interface.

decl DoSomething( _X ) as def( a as _X, b as _X )-> _X

_X is the interface parameter. By convention these start with
underscores. a and b are the runtime parameters. 

Note: it is usually possible to coerce a parameterized function into a
fully polymorphic function where the arguments can vary from each
other quite widely despite being declared to have the same parameter
type. You can do this by instantiating the function with "Any" as the
parametric type.

It is possible to allow _X to vary to some extent but still require it
to always be a Number:

decl Add(_X as Number) as def( a as _X, b as _X )-> _X

So this function could take two longs or two floats but not two
strings.

Note: as above, you could create a version that would take a float and
a long by referring to a common base interface like Number itself.

 4. Class Declarations

A class is a callable object that can be subclassed.  Currently the
only way to make those (short of magic) is with a class declaration,
but one could imagine that there might someday be an __subclass__
magic method that would allow any old object instance to also stand in
as a class.

The syntax for a class definition is identical to that for a function
with the keyword "def" replaced by "class".  What we are really
defining is the constructor. The signature of the created object can
be described in an interface declaration.

decl TreeNode(_X) as class( 
            a as _X, 
            Right as TreeNode( _X ) or None,
            Left as TreeNode( _X ) or None )
                -> ParentClasses, Interfaces

When the initialization completes, every attribute in the declared
interfaces should have a value.

 5. Interface declarations:

An interface decaration starts with the keyword "interface",
optionally has interface parameters in parentheses and then continues
with the interface name and the names of super-interfaces. This
interface inherits and must not contradict the signature of the parent
interfaces.

The interface body is made up of attribute declarations.

interface (_X,_Y) spam( a, b ):
    decl somemember as _X
    decl someOtherMember as _Y
    decl someClassAttr as [ _X ]

    decl someFunction as def( a as Int, b as Float ) -> String

 6. Typedefs:

Typedefs allow interfaces to be renamed and for parameterized
variations of interfaces to be given names.

typedef PositiveInt as BoundedInt( 0, maxint )
typedef NegativeInt as BoundedInt( max=-1, min=minint )
typedef NullableInt as Int or None
typedef Dictionary(_Y) as {String:_Y}

New Module Syntax:
======================
In a future version of Python, declarations will be allowed in Python
code and will have the same meanings. They will be extracted to a
generated PyDL file and evaluated there (along with hand-written
declarations in the PyDL file). In the meantime, there is a backwards
compatible syntax explained later. 

    "typesafe":
    ===========
In addition to decl and typedecl the keyword "typesafe" can be used to
indicate that a function or method uses types in such a way that each
operation can be checked at compile time and demonstrated not to call
any function or operation with the wrong types. 

The keyword precedes the function definition:

typesafe def foo( a, b ):
    ...

The typesafe keyword can also be used before a class definition. That
means that every method in the class is declared to be type safe. 

There typesafe keyword can be used with the "module" modifier before
the first function or class definitions in a module to state that all
of the functions and classes in the module are type safe:

import spam
import rabbit
import orphanage

typesafe module 

An interface checker's job is to ensure that methods that claim to be
typesafe actually are. It must report and refuse to compile modules
that misuse the keyword and may not refuse to compile modules that do
not. The interface checker may optionally warn the programmer about
other suspect constructs in Python code.

Note: typesafe is the only change to class definitions or module
definitions syntax.


    "as"
    ====
The "as" operator takes an expression and an interface expression and
verifies at runtime that the expression evaluates to an object that
conforms to the interface described by the expression.

It returns the expression's value if it succeeds and raises
TypeAssertionError (a subtype of AssertionError) otherwise.

foostr  = foo as [String] # verifies that foo is a string and
                          # re-assigns it.

This operation can be used in various ways. The most basic way to use
it is as a test:

>>> j = getData()
>>> j as Int
>>> j=j+1

The "as" operator has the lowest precedence of the binary operators.

    Interface objects
    =================

Every interface object (remember, interfaces are just Python objects!)
has the following method :

__conforms__ : def (obj: Any ) -> boolean

This method can be used at runtime to determine whether an object
conforms to the interface. It would check the signature for sure but
might also check the actual values of particular attributes.

There is also a global function with this signature:

class_conforms : def ( obj as Class, Obj as Interface ) -> boolean

This function can be used either at compile time (e.g. by an
implementation of an interface checker) or runtime to check that a
class will generate objects that have the right signature to conform
to the interface.

(the rest of the interface reflection API will be worked out later)


Experimental syntax:
====================

There is a backwards compatible syntax for embedding declarations in a
Python 1.5x file:

"decl","myint as Integer"

"typedef","PositiveInteger as BoundedInt( 0, maxint )"

"typesafe"
def ...( ... ): ...

"typesafe module"

There will be a tool that extracts these declarations from a Python
file to generate a .gpydl (for "generated PyDL") file. These files are
used alongside hand-crafted PyDL files. The "effective interface" of
the file is evaluated by combining the declarations from the same file
as if they were concatenated together (more or less...exact details to
follow). The two files must not contradict each other, just as
declarations within a single file must not contradict each other.
This means that names that are declared twice must evaluate to
equivalent types.

Over time the .gpydl generator will get more intelligent and may
deduce type information based on code outside of explicit declarations
(for instance function and class definitions, assignment statements
and so forth).

The "as" keyword is replaced in the backwards-compatible syntax with 

Summary of Major Runtime Implications:
======================================
All of the named interfaces defined in a PyDL file are available in
the "__interfaces__" dictionary that is searched between the module
dictionary and the built-in dictionary.

The runtime should not allow an assignment or function call to violate
the declarations in the PyDL file. In an "optimized speed mode" those
checks would be disabled. In non-optimized mode, these assignments
would generate an IncompatibleAssignmentError.

The runtime should not allow a read from an unassigned attribute. It
should raise NotAssignedError if it detects this at runtime instead of
at compile time.

Several new object interfaces and functions are needed.

Future Directions:
==================

    Inferencing/Deduction:
    ======================
At some point in the future, PyDL files will likely be generated from
source code using a combination of declarations within Python code and
some sorts of interface deduction and inferencing based on
various kinds of assignment.

    Const-ness/Readonly-ness:
    =========================
We need to be able to say that some attributes cannot be re-bound and
that some attributes and parameters are immutable.

    Idea: The Undefined Object:
    ===========================

The Undefined object is used as the value of unassigned attributes and
the return value of functions that do not return a value. It may not
be bound to a name. 

a = Undefined   # raises UndefinedValueError
a = b           # raises UndefinedValueError if b has not been assigned

Undefined can be thought of as a subtype of NameError. Undefined is
needed because it is now possible to declare names at compile time but
never get around to assigning to them. In ordinary Python this is not
possible.

The only useful thing you can do with Undefined is check whether an
object "is" Undefined:

if a is Undefined:
    doSomethingWithA(a)
else:
    doSomethingElse()

This is equivalent to:

try:
    doSomethingWithA( a )
except NameError:
    doSomethingElse

It is debatable whether we still need NameError for anything other
than backwards compatibility. We could say that any referenced
variable is automatically initialized to "Undefined". Undefined is
sufficiently restrictive that this will not lead to buggy programs.

Undefined also corrects a long-term unsafe issue with functions. Now,
functions that do not explicitly return a value return Undefined
instead of None. That means that this is no longer possible

a = list.sort()

With Undefined, it will blow up because it is not possible to assign the
Undefined value. Before Undefined, the code did not blow up but it
also did not do the "right thing." It assigned None to "a" which was
seldom what was intended.


From gstein@lyra.org  Thu Dec 30 18:06:12 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 30 Dec 1999 10:06:12 -0800 (PST)
Subject: [Types-sig] RFC Comments
In-Reply-To: <386A3B9B.43E94DB9@prescod.net>
Message-ID: <Pine.LNX.4.10.9912301003200.412-100000@nebula.lyra.org>

On Wed, 29 Dec 1999, Paul Prescod wrote:
>...
> > The reason I chose "!" for argument declarations was that it
> > was already being used in similar way for the _expression_:
> > 
> >         y = x ! t
> > 
> > and in this context, ":" cannot be used.
> 
> Right. My RFC uses a function call syntax inline. It seems more Pythonic
> and can cause no no precedence confusion. It is also compatible with the
> Python 1.5.x grammar.

I've discussed the notion of function call syntax before. It is Not Good.
It causes problems with people believing that an actual function exists
that can be referenced, passed around, etc.  I think that I had a few
other reasons.

If you're going to introduce a new operator, then it should be a new
operator. Not a function call.

> >         OK. You should proceed with _some_ fixed syntax. 
> 
> I used "as" everywhere else. The colon was just a lapse. 

"as" has the wrong semantic.

   x = y as Int

That looks like you want to use y "as an integer". Of course, that isn't
what is happening. You're asserting that y *is* an integer.

Either use "!" or "isa". But definitely do not use "as".

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Thu Dec 30 18:18:51 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 30 Dec 1999 10:18:51 -0800 (PST)
Subject: [Types-sig] rebinding (was: const)
In-Reply-To: <386A3B99.74EE9F6E@prescod.net>
Message-ID: <Pine.LNX.4.10.9912301007330.412-100000@nebula.lyra.org>

On Wed, 29 Dec 1999, Paul Prescod wrote:
>...
> For now, we just need a solid definition of what types of rebinding are
> legal. There are four kinds of names:

Why not just make them all legal, and worry about this later?

>  * module -- we must always disallow rebinding these because we don't
> have a notion of two modules with the "same interface". Maybe in some
> future version we could.

Untrue. Ever look at the "anydbm" module and its cohorts? How about the
DBAPI modules?

I've said before: modules and classes both have the notion of an
interface. We ought to be able to associate an interface with a module!

>  * class -- rebinding is fine as long as the new class has a signuture
> that will produce instances that conform to the declared interface(s).
> 
> * functions and other objects -- rebinding is fine as long as the new
> function conforms to the declared interface.

These make sense.

Note on functions: how is a function declared to have a particular
signature? If the function itself is declaring the signature, then
rebinding could be allowed. For example:

interface foo:
  def bar(x: Int)->String:
    "doc string"

class baz(foo):
  def bar(x: Int)->String:
    ...

def func(x: String)->String:
  ...
def func(x: Int)->String:
  ...


In the above example, baz.bar must conform to foo.bar since the class is
supposed to conform to the foo interface.

func() is defining the module's interface, so the second func() is simply
tweaking the "final" interface.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From skaller@maxtal.com.au  Thu Dec 30 18:17:34 1999
From: skaller@maxtal.com.au (skaller)
Date: Fri, 31 Dec 1999 05:17:34 +1100
Subject: [Types-sig] const (was: PyDL RFC 0.02)
References: <Pine.LNX.4.10.9912281436150.412-100000@nebula.lyra.org> <386A3151.8F5AFCD7@maxtal.com.au> <386B2984.DEE409B@prescod.net>
Message-ID: <386BA1BE.7F92DBC6@maxtal.com.au>

Paul Prescod wrote:
> 
> skaller wrote:
> >
> > ...
> >         I know, but that is my point: it isn't consistent
> > with a model in which checking is applied to _names_:
> 
> Yes it is. Readonly-ness is part of an object's interface.

	No. Objects don't HAVE interfaces. NAMES have interfaces.
Access to the object is constrained by the interface associated
with the name. A standard dictionary could, if we had some kind
of const, be accessed via two names: the first, allowing only
read access, and the second read and write. 

	An object can be 'compatible' with many interfaces.
It can only be _accessed_ via a name, through the declared
interface. Two names, two interfaces. One object.
your model!

	Example: an object is accessed as a Sequence.
Same object is also accessed by a different name bound
to a different interface, as List. Should work across
function call boundaries to support protocol based 
polymorphism:

	decl x as List
	def f(y as Sequence): ..
	f(x) # should be OK: a List 'is a' Sequence

	So: 'const' in an interface is an access control.
Perhaps object bound to the name with an interface is immutable,
and perhaps not: immutability is a _runtime_ property.
'const' is a compile time one. Objects exist at run time.
Names are bound to interfaces at compile time.

 We could make
> ReadOnlyMapping types, ReadOnlyFile types, ReadOnlyList types,
> ReadOnlyBankAccount types and so forth. It makes more sense to me,
> however, to separate out ReadOnly-ness because it is so pervasive. But
> not for version 1.

	No. If you want an 'immutable dictionary', you need a 
new type. I mean, an actual new Python run-time type object
with different 'methods' defined. The properties of an object
of a certain type are entirely a run time matter. The static
type system doesn't enter into it. Interfaces do not change
the type of an object. They restrict how the object can be accessed.
If you 'see' something as an Immutable Mapping,
you cannot insert a new key: the static type checker would not
let you. This prevents a run time error; but if the object
were a standard dictionary, the static type system will
_still_ stop you accessing the object as a dictionary.

'Seeing is not being'-ly yours.
-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From gstein@lyra.org  Thu Dec 30 18:26:30 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 30 Dec 1999 10:26:30 -0800 (PST)
Subject: [Types-sig] syntax (was: check.py)
In-Reply-To: <386A3B98.35DF5735@prescod.net>
Message-ID: <Pine.LNX.4.10.9912301019220.412-100000@nebula.lyra.org>

On Wed, 29 Dec 1999, Paul Prescod wrote:
> Greg Stein wrote:
> > ...
> > To do this, I would need to change the Python grammar, or suck in .pyi
> > files. I plan to do the latter once some kind of formal grammar is
> > specified. If that doesn't happen soon, then I'll be using the grammar
> > that I posted in my type-proposal.html. It is complete and is sufficient
> > (yet Paul seems to be starting from scratch... :-( ).
> 
> My syntax is mostly based on your web page. I switched "!" for "as"
> based on my belief that it isn't Pythonic to use random keyboard
> characters in ways that are not universally understood.

I covered this under a separate email. If you insist on using a word
rather than '!', then at least use the right semantic.

> And I put decl
> and typedecl at the front instead of making them operators because I
> agree with Tim Peters that we are designing a sub-language that needs to
> be understood as being separate by virtue of being evaluated BEFORE the
> code is executed.

I disagree. Making a "sub-language" will simply serve to create something
that is not integrated with Python. I see no reason to separate anything
that is happening here -- that is a poor requirement/direction to take.

The typedef unary operator allows a Python programmer to manipulate type
declarator objects. That will be important for things such as an IDE, a
debugger, or some more sophisticated analysis tools.

The "decl" statement is fine, but we shouldn't look at it as an escape
hatch for anything that we'd like to do.

> It is my personal opinion that the grammar should be the last thing you
> integrate into your system.

I'd rather see the grammar *now* so that check.py can take advantage of
it. That kind of defeats your claim :-)  Otherwise, we're just coding in
the dark, hoping that the app will work when the grammar finally gets
implemented.

> In order to avoid maintaining a whole
> compiler while the grammar shifts, I would suggest you define classes
> like this:
> 
> class ParameterizedInterface:
> 	....
> 
> class ConcreteInterface:
> 	....
> 
> class MethodSignature:
> 
> and so forth. You need these classes regardless.

I've got classes like this. Go look at typedecl.py in my posting.

> Then your interface file becomes:
> 
> Array = new ParamterizedInterface( 
> 	parameters=["elements", "array"],
> 	attributes=[new MethodSignature( arguments=... )]
>   )
> 
> We need this API anyhow so it would help alot if you could design it
> while you are writing your package.

Have you looked at check.py?

-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Thu Dec 30 18:30:14 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 30 Dec 1999 10:30:14 -0800 (PST)
Subject: [Types-sig] english words in Python (was: check.py)
In-Reply-To: <386AA1FD.C726C130@maxtal.com.au>
Message-ID: <Pine.LNX.4.10.9912301027060.412-100000@nebula.lyra.org>

On Thu, 30 Dec 1999, skaller wrote:
> Paul Prescod wrote:
> > My syntax is mostly based on your web page. I switched "!" for "as"
> > based on my belief that it isn't Pythonic to use random keyboard
> > characters in ways that are not universally understood.
> 
> 	Then you had better think again. 'as' is an ENGLISH word.
> English is not 'universally' understood.

So what? The Python language uses English words. "as" is the wrong
semantic and should be rejected based on that. But not because it is
English.

Next, you'll say that we should replace "import", "class", and "assert"
with funny little characters. Soon enough, we'll end up with APL or Perl.

> 	I have implemented 'x!t' in Viper, and then, later,
> I implemented 'def f(x!t)' -- the uses require grammar modifications
> in different places and are technically distinct.

Yes. I recommend '!' for the operator and ':' for the funcdefs.

> 	As I am now implementing a C code generator, I am noticing
> the effects of the optional typing on a compiler (although I'm
> not actually using the information yet). 
> 
> 	In particular, since my implementation is entirely dynamic, 
> it fits well with cgen_module, which uses an already loaded module.
> I have not tried a static compiler which 'parses' text to generate
> code yet, but I suspect this will make my dynamic interpretation
> difficult to implement -- on the other hand, Greg Stein HAS
> tried this kind of tool -- and so I'd like to hear from him
> what the impact of the 'at run time' meaning would be,
> if he has looked at this.

I don't follow your question here. I don't understand your distinctions
between dynamic and runtime and static...

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Thu Dec 30 18:38:59 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 30 Dec 1999 10:38:59 -0800 (PST)
Subject: [Types-sig] import vs include (was: RFC Comments)
In-Reply-To: <386AA976.F268268@maxtal.com.au>
Message-ID: <Pine.LNX.4.10.9912301032250.412-100000@nebula.lyra.org>

On Thu, 30 Dec 1999, skaller wrote:
>...
> 	In this 'two pass' model, it is inconsistent to
> 'import' a module in pass 1, since 'importing' a module
> requires a recursive tranlation pass involving TWO passes,
> and we know that the second pass can even involve recursive
> module execution. So it isn't _possible_ to import
> a module during pass 1. It won't work.

Python importing does *not* allow recursive module execution.

a.py:
import b
some_code()

b.py:
import a
more_code()


Let's say that you import a.py. b will be imported, the "import a" will
establish a reference to the "a" module (which is incomplete at that point
in time), and then more_code() is executed. The import of b completes and
some_code() is then executed. After a.py completes, the module is then
filled in and becomes available to other modules (such as b.py).

[ note that if you *run* a.py, it is named __main__ so b's import will
  grab something different ]


Given that recursive module execution cannot occur, there is no problem in
doing a real import and acquiring interface information from the imported
module. In other words: I agree with Paul that we do not need to separate
the notions of import and include.

In check.py, however, I do not plan to perform a true import. Interfaces
of other modules must be specified in a .pi file, or check.py must be
allowed to parse and construct an interface from the target module.

[ I'd rather not open/read/parse/extract an interface from another module
  because it would definitely impact the performance too much. ]

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From skaller@maxtal.com.au  Thu Dec 30 18:41:03 1999
From: skaller@maxtal.com.au (skaller)
Date: Fri, 31 Dec 1999 05:41:03 +1100
Subject: [Types-sig] const (was: PyDL RFC 0.02)
References: <Pine.LNX.4.10.9912281436150.412-100000@nebula.lyra.org> <386A3B99.74EE9F6E@prescod.net> <386AB71D.CFF4DE70@maxtal.com.au> <386A8759.C2567113@prescod.net>
Message-ID: <386BA73F.71C1F176@maxtal.com.au>

Paul Prescod wrote:
 
> I don't follow you and anyhow I don't see why banning some name
> rebindings needs to be high on our list of things to do. The
> optimization and safety benefits are not as high as those of ordinary
> type checking.

	I do not agree. One of the most sought after optimisations
in Python is caching. Preventing rebindings in loaded modules
and/or defined classes permits function and method caching.
It may be the benefits are not as great in total as typing
would bring, but they are certainly significant, and they're
very high on our priorty list becase the changes required
to enforce such a rule are trivial (a one line change
in Viper now prints a warning message for the module case).
The change is also trivial to document and specify.

	I contend there is a very large benefit for a very small
effort here. A proposal is likely to be accepted, Guido has
been heard to murmur in favour of it :-) And, it is likely to
make it into Python 1.6, even if the more ambitious typing proposal
produced does not. This should win Brownie points for the Sig. :-)
 
> Yes and no. We are now able to associate types with names so we need to
> be cognizant of both the type of the name and the type of the object.

	precisely, yes, I agree, except i do not agree that names
have types associated with them. They have _interfaces_ associated
with them. Objects have _types_ associated with them.
That is, actual TypeType objects.

	Example of difference: a module has a type. All modules
have the _same_ type. Most modules have _different_ interfaces.
An interface declaration not only asserts that a name is associated
with an object of module type at run time, but also that the module
have certain attributes of certain types.
 
> My point was that we need to treat modules differently than classes
> because two modules cannot export the same interface whereas two classes
> can.

	Why cannot two modules can export the same interface?
The _names_ of the interfaces may differ.

> There is no equivalent code for modules because module interfaces are
> not named and modules do not claim to conform to particular interfaces.

	Why not? I think they do. They are named.
The name is encoded in the interface file name (.pi file).
 
-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From gstein@lyra.org  Thu Dec 30 18:46:03 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 30 Dec 1999 10:46:03 -0800 (PST)
Subject: [Types-sig] class/module interfaces (was: const)
In-Reply-To: <386A8759.C2567113@prescod.net>
Message-ID: <Pine.LNX.4.10.9912301041170.412-100000@nebula.lyra.org>

On Wed, 29 Dec 1999, Paul Prescod wrote:
>...
> My point was that we need to treat modules differently than classes
> because two modules cannot export the same interface whereas two classes
> can.

We should be treating them *exactly* the same. I've been saying this for a
while now :-)

A module can export the same interface as another module. Heck, it can
export the same interface as a class.

---- a.py ----
def foo(x: Int)->String:
  return "hi " + str(x)

bar = 5
--------------

import a

print a.foo(5), a.bar

class xyz:
  bar = 5
  def foo(self, x:Int)->String:
    return "hi " + str(x)

b = xyz()
print b.foo(5), b.bar

-------------

In the above code, "a" and "b" have the same interface. We could even go
and declare an interface, specify that the module and the class conforms
to that interface, and then declare a and b to use the interface.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Thu Dec 30 18:54:45 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 30 Dec 1999 10:54:45 -0800 (PST)
Subject: [Types-sig] decls and typedefs (was: RFC Comments)
In-Reply-To: <386A8907.906DC149@prescod.net>
Message-ID: <Pine.LNX.4.10.9912301046330.412-100000@nebula.lyra.org>

On Wed, 29 Dec 1999, Paul Prescod wrote:
> skaller wrote:
...
> > It follows FROM THE MODEL
> > that these declarations are effectively private.

I don't see how that follows. But the comment was in regards to the
distinction between an import/include process. I believe that distinction
is bunk (as I explained in the other note), so the notion of "follows" is
probably moot.

> It was never my intent that decls and typedecls could be private.

A "decl" does not establish a name in my mind. It defines the type that a
name *will* use, but nothing more. So far, we have decls to specify
interface attributes. I think there is something in there for declaring
parameterized types, but I'm still not clear on that syntax. In any case,
they only declare type information -- you still need an assignment
somewhere to establish a value.

To associate information with a name, I think that we want to use an
assignment or the classdef/funcdef pattern (... name: suite). This is
where the "typedef" operator came from, allowing you to do:

  IntOrString = typedef Int or String

Have no fears -- the type checker can easily understand the above code.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From skaller@maxtal.com.au  Thu Dec 30 18:53:17 1999
From: skaller@maxtal.com.au (skaller)
Date: Fri, 31 Dec 1999 05:53:17 +1100
Subject: [Types-sig] Re: Conformance model
References: <Pine.LNX.4.10.9912281401170.412-100000@nebula.lyra.org> <386A3B9D.228BD31B@prescod.net> <386AAF7B.9EA4E3F3@maxtal.com.au> <386A8826.23694955@prescod.net>
Message-ID: <386BAA1D.BD332F0A@maxtal.com.au>

Paul Prescod wrote:
> 
> skaller wrote:
> >
> > ...
> >
> > Given that python is dynamic,
> > my argument is that, say, for a type error,
> > it might make sense to ... mandate that
> >
> >         def f(x):
> >                 1 + "Hello"
> >
> > is not a valid Python program -- a compiler
> > can reject the program, rather than being
> > forced to implement:
> >
> >         def f(x):
> >                 raise TypeError
> >
> > which is what is currently required.
> 
> My feeling could be summed up thus:
> 
> "The following actions are illegal. A Python compiler may report them
> and refuse to compile the program or it may run the program and generate
> some form of Error exception."

	Seems fair .. 

> I would only willing to go further if you would describe overwhelming
> optimization benefits in allowing undefined behavior.

	How would you account for 'assert'? Assert provides
a run time test that can be optimised away, a compiler could
perhaps be permitted to report an error IF it could detect
one would occur -- but currently, there is no requirement
is actually generate an error, either at compile time
or run time: the current optimising compiler does neither.

> You are swimming against the tide of history here. Java doesn't have
> much undefined behavior either. 

	java and python are not ISO standardised.
C and C++ are, and allow undefined behaviour in some places,
because it is necessary for performance.
Pleny of people use C and C++ to write code. :-)
Plenty of people use python and java and wish they ran
faster.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Thu Dec 30 19:00:59 1999
From: skaller@maxtal.com.au (skaller)
Date: Fri, 31 Dec 1999 06:00:59 +1100
Subject: [Types-sig] Conformance model
References: <Pine.LNX.4.10.9912281401170.412-100000@nebula.lyra.org> <386A3B9D.228BD31B@prescod.net> <386AAF7B.9EA4E3F3@maxtal.com.au> <00c701bf526d$b5224c60$87740918@phnx3.az.home.com>
Message-ID: <386BABEB.586103A2@maxtal.com.au>

Tim Hochberg wrote:
> First off, the function 'f' is close enough to:
> 
> def g(x):
>     return g + "Hello"
> 
> that it strikes me as somewhat strange to ban the former but not the later.

	I'd ban _anything_ that raised _any_ error, except an environment
error, unless the error was caught in the function that raises the
error.
This implies NO system exceptions can be raised by a function.
Only user defined exceptions. And they can ONLY be raised by
an explicit 'raise' statement.

	This means that, when calling a function that does NOT
raise any errors, we don't have to check for them.

	This is a huge overhead in typical C code, an obstacle
to inlining, which reduces one of pythons biggest performance
bottlenecks -- function calling.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From skaller@maxtal.com.au  Thu Dec 30 19:07:01 1999
From: skaller@maxtal.com.au (skaller)
Date: Fri, 31 Dec 1999 06:07:01 +1100
Subject: [Types-sig] Type checks
References: <000801bf528c$72f01920$a02d153f@tim>
Message-ID: <386BAD55.25B3A05B@maxtal.com.au>

Tim Peters wrote:

> John, you're not getting anywhere with this approach -- drop it.

> The Python language does not define the means by which processor options are
> specified, but does define their effects.  It is not required that a
> processor implement the processor option that we informally refer to as "-O
> mode" -- but if it does, its effect is defined.

	Ok, I'll give up. 

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850


From gstein@lyra.org  Thu Dec 30 21:40:08 1999
From: gstein@lyra.org (Greg Stein)
Date: Thu, 30 Dec 1999 13:40:08 -0800 (PST)
Subject: [Types-sig] Re: Conformance model
In-Reply-To: <386BAA1D.BD332F0A@maxtal.com.au>
Message-ID: <Pine.LNX.4.10.9912301336510.412-100000@nebula.lyra.org>

On Fri, 31 Dec 1999, skaller wrote:
>...
> > You are swimming against the tide of history here. Java doesn't have
> > much undefined behavior either. 
> 
> 	java and python are not ISO standardised.
> C and C++ are, and allow undefined behaviour in some places,
> because it is necessary for performance.
> Pleny of people use C and C++ to write code. :-)
> Plenty of people use python and java and wish they ran
> faster.

That does not negate what Tim, Paul, and myself (among others) have been
saying: you're going to be unsuccessful in trying to introduce undefined
behavior in Python. Give it up already.

I believe that you have a possibility to get Guido to define a language
feature that says "it operates <thusly>, but on <xyz> it will operate
like <this>." The assert statement and some JPython-related issues are
examples. However, this is the wrong forum for this since it is *VERY*
dependent upon Guido's thoughts.

As I've said before: bringing up this issue is growing awfully tiresome. I
wish you would stop.

Happy Holidays,
-g

-- 
Greg Stein, http://www.lyra.org/


From sjmachin@lexicon.net  Thu Dec 30 22:59:26 1999
From: sjmachin@lexicon.net (John Machin)
Date: Fri, 31 Dec 1999 08:59:26 +1000
Subject: Anti-poking lobby (was:Re: [Types-sig] type declaration syntax)
In-Reply-To: <14443.32587.574186.48706@dolphin.mojam.com>
References: <386165AF.F6E6BF81@maxtal.com.au>
Message-ID: <19991230215036009.AAA186.69@max41121.izone.net.au>

Skip said:
> 
>     skaller> I.e. TWO bans fix most problems. The ban on module level
>     skaller> rebindings is a significant restriction. 
> 
> I'll say.  The common idiom for trapping stdout or stderr is to rebind
> sys.stdout/stderr to a file-like object.  How would that be accomplished
> in such a straightforward way if module-level rebindings are disallowed?
> 

sys.stdout = "you lose"

is a bit too straightforward for my liking.

Once we have banned poking from outside a module, can't we fix the 
presumably-few cases of missing-but-required functionality by supplying 
functions? For example,

previous_stdout = sys.divert_stdout(new_stdout_file-like_object)

[default argument is original stdout]

with maybe some checking as determined by the module's "owner" [is it 
truly file-like?], and maybe some extra functionality e.g.

sys.divert_stdout("my_stdout.log")

# interprets string as name of file; appends if existing, else creates


From tim_one@email.msn.com  Fri Dec 31 03:35:30 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 30 Dec 1999 22:35:30 -0500
Subject: [Types-sig] Type checks
In-Reply-To: <386BAD55.25B3A05B@maxtal.com.au>
Message-ID: <000301bf5340$1604f820$e12d153f@tim>

[Tim]
> John, you're not getting anywhere with this approach -- drop it.
> [and a paragraph of pseudo-standardese]

[skaller]
> 	Ok, I'll give up.

John!  Are you feeling OK?  I was prepared for any conceivable response --
except for that one <wink>.

and-a-happy-new-millennium-to-all-ly y'rs  - tim


From paul@prescod.net  Fri Dec 31 09:31:54 1999
From: paul@prescod.net (Paul Prescod)
Date: Fri, 31 Dec 1999 04:31:54 -0500
Subject: [Types-sig] Re: rebinding (was: const)
References: <Pine.LNX.4.10.9912301007330.412-100000@nebula.lyra.org>
Message-ID: <386C780A.3BFEE183@prescod.net>

Greg Stein wrote:
> 
> ...
> >  * module -- we must always disallow rebinding these because we don't
> > have a notion of two modules with the "same interface". Maybe in some
> > future version we could.
> 
> Untrue. Ever look at the "anydbm" module and its cohorts? How about the
> DBAPI modules?

I didn't say that there was no notion of modules with the same
interface. I said that our type declaration sub-language does not have
such a notion. 

> I've said before: modules and classes both have the notion of an
> interface. We ought to be able to associate an interface with a module!

There is some elegance in this model but it is also pretty weird the way
Pythonista's use modules instead of classes for some types of
polymorphism. I want to know where Guido wants to go in people's
thinking on modules.

> Note on functions: how is a function declared to have a particular
> signature? 

Just through an attribute declaration. It can be either the top level or
in an interface declaration.

 Paul Prescod


From paul@prescod.net  Fri Dec 31 09:38:47 1999
From: paul@prescod.net (Paul Prescod)
Date: Fri, 31 Dec 1999 04:38:47 -0500
Subject: [Types-sig] Re: syntax (was: check.py)
References: <Pine.LNX.4.10.9912301019220.412-100000@nebula.lyra.org>
Message-ID: <386C79A7.CBB008E@prescod.net>

Greg Stein wrote:
> 
> ...
>
> > And I put decl
> > and typedecl at the front instead of making them operators because I
> > agree with Tim Peters that we are designing a sub-language that needs to
> > be understood as being separate by virtue of being evaluated BEFORE the
> > code is executed.
> 
> I disagree. Making a "sub-language" will simply serve to create something
> that is not integrated with Python. I see no reason to separate anything
> that is happening here -- that is a poor requirement/direction to take.

Compile time stuff is inherently separate because it is *compile time
stuff*. It follows different import rules, it is evaluated in a
different namespace, and so forth. This, for instance, is not legal:

a = doSomething()
b = typedef a

Python programmers need to understand these sorts of things. The decl
syntax and "gpydl" semantics makes it very clear that these declarations
are *separate* and are evaluated in a different time in a different
execution context with a different namespace.
 
> The typedef unary operator allows a Python programmer to manipulate type
> declarator objects. That will be important for things such as an IDE, a
> debugger, or some more sophisticated analysis tools.

This is a completely orthogonal issue. There is no syntax in Python for
a traceback or frame object but IDEs can work with traceback and frame
objects. Classes are not created by a unary operator and assignment but
they are still runtime-available objects.

 Paul Prescod


From da@ski.org  Sun Dec  5 07:00:10 1999
From: da@ski.org (David Ascher)
Date: Sat, 4 Dec 1999 23:00:10 -0800 (Pacific Standard Time)
Subject: [Types-sig] Static typing considered HARD
In-Reply-To: <384A00D6.C3015C9D@fourthought.com>
Message-ID: <Pine.WNT.4.05.9912042244240.201-100000@david.ski.org>

On Sat, 4 Dec 1999, Uche Ogbuji wrote:

> David Ascher wrote:
> > > No language that I know of does even a tenth of the job of configuration
> > > management, error-handling or testing for anybody.  They are not matters
> > > for a programming language to address.
> > 
> > I guess we'll have to agree to disagree.
> > 
> > I've been doing some playing with Swing using JPython.  Because it's
> > wicked slow to start, (due to Java mostly) the
> > edit-run-traceback-edit-run-traceback cycle is significantly longer than
> > with with CPython.  That's when I curse the fact that the compile-time
> > analysis didn't catch simple typos, trivial mistakes in signatures, etc. I
> > *love* Python's dynamicity.  But mostly I use its 'wicked cool' dynamic
> > features, like modifying the type of a variable in a function call or
> > changing the __class__ of an object once in a very blue moon.
> 
> I can agree to disagree as well as anyone, but I'll confess I'm still
> baffled at how you claim that any language automates configuration
> management, error-handling or testing to any significant extent.  I
> guess we'll also have to agree to not understand each other.

I'm unsure all that you mean by 'configuration management, error-handling
and testing'.  All I'm pointing out is that I, others doing large-scale
systems (e.g. eGroups) and many of my students all complain that Python
isn't doing 'as much as it could' in the area of compile-time type
checking and signature verification.

> Also, I don't think I've _ever_ done anything as off-the-wall as
> "modifying the type of a variable in a function call or changing the
> __class__ of an object".  I hope this isn't anyone's benchmark of
> Python's dynamicism.

Just in case it wasn't clear, what I meant by 'modifying the type of a
variable in a function call' is:

  def a(x):
    x = len(x)

The point is: Python is extremely dynamic, and those are extreme examples
of this dynamicity.  When you expressed quite strong reactions to Paul's
proposal to add static types, I wanted to point out that there are things
which could be done which would alleviate some of the problems that many
folks are having in doing programming-in-the-large (and in-the-small as
well), while not hindering most programmers most of the time (what was it
P.T. Barnum said? =).

Let's try a different approach.  What is your benchmark of Python's
dynamicity? What aspects do you care about keeping?  Not modifying the
__class__, that's clear. What, then?

Or is it the syntactic lemon (opposite of sugar?) which lurks behind some
static typing proposals which got you worried?

> I program in Python perhaps 40 hours a week, and have done so for a long
> time.  Most of what I work on are large-scale systems.  Very strange
> that my typos (and they are legion) are much less catastrophic than your
> own.

Ah, well, probably you're just better at it than I am.  =)

My programs are typically small and run for a long time.  They also change
ten times daily due to the changing nature of the requirements.  There is
no 'finished' program in my current line of work.  Just a different way of
doing business.  Note that developing a test suite for this sort of code
is unrealistic.  I'm paid to do science, not to do regression tests, and
the regression suite is likely to be longer and buggier than the actual
code.

Perhaps it's best if we took this off-line though -- I think we're
straying from the types-sig charter.

--david


From da@ski.org  Sun Dec  5 05:36:53 1999
From: da@ski.org (David Ascher)
Date: Sat, 4 Dec 1999 21:36:53 -0800 (Pacific Standard Time)
Subject: [Types-sig] Static typing considered HARD
In-Reply-To: <3849AC89.1173B163@fourthought.com>
Message-ID: <Pine.WNT.4.05.9912042116340.173-100000@david.ski.org>

On Sat, 4 Dec 1999, Uche Ogbuji wrote:

> Is their problem performance or defect-management?  Again, there is an
> important difference.  I agree that typing can help the former: I am
> doubtful that it is a panacea for the latter.

The latter.  The quote (paraphrased from memory) is "When someone changes
a function interface, there's no way to know if we've caught all of the
calls to that function in the tens of thousands of line of code that we
have except to run the code'.

Note that I don't think anyone is arguing 'panacea'.  Just 'we could do
better'.

> > I see two very distinct problems, though -- one is the use of 'statically
> > typed variables', which requires fundamental changes to Python's
> > typesystem. The other is 'compile-time type/signature/interface checking',
> > which could probably be done coarsely with add-on tools without changing
> > the syntax or type system one iota (ok, maybe one or two iotas).
> > 
> > > see this "misspelling" problem.  Proper configuration-management
> > > procedures and testing, along with intelligent error-recovery, prevent
> > > such problems, which can also occur in the most strongly-typed systems.
> > 
> > Wouldn't you agree that enforcing these 'proper procedures' is much harder
> > in a language which doesn't do half the job for you?
> 
> No language that I know of does even a tenth of the job of configuration
> management, error-handling or testing for anybody.  They are not matters
> for a programming language to address.

I guess we'll have to agree to disagree.

I've been doing some playing with Swing using JPython.  Because it's
wicked slow to start, (due to Java mostly) the
edit-run-traceback-edit-run-traceback cycle is significantly longer than
with with CPython.  That's when I curse the fact that the compile-time
analysis didn't catch simple typos, trivial mistakes in signatures, etc. I
*love* Python's dynamicity.  But mostly I use its 'wicked cool' dynamic
features, like modifying the type of a variable in a function call or
changing the __class__ of an object once in a very blue moon.

IIRC, JimH mentioned in the early part of his talk (before it got heated)
a system which allowed one to change whether a particular symbol could be
considered 'static' or not, and suggested what seemed to me reasonable
defaults, like the names of builtins being considered 'known' at
compile-time.  With two new syntactic mechanisms called e.g. 'freeze' and
'thaw', one could maintain exactly the same dynamicity, while allowing the
user to 'tell the compiler' that some things could be trusted not to
change in the lifetime of the program (and the runtime would enforce
those, of course).  And if you really wanted to redefine 'open', then you
still could.

In other words, I'm just suggesting that given that (I'd guess) 95% of the
code out there is such that variable maintain their type throughout the
life of the program and that the builtins don't typically get overriden,
it seems a shame not to play the numbers. And we don't have to cover all
the cases.  Just the 80% which give the largest payoff.

Another trivial example: I can never remember whether it's
pickle.dump(object, file) or pickle.dump(file, object).  I tend to
remember that I don't remember after the simulation has run for two hours
(if I'm lucky) and the saving of state fails...

--david