From guido@CNRI.Reston.VA.US Thu Dec 2 21:17:19 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 02 Dec 1999 16:17:19 -0500 Subject: [Types-sig] The Types-SIG is comatose. Let's retire it. Message-ID: <199912022117.QAA15195@eric.cnri.reston.va.us> It's time for the twice yearly ritual of looking for comatose SIGs. From the archives, it looks like the types-sig is the only dud amongst the crowd: all other SIGs are doing well (some are doing *extremely* well, like the doc-sig and the matrix-sig). The types-sig hasn't had traffic since August (4 messages) and in all of 1999 it has only has 12 messages. Type-sig, what do you have to say for yourself? --Guido van Rossum (home page: http://www.python.org/~guido/) From mengx@nielsenmedia.com Thu Dec 2 21:52:33 1999 From: mengx@nielsenmedia.com (mengx@nielsenmedia.com) Date: Thu, 2 Dec 1999 16:52:33 -0500 (EST) Subject: [Types-sig] The Types-SIG is comatose. Let's retire it. Message-ID: <199912022152.QAA29677@p5mts.nielsenmedia.com> Perhaps this proved trying to (optionally) adding TYPEs to python language itself to be unpopular. At the start of this list, I suggested to embed type hints inside doc string, or some other non-breaking methods which only requires python engine implementation changes instead of adding new keywords or symbols to the code. Or instead of diving into uncertain langauge research, accept and enhance CXX to ease the extension writing, which may solve many issues related to the need of TYPED python Thanks -Ted Meng > From POP3 Thu Dec 2 16:18:02 1999 > Delivered-To: types-sig@dinsdale.python.org > To: types-sig@python.org > Cc: meta-sig@python.org > Date: Thu, 02 Dec 1999 16:17:19 -0500 > From: Guido van Rossum > Subject: [Types-sig] The Types-SIG is comatose. Let's retire it. > X-BeenThere: types-sig@python.org > X-Mailman-Version: 1.2 (experimental) > List-Id: Special Interest Group on the Python type system > > It's time for the twice yearly ritual of looking for comatose SIGs. > From fdrake@acm.org Thu Dec 2 22:18:07 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 2 Dec 1999 17:18:07 -0500 (EST) Subject: [meta-sig] Re: [Types-sig] The Types-SIG is comatose. Let's retire it. In-Reply-To: <199912022152.QAA29677@p5mts.nielsenmedia.com> References: <199912022152.QAA29677@p5mts.nielsenmedia.com> Message-ID: <14406.61471.268274.986137@weyr.cnri.reston.va.us> mengx@nielsenmedia.com writes: > Perhaps this proved trying to (optionally) adding TYPEs to python language > itself to be unpopular. At the start of this list, I suggested to embed > type hints inside doc string, or some other non-breaking methods which > only requires python engine implementation changes instead of adding new > keywords or symbols to the code. Or instead of diving into uncertain > langauge research, accept and enhance CXX to ease the extension writing, > which may solve many issues related to the need of TYPED python Actually, someone suggested encoding type information in docstrings just recently in the Doc-SIG. See: http://dinsdale.python.org/pipermail/doc-sig/1999-December/001607.html http://dinsdale.python.org/pipermail/doc-sig/1999-December/001610.html http://dinsdale.python.org/pipermail/doc-sig/1999-December/001623.html http://dinsdale.python.org/pipermail/doc-sig/1999-December/001627.html Since the decision was to table it for now, I don't think it warrents keeping alive a dead SIG. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From paul@prescod.net Thu Dec 2 22:47:25 1999 From: paul@prescod.net (Paul Prescod) Date: Thu, 02 Dec 1999 16:47:25 -0600 Subject: [Types-sig] The Types-SIG is comatose. Let's retire it. References: <199912022152.QAA29677@p5mts.nielsenmedia.com> Message-ID: <3846F6FD.4FCCCD1@prescod.net> I'm not speaking on behalf of or in favor of the types-sig. mengx@nielsenmedia.com wrote: > > Perhaps this proved trying to (optionally) adding TYPEs to python language > itself to be unpopular. I don't think so. I think that there were just too many ideas of how it should work. I think that's why revoluationary programming language features cannot be designed by committee. > Or instead of diving into uncertain > langauge research, accept and enhance CXX to ease the extension writing, > which may solve many issues related to the need of TYPED python I don't see how CXX can help. Python programmers choose not to program C++ for a reason. Here's an approach that we didn't try because it is likely to be wildly unpopular: There exists a popular programming language that uses optional type checking and is nearly as dynamic as Python: Visual Basic. The overall type system is weak, (e.g. no concept of common interface) but the optional type checking part seems to work pretty well. We wouldn't have to do "uncertain language research" to rip its behavior (and even some of its syntax) off. It strikes me as a pretty common sense approach. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself "I always wanted to be somebody, but I should have been more specific." --Lily Tomlin From guido@CNRI.Reston.VA.US Thu Dec 2 22:51:03 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 02 Dec 1999 17:51:03 -0500 Subject: [Types-sig] The Types-SIG is comatose. Let's retire it. In-Reply-To: Your message of "Thu, 02 Dec 1999 16:47:25 CST." <3846F6FD.4FCCCD1@prescod.net> References: <199912022152.QAA29677@p5mts.nielsenmedia.com> <3846F6FD.4FCCCD1@prescod.net> Message-ID: <199912022251.RAA15968@eric.cnri.reston.va.us> > Here's an approach that we didn't try because it is likely to be wildly > unpopular: Why would it be unpopular? > There exists a popular programming language that uses optional type > checking and is nearly as dynamic as Python: Visual Basic. The overall > type system is weak, (e.g. no concept of common interface) but the > optional type checking part seems to work pretty well. We wouldn't have > to do "uncertain language research" to rip its behavior (and even some > of its syntax) off. It strikes me as a pretty common sense approach. I don't know the details, never having studied VB manuals, although I once saw the source of a file that described the linkage to a C module (pretty ugly but effective and no need for wrappers). Do you have the time to describe this in somewhat more detail for us lucky folks who haven't had the pleasure to learn VB? --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein@lyra.org Fri Dec 3 00:05:14 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 2 Dec 1999 16:05:14 -0800 (PST) Subject: [Types-sig] Re: The Types-SIG is comatose. Let's retire it. In-Reply-To: <199912022251.RAA15968@eric.cnri.reston.va.us> Message-ID: Meta issue: I'm not sure that I agree the types-sig should stay alive simply because some traffic is inserted when a threat-of-execution has arisen. The impetus for the SIG is (IMO) obviously gone, despite some people's unstated desires to see work done along this path. I'd recommend closing the SIG and letting this discussion move elsewhere. Cheers, -g On Thu, 2 Dec 1999, Guido van Rossum wrote: > > Here's an approach that we didn't try because it is likely to be wildly > > unpopular: > > Why would it be unpopular? > > > There exists a popular programming language that uses optional type > > checking and is nearly as dynamic as Python: Visual Basic. The overall > > type system is weak, (e.g. no concept of common interface) but the > > optional type checking part seems to work pretty well. We wouldn't have > > to do "uncertain language research" to rip its behavior (and even some > > of its syntax) off. It strikes me as a pretty common sense approach. > > I don't know the details, never having studied VB manuals, although I > once saw the source of a file that described the linkage to a C > module (pretty ugly but effective and no need for wrappers). > > Do you have the time to describe this in somewhat more detail for us > lucky folks who haven't had the pleasure to learn VB? > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > _______________________________________________ > Types-SIG mailing list > Types-SIG@python.org > http://www.python.org/mailman/listinfo/types-sig > -- Greg Stein, http://www.lyra.org/ From jeremy@cnri.reston.va.us Fri Dec 3 00:08:54 1999 From: jeremy@cnri.reston.va.us (Jeremy Hylton) Date: Thu, 2 Dec 1999 19:08:54 -0500 (EST) Subject: [Types-sig] Re: [meta-sig] Re: The Types-SIG is comatose. Let's retire it. In-Reply-To: References: <199912022251.RAA15968@eric.cnri.reston.va.us> Message-ID: <14407.2582.897430.477756@goon.cnri.reston.va.us> >>>>> "GS" == Greg Stein writes: GS> Meta issue: I'm not sure that I agree the types-sig should stay GS> alive simply because some traffic is inserted when a GS> threat-of-execution has arisen. The impetus for the SIG is (IMO) GS> obviously gone, despite some people's unstated desires to see GS> work done along this path. I don't remember anymore what the impetus was. The problem I see is that a lot of work is going to be required to make much progress on extending the type system. In the absence of anyone willing and able to do the work (whatever it is), there's not much point to a SIG. GS> I'd recommend closing the SIG and letting this discussion move GS> elsewhere. Yes. Jeremy From paul@prescod.net Fri Dec 3 02:53:24 1999 From: paul@prescod.net (Paul Prescod) Date: Thu, 02 Dec 1999 20:53:24 -0600 Subject: [Types-sig] VB Types References: <199912022152.QAA29677@p5mts.nielsenmedia.com> <3846F6FD.4FCCCD1@prescod.net> <199912022251.RAA15968@eric.cnri.reston.va.us> Message-ID: <384730A4.7BFC4933@prescod.net> Guido van Rossum wrote: > > > Here's an approach that we didn't try because it is likely to be wildly > > unpopular: > > Why would it be unpopular? Stealing ideas from Visual Basic? Shudder! > Do you have the time to describe this in somewhat more detail for us > lucky folks who haven't had the pleasure to learn VB? The declarations are totally optional. If you don't declare something then it is a "Variant" which is a grab-bag like void * or PyObject. So the semantics of an untyped program are similar to Pythons: Private Function Foo() b = "foo" MsgBox (b) b = 5 MsgBox (b) End Function b is a Variant. So is the return value of the function. I could change that: Public Function Foo() As Slide Set Foo = ActivePresentation.Slides(0) End Function Ignore the word "set". It's a hack and I think that even in VB there isn't a good reason it is necessary. Their word for "declare" is "dim" Dim i as Integer As soon as you Dim something, the IDE tries to help you with its method signatures. That's useful enough to encourage type declarations for things of known type...which in turn can help you catch type errors more quickly. I've prototyped some COM apps in VB and then port to Python because the method signatures stuff is important when the COM object isn't well documented (usually!). For some reason, it isn't compile time type safe. This would cause a runtime error: Dim b As Integer MsgBox (b) b = "foo" b = 5 I don't think I've ever got a compile time type error message. Perhaps they don't want to give a false sense of "type safety" because it is still very possible to make type errors (because of the variants). Still, in a Python implementation I would expect IDEs to have a "check all types" menu item and the Python interpreter would also need a check all types command line option. The default value of integers is "0". Parameters can be typed or implicitly variant: Public Function Foo(a As Integer, b, c as String) As Collection Set Foo = New Collection End Function Classes are types so you can create new types easily. There is no concept of predefined interfaces (other than interfaces in a typelib) but that could be added easily. There is no union type: you would have to use variants. As you point out, these same definitions can be used to interface to statically typed languages without good introspection and libraries but that also depends on built-in language features. I can't think of anything else that is relevant. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself "I always wanted to be somebody, but I should have been more specific." --Lily Tomlin From da@ski.org Fri Dec 3 04:53:00 1999 From: da@ski.org (David Ascher) Date: Thu, 2 Dec 1999 20:53:00 -0800 (Pacific Standard Time) Subject: [Types-sig] VB Types In-Reply-To: <384730A4.7BFC4933@prescod.net> Message-ID: On Thu, 2 Dec 1999, Paul Prescod wrote: > Guido van Rossum wrote: > > > > > Here's an approach that we didn't try because it is likely to be wildly > > > unpopular: > > > > Why would it be unpopular? > > Stealing ideas from Visual Basic? Shudder! Nobody's got to know outside of these four walls. =) > The declarations are totally optional. If you don't declare something > then it is a "Variant" which is a grab-bag like void * or PyObject. So > the semantics of an untyped program are similar to Pythons: [...] Sounds quite a bit like JimH's proposal @ IPC7 (or was it 6?). He got booed, IIRC, but that was just an emotional reaction, methinks. =) How does VB handle specifying types which are not one of the atomic types? (e.g. list of (tuple or dictionaries) of length 5 or fewer?) --david From tim_one@email.msn.com Fri Dec 3 05:58:48 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 3 Dec 1999 00:58:48 -0500 Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose. Let's retire it. In-Reply-To: <199912022117.QAA15195@eric.cnri.reston.va.us> Message-ID: <000801bf3d53$77a44f20$3a2d153f@tim> [Guido] > It's time for the twice yearly ritual of looking for comatose SIGs. > ... > The types-sig hasn't had traffic since August (4 messages) and in all > of 1999 it has only has 12 messages. > > Type-sig, what do you have to say for yourself? The Types-SIG was very active at its inception; indeed, I still have 142 old Types-SIG msgs in my inbox I haven't yet read! Note that the traffic dropped to essentially nothing several weeks after you (Guido) posted your own last msg to it. I don't think that's coincidence. You were an active initial participant, and when you dropped out most of us likely figured you had some other ideas in mind for Python2 and there was little point to proceeding without you. So, like everything else that goes wrong in the Python world, it was entirely Gordon McMillan's fault . I'd kill the SIG due to lack of activity. I'm sure interest in the topics remains high among many, though ("Types SIG"-related debates have continued non-stop on c.l.py). taking-no-more-from-this-than-that-a-successful-sig-needs-a- focused-charter-ly y'rs - tim From paul@prescod.net Fri Dec 3 12:52:22 1999 From: paul@prescod.net (Paul Prescod) Date: Fri, 03 Dec 1999 06:52:22 -0600 Subject: [Types-sig] VB Types References: Message-ID: <3847BD06.C2FA6743@prescod.net> David Ascher wrote: > > How does VB handle specifying types which are not one of the atomic types? > (e.g. list of (tuple or dictionaries) of length 5 or fewer?) Good question. It seems to handle fixed (and variable?) length arrays. Dim Washington(1 To 100) As StateData It also has a concept of a "struct" which they call a "user defined type". Type StateData CityCode (1 To 100) As Integer ' Declare a static array. County As String End Type Of course for Python, we would use square brackets for array bounds and classes for structs. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself "I always wanted to be somebody, but I should have been more specific." --Lily Tomlin From jim@digicool.com Fri Dec 3 14:27:38 1999 From: jim@digicool.com (Jim Fulton) Date: Fri, 03 Dec 1999 09:27:38 -0500 Subject: [Types-sig] The Types-SIG is comatose. Let's retire it. References: <199912022117.QAA15195@eric.cnri.reston.va.us> Message-ID: <3847D35A.59EB5770@digicool.com> Guido van Rossum wrote: > > It's time for the twice yearly ritual of looking for comatose SIGs. > From the archives, it looks like the types-sig is the only dud amongst > the crowd: all other SIGs are doing well (some are doing *extremely* > well, like the doc-sig and the matrix-sig). > > The types-sig hasn't had traffic since August (4 messages) and in all > of 1999 it has only has 12 messages. > > Type-sig, what do you have to say for yourself? As others have pointed out, there is clear evidence that the SIG is inactive and should be deactivated. I was a bit frustrated that the SIG tried to address three topics that I consider independent: - Interfaces - Classes vs types - Static typing This hurt the focus of the sig and emotion from some topics tended to bleed over to others. For example, I think the interfaces work was hurt by association with the typing work. I'll find some time over the next few days to try to sumarize and report on work in the sig on the first two topics. Perhaps someone else will do the same for static typing. Even if the SIG goes away, I think some report on the SIGs activity should be made at IPC8 (assuming there is a SIG status discussion). Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From paul@prescod.net Fri Dec 3 14:32:13 1999 From: paul@prescod.net (Paul Prescod) Date: Fri, 03 Dec 1999 08:32:13 -0600 Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose. Let's retire it. References: <000801bf3d53$77a44f20$3a2d153f@tim> Message-ID: <3847D46D.17C79972@prescod.net> David Ascher wrote: > Sounds quite a bit like JimH's proposal @ IPC7 (or was it 6?). He > got booed, IIRC, but that was just an emotional reaction, methinks. =) There is no non-trivial Python extension that will not get booed. I like the Visual Basic approach because it is simple, seems intuitive to me, does not depend on any new ideas at all and thus does not require a lot of debate. To me, Python's brilliance is in eschewing innovation. People come to it and say: "this is the language I have been looking for". Other than whitespace there is no "weird stuff." It just takes the best ideas from every other language and simplifies the hell out of them. While I'm ranting, the other problem new people have is the whole reference/copy issue. Is there any language that has more understandable (perhaps more explicit) semantics for that stuff that we could steal for Py2? P.S. I brainwashed another one today. Literal quote: "This is the language I've been looking for." -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself "I always wanted to be somebody, but I should have been more specific." --Lily Tomlin From paul@prescod.net Fri Dec 3 14:32:32 1999 From: paul@prescod.net (Paul Prescod) Date: Fri, 03 Dec 1999 08:32:32 -0600 Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose. Let's retire it. References: <000801bf3d53$77a44f20$3a2d153f@tim> Message-ID: <3847D480.12A31666@prescod.net> > taking-no-more-from-this-than-that-a-successful-sig-needs-a- > focused-charter-ly y'rs - tim I propose that the types sig be re-commissioned with a much tighter commission. Let's focus on ONE of the three problems listed in our old charter: http://www.python.org/sigs/types-sig/ And let's start with a clear direction from the Powers that Be. I propose: * the goal is a optional static type system for version 2. * presume that the type/class dichotomy has been removed in V2 * backwards compatibility with current code is relatively important * compatibility with the Python 1.x interpreter is NOT important * interfaces are not an issue * parameterized (template) types are not available * names are type checked, not expressions * got now, only named types (types and classes) can be declared, not lists and tuples of types (many of these restrictions are easy to work around in Python: for instance making a list of string subclass of userlist) Start from these (very similar!) proposals: http://www.python.org/~rmasse/papers/python-types/ The current Visual Basic type system Something somewhere from JimH The type declaration part of strongtalk The first half of this: http://www.foretec.com/python/workshops/1998-11/greg-type-ideas.html We should appoint an "editor" as they do in standards bodies. If there are issues that just cannot be worked out by consensus, Guido rules. Ideally, it should work much like the docstring discussion going on in the doc-sig. If we had a particularly ambitious editor (unlikely) then we could have an RFC by the Python conference. Later, we could do the same thing for the class/type dichotomy. ...then interfaces ...then parameterized types. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself "I always wanted to be somebody, but I should have been more specific." --Lily Tomlin From guido@CNRI.Reston.VA.US Fri Dec 3 14:47:07 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 03 Dec 1999 09:47:07 -0500 Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose. Let's retire it. In-Reply-To: Your message of "Fri, 03 Dec 1999 08:32:32 CST." <3847D480.12A31666@prescod.net> References: <000801bf3d53$77a44f20$3a2d153f@tim> <3847D480.12A31666@prescod.net> Message-ID: <199912031447.JAA16565@eric.cnri.reston.va.us> Paul, do you want to be the head honcho for the reborn types SIG? You seem to have the right ideas, and you're the only one so far who has spoken up to keep it alive. I doubt that anyone else will volunteer, so if you don't, we will retire the SIG. I'll give you till June 2000 (the same expiration date as for other SIGs) to show that there's life in the subject. --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Fri Dec 3 15:29:47 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Fri, 3 Dec 1999 10:29:47 -0500 (EST) Subject: [meta-sig] Re: [Types-sig] The Types-SIG is comatose. Let's retire it. References: <199912022117.QAA15195@eric.cnri.reston.va.us> <3847D35A.59EB5770@digicool.com> Message-ID: <14407.57835.318063.51763@anthem.cnri.reston.va.us> >>>>> "JF" == Jim Fulton writes: JF> Even if the SIG goes away, I think some report on the SIGs JF> activity should be made at IPC8 (assuming there is a SIG JF> status discussion). Let's see how many other topics get championed. If there's time, I say this is a great idea. -Barry From m.faassen@vet.uu.nl Fri Dec 3 17:08:38 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Fri, 03 Dec 1999 18:08:38 +0100 Subject: [Types-sig] The Types-SIG is comatose. Let's retire it. References: <199912022117.QAA15195@eric.cnri.reston.va.us> Message-ID: <3847F916.35743ADB@vet.uu.nl> Guido van Rossum wrote: > > It's time for the twice yearly ritual of looking for comatose SIGs. > >From the archives, it looks like the types-sig is the only dud amongst > the crowd: all other SIGs are doing well (some are doing *extremely* > well, like the doc-sig and the matrix-sig). > > The types-sig hasn't had traffic since August (4 messages) and in all > of 1999 it has only has 12 messages. > > Type-sig, what do you have to say for yourself? Oddly enough, there has been quite some discussion on types on comp.lang.python since then. John Skaller's viper discussions and the discussions on Ruby are an example. I agree with others that the problem of the types-SIG is a lack of focus of discussion (too many different topics all having to do somewhat with types), and nobody doing the brunt of the work. John Skaller does appear to be doing lots of work on types in Python, but he seems to prefer working alone with his source. It's not as if there's no interest for type issues in the Python community; far from that. It just seems that there's nobody who has enough time/knowledge to work on them. Having studied the Zope sources I'm becoming painfully aware for the need of something like interfaces. Zope's source would really be far more understandable if it were rewritten with interfaces, I think. I understand Jim Fulton's motivation concerning interfaces far better since my foray into those sources. I'm still interested in static types as well, mostly in the interests of compilation. It's ridiculous to split a SIG that doesn't talk, of course, but perhaps better would be to have a 'compiler-SIG' and an 'interfaces-SIG'. I'd expect the interface-SIG to come with results far more quickly than the compiler-SIG. Regards, Martijn From m.faassen@vet.uu.nl Fri Dec 3 17:15:26 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Fri, 03 Dec 1999 18:15:26 +0100 Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose. Let's retire it. References: <000801bf3d53$77a44f20$3a2d153f@tim> <3847D480.12A31666@prescod.net> <199912031447.JAA16565@eric.cnri.reston.va.us> Message-ID: <3847FAAE.B8D74FDD@vet.uu.nl> Guido van Rossum wrote: > > Paul, do you want to be the head honcho for the reborn types SIG? You > seem to have the right ideas, and you're the only one so far who has > spoken up to keep it alive. I doubt that anyone else will volunteer, > so if you don't, we will retire the SIG. I'll give you till June 2000 > (the same expiration date as for other SIGs) to show that there's life > in the subject. Okay, I'd like to keep the place alive as well. I'll endeavor contribute by replying to Paul's messages, or anybody else who posts here. (I was actually quite excited to suddenly discover my types-SIG mailbox had lots of new messages in it :). Perhaps I'll get more time to actually work on these issues next year. Then my only thing lacking is actual knowledge and experience, but I can work on that. :) Regards, Martijn From jeremy@cnri.reston.va.us Fri Dec 3 17:33:55 1999 From: jeremy@cnri.reston.va.us (Jeremy Hylton) Date: Fri, 3 Dec 1999 12:33:55 -0500 (EST) Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose. Let's retire it. In-Reply-To: <3847D46D.17C79972@prescod.net> References: <000801bf3d53$77a44f20$3a2d153f@tim> <3847D46D.17C79972@prescod.net> Message-ID: <14407.65283.532788.640647@goon.cnri.reston.va.us> >>>>> "PP" == Paul Prescod writes: PP> David Ascher wrote: >> Sounds quite a bit like JimH's proposal @ IPC7 (or was it 6?). >> He got booed, IIRC, but that was just an emotional reaction, >> methinks. =) Jim's proposal was to extend Python with Java-style syntax and semantics. The Modula-3 fans cried foul. PP> There is no non-trivial Python extension that will not PP> get booed. :-) PP> While I'm ranting, the other problem new people have is the PP> whole reference/copy issue. Is there any language that has more PP> understandable (perhaps more explicit) semantics for that stuff PP> that we could steal for Py2? I think Python's rules are pretty simple already! I think newbies get confused by the general design issue, rather than Python's semantics. I read The Practice of Programming a few months ago and much appreciated the discussion of resource (e.g. memory) management. The authors said: "One of the most difficult problems in designing the interface for a library (or a class or a package) is to manage resources that are owned by the library and shared by the library and those who call it." (p. 103) Memory management issues, in particular, don't simply disappear in garbage-collected languages. The designer still has to determine when to use copies and when to use shared objects. I don't think the language can do a lot more to help with this issue except have clear semantics. Jeremy From m.faassen@vet.uu.nl Fri Dec 3 17:40:16 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Fri, 03 Dec 1999 18:40:16 +0100 Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose. Let's retire it. References: <000801bf3d53$77a44f20$3a2d153f@tim> <3847D480.12A31666@prescod.net> Message-ID: <38480080.3403BDDF@vet.uu.nl> Paul Prescod wrote: > > > taking-no-more-from-this-than-that-a-successful-sig-needs-a- > > focused-charter-ly y'rs - tim > > I propose that the types sig be re-commissioned with a much tighter > commission. Let's focus on ONE of the three problems listed in our old > charter: > > http://www.python.org/sigs/types-sig/ > > And let's start with a clear direction from the Powers that Be. > > I propose: > > * the goal is a optional static type system for version 2. Okay, I'll assume this goal for now. I'd like to see something happen with interfaces too, but I'll just assume/hope that an interface proposal will arise 'naturally' from any static type system we come up with. > * presume that the type/class dichotomy has been removed in V2 Gladly. So, what does this mean in practice? A particular class is another type? I don't want to accidentally start the discussion on the dichotomy itself here, I just want to know what Python 2 is like in practice. For now I'll assume that if I declare a class in Python 2, that class becomes a type. > * backwards compatibility with current code is relatively important All right, though you'll run into trouble with any current code that messes too much with types, so we can just forget about that trouble, as it'd be caused by the solving of the class/type dichotomy in any case. > * compatibility with the Python 1.x interpreter is NOT important So we don't care if we can add static typing to the Python 1.x interpreter line? > * interfaces are not an issue Presumably they'll arise naturally, as I said before. :) > * parameterized (template) types are not available Darn! I like these, if I understand what you mean. Don't we need things like 'a list of integers'? Or 'a list of objects that have class-type Foo' (objects of class Bar may be of class-type Foo too if Foo derives from Bar). If you want to actually use static types for compilation 'Swallow style' (only compile those functions/classes that are *fully* static type described) you'd need something like parameterized types.. Also, if you don't have parameterized types, you'll effectively lose track (statically) of the type of any object once you put it in a list. > * names are type checked, not expressions What does this mean? a = 4 @ IntegerType b = a @ IntegerType # checked if a is indeed integertype b = a + a @ IntegerType # not checked, as a + a is an expression class Foo: def doFoo(): print "Foo!" a = Foo() @ FooType a.doFoo() # does this do a typecheck for a? > * got now, only named types (types and classes) can be declared, not > lists and tuples of types That fits in with the no template types idea, right? > (many of these restrictions are easy to work around in Python: for > instance making a list of string subclass of userlist) Hm. But tuples, lists and dictionaries are very basic in Python. If the types system does not support them that would seem to be a bit incongruous (and inconvenient). > Start from these (very similar!) proposals: > > http://www.python.org/~rmasse/papers/python-types/ This does talk about interfaces (protocols) though. What part of this proposal do you mean? > The current Visual Basic type system I'll reread your posts on that. > Something somewhere from JimH > The type declaration part of strongtalk Any references? > The first half of this: > http://www.foretec.com/python/workshops/1998-11/greg-type-ideas.html > > We should appoint an "editor" as they do in standards bodies. If there > are issues that just cannot be worked out by consensus, Guido rules. Guido would rule in any case if Guido disagrees with consensus, right? :) Regards, Martijn From jeremy@cnri.reston.va.us Fri Dec 3 17:52:42 1999 From: jeremy@cnri.reston.va.us (Jeremy Hylton) Date: Fri, 3 Dec 1999 12:52:42 -0500 (EST) Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose. Let's retire it. In-Reply-To: <3847D480.12A31666@prescod.net> References: <000801bf3d53$77a44f20$3a2d153f@tim> <3847D480.12A31666@prescod.net> Message-ID: <14408.874.505464.996655@goon.cnri.reston.va.us> Paul Prescod proposes a new charter for the types-sig: > * the goal is a optional static type system for version 2. > * presume that the type/class dichotomy has been removed in V2 > * backwards compatibility with current code is relatively important > * compatibility with the Python 1.x interpreter is NOT important > * interfaces are not an issue > * parameterized (template) types are not available > * names are type checked, not expressions > * got now, only named types (types and classes) can be declared, not >lists and tuples of types If you're going to develop a static type system to describe Python programs (optional or otherwise), then I think you can't punt on all the things you want to punt on. > * interfaces are not an issue Yes, they are :-). > * parameterized (template) types are not available They need to be. > * names are type checked, not expressions Expressions need type checking, too! I'm thinking of the "the" special form in Common Lisp. (I don't have much experience with CL, so I'd appreciate input from someone who is.) Regardless of these minor quibbles, my largest complaint is: > * the goal is a optional static type system for version 2. What exactly is the deliverable. Saying an "optional static type system" is a bit vague. What is it specifically? A formal specification of the type system? A stand-alone utility that reports type errors? A new compiler? If this is a type system for Python 2, it seems that the best a SIG can hope for right now is a specification of the type system. Since Py2 design hasn't even started. Jeremy From jim@digicool.com Fri Dec 3 17:55:12 1999 From: jim@digicool.com (Jim Fulton) Date: Fri, 03 Dec 1999 12:55:12 -0500 Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose. Let's retire it. References: <000801bf3d53$77a44f20$3a2d153f@tim> <3847D480.12A31666@prescod.net> <38480080.3403BDDF@vet.uu.nl> Message-ID: <38480400.D3EE8A6@digicool.com> Martijn Faassen wrote: > > Paul Prescod wrote: > > > > > taking-no-more-from-this-than-that-a-successful-sig-needs-a- > > > focused-charter-ly y'rs - tim > > > > I propose that the types sig be re-commissioned with a much tighter > > commission. Let's focus on ONE of the three problems listed in our old > > charter: > > > > http://www.python.org/sigs/types-sig/ I really agree with this. > > And let's start with a clear direction from the Powers that Be. > > > > I propose: > > > > * the goal is a optional static type system for version 2. > > Okay, I'll assume this goal for now. I'd like to see something happen > with interfaces too, but I'll just assume/hope that an interface > proposal will arise 'naturally' from any static type system we come up > with. I intend to summarize the interfaces discussion and report back. I also intend to go ahead and release the interface implementation based on requirements that we agreed to at Spam7 and mostly agreed to in the SIG. We'll also start folding it into Zope. Based on actual experience using it, we'll have a basis for future discussions. I desperately hope these future discussions happen somewhere other than the reinvented types sig. > > * presume that the type/class dichotomy has been removed in V2 > > Gladly. So, what does this mean in practice? A particular class is > another > type? I don't want to accidentally start the discussion on the dichotomy > itself > here, I just want to know what Python 2 is like in practice. For now > I'll > assume that if I declare a class in Python 2, that class becomes a type. I vaguely remember agreement on a number of issues. As I said in a previous post, I'll try to summarise the progress made and report back. We can decide what to do based on that. (Alternatively, if someone else wants to summarize that's OK with me.) (snip, I don't really care that much about static typing, except that I'm generally wary of it. ;) Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From GoldenH@littoncorp.com Fri Dec 3 18:03:04 1999 From: GoldenH@littoncorp.com (Golden, Howard) Date: Fri, 3 Dec 1999 10:03:04 -0800 Subject: [Types-SIG] Python vs. Smalltalk/Strongtalk, etc. Was: The Types- SIG is comatose. Message-ID: Paul Prescod wrote: > And let's start with a clear direction from the Powers that Be. > > I propose: > > * the goal is a optional static type system for version 2. > * presume that the type/class dichotomy has been removed in V2 > * backwards compatibility with current code is relatively important > * compatibility with the Python 1.x interpreter is NOT important > * interfaces are not an issue > * parameterized (template) types are not available > * names are type checked, not expressions > * got now, only named types (types and classes) can be declared, not > lists and tuples of types There are a lot of different proposals. Do we all agree on all of these points? (Unlikely!) > Start from these (very similar!) proposals: > > http://www.python.org/~rmasse/papers/python-types/ > The current Visual Basic type system > Something somewhere from JimH > The type declaration part of strongtalk > The first half of this: > http://www.foretec.com/python/workshops/1998-11/greg-type-ideas.html It must be serendipity, but I was just thinking about this subject yesterday, and I went so far as to look up Strongtalk and download Squeak. Syntax differences aside, what I think we would benefit from is a comparison of the capabilities of Python1.x and proposed Python2 to Smalltalk/Strongtalk/Squeak, Visual Basic, etc. For me, I am looking at Python as a general purpose language, rather than a scripting language, so programming-in-the-large features are important. -- Specific questions: -- What if the C definition of functions and methods were extended by adding a signature object? (If so, how can signatures be specified?) Could the signatures then be used to generate more efficient code? Should there be function/method choice by signature? Maybe I'm trying to make Python into something it wasn't intended to be, but I have this wish that I wouldn't have to use different languages for different tasks. Howard B. Golden Software developer Litton Industries, Inc. Woodland Hills, California From m.faassen@vet.uu.nl Fri Dec 3 18:08:29 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Fri, 03 Dec 1999 19:08:29 +0100 Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose. Let's retire it. References: <000801bf3d53$77a44f20$3a2d153f@tim> <3847D480.12A31666@prescod.net> <38480080.3403BDDF@vet.uu.nl> <38480400.D3EE8A6@digicool.com> Message-ID: <3848071D.5A994688@vet.uu.nl> Jim Fulton wrote: > > Martijn Faassen wrote: > > > > Paul Prescod wrote: > > > > > > > taking-no-more-from-this-than-that-a-successful-sig-needs-a- > > > > focused-charter-ly y'rs - tim > > > > > > I propose that the types sig be re-commissioned with a much tighter > > > commission. Let's focus on ONE of the three problems listed in our old > > > charter: > > > > > > http://www.python.org/sigs/types-sig/ > > I really agree with this. But I suppose you disagree with Paul on what this focus problem should be? You'd prefer interfaces, right? Or seeing what you said later on in your post, perhaps an interface-SIG? I expect I'd contribute to any discussion of interfaces *or* static types. I'd probably be able to contribute more of practical value to any interface development right now. I don't have so much to contribute about the class/type dichotomy. > > > And let's start with a clear direction from the Powers that Be. > > > > > > I propose: > > > > > > * the goal is a optional static type system for version 2. > > > > Okay, I'll assume this goal for now. I'd like to see something happen > > with interfaces too, but I'll just assume/hope that an interface > > proposal will arise 'naturally' from any static type system we come up > > with. > > I intend to summarize the interfaces discussion and report back. That'd be really helpful. > I also intend to go ahead and release the interface implementation > based on requirements that we agreed to at Spam7 and mostly agreed to > in the SIG. That'd be even more helpful. > We'll also start folding it into Zope. And that'd be wonderful! I am starting to feel that need after getting lost in the Zope sources too often. I'd like to contribute; perhaps by documenting something for starters. Any ideas? > Based on actual > experience using it, we'll have a basis for future discussions. So practical.! :) I'd like to get in on this early on. I assume I'll catch your announcement on the release of the interface implementation, but I'd also be very interested to follow the process of rolling it into Zope from the start. Not that I'm likely to be able to contribute much at the start, but it just sounds really interesting to me. Any idea on how this could be accomplished? > I desperately hope these future discussions happen somewhere other than > the reinvented types sig. The interfaces-SIG? :) [snip class/type discussion] [lots on static typing] > (snip, I don't really care that much about static typing, except that > I'm generally wary of it. ;) *grin* Okay, I suggest another design goal for the revived types-SIG: 'Pass the Fulton Test'. We must strive for a static type system so wonderful that even Jim Fulton will like it. :) Regards, Martijn From m.faassen@vet.uu.nl Fri Dec 3 18:17:26 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Fri, 03 Dec 1999 19:17:26 +0100 Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose. Let's retire it. References: <000801bf3d53$77a44f20$3a2d153f@tim> <3847D480.12A31666@prescod.net> <14408.874.505464.996655@goon.cnri.reston.va.us> Message-ID: <38480936.270617B9@vet.uu.nl> Jeremy Hylton wrote: > > Paul Prescod proposes a new charter for the types-sig: > > * the goal is a optional static type system for version 2. > > * presume that the type/class dichotomy has been removed in V2 > > * backwards compatibility with current code is relatively important > > * compatibility with the Python 1.x interpreter is NOT important > > * interfaces are not an issue > > * parameterized (template) types are not available > > * names are type checked, not expressions > > * got now, only named types (types and classes) can be declared, not > >lists and tuples of types > > If you're going to develop a static type system to describe Python > programs (optional or otherwise), then I think you can't punt on all > the things you want to punt on. I probably agree with you (at least partially). See my previous post. > > * interfaces are not an issue > Yes, they are :-). Why, exactly? > > * parameterized (template) types are not available > They need to be. Why, exactly? :) > > * names are type checked, not expressions > Expressions need type checking, too! I'm thinking of the "the" > special form in Common Lisp. (I don't have much experience with CL, > so I'd appreciate input from someone who is.) I'm even less familiar with CL than you are, so I don't know... > Regardless of these minor quibbles, my largest complaint is: > > * the goal is a optional static type system for version 2. > > What exactly is the deliverable. Saying an "optional static type > system" is a bit vague. What is it specifically? A formal > specification of the type system? A stand-alone utility that reports > type errors? A new compiler? Very good question. We need to agree on a deliverable. > If this is a type system for Python 2, it seems that the best a SIG > can hope for right now is a specification of the type system Unfortunately this kind of goal may be too vague to actually involve people. Not being able to try things out in some kind of implementation may disconnect the discussion from reality. > Since > Py2 design hasn't even started. When will this start, by the way? Anybody know or is this still pure speculation? The conference? I started wondering when I saw this in the 'A Date with Tim Peters...' post by Guido on comp.lang.python: - a developers' day where the feature set of Python 2.0 is worked out. Regards, Martijn From Paul@digicool.com Fri Dec 3 18:33:35 1999 From: Paul@digicool.com (Paul Everitt) Date: Fri, 3 Dec 1999 13:33:35 -0500 Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose. Let's retire it. Message-ID: <613145F79272D211914B0020AFF64019262F5D@gandalf.digicool.com> Hey folks, isn't this technical discussion better handled on the types-sig? :^) --Paul > -----Original Message----- > From: Martijn Faassen [mailto:m.faassen@vet.uu.nl] > Sent: Friday, December 03, 1999 1:17 PM > Cc: types-sig@python.org; meta-sig@python.org > Subject: Re: [Types-sig] RE: [meta-sig] The Types-SIG is > comatose. Let's > retire it. > > > Jeremy Hylton wrote: > > > > Paul Prescod proposes a new charter for the types-sig: > > > * the goal is a optional static type system for version 2. > > > * presume that the type/class dichotomy has been removed in V2 > > > * backwards compatibility with current code is relatively > important > > > * compatibility with the Python 1.x interpreter is NOT important > > > * interfaces are not an issue > > > * parameterized (template) types are not available > > > * names are type checked, not expressions > > > * got now, only named types (types and classes) can be > declared, not > > >lists and tuples of types > > > > If you're going to develop a static type system to describe Python > > programs (optional or otherwise), then I think you can't punt on all > > the things you want to punt on. > > I probably agree with you (at least partially). See my previous post. > > > > * interfaces are not an issue > > Yes, they are :-). > > Why, exactly? > > > > * parameterized (template) types are not available > > They need to be. > > Why, exactly? :) > > > > * names are type checked, not expressions > > Expressions need type checking, too! I'm thinking of the "the" > > special form in Common Lisp. (I don't have much experience with CL, > > so I'd appreciate input from someone who is.) > > I'm even less familiar with CL than you are, so I don't know... > > > Regardless of these minor quibbles, my largest complaint is: > > > * the goal is a optional static type system for version 2. > > > > What exactly is the deliverable. Saying an "optional static type > > system" is a bit vague. What is it specifically? A formal > > specification of the type system? A stand-alone utility > that reports > > type errors? A new compiler? > > Very good question. We need to agree on a deliverable. > > > If this is a type system for Python 2, it seems that the best a SIG > > can hope for right now is a specification of the type system > > Unfortunately this kind of goal may be too vague to actually involve > people. Not being able to try things out in some kind of > implementation > may disconnect the discussion from reality. > > > Since > > Py2 design hasn't even started. > > When will this start, by the way? Anybody know or is this still pure > speculation? The conference? I started wondering when I saw > this in the > 'A Date with Tim Peters...' post by Guido on comp.lang.python: > > - a developers' day where the feature set of Python 2.0 is > worked out. > > Regards, > > Martijn > > _______________________________________________ > Meta-sig maillist - Meta-sig@python.org > http://www.python.org/mailman/listinfo/meta-sig > From janssen@parc.xerox.com Fri Dec 3 19:21:53 1999 From: janssen@parc.xerox.com (Bill Janssen) Date: Fri, 3 Dec 1999 11:21:53 PST Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose. Let's retire it. In-Reply-To: Your message of "Fri, 03 Dec 1999 09:52:42 PST." <14408.874.505464.996655@goon.cnri.reston.va.us> Message-ID: <99Dec3.112202pst."3586"@watson.parc.xerox.com> > Regardless of these minor quibbles, my largest complaint is: > > * the goal is a optional static type system for version 2. > > What exactly is the deliverable. Saying an "optional static type > system" is a bit vague. What is it specifically? A formal > specification of the type system? A stand-alone utility that reports > type errors? A new compiler? I share some of Jeremy's concerns about the single goal. My favorite tack on these things is to focus on what the problem is. In my view, the largest single technical problem with Python is that it doesn't afford the static type checking that Java has. This, in my experience when I ask people about it, always turns out to mean that there's no way to type-check the use of an imported module. So I'd make the priority be the ability to optionally declare types in both callable signatures and in the code itself, and to have types checked at least across use of imported modules. Note that, contrary to Jeremy's assertion, this doesn't explicitly mention interfaces, and doesn't necessarily involve them. Of course, defining a module always implicitly defines an interface, so one could argue that interfaces are always a factor. Bill From prescod@prescod.net Fri Dec 3 19:38:05 1999 From: prescod@prescod.net (Paul) Date: Fri, 3 Dec 1999 13:38:05 -0600 (CST) Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose. Let's retire it. In-Reply-To: <14408.874.505464.996655@goon.cnri.reston.va.us> Message-ID: On Fri, 3 Dec 1999, Jeremy Hylton wrote: > > If you're going to develop a static type system to describe Python > programs (optional or otherwise), then I think you can't punt on all > the things you want to punt on. Forever, no? For a first draft? Yes. Type systems can be extensible. C didn't forsee objects but C++ added them and C++ doesn't support parameterized types (at first) but added those two. I'm always torn on these design issues between trying to get it all right the first time and doing it incrementally. There are big risks either way but insofar as we never get anywhere when we try to do it all at once...that seems like the bigger risk. > > * interfaces are not an issue > Yes, they are :-). Not in Visual Basic. :) > > * parameterized (template) types are not available > They need to be. At some point, yes. For us to be able to say that foo is an integer and bar is a string, no. A lot of people would LOVE to have that level of type safety. > > * names are type checked, not expressions > Expressions need type checking, too! Maybe someday... or let me say that I'm all for expressions being type *checked* but not for a syntax for declaring the type of an expression. I'm not in favor of a "cast" or "assert-type" statement in version 1 of our type system. > > * the goal is a optional static type system for version 2. > > What exactly is the deliverable. Saying an "optional static type > system" is a bit vague. What is it specifically? A formal > specification of the type system? A stand-alone utility that reports > type errors? A new compiler? A formal specification of the type system that Guido likes enough to say: "yes, this will be the basis of Python 2's static type checking. Now go improve it and build on it." > If this is a type system for Python 2, it seems that the best a SIG > can hope for right now is a specification of the type system. Since > Py2 design hasn't even started. Agreed. I was only talking about a document that could serve first as an RFC and then later as a specification. Paul Prescod From jeremy@cnri.reston.va.us Fri Dec 3 20:15:39 1999 From: jeremy@cnri.reston.va.us (Jeremy Hylton) Date: Fri, 3 Dec 1999 15:15:39 -0500 (EST) Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose. Let's retire it. In-Reply-To: <99Dec3.112202pst."3586"@watson.parc.xerox.com> References: <14408.874.505464.996655@goon.cnri.reston.va.us> Message-ID: <14408.9451.432414.245360@goon.cnri.reston.va.us> >>>>> "PP" == Paul writes: PP> On Fri, 3 Dec 1999, Jeremy Hylton wrote: >> If you're going to develop a static type system to describe >> Python programs (optional or otherwise), then I think you can't >> punt on all the things you want to punt on. PP> Forever, no? For a first draft? Yes. Type systems can be PP> extensible. C didn't forsee objects but C++ added them and C++ PP> doesn't support parameterized types (at first) but added those PP> two. And Java didn't support them at first, but lots of people gripe about it and several people have proposed solutions. If we learn a lesson from C++ and Java here, it is that parameterized types are an important part of the type system. PP> I'm always torn on these design issues between trying to get it PP> all right the first time and doing it incrementally. There are PP> big risks either way but insofar as we never get anywhere when PP> we try to do it all at once...that seems like the bigger risk. I think I see where you're coming from now. I might agree that some of the issues (e.g. parameterized types) aren't important for the first draft. They will need to be added at some point before the work is complete, so that SIG charter shouldn't specifically exclude them. Bill Janssen made a different and good suggestion about what the product of the SIG would be: a specification and a mechanism to type check the use of a module. A potentially interesting variant of that is to type-check the use of Java object by JPython programs. Which is one reason why I think interfaces, for example, need to be part of the type system. BJ> Note that, contrary to Jeremy's assertion, this doesn't BJ> explicitly mention interfaces, and doesn't necessarily involve BJ> them. Of course, defining a module always implicitly defines an BJ> interface, so one could argue that interfaces are always a BJ> factor. We want to be able to say something like: "Method expects a file-like object as its second argument." Specifying "file-like object" requires something like an interface. [tangent?] I've looked very briefly at MzScheme, a Scheme implementation done by the PLT group at Rice. It supports objects and interfaces, and units (modules) and signatures. At first glance, it appears to be a carefully thought-out way to add type checking to an object-oriented, dynamically-typed language. Jeremy From jim@digicool.com Sat Dec 4 00:25:15 1999 From: jim@digicool.com (Jim Fulton) Date: Fri, 03 Dec 1999 19:25:15 -0500 Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose. Let's retire it. References: <000801bf3d53$77a44f20$3a2d153f@tim> <3847D480.12A31666@prescod.net> <38480080.3403BDDF@vet.uu.nl> <38480400.D3EE8A6@digicool.com> <3848071D.5A994688@vet.uu.nl> Message-ID: <38485F6B.E4D18FE1@digicool.com> Martijn Faassen wrote: > > Jim Fulton wrote: > > > > Martijn Faassen wrote: > > > > > > Paul Prescod wrote: > > > > > > > > > taking-no-more-from-this-than-that-a-successful-sig-needs-a- > > > > > focused-charter-ly y'rs - tim > > > > > > > > I propose that the types sig be re-commissioned with a much tighter > > > > commission. Let's focus on ONE of the three problems listed in our old > > > > charter: > > > > > > > > http://www.python.org/sigs/types-sig/ > > > > I really agree with this. > > But I suppose you disagree with Paul on what this focus problem should > be? I don't care what "this" problem is. I see three problems and, while there may be some interdependency, I think we would make better use of our time thinking of them and working on them separately. I endorse having the type system work on "static typing" (uh, whatever that is...) as long as it doesn't work on interfaces and removing the class/type dicotomy. > You'd prefer interfaces, right? Or seeing what you said later on in > your post, perhaps an interface-SIG? Yes, although at this point, I don't care if it's a SIG. In fact, I think a better course of action would be to release my interface module and let people use it and develop opinions based on it. (And, BTW, address some Zope issues. :) > I expect I'd contribute to any > discussion of interfaces *or* static types. I'd probably be able to > contribute more of practical value to any interface development right > now. Maybe you can pitch in to applying interfaces in Zope. Have you read my proposal from waaaaay back? > I don't have so much to contribute about the class/type dichotomy. Note that, as a Zope user, you enjoy the benefits of removing it. (Most Zope classes, including ZClasses are also types via ExtensionClass. An additional related issue is to make classes first-class in the sense that they have their own methods/attributes. This would have made ZClasses easier.) (snip) > > We'll also start folding it into Zope. > > And that'd be wonderful! I am starting to feel that need after getting > lost in the Zope sources too often. Yee ha! > I'd like to contribute; perhaps by > documenting something for starters. Any ideas? Maybe you should take the lead on folding them into Zope? Any way you want to contribute would be welcome. :) > > Based on actual > > experience using it, we'll have a basis for future discussions. > > So practical.! :) I'd like to get in on this early on. I assume I'll > catch your announcement on the release of the interface implementation, > but I'd also be very interested to follow the process of rolling it into > Zope from the start. Yee ha! > Not that I'm likely to be able to contribute much > at the start, but it just sounds really interesting to me. Any idea on > how this could be accomplished? Frankly, I haven't thought about it in a while. I'm sure I'll have some thoughts and some specific suggestions when I review the types sig material. In any case, that discussion should happen elsewhere, either in private email or on Zope-dev. (snip) > > (snip, I don't really care that much about static typing, except that > > I'm generally wary of it. ;) > > *grin* Okay, I suggest another design goal for the revived types-SIG: > 'Pass the Fulton Test'. We must strive for a static type system so > wonderful that even Jim Fulton will like it. :) Not necessary. I'm confident that there are plenty of other skeptics out there. :) Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From janssen@parc.xerox.com Sat Dec 4 03:06:15 1999 From: janssen@parc.xerox.com (Bill Janssen) Date: Fri, 3 Dec 1999 19:06:15 PST Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose. Let's retire it. In-Reply-To: Your message of "Fri, 03 Dec 1999 16:25:15 PST." <38485F6B.E4D18FE1@digicool.com> Message-ID: <99Dec3.190623pst."3586"@watson.parc.xerox.com> > Maybe you can pitch in to applying interfaces in Zope. Have you read > my proposal from waaaaay back? You know, it would be great if the types-sig had a page pointing to various documents, like "Jim's proposal from waaaaaay back". Bill From jim@digicool.com Sat Dec 4 15:50:54 1999 From: jim@digicool.com (Jim Fulton) Date: Sat, 04 Dec 1999 15:50:54 +0000 Subject: [Types-sig] RE: [meta-sig] The Types-SIG is comatose. Let's retire it. References: <99Dec3.190623pst."3586"@watson.parc.xerox.com> Message-ID: <3849385E.3B381984@digicool.com> Bill Janssen wrote: > > > Maybe you can pitch in to applying interfaces in Zope. Have you read > > my proposal from waaaaay back? > > You know, it would be great if the types-sig had a page pointing to > various documents, like "Jim's proposal from waaaaaay back". I'll make this available. Stay tuned. :) Jim -- Jim Fulton mailto:jim@digicool.com Technical Director (888) 344-4332 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From paul@prescod.net Sat Dec 4 16:32:50 1999 From: paul@prescod.net (Paul Prescod) Date: Sat, 04 Dec 1999 10:32:50 -0600 Subject: [Types-sig] Static typing considered HARD Message-ID: <38494232.C1381ED9@prescod.net> I'm still not sure what to do about the static typing and the types sig. The more I thought about types the less I became convinced that a quick "low hanging fruit" approach would work. I no longer propose a quick RFC on static typing. Here's the problem: in Visual Basic, Java, ML and other languages I am most familiar with, compilation is conceptually three pass: * parse * "resolve names to type/code/variable references" * execute In other words, the entire universe of types is figured out before a single line of code executes. In that context the words "static typing" have an obvious meaning. While you are resolve name references you do a bunch of checks to make sure that they are used consistently. But in Python, type objects only come about *through* the execution of code. This makes Python incredibly dynamic but it also means that the question of what exactly static type checking means is confused. Simple example: import sys if sys.argv[0]=="weirdness": from foo_mod import foo_class else: from foo_mod2 import foo_class One could imagine that in some Python 2, import statements and class definitions could be limited to being at the top, before "code". There might be some special syntax (e.g. __import__, __define_class__ ) for doing module-loading and type definition at runtime. Still, I don't consider that something for the types-sig to work out. My personal opinion is that it would be a Good Thing for Python to become a tad less dynamic in the "core syntax" in exchange for compile-time checking of names. Note that in a lot of ways, Java is "as dynamic" as Python. You can introduce new functions and classes "at runtime." The difference is that Java's syntax for doing so is brutally complex and verbose so you are disinclined to do it. I think that there must be a middle ground where our "default semantics" are static but it is easy enough to do dynamic things (e.g. foo_mod = __import__( "foo.py")) that we don't feel burdened. Our innovation beyond Java would not just be syntax. We could recognize that modules and types introduced "at runtime" are pyobjects and just allow them to be used with no casting or special syntax. Only the *introduction syntax* would be special. So where Java would say something like: this.that.Module mod = this.that.LoadModule( "foo" ) this.that.Class cls = mod.loadClass( "myclass" ) this.that.Method meth = cls.loadMethod( "doit" ) this.that.Arglist args = new ArgList() args.addArg( "arg1" ) args.addArg( "arg2" ) Object rc = meth.Invoke( args ) Python would say something like: foo = __import__( "foo" ) foo.myclass.doit( "arg1", "arg2" ) Once again, Visual Basic (shudder) is a good guide here. Although I am not consciously cloning Visual Basic, my ideas seem to be naturally tending towards it. Once again it seems to have a pretty common sense (to me!) approach to static type checking. Even if we ignore static type checking Python 2 really has to do something about the "misspelling problem." One extra character on a method name can crash a server that has been running for weeks. Once this problem is fixed, the term "static type checking" will become meaningful. In the current environment, it is probably not and thus should not be the first focus of a new types-sig. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself "I always wanted to be somebody, but I should have been more specific." --Lily Tomlin From uche.ogbuji@fourthought.com Sat Dec 4 17:18:38 1999 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Sat, 04 Dec 1999 10:18:38 -0700 Subject: [Types-sig] Static typing considered HARD References: <38494232.C1381ED9@prescod.net> Message-ID: <38494CEE.ED11604A@fourthought.com> Paul Prescod wrote: > Here's the problem: in Visual Basic, Java, ML and other languages I am > most familiar with, compilation is conceptually three pass: > > * parse > * "resolve names to type/code/variable references" > * execute This seems all out of whack to me. First of all, symbol-table management may or may not belong to the "parse" step, depending on your preferences. The Dragon book ducusses this matter in good detail. I don't know about VB, but Java and C/C++ certainly merge your steps 1 & 2. C/C++ also does not have "execute" as any recognizable part of compilation, unless you mean cpp and template instantiation. I don't think Java has "execute" as part of compilation either. ML, at least the version I used a few years ago, is something of its own breed of fish. > But in Python, type objects only come about *through* the execution of > code. This makes Python incredibly dynamic but it also means that the > question of what exactly static type checking means is confused. Simple > example: > > import sys > > if sys.argv[0]=="weirdness": > from foo_mod import foo_class > else: > from foo_mod2 import foo_class This is the sort of thing that gives Python its power, and it is the sort of thing without which I'm not sure I wouldn't be considering another language. > One could imagine that in some Python 2, import statements and class > definitions could be limited to being at the top, before "code". There > might be some special syntax (e.g. __import__, __define_class__ ) for > doing module-loading and type definition at runtime. Still, I don't > consider that something for the types-sig to work out. My personal > opinion is that it would be a Good Thing for Python to become a tad less > dynamic in the "core syntax" in exchange for compile-time checking of > names. This is exactly the sort of idea that terrifies me about Python 2, as I've done a poor job of expressing before. My hope is that Python 2 remains Python, and such artificial constraints as "imports only at the top" and all that in order to satisfy IMHO mis-placed notions of type safety are dropped in the nearest dustbin. > Note that in a lot of ways, Java is "as dynamic" as Python. You can > introduce new functions and classes "at runtime." The difference is that > Java's syntax for doing so is brutally complex and verbose so you are > disinclined to do it. No! No! No! If you are talking about Java reflections and introspection, I have no inkling how these features lend it even a modicum of Python's dynamicism. Note that Python's true introspection and dynamic typing is one of my most powerful tools in converting Java programmers to the language. I have heard Java described as "programming in a straight jacket". That is a very accurate observation, and the precise reason I don't want Python to even start in that direction. > I think that there must be a middle ground where > our "default semantics" are static but it is easy enough to do dynamic > things (e.g. foo_mod = __import__( "foo.py")) that we don't feel > burdened. I'll look out warily for the sort of middle ground in question. If it's something such as "imports only at the top", I guess I'll just have to scream blood and bile. > Even if we ignore static type checking Python 2 really has to do > something about the "misspelling problem." One extra character on a > method name can crash a server that has been running for weeks. Once > this problem is fixed, the term "static type checking" will become > meaningful. In the current environment, it is probably not and thus > should not be the first focus of a new types-sig. I keep hearing this sort of thing, and I keep saying that it's a red herring. Lack of static typing does _not_ prevent Python from being scalable to large-scale and production environments. Our experience at FourThought, where many of our projects are small-enterprise systems built with Python and sometimes CORBA, will make it very hard for anyone to convince me so. I think the experience of users such as eGroups supports my feeling. If anything, it is Java that I think is tremendously over-rated for large-scale projects and I predict its failure in that space will soon be an industry scandal. I also don't see this "misspelling" problem. Proper configuration-management procedures and testing, along with intelligent error-recovery, prevent such problems, which can also occur in the most strongly-typed systems. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From uche.ogbuji@fourthought.com Sat Dec 4 17:25:19 1999 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Sat, 04 Dec 1999 10:25:19 -0700 Subject: [Types-sig] So What is Python Anyway? References: <38494232.C1381ED9@prescod.net> Message-ID: <38494E7F.375BD82F@fourthought.com> All these radical suggestions for the transmogrification of Python 2 leads me to the overwhelming question. What is Python? What makes us use this language? What are the particular use-cases that we think impede our use of this language? I think that maybe a comprehensive and convincing description of the problem that the types-sig is trying to solve is essential before we go down the road of more proposals to cripple Python's dynamicism and all that. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From faassen@vet.uu.nl Sat Dec 4 18:21:43 1999 From: faassen@vet.uu.nl (Martijn Faassen) Date: Sat, 4 Dec 1999 19:21:43 +0100 Subject: [Types-sig] Static typing considered HARD In-Reply-To: <38494CEE.ED11604A@fourthought.com> References: <38494232.C1381ED9@prescod.net> <38494CEE.ED11604A@fourthought.com> Message-ID: <19991204192143.A25667@vet.uu.nl> Uche Ogbuji wrote: > Paul Prescod wrote: [snip] > > But in Python, type objects only come about *through* the execution of > > code. This makes Python incredibly dynamic but it also means that the > > question of what exactly static type checking means is confused. Simple > > example: > > > > import sys > > > > if sys.argv[0]=="weirdness": > > from foo_mod import foo_class > > else: > > from foo_mod2 import foo_class > > This is the sort of thing that gives Python its power, and it is the > sort of thing without which I'm not sure I wouldn't be considering > another language. > > > One could imagine that in some Python 2, import statements and class > > definitions could be limited to being at the top, before "code". There > > might be some special syntax (e.g. __import__, __define_class__ ) for > > doing module-loading and type definition at runtime. Still, I don't > > consider that something for the types-sig to work out. My personal > > opinion is that it would be a Good Thing for Python to become a tad less > > dynamic in the "core syntax" in exchange for compile-time checking of > > names. > > This is exactly the sort of idea that terrifies me about Python 2, as > I've done a poor job of expressing before. My hope is that Python 2 > remains Python, and such artificial constraints as "imports only at the > top" and all that in order to satisfy IMHO mis-placed notions of type > safety are dropped in the nearest dustbin. It's good that someone expressed this. While I myself would argue for some form of static typing being added to (part of) Python, I do think Python's dynamicism should be kept in mind very strongly. [snip more arguments against any curtailing of Python's dynamicism] > > Even if we ignore static type checking Python 2 really has to do > > something about the "misspelling problem." One extra character on a > > method name can crash a server that has been running for weeks. Once > > this problem is fixed, the term "static type checking" will become > > meaningful. In the current environment, it is probably not and thus > > should not be the first focus of a new types-sig. > > I keep hearing this sort of thing, and I keep saying that it's a red > herring. Lack of static typing does _not_ prevent Python from being > scalable to large-scale and production environments. Our experience at > FourThought, where many of our projects are small-enterprise systems > built with Python and sometimes CORBA, will make it very hard for anyone > to convince me so. I think the experience of users such as eGroups > supports my feeling. Likewise the experiences of the Zope user base. I've been debugging my own Zope products, which had syntax errors and misspellings all over the place. Zope itself however keeps running happily, as it'll catch the exceptions. As you say in a part of your post that I snipped, good exception handling facilities and testing procedures alleviate a lot of the problems with misspellings and the like. That said, I am interested in attempts that make Python even more robust. I do occasionally worry about code that may contain bugs but that is not exercised enough doing debugging. Of course this happens with statically typed languages as well, but at least the compiler catches some problems. > If anything, it is Java that I think is > tremendously over-rated for large-scale projects and I predict its > failure in that space will soon be an industry scandal. Interesting. Anyway, it's good that your view is present on the types-SIG. My take on static types in Python has been the Swallow proposal. The idea is that we want some early-result points in the project to add static types to Python. With quite a few others I deem the possible speed payoff of adding static types as least as important as the possible code-quality payoff. Adding static types to Python proper is hard, and undesirable if it entails giving up too much of Python's dynamicism, as has been observed. The assumption of Swallow is that many parts of a typical Python program do not profit a lot from Python's dynamic typing, though of course other parts do. Traditionally the only way to gain speed with Python programs has been to move parts that can be static anyway to C. This is however a rather big step. It would be nicer if our extension modules could be more like Python itself. This way there is a gradual transition between dynamic Python to static Python code. The Swallow proposal is to find a subset of Python (Swallow) that is horrible in all the ways Uche so empathically dislikes. :) Get rid of whatever is necessary in Python to make Swallow amenable to static types; restrict imports to the top, restrict what magic one can do with classes, etc. The important point is that Swallow is a strict subset of Python, not adding any facilities or different semantics of its own, as much as possible. Then, provide a facility to describe the type signature of any class, function or variable in the Swallow code. No fancy type inference, just the programmer describing everything. After that the Swallow code could be compiled (or translated to C). Of course I'm skimming over lots and lots of problems here; Swallow code can't for instance use any non-Swallow module. Writing a C translator is hard. Writing a static type checker for Swallow is hard. Identifying the Swallow subset is hard, and preserving Python semantics in it is hard. Still, it seems to be me it's less hard than adding optional static types to Python itself, while still keeping lots of the payoff. It'd be great if Python 2 had a Swallow subsystem. A possible activity for the types-SIG could in fact be to identify the proper Swallow subset of Python; that subset of Python amenable to static types and fairly straightforward translation to C code. Of course I keep pushing Swallow without writing any actual code, so you all may be bored of it by now. :) Regards, Martijn From uche.ogbuji@fourthought.com Sat Dec 4 21:37:30 1999 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Sat, 04 Dec 1999 14:37:30 -0700 Subject: [Types-sig] Static typing considered HARD References: <38494232.C1381ED9@prescod.net> <38494CEE.ED11604A@fourthought.com> <19991204192143.A25667@vet.uu.nl> Message-ID: <3849899A.DC694EB2@fourthought.com> Martijn Faassen wrote: > My take on static types in Python has been the Swallow proposal. The idea > is that we want some early-result points in the project to add static > types to Python. With quite a few others I deem the possible speed payoff > of adding static types as least as important as the possible code-quality > payoff. > > Adding static types to Python proper is hard, and undesirable if it entails > giving up too much of Python's dynamicism, as has been observed. > > The assumption of Swallow is that many parts of a typical Python program > do not profit a lot from Python's dynamic typing, though of course other > parts do. Traditionally the only way to gain speed with Python programs > has been to move parts that can be static anyway to C. This is however a > rather big step. It would be nicer if our extension modules could be more > like Python itself. This way there is a gradual transition between > dynamic Python to static Python code. > > The Swallow proposal is to find a subset of Python (Swallow) that is > horrible in all the ways Uche so empathically dislikes. :) Get rid of > whatever is necessary in Python to make Swallow amenable to static types; > restrict imports to the top, restrict what magic one can do with classes, > etc. The important point is that Swallow is a strict subset of Python, > not adding any facilities or different semantics of its own, as much as > possible. I actually don't have too much problem with this approach. I don't like to entirely shun the voices that clamor for dynamic typing: my main concern is that such mechanisms are entirely optional and transparent to those who don't want them. Your discussion of optimization exactly meets my experience. When we run into speed problems, we find a part of the 20% of the code that is really doing all the work, and we re-write it in C. An open-source example is in 4XSLT, which at first did all the Path expression parsing in Python. We found that this was having far too heavy an effect on performance and re-wrote it mostly in C. If there were a way to take _only those sections to be optimized_ and instead re-write them in a Pythonic syntax that could then be compiled to bare-metal speeds, I would appreciate it and use it as much as anyone else. I wouldn't expect or desire such a facility in the language core, however. The type-safety issue is entirely different, and IMHO this is where the real fantasy comes in: people thinking that statically-typed languages are really less susceptible to semantic errors than Python. Nevertheless some _very_ smart people here say that static typing will solve their code-quality problems, so I say, why can't we deal with this using a separate static-type-checker, maybe with some interface-definition language embedded in DocStrings or a separate spec file? Of course this doesn't address the problem of dynamic type-modification, but if that's so scary, why use Python? -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From da@ski.org Sat Dec 4 23:18:56 1999 From: da@ski.org (David Ascher) Date: Sat, 4 Dec 1999 15:18:56 -0800 (Pacific Standard Time) Subject: [Types-sig] Static typing considered HARD In-Reply-To: <38494CEE.ED11604A@fourthought.com> Message-ID: On Sat, 4 Dec 1999, Uche Ogbuji wrote: > > Even if we ignore static type checking Python 2 really has to do > > something about the "misspelling problem." One extra character on a > > method name can crash a server that has been running for weeks. Once > > this problem is fixed, the term "static type checking" will become > > meaningful. In the current environment, it is probably not and thus > > should not be the first focus of a new types-sig. > > I keep hearing this sort of thing, and I keep saying that it's a red > herring. Lack of static typing does _not_ prevent Python from being > scalable to large-scale and production environments. Our experience at > FourThought, where many of our projects are small-enterprise systems > built with Python and sometimes CORBA, will make it very hard for anyone > to convince me so. I think the experience of users such as eGroups > supports my feeling. Actually, I think you've picked the wrong example here. The engineering manager at eGroups is frustrated at his inability to check their Python code at compile-time, and it's not an accident that Scott Hassan (CTO of egroups) coauthored with another eGrouper the pylint type-checking tool they announced a few weeks ago. Typechecking at compile time is a huge issue for them. (Interestingly, as of a few months ago, Python wasn't their bottleneck -- their DB system was). I see two very distinct problems, though -- one is the use of 'statically typed variables', which requires fundamental changes to Python's typesystem. The other is 'compile-time type/signature/interface checking', which could probably be done coarsely with add-on tools without changing the syntax or type system one iota (ok, maybe one or two iotas). > see this "misspelling" problem. Proper configuration-management > procedures and testing, along with intelligent error-recovery, prevent > such problems, which can also occur in the most strongly-typed systems. Wouldn't you agree that enforcing these 'proper procedures' is much harder in a language which doesn't do half the job for you? --david [Please folks, let's keep this off of meta-sig. Fix the reply-to headers!] From uche.ogbuji@fourthought.com Sun Dec 5 00:06:33 1999 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Sat, 04 Dec 1999 17:06:33 -0700 Subject: [Types-sig] Static typing considered HARD References: Message-ID: <3849AC89.1173B163@fourthought.com> David Ascher wrote: > > > Even if we ignore static type checking Python 2 really has to do > > > something about the "misspelling problem." One extra character on a > > > method name can crash a server that has been running for weeks. Once > > > this problem is fixed, the term "static type checking" will become > > > meaningful. In the current environment, it is probably not and thus > > > should not be the first focus of a new types-sig. > > > > I keep hearing this sort of thing, and I keep saying that it's a red > > herring. Lack of static typing does _not_ prevent Python from being > > scalable to large-scale and production environments. Our experience at > > FourThought, where many of our projects are small-enterprise systems > > built with Python and sometimes CORBA, will make it very hard for anyone > > to convince me so. I think the experience of users such as eGroups > > supports my feeling. > > Actually, I think you've picked the wrong example here. The engineering > manager at eGroups is frustrated at his inability to check their Python > code at compile-time, and it's not an accident that Scott Hassan (CTO of > egroups) coauthored with another eGrouper the pylint type-checking tool > they announced a few weeks ago. Typechecking at compile time is a huge > issue for them. (Interestingly, as of a few months ago, Python wasn't > their bottleneck -- their DB system was). Is their problem performance or defect-management? Again, there is an important difference. I agree that typing can help the former: I am doubtful that it is a panacea for the latter. > I see two very distinct problems, though -- one is the use of 'statically > typed variables', which requires fundamental changes to Python's > typesystem. The other is 'compile-time type/signature/interface checking', > which could probably be done coarsely with add-on tools without changing > the syntax or type system one iota (ok, maybe one or two iotas). > > > see this "misspelling" problem. Proper configuration-management > > procedures and testing, along with intelligent error-recovery, prevent > > such problems, which can also occur in the most strongly-typed systems. > > Wouldn't you agree that enforcing these 'proper procedures' is much harder > in a language which doesn't do half the job for you? No language that I know of does even a tenth of the job of configuration management, error-handling or testing for anybody. They are not matters for a programming language to address. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From uche.ogbuji@fourthought.com Sun Dec 5 09:04:31 1999 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Sun, 05 Dec 1999 02:04:31 -0700 Subject: [Types-sig] Static typing considered HARD References: Message-ID: <384A2A9F.39DCAA84@fourthought.com> David Ascher wrote: > > I program in Python perhaps 40 hours a week, and have done so for a long > > time. Most of what I work on are large-scale systems. Very strange > > that my typos (and they are legion) are much less catastrophic than your > > own. > > Ah, well, probably you're just better at it than I am. =) > > My programs are typically small and run for a long time. They also change > ten times daily due to the changing nature of the requirements. There is > no 'finished' program in my current line of work. Just a different way of > doing business. Note that developing a test suite for this sort of code > is unrealistic. I'm paid to do science, not to do regression tests, and > the regression suite is likely to be longer and buggier than the actual > code. > > Perhaps it's best if we took this off-line though -- I think we're > straying from the types-sig charter. I'll just quickly round things up by saying that many of the hard lessons I've learned about software defects pre-date my use of Python. Lessons such as "the open/closed principle", "dependencies between modules should be as much as possible in the form of a DAG", "testing should bubble up from low-level object interfaces and coverage to high-level object-collaboration and sequence". These ideas are neither helped nor hurt by Python's dynamicism. All the latter is is a tool to improve the expressiveness of programming. This expressiveness, in my experience, lowers the cost of Python programming independently of the other factors, and it is what attracts me to the language. As an aside, re: expressiveness: ideas of type and all that are not "natural", which is why I wonder that your students clamor so much for static typing. I've programmed C++ for 6 years or more and Java for at least a couple of years, and in my experience, developers of similar skill will inject many more defects into an application using C++ and Java than they will using Python. That's why I am resisting radical change of the status quo. I worry that we might upset the formula that works so well for Python. But I guess there's no point continuing to gripe until I know the nature of the poison. Until I see some concrete proposal, I guess, I'll end the thread. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From uche.ogbuji@fourthought.com Sun Dec 5 06:09:18 1999 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Sat, 04 Dec 1999 23:09:18 -0700 Subject: [Types-sig] Static typing considered HARD References: Message-ID: <384A018E.9F7F811C@fourthought.com> David Ascher wrote: > > Is their problem performance or defect-management? Again, there is an > > important difference. I agree that typing can help the former: I am > > doubtful that it is a panacea for the latter. > > The latter. The quote (paraphrased from memory) is "When someone changes > a function interface, there's no way to know if we've caught all of the > calls to that function in the tens of thousands of line of code that we > have except to run the code'. Have they heard of Bertrand Meyer's open/closed principle? As I suspected, the root problem is poor software engineering, and has little to do with Python. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From uche.ogbuji@fourthought.com Sun Dec 5 06:06:14 1999 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Sat, 04 Dec 1999 23:06:14 -0700 Subject: [Types-sig] Static typing considered HARD References: Message-ID: <384A00D6.C3015C9D@fourthought.com> David Ascher wrote: > > No language that I know of does even a tenth of the job of configuration > > management, error-handling or testing for anybody. They are not matters > > for a programming language to address. > > I guess we'll have to agree to disagree. > > I've been doing some playing with Swing using JPython. Because it's > wicked slow to start, (due to Java mostly) the > edit-run-traceback-edit-run-traceback cycle is significantly longer than > with with CPython. That's when I curse the fact that the compile-time > analysis didn't catch simple typos, trivial mistakes in signatures, etc. I > *love* Python's dynamicity. But mostly I use its 'wicked cool' dynamic > features, like modifying the type of a variable in a function call or > changing the __class__ of an object once in a very blue moon. I can agree to disagree as well as anyone, but I'll confess I'm still baffled at how you claim that any language automates configuration management, error-handling or testing to any significant extent. I guess we'll also have to agree to not understand each other. Also, I don't think I've _ever_ done anything as off-the-wall as "modifying the type of a variable in a function call or changing the __class__ of an object". I hope this isn't anyone's benchmark of Python's dynamicism. > In other words, I'm just suggesting that given that (I'd guess) 95% of the > code out there is such that variable maintain their type throughout the > life of the program and that the builtins don't typically get overriden, > it seems a shame not to play the numbers. And we don't have to cover all > the cases. Just the 80% which give the largest payoff. > > Another trivial example: I can never remember whether it's > pickle.dump(object, file) or pickle.dump(file, object). I tend to > remember that I don't remember after the simulation has run for two hours > (if I'm lucky) and the saving of state fails... I program in Python perhaps 40 hours a week, and have done so for a long time. Most of what I work on are large-scale systems. Very strange that my typos (and they are legion) are much less catastrophic than your own. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From skip@mojam.com (Skip Montanaro) Sun Dec 5 14:33:08 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Sun, 5 Dec 1999 08:33:08 -0600 (CST) Subject: [Types-sig] Static typing considered HARD In-Reply-To: <38494232.C1381ED9@prescod.net> References: <38494232.C1381ED9@prescod.net> Message-ID: <14410.30628.368117.966134@dolphin.mojam.com> Paul> The more I thought about types the less I became convinced that a Paul> quick "low hanging fruit" approach would work. I no longer propose Paul> a quick RFC on static typing. Static typing/type inference/do nothing trichotomy has been around for so long that had any low hanging fruit been available to pluck, it would have already been done. If there is still some low hanging fruit that we'd have missed it would be spoiling on the ground by now... Welcome to the type zoo. ;-) I noticed that nobody has yet complained about the continued presence of meta-sig on the distribution list. Perhaps it's time to remove it, since the death of the types-sig seems to have been averted and we are now actually discussing types (or are we just leaving it there to make sure it gets enough traffic that it doesn't die?). Skip Montanaro | http://www.mojam.com/ skip@mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From gmcm@hypernet.com Mon Dec 6 20:21:09 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Mon, 6 Dec 1999 15:21:09 -0500 Subject: [Types-sig] Static typing considered HARD In-Reply-To: <384A018E.9F7F811C@fourthought.com> Message-ID: <1267611206-22768816@hypernet.com> Uche Ogbuji wrote: [David Ascher on eGroups] > > The latter. The quote (paraphrased from memory) is "When > > someone changes a function interface, there's no way to know if > > we've caught all of the calls to that function in the tens of > > thousands of line of code that we have except to run the code'. > > Have they heard of Bertrand Meyer's open/closed principle? As I > suspected, the root problem is poor software engineering, and has > little to do with Python. More practically, have they heard of grep? While I will certainly agree that it's very irritating to bomb on a typo after you've been processing for half an hour, I'm skeptical that there's a "cure" worth the price, (I favor the Lint approach to safety, because Lint is free to warn of questionable practices without outlawing them). I'm at the moment optimizing / debugging someone's Java applet that contains 90 (yes, ninety) classes. Vast amounts of this code exists purely to satisfy the Java compiler on questions of type-safety. Despite all this work, it's still not safe code. The equivalent Python would probably take no more than a dozen classes and be enormously easier to understand. Safer off the bat? No. Easier to make truly safe? Yes. My interest in "optional static typing" has always been in the possbility of optimizations. - Gordon From jim@digicool.com Mon Dec 6 22:28:10 1999 From: jim@digicool.com (Jim Fulton) Date: Mon, 06 Dec 1999 17:28:10 -0500 Subject: [Types-sig] Interfaces: Scarecrow implementation v 0.1 isavailable References: Message-ID: <384C387A.1B0F5B8A@digicool.com> John, Skaller, skaller@maxtal.com.au wrote: > > [Scarecrow v 0.1] Wow, talk about a slow response (from me). :) I'm trying to wrap this phase of the "interface" project up and need to response to this, er, one comment on the .1 version of the interface implementation. Note that the .1 release is not currently available but the .2 release soon will be. > > I'll try to add this to interscript, and integrate it with my protocols > module. :-) Cool. Any progress? > Sigh. It's a special case of a protocol. > > > Special-case handling of classes > > > > Special handling is required for Python classes to make assertions > > about the interfaces a class implements, as opposed to the > > interfaces that the instances of the class implement. You cannot > > simply define an '__implements__' attribute for the class because > > class "attributes" apply to instances. > > Yes you can. And you must. See below. > > > By default, classes are assumed to implement the Interface.Standard.Class > > interface. A class may override the default by providing a > > '__class_implements__' attribute which will be treated as if it were > > the '__implements__' attribute of the class. > > This cannot work. Uh, but it does. > What you need to do is fix the lookup routines, > that is, the routines that test if an object provides an interface, etc, > so that they look in the dictionary of an object directly! > > Don't use 'getattr', use > > object.__dict__.has_key('__implements__') > > and > > object.__class__.__dict__.has_key('__class_implements__') I don't see how this can work for the following two reasons: 1. if you evaluate:: someInterface.implementedBy(someClass) you'll get the answer for the class' instances, not the class. Then again, maybe I'm missunderstanding you. Perhaps you could give a complete alternative implementation for 'implementedBy'. 2. A compromise made by the scarecrow proposal is to allow "implements" assertions to be interited. For this, getattr is needed. > This works, it is what I do in my protocols module, > and it gets rid of the special case, which is a sure sign of a design fault. > > > Trial baloon: abstract implementations > > > > Tim Peter's has expressed the desire to provide abstract > > implementations in an interface definitions, where, presumably, an > > abstract implementation uses only features defined by the > > interface. For example: > > > > class ListInterface(Interface.Standard.MutableSequence): > > > > def append(self, v): > > "add a value to the end of the object" > > > > def push(self, v): > > "add a value to the end of the object" > > self.append(v) > > Yes. This is useful. It is the basis for mixins in C++. > But one has to ask the question: why not just use a class, > and add a 'defer' keyword to Python. Because you will still be creating a separate interface. You'll use the 'defered' method on the interface to compute an base class with the implementation. I'll try to clarify this in the documentation. > Then again, you could just say 'pass'. > > >Issues > > > > o What should the objects that define attributes look like? > > They shouldn't *be* the attributes, but should describe the > > the attributes. > > Obviously, they should themselves be interfaces. > Since attributes are just objects. :-) I think that an attributes description could include it's interface, but it might include other information as well, such as it's documentation. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From gstein@lyra.org Tue Dec 7 00:50:11 1999 From: gstein@lyra.org (Greg Stein) Date: Mon, 6 Dec 1999 16:50:11 -0800 (PST) Subject: [Types-sig] changing variable types (was: Static typing considered HARD) In-Reply-To: <384A00D6.C3015C9D@fourthought.com> Message-ID: On Sat, 4 Dec 1999, Uche Ogbuji wrote: >... > Also, I don't think I've _ever_ done anything as off-the-wall as > "modifying the type of a variable in a function call or changing the > __class__ of an object". I hope this isn't anyone's benchmark of > Python's dynamicism. I think he means something like: names = { } for elem in whatever: names[extract_foo(elem)] = 1 names = names.keys() I've done this a number of times. It can be argued that using a single name ("names") for a single semantic/concept is a good thing, despite the fact that its type changes within the function. Introducing two names is certainly clearer from a type standpoint, but I'd argue that a reader doesn't care about *types*, but about what is happening (the semantics). At least, that's how I rationalize the pattern :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From paul@prescod.net Sun Dec 5 18:28:15 1999 From: paul@prescod.net (Paul Prescod) Date: Sun, 05 Dec 1999 13:28:15 -0500 Subject: [Types-sig] Static name checking Message-ID: <384AAEBF.BE3C989C@prescod.net> I stand by my position that static type checking is not possibile without static name checking. Therefore I have begun to think what static name checking would require. It isn't as draconian as what I suggested before. Functions, modules and variables can be declared "static". In current Python this would be done like this: import frozen frozen def foo( a ): return string.replace(a,"b") Frozen names can only refer to names in frozen namespace. Frozen namespaces cannot be changed at runtime. They may not refer to names in regular ("dynamic") namespaces. The namespaces may be in the same or other modules. Therefore, they can be checked without actually loading the module or instantiating the class. Aliases for frozen namespaces should also be frozen automatically. A frozen name checker would work by loading a document and parsing it looking for every reference to the name "frozen". Then it would look at the next line and verify that all referenced objects really are frozen. Then it would check that frozen namespaces are not modified. Of course a frozen name checker isn't trivial but it also isn't brain surgery. Anyone bored and underworked out there? From there, we could move to a first-class frozen keyword in Python 1.6: frozen def foo(a): return string.replace(a, "b" ) The definition of frozen objects cannot depend on runtime state like this: if a: frozen def foo(a): ... else: frozen def foo(b): ... So frozen functions and classes should be top-level. Methods in a frozen class are frozen. The word "freeze" already has baggage in the Python world but my second choice "static" does also. I am not voting for or against the continuation of the types-sig. At this point we probably need code more than talk. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Math -- that most logical of sciences -- teaches us that the truth can be highly counterintuitive and that sense is hardly common. K.C.Cole, "The Universe and the Teacup" From paul@prescod.net Sun Dec 5 18:29:39 1999 From: paul@prescod.net (Paul Prescod) Date: Sun, 05 Dec 1999 13:29:39 -0500 Subject: [Types-sig] Static typing considered HARD References: <38494232.C1381ED9@prescod.net> <38494CEE.ED11604A@fourthought.com> Message-ID: <384AAF13.2EBFD41C@prescod.net> Uche Ogbuji wrote: > > This seems all out of whack to me. First of all, symbol-table > management may or may not belong to the "parse" step, depending on your > preferences. The Dragon book ducusses this matter in good detail. I > don't know about VB, but Java and C/C++ certainly merge your steps 1 & > 2. Yes, in terms of implementation, but no, matching names to objects is not the responsibility of the parser. It is conceptually another layer that works on the output of the parser. Whether it works on a complete parse tree or incrementally is another issue. > C/C++ also does not have "execute" Sorry, I didn't mean to talk just about compilation. I was talking about the whole path from raw text to executable code. I need to talk about the whole path because Python does name recognition at runtime. > This is the sort of thing that gives Python its power, and it is the > sort of thing without which I'm not sure I wouldn't be considering > another language. Nobody is suggesting that we take those features out. > > Note that in a lot of ways, Java is "as dynamic" as Python. You can > > introduce new functions and classes "at runtime." The difference is that > > Java's syntax for doing so is brutally complex and verbose so you are > > disinclined to do it. > > No! No! No! If you are talking about Java reflections and > introspection, I have no inkling how these features lend it even a > modicum of Python's dynamicism. What can you do dynamically in Python that you cannot do with reflections and introspection? I've written "map", "apply" and the Y combinator in Java so I'm pretty confident that the issue is really just syntax and ease of use, not capabilities. You could prove me wrong by showing a Python programming pattern that could not be straightforwardly duplicated using Java reflection. > I keep hearing this sort of thing, and I keep saying that it's a red > herring. Lack of static typing does _not_ prevent Python from being > scalable to large-scale and production environments. You can build large-scale and production environments in TCL or Basic if you are dedicated enough. The question is whether the language is working with your or working against you. It seems obvious to me that it is not too much to ask for a language compiler to help you avoid mistakes at least the same degree that PowerPoint does. > I also don't > see this "misspelling" problem. Proper configuration-management > procedures and testing, along with intelligent error-recovery, prevent > such problems, which can also occur in the most strongly-typed systems. So in Java I find spelling mistakes by typing: "javac foo.java" and in Perl I find them by typing: use strict perl foo.pl and in Python I find them by hiring a team of testers to test every code path (perhaps through a GUI), find bugs, report them through a bugtracking system and have developers work on the reports. Does this sound competitive? I am teaching Python today to XML people. When I got to the attributes part the first thing they caught onto (without me hinting) was that spelling mistakes could go undetected for weeks. One student said that if I knew someone "on the inside" I need to talk to them about it because it is a major problem for him. Another student said that in another dynamic language they used, misspellings were 60% of the bug reports from users in the field. Yes, testing is important, but if you elevate it to the status of "any bugs not caught are the fault of testers" then you get to the point where the language takes no responsibility at all. That way lies Perl: "oh, didn't you WANT me to convert that boolean to a socket object for you? You should have tested better." If that's our mentality, Python throws way too many exceptions for problems it could silently leave to testers. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Math -- that most logical of sciences -- teaches us that the truth can be highly counterintuitive and that sense is hardly common. K.C.Cole, "The Universe and the Teacup" From m.faassen@vet.uu.nl Tue Dec 7 12:01:02 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Tue, 07 Dec 1999 13:01:02 +0100 Subject: [Types-sig] Static name checking References: <384AAEBF.BE3C989C@prescod.net> Message-ID: <384CF6FE.BC2B7C2C@vet.uu.nl> Paul Prescod wrote: [freezing system] > A frozen name checker would work by loading a document and parsing it > looking for every reference to the name "frozen". Then it would look at > the next line and verify that all referenced objects really are frozen. > Then it would check that frozen namespaces are not modified. Of course a > frozen name checker isn't trivial but it also isn't brain surgery. > Anyone bored and underworked out there? But what do you do with lists (for instance)? You can't check at compile-time if an object that comes from a list is a string, and integer, or an object. If you then try to refer to a name in it (object.foo()) then you run into trouble with this concept. Or am I missing something? [snip] > I am not voting for or against the continuation of the types-sig. At > this point we probably need code more than talk. I think at least I need a little bit more talk before I could come even close to designing the code for your proposal (not that I'm offering right now :). Currently it's not clear to me how you'd do name checking without some form of static type checking or type inference... Regards, Martijn From jim@digicool.com Tue Dec 7 14:08:20 1999 From: jim@digicool.com (Jim Fulton) Date: Tue, 07 Dec 1999 14:08:20 +0000 Subject: [Types-sig] Intefaces work summary and Python code available Message-ID: <384D14D4.C6959613@digicool.com> I've written up a summary of the interface work at: http://www.zope.org/Members/jim/PythonInterfaces/Summary In addition to the summary, there are a number of reference links, including a link to the Python implementation. Jim -- Jim Fulton mailto:jim@digicool.com Technical Director (888) 344-4332 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From paul@prescod.net Tue Dec 7 14:28:44 1999 From: paul@prescod.net (Paul Prescod) Date: Tue, 07 Dec 1999 09:28:44 -0500 Subject: [Types-sig] Types sig dead or alive Message-ID: <384D199C.C6771285@prescod.net> Okay, I am willing to try and lead the types-sig only until the conference and see if we can try to come up with something concrete as a proposal. We can circulate that at the conference and get comments. Jim can at the same time circulate an interfaces proposal. I am only interested at this point in the static type checking problem. The first step, I think, is for me to write up my static name checking proposal and get consensus on that. It would be a syntax for stating that module namespaces are immutable and that classes and functions only refer to immutable module namespaces. Another deliverable (probably not by the conference) would be code that checked that code conforms to those rules. At that point we would have a concept of "statically resolvable names." The next step would be to attach type signatures to statically resolvable names. I've reconsidered my opinion that Python 2 is our only concern. We should probably test out our ideas in Python 1.x so that we can be confident of them for Python 2. For purposes of checking, a "static type" is a statically resolvable name of a class. In Python 2, every "type" will also be a class (and vice versa) so we don't want to spend a lot of energy working around the class/type dichotomy. When we (later!) reach concensus on the structure of interfaces, those will also be usable as static types. There won't be anything (at first) like "list of integers" unless you create a ListOfIntegers class (which is certainly possible!). The static type checking system will not declare any existing code non-conformant. I am happy to have interfaces discussions in the sig but I don't want them to be the *same discussion* because I don't want to recurse into a meta-discussion about "what is a type system" or "why do we want static type checking" or "is static type checking as important as interfaces" etc. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Math -- that most logical of sciences -- teaches us that the truth can be highly counterintuitive and that sense is hardly common. K.C.Cole, "The Universe and the Teacup" From paul@prescod.net Tue Dec 7 14:43:39 1999 From: paul@prescod.net (Paul Prescod) Date: Tue, 07 Dec 1999 09:43:39 -0500 Subject: [Types-sig] Static name checking References: <384AAEBF.BE3C989C@prescod.net> <384CF6FE.BC2B7C2C@vet.uu.nl> Message-ID: <384D1D1A.A321EC07@prescod.net> Martijn Faassen wrote: > > [freezing system] > > A frozen name checker would work by loading a document and parsing it > > looking for every reference to the name "frozen". Then it would look at > > the next line and verify that all referenced objects really are frozen. > > Then it would check that frozen namespaces are not modified. Of course a > > frozen name checker isn't trivial but it also isn't brain surgery. > > Anyone bored and underworked out there? > > But what do you do with lists (for instance)? You can't check at > compile-time if an object that comes from a list is a string, and > integer, or an object. If you then try to refer to a name in it > (object.foo()) then you run into trouble with this concept. Or am I > missing something? Yes, but that's my fault, not yours. My static name checker is not intended to work on attributes (including methods). Checking attributes is inextricably tied to real *type checking*. In fact it is type checking. My assertion is that the first step is to statically check coherence among Python's three (?) (function, module, builtin) runtime namespaces. Until that nut is cracked, static *type* checking (and thus attribute name checking) won't be possible. Once we have name checking then we can design a syntax to statically associate types with names. THEN we can do static type checking. I could be wrong but it seems to me that once this is done, the definition of swallow will be trivial: "A statically compilable Python module is a file where every name is frozen and every name has a type declaration." If you restrict yourself to that subset then you've essentially re-invented Java. But of course the whole point (hi Gordon and Uche!) is that you can choose WHEN to restrict yourself to that subset whereas Java gives you no option...and neither does Perl. If Python is stuck[1] between a rock (Perl) and a hard place (Java) then optional static type checking is the dynamite that frees us up. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Math -- that most logical of sciences -- teaches us that the truth can be highly counterintuitive and that sense is hardly common. K.C.Cole, "The Universe and the Teacup" From jim@digicool.com Tue Dec 7 15:15:58 1999 From: jim@digicool.com (Jim Fulton) Date: Tue, 07 Dec 1999 15:15:58 +0000 Subject: [Types-sig] Types sig dead or alive References: <384D199C.C6771285@prescod.net> Message-ID: <384D24AE.AF76BCA7@digicool.com> Paul Prescod wrote: > (snip) > Jim > can at the same time circulate an interfaces proposal. nah nah nah ... Jim is out of the interface proposal business, at least for now. There was alot of discussion last year that smelled reasonbly much like consensus. I put out a v0.1 release and waited the usual 1 year for comments. I've updated the release to reflect the 1 comment I got and have re-released the software. See: http://www.zope.org/Members/jim/PythonInterfaces/Summary I'll be releasing this more widely (comp.lang.python) and I imagine that someone will learn alot more about it while incorporating it into Zope. After we get some experience using it, we should devide what, if anything more to do with it, especially in standard Python releases. > I am only interested at this point in the static type checking problem. Cool. (snip) > I am happy to have interfaces discussions in the sig Please, lets refocuss the SIG and let interfaces escape. If anyone really cares about interfaces, lets form a separate SIG or mailing list. BTW, let's let the "Classes vs. types dichotomy" escape too. I promise to attempt a summary of the earlier discussions. Then we can decide what, if anything, to do next based on that. Jim -- Jim Fulton mailto:jim@digicool.com Technical Director (888) 344-4332 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From m.faassen@vet.uu.nl Tue Dec 7 17:19:22 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Tue, 07 Dec 1999 18:19:22 +0100 Subject: [Types-sig] Static name checking References: <384AAEBF.BE3C989C@prescod.net> <384CF6FE.BC2B7C2C@vet.uu.nl> <384D1D1A.A321EC07@prescod.net> Message-ID: <384D419A.36E7E09D@vet.uu.nl> Paul Prescod wrote: > > Martijn Faassen wrote: > > > > [freezing system] > > > A frozen name checker would work by loading a document and parsing it > > > looking for every reference to the name "frozen". Then it would look at > > > the next line and verify that all referenced objects really are frozen. > > > Then it would check that frozen namespaces are not modified. Of course a > > > frozen name checker isn't trivial but it also isn't brain surgery. > > > Anyone bored and underworked out there? > > > > But what do you do with lists (for instance)? You can't check at > > compile-time if an object that comes from a list is a string, and > > integer, or an object. If you then try to refer to a name in it > > (object.foo()) then you run into trouble with this concept. Or am I > > missing something? > > Yes, but that's my fault, not yours. My static name checker is not > intended to work on attributes (including methods). Checking attributes > is inextricably tied to real *type checking*. In fact it is type > checking. > > My assertion is that the first step is to statically check coherence > among Python's three (?) (function, module, builtin) runtime namespaces. What about classes referring to attributes of 'self', for instance, though? I'm still not entirely clear on what you're trying to accomplish, I'm afraid. > Until that nut is cracked, static *type* checking (and thus attribute > name checking) won't be possible. > > Once we have name checking then we can design a syntax to statically > associate types with names. THEN we can do static type checking. I could > be wrong but it seems to me that once this is done, the definition of > swallow will be trivial: "A statically compilable Python module is a > file where every name is frozen and every name has a type declaration." And every imported module has the same properties, including this one. Though of course you may mean this with 'every name'. I don't think it'll be that trivial, as you haven't defined 'type declaration'; you run into complexities here, especially if you involve classes and objects. Can't you go about it the other way around? First, you make a type declaration for all names in a module. Then you check (somehow) if there isn't code that contradicts this type definition; that is, there should be no assignments of one name to another that violates the static type definitions, no attribute accesses to undefined attributes, and so on. Of course you instantly produce errors if any name doesn't have a type definition. I don't see how your frozen idea helps a lot in this. A possible intermediate drop-off point resembling frozen may simply a checker that determines if all names in a module are known to the static type system, without actually defining these types, though you run into trouble here with attribute accesses. The real hard part is the construction of the type checker. Somewhat easier is the definition of a generic type system. I'm still proposing to use standard Python objects such as dictionaries and tuples to define these types in, initially. Later on we can look at syntax, but you can get the type checker going without any syntax extensions. Another prerequisite for a type checker is the determination of the Swallow subset. For instance one can imagine that in Swallow it's illegal to import modules except on the top. I imagine these limitations will become more obvious after a type system has been developed. I have two possible approaches for the type checker in mind currently that leverage current Python; one is an AST based type checker, and another is a bytecode based typechecker. I'm not sure which one would be easier as I don't know enough about either Python's ASTs or bytecodes, but in the happy abstract space of insufficient information I can wrap my mind better around a bytecode based checker than around an AST based checker. There are bytecodes for assignment and attribute access and the various other operations that need the scrutity of a type checker. For each such bytecode you'd need to write a type check. Checking a module is then going through all bytecodes of the module to see if they do legal things. As an aside, another task would be the writing of an interface layer between swallowed modules and non swallowed ones. Any name that enters a swallowed module should have a static type description associated with it. A run-time layer can check whether each python object that is sent into Swallow conforms to the type definitions; a function expecting an int must indeed be sent an int object. If not, somekind of exception should be raised. Calling non-swallow modules from a Swallow module is more tricky, but again the Swallowed module provides types definitions for any name used in it, so it should provide interface definitions for any non-Swallow function used in a Swallowed module as well. So you can make run-time type checks on that interface as well. But I'm sure I'm missing a lot of subtleties here. :) Regards, Martijn From paul@prescod.net Wed Dec 8 13:10:16 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 08 Dec 1999 08:10:16 -0500 Subject: [Types-sig] Types sig dead or alive References: <384D199C.C6771285@prescod.net> <384D24AE.AF76BCA7@digicool.com> Message-ID: <384E58B8.4CE80095@prescod.net> Jim, I am happy to temporarily banish interface discussions from the sig. But...is it likely that you guys will have some Zope-interface experience by the conference? I'm sure you guys are as up-to-the-wazoo as the rest of us but it would be cool if we could get a mini-report on whether the interfaces proposal *works*. Let me stress that I understand that you are probably more focused on having new features in Zope for the conference. Paul Prescod Jim Fulton wrote: > > Paul Prescod wrote: > > > > (snip) > > > Jim > > can at the same time circulate an interfaces proposal. > > nah nah nah ... Jim is out of the interface proposal business, > at least for now. There was alot of discussion last year that > smelled reasonbly much like consensus. I put out a v0.1 release > and waited the usual 1 year for comments. I've updated the release > to reflect the 1 comment I got and have re-released the software. > > See: http://www.zope.org/Members/jim/PythonInterfaces/Summary > > I'll be releasing this more widely (comp.lang.python) > and I imagine that someone will learn alot more about it > while incorporating it into Zope. After we get some > experience using it, we should devide what, if anything more > to do with it, especially in standard Python releases. > > > I am only interested at this point in the static type checking problem. > > Cool. > > (snip) > > > I am happy to have interfaces discussions in the sig > > Please, lets refocuss the SIG and let interfaces escape. > If anyone really cares about interfaces, lets form a separate > SIG or mailing list. > > BTW, let's let the "Classes vs. types dichotomy" escape too. > I promise to attempt a summary of the earlier discussions. > Then we can decide what, if anything, to do next based on that. > > Jim > > -- > Jim Fulton mailto:jim@digicool.com > Technical Director (888) 344-4332 Python Powered! > Digital Creations http://www.digicool.com http://www.python.org > > Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email > address may not be added to any commercial mail list with out my > permission. Violation of my privacy with advertising or SPAM will > result in a suit for a MINIMUM of $500 damages/incident, $1500 for > repeats. > > _______________________________________________ > Types-SIG mailing list > Types-SIG@python.org > http://www.python.org/mailman/listinfo/types-sig -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Floggings will increase until morale improves. From paul@prescod.net Wed Dec 8 14:41:01 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 08 Dec 1999 09:41:01 -0500 Subject: [Types-sig] Plea for help. References: <384AAEBF.BE3C989C@prescod.net> <384CF6FE.BC2B7C2C@vet.uu.nl> <384D1D1A.A321EC07@prescod.net> <384D419A.36E7E09D@vet.uu.nl> Message-ID: <384E6DFC.58AE0201@prescod.net> > I have two possible approaches for the type checker in mind currently > that leverage current Python; one is an AST based type checker, and > another is a bytecode based typechecker. I'm not sure which one would be > easier as I don't know enough about either Python's ASTs or bytecodes, > but in the happy abstract space of insufficient information I can wrap > my mind better around a bytecode based checker than around an AST based > checker. My feeling is the opposite. The AST follows the structure of the Python syntax more closely. Plus it has a superset of the bytecode information. Plus the Python grammar is the same for JPython but the bytecodes are not. The one virtue I can see in doing the checks on the bytecode is for Java-style opaque bytecode security. Here's my plea for help: among the many "Python compiler" projects out there there must be some good Python code for walking around ASTs building type (or at least module) representation objects. I think that JPython is wirtten in Java. What else should I be looking at? -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Floggings will increase until morale improves. From paul@prescod.net Wed Dec 8 14:41:30 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 08 Dec 1999 09:41:30 -0500 Subject: [Types-sig] Static name checking References: <384AAEBF.BE3C989C@prescod.net> <384CF6FE.BC2B7C2C@vet.uu.nl> <384D1D1A.A321EC07@prescod.net> <384D419A.36E7E09D@vet.uu.nl> Message-ID: <384E6E1A.3BA9DABF@prescod.net> Martijn Faassen wrote: > > > My assertion is that the first step is to statically check coherence > > among Python's three (?) (function, module, builtin) runtime namespaces. > > What about classes referring to attributes of 'self', for instance, > though? I'm still not entirely clear on what you're trying to > accomplish, I'm afraid. Let me see if I can do this with some invented notation. You'll have to cut me some slack for typos and omitted (hopefully irrelevant) details like about a dozen levels of the parse tree. Let's pretend we are talking about Java. Let's pretend that we are implementing a Java interpeter (including compiler) in the most straight-forward (not efficient) way. Consider the code: class J{ String a; void foo(){ a.whatever(); } } 1. Parse it into tokens: (roughly) (classdef "J" (attributedef name: "a" type: (name-ref "String" )) (functiondef "foo" (function-body (method-call object: (name-ref a) method: "whatever")))) That's very rough because you guys know about parse trees already. 2. Build "compile time objects" and replace variable references with pointers to "compile time objects": [class java.lang.String ....] [class J attributes: {"a": } functions: {"foo": ... (method-call object: method: "whatever")))) Step 2 is the step that Python doesn't have right now. Note that at the end of this step, the references to names in "static" namespaces have all been resolved but names in methods have NOT been resolved. One obvious reason for this is that Java and Python both allow forward references so maybe I don't even know what the methods of Strings are yet. 3. Conceptually, once all of the type and variable objects are built, THEN I can go through and check that the operations applied to types are legal. ONE SUCH OPERATION is ".whatever". It becomes possible to check that ".whatever" is legal at the same time that it becomes legal to check whether "a+b" is legal. 4. Generate bytecode. 5. Run it. Python has steps 1, 4 and 5 but skips steps 2 and 3. I am trying to get us to the point where we can do step 2 so that we can get to step 3 eventually. I once wrote a compiler and I beat my head against a wall until I realized that foo.bar resolution is a massively different problem if foo is a module (doesn't rely on type system) or a class (does rely on type system). The point of the "static" keyword is to allow Python author to say: "Some of my modules are static like Java modules. Please resolve references to these at compile time, not runtime." > > Once we have name checking then we can design a syntax to statically > > associate types with names. THEN we can do static type checking. I could > > be wrong but it seems to me that once this is done, the definition of > > swallow will be trivial: "A statically compilable Python module is a > > file where every name is frozen and every name has a type declaration." > > And every imported module has the same properties, including this one. > Though of course you may mean this with 'every name'. I don't think > it'll be that trivial, as you haven't defined 'type declaration'; you > run into complexities here, > especially if you involve classes and objects. That's true. That's why I'm not trying to solve that part of the problem yet. > Can't you go about it the other way around? First, you make a type > declaration for > all names in a module. There are four declarations we could imagine. Each is a little bit stronger than the previous. 1. "I believe that every name in this module/class that is not an attribute name can be statically resolved." 2. "I believe that this module can be used in other modules where every non-attribute name is supposed to be statically resolved." 3. "I believe that every name in this module/class can be statically type checked (including attribute name checking)". 4. "I believe that this module/class can be used in other modules where every name can be statically type checked." "freeze" could be 1. There is probably not much virtue in separating 1 and 2 so we could rather say that "freeze" actually means 2 which implies 1. A new, "type-safe" keyword might be used for 4 which again would imply 3 (and 2, and 1). If this is to be optional, off-by-default type and name checking then we need a way to turn it ON. "freeze" might be enough to allow some type inferencing and early binding (for performance, not safety) . "type-safe" would be used for performance and safety at a price that it would require you to stick to the "Java subset". > Then you check (somehow) if there isn't code that > contradicts this type definition; that is, there should be no > assignments of one name to another that violates the static type > definitions, no attribute accesses to undefined attributes, and so on. > Of course you instantly produce errors if any name doesn't have a type > definition. Here's the rub. In my mind, this should be legal code: def doubleString( String b ): return b*2 def doit(): doubleString( eval( raw_input() ) ) doit() That's what Python users will expect and that's also what VB does. This should be checked at *runtime*. On the other hand, THIS would cause a static type error: type-safe def doit(): doubleString( eval( raw_input() ) ) That's illegal because it claims to be type-safe but isn't really. The same goes for static: static def foo(): return this.that.b() is only valid if this.that is statically resolvable. (e.g. a module, not a class, and a module that is itself static) > Another > prerequisite for a type checker is the determination of the Swallow > subset. For instance one can imagine that in Swallow it's illegal to > import modules except on the top. I imagine these limitations will > become more obvious after a type system has been developed. I think we disagree on the granularity of the project. It should be possible to declare individual functions, classes or methods statically type checkable, not just whole modules. > A run-time layer can check whether each python object that is sent > into Swallow conforms to the type definitions; a function expecting an > int must indeed be sent an int object. If not, somekind of exception > should be raised. I agree. But I think we also need a way to say: "I want you to check this code at compile time, not runtime." -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Floggings will increase until morale improves. From gmcm@hypernet.com Wed Dec 8 16:28:52 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Wed, 8 Dec 1999 11:28:52 -0500 Subject: [Types-sig] Plea for help. In-Reply-To: <384E6DFC.58AE0201@prescod.net> Message-ID: <1267452506-32324318@hypernet.com> Paul Prescod wrote: > Here's my plea for help: among the many "Python compiler" > projects out there there must be some good Python code for > walking around ASTs building type (or at least module) > representation objects. I think that JPython is wirtten in Java. > What else should I be looking at? Probably the Python2C stuff that reformats a standard Python parse tree into something saner. Another possibility might be John Aycock's stuff; but his Python grammar doesn't produce an AST (it only verifies), and the grammar has some errors. I think Aaron Watters also did a Python grammar (for 1.4?), but I never looked at that. - Gordon From guido@CNRI.Reston.VA.US Wed Dec 8 16:34:44 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 08 Dec 1999 11:34:44 -0500 Subject: [Types-sig] Plea for help. In-Reply-To: Your message of "Wed, 08 Dec 1999 11:28:52 EST." <1267452506-32324318@hypernet.com> References: <1267452506-32324318@hypernet.com> Message-ID: <199912081634.LAA04169@eric.cnri.reston.va.us> > Paul Prescod wrote: > > > Here's my plea for help: among the many "Python compiler" > > projects out there there must be some good Python code for > > walking around ASTs building type (or at least module) > > representation objects. I think that JPython is wirtten in Java. > > What else should I be looking at? GMcM replied: > Probably the Python2C stuff that reformats a standard Python > parse tree into something saner. Another possibility might be > John Aycock's stuff; but his Python grammar doesn't produce > an AST (it only verifies), and the grammar has some errors. I > think Aaron Watters also did a Python grammar (for 1.4?), but > I never looked at that. Aaron's kjpylint contains a Python parser: http://www.chordate.com/kwParsing/ David Jeske's pylink also contains one: http://www.chat.net/~jeske/Projects/PyLint/download/pylint-19991121.py I seem to be having problems with pylint, which is much newer; the current kjpylint's parser is pretty robust as far as I can tell. --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy@cnri.reston.va.us Wed Dec 8 17:45:37 1999 From: jeremy@cnri.reston.va.us (Jeremy Hylton) Date: Wed, 8 Dec 1999 12:45:37 -0500 (EST) Subject: [Types-sig] Plea for help. In-Reply-To: <199912081634.LAA04169@eric.cnri.reston.va.us> References: <1267452506-32324318@hypernet.com> <199912081634.LAA04169@eric.cnri.reston.va.us> Message-ID: <14414.39233.255818.554490@goon.cnri.reston.va.us> >> Paul Prescod wrote: >> >> > Here's my plea for help: among the many "Python compiler" > >> projects out there there must be some good Python code for > >> walking around ASTs building type (or at least module) > >> representation objects. I think that JPython is wirtten in Java. >> > What else should I be looking at? Gordon and Guido offered some suggestions. I have done some noodling with the Py2C AST, and I think it is an excellent candidate. I was going to suggest that a good near-term goal for the type sig would be to write a Python compiler in Python, but I see that Paul has beaten me to it. I believe this project was also discussed on python-dev a few months ago (as part of the warnings discussion). I think it's a good project to tackle because it has usefulness beyond the specific approaches to static type, which remain controversial. When I was using the Py2C transformer class, I made some modifications to the AST generated to make it a little easier to use interactively. The original defintion for the AST was: class Node: def __init__(self, *args): self.__children = args self.lineno = None def __getitem__(self, index): return self.__children[index] def __str__(self): return str(self.__children) def __repr__(self): return "" % self.__children[0] def __len__(self): return len(self.__children) def __getslice__(self, low, high): return self.__children[low:high] def getChildren(self): return self.__children def getType(self): return self.__children[0] def asList(self): return tuple(asList(self.__children)) A tree of these nodes is created by the Transformer class, which walks the parse trees created by the parser module. I modified Node to be BaseNode and created specific classes for each of type of node: class Function(BaseNode): def __init__(self, name, argnames, defaults, flags, doc, code): self.name = name self.argnames = argnames self.defaults = defaults self.flags = flags self.doc = doc self.code = code self._children = ('function', name, argnames, defaults, flags, doc, code) def __repr__(self): return "Function(%s,%s,%s,%s,%s,%s)" % self._children[1:] def __str__(self): return "func:%s" % self.name Jeremy From eddy@chaos.org.uk Wed Dec 8 18:39:26 1999 From: eddy@chaos.org.uk (Edward Welbourne) Date: Wed, 08 Dec 1999 18:39:26 +0000 Subject: [Types-sig] Static typing considered ... UGLY Message-ID: <384EA5DE.3ADD8801@lsl.co.uk> Might I humbly suggest that: to incorporate static typing into python would change it beyond recognition it would probably be better to start from Algol and pythonify it, if that's where you want to go (hint: I don't) the right name for the relevant language would be typhoon because it's almost an anagram the real reason for doing it is speed when it breaks things it won't half tear them into little pieces ? (albeit viper is already out there and doubtless good ;^) A more pythonic approach would be to deploy some byte-code hacks which notice assertions of form assert isinstance(x, IntType) and optimise ensuing code around the presumption that the value in x when that assertion was executed is an int, allowing that all will go horribly wrong if it isn't (which won't be checked unless __debug__), but then we all know that speed kills. But only do this if the user has asked for type-asserted enhancements, and use a different .pyc extension for it. Might need a TypeException for throwing when it all goes horribly wrong. In a similar vein: could the interpreter and compiler exploit knowledge of an assertion a function makes (about its return value) just before returning ? i.e. calls to the function could presume the truth of what the function asserted ... not that I'm convinced that this is worth it, just that if you *insist* on static type notions, these are pythonic ways to approach it. But this is all `speed enhancement' (I refuse to call it optimisation: I have no evidence it gets anywhere near the optimum). There is a better way (I'll tell you about it late in January). Note: I believe the function type() should be removed totally, and isinstance should be replaced by (hint: think `type(x) in ...' instead of `type(x) == ...') def isinstance(x, *what): """True if x is an instance of any of the given types or classes.""" for mode in what: if oldisinstance(x, mode): return 1 return 0 Then `try: ... except (tuple, of, exceptions):' would, of course, be using the given tuple as *types when checking the exception raised. I'd vote to keep this sig open for unification (all objects are objects and support the same protocols - a module with __call__ in its namespace is callable, for instance) but if all that's to be discussed is static typing, I'd vote for closure (prematurely and *with* prejudice). I intend to follow up this bull-headedness in January. See y'all at IPC8. Eddy. -- was it Sam Johnson who said something about knowledge of impending death concentrating the mind ? Hence The Grim Guido re-woke the types-sig. From jim@digicool.com Wed Dec 8 18:48:44 1999 From: jim@digicool.com (Jim Fulton) Date: Wed, 08 Dec 1999 13:48:44 -0500 Subject: [Types-sig] Types sig dead or alive References: <384D199C.C6771285@prescod.net> <384D24AE.AF76BCA7@digicool.com> <384E58B8.4CE80095@prescod.net> Message-ID: <384EA80C.42B533F6@digicool.com> Paul Prescod wrote: > > Jim, I am happy to temporarily banish interface discussions from the > sig. I'd prefer that it be permanent. I'll also reiterate that I'd like the Class-Type unification to be taken out too. > But...is it likely that you guys will have some Zope-interface > experience by the conference? Don't know. It probably depends on Martijn Faassen, who sort of volunteered to do Zope integration. :) > I'm sure you guys are as up-to-the-wazoo > as the rest of us but it would be cool if we could get a mini-report on > whether the interfaces proposal *works*. Yes it would, however I can't make any promises to do this myself (or commit DC), but I *am* willing to work with Martijn or anyone else who wants to take the lead for now. Note that, even if the implementation isn't exercised in the next month, there would be progress since Spam7 to report on. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Wed Dec 8 18:51:44 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Wed, 8 Dec 1999 13:51:44 -0500 (EST) Subject: [Types-sig] Types sig dead or alive References: <384D199C.C6771285@prescod.net> <384D24AE.AF76BCA7@digicool.com> <384E58B8.4CE80095@prescod.net> <384EA80C.42B533F6@digicool.com> Message-ID: <14414.43200.610908.557195@anthem.cnri.reston.va.us> >>>>> "JF" == Jim Fulton writes: JF> Note that, even if the implementation isn't exercised in the JF> next month, there would be progress since Spam7 to report on. Do you want another devday session, Jim? -Barry From jim@digicool.com Wed Dec 8 19:05:32 1999 From: jim@digicool.com (Jim Fulton) Date: Wed, 08 Dec 1999 14:05:32 -0500 Subject: [Types-sig] Types sig dead or alive References: <384D199C.C6771285@prescod.net> <384D24AE.AF76BCA7@digicool.com> <384E58B8.4CE80095@prescod.net> <384EA80C.42B533F6@digicool.com> <14414.43200.610908.557195@anthem.cnri.reston.va.us> Message-ID: <384EABFC.EF38BA24@digicool.com> "Barry A. Warsaw" wrote: > > >>>>> "JF" == Jim Fulton writes: > > JF> Note that, even if the implementation isn't exercised in the > JF> next month, there would be progress since Spam7 to report on. > > Do you want another devday session, Jim? No, but if recommendations are made on a devday, there should be some time spent on the next devday to report on progress. So, I think there should be some time (session, whatever) spent giving progress on projects launched by devday'98. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From jim@digicool.com Wed Dec 8 19:15:56 1999 From: jim@digicool.com (Jim Fulton) Date: Wed, 08 Dec 1999 14:15:56 -0500 Subject: [Types-sig] I like the new look of the types-sig page! Message-ID: <384EAE6C.1CF299A6@digicool.com> I like the new look at: http://www.python.org/sigs/types-sig/. Thanks! Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From m.faassen@vet.uu.nl Wed Dec 8 20:11:50 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Wed, 08 Dec 1999 21:11:50 +0100 Subject: [Types-sig] Types sig dead or alive References: <384D199C.C6771285@prescod.net> <384D24AE.AF76BCA7@digicool.com> <384E58B8.4CE80095@prescod.net> <384EA80C.42B533F6@digicool.com> Message-ID: <384EBB85.E402AD2D@vet.uu.nl> Jim Fulton wrote: > > Paul Prescod wrote: > > > > Jim, I am happy to temporarily banish interface discussions from the > > sig. > > I'd prefer that it be permanent. I'll also reiterate that > I'd like the Class-Type unification to be taken out too. > > > But...is it likely that you guys will have some Zope-interface > > experience by the conference? > > Don't know. It probably depends on Martijn Faassen, who sort > of volunteered to do Zope integration. :) Currently I'm way too busy, and I volunteered to be involved in it, not to do it all myself. :) That said, I *hope* I'll get more time in january and actually explore the issue better then. Don't know how far I'll get as there's more I need to do. > > I'm sure you guys are as up-to-the-wazoo > > as the rest of us but it would be cool if we could get a mini-report on > > whether the interfaces proposal *works*. > > Yes it would, however I can't make any promises to do this myself > (or commit DC), but I *am* willing to work with Martijn or anyone > else who wants to take the lead for now. I hope to get some time by the end of this month, but I can't say how much right now. ZFormulator, XMLWidgets, and a whole lot of other stuff still needs to be worked on. > Note that, even if the implementation isn't exercised in the next > month, there would be progress since Spam7 to report on. That's true. I did read through your documentation on it yesterday. It looks good. I recall having read it before way back when, too, but I think I understand it better now. Regards, Martijn From guido@CNRI.Reston.VA.US Wed Dec 8 22:01:31 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 08 Dec 1999 17:01:31 -0500 Subject: [Types-sig] I like the new look of the types-sig page! In-Reply-To: Your message of "Wed, 08 Dec 1999 14:15:56 EST." <384EAE6C.1CF299A6@digicool.com> References: <384EAE6C.1CF299A6@digicool.com> Message-ID: <199912082201.RAA04898@eric.cnri.reston.va.us> > I like the new look at: http://www.python.org/sigs/types-sig/. > > Thanks! > > Jim You're welcome. I basically ripped out three pages saying "coming soon" that were last edited a year ago, plus everything that referred to them. There wasn't much left after that. ;-) --Guido van Rossum (home page: http://www.python.org/~guido/) From paul@prescod.net Wed Dec 8 22:47:29 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 08 Dec 1999 17:47:29 -0500 Subject: [Types-sig] Re: Static typing considered ... UGLY References: <384EA5DE.3ADD8801@lsl.co.uk> Message-ID: <384EE001.7D040EFE@prescod.net> Edward Welbourne wrote: > > Might I humbly suggest that: > > to incorporate static typing into python would change it > beyond recognition Our intention is that all existing Python code would continue to be valid modulo the possible introduction of a couple of keywords. If that still doesn't sound like it meets your needs then I'll just have to apologize in advance. A few other points: * if Python is never allowed to make major changes it will die prematurely. The other languages (except Scheme and other arguably dead languages) grow and evolve. * I would be more interested in your technical concerns. "It's ugly" is too subjective....especially when nobody considers it "ugly" in every other programming language that has type declarations. Rather, I think that type declarations improve code readability. * the assertion syntax strikes me as doubly ugly. Imagine a function that takes 10 arguments with 10 of those assertions. * for me, the goal is not performance. Performance considerations are secondary. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Floggings will increase until morale improves. From gstein@lyra.org Thu Dec 9 23:04:17 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 9 Dec 1999 15:04:17 -0800 (PST) Subject: [Types-sig] Plea for help. In-Reply-To: <14414.39233.255818.554490@goon.cnri.reston.va.us> Message-ID: On Wed, 8 Dec 1999, Jeremy Hylton wrote: >... > Gordon and Guido offered some suggestions. > > I have done some noodling with the Py2C AST, and I think it is an > excellent candidate. http://www.mudlib.org/~rassilon/p2c/ Specifically, the file transformer.py in that distribution. I've threatened before to break it out and make it available on my Python page... ought to do that sometime. >... > When I was using the Py2C transformer class, I made some modifications > to the AST generated to make it a little easier to use interactively. > The original defintion for the AST was: >... > A tree of these nodes is created by the Transformer class, which walks > the parse trees created by the parser module. > > I modified Node to be BaseNode and created specific classes for each > of type of node: The Node subclasses were on Bill's to-do list. That's cool that you've already done it! Cheers, -g -- Greg Stein, http://www.lyra.org/ From GoldenH@littoncorp.com Fri Dec 10 01:48:45 1999 From: GoldenH@littoncorp.com (Golden, Howard) Date: Thu, 9 Dec 1999 17:48:45 -0800 Subject: [Types-sig] "Open-World" design using generic Java: Lesson for Py thon? Message-ID: I recommend you look at the paper "Safe 'Open-World' Designs in Java and GJ," by Marco Nissen and Karsten Weihe, ftp://ftp.fmi.uni-konstanz.de/pub/preprints/1998/preprint-066-02.ps.Z . In the paper (Section 4), they distinguish two use scenarios for static type safety. In one use, static safety is of no benefit. However, in the other case, it will lead to a more reliable system. I hope those of you who question the value of typing will read the paper and consider their argument. (Note: In the paper the authors conclude that Java is deficient in meeting the second use case. They find that GJ, the generic version of Java developed by Bracha, et al., has the necessary feature of parametric polymorphism to be used this way.) Howard B. Golden Software developer Litton Industries, Inc. Woodland Hills, California From paul@prescod.net Fri Dec 10 15:05:05 1999 From: paul@prescod.net (Paul Prescod) Date: Fri, 10 Dec 1999 10:05:05 -0500 Subject: [Types-sig] Plea for help. References: <1267452506-32324318@hypernet.com> <199912081634.LAA04169@eric.cnri.reston.va.us> <14414.39233.255818.554490@goon.cnri.reston.va.us> Message-ID: <385116A1.69E1BD05@prescod.net> Jeremy Hylton wrote: > > I was going to suggest that a good near-term goal for the type sig > would be to write a Python compiler in Python, but I see that Paul has > beaten me to it. Not quite. I'm not going to do anything about generating bytecodes. But it seems to me like that would be another cool project. Someone should do py2pyc and add it to the py2c distribution. But I'm not going to... Yes, we will eventually want such a beast in order to allow for some runtime checks (since changing and distributing a Python-coded compiler is probably easier than changing the C-coded interpreter). Here's what I *would* like to do. I would like to subclass your node objects and build "statically resolved" subtypes. This will be a natural base class for a new version of py2c (which as far as I know does no static resolution) and for an optimizing bytecode compiler. Hopefully a big chunk of the Python 2 compiler can be written in Python 2. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself "A writer is also a citizen, a political animal, whether he likes it or not. But I do not accept that a writer has a greater obligation to society than a musician or a mason or a teacher. Everyone has a citizen's commitment." - Wole Soyinka, Africa's first Nobel Laureate From paul@prescod.net Fri Dec 10 15:04:39 1999 From: paul@prescod.net (Paul Prescod) Date: Fri, 10 Dec 1999 10:04:39 -0500 Subject: [Types-sig] Plea for help. References: Message-ID: <38511687.73E793C6@prescod.net> Greg Stein wrote: > > The Node subclasses were on Bill's to-do list. That's cool that you've > already done it! Would it be possible to update the standard py2c distribution with these changes so that we don't have a code fork? -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself "A writer is also a citizen, a political animal, whether he likes it or not. But I do not accept that a writer has a greater obligation to society than a musician or a mason or a teacher. Everyone has a citizen's commitment." - Wole Soyinka, Africa's first Nobel Laureate From jeremy@cnri.reston.va.us Fri Dec 10 16:44:44 1999 From: jeremy@cnri.reston.va.us (Jeremy Hylton) Date: Fri, 10 Dec 1999 11:44:44 -0500 (EST) Subject: [Types-sig] Plea for help. In-Reply-To: <38511687.73E793C6@prescod.net> References: <38511687.73E793C6@prescod.net> Message-ID: <14417.11772.747280.226120@goon.cnri.reston.va.us> >>>>> "PP" == Paul Prescod writes: PP> Greg Stein wrote: >> The Node subclasses were on Bill's to-do list. That's cool that >> you've already done it! PP> Would it be possible to update the standard py2c distribution PP> with these changes so that we don't have a code fork? I'm going to send patches to Greg. I'm swamped today, but will get to it Monday at the latest. Jeremy From gstein@lyra.org Sat Dec 11 01:00:41 1999 From: gstein@lyra.org (Greg Stein) Date: Fri, 10 Dec 1999 17:00:41 -0800 (PST) Subject: [Types-sig] transformer.py (was: Plea for help.) In-Reply-To: <38511687.73E793C6@prescod.net> Message-ID: On Fri, 10 Dec 1999, Paul Prescod wrote: > Greg Stein wrote: > > > > The Node subclasses were on Bill's to-do list. That's cool that you've > > already done it! > > Would it be possible to update the standard py2c distribution with these > changes so that we don't have a code fork? Oh. Sure. I'll get right on it. Bill and I have already exchanged mail with Jeremy. The stuff will get folded in at some point. When? Dunno. When we have free time. Bill and I don't spend much time with that code -- it comes in bursts. Code fork? I don't see that occurring at all; it isn't like Jeremy is purposefully going to start producing new releases of transformer.py. If he sends one out, it would simply be to expedite matters. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Sat Dec 11 01:13:59 1999 From: gstein@lyra.org (Greg Stein) Date: Fri, 10 Dec 1999 17:13:59 -0800 (PST) Subject: [Types-sig] Plea for help. In-Reply-To: <385116A1.69E1BD05@prescod.net> Message-ID: On Fri, 10 Dec 1999, Paul Prescod wrote: > Jeremy Hylton wrote: > > I was going to suggest that a good near-term goal for the type sig > > would be to write a Python compiler in Python, but I see that Paul has > > beaten me to it. > > Not quite. I'm not going to do anything about generating bytecodes. But > it seems to me like that would be another cool project. Someone should > do py2pyc and add it to the py2c distribution. But I'm not going to... P2C is in CVS (see http://www.pythonpros.com/cvs.html). If people really want to get some work done, then we can arrange for access. The P2C framework has been used for a couple output targets, so generating a pyc is definitely workable. > Yes, we will eventually want such a beast in order to allow for some > runtime checks (since changing and distributing a Python-coded compiler > is probably easier than changing the C-coded interpreter). yup. > Here's what I *would* like to do. I would like to subclass your node > objects and build "statically resolved" subtypes. This will be a natural > base class for a new version of py2c (which as far as I know does no > static resolution) and for an optimizing bytecode compiler. We have very limited type handling (and certainly no inference). > Hopefully a big chunk of the Python 2 compiler can be written in Python > 2. I'm hoping to see a replaceable compiler in 1.6. Shouldn't be hard to move the compilation step behind some hooks. Should be able to hook-ify the parser, too. Cheers, -g -- Greg Stein, http://www.lyra.org/ From steve@websentric.com Mon Dec 13 10:00:53 1999 From: steve@websentric.com (Stephen Purcell) Date: Mon, 13 Dec 1999 11:00:53 +0100 Subject: [Types-sig] Re: Static typing considered HARD Message-ID: <3854C3D5.D7F765A@websentric.com> Paul Prescod wrote: > > What can you do dynamically in Python that you cannot do with > reflections and introspection? I've written "map", "apply" and the Y > combinator in Java so I'm pretty confident that the issue is really just > syntax and ease of use, not capabilities. > > You could prove me wrong by showing a Python programming pattern that > could not be straightforwardly duplicated using Java reflection. I have one: dynamic configuration of exception catching -- class Test: FAILURE_ERROR_TYPE = AssertionError def run(self, result): try: self.runTest() except self.FAILURE_ERROR_TYPE: result.failed(self) except: result.error(self) else: result.success(self) This is a cut-down version of a real and justifiable example. Try doing that in Java with reflection and without resorting to 'instanceof' or 'Class.isInstance()'. -Steve _______________________________________________ Steve Purcell Squadron Leader WebSentric AG, http://www.websentric.com/ ____"Would you like to look at my Python?"_____ From paul@prescod.net Mon Dec 13 15:49:15 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 13 Dec 1999 10:49:15 -0500 Subject: [Types-sig] RFC 0.1 Message-ID: <3854EB4B.37EA2888@prescod.net> I don't think we would get anywhere if I just opened up the floor and had everyone yell their opinions about type safety. Here is a very rough starting point. Let's talk freely about it for a few days and then I'll try to direct the conversation based upon addressing the feedback. Version 0.1 Draft of a Pythonic Type Checking System ==================================================== Guiding Principles in the System's Development ---------------------------------------------- #1. The system exists to serve the dual goals of catching errors earlier in the development process and improving the performance of Python compilers and the Python runtime. Neither goal should be pursued exclusively. #2. The system must allow authors to make assertions about the sorts of values that may be bound to names. These are called binding assertions. They should behave as if an assertion statement was inserted after every assignment to that name program-wide. Note: this does in fact put more power in the hands of module developers. For the first time we will be able to say that sys.exit may not be overridden in user code and that sys.maxint cannot be changed to contain a string. Note: the term "sorts of values" is meant to be ambiguous: the definition of "type" in Python may undergo change in the future. #3. Binding assertions must always be optional. #4. There must be declarations that instruct static type checking software to verify that a function cannot violate binding assertions. These are called safety declarations. #5. The introduction of binding assertions to a module should not change the perceived interface of functions and classes in the module. In other words, code that uses functions and classes from the module should not need to know whether it uses binding assertions or old fashioned assert statements. #6. In the absence of local safety declarations, a static type checker should not by default report errors in otherwise legal Python code. In other words, a coder must ask (through function or module level declarations, command line switches or environment variables) for his or her code to be checked. In particular, a module cannot force client modules to be statically type checked (see #5, above). #7. The attachment of safety declarations to a function should not change the perceived interface of the function. In other words, code that uses the functions should not need to know that the function happens to be statically checkable. #8. It is not a goal that a statically checkable function should only be able to call other statically checkable functions. Those other functions should be presumed to return a "PyObject" object. #9. There should be a mechanism to assert that an object has a particular type for purposes of informing the static and dynamic type checkers about something that the programmer knows about the flow of the program. #10. In general, the mechanism should try to be "pythonic" which includes but is not limited to: * maximize simplicity * maximize power * minimize syntax * be explicit * be readable * interoperate nicely with other features Temporary Goals and Non-Goals: ------------------------------ #1. The first version of the system will be as neutral as possible on the issue of what defines a "type". Fulton's capability-based interfaces should be legal as types but so should type objects and classes. Note: a purely interface based system cannot be feasible for testing until interfaces are embedded deeply into the existing Python library. It might be more philisophically pure to test for an abstract CharacterString interface but if the Python expression "abc" does not return an object that conforms to the interface then there is not much we can do. Some future version of the system may be restricted to only allow declared interfaces as types. Or it may be expanded to allow parameterized types. #2. The first version of the system will not allow the use of types that cannot be referred to as simple Python objects. In particular it will not allow users to refer to things like "List of Integers" and "Functions taking Integers as arguments and returning strings." #3. The first version of the system will not define the operation of a type inferencing system. For now, all type declarations would need to be explicit. #4. The first version of the system will be syntactically compatible with Python 1.5.x in order to allow experimentation in the lead-up to an integrated system in Python 2. Definitions: ------------ Namespace creating suite: The suite contained directly within a module, class or function definition. Statically available namespace creating suite: The namespace creating suite defined by a module or class definition. We do not consider the suite contained with a function as Statically available because the namespace only becomes available when the function is executed, not when it is declared. Name binding statement, target: An assignment statement (target), "def" statement ("funcname"), "class" ("classname") statement or "import" statement (module). *** more thought about "from" version *** Name declaration: A name bound at the most out-dented context of a statically available namespace creating suite. Classification: Due to a shortage of synonyms for "type" that do not already have a meaning, we use the word "classification." Given a value v and a value t, v conforms to classification t if t is returned by type( v ) t is returned by v.__class__ t is in v.__implements__ (the fulton convention) t is the "object" classification v is the value "None" Classification Declaration: A statement that precedes a name binding statement and declares the classifications that the name must conform to. The type declaration must textually precede any use of the name. Classification Constraints: A pair of statements declaring the classifications that values bound to a name must support. There are a few syntactic variations: 1. A name binding statement preceded by a statement referencing a classification. types.StringType a class foo: types.IntType j=5 This assertion is maintained by a combination of the static and dynamic type checkers. In order for the dynamic checker to work, we will need to modify the module_setattr and class_setattr functions for Python 1.6. 2. A simple expression containing only a tuple where all but the last item reference a classification. The last item should be a locally declared name. The statement must occur in the most out-dented context of a namespace creating statement suite: def foo(bar, baz): types.IntType, bar interfaces.NumericType, interfaces.SignedType, baz 3. The classification of a function is always "function" but its return classification can be specified with a declaration: types.StringType def foo(): return "abc" This can be checked through the introduction of "virtual" assertion statements into byte-code: types.StringType def foo(): __tmp = "abc" assert has_type( __tmp, types.StringType ) return "abc" 4. The classification of class instance variables comes from the classification of the corresponding class variable. class foo: types.IntType a=5 types.ListType b=None Classification-testing expression: The function has_type takes a value and a reference to a classification or list of classifications. The return type of the function is the union of the classifications. Classification-safe Function: a function that can be checked at compile time not to violate any classification constraints by assigning invalid values to any constrained names: Every reference to a name in a module or class (not instance!) must be to a declared (but perhaps not classification constrained) name. Remember that variables without classification constraints can be presumed to conform to the "Object" type. Every expression must be type-checked based on the operators, constants and global and local name references. Attribute assignments and references are checked based upon the asserted classifications of the owning object. The classification of every assignment must be checked based on the types of constants, variables and function return types in the right-hand side. The classification of every function parameter must be checked based on the classifications of the argument expression. All return statements must be checked based on the classifications of the expressions. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself "A writer is also a citizen, a political animal, whether he likes it or not. But I do not accept that a writer has a greater obligation to society than a musician or a mason or a teacher. Everyone has a citizen's commitment." - Wole Soyinka, Africa's first Nobel Laureate From paul@prescod.net Mon Dec 13 15:56:52 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 13 Dec 1999 10:56:52 -0500 Subject: [Types-sig] Re: Static typing considered HARD References: <3854C3D5.D7F765A@websentric.com> Message-ID: <3854ED13.8093FCE7@prescod.net> Stephen Purcell wrote: > > This is a cut-down version of a real and justifiable example. Try doing > that in Java with reflection and without resorting to 'instanceof' or > 'Class.isInstance()'. What you're saying is that I can't emulate Python's dynamic features without using Java's dynamic features. I would agree with that assertion -- but I'm not convinced it is relevant. instanceof is part of the language core and isInstance is a reflective feature. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself There are only two countries in the world that have not ratified the United Nations convention on the rights of children: Somalia and the United States of America. See: http://www.boes.org/ From steve@websentric.com Mon Dec 13 17:46:39 1999 From: steve@websentric.com (Stephen Purcell) Date: Mon, 13 Dec 1999 18:46:39 +0100 Subject: [Types-sig] Re: Static typing considered HARD References: <3854C3D5.D7F765A@websentric.com> <3854ED13.8093FCE7@prescod.net> Message-ID: <385530FF.70AABC9E@websentric.com> Paul Prescod wrote: > > Stephen Purcell wrote: > > > > This is a cut-down version of a real and justifiable example. Try doing > > that in Java with reflection and without resorting to 'instanceof' or > > 'Class.isInstance()'. > > What you're saying is that I can't emulate Python's dynamic features > without using Java's dynamic features. I would agree with that assertion > -- but I'm not convinced it is relevant. instanceof is part of the > language core and isInstance is a reflective feature. > Thanks, Paul, for noting the lack of clarity in my comment, which I shall endeavour to remedy: The dynamic nature of Python's exception handling is an intrinsic language property that cannot be exactly mirrored in Java's exception handling, using reflection or otherwise. No 'catch' clause in Java will ever work the same way as Python's 'except', and by "resorting to instanceof or isInstance" I meant a subversion of the 'catch' clause's fundamental semantics: abstract class Test { private Class FAILURE_ERROR_TYPE = AssertionException.class; void run(Result r) { try { runTest(); r.success(this); catch ( Exception e ) { if ( e.getClass() == FAILURE_ERROR_TYPE ) { r.fail(this); } else { r.error(this); } } } } This is not the same language construct as the Python version. Your argument is that any functionality implemented in a dynamically-typed language can be mirrored in a statically-typed language. Of course that is true, given enough code. It does *not* imply that the features of the statically typed language are compatible with those of the dynamically typed language, nor that their introduction is desirable and technically possible. It seems to me that the whole static-blah thing clashes with fundamental choices that Guido made when designing Python, and those choices are presumably a large part of Python's appeal and success. I would never presume to second-guess the needs of Python's users. Static typing works very well in Java and suchlike, but those are different languages, and the people who cannot live without static typing use them instead of Python (and Smalltalk). The rest of us, who do not expect perfection to consist of the union of all possibilities, use Python when appropriate and keep in mind its characteristics. There's something special about Python's elegance, and losing that elegance by a Perl-5-like process of cluttering would be enough for me to abandon the language, and move on to the next 'clean' thing. I care enough to have posted this opinion, but not enough to try to influence Python's development. The booing to which David Ascher alluded in his posting may indeed have been 'just' an emotional reaction to the proposal, but I challenge any avid Python user to fully describe his or her enthusiasm for the language in purely technical terms. I use the language because it somehow makes me feel good. When it no longer gives me that feeling, I'll stop using it. Static typing would have that effect. I suspect that other avid users such as Uche would also stop. Rational? Not entirely. I don't expect anyone to care what I think or if I abandon Python in the future, and I certainly don't imagine that any rational argument I might provide for my opinion would change anybody else's mind. I'll avoid the fray, and vote with my feet when the time comes. -Steve _______________________________________________ Steve Purcell Squadron Leader WebSentric AG, http://www.websentric.com/ ____"Would you like to look at my Python?"_____ From guido@CNRI.Reston.VA.US Mon Dec 13 18:09:15 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 13 Dec 1999 13:09:15 -0500 Subject: [Types-sig] RFC 0.1 In-Reply-To: Your message of "Mon, 13 Dec 1999 10:49:15 EST." <3854EB4B.37EA2888@prescod.net> References: <3854EB4B.37EA2888@prescod.net> Message-ID: <199912131809.NAA19402@eric.cnri.reston.va.us> > I don't think we would get anywhere if I just opened up the floor and > had everyone yell their opinions about type safety. Here is a very rough > starting point. Let's talk freely about it for a few days and then I'll > try to direct the conversation based upon addressing the feedback. Thanks for starting this, Paul! > Version 0.1 Draft of a Pythonic Type Checking System > ==================================================== > > Guiding Principles in the System's Development > ---------------------------------------------- > > #1. The system exists to serve the dual goals of catching errors > earlier in the development process and improving the performance of > Python compilers and the Python runtime. Neither goal should be > pursued exclusively. Hm, these may at times be very different goals. I had a recent private discussion about types where the two goals were referred to as (OPT), for optimization, and (ERR), for error-detection. One observation is that while for (OPT) you may be able to get away with aggressive whole-program type inferencing only, but for (ERR) you're likely to *want* to declare types in certain cases; e.g. to prepare for possible evolution of a module you may want to fix its API to a subset of what is actually implemented. > #2. The system must allow authors to make assertions about the sorts > of values that may be bound to names. These are called binding > assertions. They should behave as if an assertion statement was > inserted after every assignment to that name program-wide. Technically, Python assert statements are only executed in non-optimizing mode -- "assert 0" has no effect when you happen to use "python -O" to execute your program. But I presume that here you mean assertions in the abstract conceptual sense. > Note: this does in fact put more power in the hands of module > developers. For the first time we will be able to say that > sys.exit may not be overridden in user code and that sys.maxint cannot > be changed to contain a string. I think JPython secretly already imposes some of these restrictions (in particular for the sys module!). > Note: the term "sorts of values" is meant to be ambiguous: the > definition of "type" in Python may undergo change in the future. > > #3. Binding assertions must always be optional. > > #4. There must be declarations that instruct static type checking > software to verify that a function cannot violate binding assertions. > These are called safety declarations. I'm not sure what you mean here and how such declarations differ from type assertions. And I'm worried about the "must" part. Please explain better? > #5. The introduction of binding assertions to a module should not > change the perceived interface of functions and classes in the module. > In other words, code that uses functions and classes from the module > should not need to know whether it uses binding assertions or old > fashioned assert statements. Except that some unintended uses may become illegal while before you might just have gotten away with them. > #6. In the absence of local safety declarations, a static type checker > should not by default report errors in otherwise legal Python code. In > other words, a coder must ask (through function or module level > declarations, command line switches or environment variables) for his > or her code to be checked. In particular, a module cannot force client > modules to be statically type checked (see #5, above). However, there are some examples of dynamic code usage that are fishy. Examples include adding or changing globals in other modules (except for the rare global that is intended to be a settable option), or messing with the __builtin__ module. > #7. The attachment of safety declarations to a function should not > change the perceived interface of the function. In other words, code > that uses the functions should not need to know that the function > happens to be statically checkable. But I'd still like to be able to be diagnosed at compile time instead of at runtime when my code makes a statically illegal call to a function with a safety declaration. > #8. It is not a goal that a statically checkable function should only > be able to call other statically checkable functions. Those other > functions should be presumed to return a "PyObject" object. > > #9. There should be a mechanism to assert that an object has a > particular type for purposes of informing the static and dynamic type > checkers about something that the programmer knows about the flow of > the program. Beyond "assert isinstance(object, type_or_class)" ? > #10. In general, the mechanism should try to be "pythonic" which > includes but is not limited to: > > * maximize simplicity > * maximize power > * minimize syntax > * be explicit > * be readable > * interoperate nicely with other features > > Temporary Goals and Non-Goals: > ------------------------------ > > #1. The first version of the system will be as neutral as possible on > the issue of what defines a "type". Fulton's capability-based > interfaces should be legal as types but so should type objects and > classes. > > Note: a purely interface based system cannot be feasible for testing > until interfaces are embedded deeply into the existing Python library. > It might be more philisophically pure to test for an abstract > CharacterString interface but if the Python expression "abc" does not > return an object that conforms to the interface then there is not much > we can do. Some future version of the system may be restricted to only > allow declared interfaces as types. Or it may be expanded to allow > parameterized types. > > #2. The first version of the system will not allow the use of types > that cannot be referred to as simple Python objects. In particular it > will not allow users to refer to things like "List of Integers" and > "Functions taking Integers as arguments and returning strings." It's been said before: that's a shame. Type inference is seriously hindered if it doesn't have such information. (Consider a loop over sys.argv; I want the checker to be able to assume that the items are strings.) > #3. The first version of the system will not define the operation of a > type inferencing system. For now, all type declarations would need to > be explicit. I expect that this will make the system relatively heavy-weight and hence unpythonic. You'd be sprinkling way more type decls over your source code than would be necessary with a somewhat more sophisticated type checker. > #4. The first version of the system will be syntactically compatible > with Python 1.5.x in order to allow experimentation in the lead-up to > an integrated system in Python 2. I think that this is too much of a constraint, and may be informing your preliminary design too much. As long as an easy mechanical transformation to valid Python 1.5.x is available, I'd be happy. > Definitions: > ------------ > Namespace creating suite: > The suite contained directly within a module, class or function > definition. > > Statically available namespace creating suite: > The namespace creating suite defined by a module or class > definition. We do not consider the suite contained with a function as > Statically available because the namespace only becomes available when > the function is executed, not when it is declared. > > Name binding statement, target: > An assignment statement (target), "def" statement ("funcname"), > "class" ("classname") statement or "import" statement (module). *** > more thought about "from" version *** > > Name declaration: > A name bound at the most out-dented context of a statically > available namespace creating suite. The indentation don't enter into it. Consider if win32: def func(): ... # win32 specific version else: def func(): ... # generic version > Classification: > Due to a shortage of synonyms for "type" that do not already have a > meaning, we use the word "classification." Oh, dear. Keep looking for a better synonym! > Given a value v and a value t, v conforms to classification t if > t is returned by type( v ) > t is returned by v.__class__ > t is in v.__implements__ (the fulton convention) > t is the "object" classification > v is the value "None" > > Classification Declaration: > A statement that precedes a name binding statement and declares > the classifications that the name must conform to. The type > declaration must textually precede any use of the name. > > Classification Constraints: > A pair of statements declaring the classifications that values > bound to a name must support. There are a few syntactic variations: > > 1. A name binding statement preceded by a statement referencing a > classification. > > > types.StringType > a > > class foo: > types.IntType > j=5 > > > This assertion is maintained by a combination of the static and > dynamic type checkers. In order for the dynamic checker to work, we > will need to modify the module_setattr and class_setattr functions for > Python 1.6. > > 2. A simple expression containing only a tuple where all but the > last item reference a classification. The last item should be a > locally declared name. The statement must occur in the most out-dented > context of a namespace creating statement suite: > > def foo(bar, baz): > types.IntType, bar > interfaces.NumericType, interfaces.SignedType, baz > > 3. The classification of a function is always "function" but its > return classification can be specified with a declaration: > > > types.StringType > def foo(): return "abc" > > > This can be checked through the introduction of "virtual" assertion > statements into byte-code: > > > types.StringType > def foo(): > __tmp = "abc" > assert has_type( __tmp, types.StringType ) > return "abc" > Of course, in certain cases (as in this example) the type checker may be able to prove that the assertion can never fail, and omit it. > 4. The classification of class instance variables comes from the > classification of the corresponding class variable. > > > class foo: > types.IntType > a=5 > > types.ListType > b=None > The initialization for b denies its type declaration. Do you really want to do this? This doesn't look like it should be part of the final (Python 2.0) version -- it's just too ugly. How am I going to explain this to a newbie with no programming *nor* Python experience? > Classification-testing expression: > > The function has_type takes a value and a reference to a > classification or list of classifications. The return type of the > function is the union of the classifications. Perhaps this could be an extension of isinstance()? (That already takes both class and type objects.) > Classification-safe Function: > > a function that can be checked at compile time not to violate any > classification constraints by assigning invalid values to any > constrained names: > > Every reference to a name in a module or class (not instance!) must be > to a declared (but perhaps not classification constrained) name. Explain the reason for excluding instances? Maybe I'm not very clear on what you're proposing here. > > Remember that variables without classification constraints can be > presumed to conform to the "Object" type. > > > Every expression must be type-checked based on the operators, > constants and global and local name references. Ah, good. This implies the "no messing with builtins or other modules' globals" rule that I'm proposing. > Attribute assignments and references are checked based upon the > asserted classifications of the owning object. > > The classification of every assignment must be checked based on the > types of constants, variables and function return types in the > right-hand side. > > The classification of every function parameter must be checked based > on the classifications of the argument expression. > > All return statements must be checked based on the classifications of > the expressions. OK. I'm not sure everywhere whether you want compile-time or run-time checking. Perhaps you can clarify this? --Guido van Rossum (home page: http://www.python.org/~guido/) From GoldenH@littoncorp.com Mon Dec 13 18:11:15 1999 From: GoldenH@littoncorp.com (Golden, Howard) Date: Mon, 13 Dec 1999 10:11:15 -0800 Subject: [Types-sig] Re: RFC 0.1 Message-ID: Paul Prescod wrote: > #2. The system must allow authors to make assertions about the sorts > of values that may be bound to names. These are called binding > assertions. They should behave as if an assertion statement was > inserted after every assignment to that name program-wide. I think the system should also allow the author to require declarations of all variables (e.g., via a command-line switch or pragma). > #3. Binding assertions must always be optional. Unless the author requires them using the above mechanism. > #10. In general, the mechanism should try to be "pythonic" which > includes but is not limited to: > * maximize simplicity > * maximize power > * minimize syntax > * be explicit > * be readable > * interoperate nicely with other features This is vague. I'm not sure what it means. > 1. The first version of the system will be as neutral as possible on > the issue of what defines a "type". Fulton's capability-based > interfaces should be legal as types but so should type objects and > classes. I don't understand the ramifications of this. Might it not gut the RFC? > Note: a purely interface based system cannot be feasible for testing > until interfaces are embedded deeply into the existing Python library. > It might be more philisophically pure to test for an abstract > CharacterString interface but if the Python expression "abc" does not > return an object that conforms to the interface then there is not much > we can do. Some future version of the system may be restricted to only > allow declared interfaces as types. Or it may be expanded to allow > parameterized types. Shouldn't it be straightforward to add declarations to the existing library? > #2. The first version of the system will not allow the use of types > that cannot be referred to as simple Python objects. In particular it > will not allow users to refer to things like "List of Integers" and > "Functions taking Integers as arguments and returning strings." Why? I don't think this should be prohibited, only not guaranteed. > #4. The first version of the system will be syntactically compatible > with Python 1.5.x in order to allow experimentation in the lead-up to > an integrated system in Python 2. Does this mean no new syntax? (That's what it appears from your examples.) How about a declaration syntax, e.g., var x : type1, y : type2 Is this prohibited by the RFC? > Definitions: > ------------ I'm confused about this section. Are these requirements or merely terminology? In general, I don't understand the definitions. It would help me if there were some additional explanation of how the defined terms fit together and what benefits are being obtained by making these distinctions. --- Howard B. Golden Software developer Litton Industries, Inc. Woodland Hills, California From jpe@arachne.org Mon Dec 13 18:54:30 1999 From: jpe@arachne.org (John Ehresman) Date: Mon, 13 Dec 1999 13:54:30 -0500 (EST) Subject: [Types-sig] RFC 0.1 In-Reply-To: <199912131809.NAA19402@eric.cnri.reston.va.us> Message-ID: On Mon, 13 Dec 1999, Guido van Rossum wrote: > ... > OK. I'm not sure everywhere whether you want compile-time or run-time > checking. I think it might be possible to do both run-time and compile-time checking by defining the system in terms of what happens at run time, but allowing compile time optimizations to be made. For example, we might say the declaration (using C-like syntax) "def IntType atoi(StringType s):" to mean that if a value is passed to atoi that is not a string, a TypeError exception is raised. This declaration might be enough for a lint like program to analyze code before it is run and to flag cases where TypeError would be thrown. I think there's value in having run-time checking to support delayed checking in some cases -- it would allow strongly typed functions to be bound to symbols without any typing info. Otherwise, it unclear how to handle the following: def IntType atoi(StringType s): ... if something: conv = atoi else: conv = function_of_unknown_type print conv('1') Then, if a compiler was able to determine that the value bound to StringType never changed and that a value was a valid StringType, it could optimize away the code to check the type of that value. This could be implemented for functions by separating the code to check argument types from the function body and setting up the calling conventions so that the type checking code was only executed when needed. I don't see anything here that prevents type inferencing to work in either a lint like program or a compiler. For example, from the code: def IntType atoi(StringType s): ... def wrapper(s): return atoi(s) it's relatively easy to determine that wrapper must take a string and returns an integer (assuming the binding for atoi is constant). John From gstein@lyra.org Mon Dec 13 20:15:13 1999 From: gstein@lyra.org (Greg Stein) Date: Mon, 13 Dec 1999 12:15:13 -0800 (PST) Subject: [Types-sig] RFC 0.1 In-Reply-To: <199912131809.NAA19402@eric.cnri.reston.va.us> Message-ID: My comments below come from a writeup that is posted at: http://www.foretec.com/python/workshops/1998-11/greg-type-ideas.html The writeup is from a discussion last year, between Fred, Sjoerd, and myself. I'm not going to replicate the details of that writeup here, but will simply highlight some points. Hit the link to see the background. On Mon, 13 Dec 1999, Guido van Rossum wrote: > Paul Prescode wrote: >... > > #2. The system must allow authors to make assertions about the sorts > > of values that may be bound to names. These are called binding > > assertions. They should behave as if an assertion statement was > > inserted after every assignment to that name program-wide. In our writeup, we posit that it is better (and more Pythonic) to bind the assertions to expressions, rather than names. This came about when we looked at how to supply assertions for things like: x.y = value x[i] = value x[i:j] = value Certainly, function objects would have type information associated with them, but I believe that is different than associating a type with the function's name. > Technically, Python assert statements are only executed in > non-optimizing mode -- "assert 0" has no effect when you happen to use > "python -O" to execute your program. But I presume that here you mean > assertions in the abstract conceptual sense. We proposed a new type-assertion operator. Whether it did anything or not (based on the -O switch) is a different discussion :-) >... > > #9. There should be a mechanism to assert that an object has a > > particular type for purposes of informing the static and dynamic type > > checkers about something that the programmer knows about the flow of > > the program. > > Beyond "assert isinstance(object, type_or_class)" ? We also proposed extending isinstance() to allow a callable for the third argument. This allows for arbitrarily complex type checking (e.g. the "list of integers" problem). >... > > #2. The first version of the system will not allow the use of types > > that cannot be referred to as simple Python objects. In particular it > > will not allow users to refer to things like "List of Integers" and > > "Functions taking Integers as arguments and returning strings." > > It's been said before: that's a shame. Type inference is seriously > hindered if it doesn't have such information. (Consider a loop over > sys.argv; I want the checker to be able to assume that the items are > strings.) The mechanism we outlined would allow any dotted-name for specifying a type, and the "isinstance(ob, callable)" mechanism would allow for complex type checking. >... > > #4. The first version of the system will be syntactically compatible > > with Python 1.5.x in order to allow experimentation in the lead-up to > > an integrated system in Python 2. > > I think that this is too much of a constraint, and may be informing > your preliminary design too much. As long as an easy mechanical > transformation to valid Python 1.5.x is available, I'd be happy. I believe we came up with an unambiguous grammer which should easily allow for mechanical translation. [ side note: if we get replaceable parser/compiler functionality in 1.6, then we can start to test these alternative grammers and can compile assertions and things based on them! ] >... > > 4. The classification of class instance variables comes from the > > classification of the corresponding class variable. > > > > > > class foo: > > types.IntType > > a=5 > > > > types.ListType > > b=None > > > > The initialization for b denies its type declaration. Do you really > want to do this? This doesn't look like it should be part of the > final (Python 2.0) version -- it's just too ugly. How am I going to > explain this to a newbie with no programming *nor* Python experience? If type assertions are bound to expressions, rather than names, a data flow analysis will show the types at any point. This could (theoretically) avoid many "declarations". > > Classification-testing expression: > > > > The function has_type takes a value and a reference to a > > classification or list of classifications. The return type of the > > function is the union of the classifications. > > Perhaps this could be an extension of isinstance()? (That already > takes both class and type objects.) See my proposed extension to isinstance(). I believe it is a very clear extension and offers all the functionality you may need. Cheers, -g -- Greg Stein, http://www.lyra.org/ From tim.hochberg@ieee.org Mon Dec 13 21:17:00 1999 From: tim.hochberg@ieee.org (Tim Hochberg) Date: Mon, 13 Dec 1999 14:17:00 -0700 Subject: [Types-sig] Greg Stein's writeup (was RFC 0.1) References: Message-ID: <00a301bf45af$67323440$87740918@phnx3.az.home.com> > My comments below come from a writeup that is posted at: > http://www.foretec.com/python/workshops/1998-11/greg-type-ideas.html > > The writeup is from a discussion last year, between Fred, Sjoerd, and > myself. I'm not going to replicate the details of that writeup here, but > will simply highlight some points. Hit the link to see the background. I just read Greg's writeup and I like it quite a bit. With the exception of those nasty !s, it seems very Pythonic. My question is: is there a reason that a digraph couldn't be used instead of the !. In particular "foo->Int" can be read as "foo evaluates_to Int" which seems to have all of the correct associations. Or does this result in ambiguous syntax? All of Greg's examples seem to be OK: x = value->Int (x,y) = value->Coord (x,y) = value->(Int, String) while foo()->Int: ... def foo(x->String)->Int: ... Of course there is the problem that -> has a very different meaning in C/C++, but then so does !. Just my two cents, -tim From paul@prescod.net Tue Dec 14 04:39:36 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 13 Dec 1999 20:39:36 -0800 Subject: [Types-sig] Plea for help. References: Message-ID: <3855CA08.4EA1BF11@prescod.net> Greg Stein wrote: > > I'm hoping to see a replaceable compiler in 1.6. Shouldn't be hard to move > the compilation step behind some hooks. Should be able to hook-ify the > parser, too. Is there currently any path from high level parse trees to bytecodes? E.g. is there a way to get sane parse trees to "render" themselves as, er, insane parse trees? I don't think so but I'm just checking to avoid extra work. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself "Unwisely, Santa offered a teddy bear to James, unaware that he had been mauled by a grizzly earlier that year." - Timothy Burton, "James" From paul@prescod.net Tue Dec 14 04:39:43 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 13 Dec 1999 20:39:43 -0800 Subject: [Types-sig] Re: transformer.py (was: Plea for help.) References: Message-ID: <3855CA0F.911CB41A@prescod.net> Greg Stein wrote: > > Bill and I have already exchanged mail with Jeremy. The stuff will get > folded in at some point. When? Dunno. Is it a case of "folding in" or of merging a file? I thought it was the latter because I thought that Jeremy's changes were backwards compatible with py2c. > Code fork? I > don't see that occurring at all; it isn't like Jeremy is purposefully > going to start producing new releases of transformer.py. If he sends one > out, it would simply be to expedite matters. My point is, if I build an interesting application on top of Jeremy's version and you continue to build on the older version, we will have a defacto code fork because one of us will have to update our code in order to re-sync. If Jeremy's code is just a drop-in replacement then that won't be a problem and I'll just use it until you get around to "dropping it in" to py2c. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself "Unwisely, Santa offered a teddy bear to James, unaware that he had been mauled by a grizzly earlier that year." - Timothy Burton, "James" From paul@prescod.net Tue Dec 14 07:34:16 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 13 Dec 1999 23:34:16 -0800 Subject: [Types-sig] RFC 0.1 References: Message-ID: <3855F2F8.DE1FED31@prescod.net> I did evaluate your proposal but it seemed to me that it was solving a slightly different problem. I think as we compare them we'll find that your ideas were more oriented toward runtime safety checking. Greg Stein wrote: > > > > #2. The system must allow authors to make assertions about the sorts > > > of values that may be bound to names. These are called binding > > > assertions. They should behave as if an assertion statement was > > > inserted after every assignment to that name program-wide. > > In our writeup, we posit that it is better (and more Pythonic) to bind the > assertions to expressions, rather than names. This came about when we > looked at how to supply assertions for things like: > > x.y = value > x[i] = value > x[i:j] = value I wouldn't supply assertions for assignments at all. You supply assertions for the names x, y, i, and j. > Certainly, function objects would have type information associated with > them, but I believe that is different than associating a type with the > function's name. But if a function takes as its first argument an int, in what sense is that type associated with an "expression"? It is associated with a name, whatever the name of the first argument is. Plus consider this: type-safe String def foo(): return abc() How can I, at compile time, statically know the type of the value currently contained in the name abc if I don't restrict it in advance like this: String def abc(): return "abc" Rebinding is fine, as long as it doesn't invalidate the type declaration: abc = lambda: "def" > We also proposed extending isinstance() to allow a callable for the third > argument. This allows for arbitrarily complex type checking (e.g. the > "list of integers" problem). I liked that idea but really didn't see how to port it to a compile time static type checker. I'm going out of my way to avoid running arbitrary Python code. Static type checking shouldn't be a security hazard. > [ side note: if we get replaceable parser/compiler functionality in 1.6, > then we can start to test these alternative grammers and can compile > assertions and things based on them! ] That would be way cool! > If type assertions are bound to expressions, rather than names, a data > flow analysis will show the types at any point. This could (theoretically) > avoid many "declarations". Names get their values from expressions so the data flow analysis is the same. If you have to type-check the statement "return a" then you need to be able to know the type of both the variable and the expression. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Tue Dec 14 04:56:32 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 13 Dec 1999 20:56:32 -0800 Subject: [Types-sig] Re: Static typing considered HARD References: <3854C3D5.D7F765A@websentric.com> <3854ED13.8093FCE7@prescod.net> <385530FF.70AABC9E@websentric.com> Message-ID: <3855CE00.805CF934@prescod.net> Stephen Purcell wrote: > >... > > Static typing works very well in Java and suchlike, but those are > different languages, and the people who cannot live without static > typing use them instead of Python (and Smalltalk). If Python had not had object orientation 8 years ago, we would now be arguing against the introduction of the "class" operator as being "un-pythonic." Anything elegant, clean and in line with the rest of the language is, in my mind, Pythonic. Since Guido encouraged us to go down this path, at least as a mind experiment, I personally will not be dissuaded based on arguments that we are going against his original intentions. As afraid as you are that we will kill Python by changing it, I fear that we will kill it by stultifying it. We are, after all, in the software industry. > I use the language because it > somehow makes me feel good. When it no longer gives me that feeling, > I'll stop using it. Static typing would have that effect. Had you read the static type checking proposal when you wrote that? Have you used languages with optional static typing? I want everybody's opinions, emotional or otherwise, but I want people's informed opinions. My proposal is basically about giving people a special, computer-recognizable syntax for assertions. Are you against assertions? Does changing the syntax of assertions bother you? Would it bother you to find that your Python compiler might someday have a declaration that would allow some assertions to be checked at compile time assertions? Why wouldn't you just choose not to use that declaration? -- Paul Prescod - ISOGEN Consulting Engineer "Unwisely, Santa offered a teddy bear to James, unaware that he had been mauled by a grizzly earlier that year." - Timothy Burton, "James" From paul@prescod.net Tue Dec 14 05:54:41 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 13 Dec 1999 21:54:41 -0800 Subject: [Types-sig] Type inferencing Message-ID: <3855DBA1.9384B6AE@prescod.net> > > #3. The first version of the system will not define the operation of a > > type inferencing system. For now, all type declarations would need to > > be explicit. > > I expect that this will make the system relatively heavy-weight and > hence unpythonic. You'd be sprinkling way more type decls over your > source code than would be necessary with a somewhat more sophisticated > type checker. Point taken. I am only willing to do type inferencing up to a function level. After my "ML Experience" I am not willing to do it globally. A method with no type declaration should be presumed to return Object, even if it is like this: def foo(): return "abc" Otherwise you get the problem where changing a line of code in the middle of a function somewhere breaks code somewhere far away: type-check StringType def a(): return b() def b(): return c() def c(): if something(): return "abc" else return 1 Under my plan, the very first function would never have been statically checkable. So the code far away couldn't have broken it. But I am willing to take type inference this far: type-check StringType def a(): a="abc" return a This seems no harder than the type inferencing you need to do to check the type of expressions. Probably the only reason that Java and C don't do this is because they want to know what size of space to allocate on the stack. Of course we won't worry about that. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself "Unwisely, Santa offered a teddy bear to James, unaware that he had been mauled by a grizzly earlier that year." - Timothy Burton, "James" From paul@prescod.net Tue Dec 14 05:55:07 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 13 Dec 1999 21:55:07 -0800 Subject: [Types-sig] List of FOO Message-ID: <3855DBBB.6D1B462A@prescod.net> > > #2. The first version of the system will not allow the use of types > > that cannot be referred to as simple Python objects. In particular it > > will not allow users to refer to things like "List of Integers" and > > "Functions taking Integers as arguments and returning strings." > > It's been said before: that's a shame. Type inference is seriously > hindered if it doesn't have such information. (Consider a loop over > sys.argv; I want the checker to be able to assume that the items are > strings.) It took two years to get the parameterized version of the Java type system up and running. Let me ask this your opinion on this question (seriously, not sarcastically), should we include a spelling for "list of string" and not "callable taking list of callables taking strings returning integers returning string" and what about "callable taking list of callables taking and R returning list of callables taking and returning ." You see my problem? I could special case "list of" as Java and C did if we agreed to take our chances that my syntax would be extensible. We could even steal that weird "[]" thing that C and Java do: StringType [] foo -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself "Unwisely, Santa offered a teddy bear to James, unaware that he had been mauled by a grizzly earlier that year." - Timothy Burton, "James" From paul@prescod.net Tue Dec 14 06:22:31 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 13 Dec 1999 22:22:31 -0800 Subject: [Types-sig] IsInstance References: <3854EB4B.37EA2888@prescod.net> <199912131809.NAA19402@eric.cnri.reston.va.us> Message-ID: <3855E227.AE33907@prescod.net> Guido van Rossum wrote: > Perhaps this could be an extension of isinstance()? (That already > takes both class and type objects.) I wanted the function to return an object: myList=isinstance( foo, types.ListType ) if not myList: myDict=isinstance( foo, types.DictionaryType ) Then we can do the inferencing by looking at a single statement. Compare it to this: if isinstance( foo, types.ListType ): myList=foo elif isinstance( foo, types.DictionaryType ): myDict=foo That inferencing is just too hard. It isn't a proper cast operator anymore. If you are willing to change isinstance to return the object if it matches then I would like to use it. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Tue Dec 14 06:41:25 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 13 Dec 1999 22:41:25 -0800 Subject: [Types-sig] Module protection Message-ID: <3855E695.B1180A86@prescod.net> > However, there are some examples of dynamic code usage that are > fishy. Examples include adding or changing globals in other modules > (except for the rare global that is intended to be a settable option), > or messing with the __builtin__ module. I am glad you agree. Actually I took out a feature that allowed you to say that a module namespace (or particular name) was constant. I'll leave that to you since it is not directly required for static typing. All I require for static typing is that you don't replace sys.exit with a function that returns a string or replace sys.version with a file object. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Tue Dec 14 06:44:38 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 13 Dec 1999 22:44:38 -0800 Subject: [Types-sig] type-safe declaration Message-ID: <3855E756.174CE40D@prescod.net> > > #4. There must be declarations that instruct static type checking > > software to verify that a function cannot violate binding assertions. > > These are called safety declarations. > > I'm not sure what you mean here and how such declarations differ from > type assertions. And I'm worried about the "must" part. Please explain > better? "must" is an instruction to the specification writers (us) not to Python programmers. It means that we must provide a mechanism that would allow a programmer to say that a function is type-safe: type-safe StringType def double(a): StringType, a; return a*a Unlike Java, if you don't ask for a function to be statically type checked then it just isn't. Newbies can work without type checking until they feel it would be useful for them. My feeling that declaring a return type is just declaring a return type. It doesn't mean that you are willing to PROVE (statically) that the return type declaration will be accurate. > But I'd still like to be able to be diagnosed at compile time instead > of at runtime when my code makes a statically illegal call to a > function with a safety declaration. Under my plan, you would need a static declaration on YOUR code. I mean if your code can NEVER be right (e.g. range( "abc" ) ) then maybe a smart checker could report that. Java actually requires this of implementors. But if your code COULD be right (which is much more often the case in Python) then it should wait until runtime to check: a=callSomeUnTypedFunction() range( a ) > OK. I'm not sure everywhere whether you want compile-time or run-time > checking. Perhaps you can clarify this? Static type checking if you ask for it (with a type-check declaration) or dynamic type checking otherwise (unless you turn it off with an optimization option). -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Tue Dec 14 06:53:04 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 13 Dec 1999 22:53:04 -0800 Subject: [Types-sig] RFC 0.1 References: <3854EB4B.37EA2888@prescod.net> <199912131809.NAA19402@eric.cnri.reston.va.us> Message-ID: <3855E950.AE0E3E19@prescod.net> Thanks for all of your feedback! It's good stuff. Guido van Rossum wrote: > > > #1. The system exists to serve the dual goals of catching errors > > earlier in the development process and improving the performance of > > Python compilers and the Python runtime. Neither goal should be > > pursued exclusively. > > Hm, these may at times be very different goals. I had a recent > private discussion about types where the two goals were referred to as > (OPT), for optimization, and (ERR), for error-detection. One > observation is that while for (OPT) you may be able to get away with > aggressive whole-program type inferencing only, In theory, but in practice "whole-program X" seems to never get implemented (in Python or elsewhere!), as in "whole program type checks" and "whole program optimization" and "whole program flow analysis." "Whole program analysis" tends to be an excuse to put off work (roughly like "type inference"). > Technically, Python assert statements are only executed in > non-optimizing mode -- "assert 0" has no effect when you happen to use > "python -O" to execute your program. But I presume that here you mean > assertions in the abstract conceptual sense. No, I was thinking of actually compiling to the same byte-codes. It isn't really "safe" to turn off type-checks at runtime but it also isn't safe to turn off assertions. They are both there to guarantee program correctness at the price of performance. But maybe we would make a different command line option to control type checking. > I think JPython secretly already imposes some of these restrictions > (in particular for the sys module!). Good, then programmers are warmed up. :) > > In other words, code that uses functions and classes from the module > > should not need to know whether it uses binding assertions or old > > fashioned assert statements. > > Except that some unintended uses may become illegal while before you > might just have gotten away with them. Yes and no. In the past, we didn't do many type checks because many of us were philosophically against "type" and "class" checks. We wanted capability checks. Jim Fulton (et. al.) is working on that with interfaces. So with or without static type checking we should start seeing interface assertions. We're just giving them a nicer syntax (which may, admittedly, lead to more of them). Still, I want to put the blame squarely in Jim's corner (even if I was also in that corner). > > #9. There should be a mechanism to assert that an object has a > > particular type for purposes of informing the static and dynamic type > > checkers about something that the programmer knows about the flow of > > the program. > > Beyond "assert isinstance(object, type_or_class)" ? There are two issues here. First, I avoided using existing Python "spellings" for things that are going to take on magical meanings because people will expect other logical variations to work: typeobj = callSomeRandomFunction() assert isinstance(object, typeobj) If we invent new, syntactically distinct spellings then we can syntactically recognize them and complain if they aren't spelled "exactly right" (i.e. in a statically analyzable way). > I think that this is too much of a constraint, and may be informing > your preliminary design too much. As long as an easy mechanical > transformation to valid Python 1.5.x is available, I'd be happy. Okay. I'll keep this in mind. > > Name declaration: > > A name bound at the most out-dented context of a statically > > available namespace creating suite. > > The indentation don't enter into it. Consider > > if win32: > def func(): ... # win32 specific version > else: > def func(): ... # generic version That's precisely what I'm trying to disallow. I don't know the value of win32 until runtime! The pyc could be moved from Unix to win32. And more to the point, the value win32 might be computed based on arbitrarily complex code. So that's why I said out-dented. An out-dented name binding statement cannot depend (much) on a computed value. Computed base classes are going to have to be explicitly disallowed for statically checkable classes: class foo( dosomething() ): ... > > Classification: > > Due to a shortage of synonyms for "type" that do not already have a > > meaning, we use the word "classification." > > Oh, dear. Keep looking for a better synonym! You just had to put "type" and "class" in the same language! I could redefine the term type in this context and refer to the old concept of type as I did below: > > Given a value v and a value t, v conforms to classification t if > > t is returned by type( v ) > > 4. The classification of class instance variables comes from the > > classification of the corresponding class variable. > > > > > > class foo: > > types.IntType > > a=5 > > > > types.ListType > > b=None > > > > The initialization for b denies its type declaration. Do you really > want to do this? None is a valid value for any type as with NULL in C or SQL. > This doesn't look like it should be part of the > final (Python 2.0) version -- it's just too ugly. How am I going to > explain this to a newbie with no programming *nor* Python experience? With all due respect my problem is that you took the obvious (or at least traditional) instance variable declaration syntax and used it as a class variable declaring syntax. Okay, let's try this: class foo: types.IntType, a=5 def __init__( self ): types.ListType, self.b That looks equally ugly to me. Got any other ideas? On a separate track: I don't think that the whole static type system is for newbies, just as all of Python is not for newbies (think __getattr__). You shouldn't even start thinking about static typing until you are trying to "tighten up" your code for performance or safety. I don't want to use that as an excuse to make things difficult but if we are ever going to get to full polymorphic parametric static type checking we will have to acknowledge that the type system will have hard parts just as the language has hard parts. > > Classification-safe Function: > > > > a function that can be checked at compile time not to violate any > > classification constraints by assigning invalid values to any > > constrained names: > > > > Every reference to a name in a module or class (not instance!) must be > > to a declared (but perhaps not classification constrained) name. > > Explain the reason for excluding instances? Maybe I'm not very clear > on what you're proposing here. I think that that was from an earlier draft. Obviously we can't check instance variables in the same way that you check class and module namespaces but we do want to check them. The thought gives me a headache. It's my fourth year compiler class all over again. Make it stop! Maybe if I just specify it, some fourth year student will implement it as a project. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself "Unwisely, Santa offered a teddy bear to James, unaware that he had been mauled by a grizzly earlier that year." - Timothy Burton, "James" From paul@prescod.net Tue Dec 14 07:05:24 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 13 Dec 1999 23:05:24 -0800 Subject: [Types-sig] Re: RFC 0.1 References: Message-ID: <3855EC34.F78D9A99@prescod.net> "Golden, Howard" wrote: > > I think the system should also allow the author to require declarations of > all variables (e.g., via a command-line switch or pragma). I think that's a good idea for a particular implementation but I'm not going to put it in the type system specification. If I were Guido I would be unwilling to instruct every standard library package maintainer to supply all type declarations in order to please the minority who want to use Python in a manner that is as restrictive as Java. > > 1. The first version of the system will be as neutral as possible on > > the issue of what defines a "type". Fulton's capability-based > > interfaces should be legal as types but so should type objects and > > classes. > > I don't understand the ramifications of this. Might it not gut the RFC? I don't think so (yet). The main point is that we need to support "types", "classes" and the new "interfaces" > Shouldn't it be straightforward to add declarations to the existing library? Not just declarations: someone needs to actually define the set of "standard interfaces." There are probably a few weeks worth of work there and even a few weeks of work are hard to find since we all have other jobs. > > #2. The first version of the system will not allow the use of types > > that cannot be referred to as simple Python objects. In particular it > > will not allow users to refer to things like "List of Integers" and > > "Functions taking Integers as arguments and returning strings." > > Why? I don't think this should be prohibited, only not guaranteed. How can we allow it without defining the syntax? > > #4. The first version of the system will be syntactically compatible > > with Python 1.5.x in order to allow experimentation in the lead-up to > > an integrated system in Python 2. > > Does this mean no new syntax? (That's what it appears from your examples.) > > How about a declaration syntax, e.g., > > var x : type1, y : type2 > > Is this prohibited by the RFC? Yes, but I may change my mind on this issue based on Guido's feedback. > I'm confused about this section. Are these requirements or merely > terminology? The definitions turned into the spec. The long and short of it is that you can declare the types of variables: StringType a = "abc" and functions: StringType def a(): return "abc" and you can state that you want a function to be statically type checked: StringType def a(): return "abc" The spec is complex because I have to restrict the set of circumstances where this "works" to things that can be detected statically. I explicitly do not support stuff like this: import somefunction import a import b if somefunction.doit(): mod=a else: mod=b a.SomeType foo1 = None b.SomeType foo2 = None mod func( arg ): return a #valid or not?? -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Tue Dec 14 07:13:35 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 13 Dec 1999 23:13:35 -0800 Subject: [Types-sig] RFC 0.1 References: Message-ID: <3855EE1F.9E8B4C1B@prescod.net> John Ehresman wrote: > > I think it might be possible to do both run-time and compile-time > checking by defining the system in terms of what happens at run time, but > allowing compile time optimizations to be made. We are almost on the same track, but are not completely in sync. Static type checking isn't just an optimization. It's also a way of making more robust code. We use the "type-safe" declaration to say that the function/class/module should never throw a TypeError (thought it might propagate one from un-typesafe code). Note that even in C++ and Java it is possible for type-safe code to be required to propagate those language's equivalent of a type error. I'm not happy with a "lint-like-tool". I want static type checking to be formally defined in the language definition as it is in other languages. If you want it, you should be able to get it, reliably and at compile time. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From gstein@lyra.org Tue Dec 14 11:08:59 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 14 Dec 1999 03:08:59 -0800 (PST) Subject: [Types-sig] RFC 0.1 In-Reply-To: <3855F2F8.DE1FED31@prescod.net> Message-ID: On Mon, 13 Dec 1999, Paul Prescod wrote: > I did evaluate your proposal but it seemed to me that it was solving a > slightly different problem. I think as we compare them we'll find that > your ideas were more oriented toward runtime safety checking. True, but I might posit that (due to Python's dynamic nature) you really aren't going to come up with a good compile-time system. That leaves runtime. > Greg Stein wrote: > > > > > > #2. The system must allow authors to make assertions about the sorts > > > > of values that may be bound to names. These are called binding > > > > assertions. They should behave as if an assertion statement was > > > > inserted after every assignment to that name program-wide. > > > > In our writeup, we posit that it is better (and more Pythonic) to bind the > > assertions to expressions, rather than names. This came about when we > > looked at how to supply assertions for things like: > > > > x.y = value > > x[i] = value > > x[i:j] = value > > I wouldn't supply assertions for assignments at all. You supply > assertions for the names x, y, i, and j. I know you wouldn't. I was offering a different tack (and one that seemed to work better). In Python, names have no semantics other than an identifier, a scope, and that they are a reference. We thought it would be nice to retain the notion that names are just names -- it is the objects and what you're doing with them that is important. > > Certainly, function objects would have type information associated with > > them, but I believe that is different than associating a type with the > > function's name. > > But if a function takes as its first argument an int, in what sense is > that type associated with an "expression"? It is associated with a name, > whatever the name of the first argument is. Plus consider this: I think I wasn't clear enough here. In the following statement: a = b We suggested that type checking is defined and applied to the value (b), rather than associating a type with "a" and performing an assertion at assignment time. The concept of "this variable name can only contain values of type" is a standard, classical approach. We didn't think it applied as well to Python (for a number of reasons). If you're doing type inferencing, then you are actually tracking values -- the types associated with a name are very artificial/unnecessary during type inferencing. For example: a = [1, 2] foo(a) a = {1: 2} bar(a) a = 1.2 baz(a) The above code is quite legal in Python (and no, I don't want to hear arguments that it shouldn't be :-). With a type system that is associated with expressions/values rather than names, then we can do proper type inferencing, checking, etc on the above code. The only thing in our outline that has associated type information is a function object (note: not a function name). Reflection on the function can get the information for you (obviously, only useful for runtime tools; compilers would be using syntactic markers only). > type-safe > String > def foo(): > return abc() > > How can I, at compile time, statically know the type of the value > currently contained in the name abc if I don't restrict it in advance > like this: > > String > def abc(): return "abc" You would. I never said otherwise :-) But I see it as data (type) flow: "abc happens to refer to an object, which is typed as a function returning a string", rather than saying "abc is a function returning a string." Just as objects have types ("is-a"), I think a function object should expand a bit and record the types of its params and return value(s). > Rebinding is fine, as long as it doesn't invalidate the type > declaration: > > abc = lambda: "def" So you say :-) I say rebind all you want. Base your assertions and type-checks on what it has at whatever lexical point in your program. For the case of: if condition: x = 1 else: x = "abc" I would say that the type of "x" is a set, rather than a particular type. If you're going to do type-checking/assertions, then any uses of "x" better be able to accept all types in the set. > > We also proposed extending isinstance() to allow a callable for the third > > argument. This allows for arbitrarily complex type checking (e.g. the > > "list of integers" problem). > > I liked that idea but really didn't see how to port it to a compile time > static type checker. I'm going out of my way to avoid running arbitrary > Python code. Static type checking shouldn't be a security hazard. I believe that Python is too rich in data types and composition of types to be able to add *syntax* for all type declarations. I think you better stop and realize that before you get in too deep :-) In your RFC 0.1, you punted on the complex/composited data types issue too keep the solution tractable. I posit that you will *never* solve the problem of coming up with sufficient syntactical expression; therefore, you will always have to resort to a procedural component in your type system *if* you want full coverage. >... > > If type assertions are bound to expressions, rather than names, a data > > flow analysis will show the types at any point. This could (theoretically) > > avoid many "declarations". > > Names get their values from expressions so the data flow analysis is the > same. Partially true, but as I mentioned above: names are just points in your data flow. They are a side-effect. A name "recieves" a type from the data -- it does not "drive" the data flow. I think it is clearer to just avoid the attaching of a type to a name and to just look at the data. Note that one benefit of associating types with names, is that you can shortcut the data flow analysis (so the analysis is not necessarily the same). But: you cannot have a name refer to different types of objects (which I don't like; it reduces some of Python's polymorphic and dynamic behavior (interfaces solve the polymorphism stuff in a typed world)). > If you have to type-check the statement "return a" then you need to be > able to know the type of both the variable and the expression. [ by expression, I presume you mean the object referenced by "a". if you mean the expression "a", well yah... but that's a degenerate case which doesn't server as a good example of what you're trying to say. ] In the above case, you need to know the type of the variable *OR* the expression (the referenced object). If you have the type of the variable, then you simply assert that type rather than using data flow to know what is referenced. Cheers, -g -- Greg Stein, http://www.lyra.org/ From paul@prescod.net Tue Dec 14 14:13:09 1999 From: paul@prescod.net (Paul Prescod) Date: Tue, 14 Dec 1999 06:13:09 -0800 Subject: [Types-sig] RFC 0.1 References: Message-ID: <38565074.9E51515F@prescod.net> Greg Stein wrote: > > True, but I might posit that (due to Python's dynamic nature) you really > aren't going to come up with a good compile-time system. That leaves > runtime. Okay, but my mandate is to come up with a static (compile-time) system. I already have a variety of runtime tools for doing type checking. http://www.python.org/sigs/types-sig/ > In Python, names have no semantics other than an identifier, a scope, and > that they are a reference. We thought it would be nice to retain the > notion that names are just names -- it is the objects and what you're > doing with them that is important. But we WANT some names to have semantics. sys.version should be an integer. sys.path should be a list of strings. __builtins__.dir should be object->list and so forth. > The above code is quite legal in Python (and no, I don't want to hear > arguments that it shouldn't be :-). I am perfectly happy to have it be legal Python code. I just don't intend for it to be *statically type checkable* Python code. No, you cannot use all of the flexibility of Python and expect to get all of the static type checking of Java. For each function you choose one or the other. > With a type system that is associated > with expressions/values rather than names, then we can do proper type > inferencing, checking, etc on the above code. That code could not be legally inferenced in any inferencing system I am familiar with. (ML and inferenced Ada) If you write a formal specification for "data flow analysis" that can be implemented by two independent compilers based on the spec then I will take this approach seriously. But my impression from my time in the scheme world is that "data flow analysis" is an unconscious code-word for "let's put this problem off and hope that someone else figures out some magic that I haven't figured out yet." If a static type checking system is hard for Python, a static type inferencing one is going to be doubly hard! > > Rebinding is fine, as long as it doesn't invalidate the type > > declaration: > > > > abc = lambda: "def" > > So you say :-) I say rebind all you want. Base your assertions and > type-checks on what it has at whatever lexical point in your program. What if the rebinding happened in some other function, class or module? > I believe that Python is too rich in data types and composition of types > to be able to add *syntax* for all type declarations. I think you better > stop and realize that before you get in too deep :-) I have a few different answers here: 1. I don't have to be able to describe every possible type. If you can't statically check that "foo is a callable from T,T to callable from T" tough bloody luck, at least for the time being. Java can't do that. Neither could mid-90's C++. And forget about it for ANSI C. Python is not the world's most OO programming language. It is just a good one. It may not have the world's most static type checker. It will just have a good one. No type system makes type errors impossible so that is not my goal. My goal is that if a module uses type checks as religiously as Java module would, that module would be roughly as type-safe. 2. Python is no richer in types than any other language with an extensible type system. This includes ML, Haskell and even Java. There is no language today without a list type or mapping type. Yes, some Python complexity comes from the fact that there are dozens of non-reflective types "built-in" but we can and should fix that. 3. Compositions of types are complex, but not infinitely complex. We have about two decades in parameterized type research to rely on. Within a year and a half, two of the world's most popular languages (C++ and Java) will have parameterized types. > In your RFC 0.1, you punted on the complex/composited data types issue too > keep the solution tractable. I posit that you will *never* solve the > problem of coming up with sufficient syntactical expression; therefore, > you will always have to resort to a procedural component in your type > system *if* you want full coverage. I am happy to have a runtime component. I just don't see that we need any new syntax for this runtime component. And I don't think that we should give up on a formally defined static system. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From guido@CNRI.Reston.VA.US Tue Dec 14 15:19:52 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 14 Dec 1999 10:19:52 -0500 Subject: [Types-sig] A case study Message-ID: <199912141519.KAA23476@eric.cnri.reston.va.us> Here's a long and rambling example of what I think a type inferencer could do -- without type declarations of any sort. I wrote this down while thinking about the type checker that I would like to see in IDLE. --Guido van Rossum (home page: http://www.python.org/~guido/) """Let's analyze a simple program. Here's an example script -- let's call it pyfind.py -- that prints the names of all Python files in a given directory tree.""" #---------------------------------------------------------------------- import sys, find def main(): dir = "." if sys.argv[1:]: dir = sys.argv[1] list = find.find("*.py", dir) list.sort() for name in list: print name if __name__ == "__main__": main() #---------------------------------------------------------------------- """Our task is to check whether this is a correct program. I won't define correctness rigidly here, but it has something to do with under what circumstances the program will execute to completion. At the top level, we see an import statement, a function definition, and an if statement. Analyzing the import statement, we notice that sys is a well-known standard module. The find module is in the standard library. (In Python 1.6 it will be obsolete, but it's a convenient example.) Let's have a look at find.py just to see if there's any weirdness there:""" #---------------------------------------------------------------------- import fnmatch import os _debug = 0 _prune = ['(*)'] def find(pattern, dir = os.curdir): list = [] names = os.listdir(dir) names.sort() for name in names: if name in (os.curdir, os.pardir): continue fullname = os.path.join(dir, name) if fnmatch.fnmatch(name, pattern): list.append(fullname) if os.path.isdir(fullname) and not os.path.islink(fullname): for p in _prune: if fnmatch.fnmatch(name, p): if _debug: print "skip", `fullname` break else: if _debug: print "descend into", `fullname` list = list + find(pattern, fullname) return list #---------------------------------------------------------------------- """This imports two more modules, and then defines two variables and a function. Module os is a standard library module with special status. It's written in Python, but its source code is actually pretty hairy and dynamic; we can assume that its effective behavior can be hardcoded in the analyzer somehow, or perhaps we show the analyzer an idealized version of its source code. (This is an example of one trick our analyzer can use to make its life easier. It's equivalent to the concept of a "lint library" for the Unix/C lint tool.) Let's look at the fnmatch source code (I've left out some doc strings):""" #---------------------------------------------------------------------- import re _cache = {} def fnmatch(name, pat): import os name = os.path.normcase(name) pat = os.path.normcase(pat) return fnmatchcase(name, pat) def fnmatchcase(name, pat): if not _cache.has_key(pat): res = translate(pat) _cache[pat] = re.compile(res) return _cache[pat].match(name) is not None def translate(pat): i, n = 0, len(pat) res = '' while i < n: c = pat[i] i = i+1 if c == '*': res = res + '.*' elif c == '?': res = res + '.' elif c == '[': j = i if j < n and pat[j] == '!': j = j+1 if j < n and pat[j] == ']': j = j+1 while j < n and pat[j] != ']': j = j+1 if j >= n: res = res + '\\[' else: stuff = pat[i:j] i = j+1 if stuff[0] == '!': stuff = '[^' + stuff[1:] + ']' elif stuff == '^'*len(stuff): stuff = '\\^' else: while stuff[0] == '^': stuff = stuff[1:] + stuff[0] stuff = '[' + stuff + ']' res = res + stuff else: res = res + re.escape(c) return res + "$" #---------------------------------------------------------------------- """This in turn imports the re module. This one is a bit too long to include here; let's assume that, like the os module, it's known as a special case to the analyzer. Just a variable initialization and three function definitions here, no other executable code. Let's go back to the top-level script. I think the analyzer can easily recognize the idiom ``if __name__ == "__main__": ...'': it can know that since this is the root of the program, __name__ is indeed equal to "__main__", so it knows that main() gets called. Now we need to analyze main() further. Here it is again, with line numbers:""" #---------------------------------------------------------------------- def main(): # 1 dir = "." # 2 if sys.argv[1:]: # 3 dir = sys.argv[1] # 4 list = find.find("*.py", dir) # 5 list.sort() # 6 for name in list: # 7 print name # 8 #---------------------------------------------------------------------- """In line 1, we see that there are no arguments. Line 2 initializes the variable dir with the constant value ".", so we know its type is a string at this point. There's one other assignment to dir, on line 4. How do we know that this is also assigning a string to dir? My reasoning as the human reader of the program is that sys.argv is initially a list of strings, so sys.argv[x] for any x either raises an exception or yields a string value. The initial type of sys.argv can be known to the analyzer. How does the analyzer know that no other code has assigned anything different to sys.argv? Exhaustive analysis can probably show that there are no assignments to sys.argv or its items (or slices) anywhere in all the modules used by the program, nor are there calls to any of the list-modifying methods. We may be able to restrict ourselves to the code that may already have run by the time we reach this statement -- but then we have to prove that there are no other calls to main(). Maybe the analyzer can be primed with special knowledge about sys.argv, e.g. that its type cannot change. Then statements that cannot be proved to keep its type the same can be flagged as errors. Of course this gets muddy in the light of aliasing -- we'd need to keep around the information that some variable might point to sys.argv. Fortunately that kind of information seems useful in general. OK, so we know that dir is a string. Let's go back to line 3: the checker notes that sys is an imported module (the sys module) which has indeed an argv argument that is sliceable. It will also note that the expression ``1'' has the type integer which is a valid slice index. (There are no out-of-bounds exceptions for slice indices.) In line 4, we need to have another look at the expression ``sys.argv[1]''. (Note: I'm not saying that the analyzer jumps around haphazardly like this, I'm just making a case for what kinds of processes typically go on in the analyzer. In reality it probably goes at it in a much more orderly fashion.) Again, sys.argv is recognized as a list, so it can be indexed, and the expression ``1'' has the correct type. Now, indexing may cause an IndexError exception. Can the checker prove that we won't (ever!) get an IndexError at this particular line because of the test in the if statement on the previous line? I think that may be asking a bit much. But it knows that if the index is valid, the result will be a string (see above). Next, line 5. Here the analyzer knows that find is a module we imported, and that find.find is a function defined in that module. We call it with two arguments. The first is a string literal; the second is our local variable dir, which is also a string. Let's have a look at the function definition again:""" #---------------------------------------------------------------------- def find(pattern, dir = os.curdir): # 1 list = [] # 2 names = os.listdir(dir) # 3 names.sort() # 4 for name in names: # 5 if name in (os.curdir, os.pardir): # 6 continue # 7 fullname = os.path.join(dir, name) # 8 if fnmatch.fnmatch(name, pattern): # 9 list.append(fullname) # 10 if os.path.isdir(fullname) and not os.path.islink(fullname): # 11 for p in _prune: # 12 if fnmatch.fnmatch(name, p): # 13 if _debug: print "skip", `fullname` # 14 break # 15 else: # 16 if _debug: print "descend into", `fullname` # 17 list = list + find(pattern, fullname) # 18 return list # 19 #---------------------------------------------------------------------- """Indeed the function takes two arguments. It's also reassuring that the second argument has a default argument of type string (os.curdir is easily recognized as a string, using similar reasoning as for sys.argv). Now let's analyze it further. Line 2 defines a local variable list initialized to an empty list. Will it always have the type List, throughout this function? There's an assignment further down to this variable from the expression ``list + find(...)''. How can we prove that the type of that variable is List? I can show it in any of two ways, neither is very satisfactory: 1. List objects support a + operator only with a List right operand, and the result is a List. The problem with this is that the right operand might be a class instance that defines __radd__ and returns something else, so it's not a valid proof. 2. Using induction: if recursion level N returns a List, the assignment ``list = list + find(...)'' shows that recursion level N+1 has type List; recursion level 0 (no recursive calls) has return type List; so all recursion levels have type List. This is a valid proof (though I should write it down more carefully) but I'm not sure if I can assume that the analyzer is smart enough to deduce it! I'm not sure how to rescue myself out of this conundrum; perhaps there's value in John Aycock's assumption that variables typically don't change their type unless shown otherwise; then we could assume list was a List throughout. Still thin ice, but this is a common pattern. The rest is a bit simpler. Line 3 calls a known system function taking a string and returning a list of strings, or raising os.error. We remember that the dir argument is a string so this is valid. We also note that this might raise os.error. As indeed it will when we pass it a non-directory as a command line argument. The return type shows us that the local variable names is set to a list of strings. Line 3 sorts that list. The analyzer should know that this calls the built-in function cmp() pairwise on items of the list; comparing strings is fine so there's no chance of an error here. In line 5 we iterate over names, which is a list of strings, so we know name is a string. Plodding along: ``name in (os.curdir, os.pardir)'' is a valid test; the rhs of the in operator is a tuple of strings and we know that the in operator calls cmp(name, x) for each x in the tuple; again, this is fine. Line 8, ``fullname = os.path.join(dir, name)'': we can know that os.path.join is a function of 1 or more string arguments returning a string; the arguments are both strings and we now know that the local variable fullname is a string, too. Line 9 calls fnmatch.fnmatch(). I postulate that it's obvious that this takes two arguments and returns a Boolean:""" #---------------------------------------------------------------------- def fnmatch(name, pat): import os name = os.path.normcase(name) pat = os.path.normcase(pat) return fnmatchcase(name, pat) def fnmatchcase(name, pat): if not _cache.has_key(pat): res = translate(pat) _cache[pat] = re.compile(res) return _cache[pat].match(name) is not None #---------------------------------------------------------------------- """I leave it as an exercise that the argument types (strings again) are correct and that no other errors can occur. Line 10 calls list.append(fullname). We know that list is a List object and that fullname is a string. We should also know the effect of a list's append method; the call is correct (it takes one argument of arbitrary type). What do we now know about the type of the list variable? It was initialized to an empty List. It's still a List, and we know that at least some of its items are lists. Are all its items lists? This gets us into similar issues as the recursive call to find() before, and just as there, I'm not sure that we really do, so maybe we need to continue the single type hypothesis. (One way out would be to assume the single type hypothesis until we see positive proof to the contrary, and if so, redo the analysis with a less restricted type.) I'll leave the rest of this function as an exercise; no new principles are employed. Note that _prune is a global variable initialized to a list of strings, and, with John, we'll assume that that is its final type; this makes everything work. This function is recursive. Could we prove that the recursion will terminate? Probably not; it would require knowing filesystem properties. Note that if it weren't for the os.path.islink() test, it would be possible to create a structure in the filesystem that would cause infinite recursion here! So our analyzer might flag this function as questionable recursive. We can now finish the analysis of our original main function:""" #---------------------------------------------------------------------- def main(): # 1 dir = "." # 2 if sys.argv[1:]: # 3 dir = sys.argv[1] # 4 list = find.find("*.py", dir) # 5 list.sort() # 6 for name in list: # 7 print name # 8 #---------------------------------------------------------------------- """We know that find.find() returns a list of strings, so this is the type of the list variable. We already talked about sorting a list of strings. I can probably prove that sorting here is redundant, given the way find() sorts its list of names, but that will be hard for the analyzer, so I doubt that it will find this subtle optimization tip. The final for loop and print statement have no further problems; we know that list is a List, which is a sequence, so a for loop can iterate over it; the print statement calls str() on each of its arguments and this function is always safe on strings (as it does for most types, except instances or extension types that raise exception in their __str__ implementation).""" From paul@prescod.net Tue Dec 14 15:20:54 1999 From: paul@prescod.net (Paul Prescod) Date: Tue, 14 Dec 1999 07:20:54 -0800 Subject: [Types-sig] Avoiding innovation Message-ID: <38566056.70679872@prescod.net> In response to Greg's message I want to add a design goal: #11. Wherever possible the system should try to build upon existing implemented type systems and research rather than being designed from scratch for Python. It will build much more closely on dynamic language type annotation systems such as those in Smalltalk, Common Lisp, Dylan and Visual Basic. Java and C++ are of secondary interest as models. --- Python is just another syntax and virtual machine for the lambda calculus. It obeys the same mathematical laws as other programming languages. I think it would be a mistake to throw out everything that we know about type systems and implement something idiosyncratic. Python IS a classical dynamic object/procedural programming language. It is not a research language and I dislike attempts to put in untested new ideas, especially in the area of type checks. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From guido@CNRI.Reston.VA.US Tue Dec 14 15:37:11 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 14 Dec 1999 10:37:11 -0500 Subject: [Types-sig] RFC 0.1 In-Reply-To: Your message of "Tue, 14 Dec 1999 03:08:59 PST." References: Message-ID: <199912141537.KAA23487@eric.cnri.reston.va.us> [Greg Stein] > Note that one benefit of associating types with names, is that you can > shortcut the data flow analysis (so the analysis is not necessarily the > same). But: you cannot have a name refer to different types of objects > (which I don't like; it reduces some of Python's polymorphic and dynamic > behavior (interfaces solve the polymorphism stuff in a typed world)). This is a bogus argument. From the point of view of human readability, I find this: s = "the quick brown fox" s = string.split(s) del s[1] # the fox is getting old s = string.join(s) less readable and more confusing than this: s = "the quick brown fox" w = string.split(s) del w[1] # the fox is getting old s = string.join(w) The first version gives polymorphism a bad name; it's like a sloppy physicist using the same symbol for velocity and accelleration. The polymorphism that is worth having deals with function arguments and containers and the like. For example: def sum(l, zero): s = zero for x in l: s = s + x return s Here the type of l is sequence of and the type of zero is ; the only implied requirement for is that + returns another . The fact that this works just as well for lists of ints, floats, strings, or even matrixes, given the appropriate zero, is valuable polymorphism. Other languages can only do this using parametrized types; they get more type checking, but at a terrible cost. Note that a type inferencer may not be able to deduce the rules I stated above, since you could construct an example where there is no single type and yet the whole thing works. E.g. I could create a list [1, 2, 3, joint, "a", "b", "c"] where joint is an instance of a class that when added to an int returns a string. However if we had a typesystem and notation that couldn't express this easily but that could express the stricter rules, I bet that no-one would mind adding the stricter type declarations to the code, since those rules most likely express the programmer's intent better. --Guido van Rossum (home page: http://www.python.org/~guido/) From paul@prescod.net Tue Dec 14 15:44:30 1999 From: paul@prescod.net (Paul Prescod) Date: Tue, 14 Dec 1999 07:44:30 -0800 Subject: [Types-sig] Sorry! Message-ID: <385665DE.9963174B@prescod.net> ...for spamming you guys all night. I want to make sure that everybody's concerns get addressed. I'll slow down tonight. One issue that Greg raised was the difficulty of checking builtin types (in addition to the hairy parameterized types stuff). I've been thinking about this and I think that the doc-sig and the types-sig have the same problem. How do we sniff out the parameters and docstrings for methods without running dangerous binary code. I think that Java (and many other languages) has the right plan with "shadow libraries." The CORBA guys already use IDL as a static library syntax. I think that we should support both IDL and a strongly-typed Pythonic syntax. It might work something like this: def Int foo(a, b): "Foo, defined in module" pass def String foo(c, d, *args ): pass "Foo, defined in module" pass import _foo locals().update( _foo.__dict__ ) Maybe we would have some kind of a keyword instead of the locals() hack. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From guido@CNRI.Reston.VA.US Tue Dec 14 15:58:44 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 14 Dec 1999 10:58:44 -0500 Subject: [Types-sig] Re: RFC 0.1 In-Reply-To: Your message of "Mon, 13 Dec 1999 23:05:24 PST." <3855EC34.F78D9A99@prescod.net> References: <3855EC34.F78D9A99@prescod.net> Message-ID: <199912141558.KAA23531@eric.cnri.reston.va.us> [Paul Prescod] > If I were Guido I would be unwilling to instruct every standard > library package maintainer to supply all type declarations in order > to please the minority who want to use Python in a manner that is as > restrictive as Java. Don't assume that! I think that for standard library modules (either in Python or in C), having the types can be a great boon -- it acts as documentation, guidelines for future API evolution, etc. Well worth having. Probably will catch some bugs in contributed code too! :-) > Not just declarations: someone needs to actually define the set of > "standard interfaces." There are probably a few weeks worth of work > there and even a few weeks of work are hard to find since we all have > other jobs. This can be done incrementally, like the documentation got done. > > How about a declaration syntax, e.g., > > > > var x : type1, y : type2 > > > > Is this prohibited by the RFC? > > Yes, but I may change my mind on this issue based on Guido's feedback. Feedback: I think adding type declarations is too important to be crippled by a "no new keywords, no new syntax" rule. --Guido van Rossum (home page: http://www.python.org/~guido/) From tismer@appliedbiometrics.com Tue Dec 14 15:57:40 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Tue, 14 Dec 1999 16:57:40 +0100 Subject: [Types-sig] RFC 0.1 References: Message-ID: <385668F4.2340C4B2@appliedbiometrics.com> Greg Stein wrote: > > On Mon, 13 Dec 1999, Paul Prescod wrote: > > I did evaluate your proposal but it seemed to me that it was solving a > > slightly different problem. I think as we compare them we'll find that > > your ideas were more oriented toward runtime safety checking. > > True, but I might posit that (due to Python's dynamic nature) you really > aren't going to come up with a good compile-time system. That leaves > runtime. This is very true. Are you completely moving away from type declaration, or do you still propose an expr ! typeid notation? ... > a = b > > We suggested that type checking is defined and applied to the value (b), > rather than associating a type with "a" and performing an assertion at > assignment time. The concept of "this variable name can only contain > values of type" is a standard, classical approach. We didn't think > it applied as well to Python (for a number of reasons). If you're doing > type inferencing, then you are actually tracking values -- the types > associated with a name are very artificial/unnecessary during type > inferencing. For example: > > a = [1, 2] > foo(a) > a = {1: 2} > bar(a) > a = 1.2 > baz(a) One could have both behaviors at the same time, I think. Type restriction would be a property of the involved namespace. The namespace responsible for the assignment could be an extended dictionary object with the desired rules defined. ... > For the case of: > > if condition: > x = 1 > else: > x = "abc" > > I would say that the type of "x" is a set, rather than a particular type. > If you're going to do type-checking/assertions, then any uses of "x" > better be able to accept all types in the set. Allow me a question about types: Where are the limits between types, values, and properties of values? Assume a function which returns either [1, 2, 3] or the number 42. We now know that we either get a list or an integer. But in this case, we also know that we get a list of three integer elements which are known constants, or we get the integer 42 which is even, for instance. So what is 'type', how abstract or concrete should it be, where is the cut? > I believe that Python is too rich in data types and composition of types > to be able to add *syntax* for all type declarations. At the same time, Python is so rich from self-inspection that writing a dynamic type inference machine seems practicable, so how about not declaring types, but asking your code about its type? I could imagine two concepts working together: Having optional interfaces, which is a different issue and looks fine (Jim's 0.1.1 implementation). Having dynamic type inference, which is implemented by cached type info at runtime. (I hope this idea isn't too simple minded) Assume for instance the string module, implemented in Python. It would have an interface which defines what goes in and out of its functions. At "compile" time of string.py, type inference can partially take place already when the code objects are created. The interface part creates restrictions on argument values, which can be used for further inference. It can also be deduced whether the return values already obey the interface or if deduction for imported functions is necessary. This info is saved in some cache with the compilation. Changes to the module object simply break the cache. When I dynamically redefine a piece of the module where it depends of (say I assign something new to "_lower"), then the analysis must be carried out again, recursively invalidating other cached info as necessary. Well, this is an example where I think the restriction to type checking of expressions still applies, but something more is needed to trigger this check early. The involved namespace object is the string module's __dict__, which should know that it is referenced by this expression: def lower(s): res = '' for c in s: res = res + _lower[ord(c)] return res And by assignment to the name "_lower" in this case, it could invalidate code object lower's type cache. lower can no more assure that it will return a string result and will trigger its interface object to re-check consistency. The latter will raise an interface_error if the rule doesn't match. It remains an open question for me how deeply possible values should be checkable, i.e. "this arg has to be a list which is not empty". Minor point, maybe. Did I make some sense, or am I off the track? - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From paul@prescod.net Tue Dec 14 16:07:12 1999 From: paul@prescod.net (Paul Prescod) Date: Tue, 14 Dec 1999 08:07:12 -0800 Subject: [Types-sig] Re: Inferencing: A case study References: <199912141519.KAA23476@eric.cnri.reston.va.us> Message-ID: <38566B30.36608D4E@prescod.net> Guido van Rossum wrote: > > Here's a long and rambling example of what I think a type inferencer > could do -- without type declarations of any sort. I wrote this down > while thinking about the type checker that I would like to see in > IDLE. Okay, but let me ask this: if TOTAL Java-level type safety ONLY required type declarations for all "non-local" variables (including functions and instance variables) would that be acceptable to you? Your inferencer heuristics are fine for an interactive GUI environment where failure is merely an inconvenience but if we are going to have a formally checkable notion of "this is statically type-safe" and "this is not" then I worry about the "non-local breakage" problem. Oops, did changing that variable to an "int" break your module way over there? I spoke to the Journal of Functional Programmers at a conference recently. I asked him about why ML's type inferencer made the language so hard to use. He said: "oh, you should always put the type declarations in. The type inferencer is mostly just an educational tool." Of course that's not what the type inferencer was SUPPOSED to be, but I think that that's what it has become. "Global" type inferencing scares me and I think that it has the unintended consequence of making the static type checker (and thus the language) harder to understand. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From Edward Welbourne Tue Dec 14 16:18:19 1999 From: Edward Welbourne (Edward Welbourne) Date: Tue, 14 Dec 1999 16:18:19 +0000 Subject: [Types-sig] Re: Static typing considered ... UGLY Message-ID: > too subjective OK, fair enough: why *do* I find it ugly ? Substantially, prejudice and paranoia. However ... If I implement a datatype (probably as a class) whose objects (instances) behave *just the same as* integers in all pythonic respects, I demand to be able to use it everywhere that I am allowed to use an integer. If static typing breaks that, it's right out. If the way you're doing static typing is based on `what interfaces does this object support' questions instead of `type' (and I realise your deliberate vaguery on what you mean by `kind' of value may allow this), then I'm much less concerned, though I have deep reservations about changing the syntax of python in order to provide syntactic sugar for stuff that can, at present, be done using assertions. Furthermore, boring though it may be to begin a function with as many assertions as arguments, the assert mechanism leaves ample scope for the programmer to identify just exactly what it is the programmer wanted to say as the constraint on the integer (not only is it an integer, and non-negative, but *it's even*, say): and this without having to invent new and fascinating syntactic forms to express it. It's all very well to say that `existing python code will be unaffected' but if existing python programmers come across > import frozen > > frozen > def foo( a ): > return string.replace(a,"b") we're not going to be happy with being expected to understand that foo is now a name we can't modify. The existing semantics of evaluating an expression (which is how I'm reading `frozen' the second time it appears) are that the expression is evaluated and thrown away and doing so hasn't changed the semantics of how the interpreter modifies namespaces thereafter. The fact that the last-executed expression yielded (and discarded) a type object should *not* have any impact on the meaning of the code following. And existing python programmers might sensibly write something like: try: types.MagicMethodType # Check we're using python 2.0 version = '2' except AttributeError: # Cope if we're not version = '1' and be unhappy about the typerror because '2' isn't a magic method. Indeed, if any of the 1.x chain have added values to types, the above code may appear awful close to verbatim in reality ... On the other hand, if you want an object whose attributes are of pre-decided kinds, or a namespace in which certain names are reserved for certain values, use a setattr hack (or, if you're feeling very brave, some variant on the wrapper defined by URL: http://www.chaos.org.uk/~eddy/dev/toy/class.py). Likewise, if you want a namespace (the module in which your code above appeared) which can be initialised `in the usual way' but which (except with severe hassle which should alert folk to the folly of doing so) can't be modified after initialisation, use an initspace ... see .../~eddy/dev/toy/object.html and, in the same directory, python.py > if Python is never allowed to make major changes ... I'm not suggesting `no change' - only `not in that direction'. And even type-checking can get past my prejudices if it's approached gently ... ... I've now read the Greg/Fred/Sjoerd attack and I like that: let ! be a new binary operator with grammar anyvalue ! typechecker the value of the expression being that of the given value, but evaluating it'll raise an exception if the typechecker didn't like the value. Now that's a much nicer way to go. Of course, this effectively just amounts to implementing ! as an in-expression assert mechanism ... and I'm not entirely sure how it helps the compiler-writer - is that why you insist on the typechecker being a dotted name, not an arbitrary expression ? Type-checking applies to values ;^) Of course, obstreperous as I am, I immediately want to meddle with the scheme: specifically, though the *default* behaviour might be (in effect) if not isinstance(value, typechecker): raise TypeError else: yield value I'd argue for the semantics to say: evaluate the expressions `value' and `typechecker', look for a __check__ method on the latter: if present invoke it on the value, else use isinstance as above; on false return (no problem) the !-expression yields the given `value', otherwise TypeError with parameter the true value returned. Then we can implement weird and devious __check__ methods for fiddly type-checks (instead of needing to change isinstance in the way proposed - which would conflict with my pet tweak to that, which allows isinstance(value, thistype, thattype, othertype) for when I've got several types I'll accept). Please Greg/Fred/Sjoerd, can you write a proposal which starts where http://www.foretec.com/python/workshops/1998-11/greg-type-ideas.html ends (reading the first 9/10 of that was ... illuminating in hindsight, but off-putting on the way there). It looks pretty promising ... Eddy. From guido@CNRI.Reston.VA.US Tue Dec 14 16:33:17 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 14 Dec 1999 11:33:17 -0500 Subject: [Types-sig] RFC 0.1 In-Reply-To: Your message of "Mon, 13 Dec 1999 22:53:04 PST." <3855E950.AE0E3E19@prescod.net> References: <3854EB4B.37EA2888@prescod.net> <199912131809.NAA19402@eric.cnri.reston.va.us> <3855E950.AE0E3E19@prescod.net> Message-ID: <199912141633.LAA23558@eric.cnri.reston.va.us> [Paul Prescod again] > In theory, but in practice "whole-program X" seems to never get > implemented (in Python or elsewhere!), as in "whole program type checks" > and "whole program optimization" and "whole program flow analysis." > "Whole program analysis" tends to be an excuse to put off work (roughly > like "type inference"). I actually hope that I can use some of the small change I got from DARPA for CP4E to do this rather than putting it off, but I hear your warning -- and I agree it's a major project. > No, I was thinking of actually compiling to the same byte-codes. It > isn't really "safe" to turn off type-checks at runtime but it also isn't > safe to turn off assertions. They are both there to guarantee program > correctness at the price of performance. But maybe we would make a > different command line option to control type checking. Hm, this is strange -- most of the time you seem to be firmly in the compile-time-checks camp, but here you seem to want run-time checks. I say we already have run-time checks, they just come a little later. (If we didn't have runtime checks, an expression like 1+"" would dump core rather than raising a TypeError exception.) If it's (OPT) we're after, adding run-time checks can never obtain your goal. If it's (ERR) we're after, well, *maybe* adding some run-time checks can produce clearer error messages than some of the existing ones, but this doesn't really do anything for my confidence that my program is correct -- if there's a type error in my except clause, what good does it do me to get a type-check error at run time? > > The indentation don't enter into it. Consider > > > > if win32: > > def func(): ... # win32 specific version > > else: > > def func(): ... # generic version > > That's precisely what I'm trying to disallow. I don't know the value of > win32 until runtime! The pyc could be moved from Unix to win32. Most people interested in (OPT) would gladly trade in platform independence for speed. > And more > to the point, the value win32 might be computed based on arbitrarily > complex code. But typically, it isn't. > So that's why I said out-dented. An out-dented name > binding statement cannot depend (much) on a computed value. Computed > base classes are going to have to be explicitly disallowed for > statically checkable classes: > > class foo( dosomething() ): > ... There's an alternative. You could do some analysis on both variants of func() and derive a union for its interface (arguments & return type). If that union is really weird, a static checker might even warn the user that the two versions of func() don't behave the same way! (E.g. if on win32, func() takes more or different arguments or returns a different type, it's hard to write the code that *uses* func() portably, so something is probably wrong in the design.) > > > Classification: > > > Due to a shortage of synonyms for "type" that do not already have a > > > meaning, we use the word "classification." > > > > Oh, dear. Keep looking for a better synonym! > > You just had to put "type" and "class" in the same language! Blame C++ or Java, both of which have separate concepts of type and class. I'll admit that the type() function is pretty bogus -- perhaps it should be matched to isinstance(), which takes either a type object or a class as its second argument. Perhaps it's not too late to use the word type for the concept you need? (We can distinguish by using "type object" to refer to the old concept where we need it.) > I could > redefine the term type in this context and refer to the old concept of > type as I did below: Aha. Proof that I didn't read ahead when I wrote that previous paragraph. :-) > > The initialization for b denies its type declaration. Do you really > > want to do this? > > None is a valid value for any type as with NULL in C or SQL. No. In C, NULL is not a valid integer (at least not conceptually -- it's a pointer). I hate the fact that in Java, NULL is always a valid string, because strings happen to be objects, and so I always run into run-time errors dereferencing NULL. I'd like to be able to declare the possibility that a particular value is None separate from its type -- this feels much more natural and powerful to me. > > This doesn't look like it should be part of the > > final (Python 2.0) version -- it's just too ugly. How am I going to > > explain this to a newbie with no programming *nor* Python experience? > > With all due respect my problem is that you took the obvious (or at > least traditional) instance variable declaration syntax and used it as a > class variable declaring syntax. Okay, let's try this: > > class foo: > types.IntType, a=5 > > def __init__( self ): > types.ListType, self.b > > That looks equally ugly to me. Got any other ideas? There have been plenty of suggestions, from int a=5 via a:int = 5 to a!int = 5 and even a = 5!int... > On a separate track: I don't think that the whole static type system is > for newbies, just as all of Python is not for newbies (think > __getattr__). You shouldn't even start thinking about static typing > until you are trying to "tighten up" your code for performance or > safety. I don't want to use that as an excuse to make things difficult > but if we are ever going to get to full polymorphic parametric static > type checking we will have to acknowledge that the type system will have > hard parts just as the language has hard parts. Yes, fair enough. > > Explain the reason for excluding instances? Maybe I'm not very clear > > on what you're proposing here. > > I think that that was from an earlier draft. Obviously we can't check > instance variables in the same way that you check class and module > namespaces but we do want to check them. The thought gives me a > headache. It's my fourth year compiler class all over again. Make it > stop! The hard part is keeping which variables (and arguments, etc.) can contain instances of a given class; if we have that we can track instance variable assignments. A simple rule (which I may just implement in a "stricter Python" for use in early CP4E classes, just like the TeachScheme project starts teaching with a Scheme variant that has only 6 constructs) would be that class instance variables can only be assigned to via self. We can then statically analyze the methods comprising the class body, ignoring all dynamicism allowed in more advanced versions of the language, and deduce a set of instance variable names. The implementation can then be told about this set and disallow setting others (except by derived classes, which are dealt with separately). I'm hoping that this idea can somehow be extended to full Python -- maybe I'm naive? > Maybe if I just specify it, some fourth year student will implement it > as a project. Is John Aycock listening? :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@CNRI.Reston.VA.US Tue Dec 14 16:42:08 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 14 Dec 1999 11:42:08 -0500 Subject: [Types-sig] type-safe declaration In-Reply-To: Your message of "Mon, 13 Dec 1999 22:44:38 PST." <3855E756.174CE40D@prescod.net> References: <3855E756.174CE40D@prescod.net> Message-ID: <199912141642.LAA23570@eric.cnri.reston.va.us> [still Paul Prescod] > Under my plan, you would need a static declaration on YOUR code. I mean > if your code can NEVER be right (e.g. range( "abc" ) ) then maybe a > smart checker could report that. Java actually requires this of > implementors. But if your code COULD be right (which is much more often > the case in Python) then it should wait until runtime to check: > > a=callSomeUnTypedFunction() > range( a ) If the type checker can prove that callSomeUnTypedFunction() can return non-integer types as well as integers I think I'd be happy to get a warning here (as long as we're in lint mode). It's much more likely that the programmer didn't realize this possibility, than that she somehow had tweaked the environment or the arguments so that callSomeUnTypedFunction() would never return a non-int at this particular call site, or that she would be catching the resulting TypeError later. Aside: I also believe that a static typechecker can easily know 99% of all try-except statements that are currently on the call stack. Try-except statements with a variable (that isn't a simple alias) in the exception name slot are extremely rare, in my experience. Of course a lint-style checker should also warn about (1) all unqualified except clauses, and (2) "wide" try clauses -- that is, try clauses around lots of code that could raise the exception that is being caught. Bot of these are caused by sloppy coding much more frequently than they are a necessity in the program. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@CNRI.Reston.VA.US Tue Dec 14 16:45:38 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 14 Dec 1999 11:45:38 -0500 Subject: [Types-sig] IsInstance In-Reply-To: Your message of "Mon, 13 Dec 1999 22:22:31 PST." <3855E227.AE33907@prescod.net> References: <3854EB4B.37EA2888@prescod.net> <199912131809.NAA19402@eric.cnri.reston.va.us> <3855E227.AE33907@prescod.net> Message-ID: <199912141645.LAA23582@eric.cnri.reston.va.us> > From: Paul Prescod > I wanted the function to return an object: > > myList=isinstance( foo, types.ListType ) > if not myList: > myDict=isinstance( foo, types.DictionaryType ) Good feature idea, but abusing isinstance() is a bad name. In C++ I believe this is called a dynamic cast. Long ago I learned to define virtual functions that would return either an X, if the object was an X, or a null pointer. Besides, the "if not myList" test could fail if foo happened to be an empty list. > Then we can do the inferencing by looking at a single statement. Compare > it to this: > > if isinstance( foo, types.ListType ): > myList=foo > elif isinstance( foo, types.DictionaryType ): > myDict=foo > > That inferencing is just too hard. Are you sure? > It isn't a proper cast operator > anymore. If you are willing to change isinstance to return the object if > it matches then I would like to use it. No, call it something else. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@CNRI.Reston.VA.US Tue Dec 14 16:49:50 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 14 Dec 1999 11:49:50 -0500 Subject: [Types-sig] List of FOO In-Reply-To: Your message of "Mon, 13 Dec 1999 21:55:07 PST." <3855DBBB.6D1B462A@prescod.net> References: <3855DBBB.6D1B462A@prescod.net> Message-ID: <199912141649.LAA23593@eric.cnri.reston.va.us> > From: Paul Prescod > > > #2. The first version of the system will not allow the use of types > > > that cannot be referred to as simple Python objects. In particular it > > > will not allow users to refer to things like "List of Integers" and > > > "Functions taking Integers as arguments and returning strings." > > > > It's been said before: that's a shame. Type inference is seriously > > hindered if it doesn't have such information. (Consider a loop over > > sys.argv; I want the checker to be able to assume that the items are > > strings.) > > It took two years to get the parameterized version of the Java type > system up and running. Probably because Java was initially conceived as a language with a "classic" type system (like C or Pascal). Python on the other hand already has all this. > Let me ask this your opinion on this question > (seriously, not sarcastically), should we include a spelling for "list > of string" and not "callable taking list of callables taking strings > returning integers returning string" and what about "callable taking > list of callables taking and R returning list of callables taking > and returning ." You see my problem? I could special case "list > of" as Java and C did if we agreed to take our chances that my syntax > would be extensible. We could even steal that weird "[]" thing that C > and Java do: > > StringType [] foo If we could express all those the type checker could do a much better job. If we could at least do the ones without the notation, we'd still be doing a good job. Stopping at "list" is useless. (I'm guessing your use of "R" instead of "" once is a typo and not something deep I've missed?) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@CNRI.Reston.VA.US Tue Dec 14 16:54:16 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 14 Dec 1999 11:54:16 -0500 Subject: [Types-sig] Type inferencing In-Reply-To: Your message of "Mon, 13 Dec 1999 21:54:41 PST." <3855DBA1.9384B6AE@prescod.net> References: <3855DBA1.9384B6AE@prescod.net> Message-ID: <199912141654.LAA23612@eric.cnri.reston.va.us> > From: Paul Prescod > > Point taken. I am only willing to do type inferencing up to a function > level. After my "ML Experience" I am not willing to do it globally. [example snipped] I'm disappointed. Jim Hugunin did global analysis on the pystone.py module -- 250 lines containing 14 functions and one class with two methods. (He may actually have left out the class, but I'm pretty sure he did everything else.) He got a 1000x speedup, which I think should be a pretty good motivator for those interested in (OPT). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@CNRI.Reston.VA.US Tue Dec 14 16:59:27 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 14 Dec 1999 11:59:27 -0500 Subject: [Types-sig] Plea for help. In-Reply-To: Your message of "Mon, 13 Dec 1999 20:39:36 PST." <3855CA08.4EA1BF11@prescod.net> References: <3855CA08.4EA1BF11@prescod.net> Message-ID: <199912141659.LAA23638@eric.cnri.reston.va.us> > Is there currently any path from high level parse trees to bytecodes? > E.g. is there a way to get sane parse trees to "render" themselves as, > er, insane parse trees? I don't think so but I'm just checking to avoid > extra work. The parser module lets you construct a parse tree and then compile it. The parse tree must be correct before this is allowed. Check out the compileast() function on http://www.python.org/doc/current/lib/Converting_ASTs.html --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@CNRI.Reston.VA.US Tue Dec 14 17:36:44 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 14 Dec 1999 12:36:44 -0500 Subject: [Types-sig] Avoiding innovation In-Reply-To: Your message of "Tue, 14 Dec 1999 07:20:54 PST." <38566056.70679872@prescod.net> References: <38566056.70679872@prescod.net> Message-ID: <199912141736.MAA23833@eric.cnri.reston.va.us> > In response to Greg's message I want to add a design goal: > > #11. Wherever possible the system should try to build upon existing > implemented type systems and research rather than being designed from > scratch for Python. It will build much more closely on dynamic > language type annotation systems such as those in Smalltalk, Common > Lisp, Dylan and Visual Basic. Java and C++ are of secondary interest > as models. > > --- > > Python is just another syntax and virtual machine for the lambda > calculus. It obeys the same mathematical laws as other programming > languages. I think it would be a mistake to throw out everything that we > know about type systems and implement something idiosyncratic. > > Python IS a classical dynamic object/procedural programming language. It > is not a research language and I dislike attempts to put in untested new > ideas, especially in the area of type checks. I like this. I have almost always tried to avoid invention for the rest of Python, and some of the few bits of invention are some of my least favorite Python features. I also often think of Python as a particularly dynamic *implementation* of a fairly conventional type system. --Guido van Rossum (home page: http://www.python.org/~guido/) From m.faassen@vet.uu.nl Tue Dec 14 18:07:47 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Tue, 14 Dec 1999 19:07:47 +0100 Subject: [Types-sig] RFC 0.1 References: <3854EB4B.37EA2888@prescod.net> <199912131809.NAA19402@eric.cnri.reston.va.us> <3855E950.AE0E3E19@prescod.net> Message-ID: <38568773.218B3176@vet.uu.nl> Paul Prescod wrote: [vast snip] > With all due respect my problem is that you took the obvious (or at > least traditional) instance variable declaration syntax and used it as a > class variable declaring syntax. Okay, let's try this: > > class foo: > types.IntType, a=5 > > def __init__( self ): > types.ListType, self.b > > That looks equally ugly to me. Got any other ideas? Let's ignore the syntax issue for now, please? Let's just put the type info in Python lists/dictionaries/etc. Those may look horribly ugly, but they're *there* for use, you can do fancy generic type construction in them if you want to, you can easily whip up a structure for that, and Python can already use them right away! Later on once we've got the horribly ugly system going we can think about syntax. Syntax will be clearer once we've got the semantics going, anyway. Regards, Martijn From GoldenH@littoncorp.com Tue Dec 14 18:23:13 1999 From: GoldenH@littoncorp.com (Golden, Howard) Date: Tue, 14 Dec 1999 10:23:13 -0800 Subject: [Types-sig] Pascal style declarations Message-ID: Since Guido hasn't had a coronary in response to my earlier suggestion, I will be more specific: 1. I propose _optional_ typing, using the Pascal syntax (since this seems to me to be the most "Pythonic" (Isn't that like giving a snake an enema? Sorry.). Actually, I don't care about the specific syntax, just as long as there is one. 2. Specifically, you can declare a variable using the syntax: var x : int, y : string, ... 3. In functions and methods, you can _optionally_ specify the argument type: def funx(x : int, y : string): ... 4. If you use these, then you are making binding assertions about the types of the names, and these assertions can be checked at compile or run time. 5. The parser could be made to strip out these declarations, and ignore them, in which case they would have no effect. 6. The parser should be modified so you can tell it (using a compile-time switch or pragma) to require declarations. 7. It appears to me that this would not change existing code, except if it uses the name "var". 8. I think there should be a parameterized type mechanism. I don't much like the angle bracket notation of C++, but I guess it's well established, so it'll do. In my opinion, this doesn't "muck up" the language (since you don't have to use it). --- Howard B. Golden Software developer Litton Industries, Inc. Woodland Hills, California From m.faassen@vet.uu.nl Tue Dec 14 18:27:38 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Tue, 14 Dec 1999 19:27:38 +0100 Subject: [Types-sig] RFC 0.1 References: <38565074.9E51515F@prescod.net> Message-ID: <38568C1A.D1BCF530@vet.uu.nl> Paul Prescod wrote: > Greg Stein wrote: [assigns objects of various types to the same name and wants this to remain legal Python code] > I am perfectly happy to have it be legal Python code. I just don't > intend for it to be *statically type checkable* Python code. No, you > cannot use all of the flexibility of Python and expect to get all of the > static type checking of Java. For each function you choose one or the > other. I agree with this, which is I am advocating a strong split (for simplicity) of fully-statically checked code and normal python code. Later on you can work on blurring the interface between the two. First *fully* type annotated functions (classes, modules, what you want), which can only refer to other things that are fully annotated. By 'fully annotated' I mean all names have a type. I keep disagreeing with Paul's simplification of initially throwing out constructed types such as list of integer, as that would break my own approach at simplicity. :) [snip] > > I believe that Python is too rich in data types and composition of types > > to be able to add *syntax* for all type declarations. I think you better > > stop and realize that before you get in too deep :-) > > I have a few different answers here: > > 1. I don't have to be able to describe every possible type. If you > can't statically check that "foo is a callable from T,T to callable from > T" tough bloody luck, at least for the time being. Java can't do that. > Neither could mid-90's C++. And forget about it for ANSI C. > > Python is not the world's most OO programming language. It is just a > good one. It may not have the world's most static type checker. It will > just have a good one. No type system makes type errors impossible so > that is not my goal. My goal is that if a module uses type checks as > religiously as Java module would, that module would be roughly as > type-safe. If we throw out the syntax issue and use Python constructs for types until we know more, we'll all be happier, right? :) The syntax will be clear when the semantics is. Guido is good at syntax, let him figure out a good syntax for it, let's just focus on the semantics. Our static type checker/compiler can use the Python type constructions directly. We can put limitations on them to forbid any type constructions that the compiler cannot fully evaluate before the compilation of the actual code, of course, just like we can put limitations on statically typed functions (they shouldn't be able to call any non-static functions in the first iteration of our design, I'm still maintaining) [snip] > 3. Compositions of types are complex, but not infinitely complex. We > have about two decades in parameterized type research to rely on. Within > a year and a half, two of the world's most popular languages (C++ and > Java) will have parameterized types. Doesn't C++ already have parameterized types? (template classes and such?). > > In your RFC 0.1, you punted on the complex/composited data types issue too > > keep the solution tractable. I posit that you will *never* solve the > > problem of coming up with sufficient syntactical expression; therefore, > > you will always have to resort to a procedural component in your type > > system *if* you want full coverage. > > I am happy to have a runtime component. I just don't see that we need > any new syntax for this runtime component. And I don't think that we > should give up on a formally defined static system. I agree we should focus on a static system. Regards, Martijn From guido@CNRI.Reston.VA.US Tue Dec 14 18:51:01 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 14 Dec 1999 13:51:01 -0500 Subject: [Types-sig] Re: Inferencing: A case study In-Reply-To: Your message of "Tue, 14 Dec 1999 08:07:12 PST." <38566B30.36608D4E@prescod.net> References: <199912141519.KAA23476@eric.cnri.reston.va.us> <38566B30.36608D4E@prescod.net> Message-ID: <199912141851.NAA24093@eric.cnri.reston.va.us> > Guido van Rossum wrote: > > > > Here's a long and rambling example of what I think a type inferencer > > could do -- without type declarations of any sort. I wrote this down > > while thinking about the type checker that I would like to see in > > IDLE. > > Okay, but let me ask this: if TOTAL Java-level type safety ONLY required > type declarations for all "non-local" variables (including functions and > instance variables) would that be acceptable to you? > > Your inferencer heuristics are fine for an interactive GUI environment > where failure is merely an inconvenience but if we are going to have a > formally checkable notion of "this is statically type-safe" and "this is > not" then I worry about the "non-local breakage" problem. Oops, did > changing that variable to an "int" break your module way over there? > > I spoke to the Journal of Functional Programmers at a conference > recently. Is Journal some kind of military term, maybe between General and Sergeant? :-) > I asked him about why ML's type inferencer made the language > so hard to use. He said: "oh, you should always put the type > declarations in. The type inferencer is mostly just an educational > tool." Of course that's not what the type inferencer was SUPPOSED to be, > but I think that that's what it has become. "Global" type inferencing > scares me and I think that it has the unintended consequence of making > the static type checker (and thus the language) harder to understand. I agree. Typically, especially for libraries, there should be type decls at the module boundaries to avoid endless "exercises for the reader" as in my case study. (Note that the case study actually stipulates that the re module has a module declaration, and explains why.) --Guido van Rossum (home page: http://www.python.org/~guido/) From m.faassen@vet.uu.nl Tue Dec 14 18:52:03 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Tue, 14 Dec 1999 19:52:03 +0100 Subject: [Types-sig] Re: RFC 0.1 References: Message-ID: <385691D3.6DC4A36E@vet.uu.nl> "Golden, Howard" wrote: [snip snip] > > #4. The first version of the system will be syntactically compatible > > with Python 1.5.x in order to allow experimentation in the lead-up to > > an integrated system in Python 2. > > Does this mean no new syntax? (That's what it appears from your examples.) > > How about a declaration syntax, e.g., > > var x : type1, y : type2 > > Is this prohibited by the RFC? While my agenda is to kill the syntax discussions for the moment, I'd propose a seperate declaration syntax before all others, because this is the most syntactically compatible with Python. And easier on the programmer. Imagine you have a module. Now you want to make it fully statically typed. With most syntax proposals I've seen you'd have to go through the code and add type declarations here and there, mix it with the current code. With either a Python based system as I'm proposing (ugly but powerful and fairly simple), or a seperate type declaration system, you have your type declarations separated from the code itself. This means you easily add and remove type information and switch between a statically typed module and a dynamically typed module easily. On a slightly seperate issue, I propose a classification of modules according to type annotation (or functions or classes, whatever level you prefer thinking about): fully unannotated module: Names have no type annotations. Full type dynamicism. Only run-time type checks by hand are possible. Can use any other kind of module. I.e. this is the good old Python module as we know it now. fully annotated module: All names (local and global, function definitions, classes, class members, class data, etc) in the module have a type annotation. Restricts lots. Can only use other fully annotated modules. object attributes are fixed at compile-time according to type annotations. code that tries to add a new member to an object at run-time will give a run-time error. 'a = "foo"; a = 1' will give a compile time error. I.e. this is like a static language and this can be compiled to fast native code. partially annotated module: Some names, but not all names, have type information. Possibly all names do in fact, but imported modules aren't fully annotated which also breaks things. Restricts some. Will raise a run-time error if it is detected that type annotations are violate, automatically. *may* do limited compile-time checking. *may* try to do type inference to turn this module into a fully annotated one. *may* even do fancy analysis and come up with one or more fully annotated modules (which can be compiled for speed reasons), but keeps a dynamic module around in case the fully annotated modules cannot be used. Regards, Martijn From m.faassen@vet.uu.nl Tue Dec 14 19:03:20 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Tue, 14 Dec 1999 20:03:20 +0100 Subject: [Types-sig] List of FOO References: <3855DBBB.6D1B462A@prescod.net> <199912141649.LAA23593@eric.cnri.reston.va.us> Message-ID: <38569478.40E29421@vet.uu.nl> Guido van Rossum wrote: > > > From: Paul Prescod > > > > > #2. The first version of the system will not allow the use of types > > > > that cannot be referred to as simple Python objects. In particular it > > > > will not allow users to refer to things like "List of Integers" and > > > > "Functions taking Integers as arguments and returning strings." > > > > > > It's been said before: that's a shame. Type inference is seriously > > > hindered if it doesn't have such information. (Consider a loop over > > > sys.argv; I want the checker to be able to assume that the items are > > > strings.) > > > > It took two years to get the parameterized version of the Java type > > system up and running. > > Probably because Java was initially conceived as a language with a > "classic" type system (like C or Pascal). Python on the other hand > already has all this. > > > Let me ask this your opinion on this question > > (seriously, not sarcastically), should we include a spelling for "list > > of string" and not "callable taking list of callables taking strings > > returning integers returning string" and what about "callable taking > > list of callables taking and R returning list of callables taking > > and returning ." You see my problem? I could special case "list > > of" as Java and C did if we agreed to take our chances that my syntax > > would be extensible. We could even steal that weird "[]" thing that C > > and Java do: > > > > StringType [] foo > > If we could express all those the type checker could do a much better > job. If we could at least do the ones without the notation, we'd > still be doing a good job. Stopping at "list" is useless. [snip] I agree completely, and one *can* express most of this pretty easily in current Python, i.e.: types = { "bar": IntType, "baz": ListType(IntType), "hey": IntType, "foo3": FunctionType(args=(IntType,), result=IntType), "crazy" : ListType(FunctionType(args=(ListType(IntType), StringType), result=DictType(StringType, FunctionType(args=None, result=StringType))) } It looks very very ugly, but that's beside the point. It's usable for type reasoning from within Python, directly (I actually have a buggy module which this a little, and features typedefs to boot). One can come up with a more Pythonic syntax (indentation, anyone?) later, once one has the semantics working. Regards, Martijn From guido@CNRI.Reston.VA.US Tue Dec 14 19:09:59 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 14 Dec 1999 14:09:59 -0500 Subject: [Types-sig] RFC 0.1 In-Reply-To: Your message of "Tue, 14 Dec 1999 19:27:38 +0100." <38568C1A.D1BCF530@vet.uu.nl> References: <38565074.9E51515F@prescod.net> <38568C1A.D1BCF530@vet.uu.nl> Message-ID: <199912141909.OAA24221@eric.cnri.reston.va.us> [Martijn Faassen] > I agree with this, which is I am advocating a strong split (for > simplicity) of fully-statically checked code and normal python code. You can already do this -- write in Java or C. > Later on you can work on blurring the interface between the two. First > *fully* type annotated functions (classes, modules, what you want), > which can only refer to other things that are fully annotated. By 'fully > annotated' I mean all names have a type. I keep disagreeing with Paul's > simplification of initially throwing out constructed types such as list > of integer, as that would break my own approach at simplicity. :) Agreed. List of integer and its friends are important. Also correspondences (see my example of a sum() function taking a list of and an additional single . > If we throw out the syntax issue and use Python constructs for types > until we know more, we'll all be happier, right? :) The syntax will be > clear when the semantics is. Guido is good at syntax, let him figure out > a good syntax for it, let's just focus on the semantics. Thank you. This of course leaves Paul with the question of how to prototype all this -- he'll have to make *something* up. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@CNRI.Reston.VA.US Tue Dec 14 19:17:23 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 14 Dec 1999 14:17:23 -0500 Subject: [Types-sig] RFC 0.1 In-Reply-To: Your message of "Tue, 14 Dec 1999 16:57:40 +0100." <385668F4.2340C4B2@appliedbiometrics.com> References: <385668F4.2340C4B2@appliedbiometrics.com> Message-ID: <199912141917.OAA24231@eric.cnri.reston.va.us> [Christian Tismer] > Allow me a question about types: > > Where are the limits between types, values, and properties of values? > > Assume a function which returns either > [1, 2, 3] or the number 42. > > We now know that we either get a list or an integer. > But in this case, we also know that we get a list of three > integer elements which are known constants, or we get > the integer 42 which is even, for instance. > > So what is 'type', how abstract or concrete should it be, > where is the cut? Good questions. I'd like to remember all of this information. It can help with optimization (through constant folding). It can help detect unreachable code (e.g. your example function always returns a true value). Etc., etc. Note that this can all be folded into a sufficiently rich type system; a type is nothing more than a (possibly infinite) set of values. > At the same time, Python is so rich from self-inspection that > writing a dynamic type inference machine seems practicable, > so how about not declaring types, but asking your code about its > type? I suppose you could do symbolic execution on the bytecode, but I don't think this is a very fruitful path. (Of course if anyone can prove I'm wrong, it's you. :-) > I could imagine two concepts working together: > > Having optional interfaces, which is a different issue > and looks fine (Jim's 0.1.1 implementation). > > Having dynamic type inference, which is implemented by cached > type info at runtime. Eh? Type inference is supposed to be a compile-time thing. You present your whole Python program to the typechecker and ask it "where could this crash if I sent it on rocket to Mars?" > (I hope this idea isn't too simple minded) > Assume for instance the string module, implemented in Python. > It would have an interface which defines what goes in and > out of its functions. > > At "compile" time of string.py, type inference can partially > take place already when the code objects are created. The interface > part creates restrictions on argument values, which can be used > for further inference. It can also be deduced whether the return > values already obey the interface or if deduction for imported > functions is necessary. > This info is saved in some cache with the compilation. > Changes to the module object simply break the cache. And that's exactly the problem. I want to be able to be told whether the cache might be broken *before* I launch my rocket to Mars. > When I dynamically redefine a piece of the module where it > depends of (say I assign something new to "_lower"), then > the analysis must be carried out again, recursively invalidating > other cached info as necessary. In my scenario, the assignment to _lower is either detected and taken into account by the type checker, or forbidden. But this decision is taken at compile time and if forbidden, it is flagged as a compile time error. If you exec code that could make this assignment that would be a run-time error (it's also forbidden at run-time) but typically, the Mars lander isn't going to accept input for exec from the Martians -- we could probably flag all uses of exec (and eval() and a few others) as errors unless there's a try/except around them. > Well, this is an example where I think the restriction to > type checking of expressions still applies, but something more is > needed to trigger this check early. > The involved namespace object is the string module's __dict__, > which should know that it is referenced by this expression: > > def lower(s): > res = '' > for c in s: > res = res + _lower[ord(c)] > return res > > And by assignment to the name "_lower" in this case, it could > invalidate code object lower's type cache. lower can no more > assure that it will return a string result and will trigger > its interface object to re-check consistency. The latter > will raise an interface_error if the rule doesn't match. > > It remains an open question for me how deeply possible > values should be checkable, i.e. "this arg has to be a list > which is not empty". Minor point, maybe. > > Did I make some sense, or am I off the track? - chris Read my case study. --Guido van Rossum (home page: http://www.python.org/~guido/) From m.faassen@vet.uu.nl Tue Dec 14 19:19:34 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Tue, 14 Dec 1999 20:19:34 +0100 Subject: [Types-sig] RFC 0.1 References: <38565074.9E51515F@prescod.net> <38568C1A.D1BCF530@vet.uu.nl> <199912141909.OAA24221@eric.cnri.reston.va.us> Message-ID: <38569846.853294E7@vet.uu.nl> Guido van Rossum wrote: > > [Martijn Faassen] > > I agree with this, which is I am advocating a strong split (for > > simplicity) of fully-statically checked code and normal python code. > > You can already do this -- write in Java or C. Good answer, but I'd prefer to write more Pythonic code. If I want to translate my Python module to C, I have to work hard. If I want to translate my Python module to a static Python module, I 'just' need to add type annotations and change some parts that are 'too dynamic'. Most Python code is fairly static. And I didn't intend to *stop* at this, I just think it's valuable 'early' payoff. > > Later on you can work on blurring the interface between the two. First > > *fully* type annotated functions (classes, modules, what you want), > > which can only refer to other things that are fully annotated. By 'fully > > annotated' I mean all names have a type. I keep disagreeing with Paul's > > simplification of initially throwing out constructed types such as list > > of integer, as that would break my own approach at simplicity. :) > > Agreed. List of integer and its friends are important. Also > correspondences (see my example of a sum() function taking a list of > and an additional single . > > > If we throw out the syntax issue and use Python constructs for types > > until we know more, we'll all be happier, right? :) The syntax will be > > clear when the semantics is. Guido is good at syntax, let him figure out > > a good syntax for it, let's just focus on the semantics. > > Thank you. This of course leaves Paul with the question of how to > prototype all this -- he'll have to make *something* up. :-) You're welcome. As to the prototype, you can easily make up something in Python. I have posted an example of this to the list in another post. Regards, Martijn From gstein@lyra.org Tue Dec 14 19:56:50 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 14 Dec 1999 11:56:50 -0800 (PST) Subject: [Types-sig] Plea for help. In-Reply-To: <199912141659.LAA23638@eric.cnri.reston.va.us> Message-ID: On Tue, 14 Dec 1999, Guido van Rossum wrote: > > Is there currently any path from high level parse trees to bytecodes? > > E.g. is there a way to get sane parse trees to "render" themselves as, > > er, insane parse trees? I don't think so but I'm just checking to avoid > > extra work. > > The parser module lets you construct a parse tree and then compile > it. The parse tree must be correct before this is allowed. Check out > the compileast() function on > http://www.python.org/doc/current/lib/Converting_ASTs.html While it is certainly possible to go from a transformer-tree back to an ast-tree and then to compile -- if that's what you want, then why use the transformer at all? :-) As Bill said: you can definitely generate a pyc from a transformer tree. I believe it is bit easier than doing it from AST, too. But it isn't a cake-walk... there are a lot of constructs in there. Hrm. Well... the Python bytecodes certainly map better. It was difficult for us to go to C, but maybe generating bytecodes won't be too hard. If anybody is thinking about doing this, then please talk with Bill and I first. genc.py is not the best model. In a proprietary compiler (e.g. I can't release it yet), we built a *much* better model. There are some things that are similar, but others that really need to change. Cheers, -g -- Greg Stein, http://www.lyra.org/ From paul@prescod.net Tue Dec 14 20:24:40 1999 From: paul@prescod.net (Paul Prescod) Date: Tue, 14 Dec 1999 12:24:40 -0800 Subject: [Types-sig] Re: Static typing considered ... UGLY References: Message-ID: <3856A788.A7435117@prescod.net> Edward Welbourne wrote: > > If I implement a datatype (probably as a class) whose objects > (instances) behave *just the same as* integers in all pythonic respects, > I demand to be able to use it everywhere that I am allowed to use an > integer. If static typing breaks that, it's right out. It won't break it. Number will be an interface with operations like "add", "radd", "sub", "mult" and so forth. If you check against the interface instead of against the type, things just work. Anyhow, the decision of whether to do this in an interface-y way or a hard-coded type way is ALREADY up to the author of a module. There are many places in the standard library where module owners check the types of objects and return TypeError if they don't get the data they expect. It is even more common in the built-in modules. How would changing the syntax from def prepend(self, cmd, kind): if type(cmd) <> type(''): raise TypeError, \ 'Template.prepend: cmd must be a string' To: def prepend( self, cmd: String, kind ): ... make anything worse? And is the latter really "uglier" than the former? Or do you propose to outlaw the former? Does the mere fact that the verbose version is essentially useless to the compiler make it more virtuous? > the value of the expression being that of the given value, but > evaluating it'll raise an exception if the typechecker didn't like the > value. Now that's a much nicer way to go. Of course, this effectively > just amounts to implementing ! as an in-expression assert mechanism ... > and I'm not entirely sure how it helps the compiler-writer - is that why > you insist on the typechecker being a dotted name, not an arbitrary > expression ? Exactly. Dotted names help the compiler writer and the compiler writer helps the programmer by finding mistakes and optimizing code. You scratch my back and I'll scratch yours. > Please Greg/Fred/Sjoerd, can you write a proposal which starts where > http://www.foretec.com/python/workshops/1998-11/greg-type-ideas.html > ends (reading the first 9/10 of that was ... illuminating in hindsight, > but off-putting on the way there). It looks pretty promising ... Let me point out again that while that approach is interesting, it doesn't solve the problem I was recruited to solve: a *static* *compile-time* checker. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Tue Dec 14 20:00:12 1999 From: paul@prescod.net (Paul Prescod) Date: Tue, 14 Dec 1999 12:00:12 -0800 Subject: [Types-sig] RFC 0.1 References: <38565074.9E51515F@prescod.net> <38568C1A.D1BCF530@vet.uu.nl> Message-ID: <3856A1CB.B5470782@prescod.net> Martijn Faassen wrote: > > I agree with this, which is I am advocating a strong split (for > simplicity) of fully-statically checked code and normal python code. I don't see this as buying much simplicity. And I do see it as requiring more work later. I also see it as scaring the bejeesus out of many static type system fence sitters. Can you demonstrate that it makes our life easier to figure out integration issues later? > Later on you can work on blurring the interface between the two. First > *fully* type annotated functions (classes, modules, what you want), > which can only refer to other things that are fully annotated. By 'fully > annotated' I mean all names have a type. I think that's a non-starter because it will take forever to become useful because the standard library is not type-safe. Anyhow I fell like I've *already solved* the problem of integration so why would I undo that? > I keep disagreeing with Paul's > simplification of initially throwing out constructed types such as list > of integer, as that would break my own approach at simplicity. :) If I'm making this problem harder than it needs to be then I'm happy to accept your simple solution for parameterized types as soon as I understand it. > If we throw out the syntax issue and use Python constructs for types > until we know more, we'll all be happier, right? :) The syntax will be > clear when the semantics is. Guido is good at syntax, let him figure out > a good syntax for it, let's just focus on the semantics. Well, we need SOME syntax in order to communicate. Anyhow... > Our static type checker/compiler can use the Python type constructions > directly. We can put limitations on them to forbid any type > constructions that the compiler cannot fully evaluate before the > compilation of the actual code, of course, just like we can put > limitations on statically typed functions (they shouldn't be able to > call any non-static functions in the first iteration of our design, I'm > still maintaining) I see no reason for that limitation. The result of a call to a non-static function is a Pyobject. You cast it in your client code to get type safety. Just like the shift from K&R C to ANSI C. Functions always (okay, often) returned "ints" but you could cast them to foo *'s. > Doesn't C++ already have parameterized types? (template classes and > such?). Yes. I was just pointing out that in a year and a half Java will have them too which will put a lot of pressure on us. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Tue Dec 14 20:24:32 1999 From: paul@prescod.net (Paul Prescod) Date: Tue, 14 Dec 1999 12:24:32 -0800 Subject: [Types-sig] Type inferencing References: <3855DBA1.9384B6AE@prescod.net> <199912141654.LAA23612@eric.cnri.reston.va.us> Message-ID: <3856A780.36B6788D@prescod.net> Guido van Rossum wrote: > > > From: Paul Prescod > > > > Point taken. I am only willing to do type inferencing up to a function > > level. After my "ML Experience" I am not willing to do it globally. > [example snipped] > > I'm disappointed. Jim Hugunin did global analysis on the pystone.py > module -- 250 lines containing 14 functions and one class with two > methods. (He may actually have left out the class, but I'm pretty > sure he did everything else.) He got a 1000x speedup, which I think > should be a pretty good motivator for those interested in (OPT). I think that we may be talking at cross purposes. I am trying to define a formal, independently implementable specification for a type system that Python users will understand and like. Some languages use global type inferencing as a formally specified part of the type checker but my impression is that users do not like the resulting languages. Jim created an implementation of an excellent, intelligent optimizing compiler. His work is as, or more, interesting than mine, but it is a different problem he is trying to solve. (OPT) comes into the picture because my work makes his much, much easier and more effective in many cases. I am totally in favor of particular global type inferencing implementations, but am not in favor of requiring global type inference of every static type checker implementation nor of requiring safety-conscious Python users to think in terms of global type inferencing. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Tue Dec 14 20:12:55 1999 From: paul@prescod.net (Paul Prescod) Date: Tue, 14 Dec 1999 12:12:55 -0800 Subject: [Types-sig] Re: Inferencing: A case study References: <199912141519.KAA23476@eric.cnri.reston.va.us> <38566B30.36608D4E@prescod.net> <199912141851.NAA24093@eric.cnri.reston.va.us> Message-ID: <3856A4C7.1B6132C8@prescod.net> Guido van Rossum wrote: > > > I spoke to the Journal of Functional Programmers at a conference > > recently. > > Is Journal some kind of military term, maybe between General and > Sergeant? :-) I spoke with an editor of said Journal. :) > I agree. Typically, especially for libraries, there should be type > decls at the module boundaries to avoid endless "exercises for the > reader" as in my case study. (Note that the case study actually > stipulates that the re module has a module declaration, and explains > why.) Good, we are in agreement. I've been thinking: I can allow statically checked references to type-inferenced module variables if we make the module namespace write-only outside of the module. The "trick" is that I need to put a boundary around where I expect writes to take place so I can check that I can figure the complete list of possible values the variable can take. If writes can come from outer space then I need to check every write at runtime. So, I can do static type checking on a module variable if: * it is declared only "privately writeable" * or the whole module namespace is "privately writeable" * or it has a type declaration. We can provide access to any combination of these options that we decide. Privately writable is more pythonic than "const" which was my first reaction. Of course the vast, vast majority of module variables are privately writable. And one could argue that ALL of them should be. Module namespace writability is a security nightmare and it is SO easy to move writeable variables to an object: sys.path => sys.runtime.path sys.version => sys.impl.version -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From tismer@appliedbiometrics.com Tue Dec 14 20:23:03 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Tue, 14 Dec 1999 21:23:03 +0100 Subject: [Types-sig] RFC 0.1 References: <385668F4.2340C4B2@appliedbiometrics.com> <199912141917.OAA24231@eric.cnri.reston.va.us> Message-ID: <3856A727.490600C4@appliedbiometrics.com> Guido van Rossum wrote: > > [Christian Tismer] [about is [42]'s type "list", "list with one element", "list with one even int" ] > Good questions. I'd like to remember all of this information. It can > help with optimization (through constant folding). It can help detect > unreachable code (e.g. your example function always returns a true > value). Etc., etc. Fine, in general. > Note that this can all be folded into a sufficiently rich type system; > a type is nothing more than a (possibly infinite) set of values. Yup, it's just open where to cut. I'd like to do compile time checking, but to refine this at any time during the program execution (sometimes maybe), and this needs some abstraction to keep data limited. > > At the same time, Python is so rich from self-inspection that > > writing a dynamic type inference machine seems practicable, > > so how about not declaring types, but asking your code about its > > type? Wrong wording of mine. I don't want to analyse bytecode, but perhaps use AST info at some time. The initial compile time AST is general but currently doesn't try deduction. It could do so. But it could build derived AST's at runtime which know much more. Well I'm still after the JIT idea, so just drop it, I think this thread is for static types, which are a good thing! > I suppose you could do symbolic execution on the bytecode, but I don't > think this is a very fruitful path. (Of course if anyone can prove > I'm wrong, it's you. :-) Will not try again soon, I'm tired. Proving you slightly not right (wrong is too much) costs me half a year of work, finally a little adjustment to truth helped. Changing truth is the easier way :-) [interfaces and "dynamic" type inference] > > Eh? Type inference is supposed to be a compile-time thing. You > present your whole Python program to the typechecker and ask it "where > could this crash if I sent it on rocket to Mars?" I understand. I always think of importing which is already execution of something, and then I miss the need to do it before. Hmm, isn't it AST inspection, and after code is run, you get a new AST instance which is richer? ... > And that's exactly the problem. I want to be able to be told whether > the cache might be broken *before* I launch my rocket to Mars. I see. The Houston traceback. You need to close the cache, and also foresee that some module might want to break it and report a syntax error *before*, which sounds hard. A frozen module is a module which has proven its interface and is protected against changes of necessary conditions. That's indeed more than mine. [more numb stuff of mine] > Read my case study. Did that. Great. I think I would use Greg's modified AST and do the analysis there. An interpreter which runs these is also not that hard and keeps more info than bytecodes. AFAIK this is Skaller's approach for Viper. You often said that you want to analyse code form the source, instead of importing/executing stuff and use inspection. I understand that. But analysis needs some simulation as well, to see the effects of running some code. This simulation needs an environment which can track effects of assignments, imports and so on. Instead of re-inventing much stuff, why not use Python inside of a restricted environment? A "virtual Python", run in a real one, could execute steps, undo them, use other control flow paths, record types(==sets of possible values), all as long as there are no permanent side effects to the outside. But the latter are to be avoided in either case, whatever which approach you use. Finally I think this is just another view of the same thing. cheers - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From gstein@lyra.org Tue Dec 14 20:28:34 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 14 Dec 1999 12:28:34 -0800 (PST) Subject: [Types-sig] Re: Static typing considered ... UGLY In-Reply-To: Message-ID: On Tue, 14 Dec 1999, Edward Welbourne wrote: >... > On the other hand, if you want an object whose attributes are of > pre-decided kinds, or a namespace in which certain names are reserved > for certain values, use a setattr hack (or, if you're feeling very > brave, some variant on the wrapper defined by > URL: http://www.chaos.org.uk/~eddy/dev/toy/class.py). setattr hacks work great for class instances. Try to apply them to module namespaces... :-) This is a pretty old request to Guido: provide a setattr hook for modules. >... > ... I've now read the Greg/Fred/Sjoerd attack and I like that: let ! > be a new binary operator with grammar > > anyvalue ! typechecker > > the value of the expression being that of the given value, but Ah. Right. I didn't make that explicit, but yes. > evaluating it'll raise an exception if the typechecker didn't like the > value. Now that's a much nicer way to go. Of course, this effectively Yes, and I think so, too :-) > just amounts to implementing ! as an in-expression assert mechanism ... Yup. The proposal doesn't even introduce new bytecodes... it could use the same pattern as the assert statement, allowing the Python VM to optimize it during a -O invocation. > and I'm not entirely sure how it helps the compiler-writer - is that why > you insist on the typechecker being a dotted name, not an arbitrary > expression ? Correct. The compiler is going to have a hard enough time with dotted names, let alone arbitrary expressions. The type assertions help the compiler because the compiler can then make assumptions on how to *use* that value. For example: if x!Int: ... If you compile this, then you know that you can do a simple integer test, rather than check for an instance and possibly calling __nonzero__. While no biggy for compiling to the Python VM, this is a *huge* win if you're compiling to something like C or the JVM. In the statement: a = 5!String The compiler now knows that will contain a string and can optimize the uses of as appropriate. > Type-checking applies to values ;^) :-) I believe one of the differences is how a person views "type-safety". I don't regard " must only contain integers" as an interesting requirement. "the second param of foo(a,b) must be an integer" is interesting, and asserting specific return types is interesting. Problems with types almost *always* occur at boundaries (function arguments and return values). Type problems just don't occur within a single function (Guido's CP4E system might disagree, tho :-). As a result, I think restricting (variable) names is not nearly as interesting as asserting that your func args/returns are "correct." > Of course, obstreperous as I am, I immediately want to meddle with the > scheme: specifically, though the *default* behaviour might be (in > effect) > > if not isinstance(value, typechecker): raise TypeError > else: yield value > > I'd argue for the semantics to say: evaluate the expressions `value' and > `typechecker', look for a __check__ method on the latter: if present > invoke it on the value, else use isinstance as above; on false return > (no problem) the !-expression yields the given `value', otherwise > TypeError with parameter the true value returned. Then we can implement > weird and devious __check__ methods for fiddly type-checks (instead of > needing to change isinstance in the way proposed - which would conflict > with my pet tweak to that, which allows isinstance(value, thistype, > thattype, othertype) for when I've got several types I'll accept). I think altering isinstance() to accept a callable is preferable to introducing a __check__ method. A callable implies that you can use builtin types to implement type-checkers. You could still use the __check__ concept with builtins, but you would need to add new slots to the type structures (which is, IMO, to be avoided). There is no problem with saying that isinstance() can take more than two parameters, where 2..n can be a type, a class, or a callable. [ and note: it should be apparent that you check for a class before a callable :-) ] > Please Greg/Fred/Sjoerd, can you write a proposal which starts where > http://www.foretec.com/python/workshops/1998-11/greg-type-ideas.html > ends (reading the first 9/10 of that was ... illuminating in hindsight, > but off-putting on the way there). It looks pretty promising ... I'm peripherally interested here. Not enough to go writing :-). I've got about three other projects on my "over the next couple months" plate. An email here or there... sure, I'll do. An "emphatic discussion"... sure. That page was definitely just a series of notes. I think somebody could easily distill a one-page proposal from it. Please feel free! Cheers, -g -- Greg Stein, http://www.lyra.org/ From guido@CNRI.Reston.VA.US Tue Dec 14 20:35:50 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 14 Dec 1999 15:35:50 -0500 Subject: [Types-sig] Type inferencing In-Reply-To: Your message of "Tue, 14 Dec 1999 12:24:32 PST." <3856A780.36B6788D@prescod.net> References: <3855DBA1.9384B6AE@prescod.net> <199912141654.LAA23612@eric.cnri.reston.va.us> <3856A780.36B6788D@prescod.net> Message-ID: <199912142035.PAA24440@eric.cnri.reston.va.us> > I think that we may be talking at cross purposes. I am trying to define > a formal, independently implementable specification for a type system > that Python users will understand and like. Some languages use global > type inferencing as a formally specified part of the type checker but my > impression is that users do not like the resulting languages. OK, you may be right. Although I think that with Python as a starting point we'd end up with something sufficiently different from ML that the jury is still out on whether users will like it or not. > Jim created an implementation of an excellent, intelligent optimizing > compiler. His work is as, or more, interesting than mine, but it is a > different problem he is trying to solve. (OPT) comes into the picture > because my work makes his much, much easier and more effective in many > cases. I am totally in favor of particular global type inferencing > implementations, but am not in favor of requiring global type inference > of every static type checker implementation nor of requiring > safety-conscious Python users to think in terms of global type > inferencing. OK, I see and agree. I think that I would like to make *some* form of type inference (maybe only within the function body) part of the formal specs. Note that in a limited way, inference is already part of Python (and sometimes deplored -- because the diagnostics stink): if you write "a = 1" anywhere in a function body, then a is a local variable everywhere in that function (unless there's a "global a" as well). Now, please make some progress with a design... --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein@lyra.org Tue Dec 14 20:42:25 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 14 Dec 1999 12:42:25 -0800 (PST) Subject: [Types-sig] expression-based type assertions (was: Static typing considered ...UGLY) In-Reply-To: <3856A788.A7435117@prescod.net> Message-ID: On Tue, 14 Dec 1999, Paul Prescod wrote: >... > > Please Greg/Fred/Sjoerd, can you write a proposal which starts where > > http://www.foretec.com/python/workshops/1998-11/greg-type-ideas.html > > ends (reading the first 9/10 of that was ... illuminating in hindsight, > > but off-putting on the way there). It looks pretty promising ... > > Let me point out again that while that approach is interesting, it > doesn't solve the problem I was recruited to solve: a *static* > *compile-time* checker. Yes, it does :-) As I mentioned in my note to Eddy just now, the compiler can use the assertions to determine an expression's type (assuming it isn't already available through inference). The type can then be used in the checks. Specifically, the "GFS proposal" would lead to the following types of compile-time checks: * is the type correct for each parameter of a function call? * is the type correct for the function return value(s)? * will a type assertion (the '!' operator) possibly fail? And to reiterate a point from my last note: I believe checks associated with shoving a value into a name are not as interesting, as 99% of the errors will occur at code boundaries (function calls), which are handled by the above mechanism. In fact, I would even say that the only type declarations used would be associated with function params and returns (and not variable). If you are implementing a function and want to ensure that a result has a proper type, then the '!' operator can be used (shoving it into a typed variable isn't going to help you!). In both cases, expression- and name-based type assertions, I think you require type inferencing. So I don't think the problem is simplified by virtue of using name-based assertions. All you really get is an compile-time assertion at assignment time, which is also provided by an expression-based typing. In other words: Int a a = foo() vs. a = foo() ! Int In both cases, the compiler will throw a fit if it knows foo() always returns a String. Cheers, -g -- Greg Stein, http://www.lyra.org/ From guido@CNRI.Reston.VA.US Tue Dec 14 20:42:37 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 14 Dec 1999 15:42:37 -0500 Subject: [Types-sig] Re: Inferencing: A case study In-Reply-To: Your message of "Tue, 14 Dec 1999 12:12:55 PST." <3856A4C7.1B6132C8@prescod.net> References: <199912141519.KAA23476@eric.cnri.reston.va.us> <38566B30.36608D4E@prescod.net> <199912141851.NAA24093@eric.cnri.reston.va.us> <3856A4C7.1B6132C8@prescod.net> Message-ID: <199912142042.PAA24452@eric.cnri.reston.va.us> [Paul] > I've been thinking: I can allow statically checked references to > type-inferenced module variables if we make the module namespace > write-only outside of the module. The "trick" is that I need to put a > boundary around where I expect writes to take place so I can check that > I can figure the complete list of possible values the variable can take. > If writes can come from outer space then I need to check every write at > runtime. Good -- I've been thinking the same thing. Here's what I think would be needed: 1. . = is simply forbidden (this is setattr for module objects) 2. Somehow we restrict use of .__dict__, globals(), locals(), and vars(). 3. Somehow we restrict exec, eval(), and execfile() when these can touch a module's globals. So the only way a module-level variable can be set will be through assignments in its body (this includes classes and functions contained in its body); such assignments are easily traceable for the typechecker. > So, I can do static type checking on a module variable if: > > * it is declared only "privately writeable" > * or the whole module namespace is "privately writeable" > * or it has a type declaration. > > We can provide access to any combination of these options that we > decide. > > Privately writable is more pythonic than "const" which was my first > reaction. Of course the vast, vast majority of module variables are > privately writable. And one could argue that ALL of them should be. > Module namespace writability is a security nightmare and it is SO easy > to move writeable variables to an object: > > sys.path => sys.runtime.path > sys.version => sys.impl.version Actually, there's never a need to assign to sys.version, and as for sys.path, maybe you can't assign a different object to it, you can still change its value because a list is mutable. --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein@lyra.org Tue Dec 14 20:55:31 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 14 Dec 1999 12:55:31 -0800 (PST) Subject: [Types-sig] Type inferencing In-Reply-To: <199912142035.PAA24440@eric.cnri.reston.va.us> Message-ID: On Tue, 14 Dec 1999, Guido van Rossum wrote: > Paul Prescod wrote: >... > > cases. I am totally in favor of particular global type inferencing > > implementations, but am not in favor of requiring global type inference > > of every static type checker implementation nor of requiring > > safety-conscious Python users to think in terms of global type > > inferencing. > > OK, I see and agree. > > I think that I would like to make *some* form of type inference (maybe > only within the function body) part of the formal specs. Note that in > a limited way, inference is already part of Python (and sometimes I believe that you will always have type inferencing occurring. Maybe I'm just referring to a degenerate case, but you do need inferencing just to deal with: Int a a = foo() + bar() i.e. inference says "foo-result-type + bar-result-type => Int", so the assignment is safe. > Now, please make some progress with a design... I've got a partial one for you :-) * add declarations to "def" statements * add a type-assertion operator (for discussion, this has been '!') * use type inference to check func args and returns, and to (pre)check type-assertion operators Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Tue Dec 14 20:59:39 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 14 Dec 1999 12:59:39 -0800 (PST) Subject: [Types-sig] Pascal style declarations In-Reply-To: Message-ID: You don't provide a way to declare function return value(s) types. When you do, then I think you're going to run into a problem using the ':' syntactical marker... This was one reason that Fred/Sjoerd/myself moved away from ':'-based declarations, and eventually fell into expression-based type checking. Cheers, -g On Tue, 14 Dec 1999, Golden, Howard wrote: > Since Guido hasn't had a coronary in response to my earlier suggestion, I > will be more specific: > > 1. I propose _optional_ typing, using the Pascal syntax (since this seems > to me to be the most "Pythonic" (Isn't that like giving a snake an enema? > Sorry.). Actually, I don't care about the specific syntax, just as long as > there is one. > > 2. Specifically, you can declare a variable using the syntax: > > var x : int, y : string, ... > > 3. In functions and methods, you can _optionally_ specify the argument > type: > > def funx(x : int, y : string): ... > > 4. If you use these, then you are making binding assertions about the types > of the names, and these assertions can be checked at compile or run time. > > 5. The parser could be made to strip out these declarations, and ignore > them, in which case they would have no effect. > > 6. The parser should be modified so you can tell it (using a compile-time > switch or pragma) to require declarations. > > 7. It appears to me that this would not change existing code, except if it > uses the name "var". > > 8. I think there should be a parameterized type mechanism. I don't much > like the angle bracket notation of C++, but I guess it's well established, > so it'll do. > > In my opinion, this doesn't "muck up" the language (since you don't have to > use it). > > --- > > Howard B. Golden > Software developer > Litton Industries, Inc. > Woodland Hills, California > > > _______________________________________________ > Types-SIG mailing list > Types-SIG@python.org > http://www.python.org/mailman/listinfo/types-sig > -- Greg Stein, http://www.lyra.org/ From jeremy@cnri.reston.va.us Tue Dec 14 21:02:38 1999 From: jeremy@cnri.reston.va.us (Jeremy Hylton) Date: Tue, 14 Dec 1999 16:02:38 -0500 (EST) Subject: [Types-sig] expression-based type assertions (was: Static typing considered ...UGLY) In-Reply-To: References: <3856A788.A7435117@prescod.net> Message-ID: <14422.45166.820239.289239@goon.cnri.reston.va.us> >>>>> "GS" == Greg Stein writes: GS> On Tue, 14 Dec 1999, Paul Prescod wrote: >> ... > Please Greg/Fred/Sjoerd, can you write a proposal which >> starts where > >> http://www.foretec.com/python/workshops/1998-11/greg-type-ideas.html >> > ends (reading the first 9/10 of that was ... illuminating in >> hindsight, > but off-putting on the way there). It looks pretty >> promising ... >> >> Let me point out again that while that approach is interesting, >> it doesn't solve the problem I was recruited to solve: a *static* >> *compile-time* checker. [I was in starting a response to Paul when Greg's mail arrived, so I merged the responses. Hoping to maximize confusion.] Perhaps we need a charter revision. We need to formally define a type system for Python. It may or may not be statically checkable -- that's just the way type systems work, e.g. Java does array bounds checks at runtime because it can't at compile time. The fact that array bounds are checked at runtime doesn't mean that Java's type system forbids referencing past the end of an array; it just can be statically checked (or at least no one has figured out a practical way to check it). The point of this digression is to argue that saying you only do "compile-time" checks is a bit of a cop out. GS> Yes, it does :-) I agree. GS> As I mentioned in my note to Eddy just now, the compiler can use GS> the assertions to determine an expression's type (assuming it GS> isn't already available through inference). The type can then be GS> used in the checks. GS> Specifically, the "GFS proposal" would lead to the following GS> types of compile-time checks: GS> * is the type correct for each parameter of a function call? GS> * is the type correct for the function return value(s)? GS> * will a type assertion (the '!' operator) possibly fail? These sounds like exactly the right place to start! GS> And to reiterate a point from my last note: I believe checks GS> associated with shoving a value into a name are not as GS> interesting, as 99% of the errors will occur at code boundaries GS> (function calls), which are handled by the above mechanism. GS> In fact, I would even say that the only type declarations used GS> would be associated with function params and returns (and not GS> variable). If you are implementing a function and want to ensure GS> that a result has a proper type, then the '!' operator can be GS> used (shoving it into a typed variable isn't going to help GS> you!). I think I agree with you as far as local variables. It becomes quite interesting when you're talking about attributes of objects, e.g. what is the type of the closed attribute of a builtin file object. (For that matter, what is the type of the builtin open function and how does it differ from a function that returns a StringIO object?) Jeremy From gstein@lyra.org Tue Dec 14 21:33:44 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 14 Dec 1999 13:33:44 -0800 (PST) Subject: [Types-sig] expression-based type assertions In-Reply-To: <14422.45166.820239.289239@goon.cnri.reston.va.us> Message-ID: On Tue, 14 Dec 1999, Jeremy Hylton wrote: >... > Perhaps we need a charter revision. We need to formally define a type > system for Python. It may or may not be statically checkable -- > that's just the way type systems work, e.g. Java does array bounds In deference to Paul, I think it must be statically checkable. However, your point about "completely checkable" is quite valid! We can have "as much as possible" but not necessarily "completely." Due to complex type issues, it may not ever be possible to be complete. The question really becomes "how close?" (note that expression-based type assertions allow a person to make assertions on sub-components of a complex/composite type while it is being used; this gives expr-based the capability to fill in where name-based falls down because of a lack of syntactic expressability) >... > to check it). The point of this digression is to argue that saying > you only do "compile-time" checks is a bit of a cop out. Damn... :-) Personally, I in the (OPT) camp, rather than (ERR) camp, so I don't care about static checks. But: the GFS proposal still supports it. >... > GS> And to reiterate a point from my last note: I believe checks > GS> associated with shoving a value into a name are not as > GS> interesting, as 99% of the errors will occur at code boundaries > GS> (function calls), which are handled by the above mechanism. > > GS> In fact, I would even say that the only type declarations used > GS> would be associated with function params and returns (and not > GS> variable). If you are implementing a function and want to ensure > GS> that a result has a proper type, then the '!' operator can be > GS> used (shoving it into a typed variable isn't going to help > GS> you!). > > I think I agree with you as far as local variables. It becomes quite > interesting when you're talking about attributes of objects, e.g. what > is the type of the closed attribute of a builtin file object. (For > that matter, what is the type of the builtin open function and how > does it differ from a function that returns a StringIO object?) Ah! Good point. I think this is where interfaces come in. Otherwise, it becomes very difficult to syntactically specify the types of attributes. Note that many of the problems with type decls for builtin types would probably be solved with interfaces, too. Until interfaces arrive, Martijn's proposal of using structures to specify an interface is probably best. A class or type can have an associated structure to specify attribute type information (functions still use syntactical declarators). Cheers, -g -- Greg Stein, http://www.lyra.org/ From GoldenH@littoncorp.com Tue Dec 14 21:45:53 1999 From: GoldenH@littoncorp.com (Golden, Howard) Date: Tue, 14 Dec 1999 13:45:53 -0800 Subject: [Types-sig] Re: Pascal style declarations Message-ID: Greg Stein [mailto:gstein@lyra.org] wrote: > You don't provide a way to declare function return value(s) > types. When > you do, then I think you're going to run into a problem using the ':' > syntactical marker... [refers to:] > > 3. In functions and methods, you can _optionally_ specify > the argument > > type: > > > > def funx(x : int, y : string): ... > > I'll admit that Python already uses the ":" character where Pascal does, but so what? You can still specify the return type in other ways. The most obvious (to me) is to use the ":" character twice, e.g., def funx(x : int, y : string): int : ... While I'm not a parsing expert, I believe this would still be parsable. Of course, any other available character could be used instead of the ":", if this would be preferable. (Again, I'm not trying to dictate the final syntax, just suggest a starting point.) > This was one reason that Fred/Sjoerd/myself moved away from ':'-based > declarations, and eventually fell into expression-based type checking. I am suggesting using declarations, rather than expression-based type checking, since that is familiar in other languages. As a declaration, it is clear that I am talking about an invariant assertion, not a dynamic one. Expression-based type checking should also be available, since it is needed when static checking is impossible. I don't think it has to be either/or. From Tony Lownds Tue Dec 14 21:46:03 1999 From: Tony Lownds (Tony Lownds) Date: Tue, 14 Dec 1999 13:46:03 -0800 (PST) Subject: [Types-sig] Pascal style declarations In-Reply-To: Message-ID: Hi, Visual Basic uses "as" to declare types of parameters, and Object Pascal uses "as" as a dynamic cast operator, so consider "as" instead of ! I'll use that below just to try it on for size. My main point is, I think there should be a seperate operator for declaring return types. If I read your proposal right, then def logfn(s as String, *args) as String: ... declares that log is a reference to a function taking a sting and a bunch of unspecified types, returning a string. How would you check that an object is a function with the same signature? # programmer would have to think associativity here log = logfn as (String, *Object) as String That syntax doesnt seem to be easily grokkable. Now if you had another operator that declared return values, say ->, then the statement above is clearer and you could also make a typedef for a function and apply it in the def statement. def logfn(s as String, *args) -> String: ... log = logfn as (String, *Object) -> String -or- log_function = (String, *Object) -> String def logfn(s, *args) as log_function: ... log = logfn as log_function Tim H. also mentioned using -> but he suggested replacing ! with ->, I am suggesting that we'd want a seperate operator for declaring return types. -Tony Lownds On Tue, 14 Dec 1999, Greg Stein wrote: > You don't provide a way to declare function return value(s) types. When > you do, then I think you're going to run into a problem using the ':' > syntactical marker... > > This was one reason that Fred/Sjoerd/myself moved away from ':'-based > declarations, and eventually fell into expression-based type checking. > > Cheers, > -g > > > On Tue, 14 Dec 1999, Golden, Howard wrote: > > > Since Guido hasn't had a coronary in response to my earlier suggestion, I > > will be more specific: > > > > 1. I propose _optional_ typing, using the Pascal syntax (since this seems > > to me to be the most "Pythonic" (Isn't that like giving a snake an enema? > > Sorry.). Actually, I don't care about the specific syntax, just as long as > > there is one. > > > > 2. Specifically, you can declare a variable using the syntax: > > > > var x : int, y : string, ... > > > > 3. In functions and methods, you can _optionally_ specify the argument > > type: > > > > def funx(x : int, y : string): ... > > > > 4. If you use these, then you are making binding assertions about the types > > of the names, and these assertions can be checked at compile or run time. > > > > 5. The parser could be made to strip out these declarations, and ignore > > them, in which case they would have no effect. > > > > 6. The parser should be modified so you can tell it (using a compile-time > > switch or pragma) to require declarations. > > > > 7. It appears to me that this would not change existing code, except if it > > uses the name "var". > > > > 8. I think there should be a parameterized type mechanism. I don't much > > like the angle bracket notation of C++, but I guess it's well established, > > so it'll do. > > > > In my opinion, this doesn't "muck up" the language (since you don't have to > > use it). > > > > --- > > > > Howard B. Golden > > Software developer > > Litton Industries, Inc. > > Woodland Hills, California > > > > > > _______________________________________________ > > Types-SIG mailing list > > Types-SIG@python.org > > http://www.python.org/mailman/listinfo/types-sig > > > > -- > Greg Stein, http://www.lyra.org/ > > > _______________________________________________ > Types-SIG mailing list > Types-SIG@python.org > http://www.python.org/mailman/listinfo/types-sig > From gstein@lyra.org Tue Dec 14 22:17:29 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 14 Dec 1999 14:17:29 -0800 (PST) Subject: [Types-sig] Re: Pascal style declarations In-Reply-To: Message-ID: On Tue, 14 Dec 1999, Golden, Howard wrote: >... > I'll admit that Python already uses the ":" character where Pascal does, but > so what? You can still specify the return type in other ways. The most > obvious (to me) is to use the ":" character twice, e.g., > > def funx(x : int, y : string): int : ... > > While I'm not a parsing expert, I believe this would still be parsable. Of It isn't "easily" parsable :-) "int" is a valid expression, which is valid on the same line after a function definition. For example: def funx(x, y): foo() ; return 5 The parser wouldn't know whether the expression is part of the function body or a return type declaration until hitting the ':'. That would require an arbitrary look-ahead or some funkiness in the grammar. > course, any other available character could be used instead of the ":", if > this would be preferable. (Again, I'm not trying to dictate the final > syntax, just suggest a starting point.) Yes, another character would be used. But which? What construct looks Pythonic? I don't disagree with the basic notion here... just that it is tough to retain Python's clean feel. While we didn't necessarily like the '!' choice for the operator, we felt that the basic concept imposed very little change on Python's clean feel. > > This was one reason that Fred/Sjoerd/myself moved away from ':'-based > > declarations, and eventually fell into expression-based type checking. > > I am suggesting using declarations, rather than expression-based type > checking, since that is familiar in other languages. As a declaration, it > is clear that I am talking about an invariant assertion, not a dynamic one. As I've mentioned in my other email, expression-based checkin also defines an invariant. The compiler can make assumptions based on type declarators in a function or when it sees a type-assert operator. > Expression-based type checking should also be available, since it is needed > when static checking is impossible. I don't think it has to be either/or. We already have expr-based (the "assert" statement) -- we can assert types on expressions anywhere. It is just a little less convenient since we must place the expression value into a temporary variable, assert the type of that, then continue with the expression. The "type-assert operator" simplifies this process dramatically. Cheers, -g -- Greg Stein, http://www.lyra.org/ From tismer@appliedbiometrics.com Tue Dec 14 22:53:33 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Tue, 14 Dec 1999 23:53:33 +0100 Subject: [Types-sig] Re: Pascal style declarations References: Message-ID: <3856CA6D.C84785ED@appliedbiometrics.com> Greg Stein wrote: > > On Tue, 14 Dec 1999, Golden, Howard wrote: [snap] > We already have expr-based (the "assert" statement) -- we can assert types > on expressions anywhere. It is just a little less convenient since we must > place the expression value into a temporary variable, assert the type of > that, then continue with the expression. The "type-assert operator" > simplifies this process dramatically. Why not use "assert" instead of "as" as an operator? def f(x): a = x assert int #... stuff return str(x) + g(x) assert string # assert binding low precedence -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From gstein@lyra.org Tue Dec 14 22:59:49 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 14 Dec 1999 14:59:49 -0800 (PST) Subject: [Types-sig] Re: Pascal style declarations In-Reply-To: <3856CA6D.C84785ED@appliedbiometrics.com> Message-ID: A bit wordy, but that might work! On Tue, 14 Dec 1999, Christian Tismer wrote: > Greg Stein wrote: >... > > We already have expr-based (the "assert" statement) -- we can assert types > > on expressions anywhere. It is just a little less convenient since we must > > place the expression value into a temporary variable, assert the type of > > that, then continue with the expression. The "type-assert operator" > > simplifies this process dramatically. > > Why not use "assert" instead of "as" as an operator? > > def f(x): > a = x assert int > #... stuff > return str(x) + g(x) assert string # assert binding low precedence -- Greg Stein, http://www.lyra.org/ From GoldenH@littoncorp.com Tue Dec 14 23:05:06 1999 From: GoldenH@littoncorp.com (Golden, Howard) Date: Tue, 14 Dec 1999 15:05:06 -0800 Subject: [Types-sig] Re: Pascal style declarations Message-ID: Christian Tismer [mailto:tismer@appliedbiometrics.com] wrote: > def f(x): > a = x assert int > #... stuff > return str(x) + g(x) assert string # assert binding low precedence I'm still trying to get a _declaration_ into the signature, e.g., using your assert: def f(x assert int) assert string : a = x #... stuff return str(x) + g(x) In other words, "assert" is a synonym for Pascal's ":"! :-) From tismer@appliedbiometrics.com Tue Dec 14 23:30:27 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Wed, 15 Dec 1999 00:30:27 +0100 Subject: [Types-sig] Re: Pascal style declarations References: Message-ID: <3856D313.CFE21380@appliedbiometrics.com> > I'm still trying to get a _declaration_ into the signature, e.g., using your > assert: > > def f(x assert int) assert string : > a = x > #... stuff > return str(x) + g(x) > > In other words, "assert" is a synonym for Pascal's ":"! :-) Sure, while not mentioning, it was obvious to do this since I proposed a textual replacement for "as" :-) assert is also a synonym for VB's "as" but don't tell 'em :-= -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From paul@prescod.net Tue Dec 14 20:33:56 1999 From: paul@prescod.net (Paul Prescod) Date: Tue, 14 Dec 1999 12:33:56 -0800 Subject: [Types-sig] List of FOO References: <3855DBBB.6D1B462A@prescod.net> <199912141649.LAA23593@eric.cnri.reston.va.us> <38569478.40E29421@vet.uu.nl> Message-ID: <3856A9B4.7636C0C9@prescod.net> Martijn Faassen wrote: > > I agree completely, and one *can* express most of this pretty easily in > current Python, i.e.: > > types = { > "bar": IntType, > "baz": ListType(IntType), > "hey": IntType, > "foo3": FunctionType(args=(IntType,), result=IntType), > > "crazy" : ListType(FunctionType(args=(ListType(IntType), > StringType), result=DictType(StringType, > > FunctionType(args=None, > result=StringType))) > } Questions: 1. This system is supposed to be extensible, right? So I could, for instance, define a binary tree module and have "binary trees of ints" and "binary trees of strings." How do I define the binary tree class and state that it is parameterizable? 2. How does this work with interfaces? "ListType" is cheating. We need SequenceType because that's not implementation specific. And SequenceType needs to be defined by an interface, not a class. 3. What does "tuple of int, string" look like? And should we have list length parameters? -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Tue Dec 14 20:45:28 1999 From: paul@prescod.net (Paul Prescod) Date: Tue, 14 Dec 1999 12:45:28 -0800 Subject: [Types-sig] Shadow File Opinions? References: <385691D3.6DC4A36E@vet.uu.nl> Message-ID: <3856AC68.2D5FCF37@prescod.net> Martijn Faassen wrote: > > ... > While my agenda is to kill the syntax discussions for the moment, I'd > propose a seperate declaration syntax before all others, because this is > the most syntactically compatible with Python. And easier on the > programmer. I'm considering your argument carefully. If we make separate interface files then we get Python 1.5 (hell, Python 1.0) compatibility "for free" and we can experiment with different syntaxes without breaking Python code. Plus we could use IDL and type libraries for type analysis *already*. I think the final product must allow inline declarations but I am starting to think that in the short term, "interface definition" files are the way to go not just for builtin modules but for all modules. Do others agree? > Imagine you have a module. Now you want to make it fully statically > typed. With most syntax proposals I've seen you'd have to go through the > code and add type declarations here and there, mix it with the current > code. I think that any proposal that requires you to keep two separate files "in sync" is bound to fail in the long term. I left that crap behind in C++. But in the short term...okay. > With either a Python based system as I'm proposing (ugly but powerful > and fairly simple), or a seperate type declaration system, you have your > type declarations separated from the code itself. This means you easily > add and remove type information and switch between a statically typed > module and a dynamically typed module easily. But there is not going to be alot of "switching". You add declarations and you leave them there. You update them when they get out of sync with the code. Why would you want to take a nice, safe, optimized module that you have gone to the effort of type annotating and hide the annotations? -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Tue Dec 14 20:47:48 1999 From: paul@prescod.net (Paul Prescod) Date: Tue, 14 Dec 1999 12:47:48 -0800 Subject: [Types-sig] Compile-time or runtime checks? References: <385691D3.6DC4A36E@vet.uu.nl> Message-ID: <3856ACF4.E19023D8@prescod.net> Martijn Faassen wrote: > > On a slightly seperate issue, I propose a classification of modules > according to type annotation (or functions or classes, whatever level > you prefer thinking about): I'm trying hard to separate the axes of: "I have some type declarations" and "I want a static type checker to gurantee that this code is totally type safe." This should be legal: StringType def foo(): a=eval( sys.argv[1] ) return a That means I want a runtime check. This should be illegal: type-safe StringType def foo(): a=eval( sys.argv[1] ) return a Here I've specifically asked for a compile time check and my code is not up to snuff. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Wed Dec 15 01:16:19 1999 From: paul@prescod.net (Paul Prescod) Date: Tue, 14 Dec 1999 17:16:19 -0800 Subject: [Types-sig] Type inferencing References: Message-ID: <3856EBE3.74D8FD2F@prescod.net> Greg Stein wrote: > > I believe that you will always have type inferencing occurring. Maybe I'm > just referring to a degenerate case, but you do need inferencing just to > deal with: > > Int a > a = foo() + bar() That's absolutely true. I agree with everyone else that argument values must be type checked and that most of the rest can be inferred. > * add declarations to "def" statements Agreed. > * add a type-assertion operator (for discussion, this has been '!') Prefer function call syntax. Or maybe Java/C++ (cast) syntax. > * use type inference to check func args and returns, and to (pre)check > type-assertion operators Agreed. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From guido@CNRI.Reston.VA.US Wed Dec 15 03:03:17 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 14 Dec 1999 22:03:17 -0500 Subject: [Types-sig] Compile-time or runtime checks? In-Reply-To: Your message of "Tue, 14 Dec 1999 12:47:48 PST." <3856ACF4.E19023D8@prescod.net> References: <385691D3.6DC4A36E@vet.uu.nl> <3856ACF4.E19023D8@prescod.net> Message-ID: <199912150303.WAA00737@eric.cnri.reston.va.us> > I'm trying hard to separate the axes of: "I have some type declarations" > and "I want a static type checker to gurantee that this code is totally > type safe." This should be legal: > > StringType > def foo(): > a=eval( sys.argv[1] ) > return a > > That means I want a runtime check. This should be illegal: > > type-safe > StringType > def foo(): > a=eval( sys.argv[1] ) > return a > > Here I've specifically asked for a compile time check and my code is not > up to snuff. I would strongly advise to focus on the type-safe axis. Run-time checks can already be implemented using various assert statements. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@CNRI.Reston.VA.US Wed Dec 15 03:05:47 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 14 Dec 1999 22:05:47 -0500 Subject: [Types-sig] Shadow File Opinions? In-Reply-To: Your message of "Tue, 14 Dec 1999 12:45:28 PST." <3856AC68.2D5FCF37@prescod.net> References: <385691D3.6DC4A36E@vet.uu.nl> <3856AC68.2D5FCF37@prescod.net> Message-ID: <199912150305.WAA00748@eric.cnri.reston.va.us> > I think the final product must allow inline declarations but I am > starting to think that in the short term, "interface definition" files > are the way to go not just for builtin modules but for all modules. > > Do others agree? Yes on both counts. I think this has been suggested long ago (maybe by Jack Jansen?). It never went anywhere, probably because the whole idea never went anywhere. Note that there's one case where separate interface files may be the end solution: when the source itself is in another language. This has been discussed already. Note that the doc-sig is also considering that for documenting C extensions. And of course Java does this for native methods (both for docs and for typedecls!). --Guido van Rossum (home page: http://www.python.org/~guido/) From janssen@parc.xerox.com Wed Dec 15 03:27:14 1999 From: janssen@parc.xerox.com (Bill Janssen) Date: Tue, 14 Dec 1999 19:27:14 PST Subject: [Types-sig] Shadow File Opinions? In-Reply-To: Your message of "Tue, 14 Dec 1999 12:45:28 PST." <3856AC68.2D5FCF37@prescod.net> Message-ID: <99Dec14.192724pst."3587"@watson.parc.xerox.com> > I'm considering your argument carefully. If we make separate interface > files then we get Python 1.5 (hell, Python 1.0) compatibility "for free" > and we can experiment with different syntaxes without breaking Python > code. Plus we could use IDL and type libraries for type analysis > *already*. > > I think the final product must allow inline declarations but I am > starting to think that in the short term, "interface definition" files > are the way to go not just for builtin modules but for all modules. > > Do others agree? Hey, I agreed with this five years ago! The tricky part is type-checking your use of that module without type declarations in the usage-side code. But yes, the standard process is: 1) Add separate interface files, containing declarations of the interface exported from a module file. This is documentation even if used for no other purpose. 2) Add a type inferencer that checks code using a module against the interface for that module. Provided you don't kill yourself writing the type inferencer (which almost happened here attempting the type inferencing system for SchemeXerox :-), you can now make some limited type checking available. 3) Move the type declaration syntax you developed for step 1 into the language proper. The parser is initially rigged to ignore it (and maybe it always will). 4) Now the type inferencer/checker is re-written to take advantage of the real type annotations in the usage-side code. 5) (Optional) Do away with the separate interfaces developed in step 1 and move the type declarations into the implementation of the module. Bill From paul@prescod.net Wed Dec 15 02:03:24 1999 From: paul@prescod.net (Paul Prescod) Date: Tue, 14 Dec 1999 18:03:24 -0800 Subject: [Types-sig] Re: Pascal style declarations References: Message-ID: <3856F6EC.4FD5860E@prescod.net> Greg Stein wrote: > > "int" is a valid expression, which is valid on the same line after a > function definition. For example: > > def funx(x, y): foo() ; return 5 Well, first, I don't think that we are going to allow functions as return type specifications. Use assert for runtime assertions. Second, Python needs to use look-ahead to tell the difference between parentheses used for parsing a tuple and used for bracketing, doesn't it? > We already have expr-based (the "assert" statement) -- we can assert types > on expressions anywhere. It is just a little less convenient since we must > place the expression value into a temporary variable, assert the type of > that, then continue with the expression. The "type-assert operator" > simplifies this process dramatically. Sure, but why not just use function call syntax? Or maybe Java/C++ (cast) syntax? > Jeremy wrote: > > I think I agree with you as far as local variables. It becomes quite > > interesting when you're talking about attributes of objects, e.g. what > > is the type of the closed attribute of a builtin file object. (For > > that matter, what is the type of the builtin open function and how > > does it differ from a function that returns a StringIO object?) > Greg Stein wrote: > Ah! Good point. I think this is where interfaces come in. Otherwise, it > becomes very difficult to syntactically specify the types of attributes. When we specify the types of attributes, we will be talking about those attributes by name, not by expression or value. So we need a syntax for specifying types of names *and* expressions. If we use function syntax for expressions casts then we can reduced the syntactic overload. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Wed Dec 15 02:39:55 1999 From: paul@prescod.net (Paul Prescod) Date: Tue, 14 Dec 1999 18:39:55 -0800 Subject: [Types-sig] Re: expression-based type assertions (was: Static typing considered...UGLY) References: Message-ID: <3856FF7B.3AA32F18@prescod.net> Greg Stein wrote: > > ... > In fact, I would even say that the only type declarations used would be > associated with function params and returns (and not variable). How do we handle attribute values? We can't just say "interfaces" unless we agree that interfaces allow type declarations to be associated with instance variables. And if we start associating type declarations with attribute names as we do parameter names, why wouldn't we also allow that for local and global variables? -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Wed Dec 15 02:41:56 1999 From: paul@prescod.net (Paul Prescod) Date: Tue, 14 Dec 1999 18:41:56 -0800 Subject: [Types-sig] Shadow File Opinions? References: <99Dec14.192724pst."3587"@watson.parc.xerox.com> Message-ID: <3856FFF4.1C1A4AD@prescod.net> Okay, shadow files seem to be a hit. Bill, while you're here, could you help me out with the CORBA IDL POV on generic types? Does IDL support parameterization? > 2) Add a type inferencer that checks code using a module against the > interface for that module. Provided you don't kill yourself writing > the type inferencer (which almost happened here attempting the type > inferencing system for SchemeXerox :-), you can now make some limited > type checking available. This is the part that scares the hell out of me! -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From peter.sommerfeld@gmx.de Wed Dec 15 06:36:10 1999 From: peter.sommerfeld@gmx.de (Peter Sommerfeld) Date: Wed, 15 Dec 1999 06:36:10 +0000 Subject: [Types-sig] Re: Pascal style declarations Message-ID: <199912150536.AAA25023@python.org> Paul Prescod wrote: > Well, first, I don't think that we are going to allow functions as > return type specifications. Use assert for runtime assertions. I don't see a reason for this limitation. It would seriously restrict future introduction of closures into python. def format(string s) -> def(string); -- Peter From tim_one@email.msn.com Wed Dec 15 09:08:44 1999 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 15 Dec 1999 04:08:44 -0500 Subject: [Types-sig] List of FOO In-Reply-To: <3855DBBB.6D1B462A@prescod.net> Message-ID: <000d01bf46db$fdeccd00$05a0143f@tim> [Paul Prescod] > It took two years to get the parameterized version of the Java type > system up and running. Ya, but they took this stuff seriously . > Let me ask this your opinion on this question (seriously, not > sarcastically), should we include a spelling for "list of > string" [""] > and not "callable taking list of callables taking strings returning > integers returning string" ["" -> 0] -> "" > and what about "callable taking list of callables taking > and R returning list of callables taking and returning ." The last "returning " is ambiguous. You may mean: [(T, R) -> [R -> None]] -> T or [(T, R) -> [R -> T]] -> None > You see my problem? I don't. The convolution comes not from the concepts but from the attempt to express them in English. If the formalism introduced above is too concise, there are a gazillion other ways to spell it; e.g., List of String Func(List of Func(String)->Int)->String Func(List of Func(T, R)->List of Func(R))->T Func(List of Func(T, R)->List of Func(R)->T) > I could special case "list of" as Java and C did if we agreed to > take our chances that my syntax would be extensible. Ack, no -- start with a general scheme, so special cases aren't necessary. Although it's *pleasant* if Python's builtin types get especially nice syntax. BTW, the concise form above is much like what Haskell uses. panic-is-always-premature-ly y'rs - tim From tim_one@email.msn.com Wed Dec 15 09:08:50 1999 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 15 Dec 1999 04:08:50 -0500 Subject: [Types-sig] RFC 0.1 In-Reply-To: <3855E950.AE0E3E19@prescod.net> Message-ID: <000e01bf46dc$0050ada0$05a0143f@tim> [Paul Prescod] > ... > In theory, but in practice "whole-program X" seems to never get > implemented (in Python or elsewhere!), as in "whole program type checks" > and "whole program optimization" and "whole program flow analysis." > "Whole program analysis" tends to be an excuse to put off work (roughly > like "type inference"). Whole-program type inference is the *norm* in the functional language world -- although they design the languages to make this provably possible in all cases(possible != easy -- it's not). Experienced f.p. programmers nevertheless explicitly name all their types and explictly declare all their vrbls of non-trivial types; else the unification algorithms that deduce most-general types yield incomprehensible error msgs; e.g., if you have a function that you *think* of as taking a list of ints, you don't know what the compiler is talking about if you forget to declare it as such and the type inferencer bitches about being unable to unify two type expressions that take five lines each to spell <0.5 wink>. The more general the language, the more benefit there is for *people* to be able to declare types, in their role as code readers. So in addition to Guido's OPT and ERR, add COM -- for "make this mess COMprehensible" . > ... > If we invent new, syntactically distinct spellings then we can > syntactically recognize them and complain if they aren't spelled > "exactly right" (i.e. in a statically analyzable way). [Guido] >> As long as an easy mechanical transformation to valid Python >> 1.5.x is available, I'd be happy. [PP] > ... > With all due respect my problem is that you took the obvious (or at > least traditional) instance variable declaration syntax and used it > as a class variable declaring syntax. Okay, let's try this: > > class foo: > types.IntType, a=5 > > def __init__( self ): > types.ListType, self.b > > That looks equally ugly to me. Got any other ideas? Don't try to overload existing syntax, either asserts or (as above) tuple syntax. That confuses both the overloader and the overloadee. Guido just *begged* us to suck up a new keyword! For lack of a better word, say type declarations are in "decl" stmts. It doesn't matter to me, but what does matter is that once you get your own statement, you can also define the syntax of that statement; e.g., class foo: decl a: int # slop in a const too, if you like a = 5 def __init__(self): decl member b: List of Any # or put that at class level -- where it belongs Resist the dubious temptation to conflate declaration with initialization, and "an easy mechanical transformation to valid Python 1.5.x" consists of commenting out the decl stmts! Heck, call the keyword "#\s+decl\s+" and it's a nop. > ... > if we are ever going to get to full polymorphic parametric static > type checking we will have to acknowledge that the type system will > have hard parts just as the language has hard parts. Indeed it will. but-in-a-pythonically-soft-way-ly y'rs - tim From m.faassen@vet.uu.nl Wed Dec 15 09:32:07 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Wed, 15 Dec 1999 10:32:07 +0100 Subject: [Types-sig] RFC 0.1 References: <38565074.9E51515F@prescod.net> <38568C1A.D1BCF530@vet.uu.nl> <3856A1CB.B5470782@prescod.net> Message-ID: <38576017.75252FF7@vet.uu.nl> Paul Prescod wrote: > > Martijn Faassen wrote: > > > > I agree with this, which is I am advocating a strong split (for > > simplicity) of fully-statically checked code and normal python code. > > I don't see this as buying much simplicity. And I do see it as requiring > more work later. I also see it as scaring the bejeesus out of many > static type system fence sitters. Can you demonstrate that it makes our > life easier to figure out integration issues later? Sure, but we're bound to scare the bejeesus out of everyone anyway; we're proposing a major change to Python. The 'simplicity' part comes in because you don't need *any* type inferencing. Conceptually it's quite simple; all names need a type. > > Later on you can work on blurring the interface between the two. First > > *fully* type annotated functions (classes, modules, what you want), > > which can only refer to other things that are fully annotated. By 'fully > > annotated' I mean all names have a type. > > I think that's a non-starter because it will take forever to become > useful because the standard library is not type-safe. Anyhow I fell like > I've *already solved* the problem of integration so why would I undo > that? Okay, I will need to figure out your solution then. :) > > I keep disagreeing with Paul's > > simplification of initially throwing out constructed types such as list > > of integer, as that would break my own approach at simplicity. :) > > If I'm making this problem harder than it needs to be then I'm happy to > accept your simple solution for parameterized types as soon as I > understand it. I'll try to clean up my swallow.py demo module. It doesn't demonstrate much, just a way a type system could work using Python dicts and such to construct complicated types. > > If we throw out the syntax issue and use Python constructs for types > > until we know more, we'll all be happier, right? :) The syntax will be > > clear when the semantics is. Guido is good at syntax, let him figure out > > a good syntax for it, let's just focus on the semantics. > > Well, we need SOME syntax in order to communicate. Anyhow... Right, but just Python code will do for communication. It's clear as we all understand it already. It looks horrible, but we can work on that later. > > Our static type checker/compiler can use the Python type constructions > > directly. We can put limitations on them to forbid any type > > constructions that the compiler cannot fully evaluate before the > > compilation of the actual code, of course, just like we can put > > limitations on statically typed functions (they shouldn't be able to > > call any non-static functions in the first iteration of our design, I'm > > still maintaining) > > I see no reason for that limitation. The result of a call to a > non-static function is a Pyobject. You cast it in your client code to > get type safety. Just like the shift from K&R C to ANSI C. Functions > always (okay, often) returned "ints" but you could cast them to foo *'s. Sure, that's why I say it's easy to start blurring things later. This would require runtime manipulation of bytecodes or something to insert a type cast or assertion, while a fully annotated module can be fully checked statically and thus this type of runtime manipulation can be delayed until later. > > Doesn't C++ already have parameterized types? (template classes and > > such?). > > Yes. I was just pointing out that in a year and a half Java will have > them too which will put a lot of pressure on us. We already have parameterized types that are fully dynamic in Python now, don't we, really? :) Regards, Martijn From m.faassen@vet.uu.nl Wed Dec 15 09:39:34 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Wed, 15 Dec 1999 10:39:34 +0100 Subject: [Types-sig] Shadow File Opinions? References: <385691D3.6DC4A36E@vet.uu.nl> <3856AC68.2D5FCF37@prescod.net> Message-ID: <385761D6.BD8B8E5F@vet.uu.nl> Paul Prescod wrote: > > Martijn Faassen wrote: > > > > ... > > While my agenda is to kill the syntax discussions for the moment, I'd > > propose a seperate declaration syntax before all others, because this is > > the most syntactically compatible with Python. And easier on the > > programmer. > > I'm considering your argument carefully. If we make separate interface > files then we get Python 1.5 (hell, Python 1.0) compatibility "for free" > and we can experiment with different syntaxes without breaking Python > code. Plus we could use IDL and type libraries for type analysis > *already*. > > I think the final product must allow inline declarations but I am > starting to think that in the short term, "interface definition" files > are the way to go not just for builtin modules but for all modules. > > Do others agree? I agree. I'm not sure I'm others, though. :) > > Imagine you have a module. Now you want to make it fully statically > > typed. With most syntax proposals I've seen you'd have to go through the > > code and add type declarations here and there, mix it with the current > > code. > > I think that any proposal that requires you to keep two separate files > "in sync" is bound to fail in the long term. I left that crap behind in > C++. But in the short term...okay. Right - in the longer term we'll have a nice syntax, but it's too soon for syntax right now. > > With either a Python based system as I'm proposing (ugly but powerful > > and fairly simple), or a seperate type declaration system, you have your > > type declarations separated from the code itself. This means you easily > > add and remove type information and switch between a statically typed > > module and a dynamically typed module easily. > > But there is not going to be alot of "switching". You add declarations > and you leave them there. You update them when they get out of sync with > the code. Why would you want to take a nice, safe, optimized module that > you have gone to the effort of type annotating and hide the annotations? Hiding the annotations may be useful (on the short term, at least). You can use existing the Python interpreter to test your module even if you have added type annotations. That's nice for development/debugging, including the development and debugging of the type annotation system. You can say 'hey, Python does this to my code when I only pass strings in, but our static type checker/compiler/asserter barfs at it'. If you have Python code already sprinkled with annotations you need two source files for the same module, one with annotations, one without. You can automate this but it's not as nice. Regards, Martijn From m.faassen@vet.uu.nl Wed Dec 15 09:53:42 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Wed, 15 Dec 1999 10:53:42 +0100 Subject: [Types-sig] List of FOO References: <3855DBBB.6D1B462A@prescod.net> <199912141649.LAA23593@eric.cnri.reston.va.us> <38569478.40E29421@vet.uu.nl> <3856A9B4.7636C0C9@prescod.net> Message-ID: <38576526.78AD224C@vet.uu.nl> Paul Prescod wrote: > > Martijn Faassen wrote: > > > > I agree completely, and one *can* express most of this pretty easily in > > current Python, i.e.: > > > > types = { > > "bar": IntType, > > "baz": ListType(IntType), > > "hey": IntType, > > "foo3": FunctionType(args=(IntType,), result=IntType), > > > > "crazy" : ListType(FunctionType(args=(ListType(IntType), > > StringType), result=DictType(StringType, > > > > FunctionType(args=None, > > result=StringType))) > > } > > Questions: > > 1. This system is supposed to be extensible, right? So I could, for > instance, define a binary tree module and have "binary trees of ints" > and "binary trees of strings." How do I define the binary tree class and > state that it is parameterizable? Good question; so far I only thought about making built in types (such as list) parameterizable. One could however do something similar with classes, though: __typedefs__ = { "parameterized_class" : ParameterizedClassTypeDef(parameters=('foo',), members = { "alpha" : 'foo', "beta" : IntType } ) } __types__ = { "integer_class" : ParameterizedClassType('parameterized_class', parameters = { "foo" : IntegerType }) } Something like that, at least. I know it looks absolutely horrible, but it's workable. :) > 2. How does this work with interfaces? "ListType" is cheating. We need > SequenceType because that's not implementation specific. And > SequenceType needs to be defined by an interface, not a class. I just basically took the standard module types and replaced them with parameterizable classes, but you could come up with SequenceType if you like. I'm often in quite an OPT frame of mind. But even outside that, ListType does say something about the interface. A TupleType parameter cannot be changed inside the function, but a ListType parameter can. That's a huge difference for the interface. > 3. What does "tuple of int, string" look like? And should we have list > length parameters? I haven't fully worked this out yet, but you can fill in details yourself. :) HeterogenousTupleType(elementtypes = (IntType, StringType)) I don't know if we should have list length parameters. Regards, Martijn From tim_one@email.msn.com Wed Dec 15 10:08:54 1999 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 15 Dec 1999 05:08:54 -0500 Subject: [Types-sig] A case study In-Reply-To: <199912141519.KAA23476@eric.cnri.reston.va.us> Message-ID: <000f01bf46e4$650557c0$05a0143f@tim> [Guido] > ... > What do we now know about the type of the list variable? It was > initialized to an empty List. It's still a List, and we know > that at least some of its items are lists. Are all its items > lists? This gets us into similar issues as the recursive call > to find() before, and just as there, I'm not sure that we really > do, so maybe we need to continue the single type hypothesis. (One > way out would be to assume the single type hypothesis until we > see positive proof to the contrary, and if so, redo the analysis > with a less restricted type.) These are all std problems in dataflow analysis. Conceptually: you have a program graph (rooted and directed), where nodes are basic blocks (single entry, single exit), and arcs represent control flow. Associate (still conceptually) with every node a table mapping every name to the set of types it may have upon entry to the block, and another table doing likewise for block exit. Initialize all these to empty sets (that is, replace your "single type" hypothesis with the "no type" hypothesis!). Traverse the graph. Each block has certain effects on its exit type mappings. These need to propogate to the block's successors. At each block entry, the set of types a name maps to is just the union of the set of types the name maps to at the exits of all predecessor blocks. The root of a function's graph is a slightly special case, in that the arglist acts like a predecessor block for this purpose. You continue propagating changes until you reach a steady state. Meaning that, for each node, the entry map equals the union of the predecessors' exit maps, and the exit map is consistent with the entry map as modified by the bindings in the block. The hard parts are changing this from intuitive conception to efficient implementation (global dataflow analysis can consume enormous amounts of memory -- all blocks * all names * all functions * all modules == a whole lot), and in crafting the type system so that you know a priori that you *must* reach a steady state (e.g., it's probably not a good idea to say that lists whose length is a prime number constitute "a type" <0.31 wink>). Freebie: if, at the end of this, there exists a block and a local name such that the first occurrence of the name within the block is a reference, and the name is still associated with the empty set in the block's entry map, you've got an UnboundLocalError waiting to happen (provided the block is reachable). Semi-freebie: If the first occurrence of a local name within a block is a reference, and at least one of the block's predecessors associates this name with the empty set in its exit map, you've got something very close to a violation of Java's "definite assignment" rules. That is, there is *a* path in which this name may not be bound before reference; although you cannot, in general, prove that it's *possible* for that path to occur at runtime. Java gives a fatal error anyway, and after the first hour I came to like that. So your intuition is on the right track here. What I can add as a former Professional Compiler Writer is my Professional Assurance that making this all run efficiently (in either time or space) is a Professional Pain in the Professional Ass. Because of this, global analysis never works out in practice unless you invent an efficient database format to cache the results of analysis, keeping that in synch with the source base under mutation. It's all too easy to come up with a toy system that absolutely will not scale to real life! Python has an advantage, though, in that most people write very small functions and methods most of the time. If you can, in addition, avoiding needing to deduce the types of most globals, it could actually fly before we're all dead . but-civilization-ends-in-a-few-weeks-anyway-ly y'rs - tim From tim_one@email.msn.com Wed Dec 15 10:32:20 1999 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 15 Dec 1999 05:32:20 -0500 Subject: [Types-sig] RFC 0.1 In-Reply-To: <199912141537.KAA23487@eric.cnri.reston.va.us> Message-ID: <001001bf46e7$aafc14a0$05a0143f@tim> [Greg Stein] > Note that one benefit of associating types with names, is that > you can shortcut the data flow analysis (so the analysis is not > necessarily the same). But: you cannot have a name refer to > different types of objects (which I don't like; it reduces some > of Python's polymorphic and dynamic behavior (interfaces solve > the polymorphism stuff in a typed world)). [Guido] > This is a bogus argument. From the point of view of human > readability, I find this: > > s = "the quick brown fox" > s = string.split(s) > del s[1] # the fox is getting old > s = string.join(s) > > less readable and more confusing than this: > > s = "the quick brown fox" > w = string.split(s) > del w[1] # the fox is getting old > s = string.join(w) > > The first version gives polymorphism a bad name; it's like a sloppy > physicist using the same symbol for velocity and accelleration. It's an excellent example, but to me the *first* is easier to follow! In the 2nd I'm left wondering what further use will be made of w, so have to try to keep w *and* s alive in my short-term memory. In the 1st, I can scrub my brain cleaner harder oftener. Heck, I wrote this just last week -- and deliberately: result = {} for i in xrange(k): # The expected # of times thru the next loop is n/(n-i). # Since i < k <= n/2, n-i > n/2, so n/(n-i) < 2 and is # usually closer to 1: on average, this succeeds very # quickly! while 1: candidate = int(random() * n) if not result.has_key(candidate): result[candidate] = 1 break result = result.keys() result.sort() return result At the start of its life, the result is a (conceptual) set, and at the end it's a list with the same stuff. That's not confusing -- it's helpful! It wouldn't confuse a decent type-inference engine, either ("result" is a dict until the block starting with the .keys() call, and is a list thereafter; it's not even a "union type" -- at any given point, it's always one or the other). > ... > Note that a type inferencer may not be able to deduce the rules I > stated above, since you could construct an example where there is no > single type and yet the whole thing works. E.g. I could > create a list [1, 2, 3, joint, "a", "b", "c"] where joint is an > instance of a class that when added to an int returns a string. Now *that's* what gives polymorphism a bad name <0.9 wink>. > However if we had a typesystem and notation that couldn't express > this easily but that could express the stricter rules, I bet that > no-one would mind adding the stricter type declarations to the > code, since those rules most likely express the programmer's intent > better. I agree there's little payback in making a type system that can represent everything possible, simply because 99% of the benefit is in capturing vanilla types (which certainly includes lists of X and dicts mapping X to Y and functions taking lists of X returning dicts mapping Y to lists of Z ...). shocked-at-what-some-people-find-unreadable-ly y'rs - tim From gstein@lyra.org Wed Dec 15 10:42:25 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 15 Dec 1999 02:42:25 -0800 (PST) Subject: [Types-sig] Re: Pascal style declarations In-Reply-To: <3856F6EC.4FD5860E@prescod.net> Message-ID: On Tue, 14 Dec 1999, Paul Prescod wrote: > Greg Stein wrote: > > > > "int" is a valid expression, which is valid on the same line after a > > function definition. For example: > > > > def funx(x, y): foo() ; return 5 > > Well, first, I don't think that we are going to allow functions as > return type specifications. Use assert for runtime assertions. I was pointing out that the above code is currently-valid, "real" code. It is a counter-example to the notion that you can use the "def foo(): type:" syntax. I've already stated in the past that type declarators should only be dotted names. > Second, Python needs to use look-ahead to tell the difference between > parentheses used for parsing a tuple and used for bracketing, doesn't > it? A very different problem. In both cases, you can have an expression following that open parentheses. When you see a comma or a right-parent, then you know what to do with the whole thing. In the function definition, you don't know whether the thing following the colon is a type-declarator, or an expression (where an expression is a valid type of statement, which is part of a suite). In other words, when the parser starts consuming stuff, it doesn't know whether it is consuming a "suite" or a "typedecl". Therefore, you must create a pseudo grammar element which means "it is one of these two, but I don't know YET." At some point you figure it out and transition to the right part of the grammar. This is what I meant when I said it would be possible, but not pretty. I might even venture to say that Guido just plain wouldn't allow this kind of thing in the Python grammar! :-) > > We already have expr-based (the "assert" statement) -- we can assert types > > on expressions anywhere. It is just a little less convenient since we must > > place the expression value into a temporary variable, assert the type of > > that, then continue with the expression. The "type-assert operator" > > simplifies this process dramatically. > > Sure, but why not just use function call syntax? Or maybe Java/C++ > (cast) syntax? Grammar construction issues. The cast would be difficult -- again the issue of determining "(" typedecl ")" vs. "(" expr ")" (presuming that a typedecl cannot be an arbitrary expression. Until you know what the parse element is (typedecl vs expr), you cannot apply the appropriate restrictions (e.g. typedecl is only a dotted name or some other new typedecl syntax that gets invented). C/C++/Java can tell because a name has an associated name-type (e.g. typedef, variable); once the first symbol inside that "(" is seen, it can figure out which parsing form is occurring. A function call syntax could be possible, but again: is the function part (before the open paren) an expression or a typedecl? If it looks just like a function call, then how do you know it is a type assertion? For example: class Foo: ... x = Foo(y) Is that an assertion that y is of type Foo, or is it a constructor? > > Jeremy wrote: > > > I think I agree with you as far as local variables. It becomes quite > > > interesting when you're talking about attributes of objects, e.g. what > > > is the type of the closed attribute of a builtin file object. (For > > > that matter, what is the type of the builtin open function and how > > > does it differ from a function that returns a StringIO object?) > > > Greg Stein wrote: > > Ah! Good point. I think this is where interfaces come in. Otherwise, it > > becomes very difficult to syntactically specify the types of attributes. > > When we specify the types of attributes, we will be talking about those > attributes by name, not by expression or value. So we need a syntax for > specifying types of names *and* expressions. If we use function syntax > for expressions casts then we can reduced the syntactic overload. As mentioned above: you cannot function syntax (somebody educate me if you believe otherwise). All right. I'll modify my statement: * new syntax to specify param and return types * new syntax to specify attribute types [as part of an interface defn?] * type assert operator Note the specific lack of syntax for specifying *variable* types. We aren't typing names, just interfaces (and yes, they happen to have names, but it is truly a different concept right there). In other words, I don't agree with your statement about typing names and expressions. I say we provide: * types for function params, return values * types for attributes [via interfaces rather than syntax?] * a type assertion operator * compile-time (and runtime) checks for the above usages [ and in a case where all your (called) functions have type decls (e.g. os.listdir()), then your code probably doesn't need any assertions since inferencing is enough; Guido's case study shows that you can infer *everything*, but it would be a lot easier once you have inference boundaries established at the function/method calls. ] Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Wed Dec 15 10:47:24 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 15 Dec 1999 02:47:24 -0800 (PST) Subject: [Types-sig] Re: expression-based type assertions In-Reply-To: <3856FF7B.3AA32F18@prescod.net> Message-ID: On Tue, 14 Dec 1999, Paul Prescod wrote: > Greg Stein wrote: > > > > ... > > In fact, I would even say that the only type declarations used would be > > associated with function params and returns (and not variable). > > How do we handle attribute values? We can't just say "interfaces" unless > we agree that interfaces allow type declarations to be associated with > instance variables. And if we start associating type declarations with > attribute names as we do parameter names, why wouldn't we also allow > that for local and global variables? This was covered elsewhere, but for completeness... We handle attribute values thru interfaces, which associate typedecls with attributes. (and yes, an instance variable is an attribute) I do not see a logical extension of that framework that states you should also provide typedecls for variables (local/global). Specifying the type of an attribute is a very different matter from specifying the type of a global. As I've stated: I think specifying the type of a local/global is needless syntactic sugar, which Python (thankfully) has a minimum of. Note that modules and classes each have an interface (to establish type info). Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Wed Dec 15 10:50:44 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 15 Dec 1999 02:50:44 -0800 (PST) Subject: [Types-sig] Re: Pascal style declarations In-Reply-To: <199912150536.AAA25023@python.org> Message-ID: On Wed, 15 Dec 1999, Peter Sommerfeld wrote: > Paul Prescod wrote: > > Well, first, I don't think that we are going to allow functions as > > return type specifications. Use assert for runtime assertions. > > I don't see a reason for this limitation. It would seriously > restrict future introduction of closures into python. > > def format(string s) -> def(string); I think he was saying that you can't use a runtime-computed type declaration. That is different than saying you can't define functional types. In other words: nobody is suggesting that you cannot declare a function type as a return value. Regardless, this thread is bogus. Nobody even said that runtime-computed types should be allowed. Paul mistook my counter-example as a typedecl. See my response to his email. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Wed Dec 15 11:09:46 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 15 Dec 1999 03:09:46 -0800 (PST) Subject: [Types-sig] RFC 0.1 In-Reply-To: <001001bf46e7$aafc14a0$05a0143f@tim> Message-ID: Woo hoo! Tim to the rescue! :-) On Wed, 15 Dec 1999, Tim Peters wrote: > [Greg Stein] ... me saying that it is nice for names to have different types... > > [Guido] ... Guido saying that "feature" is less readable ... > > It's an excellent example, but to me the *first* is easier to follow! In > the 2nd I'm left wondering what further use will be made of w, so have to > try to keep w *and* s alive in my short-term memory. In the 1st, I can > scrub my brain cleaner harder oftener. Yup. I might use two variables myself in that example, but using a single name can definitely be easier in some cases... > Heck, I wrote this just last week -- and deliberately: ... Tim's example code ... > At the start of its life, the result is a (conceptual) set, and at the end > it's a list with the same stuff. That's not confusing -- it's helpful! It > wouldn't confuse a decent type-inference engine, either ("result" is a dict > until the block starting with the .keys() call, and is a list thereafter; > it's not even a "union type" -- at any given point, it's always one or the > other). Ha! I posted something just like this just the other day: http://www.python.org/pipermail/types-sig/1999-December/000518.html Basically: I *totally* agree, and this is primarily the time when I use a single variable name for two different types. This is also a reason why I'd like to avoid the notion of associating a type with a [variable] name. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Wed Dec 15 11:33:46 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 15 Dec 1999 03:33:46 -0800 (PST) Subject: [Types-sig] List of FOO In-Reply-To: <3856A9B4.7636C0C9@prescod.net> Message-ID: On Tue, 14 Dec 1999, Paul Prescod wrote: > Martijn Faassen wrote: ... example ... > 1. This system is supposed to be extensible, right? So I could, for > instance, define a binary tree module and have "binary trees of ints" > and "binary trees of strings." How do I define the binary tree class and > state that it is parameterizable? Dunno. I'll leave that for some other brainiac. :-) As Tim pointed out: you'll get 99% of your benefit from handling a half-dozen builtin types and their composites. Lessee... int, long, float, complex, list, dict, tuple, func, class 2nd order: numeric, sequence, mapping I think a big question is whether you provide syntax, like what Tim just posted recently (e.g. ["" -> 0] -> None), and/or whether you use/allow names (which refer to Types) (e.g. [StringType -> IntType] -> None). If you allow names, rather than pure syntax, then the compiler will need to infer what type the name refers to. Note that the presence of classes means that names are probably required in some way. > 2. How does this work with interfaces? "ListType" is cheating. We need > SequenceType because that's not implementation specific. And > SequenceType needs to be defined by an interface, not a class. We need both. It is perfectly acceptable to state that a List is required. > 3. What does "tuple of int, string" look like? And should we have list > length parameters? I think Martijn answered this one. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Wed Dec 15 11:43:53 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 15 Dec 1999 03:43:53 -0800 (PST) Subject: [Types-sig] Shadow File Opinions? In-Reply-To: <3856AC68.2D5FCF37@prescod.net> Message-ID: On Tue, 14 Dec 1999, Paul Prescod wrote: > Martijn Faassen wrote: > > > > ... > > While my agenda is to kill the syntax discussions for the moment, I'd > > propose a seperate declaration syntax before all others, because this is > > the most syntactically compatible with Python. And easier on the > > programmer. > > I'm considering your argument carefully. If we make separate interface > files then we get Python 1.5 (hell, Python 1.0) compatibility "for free" > and we can experiment with different syntaxes without breaking Python > code. Plus we could use IDL and type libraries for type analysis > *already*. > > I think the final product must allow inline declarations but I am > starting to think that in the short term, "interface definition" files > are the way to go not just for builtin modules but for all modules. > > Do others agree? Interface files and/or Martijn's approach. Personally, I like Martijn's a bit better because you don't have to juggle two files. But yes: it solves a short-term problem of "what is the syntax for defining a module/class interface (its func and attr signatures)". Although I think func signatures are an easy syntactic extension which several people have provided samples for, so the interface can use that. The attributes of a module/class are the hard part. And no... I haven't read JimF's proposal yet to see his suggestion for how this might be done... it does apply to this problem. And here we tried to separate interfaces from the discussion :-) Suggestion: defer consideration of interfaces (whether via Martijn's approach or a separate file) for V2 of the type system design. For V1, let's concentrate on applying type signatures to functions (and variables if people insist :-), and any type inferencing that may be needed. I believe there are a lot of associated problems to handle before needing to throw the interface problem into the mix. Seriously, I only see interfaces as providing a way to define type info for attributes (within the context of this discussion; they have other uses). We have issues dealing with the existing modules, backwards/forwards compatibility, what constitutes type-safety, what checks are available, what runtime switches are used, etc. [ many of these types of details goes into the RFC Paul is putting together ] >... > > With either a Python based system as I'm proposing (ugly but powerful > > and fairly simple), or a seperate type declaration system, you have your > > type declarations separated from the code itself. This means you easily > > add and remove type information and switch between a statically typed > > module and a dynamically typed module easily. > > But there is not going to be alot of "switching". You add declarations > and you leave them there. You update them when they get out of sync with > the code. Why would you want to take a nice, safe, optimized module that > you have gone to the effort of type annotating and hide the annotations? Agreed. We don't need to support switching. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Wed Dec 15 11:51:11 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 15 Dec 1999 03:51:11 -0800 (PST) Subject: [Types-sig] Compile-time or runtime checks? In-Reply-To: <3856ACF4.E19023D8@prescod.net> Message-ID: On Tue, 14 Dec 1999, Paul Prescod wrote: >... > I'm trying hard to separate the axes of: "I have some type declarations" > and "I want a static type checker to gurantee that this code is totally > type safe." This should be legal: Agreed. Good separation. > StringType > def foo(): > a=eval( sys.argv[1] ) > return a > > That means I want a runtime check. This should be illegal: > > type-safe > StringType > def foo(): > a=eval( sys.argv[1] ) > return a > > Here I've specifically asked for a compile time check and my code is not > up to snuff. Add the following in: type-safe StringType def foo(): a = eval(sys.argv[1]) ! StringType return a Now you have a type-safe function. :-) Of course, it might raise an exception, but your types are clean. (heck, the eval could raise an exception... type safety does not imply "no exceptions") (yah yah.. I recognize the same could be done with an "assert" statement, but I think the inferencer would not be as pleased trying to deal with that, as with the type-assert operator) Oh. That just made me think of something. "Exceptions which might be raised" is technically part of a signature. I say punt that to V2 :-) [ we could make some accomodation in the FuncObject to record a tuple of possible exceptions, and it would always contain (Exception,) in it for now... ] Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Wed Dec 15 12:12:37 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 15 Dec 1999 04:12:37 -0800 (PST) Subject: [Types-sig] RFC 0.1 In-Reply-To: <38576017.75252FF7@vet.uu.nl> Message-ID: On Wed, 15 Dec 1999, Martijn Faassen wrote: > Paul Prescod wrote: > > Martijn Faassen wrote: > > > I agree with this, which is I am advocating a strong split (for > > > simplicity) of fully-statically checked code and normal python code. > > > > I don't see this as buying much simplicity. And I do see it as requiring > > more work later. I also see it as scaring the bejeesus out of many > > static type system fence sitters. Can you demonstrate that it makes our > > life easier to figure out integration issues later? > > Sure, but we're bound to scare the bejeesus out of everyone anyway; > we're proposing a major change to Python. "We" ? I'm advocating a minimal change. Add a bit of grammar to function definitions. Add a new type-assert operator. Add Tim's "decl" statement for interfaces (caveat/todo: rationalize against JimF's proposal). Leave out the complexity of variable declarations. Note that I'd be okay with punting the "decl" / interfaces for now. That leaves a bit of "def" grammar changing and a new operator. To the Python programmer: *very* little change. > The 'simplicity' part comes in because you don't need *any* type > inferencing. Conceptually it's quite simple; all names need a type. 1) There is *no* way that I'm going to give every name a type. I may as well switch to Java, C, or C++ (per Guido's advice in another email :-) 2) You *still* need inferencing. "a = foo() + bar()" implies that some inferencing occurs. (for a compile-time check; the compiler can insert a runtime check to assert the type being assigned to "a" (but you know my opinion there...)) >... > > > Later on you can work on blurring the interface between the two. First > > > *fully* type annotated functions (classes, modules, what you want), > > > which can only refer to other things that are fully annotated. By 'fully > > > annotated' I mean all names have a type. > > > > I think that's a non-starter because it will take forever to become > > useful because the standard library is not type-safe. Anyhow I fell like > > I've *already solved* the problem of integration so why would I undo > > that? Agreed. Also, if I grab some module Foo from Joe, and he didn't add typedecls, then why shouldn't I be able to use it? (and I'd just add some type-asserts if that even mattered to me) >... > > > Our static type checker/compiler can use the Python type constructions > > > directly. We can put limitations on them to forbid any type > > > constructions that the compiler cannot fully evaluate before the > > > compilation of the actual code, of course, just like we can put > > > limitations on statically typed functions (they shouldn't be able to > > > call any non-static functions in the first iteration of our design, I'm > > > still maintaining) The compiler can issue a warning and insert a type assertion for a runtime check. IMO, it should not forbid you from doing anything simply because it can't figure out some type. Python syntax's "type agnosticism" is one of its major strengths. > > I see no reason for that limitation. The result of a call to a > > non-static function is a Pyobject. You cast it in your client code to > > get type safety. Just like the shift from K&R C to ANSI C. Functions Bunk! It is *not* a cast. You cannot cast in Python. It is a type assertion. An object is an object -- you cannot cast it to something else. Forget function call syntax and casting syntax -- they don't work grammatically, and that is the wrong semantic (if you're using that format to create some semantic equivalent to a cast). Cheers, -g -- Greg Stein, http://www.lyra.org/ From m.faassen@vet.uu.nl Wed Dec 15 12:54:25 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Wed, 15 Dec 1999 13:54:25 +0100 Subject: [Types-sig] RFC 0.1 References: Message-ID: <38578F81.3F262F89@vet.uu.nl> Greg Stein wrote: > > On Wed, 15 Dec 1999, Martijn Faassen wrote: > > Paul Prescod wrote: > > > Martijn Faassen wrote: > > > > I agree with this, which is I am advocating a strong split (for > > > > simplicity) of fully-statically checked code and normal python code. > > > > > > I don't see this as buying much simplicity. And I do see it as requiring > > > more work later. I also see it as scaring the bejeesus out of many > > > static type system fence sitters. Can you demonstrate that it makes our > > > life easier to figure out integration issues later? > > > > Sure, but we're bound to scare the bejeesus out of everyone anyway; > > we're proposing a major change to Python. > > "We" ? > > I'm advocating a minimal change. Ahum: > Add a bit of grammar to function > definitions. Add a new type-assert operator. Add Tim's "decl" statement > for interfaces (caveat/todo: rationalize against JimF's proposal). Leave > out the complexity of variable declarations. > > Note that I'd be okay with punting the "decl" / interfaces for now. That > leaves a bit of "def" grammar changing and a new operator. > > To the Python programmer: *very* little change. The programmer needs to deal with the following new things and their consequences: * New grammar with function definitions. * A whole new operator (which you can't overload..or can you?), which does something quite unusual (most programmers associate types with names, not with expressions). The operation also doesn't actually return much that's useful to the program, so the semantics are weird too. * Interfaces with a new 'decl' statement. [If you punt on this you'll have to the innocent Python programmer he can't use the static type system with instances? or will we this be inferenced?] * Unspecified syntax to actually *specify* types, I mean, a ! operator with something syntactically wholly new behind it may not be that simple for the Python programmer either. It's not that hard with IntType and so on, but it gets complex if you have function types, class types, etc. * And then there's the type inferencer which will interact with the Python programmer's code as well, right? And the interpreter will spew out errors if compile time checks fail on types? And you call this: '*very* little change' ? I'll call adding a list with names of static type associations to the module an 'an even *smaller* change' then, as you don't need any new operator or statement, at least to start with. :) Adding anything like static type checking to Python entails fairly major changes to the language, I'd think. Not that we shouldn't aim at keeping those transparant and mostly compatible with Python as it is now, but what we'll add will still be major. > > The 'simplicity' part comes in because you don't need *any* type > > inferencing. Conceptually it's quite simple; all names need a type. > > 1) There is *no* way that I'm going to give every name a type. I may as > well switch to Java, C, or C++ (per Guido's advice in another email :-) Sure, but we're looking at *starting* the process. Perhaps we can do away with specifying the type of each local variable very quickly by using type inferencing, but at least we'll have a working implementation! > 2) You *still* need inferencing. "a = foo() + bar()" implies that some > inferencing occurs. > (for a compile-time check; the compiler can insert a runtime check to > assert the type being assigned to "a" (but you know my opinion > there...)) Sure, that's true. [me] > > > > Later on you can work on blurring the interface between the two. First > > > > *fully* type annotated functions (classes, modules, what you want), > > > > which can only refer to other things that are fully annotated. By 'fully > > > > annotated' I mean all names have a type. [Paul] > > > I think that's a non-starter because it will take forever to become > > > useful because the standard library is not type-safe. Anyhow I fell like > > > I've *already solved* the problem of integration so why would I undo > > > that? > > Agreed. Also, if I grab some module Foo from Joe, and he didn't add > typedecls, then why shouldn't I be able to use it? > (and I'd just add some type-asserts if that even mattered to me) I'm not saying this is a good situation, it's just a way to get off the ground without having to deal with quite a few complexities such as inferencing (outside expressions), interaction with modules that don't have type annotations, and so on. I'm *not* advocating this as the end point, but I am advocating this as an intermediate point where it's actually functional. [me] > > > > Our static type checker/compiler can use the Python type constructions > > > > directly. We can put limitations on them to forbid any type > > > > constructions that the compiler cannot fully evaluate before the > > > > compilation of the actual code, of course, just like we can put > > > > limitations on statically typed functions (they shouldn't be able to > > > > call any non-static functions in the first iteration of our design, I'm > > > > still maintaining) > > The compiler can issue a warning and insert a type assertion for a runtime > check. IMO, it should not forbid you from doing anything simply because it > can't figure out some type. Python syntax's "type agnosticism" is one of > its major strengths. Yes, but now you're building a static type checker *and* a Python compiler inserting run time checks into bytecodes. This is two things. This is more work, and more interacting systems, before you get *any* payoff. My sequence would be: * build system that can do compile-time checking of fully annotated code * now you can work on interfacing this with non-fully annotated code. You can also looking at including run-time assertions. * in parallel, now you can work on type inferencing the local variable annotations out of function type signatures, interface declarations, and so on. If you don't separate out your development path like this you end up having to do it all at once, which is harder and less easy to test. [Paul] > > > I see no reason for that limitation. The result of a call to a > > > non-static function is a Pyobject. You cast it in your client code to > > > get type safety. Just like the shift from K&R C to ANSI C. Functions > > Bunk! It is *not* a cast. You cannot cast in Python. It is a type > assertion. An object is an object -- you cannot cast it to something else. > Forget function call syntax and casting syntax -- they don't work > grammatically, and that is the wrong semantic (if you're using that format > to create some semantic equivalent to a cast). This'd be only implementable with run-time assertions, I think, unless you do inferencing and know what the type the object is after all. So that's why I put the limitation there. Don't allow unknown objects entering a statically typed function before you have the basic static type system going. After that you can work on type inference or cleaner interfaces with regular Python. But perhaps I'm mistaken and local variables don't need type descriptions, as it's easy to do type inferencing from the types of the function arguments and what the function returns, as well as the types of any instance attributes involved. I'd like to see some actual examples of how this'd work first, though. For instance: def brilliant() ! IntType: a = [] a.append(1) a.append("foo") return a[0] What's the inferred type of 'a' now? A list with heterogenous contents, that's about all you can say, and how hard is it for a type inferencer to deduce even that? But for optimization purposes, at least, but it could also help with error checking, if 'a' was a list of IntType, or StringType, or something like that? It seems tough for the type inferencer to be able to figure out that this is so, but perhaps I'm overestimating the difficulty. Regards, Martijn From tismer@appliedbiometrics.com Wed Dec 15 14:06:09 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Wed, 15 Dec 1999 15:06:09 +0100 Subject: [Types-sig] Shadow File Opinions? References: Message-ID: <3857A051.406A2FC@appliedbiometrics.com> Greg Stein wrote: > > On Tue, 14 Dec 1999, Paul Prescod wrote: > > Martijn Faassen wrote: [- seperate IF files, forget about syntax -] [Paul] [- compatibility for free, plus IDL option -] [- IF files for all modules possible -] > > Do others agree? [Greg] > Interface files and/or Martijn's approach. Personally, I like Martijn's a > bit better because you don't have to juggle two files. It doesn't matter if there is an extra file, or you insert a function call into your module, like system.interface("""triple quoted string defining interface""") without changes to the language but experimental syntaxes for these IF files/strings. > But yes: it solves a short-term problem of "what is the syntax for > defining a module/class interface (its func and attr signatures)". I think JimF has the best answer yet. Just look into his code. > Although I think func signatures are an easy syntactic extension which > several people have provided samples for, so the interface can use that. > The attributes of a module/class are the hard part. And no... I haven't > read JimF's proposal yet to see his suggestion for how this might be > done... it does apply to this problem. And here we tried to separate > interfaces from the discussion :-) > > Suggestion: defer consideration of interfaces (whether via Martijn's > approach or a separate file) for V2 of the type system design. For V1, > let's concentrate on applying type signatures to functions (and variables > if people insist :-), and any type inferencing that may be needed. Hmm, I hink the opposite is the way to go. Forget about type signatures for functions et al at all, just use interface info, and prove the interface by type inference. The interface is correct if and only if it can be proven. Given that, I see no reason to spoil Python with extra type annotation syntax. It's the other way round: If there is a correct interface, then type inference can be run in parallel as you are typing, like code colorizing, and python can tell you the set of types which any expression might have. I'm telling types by using stuff with known type. That is either literals, or functions which come from other modules which already have an interface. At any time, my IDE can tell me what type the object at the cursor has, nad worst case this is just PyObject. The empty interface which just says "every visible is exported" and "everything is a PyObject" is always fulfilled. An interface which specifies restrictions on input values (as parameters to functions and arguments of setattr calls of objects) provides the information which is used to calculate types in your code. An interface which restricts output values (function return values and results of getattr calls of objects) provides the constraints which have to be proven. What am I missing when I say: We need interfaces only and an inference machine to prove it, and that's all! Forget about extra info in the Python code. What would it help? I believe this is the whole story, and building upon JimF's startup, we would just need to write the inferencer now. not-trying-to-say-this-were-easy - ly chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From guido@CNRI.Reston.VA.US Wed Dec 15 14:21:59 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 15 Dec 1999 09:21:59 -0500 Subject: [Types-sig] A challenge Message-ID: <199912151421.JAA01106@eric.cnri.reston.va.us> There seem to be several proposals for type declaration syntaxes out there, with (mostly implied) suggestions on how to spell various types etc. I personally am losing track of all the various proposals. I would encourage the proponents of each approach to sit down with some sample code and mark it up using your proposed syntax. Or write the corresponding interface file, if that's your fancy. I recommend using the sample code that I posted as a case study, including some of the imported modules -- this should be a reasonable but not extreme challenge. --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein@lyra.org Wed Dec 15 14:40:03 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 15 Dec 1999 06:40:03 -0800 (PST) Subject: [Types-sig] minimal or major change? (was: RFC 0.1) In-Reply-To: <38578F81.3F262F89@vet.uu.nl> Message-ID: On Wed, 15 Dec 1999, Martijn Faassen wrote: ... me: stating the "GFS proposal" isn't that major of a change ... > The programmer needs to deal with the following new things and their > consequences: > > * New grammar with function definitions. Right. And this is optional. I don't see this extension of the grammar or semantic as difficult to deal with. > * A whole new operator (which you can't overload..or can you?), which > does something quite unusual (most programmers associate types with > names, not with expressions). The operation also doesn't actually return > much that's useful to the program, so the semantics are weird too. No, you cannot overload the operator. That would be a Bad Thing, I think. That would throw the whole type system into the garbage :-). The operator is not unusual: it is an inline type assertion. It is not a "new-fangled way to declare the type of something." It is simply a new operation. The compiler happens to be able to create associations from it, but that does *not* alter the basic semantic of the operation. Given: x = y or z In the above statement, it returns "y" if it is "true". In the statement: x = y ! z It returns "y" if it has "z" type; otherwise, throws an exception. The semantics aren't all the difficult or unusual. Programmers are confronted with "new stuff" all the time. How about: values = cgi.parse() Just because the above happens to be a method invocation rather than a syntactical construction does not reduce the amount of new semantics that a programmer must learn. In summary: a new operator isn't that much of a burden. > * Interfaces with a new 'decl' statement. [If you punt on this you'll > have to the innocent Python programmer he can't use the static type > system with instances? or will we this be inferenced?] Yes, I'd prefer to punt this for a while, as it is a much larger can of worms. It is another huge discussion piece. In the current discussion, I believe that we can factor out the interface issue quite easily -- we can do a lot of work now, and when interfaces arrive, they will slide right in without interfering with the V1 work. In other words, I believe there is very little coupling between the proposal as I've outline, and the next set of type system extensions (via interfaces). Without interfaces (or the "decl" statement, or whatever), I *do* posit that the type system will not be applicable to attributes. And no: we cannot infer their type -- that would require global type inferencing. Thankfully, I believe the inferencing required by the "GFS proposal" is local to a single function at a time. > * Unspecified syntax to actually *specify* types, I mean, a ! operator > with > something syntactically wholly new behind it may not be that simple for > the Python programmer either. It's not that hard with IntType and so on, > but it gets complex if you have function types, class types, etc. True. I've been suggesting the use of dotted names, but also allowing for the fact that new syntax can be designed to generate typedecl objects. Specifying a typedecl is necessary to introduce any typing. That is a hit that we take no matter what. I don't see it as a "major" change, though, since we can keep the syntax simple and limit where/how they are used. > * And then there's the type inferencer which will interact with the > Python programmer's code as well, right? And the interpreter will spew > out errors if compile time checks fail on types? This is behind the scenes. The Python programmer is usually not impacted, so yes... again a minimal impact. IMO, the compile-time checks are not enabled by default. If you want them, then you can deal with the errors and warnings. > And you call this: '*very* little change' ? Yes. From the standpoint of the Python programmer, there is not much more to learn or to deal with. [unless we introduce interfaces, IMO] > I'll call adding a list with > names of static type associations to the module an 'an even *smaller* > change' then, as you don't need any new operator or statement, at least > to start with. :) I never said yours was more complex :-). I just said that we aren't necessarily creating a "major change". I'd like to see variable decls punted and interfaces deferred. Add a new semantic (typedecls), a new operator, and an extension to the "def" statement. Done. (hehe... if only the code backing that were so easy...) > Adding anything like static type checking to Python entails fairly major > changes to the language, I'd think. Not that we shouldn't aim at keeping > those transparant and mostly compatible with Python as it is now, but > what we'll add will still be major. Sure. I think we're just viewing it a bit differently. To me, something like the metaclass stuff was a big change: it is capable of altering the very semantics of class construction. Adding package support was the same -- Python moved from a flat import space to an entirely new semantic for importing and application packaging. > > > The 'simplicity' part comes in because you don't need *any* type > > > inferencing. Conceptually it's quite simple; all names need a type. > > > > 1) There is *no* way that I'm going to give every name a type. I may as > > well switch to Java, C, or C++ (per Guido's advice in another email :-) > > Sure, but we're looking at *starting* the process. Perhaps we can do > away with specifying the type of each local variable very quickly by > using type inferencing, but at least we'll have a working > implementation! I don't want to start there. I don't believe we need to start there. And my point (2) below blows away your premise of simplicity. Since you still need inferencing, the requirement to declare every name is not going to help, so you may as well relax that requirement. > > 2) You *still* need inferencing. "a = foo() + bar()" implies that some > > inferencing occurs. > > (for a compile-time check; the compiler can insert a runtime check to > > assert the type being assigned to "a" (but you know my opinion > > there...)) > > Sure, that's true. > > [me] > > > > > Later on you can work on blurring the interface between the two. First > > > > > *fully* type annotated functions (classes, modules, what you want), > > > > > which can only refer to other things that are fully annotated. By 'fully > > > > > annotated' I mean all names have a type. > > [Paul] > > > > I think that's a non-starter because it will take forever to become > > > > useful because the standard library is not type-safe. Anyhow I fell like > > > > I've *already solved* the problem of integration so why would I undo > > > > that? > > > > Agreed. Also, if I grab some module Foo from Joe, and he didn't add > > typedecls, then why shouldn't I be able to use it? > > (and I'd just add some type-asserts if that even mattered to me) > > I'm not saying this is a good situation, it's just a way to get off the > ground without having to deal with quite a few complexities such as > inferencing (outside expressions), interaction with modules that don't > have type annotations, and so on. I'm *not* advocating this as the end > point, but I am advocating this as an intermediate point where it's > actually functional. IMO, it is better to assume "PyObject" when you don't have type information, rather than throw an error. Detecting the lack of type info is the same in both cases, and the resolution of the lack is easy in both mehtods: throw an error, or substitute "PyObject". I prefer the latter so that I don't have to update every module I even get close to. > [me] > > > > > Our static type checker/compiler can use the Python type constructions > > > > > directly. We can put limitations on them to forbid any type > > > > > constructions that the compiler cannot fully evaluate before the > > > > > compilation of the actual code, of course, just like we can put > > > > > limitations on statically typed functions (they shouldn't be able to > > > > > call any non-static functions in the first iteration of our design, I'm > > > > > still maintaining) > > > > The compiler can issue a warning and insert a type assertion for a runtime > > check. IMO, it should not forbid you from doing anything simply because it > > can't figure out some type. Python syntax's "type agnosticism" is one of > > its major strengths. > > Yes, but now you're building a static type checker *and* a Python > compiler inserting run time checks into bytecodes. This is two things. > This is more work, and more interacting systems, before you get *any* > payoff. My sequence would be: Who says *both* must be implemented in V0.1? If the compiler can't figure it out, then it just issues a warning and continues. Some intrepid programmer comes along and tweaks the AST to insert a runtime check. Done. The project is easily phased to give you a working system very quickly. Heck, it may even be easier for the compiler to insert runtime checks in V0.1. Static checking might come later. Or maybe an external tool does the checking at first; later to be built into the compiler. ... proposed implementation order ... > If you don't separate out your development path like this you end up > having to do it all at once, which is harder and less easy to test. Of course. Nobody is suggesting a "do it all at once" course of implementation. > [Paul] > > > > I see no reason for that limitation. The result of a call to a > > > > non-static function is a Pyobject. You cast it in your client code to > > > > get type safety. Just like the shift from K&R C to ANSI C. Functions > > > > Bunk! It is *not* a cast. You cannot cast in Python. It is a type > > assertion. An object is an object -- you cannot cast it to something else. > > Forget function call syntax and casting syntax -- they don't work > > grammatically, and that is the wrong semantic (if you're using that format > > to create some semantic equivalent to a cast). > > This'd be only implementable with run-time assertions, I think, unless > you do inferencing and know what the type the object is after all. So > that's why I put the limitation there. Don't allow unknown objects > entering a statically typed function before you have the basic static > type system going. After that you can work on type inference or cleaner > interfaces with regular Python. Why not allow unknown objects? Just call it a PyObject and be done with it. Note that the type-assert operator has several purposes: * a run-time assertion (and possibly: unless -O is used) * signal to the compiler that the expression value will have that type (because otherwise, an exception would hav been raised) * provides a mechanism to type-check: if the compiler discovers (thru inferencing) that the value has a different type than the right-hand side, then it can flag an error. The limitation you propose would actually slow things down. People would not be able to use the type system until a lot of modules were type-annotated. > But perhaps I'm mistaken and local variables don't need type > descriptions, as it's easy to do type inferencing from the types of the > function arguments and what the function returns, That is my (alas: unproven) belief. > as well as the types > of any instance attributes involved. These would always be "PyObject" (or "Any" if you prefer) until we introduce some kind of "decl" or interface mechanism. Needless to say, I do agree that this would be very difficult. > I'd like to see some actual > examples of how this'd work first, though. For instance: > > def brilliant() ! IntType: > a = [] > a.append(1) > a.append("foo") > return a[0] > > What's the inferred type of 'a' now? A list with heterogenous contents, > that's about all you can say, and how hard is it for a type inferencer > to deduce even that? It would be very difficult for an inferencer. It would have to understand the semantics of ListType.append(). Specifically, that the type of the argument is added to the set of possible types for the List elements. Certainly: a good inferencer would understand all the builtin types and their methods' semantics. > But for optimization purposes, at least, but it > could also help with error checking, if 'a' was a list of IntType, or > StringType, or something like that? It would still need to understand the semantics to do this kind of checking. In my no-variable-declaration world, the type error would be raised at the return statement. a[0] would have the type set: (IntType, StringType). The compiler would flag an error stating "return value may be a StringType or an IntType, but it must only be an IntType". > It seems tough for the type > inferencer to be able to figure out that this is so, but perhaps I'm > overestimating the difficulty. Yes it would be tough -- you aren't overestimating :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Wed Dec 15 14:47:29 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 15 Dec 1999 06:47:29 -0800 (PST) Subject: [Types-sig] hehe... sorry, too... Message-ID: Well, I count nine messages tonite, plus the others earlier today. Sorry about that... Maybe Paul can pull all the threads together and toss out a new RFC with references (not necessarily details) to the different options/threads. We can start again from that point! And Guido's challenge wouldn't be a bad idea... :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Wed Dec 15 15:37:52 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 15 Dec 1999 07:37:52 -0800 (PST) Subject: [Types-sig] challenge response (was: A challenge) In-Reply-To: <199912151421.JAA01106@eric.cnri.reston.va.us> Message-ID: On Wed, 15 Dec 1999, Guido van Rossum wrote: > There seem to be several proposals for type declaration syntaxes out > there, with (mostly implied) suggestions on how to spell various types > etc. > > I personally am losing track of all the various proposals. > > I would encourage the proponents of each approach to sit down with > some sample code and mark it up using your proposed syntax. Or write > the corresponding interface file, if that's your fancy. > > I recommend using the sample code that I posted as a case study, > including some of the imported modules -- this should be a > reasonable but not extreme challenge. #---------------------------------------------------------------------- import sys, find # 1 # 2 def main(): # 3 dir = "." # 4 if sys.argv[1:]: # 5 dir = sys.argv[1] # 6 list = find.find("*.py", dir) # 7 list.sort() # 8 for name in list: # 9 print name # 10 # 11 if __name__ == "__main__": # 12 main() # 13 #---------------------------------------------------------------------- Presume that find.find is declared as: def find(pattern!StringType, dir = os.curdir!StringType)!ListType: At the moment, I'm going to use '!' for the type declarators and the type-assert operator. A placeholder. Also, the ListType in the above declaration could be updated at some point, if we presume that more complex type declarator syntax is designed (allowing List). In my view, the type system does not prove correctness, but only type safety. In that sense, we don't need to worry about things like termination or whether main() actually gets called, or whatever. [ Guido's case study was concerned with algorithmic correctness ] At the beginning of the compilation/inference process, the compiler knows that __name__ is a string (technically: it knows the type for each name in the module's namespace (the others are '__builtins__' and '__doc__')). The import states that "sys" and "find" are ModuleType (which means sys.argv and find.find will be okay from an operational point; it would still need to check if they exist). Line 3: the compiler defines main() to take no arguments and to have a PyObject return value. Line 4: "dir" now has a String value. Line 5: The compiler knows sys is a module so the attribute access is fine. It now must verify that argv exists. Problem #1: I'm not sure how it would do this without loading the module. Caveat #2: I do not have a proposal for stating sys.argv's type. This would be part of the interface stuff (which I would defer). Caveat #3: alternative to #2: we could hardcode knowledge of "sys" For this line, the compiler cannot ensure that the [1:] would not raise an error. This can be solved by introducing type info (interfaces), or we can alter the line to: if (sys.argv!ListType)[1:]: # 5 At this point, the compiler will assume it is a List for the rest of the function. Caveat #4: we would need to determine how strict we want to be about the possibility of external changes to objects. Another thread could change the type before the next usage (or the find.find() could, but we don't use sys.argv after that) If the compiler knows it is a list, then it also knows the [1:] would succeed. Line 6: in the absence of List type information, the "dir" would now become a (PyObject, String) which is simplified to (PyObject,). Caveat #5: adding List concepts would keep "dir" as a String, as the inferencer would understand the indexing operation. Line 7: per caveat #1, assume the compiler can access the find.find() function. From that, it knows the signature. The first parameter has a matching type, but the second (PyObject) does not match the required type (String), so an error is raised. If caveat #5 is resolved, then the second parameter matches. It is also possible to avoid the error by rewriting: list = find.find("*.py", dir!StringType) # 7 "list" is now a ListType, based on the find.find() return value. (see caveat #5 -- it could be possible to refine this knowledge). Line 8: this is fine -- the inferencer knows List has a sort() method and what the sort method's signature is. Line 9: again, this is okay, based on the the inferencer's knowledge of "for" statements and Lists. "name" is assigned a PyObject type (unless we resolve caveat #5). Line 10: the print succeeds, as any object can be printed. Line 12: any comparison is valid, so this is fine. The compiler does happen to know that __name__ is a string, though. Line 13: the invocation matches the definition. No problem. ----------------------------- IMO, the best thing to improve the system here is to introduce parameterized lists (and dicts, tuples, etc). In fact, this would be necessary to avoid the error at line 7 (without rewriting). The following problem needs to be resolved: 1) how to fetch type information from modules without necessarily loading them [ if this is unsolved then all attribute accesses become PyObject values. The type system would be pretty useless since you wouldn't even get function information. ] The caveats listed are desired to be resolved: 2) how to specify types for module/class attributes 3) hardcode type information in the absence of a solution for #2 4) what sort of notions of "const" do we provide -- can the types of things change? (this may be moot with an interface present) 5) provide a syntax for composite/complex types Summary of changes to the case study code: 1) find.find() definition altered. [ it is certainly possible that fnmatch.* could be altered, but that is not necessary from the standpoint of the example code. The inferencer goes no further than the find.find() definition. ] Optional changes, if the caveats are not resolved: 1) add !ListType to line 5 2) add !StringType to line 7 Miscellaneous notes: * note the absence of declarations for "dir", "list", and "name". * only one change was made in the "find" library to support type safety for the example code. The example code itself had no alterations (subject to the noted caveats) Underlying proposal: * add type declarator syntax * add declarators to function args and return value (example provided for discussion purposes; I do believe a ':' is not possible and that a "name syntactic-marker typedecl" form is the proper form) * add type-assert operator ('!' for discussion purposes) * add type inferencing for associating types with names in the global and local scope. all other type information is imported rather than globally computed. inferencing does not occur over multiple, local scopes (in other words, we can process one function at a time, independent of the other functions) TODO: * I just realized the presence of the "global" statement throws off a lot of stuff. Type inferencing and/or checking will be harder and/or require a second pass if a global is used and new type is added to the possible set of types for the global. Cheers, -g -- Greg Stein, http://www.lyra.org/ From skip@mojam.com (Skip Montanaro) Wed Dec 15 15:45:08 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Wed, 15 Dec 1999 09:45:08 -0600 Subject: [Types-sig] Global analysis - anything available? Message-ID: <199912151545.JAA07256@dolphin.mojam.com> Guido wrote: > Jim Hugunin did global analysis on the pystone.py module -- 250 lines > containing 14 functions and one class with two methods. (He may actually > have left out the class, but I'm pretty sure he did everything else.) He > got a 1000x speedup, which I think should be a pretty good motivator for > those interested in (OPT). Did anything concrete fall out of this exercise? Did Jim write code or do it manually? Skip Montanaro | http://www.mojam.com/ skip@mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From guido@CNRI.Reston.VA.US Wed Dec 15 15:48:42 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 15 Dec 1999 10:48:42 -0500 Subject: [Types-sig] Global analysis - anything available? In-Reply-To: Your message of "Wed, 15 Dec 1999 09:45:08 CST." <199912151545.JAA07256@dolphin.mojam.com> References: <199912151545.JAA07256@dolphin.mojam.com> Message-ID: <199912151548.KAA01393@eric.cnri.reston.va.us> > Guido wrote: > > > Jim Hugunin did global analysis on the pystone.py module -- 250 lines > > containing 14 functions and one class with two methods. (He may actually > > have left out the class, but I'm pretty sure he did everything else.) He > > got a 1000x speedup, which I think should be a pretty good motivator for > > those interested in (OPT). > > Did anything concrete fall out of this exercise? Did Jim write code or do > it manually? He write Python code that would do this in general, with a limited subset of Python as input. Since Jim left the project it has not been taken to the next step, but I'm sure Barry has Jim's code somewhere. --Guido van Rossum (home page: http://www.python.org/~guido/) From paul@prescod.net Wed Dec 15 13:53:52 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 15 Dec 1999 05:53:52 -0800 Subject: [Types-sig] Progress Message-ID: <38579D70.75E477BA@prescod.net> We are actually making progress among all of the sound and fury here. You guys have a lot of good ideas and I think that we are converging more than it seems. 1. Most people seem to agree with the idea that shadow files allow us a nice way to separate type assertions out so that their syntax can vary. I think Greg disagreed but perhaps not violently enough to argue about it. Interface files are in. Inline syntax is temporarily out. Syntactic "details" to be worked out. 2. Everybody but me is comfortable with defining genericity/templating/parameterization only for built-in types for now. But now that we are separating interfaces from implementations I am thinking that I may be able to think more clearly about parameterizability. It may be possible to define parameterizable interfaces by IPC8. Parameterization is in. Syntactic "details" to be worked out. 3. We agree that we need a syntax for asserting the types of expressions at runtime. Greg proposes ! but says he is flexible on the issue. The original RFC spelled this as: has_type( foo, types.StringType ) which returns (in this case) a string or NULL. This strikes me as more flexible than ! because you can use it in an assertion but you don't have to. You can also use it like this: j=has_type( foo, types.StringType ) or has_type( foo, types.ListType ): 4. The Python misfeature that modules are externally writable by default is gone. Only Guido has expressed an opinion on whether they should be writeable at all. His opinion is no. Unless I hear otherwise, externally writable modules are gone. (I have this vague feeling that maybe we should think of modules as classes with methods and properties that happen to be a subtype of a new base class "module", in that case the rules for modules and classes should be identical) 5. It isn't clear WHAT we can specify in "PyDL" interface files. Clearly we can define function, class/interface and method interfaces. a. do we allow declarations for the type of non-method instance variables? b. do we check assignments to class and module attributes from other modules at runtime? We need to expect that some cross-module assignments will come from modules that are not statically type checked. c. should we perhaps just disallow writing to "declared" attributes from other modules? d. is it possible to write to UN-declared attributes from other modules? And what are the type safety implications of doing so? -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From rmasse@cnri.reston.va.us Wed Dec 15 16:36:34 1999 From: rmasse@cnri.reston.va.us (Roger Masse) Date: Wed, 15 Dec 1999 11:36:34 -0500 (EST) Subject: [Types-sig] Re: Pascal style declarations In-Reply-To: References: Message-ID: <14423.50066.233550.951115@nobot.cnri.reston.va.us> Golden, Howard writes: > Greg Stein [mailto:gstein@lyra.org] wrote: > > > You don't provide a way to declare function return value(s) > > types. When > > you do, then I think you're going to run into a problem using the ':' > > syntactical marker... > > [refers to:] > > > > 3. In functions and methods, you can _optionally_ specify > > the argument > > > type: > > > > > > def funx(x : int, y : string): ... > > > > > I'll admit that Python already uses the ":" character where Pascal does, but > so what? You can still specify the return type in other ways. The most > obvious (to me) is to use the ":" character twice, e.g., > > def funx(x : int, y : string): int : ... > > While I'm not a parsing expert, I believe this would still be parsable. Of > course, any other available character could be used instead of the ":", if > this would be preferable. (Again, I'm not trying to dictate the final > syntax, just suggest a starting point.) > I don't want to further muddy the waters because I think Paul has some really good ideas that he should start to run with... (I hope he has the time and resources to write some code) The optional static typing syntax has been talked about for quite some time... IMHO specifying the return type as outlined in my static types paper from last year's conference is more readable (i.e. return type of the function *before* the parameter list). See dev-day review from last year http://www.foretec.com/python/workshops/1998-11/dd-rmasse-sum.html For example: def myCallable : Int( i : Int, f : Float, m : myType): ...And parsable. (Jon Reihl developed the grammar for the proposal) Syntax is easy (contentious but easy), semantics that are "Pythonic" and don't get in the way and *actually* improve safety is much harder. -Roger From gstein@lyra.org Wed Dec 15 16:46:17 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 15 Dec 1999 08:46:17 -0800 (PST) Subject: [Types-sig] Progress In-Reply-To: <38579D70.75E477BA@prescod.net> Message-ID: On Wed, 15 Dec 1999, Paul Prescod wrote: >... > 1. Most people seem to agree with the idea that shadow files allow us a > nice way to separate type assertions out so that their syntax can vary. > I think Greg disagreed but perhaps not violently enough to argue about > it. Interface files are in. Inline syntax is temporarily out. Syntactic > "details" to be worked out. I stated a preference for allowing this information to reside in the same file as the implementation. i.e. I don't want to maintain two files. I'll go further and state that we should not use a new language for this. It should just be Python. (and this is where Martijn's __types__ thing comes in, although I'm not advocating that format) This should be equivalent to JimF's document (with extensions: I read it and he does not define typedecl mechanims). Where we disagree, change, or reinvent, then we provide feedback. Where we extend, we fold that back in. >... > 3. We agree that we need a syntax for asserting the types of expressions > at runtime. Greg proposes ! but says he is flexible on the issue. The Flexible on the character(s) used for the operator. That's a bit different than flexibility on the issue :-) > original RFC spelled this as: has_type( foo, types.StringType ) which > returns (in this case) a string or NULL. This strikes me as more > flexible than ! because you can use it in an assertion but you don't > have to. You can also use it like this: > > j=has_type( foo, types.StringType ) or has_type( foo, types.ListType ): You'll have issues with empty strings and empty lists, as Guido pointed out. has_type() does not create a *definitive* type assertion. The compiler cannot extract any information from the presence of has_type(). Using an operator which raises an exception allows the compiler to make the assertion (and thereby assist with type inferencing and type checking). >... > writable modules are gone. (I have this vague feeling that maybe we > should think of modules as classes with methods and properties that > happen to be a subtype of a new base class "module", in that case the > rules for modules and classes should be identical) This is an interesting way to view the application of an interface to either a module or a class. i.e. restate it as "apply interfaces to classes only; modules become classes so they can have interfaces applied." Note that this will also solve the setattr "problem" with modules. > 5. It isn't clear WHAT we can specify in "PyDL" interface files. Clearly > we can define function, class/interface and method interfaces. > > a. do we allow declarations for the type of non-method instance > variables? Yes. My reluctance to specify types for instance variables is caused by problems with designing a nice, inline syntax for it. If you're not worrying about an inline syntax, then you can definitely add typedecls for instance and class attributes. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Wed Dec 15 17:01:56 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 15 Dec 1999 09:01:56 -0800 (PST) Subject: [Types-sig] Shadow File Opinions? In-Reply-To: <3857A051.406A2FC@appliedbiometrics.com> Message-ID: On Wed, 15 Dec 1999, Christian Tismer wrote: >... > It doesn't matter if there is an extra file, or you insert a > function call into your module, like > > system.interface("""triple quoted string defining interface""") > > without changes to the language but experimental syntaxes for > these IF files/strings. The compiler needs the information. This implies that you can't add the information procedurally. The mechanism must be "transparent" to the compiler. > > But yes: it solves a short-term problem of "what is the syntax for > > defining a module/class interface (its func and attr signatures)". > > I think JimF has the best answer yet. Just look into his code. I don't like the implementation at all (too many modules, too many "from foo import ...), but the ideas seem to be sound. I believe there should be a proposal for syntactical representations. The compiler can't pull its information from JimF's current interface mechanism. >... > > Although I think func signatures are an easy syntactic extension which > > several people have provided samples for, so the interface can use that. > > The attributes of a module/class are the hard part. And no... I haven't > > read JimF's proposal yet to see his suggestion for how this might be > > done... it does apply to this problem. And here we tried to separate > > interfaces from the discussion :-) > > > > Suggestion: defer consideration of interfaces (whether via Martijn's > > approach or a separate file) for V2 of the type system design. For V1, > > let's concentrate on applying type signatures to functions (and variables > > if people insist :-), and any type inferencing that may be needed. > > Hmm, I hink the opposite is the way to go. Forget about type signatures > for functions et al at all, just use interface info, and prove the > interface by type inference. The interface must be defined syntactically (or at least very transparently to the compiler). Using a procedural mechanism only helps with runtime issues. Given the presumption of syntactical interface definitions, this leads to type signatures for functions. > The interface is correct if and only if it can be proven. This is a different problem, IMO. I would like to see interfaces used to tell callers about the type information. I don't care whether the interface is truly representative of the code or not. > Given that, I see no reason to spoil Python with extra type > annotation syntax. It's the other way round: > If there is a correct interface, then type inference can be run > in parallel as you are typing, like code colorizing, and python > can tell you the set of types which any expression might have. For runtime applications: yes. For compile-time static checks, you most likely require new syntax. >... > What am I missing when I say: > We need interfaces only and an inference machine to prove it, > and that's all! Forget about extra info in the Python code. > What would it help? I believe this is the whole story, and > building upon JimF's startup, we would just need to write > the inferencer now. You're missing the requirement that a compiler must be able to extract useful information. I'm not sure that a compiler cannot do this with the current JimF proposal: 1) there is no mechanism for signatures (or a defined way for the compiler to extract/parse them) 2) procedural definition is allowed (e.g. instantiating Method()), which prevents the compiler from extracting the info. New syntax, or well-known mechanism such as __types__ is needed. I believe JimF's proposal could be used (pull the info from __implements__ and the definition of the interface), but some of its dynamicism must be torched. (and maybe where it is required, we only allow it in one or two specific cases (internal to the Interfaces implementation) and then hard-code those details into the compiler/inferencer). Cheers, -g -- Greg Stein, http://www.lyra.org/ From guido@CNRI.Reston.VA.US Wed Dec 15 17:07:30 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 15 Dec 1999 12:07:30 -0500 Subject: [Types-sig] Low-hanging fruit: recognizing builtins Message-ID: <199912151707.MAA02639@eric.cnri.reston.va.us> It's always bothered me from a performance point of view that using a built-in costs at least two dict lookups, one failing (in the modules' globals), one succeeding (in the builtins). This is done so that you can define globals or locals that override the occasional builtin; which is good since new Python versions can define new builtins, and if you weren't allowed to override builtins this would break old code. Here's a way that per-module analysis plus a conservative assumption plus an addition to the PVM (Python Virtual Machine) bytecode can remove *both* dict lookups for most uses of builtins. Per-module analysis can easily detect that there are no global variables named "len", say. In this case, any expression calling len() on some object can be transformed into a new bytecode that calls PyObjectt_Length() on the object at the top of the stack. Thus, a sequence like LOAD_GLOBAL 0 (len) LOAD_FAST 0 (a) CALL_FUNCTION 1 can be replaced by LOAD_FAST 0 (a) UNARY_LEN which saves one PVM roundtrip and two dictionary lookups, plus the argument checking code inside the len() function. There are plenty of bytecodes available. In addition, we can now optimize common idioms involving builtins, the most important one probably for i in range(n): ... We lose a tiny bit of dynamic semantics: if some bozo replaces __builtin__.len with something that always returns 0, this won't affect our module. Do we care? I don't. We don't have to do this for *every* builtin; for example __import__() has as explicit semantics that you can replace it with something else; for open() I would guess that there must be plenty of programs that play tricks with it. But range()? Or len()? Or type()? I don't think anybody would care if these were less dynamic. Note that you can still override these easily as globals, it just has to be visible to the global analyzer. The per-module analysis required is major compared to what's currently happening in compile.c (which only checks one function at a time looking for assignments to locals) but minor compared to any serious type inferencing. Clearly this does nothing for (ERR), but I bet it could speed up the typical Python program with a substantial factor... Any takers? --Guido van Rossum (home page: http://www.python.org/~guido/) From GoldenH@littoncorp.com Wed Dec 15 17:52:54 1999 From: GoldenH@littoncorp.com (Golden, Howard) Date: Wed, 15 Dec 1999 09:52:54 -0800 Subject: [Types-sig] What is the Essence of Python? (Was: Low-hanging frui t: recognizing builtins) Message-ID: Guido van Rossum wrote: > We lose a tiny bit of dynamic semantics: if some bozo replaces > __builtin__.len with something that always returns 0, this won't > affect our module. Do we care? I don't. We don't have to do this > for *every* builtin; for example __import__() has as explicit > semantics that you can replace it with something else; for open() I > would guess that there must be plenty of programs that play tricks > with it. But range()? Or len()? Or type()? I don't think anybody > would care if these were less dynamic. I reiterate that we should define what is the essence of Python, so we know what sort of dynamicism and flexibility we are trying to preserve, and what is superfluous. Until we do this, we are dealing with a squishy set of requirements. In the various comments in the last few days, I have the sense that many of you are using Python in very innovative ways, far beyond my pedestrian style. Therefore, I find some of the arguments very esoteric. Even if no one is willing to attempt a definition, I would certainly benefit if someone would point me in the direction of examples of dynamic Python that you want to preserve. - Howard From paul@prescod.net Wed Dec 15 18:36:44 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 15 Dec 1999 10:36:44 -0800 Subject: [Types-sig] Handling attributes Message-ID: <3857DFBC.A2B7ADAC@prescod.net> > Yes. My reluctance to specify types for instance variables is caused by > problems with designing a nice, inline syntax for it. If you're not > worrying about an inline syntax, then you can definitely add typedecls for > instance and class attributes. Okay, but what about all of the other questions (updated slightly): b. do we check assignments to class and module attributes from other modules at runtime? We need to expect that some cross-module assignments will come from modules that are not statically type checked. c. should we perhaps just disallow writing to "declared" attributes from other classes and modules? d. is it possible to write to UN-declared attributes from other people's classes and modules? And what are the type safety implications of doing so? -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Wed Dec 15 18:37:40 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 15 Dec 1999 10:37:40 -0800 Subject: [Types-sig] Implementability References: <000f01bf46e4$650557c0$05a0143f@tim> Message-ID: <3857DFF4.F02097CB@prescod.net> I was wondering when our Professional Compiler Writer and resident skeptic would jump in and tell us what we were doing wrong! Thanks. Tim Peters wrote: > > ... > > So your intuition is on the right track here. What I can add as a former > Professional Compiler Writer is my Professional Assurance that making this > all run efficiently (in either time or space) is a Professional Pain in the > Professional Ass. According to the principle of "from each according to their talents" you should be writing this optimizing, static type checker. > Because of this, global analysis never works out in > practice unless you invent an efficient database format to cache the results > of analysis, keeping that in synch with the source base under mutation. Bah. The scope of compilation is the module. The scope of inference is a namespace defining suite (e.g. a module, class body or method, but not an "if" or "try"). > It's all too easy to come up with a toy system that absolutely will not > scale to real life! Python has an advantage, though, in that most people > write very small functions and methods most of the time. If you can, in > addition, avoiding needing to deduce the types of most globals, it could > actually fly before we're all dead . The types of globals from other modules should be explicitly declared. If they aren't, they are presumed to have type PyObject or to return PyObject. Or they just aren't available if you are in strict static type check mode. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Wed Dec 15 18:38:21 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 15 Dec 1999 10:38:21 -0800 Subject: [Types-sig] check_type() Message-ID: <3857E01D.6C699075@prescod.net> > > original RFC spelled this as: has_type( foo, types.StringType ) which > > returns (in this case) a string or NULL. This strikes me as more > > flexible than ! because you can use it in an assertion but you don't > > have to. You can also use it like this: > > > > j=has_type( foo, types.StringType ) or has_type( foo, types.ListType ): > > You'll have issues with empty strings and empty lists, as Guido pointed > out. Yes, you have to use it in ways that follow Python's boolean rules. A better name would be check_type. j=check_type( foo, types.StringType ) > has_type() does not create a *definitive* type assertion. The compiler > cannot extract any information from the presence of has_type(). Using an > operator which raises an exception allows the compiler to make the > assertion (and thereby assist with type inferencing and type checking). j=check_type( foo, types.StringType) j is *guaranteed* to be either a string or None. Note that check_type is actually an operator in that it cannot be overwritten or shadowed. It just happens to be an operator that looks like a function and that returns a useful value instead of immediately causing an exception. It also happens to be compatible with the current Python grammar. I have big aesthetic problems with adding a special character to a language that uses the word "or" to mean, well "or" and "not" to mean "not". I might be able to live with "k = eval('1') as int" if it isn't too horribly ambiguous. Sorry, I've seen so many posts back and forth that I can't remember what the consensus on this was. That's my fault as moderator. We'll have to start focusing on individual issues soon because the fireman's hose approach is reaching its limits (but it was great at first!) -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Wed Dec 15 18:38:35 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 15 Dec 1999 10:38:35 -0800 Subject: [Types-sig] Interface files References: Message-ID: <3857E02B.53CF27AC@prescod.net> Greg Stein wrote: > > I stated a preference for allowing this information to reside in the same > file as the implementation. i.e. I don't want to maintain two files. The nice thing about having separate files is that it becomes instantly clear what is "interesting" to the compiler. We have no backwards compatibility constraints. We have no questions about what variable are "in scope" and "available". It's just plain simpler. There is also something deeply elegant and useful about a separation of interface from implementation. Sure, you don't always want to be REQUIRED to separate them. I acknowledge that we will one day have to support inline declarations but I'm going to put it off unless I hear some screaming. > I'll go further and state that we should not use a new language for this. > It should just be Python. (and this is where Martijn's __types__ thing > comes in, although I'm not advocating that format) I think that that's an unreasonable (and unreadable) constraint. The language should probably be pythonic, but not necessarily Python. Python doesn't have a type declaration syntax and none of Python's existing syntax was meant to be used AS a type declaration syntax. It just gets too unreadable for quasi-complicated declarations. We need to support polymorphic and parameteric higher order functions! -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Wed Dec 15 18:56:17 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 15 Dec 1999 10:56:17 -0800 Subject: [Types-sig] Low-hanging fruit: recognizing builtins References: <199912151707.MAA02639@eric.cnri.reston.va.us> Message-ID: <3857E451.F7E4B6EC@prescod.net> What is the proposed framework for these sorts of experiments? Perhaps the first project should be an interpreter that can be extended with new bytecodes perhaps through a registration mechanism...and a hook to call Python code between parsing and generating bytecodes. You have specifically commissioned this experiment so it has a good chance of being "rolled in" but in the more general case... -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From guido@CNRI.Reston.VA.US Wed Dec 15 19:21:40 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 15 Dec 1999 14:21:40 -0500 Subject: [Types-sig] Low-hanging fruit: recognizing builtins In-Reply-To: Your message of "Wed, 15 Dec 1999 10:56:17 PST." <3857E451.F7E4B6EC@prescod.net> References: <199912151707.MAA02639@eric.cnri.reston.va.us> <3857E451.F7E4B6EC@prescod.net> Message-ID: <199912151921.OAA07698@eric.cnri.reston.va.us> > What is the proposed framework for these sorts of experiments? Perhaps > the first project should be an interpreter that can be extended with new > bytecodes perhaps through a registration mechanism...and a hook to call > Python code between parsing and generating bytecodes. You have > specifically commissioned this experiment so it has a good chance of > being "rolled in" but in the more general case... Dynamic bytecode registration would slow the PVM too much. Just hack a few special cases into ceval.c and then go hack on the bytecode. Note that the bytecode hacking could conceivably be done entirely in Python. I think Skip Montanaro may have tools to do this already. A first approximation would be to go hunt through all existing code objects in a module for LOAD_GLOBAL and STORE_GLOBAL opcodes with built-in names; for all such built-in names that have no STORE_GLOBAL anywhere, it's "safe enough" to use the special opcode. Then of course you will have to hunt through the bytecode for sequences of LOAD_GLOBAL(), followed by arbitrary code to load an object, followed by CALL_FUNCTION(1). --Guido van Rossum (home page: http://www.python.org/~guido/) From Edward Welbourne Wed Dec 15 19:59:10 1999 From: Edward Welbourne (Edward Welbourne) Date: Wed, 15 Dec 1999 19:59:10 +0000 Subject: [Types-sig] expression-based type assertions Message-ID: [Due to confusion on my part, this begins by echoing other stuff] [First: something I sent to Greg while forgetting it wasn't to all] A reason why I want not-just-dotted-names as the type checking object ... a generator for type-checking tuples (say), which takes some parameters (checkers for the individual members of a tuple) and returns a checker for that flavour of tuple. Of course, a compiler can only *exploit* this if it `knows' what the relevant type checker `really will be'. However, I see no reason against *allowing* the checker construct to be applied using checkers that no compiler could hope to recognise - these will buy you robustness but no speed - while the spec for checking can say that compilers are at liberty to exploit any checkers they *do* recognise. Even dotted names, if the compiler doesn't `know' the relevant checker, won't prevent this: and parameterised checkers, like the tuple example above, can be known (by some compilers) despite involving more than just dotted names. Note: one common type for functions is the whatever-or-None used for default arguments, notably in the case where the default for the argument is an empty dict (or list), but the standard gotcha about using {} or [] as default obliges one to use None and begin the function with if arg is None: arg = { } # or [ ] as may be This is going to provide irritation for the syntax of checking unless something sensible is dreamed up (or it's not done in the argument list). A possibility would be `if a default value is given, this value is also tolerated for the argument, even if it fails the type check' subject to a presumption that the function must begin with something which copes with the default, replacing it with something that matches the type spec ... [To which Greg replied (I've shrunk my excerpts here):] EW> A reason why I want ... returns a checker for that flavour of tuple. Sure. Syntactical type declarators are fine. But arbitrary expressions would prevent the compiler from understanding what was going on. In fact, I proposed this exact kind of thin in the "GFS proposal". Did you read that? For example: x = foo() ! Int x, y = foo() ! (Int, String) In the first case, you have a dotted name. In the second, the parser and compiler understand that the parens mean tuple-of-these-types. EW> Of course, a compiler ... exploit any checkers they *do* recognise. A dotted name allows the following construct: MyChecker = SomeCheckerGenerator(...) x = foo() ! MyChecker Again, this was in the GFS proposal. Since you can always do an assignment such as above, I felt it was quite fine to say "only dotted names." EW> Even dotted names, ... involving more than just dotted names. Well... it gets pretty tough for the compiler, the further you move from simple dotted names. Worst case, the compiler can issue a warning saying it doesn't understand how to do the compile-time check, and then insert a runtime check. EW> Note: ... whatever-or-None ... something that matches the type spec ... True. Syntactic markers can created and used to state " or None". [OK, sorry about that, back to the present] Two issues are developing in the list: one is name-checking, the other is value-checking. The two are mostly seperable - however, the parameters of a function provide a clash: value-checking says that the parameter is tagged with the type of value the *caller* may supply, name-checking tags it with a constraint that stops the function subsequently using that name to hold any other type (while incidentally doing the value-checker's check when the function is invoked). This is an area of difficulties. I wish to state explicitly that: A quite natural consolidation of the existing python object model will leave us in a position where attribute modification is *always* done by thing.__setattr__(key, val) or thing.__delattr__(key) for any thing you care to mention (albeit with some subtleties). The resulting framework *will* make it easy to: * set up a name-space such that the suite executed to initialise it performs sensible attribute modification (without any fancy syntax of fuss within the suite), yet: once it is initialised it has no attribute modification methods - if you really want, access to its __dict__ can be unavailable. * police any restrictions you can specify on what values may be stored against which names in a given namespace: you do this with a __setattr__ hack. * have your locals and globals just be namespaces, so there's nothing special about them - i.e. you can do the above with them. It is unnecessary to add syntax to the language for the purpose of specifying what you can arrange to have __setattr__ do. In particular, use of setattr hacks is the right way to implement any fascist policy a name-space wants to use in controlling modifications to itself - not new grammar. However, that only affects the robustness side of matters - it doesn't provide for the compiler to know it can presume the relevant hacks are in place: but value-checking can be used here too, with care. (Have a checker which corresponds to the assertion: this object has a namespace containing the name foo, with value matching bah-spec; optionally specifying that this can't be changed - which the checker can check by trying to, or by looking for the attribute modification methods.) So I'm falling into the purely value-checked camp. Type-specifications on values (of which function argument/return type-specs are examples) are valuable: my one reservation about them is that they are syntactic sugar for assertions - but I accept they are a good scheme which will genuinely increase the extent to which folk will make assertions: which improves robustness and maintainability. I will suppose that ! is the operator to be used for this (since : is so widely used already that another use would be irksome), but I dont' really care what the symbol is (though: the type assertion makes sense as an exclamation, e.g. `7 + (x ! IntType)' reads quite neatly as 7 plus x, which *is* an int ! as it were). I have some sympathy with someone's suggestion that the return-type of a function gets a different spelling for !, e.g. -> I *really* like TimP's Haskellish spelling of type-specifiers. The function of inline assertions is then to ensure that the compiler can perform (rudimentary) type inference based on that which is asserted. There are two ways a compiler can respond to !-assertions (indeed, it has the same choice with assert): * exploit - generate faster code, presuming the asserted truth * check - verify the assertion The latter is obvious - perform the test, raise the exception on failure. One of the easiest ways to exploit an assertion is to infer that some subsequent assertion is guaranteed true and elide its check. The inferencer has other sources of data than the !-assertions - for instance, immediately after the expression `x+7' has been evaluated, we know that x's value supports addition, at least with integers, at least this way round - either that, or we've raised an exception and aren't executing the code which followed. I believe it is philosophically valid for the inferencer to presume the truth of anything that is asserted (by existing assert statements) even if __debug__ is false. How much umbrage does that raise ? The inferencer takes all these sources of information and infers what it will, the compiler exploits the inferred truths in hunting for efficient ways to implement the given code. Any part of the code that the inferencer doesn't know how to exploit, it simply (silently) doesn't exploit. Whether it bothers to check will depend on optimise/debug flags its been given and how anal the compiler-writer is. On the dotted-names issue: so long as it's valid to say > MyChecker = SomeCheckerGenerator(...) > x = foo() ! MyChecker there is no mileage in forbidding x = foo() ! SomeCheckerGenerator(...) it just obliges me to polute my namespace (unless, of course, I intend to re-use the checker). The objection that the compiler has a hard time coping with this is without substance: * the difficulties involved in recognising SomeCheckerGenerator(...) are present in both of the above and in no way reduced by storing the result in a variable in the mean time: the compiler's knowledge of what `! MyChecker' buys it is entirely dependent on making sense of the code it's just seen which gives MyChecker a value. * each compiler is only going to recognise a sub-set of the type-specifiers deployed, even with the dotted-name constraint. * when the compiler recognises a checker, it can generate code which exploits the truth asserted. * the right thing for a compiler to do about unrecognised checks is to not exploit them. After all, it has nothing to exploit. and, in any case, dotted names (may) involve function calls: __getattr__. > I think altering isinstance() to accept a callable is preferable to > introducing a __check__ method. Ah, have I mis-understood you ... I thought you said that isinstance would take a third argument which is callable ... were you saying that it accepts a third kind of thing as its second argument ? In which case I see where you're going and that sounds great. def isinstance(thing, *modes): for mode in modes: if type(mode) is TypeType: if type(thing) is mode: return 1 elif type(mode) is ClassType: try: if issubclass(mode.__class__, mode): return 1 except AttributeError: pass else: # mode is presumed to be a callable if mode(thing): return 1 return 0 Associativity: 7 ! IntType ! (Prime ! TypeChecker) 7 is an integer, indeed it's a prime (oh, and Prime is a type-checker). So ! groups to the left (I think that's what + et al. do). Paul said: > There is also something deeply elegant and useful about a separation > of interface from implementation. Conceptual separation - yes. Physical separation - quite the opposite. When programs and documentation live apart, they drift apart. The only way the interface spec can live apart from the implementation is when the implementation can be checked for compatibility with the interface (so that the implementation change which changes the interface gets recognised as such the first time the changed code is used). If we can check the implementation for compatibility with its interface spec, we're already specifying the interface in the implementation, so why bother having a second copy in a separate file ? This list is too busy. Time I went home, before I make it any worse. Eddy. -- Yes, I did read the GFS proposal. From Edward Welbourne Wed Dec 15 20:26:25 1999 From: Edward Welbourne (Edward Welbourne) Date: Wed, 15 Dec 1999 20:26:25 +0000 Subject: [Types-sig] Re: [Doc-SIG] Sorry! In-Reply-To: <3857E6F9.52FC29F2@prescod.net> References: <385665DE.9963174B@prescod.net> <38569FFF.78EC8DA@prescod.net> <3857CDD1.D77331AD@prescod.net> <3857E6F9.52FC29F2@prescod.net> Message-ID: Damn. Another message arrived before I could escape ;^/ Paul said (we'd gone off line due to another of my confusions): > Greg admits that his proposal does nothing about attributes. Thus far I've only seen him saying he doesn't consider them worth attention. > a whole interface definition mechanism. Which brings us back to the > idea of interfaces separate from implementations, which brings us back > to shadow files, even for Python code. whoa. I don't follow the inferences here. The only interface definition mechanism I can see needed is the one that lets us specify the analogue of C structures and function types - that is, the equivalent of a typedef. One interface thus defined can be deployed for several objects that support it - this does not mean that we have to have a separate *file* in which to say it, let alone a separate file in which to re-iterate the specification of the interface for each of the files which defines an export which matches that interface. The !-assertion mechanism will, indeed, depend (when taken to its logical extreme) on some way of saying `an object which has attribute foo, which is an integer and which *you* cannot modify though the object might', mutatis mutandis. For that we'll need an IDL, in some guise, which can produce an object which encodes that spec. Such an object can be used in many places. While that object gets constructed someplace else than the objects it gets used to describe, it needn't be in a separate file; nor need this place know anything about the objects that will be described by the interface description object, least of all their names. As long as we can construct objects which encode interface descriptions, we can * use !-assertions on values in-line, where those values are computed, * use those interface objects when filtering using a setattr hack * provide some mechanism for an object to `publish' the fact that it supports some given interface (as I understand it, someone's done this). and if we can't define such objects then there is no amount of fun and games with shadow files can possibly help us. (If I keep this up much longer, I shalln't have time to write about my object-unification schemes in time for IPC8 ... which I care about much more than type-checking, and which may simplify all this anyway.) Dinner time, Eddy. From janssen@parc.xerox.com Wed Dec 15 20:26:14 1999 From: janssen@parc.xerox.com (Bill Janssen) Date: Wed, 15 Dec 1999 12:26:14 PST Subject: [Types-sig] Shadow File Opinions? In-Reply-To: Your message of "Tue, 14 Dec 1999 18:41:56 PST." <3856FFF4.1C1A4AD@prescod.net> Message-ID: <99Dec15.122615pst."3601"@watson.parc.xerox.com> > Bill, while you're here, could you help me out with the CORBA IDL POV on > generic types? Does IDL support parameterization? Nope. It believes in inheritance and mixins, which give you a different set of capabilities. Actually, with valuetypes, that's become a much more reasonable position. Bill From mal@lemburg.com Wed Dec 15 18:14:20 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 15 Dec 1999 19:14:20 +0100 Subject: [Types-sig] Low-hanging fruit: recognizing builtins References: <199912151707.MAA02639@eric.cnri.reston.va.us> Message-ID: <3857DA7C.75433A13@lemburg.com> Guido van Rossum wrote: > > It's always bothered me from a performance point of view that using a > built-in costs at least two dict lookups, one failing (in the modules' > globals), one succeeding (in the builtins). This is done so that you > can define globals or locals that override the occasional builtin; > which is good since new Python versions can define new builtins, and > if you weren't allowed to override builtins this would break old code. > > Here's a way that per-module analysis plus a conservative assumption > plus an addition to the PVM (Python Virtual Machine) bytecode can > remove *both* dict lookups for most uses of builtins. > > Per-module analysis can easily detect that there are no global > variables named "len", say. In this case, any expression calling > len() on some object can be transformed into a new bytecode that calls > PyObjectt_Length() on the object at the top of the stack. Thus, a > sequence like > > LOAD_GLOBAL 0 (len) > LOAD_FAST 0 (a) > CALL_FUNCTION 1 > > can be replaced by > > LOAD_FAST 0 (a) > UNARY_LEN > > which saves one PVM roundtrip and two dictionary lookups, plus the > argument checking code inside the len() function. > > There are plenty of bytecodes available. > > In addition, we can now optimize common idioms involving builtins, the > most important one probably > > for i in range(n): ... > > We lose a tiny bit of dynamic semantics: if some bozo replaces > __builtin__.len with something that always returns 0, this won't > affect our module. Do we care? I don't. We don't have to do this > for *every* builtin; for example __import__() has as explicit > semantics that you can replace it with something else; for open() I > would guess that there must be plenty of programs that play tricks > with it. But range()? Or len()? Or type()? I don't think anybody > would care if these were less dynamic. Note that you can still > override these easily as globals, it just has to be visible to the > global analyzer. > > The per-module analysis required is major compared to what's currently > happening in compile.c (which only checks one function at a time > looking for assignments to locals) but minor compared to any serious > type inferencing. Clearly this does nothing for (ERR), but I bet it > could speed up the typical Python program with a substantial factor... I like this :-) How about also adding caching of globals which are not modified within the module in locals ? This would save another cylce or two. The caching would have to take place during function creation time. I'm currently doing this by hand which results in ugly code... :-( but faster execution :-) Note that interning the builtins as byte codes could be a security risk when executing in a restricted environment, though. Some builtin operations might not be allowed and but would still be available via bytecode. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 16 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido@CNRI.Reston.VA.US Wed Dec 15 22:20:03 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 15 Dec 1999 17:20:03 -0500 Subject: [Types-sig] Low-hanging fruit: recognizing builtins In-Reply-To: Your message of "Wed, 15 Dec 1999 19:14:20 +0100." <3857DA7C.75433A13@lemburg.com> References: <199912151707.MAA02639@eric.cnri.reston.va.us> <3857DA7C.75433A13@lemburg.com> Message-ID: <199912152220.RAA07942@eric.cnri.reston.va.us> > How about also adding caching of globals > which are not modified within the module in locals ? This > would save another cylce or two. The caching would have > to take place during function creation time. I'm currently doing > this by hand which results in ugly code... :-( but faster execution > :-) Indeed -- the same analysis I was proposing would also support this. However there's a common pattern that can be a problem here (and isn't a problem for the built-in functions analysis): modules often have a few global variables that are initialized only once in the module, but are clearly (e.g. through comments) intended to be modified by using modules. Examples: default files, debug levels, and the like. I'm not sure how to detect this pattern reliably, unless you decide to cache only functions, classes, and imported modules. > Note that interning the builtins as byte codes could be > a security risk when executing in a restricted environment, > though. Some builtin operations might not be allowed and but would > still be available via bytecode. Of course a restricted environment should not accept arbitrary bytecode! Also you could simply not define bytecodes for security-sensitive built-ins; the only ones I cna think of right now are __import__() and open(), which I already mentioned as exceptions. Note that a bunch of built-in constants can also be optimized using this same mechanism: None and perhaps exception names. I'm not sure that exception names are worth it though; they don't tend to be touched in inner loops where performance gains are made. But None is definitely worth its own 1-byte opcode. --Guido van Rossum (home page: http://www.python.org/~guido/) From paul@prescod.net Wed Dec 15 23:55:05 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 15 Dec 1999 15:55:05 -0800 Subject: [Types-sig] Low-hanging fruit: recognizing builtins References: <199912151707.MAA02639@eric.cnri.reston.va.us> <3857E451.F7E4B6EC@prescod.net> <199912151921.OAA07698@eric.cnri.reston.va.us> Message-ID: <38582A59.74BEA4CC@prescod.net> Guido van Rossum wrote: > > > Dynamic bytecode registration would slow the PVM too much. I was thinking of just changing this: default: handler = handlers[opcode]; if( handler ){ handler( f ); }else{ fprintf(stderr, "XXX lineno: %d, opcode: %d\n", f->f_lineno, opcode); PyErr_SetString(PyExc_SystemError, "unknown opcode"); why = WHY_EXCEPTION; break; } -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Thu Dec 16 02:51:54 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 15 Dec 1999 18:51:54 -0800 Subject: [Types-sig] What is the Essence of Python? (Was: Low-hanging fruit: recognizing builtins) References: Message-ID: <385853CA.62850B22@prescod.net> "Golden, Howard" wrote: > > I reiterate that we should define what is the essence of Python, so we know > what sort of dynamicism and flexibility we are trying to preserve, and what > is superfluous. Until we do this, we are dealing with a squishy set of > requirements. I think that that is always the case in language design. What one person hates is what another loves: even in Python! I don't know how to answer your question. I think that we can only argue about particular features "when we get to them." Most people probably do not use dynamicity to the same extent as the real power users but on the other hand they are the ones who are most fanatical about the language. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From gstein@lyra.org Thu Dec 16 03:05:01 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 15 Dec 1999 19:05:01 -0800 (PST) Subject: [Types-sig] What is the Essence of Python? In-Reply-To: <385853CA.62850B22@prescod.net> Message-ID: On Wed, 15 Dec 1999, Paul Prescod wrote: > "Golden, Howard" wrote: > > > > I reiterate that we should define what is the essence of Python, so we know > > what sort of dynamicism and flexibility we are trying to preserve, and what > > is superfluous. Until we do this, we are dealing with a squishy set of > > requirements. > > I think that that is always the case in language design. What one person > hates is what another loves: even in Python! I don't know how to answer > your question. I think that we can only argue about particular features > "when we get to them." I agree. It is like asking somebody to describe the color "blue" :-) I think there is a yardstick in there somewhere, that you can hold up to a feature or design and say "that's Pythonic" or "that's not". But it is very subjective and incapable of being described... Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Thu Dec 16 03:49:18 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 15 Dec 1999 19:49:18 -0800 (PST) Subject: [Types-sig] Handling attributes In-Reply-To: <3857DFBC.A2B7ADAC@prescod.net> Message-ID: On Wed, 15 Dec 1999, Paul Prescod wrote: > > Yes. My reluctance to specify types for instance variables is caused by > > problems with designing a nice, inline syntax for it. If you're not > > worrying about an inline syntax, then you can definitely add typedecls for > > instance and class attributes. > > Okay, but what about all of the other questions (updated slightly): I didn't reply to them because I didn't really have much of an opinion :-) In general, I might say: punt. Don't worry about that stuff right now. Worry about phase 1. Refining assignment behavior can come later, as that "should" be independent of what occurs in the first phase. Cheers, -g -- Greg Stein, http://www.lyra.org/ From tim_one@email.msn.com Thu Dec 16 03:55:18 1999 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 15 Dec 1999 22:55:18 -0500 Subject: [Types-sig] A challenge In-Reply-To: <199912151421.JAA01106@eric.cnri.reston.va.us> Message-ID: <000501bf4779$5e566b40$58a2143f@tim> [Guido] > There seem to be several proposals for type declaration syntaxes > out there, with (mostly implied) suggestions on how to spell > various types etc. > > I personally am losing track of all the various proposals. You're not alone . > I would encourage the proponents of each approach to sit down with > some sample code and mark it up using your proposed syntax. Or write > the corresponding interface file, if that's your fancy. I like interface files fine, but will stick to inline "decl"s below. Apparently unlike anyone else here, I think explicit declarations can make code easier for *human readers* to understand -- so I'm not interested in hiding them from view. > I recommend using the sample code that I posted as a case study, > including some of the imported modules -- this should be a > reasonable but not extreme challenge. Sorry, but if we avoid excessive novelty, it's just a finger exercise <0.5 wink>. Note that you convert this back to Python 1.5.x code simply by commenting out the decl stmts. if-it-looks-a-lot-like-every-other-reasonable-declaration- syntax-you've-ever-seen-it-met-its-goal-ly y'rs - tim import sys, find decl main: def() -> None def main(): decl dir: String, list: [String], name: String dir = "." if sys.argv[1:]: dir = sys.argv[1] list = find.find("*.py", dir) list.sort() for name in list: print name if __name__ == "__main__": main() ---------------------------------------------------- import fnmatch import os decl _debug: Int # but Boolean makes more sense; see below _debug = 0 decl _prune: [String] _prune = ['(*)'] decl find: def(String, optional dir: String) -> [String] def find(pattern, dir = os.curdir): decl list, names: [String], name: String list = [] names = os.listdir(dir) names.sort() for name in names: decl name, fullname: String if name in (os.curdir, os.pardir): continue fullname = os.path.join(dir, name) if fnmatch.fnmatch(name, pattern): list.append(fullname) if os.path.isdir(fullname) and not os.path.islink(fullname): decl p: String for p in _prune: if fnmatch.fnmatch(name, p): if _debug: print "skip", `fullname` break else: if _debug: print "descend into", `fullname` list = list + find(pattern, fullname) return list #---------------------------------------------------------------------- import re # Declaring the type of _cache is irritating, because so far # as current Python is concerned a compiled regexp is of # type Instance, and that's too inclusive to be interesting. # I'm giving its class name instead. decl _cache: {String: RegexObject} _cache = {} # Assuming a Boolean "type" exists, if for no other reason # than to support meaningful (to humans!) type declarations. # Declaring all the function signatures in a block here, for # the heck of it. BTW, this is an example of how decls can # aid human comprehension -- e.g., I had to reverse-engineer # the code to figure out whether the "pat" arguments were # supposed to be strings or compiled regexps. They don't # both work, and the name "pat" doesn't answer it. decl fnmatch: def(String, String) -> Boolean, \ fnmatchcase: def(String, String) -> Boolean, \ translate: def(String) -> String def fnmatch(name, pat): import os name = os.path.normcase(name) pat = os.path.normcase(pat) return fnmatchcase(name, pat) def fnmatchcase(name, pat): if not _cache.has_key(pat): decl res: String res = translate(pat) _cache[pat] = re.compile(res) return _cache[pat].match(name) is not None def translate(pat): decl i, n: Int, res: String i, n = 0, len(pat) res = '' while i < n: decl c: String c = pat[i] i = i+1 if c == '*': res = res + '.*' elif c == '?': res = res + '.' elif c == '[': decl j: Int j = i if j < n and pat[j] == '!': j = j+1 if j < n and pat[j] == ']': j = j+1 while j < n and pat[j] != ']': j = j+1 if j >= n: res = res + '\\[' else: decl stuff: String stuff = pat[i:j] i = j+1 if stuff[0] == '!': stuff = '[^' + stuff[1:] + ']' elif stuff == '^'*len(stuff): stuff = '\\^' else: while stuff[0] == '^': stuff = stuff[1:] + stuff[0] stuff = '[' + stuff + ']' res = res + stuff else: res = res + re.escape(c) return res + "$" From tim_one@email.msn.com Thu Dec 16 03:55:23 1999 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 15 Dec 1999 22:55:23 -0500 Subject: [Types-sig] Implementability In-Reply-To: <3857DFF4.F02097CB@prescod.net> Message-ID: <000601bf4779$614092e0$58a2143f@tim> [Tim] > making this [global type inference] all run efficiently (in either > time or space) is a Professional Pain in the Professional Ass. [Paul Prescod] > According to the principle of "from each according to their > talents" I'm afraid you've mistaken Benevolent Dictatorship for some variant of Communism . > you should be writing this optimizing, static type checker. Guido is the one interested in magical type inference; I'm not. I'm happy to explicitly declare the snot out of everything when I want something from static typing. Merely checking that my declared types match my usage is much easier (doesn't require any flow analysis). The good news is that I couldn't make time to write an inferencer even if I wanted to; if I had time, I'd be much more likely to write something that *used* explicit declarations to generate faster code. >> Because of this, global analysis never works out in practice >> unless you invent an efficient database format to cache the >> results of analysis, keeping that in synch with the source >> base under mutation. > Bah. The scope of compilation is the module. The scope of inference is > a namespace defining suite (e.g. a module, class body or method, but > not an "if" or "try"). "Bah"? I didn't say anything about the granularity of the cached analysis. For general Python use, module level sounds good to me too. But note that in the msg I was responding to, Guido was blue-skying a type checker for IDLE: for interactive use, he'll probably want quicker feedback than that (if a newbie breaks the type correctness of an "if" test with an edit, they should probably be told about it as soon as they move the cursor off the line!). >> ... >> If you can, in addition, avoid needing to deduce the types of >> most globals, it could actually fly before we're all dead . > The types of globals from other modules should be explicitly declared. A global type inferencer can usually figure that out on its own. There's more than one issue being discussed here, alas -- blame Guido <0.9 wink>. > If they aren't, they are presumed to have type PyObject or to return > PyObject. Or they just aren't available if you are in strict static > type check mode. In the language of the msg to which you're replying, they're associated with the universal set (the set of all types) -- same thing. Then e.g. declared_int = unknown is an error, but unknown1 = unknown2 is not. Whether unknown = declared_int should be an error is a policy issue. Many will claim it should be an error, but the correct answer is that it should not. Types form a lattice, in which "unknown" is the top element, and the basic rule of type checking is that the binding lhs = rhs is OK iff type(lhs) >= type(rhs) where ">=" is wrt the partial ordering defined by the type lattice (or, in English , only "widening" bindings are OK; like assigning an int to a real, or a subclass to a base class etc, but not their converses). phrase-it-that-way-or-not-you-end-up-with-the-same-rules-ly y'rs - tim From paul@prescod.net Thu Dec 16 03:02:00 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 15 Dec 1999 19:02:00 -0800 Subject: [Types-sig] Re: [Doc-SIG] Sorry! References: <385665DE.9963174B@prescod.net> <38569FFF.78EC8DA@prescod.net> <3857CDD1.D77331AD@prescod.net> <3857E6F9.52FC29F2@prescod.net> Message-ID: <38585628.D4B4934F@prescod.net> Edward Welbourne wrote: > > The only interface definition mechanism I can see needed is the one that > lets us specify the analogue of C structures and function types - that > is, the equivalent of a typedef. One interface thus defined can be > deployed for several objects that support it - this does not mean that > we have to have a separate *file* in which to say it, let alone a > separate file in which to re-iterate the specification of the interface > for each of the files which defines an export which matches that > interface. Who said anything about a separate file for every interface? The benefits of the shadow files have been documented in other messages including those in the thread "Shadow File Opinions" and "Progress" and "Interface Files". You said that static type checking was ugly to start with so I would have thought that you would prefer a proposal that separated the type declarations from your code. This is one of the reasons I like this strategy: to comfort those that didn't want static types in Python code. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From skip@mojam.com (Skip Montanaro) Thu Dec 16 05:38:18 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Wed, 15 Dec 1999 23:38:18 -0600 (CST) Subject: [Types-sig] challenge response (was: A challenge) In-Reply-To: References: <199912151421.JAA01106@eric.cnri.reston.va.us> Message-ID: <14424.31434.689571.714592@dolphin.mojam.com> Greg> Line 7: per caveat #1, assume the compiler can access the Greg> find.find() function. From that, it knows the signature. The first Greg> parameter has a matching type, but the second (PyObject) does not Greg> match the required type (String), so an error is raised. If caveat Greg> #5 is resolved, then the second parameter matches. It is also Greg> possible to avoid the error by rewriting: Greg> list = find.find("*.py", dir!StringType) # 7 Greg> "list" is now a ListType, based on the find.find() return Greg> value. (see caveat #5 -- it could be possible to refine this Greg> knowledge). I humbly assert this train of thought rates a *bzzzt*. I thought one core requirement was that all type declaration stuff be optional. The worst that the type checker/inferencer should do in the face of incomplete type info is display a warning. I don't think you can flag an error unless the programmer sets some sort of PY_ANAL_TYPE_CHECKING_AND_I_REALLY_MEAN_IT environment variable. Skip Montanaro | http://www.mojam.com/ skip@mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From skip@mojam.com (Skip Montanaro) Thu Dec 16 05:54:20 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Wed, 15 Dec 1999 23:54:20 -0600 (CST) Subject: [Types-sig] Interface files In-Reply-To: <3857E02B.53CF27AC@prescod.net> References: <3857E02B.53CF27AC@prescod.net> Message-ID: <14424.32396.684602.505977@dolphin.mojam.com> Greg> I stated a preference for allowing this information to reside in Greg> the same file as the implementation. i.e. I don't want to maintain Greg> two files. Paul> The nice thing about having separate files is that it becomes Paul> instantly clear what is "interesting" to the compiler. We have no Paul> backwards compatibility constraints. We have no questions about Paul> what variable are "in scope" and "available". It's just plain Paul> simpler. If you're determined to have some sort of syntax to support declarations, why not separate files for 1.6 and modified syntax for 2.0? Skip From skip@mojam.com (Skip Montanaro) Thu Dec 16 05:57:36 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Wed, 15 Dec 1999 23:57:36 -0600 (CST) Subject: [Types-sig] Low-hanging fruit: recognizing builtins In-Reply-To: <199912151921.OAA07698@eric.cnri.reston.va.us> References: <199912151707.MAA02639@eric.cnri.reston.va.us> <3857E451.F7E4B6EC@prescod.net> <199912151921.OAA07698@eric.cnri.reston.va.us> Message-ID: <14424.32592.50142.921358@dolphin.mojam.com> Guido> A first approximation would be to go hunt through all existing Guido> code objects in a module for LOAD_GLOBAL and STORE_GLOBAL opcodes Guido> with built-in names; for all such built-in names that have no Guido> STORE_GLOBAL anywhere, it's "safe enough" to use the special Guido> opcode. Then of course you will have to hunt through the Guido> bytecode for sequences of LOAD_GLOBAL(), followed by Guido> arbitrary code to load an object, followed by CALL_FUNCTION(1). Don't you also have to watch out for the dreaded from my_rewritten_builtins import * ? Skip From mal@lemburg.com Thu Dec 16 10:02:49 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 16 Dec 1999 11:02:49 +0100 Subject: [Types-sig] Low-hanging fruit: recognizing builtins References: <199912151707.MAA02639@eric.cnri.reston.va.us> <3857DA7C.75433A13@lemburg.com> <199912152220.RAA07942@eric.cnri.reston.va.us> Message-ID: <3858B8C9.24962AAD@lemburg.com> Guido van Rossum wrote: > > > How about also adding caching of globals > > which are not modified within the module in locals ? This > > would save another cylce or two. The caching would have > > to take place during function creation time. I'm currently doing > > this by hand which results in ugly code... :-( but faster execution > > :-) > > Indeed -- the same analysis I was proposing would also support this. > However there's a common pattern that can be a problem here (and isn't > a problem for the built-in functions analysis): modules often have a > few global variables that are initialized only once in the module, but > are clearly (e.g. through comments) intended to be modified by using > modules. Examples: default files, debug levels, and the like. I'm > not sure how to detect this pattern reliably, unless you decide to > cache only functions, classes, and imported modules. In the long run it would be better to wrap those module globals with write access functions (the write action would then be recognized by the optimizer). I haven't followed the thread too closely, but isn't there some way to tell the optimizer which modules to treat at what optimization level ? Old modules should only use the "safe" caching strategy then while modules compiled with full optimization would be caching all read-only globals. BTW, instead of adding oodles of new byte code, how about grouping them... e.g. instead of UNARY_LEN, BUILD_RANGE, etc. why not have a CALL_BUILTIN which takes an index into a predefined set of builtin functions. The same could be done with some often used constants such as None, '', 1, 0: LOAD_SYSTEM_CONST with an index into a constants array. The advantage is that you can easily extend both sets of prefetched constants while not adding too many new new byte codes to the inner loop. Note that the loop as it is built now is already too large for common Intel+compatible based CPUs. Adding even more byte codes to the huge single loop would probably result in a decrease of CPU cache hits. (I split the Great Switch in two switch statements and got some good results out of this: the first switch handles often used byte codes while the second takes care of the more exotic ones.) A note on range() and for: the common usage of for i in range(const): ... could be compiled into a completely different set of opcodes not creating any list or tuple at all. Since the FOR_LOOP opcode generates loop integers on each iteration the creation of a range tuple or list is not needed. The loop opcode would only have to check for the upper bound "const". I've added a new counter type (basically a mutable integer type that allows for fast increment and decrement) to simplify this even more. For the curious, it's in the old patch: http://starship.skyport.net/~lemburg/mxPython-1.5.patch.gz > > Note that interning the builtins as byte codes could be > > a security risk when executing in a restricted environment, > > though. Some builtin operations might not be allowed and but would > > still be available via bytecode. > > Of course a restricted environment should not accept arbitrary > bytecode! Also you could simply not define bytecodes for > security-sensitive built-ins; the only ones I cna think of right now > are __import__() and open(), which I already mentioned as exceptions. > > Note that a bunch of built-in constants can also be optimized using > this same mechanism: None and perhaps exception names. I'm not sure > that exception names are worth it though; they don't tend to be > touched in inner loops where performance gains are made. But None is > definitely worth its own 1-byte opcode. See above: I'd rather like see the addition of more generic opcodes than many different new ones for each common constant. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 15 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From m.faassen@vet.uu.nl Thu Dec 16 13:31:46 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Thu, 16 Dec 1999 14:31:46 +0100 Subject: [Types-sig] minimal or major change? (was: RFC 0.1) References: Message-ID: <3858E9C2.E2722B88@vet.uu.nl> Greg Stein wrote: > > On Wed, 15 Dec 1999, Martijn Faassen wrote: > > ... me: stating the "GFS proposal" isn't that major of a change ... [I'm disagreeing with the 'isn't that big of a change' thesis, Greg defends fairly well that it is, but I still disagree with him. I don't think our disagreeing will matter much in the future, though, so let's forget about it.. I'll answer some points he raised in the following, but not to defend my point of view :)] [snip] > > * A whole new operator (which you can't overload..or can you?), which > > does something quite unusual (most programmers associate types with > > names, not with expressions). The operation also doesn't actually return > > much that's useful to the program, so the semantics are weird too. > > No, you cannot overload the operator. That would be a Bad Thing, I think. > That would throw the whole type system into the garbage :-). Okay, in that sense the operator would be special, as generally operators in Python can be overloaded (directly or indirectly). I'd agree you shouldn't be able to overload this one, though. > The operator is not unusual: it is an inline type assertion. It is not a > "new-fangled way to declare the type of something." But it's quite unusual to the programmer coming from most other languages, still. That doesn't mean it's bad, but Python isn't an experimental language, so this could be an objection to the operator approach. > It is simply a new > operation. The compiler happens to be able to create associations from it, > but that does *not* alter the basic semantic of the operation. > > Given: > > x = y or z > > In the above statement, it returns "y" if it is "true". In the statement: > > x = y ! z > > It returns "y" if it has "z" type; otherwise, throws an exception. The > semantics aren't all the difficult or unusual. Okay, that isn't that unusual as other operator operations can throw exceptions under some circumstances as well. Well defended. :) [snip] > > * Interfaces with a new 'decl' statement. [If you punt on this you'll > > have to the innocent Python programmer he can't use the static type > > system with instances? or will we this be inferenced?] > > Yes, I'd prefer to punt this for a while, as it is a much larger can of > worms. It is another huge discussion piece. In the current discussion, I > believe that we can factor out the interface issue quite easily -- we > can do a lot of work now, and when interfaces arrive, they will slide > right in without interfering with the V1 work. In other words, I believe > there is very little coupling between the proposal as I've outline, and > the next set of type system extensions (via interfaces). Hm, I'm still having some difficulty with this; as I understand it your proposal would initially only work with functions (not methods) which only use built-in types (not class instances). Am I right, or perhaps I'm missing something.. [snip] > > Adding anything like static type checking to Python entails fairly major > > changes to the language, I'd think. Not that we shouldn't aim at keeping > > those transparant and mostly compatible with Python as it is now, but > > what we'll add will still be major. > > Sure. You say 'sure' to me saying it'll still be major? :) Oh, wait, I wasn't arguing about that anymore! > I think we're just viewing it a bit differently. To me, something > like the metaclass stuff was a big change: it is capable of altering the > very semantics of class construction. Adding package support was the same > -- Python moved from a flat import space to an entirely new semantic for > importing and application packaging. Both happened before I was involved with Python, and I still don't know much about metaclasses, so I can't comment on this one. > > > > The 'simplicity' part comes in because you don't need *any* type > > > > inferencing. Conceptually it's quite simple; all names need a type. > > > > > > 1) There is *no* way that I'm going to give every name a type. I may as > > > well switch to Java, C, or C++ (per Guido's advice in another email :-) > > > > Sure, but we're looking at *starting* the process. Perhaps we can do > > away with specifying the type of each local variable very quickly by > > using type inferencing, but at least we'll have a working > > implementation! > > I don't want to start there. I don't believe we need to start there. And > my point (2) below blows away your premise of simplicity. Since you still > need inferencing, the requirement to declare every name is not going to > help, so you may as well relax that requirement. But you'd only need expression inferencing, which I was ('intuitively' :) assuming is easier than the larger scale thing. [snip] > > I'm not saying this is a good situation, it's just a way to get off the > > ground without having to deal with quite a few complexities such as > > inferencing (outside expressions), interaction with modules that don't > > have type annotations, and so on. I'm *not* advocating this as the end > > point, but I am advocating this as an intermediate point where it's > > actually functional. > > IMO, it is better to assume "PyObject" when you don't have type > information, rather than throw an error. Detecting the lack of type info > is the same in both cases, and the resolution of the lack is easy in both > mehtods: throw an error, or substitute "PyObject". I prefer the latter so > that I don't have to update every module I even get close to. I still don't understand how making it a PyObject will help here. Would this mean a run-time check would need to be inserted whenever PyObject occurs in a function with type annotations? In my approach this would be part of the Python/Static Python interface work. How does it fit in for you? [snip] > > Yes, but now you're building a static type checker *and* a Python > > compiler inserting run time checks into bytecodes. This is two things. > > This is more work, and more interacting systems, before you get *any* > > payoff. My sequence would be: > > Who says *both* must be implemented in V0.1? If the compiler can't figure > it out, then it just issues a warning and continues. Some intrepid > programmer comes along and tweaks the AST to insert a runtime check. Done. > The project is easily phased to give you a working system very quickly. > > Heck, it may even be easier for the compiler to insert runtime checks in > V0.1. Static checking might come later. Or maybe an external tool does the > checking at first; later to be built into the compiler. That's true; the other approach would start with adding run-time checks and proceed to a static checker later. > ... proposed implementation order ... > > If you don't separate out your development path like this you end up > > having to do it all at once, which is harder and less easy to test. > > Of course. Nobody is suggesting a "do it all at once" course of > implementation. So that's where I'm coming from. It's important for our proposal to actually come up with a workable development plan, because adding type checking to Python is rather involved. So I've been pushing one course of implementation towards a testable/hackable system that seems to give us the minimal amount of development complexities. I haven't seen clear development paths from others yet; most proposals seem to involve both run-time and compile-time developments at the same time. So I'm interested to see other development proposals; possibly there's a simpler approach or equally complex approach with more payoff, that I'm missing. > > [Paul] > > > > > I see no reason for that limitation. The result of a call to a > > > > > non-static function is a Pyobject. You cast it in your client code to > > > > > get type safety. Just like the shift from K&R C to ANSI C. Functions > > > > > > Bunk! It is *not* a cast. You cannot cast in Python. It is a type > > > assertion. An object is an object -- you cannot cast it to something else. > > > Forget function call syntax and casting syntax -- they don't work > > > grammatically, and that is the wrong semantic (if you're using that format > > > to create some semantic equivalent to a cast). > > > > This'd be only implementable with run-time assertions, I think, unless > > you do inferencing and know what the type the object is after all. So > > that's why I put the limitation there. Don't allow unknown objects > > entering a statically typed function before you have the basic static > > type system going. After that you can work on type inference or cleaner > > interfaces with regular Python. > > Why not allow unknown objects? Just call it a PyObject and be done with > it. Hm, I suppose I'm looking at it from the OPT point of view; I'd like to see a compiler that exploits the type information. If you have PyObjects this seems to get more difficult; could be solved if you had an interpreter waiting in the sidelines that would handle stuff like this that can't be compiled. > Note that the type-assert operator has several purposes: > > * a run-time assertion (and possibly: unless -O is used) > * signal to the compiler that the expression value will have that type > (because otherwise, an exception would hav been raised) > * provides a mechanism to type-check: if the compiler discovers (thru > inferencing) that the value has a different type than the right-hand > side, then it can flag an error. > > The limitation you propose would actually slow things down. People would > not be able to use the type system until a lot of modules were > type-annotated. I think I'm starting to see where you're coming from now, with the ! operator. It allows you to say 'from this point on, this value is an int, otherwise the operator would've raised an exception'. The inferencer and checker can exploit this. The point where I am coming from is however that you lose compile-time checkability as soon as you use any function that inserts PyObjects into the mix. I'm afraid that even with the operator, you wouldn't be able to check most of the code, if PyObjects are freely allowed. Perhaps I'm wrong, but I'd like to see some more debate about this. > > But perhaps I'm mistaken and local variables don't need type > > descriptions, as it's easy to do type inferencing from the types of the > > function arguments and what the function returns, > > That is my (alas: unproven) belief. How do we set about to prove it? Here I'll come with my approach again; if you have a type checker that can handle a fully annotated function (all names used in the function have type annotations), then you have a platform you can build on to develop a type checker. Then you can figure out what does need type annotations and what doesn't. You simply try to build code that adds type annotations itself, based on inferences. You can spew out warnings: "full type inferencing not possible, cannot figure out type of 'foo'". The programmer can then go add type info for 'foo'. If all types are known one way (specified) or the other (inferred), a compiler can start to do heavy duty optimization on that code. [snip] > > I'd like to see some actual > > examples of how this'd work first, though. For instance: > > > > def brilliant() ! IntType: > > a = [] > > a.append(1) > > a.append("foo") > > return a[0] > > > > What's the inferred type of 'a' now? A list with heterogenous contents, > > that's about all you can say, and how hard is it for a type inferencer > > to deduce even that? > > It would be very difficult for an inferencer. It would have to understand > the semantics of ListType.append(). Specifically, that the type of the > argument is added to the set of possible types for the List elements. > > Certainly: a good inferencer would understand all the builtin types and > their methods' semantics. > > > But for optimization purposes, at least, but it > > could also help with error checking, if 'a' was a list of IntType, or > > StringType, or something like that? > > It would still need to understand the semantics to do this kind of > checking. In my no-variable-declaration world, the type error would be > raised at the return statement. a[0] would have the type set: (IntType, > StringType). The compiler would flag an error stating "return value may be > a StringType or an IntType, but it must only be an IntType". Right, I think this would be the right behavior. But it becomes a lot easier to get a working implementation if you get to specify the type of 'a'. If you say a is a list of StringType, it's then relatively easy for a compile time checker to notice that you can't add an integer to it. And possibly it also becomes clearer for the programmer; I had to think to figure out why your compiler would complain about a[0]. I had to play type inferencer myself. I don't have to think as much if I get to specify what list 'a' may contain; obviously if something else it put into it, there should be an error. > > It seems tough for the type > > inferencer to be able to figure out that this is so, but perhaps I'm > > overestimating the difficulty. > > Yes it would be tough -- you aren't overestimating :-) What would your path towards successful implementation be, then? Regards, Martijn From Edward Welbourne Thu Dec 16 14:09:02 1999 From: Edward Welbourne (Edward Welbourne) Date: Thu, 16 Dec 1999 14:09:02 +0000 Subject: [Types-sig] Re: [Doc-SIG] Sorry! In-Reply-To: <38585628.D4B4934F@prescod.net> References: <385665DE.9963174B@prescod.net> <38569FFF.78EC8DA@prescod.net> <3857CDD1.D77331AD@prescod.net> <3857E6F9.52FC29F2@prescod.net> <38585628.D4B4934F@prescod.net> Message-ID: [Ooops - sent it to Paul but not the group, again] > Who said anything about a separate file for every interface? Dunno - I wasn't supposing anyone had. You are asking for a separate file for each module, for the purpose of saying which interfaces the things in the module support. My (admittedly poorly expressed) point was that any mechanism for doing this depends on ways of saying what an interface is; which could as readilly be stated in the source file as in a separate one. If it's so big and unweildy that it needs to go in a separate source file, then: * anything else I write that has the same interface has to include a copy of this big and unweildy thing, instead of just referencing the same interface-description object * it's big and unweildy, so its right out. > The benefits of the shadow files have been documented ... Hmm, I'm not sure I read Guido's > I think that any proposal that requires you to keep two separate files > "in sync" is bound to fail in the long term. I left that crap behind > in C++. But in the short term...okay. as anything but a `yes, we could use this as a temporary measure to let us experiment with static typing within python 1'. And it appears to be typical of the comments to date. I guess this means I should ask: do you consider the changes you're proposing to be * temporary measures or * how python 2 will do this ? If the former, then we have nothing to argue about - my sole concern is how python 2 can do this (and, for context, I'm not particularly keen on bothering - but if it's going to be done, I want to be *very* sure it isn't going to foul some of the truly lovely things python 2 could be ...) > You said that static type checking was ugly to start with so I would > have thought that you would prefer a proposal that separated the type > declarations from your code. I also consider many kinds of large industrial plant to be ugly: and whether I can see them or not doesn't enter into it (except that the ones no-one can see can get away with uglier visual appearance than the others): and the `ugliness' I'm sensing isn't a surface-thing. What my prejudices are pinging off is (something I can't properly express, or I wouldn't call it a prejudice, but it's) about the fact that we're *saying things about* what interfaces an object supports, rather than just leaving exceptions to get raised when those interfaces of it get exercised. As with large industrial plant, I can see how it may serve a useful purpose ... but I'd far sooner see that purpose served some other way, or find some way of dodging the need to serve that purpose. Besides, various of the proposals I've seen since being so horribly judgemental have swayed me towards a more ... restrained ... view of type-checking. I'm not keen on it, but maybe it's not as bad as my first impressions. However, trying to hide it so that I'll forget it's there is *much* less welcome than confronting me with something I find ugly. > This is one of the reasons I like this strategy: to comfort those that > didn't want static types in Python code. Don't bother. I don't want comforted. (Although, I confess, I sometimes wish I had my teddy-bear, but don't tell the shrinks ...) If there's going to be a mechanism for saying, in the source, which variables have (and/or which expressions hold) which kinds of value: let it be * straightforward * general (or, in the first instance, straightforwardly extensible to full generality) * part of the source code I'm *much* happier with Tim Peters' scheme, using typedecls, than with any scheme involving my source living in a separate place from something that will make a difference to how it gets compiled. (Tim: the type Boolean is a (useful) synonym for PyObject. It probably includes some added semantics about how you should be trying to use it.) (I'd even be happy with the typedecl incorporating the docstring, which is part of the interface spec after all: and would make the run-time thing actually called be lighter-weight in some probably-irrelevant sense.) Actually, there are two doc-strings in two places: one is the doc-string of (say) the function object - it says what the function does - the other is where some object carrying that function documents the role of the attribute as which it stores that object. This is directly analogous to the two forms of type declaration: one describes an object, regardless of any names we may be calling it, the other describes what some namespace holds under some name. OK, now for a *technical* reason in favour of same-sourcefile typedecls: specifically, for the typedecl of a name to appear in the namespace to which it is local: A typedecl's execution (c.f.: del) can lead to the namespace setting up, within its infrastructure (stuff like __dict__), the magic setattr hookery that can implement (possibly under the bonnet) * such enforcing of type-checking (on the names stipulated) as the namespace is prepared to bother with (and the typedecl requests). * whatever machinery distant code is meant to use to ask about the types of the attributes of the objects whose namespace this is. and, incidentally, this `each namespace is responsible for managing itself' mentality says that one can stipulate stuff *within* a function which isn't ever going to be visible to the outside world (e.g. the shadow file) - for instance, if a function contains code which defines and returns a class which defines some method under a name controlled by the arguments recieved by the function, how on earth is the shadow file going to say anything at all about the return type of the function that doesn't simply ignore the argument-dependent stuff ? (OK, I know that was hard to read, so a wilfully perverse example follows.) def crazyIknow(methname): class lumpy (previously, defined, bases): pass def doit(self, kinky, key): return some(expression, involving, self, kinky, key, andmaybe, methname) setattr(lumpy, methname, doit) return lumpy A declspec (or, indeed, type-assertion-for-values) scheme can say something useful about all this, specifically what the type of doit is. How would a shadow file cope ? Work calls, Eddy. From m.faassen@vet.uu.nl Thu Dec 16 14:52:12 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Thu, 16 Dec 1999 15:52:12 +0100 Subject: [Types-sig] A challenge References: <199912151421.JAA01106@eric.cnri.reston.va.us> Message-ID: <3858FC9C.8472CB62@vet.uu.nl> Hi there, Here's my approach towards type annotations. Note that this syntax is not be very readable, but it is very powerful as it's Python; readable syntax can be developed later. I was lucky because no classes are defined in any of the modules in Guido's example, this means it's fairly readable. :) Type annotations for function locals generally follow under the function definition, type annotations for the entire module follow at the bottom of the module. This order is taken mostly because __types__ uses the other type declarations in itself; the type checker can simply look at __types__ and find all information in there; the __types_foo__ notation is just for convenience. All the type annotations could of course also reside in external interface files. I'm also hinting at a typedef system for complicated composite types. Regards, Martijn #---------------------------------------------------------------------- import sys, find # assume static type classes and such are builtin, for this example def main(): dir = "." if sys.argv[1:]: dir = sys.argv[1] list = find.find("*.py", dir) list.sort() for name in list: print name __types_main__ = { 'dir' : StringType, 'list' : ListType(StringType), 'sys.argv' : ListType(StringType) # supply extra types by hand # the type checker should look the 'find' module for more type information # automatically } if __name__ == "__main__": main() __types__ = { 'main' : FunctionType(args=None, result=None, local=__types_main__) 'name' : StringType, '__name__' : StringType # might already be defined somewhere else } #---------------------------------------------------------------------- #---------------------------------------------------------------------- import fnmatch import os _debug = 0 _prune = ['(*)'] def find(pattern, dir = os.curdir): list = [] names = os.listdir(dir) names.sort() for name in names: if name in (os.curdir, os.pardir): continue fullname = os.path.join(dir, name) if fnmatch.fnmatch(name, pattern): list.append(fullname) if os.path.isdir(fullname) and not os.path.islink(fullname): for p in _prune: if fnmatch.fnmatch(name, p): if _debug: print "skip", `fullname` break else: if _debug: print "descend into", `fullname` list = list + find(pattern, fullname) return list __types_find__ = { 'list' : ListType(StringType), 'names' : ListType(StringType), 'name' : StringType, 'os.curdir' : StringType, 'os.pardir' : StringType, 'fullname' : StringType, 'os.path.isdir' : ImpFunctionType(args=(StringType,), result=IntegerType), 'os.path.islink' : ImpFunctionType(args=(StringType,), result=IntegerType), 'p' : StringType, } __types__ = { '_debug' : IntegerType, '_prune' : ListType(StringType), 'find' : FunctionType(args=(StringType, StringType), result=ListType(StringType), local=__types_find__) } #---------------------------------------------------------------------- #---------------------------------------------------------------------- import re _cache = {} def fnmatch(name, pat): import os name = os.path.normcase(name) pat = os.path.normcase(pat) return fnmatchcase(name, pat) __types_fnmatch__ = { 'os.path.normcase' : ImpFunctionType(args=(StringType,), result=StringType), } def fnmatchcase(name, pat): if not _cache.has_key(pat): res = translate(pat) _cache[pat] = re.compile(res) return _cache[pat].match(name) is not None __types_fnmatchcase__ = { 'res' : StringType, } def translate(pat): i, n = 0, len(pat) res = '' while i < n: c = pat[i] i = i+1 if c == '*': res = res + '.*' elif c == '?': res = res + '.' elif c == '[': j = i if j < n and pat[j] == '!': j = j+1 if j < n and pat[j] == ']': j = j+1 while j < n and pat[j] != ']': j = j+1 if j >= n: res = res + '\\[' else: stuff = pat[i:j] i = j+1 if stuff[0] == '!': stuff = '[^' + stuff[1:] + ']' elif stuff == '^'*len(stuff): stuff = '\\^' else: while stuff[0] == '^': stuff = stuff[1:] + stuff[0] stuff = '[' + stuff + ']' res = res + stuff else: res = res + re.escape(c) return res + "$" __types_translate__ = { 'i' : IntegerType, 'n' : IntegerType, 'res' : StringType, 'c' : StringType, # or CharType ? 'j' : IntegerType, 'stuff' : StringType, } __types__ = { # cheating; I'm assuming re has a ReObjectType defined somewhere # this is probably a very complicated construction # we're also assuming re functions are defined in re '_cache' : DictType(key=StringType, value=re.__typedefs__['ReObjectType']), 'fnmatch' : FunctionType(args=(StringType, StringType), result=IntegerType, local=__types_fnmatch__), 'fnmatchcase' : FunctionType(args=(StringType, StringType), result=IntegerType, local=__types_fnmatchcase__), 'translate' : FunctionType(args=(StringType,), result=StringType, local=__types_translate__), } #---------------------------------------------------------------------- From m.faassen@vet.uu.nl Thu Dec 16 15:03:29 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Thu, 16 Dec 1999 16:03:29 +0100 Subject: [Types-sig] Progress References: <38579D70.75E477BA@prescod.net> Message-ID: <3858FF41.501B12F1@vet.uu.nl> Paul Prescod wrote: > > We are actually making progress among all of the sound and fury here. > You guys have a lot of good ideas and I think that we are converging > more than it seems. > > 1. Most people seem to agree with the idea that shadow files allow us a > nice way to separate type assertions out so that their syntax can vary. > I think Greg disagreed but perhaps not violently enough to argue about > it. Interface files are in. Inline syntax is temporarily out. Syntactic > "details" to be worked out. Actually I can see the arguments for including type annotations in the module files themselves as there's something to say for keeping it together. As long as the syntax isn't inline in our first design I'm fine. See the syntax example I just posted to the list. > 2. Everybody but me is comfortable with defining > genericity/templating/parameterization only for built-in types for now. What do you mean by 'built-in types'? Does this include classes? > But now that we are separating interfaces from implementations I am > thinking that I may be able to think more clearly about > parameterizability. It may be possible to define parameterizable > interfaces by IPC8. Parameterization is in. Syntactic "details" to be > worked out. See my syntax response to Guido's challenge for my take on things. > 3. We agree that we need a syntax for asserting the types of expressions > at runtime. I'm not sure I do agree with this. It's an intruiging concept but I'm still not convinced we shouldn't go with annotating names instead. This may be easier to think about for the programmer, see an earlier response of mine to the list for an example. [snip] > 5. It isn't clear WHAT we can specify in "PyDL" interface files. Clearly > we can define function, class/interface and method interfaces. > > a. do we allow declarations for the type of non-method instance > variables? Yes, eventually at least. We could focus on functions first, but I think supporting classes will become necessary very quickly. > b. do we check assignments to class and module attributes from other > modules at runtime? We need to expect that some cross-module assignments > will come from modules that are not statically type checked. You can manually add extra annotations for the names you use from other modules that those other modules don't annotate; see my syntax proposal. > c. should we perhaps just disallow writing to "declared" attributes > from other modules? Hm, yes, this could become complicated, even for run-time checks. We should come up with somekind of rule. run-time checks can help, but we need to figure out when they're necessary, and when they aren't; i.e if you write to a declared attribute from a module with something that doesn't have a compile-time type associated, a run time check should occur. But otherwise, it shouldn't. > d. is it possible to write to UN-declared attributes from other > modules? And what are the type safety implications of doing so? This would generally be fine; undeclared attributes can contain objects of any type, right? What will be tricky (and which is why I'm clamoring for full type annotations and other strictness, at least initially) is *reading* from these.. Regards, Martijn From m.faassen@vet.uu.nl Thu Dec 16 15:08:42 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Thu, 16 Dec 1999 16:08:42 +0100 Subject: [Types-sig] Interface files References: <3857E02B.53CF27AC@prescod.net> Message-ID: <3859007A.FD369FB2@vet.uu.nl> Paul Prescod wrote: > > Greg Stein wrote: > > > > I stated a preference for allowing this information to reside in the same > > file as the implementation. i.e. I don't want to maintain two files. > > The nice thing about having separate files is that it becomes instantly > clear what is "interesting" to the compiler. We have no backwards > compatibility constraints. We have no questions about what variable are > "in scope" and "available". It's just plain simpler. Look at my proposal (response to Guido's challenge). It's in the same file, and backwardly compatible, and it's instantly clear what the compiler looks at. > There is also something deeply elegant and useful about a separation of > interface from implementation. It can be helpful, but that doesn't mean it needs to be in a separate file. :) > Sure, you don't always want to be REQUIRED to separate them. I > acknowledge that we will one day have to support inline declarations but > I'm going to put it off unless I hear some screaming. Right, but Greg can't put it off, as he is advocating his operators, which have to be inline. > > I'll go further and state that we should not use a new language for this. > > It should just be Python. (and this is where Martijn's __types__ thing > > comes in, although I'm not advocating that format) > > I think that that's an unreasonable (and unreadable) constraint. The > language should probably be pythonic, but not necessarily Python. Python > doesn't have a type declaration syntax and none of Python's existing > syntax was meant to be used AS a type declaration syntax. It just gets > too unreadable for quasi-complicated declarations. We need to support > polymorphic and parameteric higher order functions! It may become fairly readable if you support typedefs (which can be used in type anontations). But I agree that this isn't the final solution; the final solution should probably be some nice Pythonic syntax. But for now: * it's quickly implementable * it's instantly usable by tools written in Python * it's understandable by anyone who can read Python * it's backwards compatible * we don't have to debate about syntax anymore and can actually think about semantics without syntax confusion. These are major advantages during the development. Regards, Martijn From m.faassen@vet.uu.nl Thu Dec 16 15:14:21 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Thu, 16 Dec 1999 16:14:21 +0100 Subject: [Types-sig] Handling attributes References: Message-ID: <385901CD.258DD7DB@vet.uu.nl> Greg Stein wrote: > > On Wed, 15 Dec 1999, Paul Prescod wrote: > > > Yes. My reluctance to specify types for instance variables is caused by > > > problems with designing a nice, inline syntax for it. If you're not > > > worrying about an inline syntax, then you can definitely add typedecls for > > > instance and class attributes. > > > > Okay, but what about all of the other questions (updated slightly): > > I didn't reply to them because I didn't really have much of an opinion :-) > > In general, I might say: punt. Don't worry about that stuff right now. > Worry about phase 1. Refining assignment behavior can come later, as that > "should" be independent of what occurs in the first phase. I'll note that this fits in with my agenda; I was thinking about worrying about a single module for now, that has full static type annotations. You can then ignore a lot of the problems you get when interfacing with non-annotated modules. If you want to use things (functions, classes) from other modules, you can put in temporary annotations for them, but not in those modules; you put them in the module that's using them (see again my response to Guido's challenge for examples). Basically you define not only the module's interface seen from the outside, but also how the module interfaces with the other modules on the inside. You define *all* interfacing in any direction that the module is involved in. Regards, Martijn From m.faassen@vet.uu.nl Thu Dec 16 15:22:49 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Thu, 16 Dec 1999 16:22:49 +0100 Subject: [Types-sig] Implementability References: <000601bf4779$614092e0$58a2143f@tim> Message-ID: <385903C9.98B69A30@vet.uu.nl> Tim Peters wrote: > > [Tim] > > making this [global type inference] all run efficiently (in either > > time or space) is a Professional Pain in the Professional Ass. > > [Paul Prescod] > > According to the principle of "from each according to their > > talents" > > I'm afraid you've mistaken Benevolent Dictatorship for some variant of > Communism . > > > you should be writing this optimizing, static type checker. > > Guido is the one interested in magical type inference; I'm not. I'm happy > to explicitly declare the snot out of everything when I want something from > static typing. Yay, someone on my side, and it's Tim too! (now I get to watch Tim drag himself quickly out of this and into a position completely incompatible with mine :) > Merely checking that my declared types match my usage is > much easier (doesn't require any flow analysis). The good news is that I > couldn't make time to write an inferencer even if I wanted to; if I had > time, I'd be much more likely to write something that *used* explicit > declarations to generate faster code. Right -- this would be the first step towards a magic inferencer anyway; you simply let it come up with 'explicit' declarations by itself, which you then fit into your optimizer. [snip] > > The types of globals from other modules should be explicitly declared. > > A global type inferencer can usually figure that out on its own. There's > more than one issue being discussed here, alas -- blame Guido <0.9 wink>. > > > If they aren't, they are presumed to have type PyObject or to return > > PyObject. Or they just aren't available if you are in strict static > > type check mode. > > In the language of the msg to which you're replying, they're associated with > the universal set (the set of all types) -- same thing. Then e.g. > > declared_int = unknown > > is an error, but Or, if you're interfacing with untyped python, this could raise a run-time exception if unknown doesn't turn out to be an integer. Or do you disagree with this? > unknown1 = unknown2 > > is not. Whether > > unknown = declared_int > > should be an error is a policy issue. Many will claim it should be an > error, but the correct answer is that it should not. This would seem to be the natural way to do it; I'm not sure why many would claim it should be an error. Could you explain? > Types form a > lattice, in which "unknown" is the top element, and the basic rule of type > checking is that the binding > > lhs = rhs > > is OK iff > > type(lhs) >= type(rhs) > > where ">=" is wrt the partial ordering defined by the type lattice (or, in > English , only "widening" bindings are OK; like assigning an int to a > real, or a subclass to a base class etc, but not their converses). I agree. Regards, Martijn From guido@CNRI.Reston.VA.US Thu Dec 16 15:38:24 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 16 Dec 1999 10:38:24 -0500 Subject: [Types-sig] Low-hanging fruit: recognizing builtins In-Reply-To: Your message of "Wed, 15 Dec 1999 23:57:36 CST." <14424.32592.50142.921358@dolphin.mojam.com> References: <199912151707.MAA02639@eric.cnri.reston.va.us> <3857E451.F7E4B6EC@prescod.net> <199912151921.OAA07698@eric.cnri.reston.va.us> <14424.32592.50142.921358@dolphin.mojam.com> Message-ID: <199912161538.KAA08333@eric.cnri.reston.va.us> > Guido> A first approximation would be to go hunt through all existing > Guido> code objects in a module for LOAD_GLOBAL and STORE_GLOBAL opcodes > Guido> with built-in names; for all such built-in names that have no > Guido> STORE_GLOBAL anywhere, it's "safe enough" to use the special > Guido> opcode. Then of course you will have to hunt through the > Guido> bytecode for sequences of LOAD_GLOBAL(), followed by > Guido> arbitrary code to load an object, followed by CALL_FUNCTION(1). > > Don't you also have to watch out for the dreaded > > from my_rewritten_builtins import * A module using "from whatever import *" loses the benefits of this optimization. Serves them right. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@CNRI.Reston.VA.US Thu Dec 16 15:44:04 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 16 Dec 1999 10:44:04 -0500 Subject: [Types-sig] Low-hanging fruit: recognizing builtins In-Reply-To: Your message of "Thu, 16 Dec 1999 11:02:49 +0100." <3858B8C9.24962AAD@lemburg.com> References: <199912151707.MAA02639@eric.cnri.reston.va.us> <3857DA7C.75433A13@lemburg.com> <199912152220.RAA07942@eric.cnri.reston.va.us> <3858B8C9.24962AAD@lemburg.com> Message-ID: <199912161544.KAA08348@eric.cnri.reston.va.us> [MAL] > In the long run it would be better to wrap those module > globals with write access functions (the write action would > then be recognized by the optimizer). Yes, but we need to deal with the current idiom or we'd break too much code. (When I have to break some valid code, I'd rather do it in an explicit way, e.g. by adding a keyword, rather than silently changing working code into non-working code for an obscure reason.) > I haven't followed the thread too closely, but isn't there > some way to tell the optimizer which modules to treat at > what optimization level ? Old modules should only use the > "safe" caching strategy then while modules compiled with > full optimization would be caching all read-only globals. That hasn't been discussed this time around. I think you have proposed more optimization control in the past; that's still a good idea. > BTW, instead of adding oodles of new byte code, how about > grouping them... e.g. instead of UNARY_LEN, BUILD_RANGE, etc. > why not have a CALL_BUILTIN which takes an index into > a predefined set of builtin functions. > > The same could be done with some often used constants > such as None, '', 1, 0: LOAD_SYSTEM_CONST with an index > into a constants array. > > The advantage is that you can easily extend both sets > of prefetched constants while not adding too many > new new byte codes to the inner loop. Good ideas. > Note that the loop as it is built now is already too large > for common Intel+compatible based CPUs. Adding even more byte > codes to the huge single loop would probably result in a > decrease of CPU cache hits. (I split the Great Switch > in two switch statements and got some good results out of > this: the first switch handles often used byte codes while the > second takes care of the more exotic ones.) Sigh -- I wish C compilers took care of this. I like a single switch because it's so simple. > A note on range() and for: the common usage of > > for i in range(const): > ... > > could be compiled into a completely different set of opcodes > not creating any list or tuple at all. Since the FOR_LOOP > opcode generates loop integers on each iteration the creation > of a range tuple or list is not needed. The loop opcode would only > have to check for the upper bound "const". Yes, this is what I had in mind. > I've added a new > counter type (basically a mutable integer type that allows > for fast increment and decrement) to simplify this even more. > For the curious, it's in the old patch: > > http://starship.skyport.net/~lemburg/mxPython-1.5.patch.gz Or there could be something even more ad-hoc (and faster). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@CNRI.Reston.VA.US Thu Dec 16 15:47:55 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 16 Dec 1999 10:47:55 -0500 Subject: [Types-sig] Re: [Doc-SIG] Sorry! In-Reply-To: Your message of "Thu, 16 Dec 1999 14:09:02 GMT." References: <385665DE.9963174B@prescod.net> <38569FFF.78EC8DA@prescod.net> <3857CDD1.D77331AD@prescod.net> <3857E6F9.52FC29F2@prescod.net> <38585628.D4B4934F@prescod.net> Message-ID: <199912161547.KAA08382@eric.cnri.reston.va.us> > Hmm, I'm not sure I read Guido's > > I think that any proposal that requires you to keep two separate files > > "in sync" is bound to fail in the long term. I left that crap behind > > in C++. But in the short term...okay. > as anything but a `yes, we could use this as a temporary measure to let > us experiment with static typing within python 1'. And it appears to be > typical of the comments to date. I didn't write the inner quote. --Guido van Rossum (home page: http://www.python.org/~guido/) From GoldenH@littoncorp.com Thu Dec 16 17:15:37 1999 From: GoldenH@littoncorp.com (Golden, Howard) Date: Thu, 16 Dec 1999 09:15:37 -0800 Subject: [Types-sig] A challenge Message-ID: Tim Peters wrote: > I like interface files fine, but will stick to inline "decl"s below. > Apparently unlike anyone else here, I think explicit > declarations can make > code easier for *human readers* to understand -- so I'm not > interested in > hiding them from view. I _don't_ like interface files, precisely for this reason! Making the code easier to understand is my highest goal. > if-it-looks-a-lot-like-every-other-reasonable-declaration- > syntax-you've-ever-seen-it-met-its-goal-ly y'rs - tim I completely support this style! I won't quibble about 'decl' vs. 'var', though I suggest the latter, all else being equal, since it has a proud heritage. - Howard From GoldenH@littoncorp.com Thu Dec 16 17:21:49 1999 From: GoldenH@littoncorp.com (Golden, Howard) Date: Thu, 16 Dec 1999 09:21:49 -0800 Subject: [Types-sig] What is the Essence of Python? (Was: Low-hanging fruit: recognizing builtins) Message-ID: Paul Prescod wrote: > "Golden, Howard" wrote: > > I reiterate that we should define what is the essence of Python, so we know > > what sort of dynamicism and flexibility we are trying to preserve, and what > > is superfluous. Until we do this, we are dealing with a squishy set of > > requirements. > I think that that is always the case in language design. What one person > hates is what another loves: even in Python! I don't know how to answer > your question. I think that we can only argue about particular features > "when we get to them." Then I am suggesting an "Annotated Python Reference Manual" to act as a taxonomy of the features. This could become the basis for our arguments! > Most people probably do not use dynamicity to the same extent as the > real power users but on the other hand they are the ones who are most > fanatical about the language. Again, I hope someone will suggest some good examples for me to study. - Howard From GoldenH@littoncorp.com Thu Dec 16 17:31:21 1999 From: GoldenH@littoncorp.com (Golden, Howard) Date: Thu, 16 Dec 1999 09:31:21 -0800 Subject: [Types-sig] What is the Essence of Python? Message-ID: Greg Stein wrote: > On Wed, 15 Dec 1999, Paul Prescod wrote: > > "Golden, Howard" wrote: > > > I reiterate that we should define what is the essence of Python, so we know > > > what sort of dynamicism and flexibility we are trying to preserve, and what > > > is superfluous. Until we do this, we are dealing with a squishy set of > > > requirements. > > I think that that is always the case in language design. What one person > > hates is what another loves: even in Python! I don't know how to answer > > your question. I think that we can only argue about particular features > > "when we get to them." > I agree. It is like asking somebody to describe the color "blue" :-) > I think there is a yardstick in there somewhere, that you can hold up to a > feature or design and say "that's Pythonic" or "that's not". But it is > very subjective and incapable of being described... This reminds me of the judge's comment that he couldn't define obscenity, but he knew it when he saw it. Unfortunately, it's also not very useful in communicating between people. I hope some of you will make an extra effort to help newbies like me! - Howard From Edward Welbourne Thu Dec 16 17:50:43 1999 From: Edward Welbourne (Edward Welbourne) Date: Thu, 16 Dec 1999 17:50:43 +0000 Subject: [Types-sig] Re: [Doc-SIG] Sorry! In-Reply-To: <199912161547.KAA08382@eric.cnri.reston.va.us> References: <385665DE.9963174B@prescod.net> <38569FFF.78EC8DA@prescod.net> <3857CDD1.D77331AD@prescod.net> <3857E6F9.52FC29F2@prescod.net> <38585628.D4B4934F@prescod.net> <199912161547.KAA08382@eric.cnri.reston.va.us> Message-ID: >> Hmm, I'm not sure I read Guido's >>> I think that any proposal that requires you to keep two separate files >>> "in sync" is bound to fail in the long term. I left that crap behind >>> in C++. But in the short term...okay. > I didn't write the inner quote. Oops - sorry: in fact, that was Paul (drawback of snatching a look at the threaded list during compiles and such) on http://www.python.org/pipermail/types-sig/1999-December/000617.html From that being Paul, I guess I should infer that the answer to the question I posed later would be that the two-file scheme is a `for the present' idea, which greatly reduces my twitchiness about it. Eddy. -- I have almost enough time either to read all the relevant information or to respond to it. From paul@prescod.net Thu Dec 16 13:29:50 1999 From: paul@prescod.net (Paul Prescod) Date: Thu, 16 Dec 1999 05:29:50 -0800 Subject: [Types-sig] Type annotations Message-ID: <3858E94E.B7D86846@prescod.net> Okay, I see four different approaches to the syntax short-term type annotation. Personally, I do not think that it is too early to talk about syntax because we need to communicate these ideas. Here are my metrics: Python 1.5 compatibility: Sill the Python 1.5 compiler accept it? Will it have reasonable semantics? Logical separation: Will users be able to understand that runtime objects are not available? Convenience: How easy is it to edit? Syntactic Cleanliness: How "obvious" is it what the declaration means? 1. separate file: Python 1.5 compatibility: high Logical separation: high Convenience: low Syntactic Cleanliness: high 2. labelled string expressions: (like 3, but in strings) Python 1.5 compatibility: high Logical separation: high Convenience: medium Syntactic Cleanliness: medium 3. in separate decl statements: (Incompatible with Python 1.5, but easily converted) Python 1.5 compatibility: low Logical separation: high Convenience: medium Syntactic Cleanliness: high 4. in-line in "other" declarations Python 1.5 compatibility: low Logical separation: low Convenience: high Syntactic Cleanliness: high 5. in dictionaries, lists, and other basic Python objects, "overloaded" with special meaning Python 1.5 compatibility: high Logical separation: medium Convenience: high Syntactic Cleanliness: low Of course if we use a backwards-incompatible expression syntax then backwards-incompatibility is not an issue. That is one reason I propose check_type( expr, type ) which can be interpreted in old Python as a function call. My preference is to allow three different syntaxes according to a schedule: January: separate files (we need this anyhow to define types for the builtin modules) February: in-module string declarations and build separate files Python 2.0: either 3 or 4 I admit that I don't find the compatibility benefits of 5 to be worth the obfuscation. Parsing is not THAT hard. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things never trust in: That's the vendor's final bill The promises your boss makes, and the customer's good will http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Thu Dec 16 13:29:59 1999 From: paul@prescod.net (Paul Prescod) Date: Thu, 16 Dec 1999 05:29:59 -0800 Subject: [Types-sig] Type annotations Message-ID: <3858E957.CBF8F4DF@prescod.net> Okay, I see four different approaches to the syntax short-term type annotation. Personally, I do not think that it is too early to talk about syntax because we need to communicate these ideas. Here are my metrics: Python 1.5 compatibility: Sill the Python 1.5 compiler accept it? Will it have reasonable semantics? Logical separation: Will users be able to understand that runtime objects are not available? Convenience: How easy is it to edit? Syntactic Cleanliness: How "obvious" is it what the declaration means? 1. separate file: Python 1.5 compatibility: high Logical separation: high Convenience: low Syntactic Cleanliness: high 2. labelled string expressions: (like 3, but in strings) Python 1.5 compatibility: high Logical separation: high Convenience: medium Syntactic Cleanliness: medium 3. in separate decl statements: (Incompatible with Python 1.5, but easily converted) Python 1.5 compatibility: low Logical separation: high Convenience: medium Syntactic Cleanliness: high 4. in-line in "other" declarations Python 1.5 compatibility: low Logical separation: low Convenience: high Syntactic Cleanliness: high 5. in dictionaries, lists, and other basic Python objects, "overloaded" with special meaning Python 1.5 compatibility: high Logical separation: medium Convenience: high Syntactic Cleanliness: low Of course if we use a backwards-incompatible expression syntax then backwards-incompatibility is not an issue. That is one reason I propose check_type( expr, type ) which can be interpreted in old Python as a function call. My preference is to allow three different syntaxes according to a schedule: January: separate files (we need this anyhow to define types for the builtin modules) February: in-module string declarations and build separate files Python 2.0: either 3 or 4 I no longer find the compatibility benefits of 5 to be worth the obfuscation if we are really going to move into supporting parametric polymorphism and the rest. It just gets too hairy. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things never trust in: That's the vendor's final bill The promises your boss makes, and the customer's good will http://www.geezjan.org/humor/computers/threes.html From Edward Welbourne Thu Dec 16 18:14:17 1999 From: Edward Welbourne (Edward Welbourne) Date: Thu, 16 Dec 1999 18:14:17 +0000 Subject: [Types-sig] What is the Essence of Python? Message-ID: >>>> I reiterate that we should define what is the essence of Python, >> ... like asking somebody to describe the color "blue" :-) > I hope some of you will make an extra effort to help newbies like me! The nearest you could hope to get to `what is the essence of Python' would be if each of the folk in the present discussion ignored one another's answers (and opinions) and told you our own individual answers, but you mustn't go expecting our answers to agree ... So ... what is the essence of Python ? Eddy's answer: A bunch of protocols for manipulating namespaces and functions. An object is a namespace if getattr knows how to ask it for attributes. Anything you want to do with a namespace, you do by: * finding the protocol that describes what you wanted to do * looking up the attributes the protocol specifies * calling the function (it usually is a function) you just got back, with the arguments the protocol specifies, and * trusting that this has either: - achieved the effect you had in mind, or - raised an exception (probably stipulated by the protocol) There are a few handy built-in types and functions which suffice to boot-strap the protocols python defines, and to let you do `most' of the things you will ever want to do. These suffice for implementation of everything else you might want to do. The base protocols are specified in terms of various names, typically beginning and ending `__'. Now, with any luck, other answers will be so different you'll doubt we were talking about the same language as one another ... then you'll begin to understand why, though your question is sensible, we can't give you a sensible answer ;^> Eddy. From m.faassen@vet.uu.nl Thu Dec 16 18:20:15 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Thu, 16 Dec 1999 19:20:15 +0100 Subject: [Types-sig] Type annotations References: <3858E94E.B7D86846@prescod.net> Message-ID: <38592D5F.FBF12AF0@vet.uu.nl> Paul Prescod wrote: [snip snip] > My preference is to allow three different syntaxes according to a > schedule: > > January: separate files (we need this anyhow to define types for the > builtin modules) > February: in-module string declarations and build separate files > Python 2.0: either 3 or 4 > > I admit that I don't find the compatibility benefits of 5 to be worth > the obfuscation. Parsing is not THAT hard. But it doesn't completely obfuscate; it's _in Python_. Python programmers already grok Python syntax. It looks fairly horrible, but it's also readable by Python programmers, and so it's easier to communicate. And you don't have to deal with parsing only; that isn't the main problem. The main thing is that we need a way to express complex, composite types. Python is very expressive. You can make a Pythonic language to express types in later, but we can't yet as we don't fully know yet what we want to express. Yet another advantange of going the 'in Python' route is that you already have the backend for your parser. And if you have an implementation (that we'll undoubtedly will change several times, another advantage of using Python), you can actually start thinking about a good syntax with *knowledge*. You know what kind of data structures are actually involved, you have working experience. Something we don't really have right now. Regards, Martijn From m.faassen@vet.uu.nl Thu Dec 16 18:23:30 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Thu, 16 Dec 1999 19:23:30 +0100 Subject: [Types-sig] Type annotations References: <3858E957.CBF8F4DF@prescod.net> Message-ID: <38592E22.7EC44C45@vet.uu.nl> Paul Prescod wrote: > > I no longer find the compatibility benefits of 5 to be worth the > obfuscation if we are really going to move into supporting parametric > polymorphism and the rest. It just gets too hairy. Oh, cool, yet another, slightly different objection. :) I disagree that it gets too hairy. I'm advocating using Python *precisely* because of the complex types. Python expressions can deal with that kind of complexity right now. What's all this obsession with syntax early on about anyway? It only distracts us from the real topic, in my opinion.. Regards, Martijn From paul@prescod.net Thu Dec 16 18:17:56 1999 From: paul@prescod.net (Paul Prescod) Date: Thu, 16 Dec 1999 10:17:56 -0800 Subject: [Types-sig] Type annotations References: <3858E94E.B7D86846@prescod.net> <38592D5F.FBF12AF0@vet.uu.nl> Message-ID: <38592CD4.963694B0@prescod.net> Martijn Faassen wrote: > > And you don't have to deal with parsing only; that isn't the main > problem. The main thing is that we need a way to express complex, > composite types. Python is very expressive. You can make a Pythonic > language to express types in later, but we can't yet as we don't fully > know yet what we want to express. Actually, we do. The Python type system is well understood. We just don't have a way of talking about it statically. I'll attach some of my half-formed ideas and then shut up for a little while. > Yet another advantange of going the 'in Python' route is that you > already have the backend for your parser. And if you have an > implementation (that we'll undoubtedly will change several times, > another advantage of using Python), you can actually start thinking > about a good syntax with *knowledge*. You know what kind of data > structures are actually involved, you have working experience. Something > we don't really have right now. I don't see how dictionaries are a decent back-end. The real back-end will be type objects with direct references to other type objects. ---- Here are some Haskell-ish syntax ideas for type declarations: First we need to be able to talk about types. We need a "type expression" which evalutates to a type. Rough Grammar: Type : Type ['|' Type] # allow unions Unit : dotted_name | Parameterized | Function | Tuple | List | Dict Parameterized : dotted_name '(' Basic (',' Basic)* ')' Basic : dotted_name | PythonLiteral | "*" # * means anything. PythonLiteral : atom Function : Type '->' Type Tuple : "(" Type ("," Type )* ) List: "[" Type "]" Dict: "{" Type ":" Type "}" Examples: String [(Int, Int)] {(String,Int), String} BTree( String ) BTree( somepackage.somemod.someclass ) There is another syntax for declaring instance interface types, it follows Python's class declaration syntax. More on that later. Now we probably want to be able to invent names for types. This is like C's typedef. We'll use simple Python assignment syntax. Typedef = NAME["(" args ")"] '=' Type Examples : StringOrList = String | List( String ) ElementNode = XMLNode( "Element" ) MyTuple = ( Integer, String, List( String ) ) Str50 = BoundedString( 0, 50 ) PositiveInteger = BoundedInteger( 0, sys.maxint ) PositiveInteger = BoundedInteger( -sys.maxint-1, 0 ) len = sequence(*) -> int maptype(intype, outtype) = (( intype -> outtype ), List( intype )) -> List( outtype ) intmap = maptype( int, int ) lenmap = maptype( sequence(*), int ) Interfaces look like Python classes but they use an "interface" keyword. interfacedef: 'interface' NAME ['(' testlist ')'] asdecl ':' interfacebody interfacebody: funcdef | classdef | instancevar | interfacevar asdecl: "as" type funcdef: 'def' NAME parameters ':' docstring? parameters: '(' [varargslist] ')' varargslist: (like Pythons but with added "as" operator. "Interface" and instance variables may also be declared. interface (a,b) foo_interface(base_interface): static: k as String instance: j as Integer def bar( self, arg1 as a ) as b: "This takes an argument of one paramterized type and rteturns the other." def baz( self, arg1 as a ) as b: "This takes an argument of one paramterized type and rteturns the other." "Empty" class definitions are also possible in interface files: class (a, b) foo_class (base_class) implements foo_interface: def newfunc( self, arg1 as a ) as b: "This takes an argument of one paramterized type and rteturns the other." We can also export individual instances and other objects: a as String b as foo_interface c as foo_class const path as ["String"] const version as Integer -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things never trust in: That's the vendor's final bill The promises your boss makes, and the customer's good will http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Thu Dec 16 18:18:36 1999 From: paul@prescod.net (Paul Prescod) Date: Thu, 16 Dec 1999 10:18:36 -0800 Subject: [Types-sig] Interface files References: <3857E02B.53CF27AC@prescod.net> <3859007A.FD369FB2@vet.uu.nl> Message-ID: <38592CFC.DD175CAB@prescod.net> Martijn Faassen wrote: > > * we don't have to debate about syntax anymore and can actually think > about > semantics without syntax confusion. Clean syntax helps comprehension. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things never trust in: That's the vendor's final bill The promises your boss makes, and the customer's good will http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Thu Dec 16 18:18:38 1999 From: paul@prescod.net (Paul Prescod) Date: Thu, 16 Dec 1999 10:18:38 -0800 Subject: [Types-sig] Re: The role of PyObjects References: <3858E9C2.E2722B88@vet.uu.nl> Message-ID: <38592CFE.695545D9@prescod.net> Martijn Faassen wrote: > > I'm afraid that > even with the operator, you wouldn't be able to check most of the code, > if PyObjects are freely allowed. Perhaps I'm wrong, but I'd like to see > some more debate about this. PyObjects are just another type. In Python or any OO language it is ABSOLUTELY impossible to know the type of every object at compile time because of polymophism: a = CGIHTTPServer() b = BaseHTTPServer() startServer( a ) startServer( b ) startServer does not know the exact types at compile time. The basic nature of the problem does not change if we have a function that expects just a "PyObject" (the base class of all base classes). Of course if the function is to be statically type checked then you cannot use operations on the object other than those allowed by PyObjects, but the basic principle is the same. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things never trust in: That's the vendor's final bill The promises your boss makes, and the customer's good will http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Thu Dec 16 18:19:36 1999 From: paul@prescod.net (Paul Prescod) Date: Thu, 16 Dec 1999 10:19:36 -0800 Subject: [Types-sig] Attributes proposal Message-ID: <38592D38.63057A1A@prescod.net> My proposal for handling attributes is this: An attribute's type can be declared. Writes to the attribute from the same module can be statically type checked (if requested). Writes to the attribute from other modules are checked at runtime. That way we can always know the type of the attribute value and can therefore make reasonable use of the attribute in statically type checked functions. Opinions? -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things never trust in: That's the vendor's final bill The promises your boss makes, and the customer's good will http://www.geezjan.org/humor/computers/threes.html From GoldenH@littoncorp.com Thu Dec 16 18:54:17 1999 From: GoldenH@littoncorp.com (Golden, Howard) Date: Thu, 16 Dec 1999 10:54:17 -0800 Subject: [Types-sig] What is the Essence of Python? Message-ID: Edward Welbourne wrote: > So ... what is the essence of Python ? Eddy's answer: > > A bunch of protocols for manipulating namespaces and functions. > > An object is a namespace if getattr knows how to ask it for > attributes. > Anything you want to do with a namespace, you do by: > * finding the protocol that describes what you wanted to do > * looking up the attributes the protocol specifies > * calling the function (it usually is a function) you just got back, > with the arguments the protocol specifies, and > * trusting that this has either: > - achieved the effect you had in mind, or > - raised an exception (probably stipulated by the protocol) > > There are a few handy built-in types and functions which suffice to > boot-strap the protocols python defines, and to let you do > `most' of the > things you will ever want to do. These suffice for implementation of > everything else you might want to do. The base protocols are > specified > in terms of various names, typically beginning and ending `__'. This is the _mechanism_ of Python, but is it the _essence_? It's just like my talk yesterday with my 10 year-old son about the essence of the movie "Field of Dreams." He said it was about a man building a baseball field in an Iowa cornfield. I said it was about a man coming to terms with the conflict with his dead father. My son is very literal in his thinking, so I understand his analysis. I was trying to encourage him to think below the surface. Your answer about Python, and its appeal to you reminds me of how I felt about Forth, when I first learned it around 1980. Again, you have a very simple mechanism which is easily extensible to do whatever you want. It is interactive, too. I suspect that many people are still using Forth, but you seldom hear about it any more. Probably many of those using Forth have added all sorts of object-oriented, generic programming, parametric polymorphism extensions. My question is: Is that still Forth? I think you could argue either side, but the important point is that it wouldn't _look_ like 1980's Forth. "What is Python?" is really Guido's decision. (If I agree, I'll use it, and if not, I'll vote with my feet.) But I am arguing that it is more than just a clear syntax wrapped around a flexible namespace. - Howard From tismer@appliedbiometrics.com Thu Dec 16 18:53:48 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Thu, 16 Dec 1999 19:53:48 +0100 Subject: [Types-sig] A challenge References: <000501bf4779$5e566b40$58a2143f@tim> Message-ID: <3859353C.4106B165@appliedbiometrics.com> Tim Peters wrote: > > [Guido] > > I personally am losing track of all the various proposals. > > You're not alone . Trackless Python. I'm loosing track every day now, when there are between 2 to six new posts each of about 7 people, and over 70% of cited text. Hard to follow since I'm a learner still. > if-it-looks-a-lot-like-every-other-reasonable-declaration- > syntax-you've-ever-seen-it-met-its-goal-ly y'rs - tim ... This is what I can read. What a delight :-) Just a question, please: > import fnmatch > import os > > decl _debug: Int # but Boolean makes more sense; see below Is this meant to be lexically true in the globals scope from here on? > _debug = 0 > > decl _prune: [String] > _prune = ['(*)'] > > decl find: def(String, optional dir: String) -> [String] > > def find(pattern, dir = os.curdir): > decl list, names: [String], name: String > list = [] > names = os.listdir(dir) > names.sort() > for name in names: > decl name, fullname: String Same question: "name" is redefined from here on? Would this behave (or be as behaviorless) like the "global" declaration, or lexical, or do you open a new type scope with "for"? (New "variable, with C's {} in mind). The latter cannot be since "for" declared it already. make code, not words :)- ly 'y'rs - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From lannert@uni-duesseldorf.de Thu Dec 16 19:05:24 1999 From: lannert@uni-duesseldorf.de (lannert@uni-duesseldorf.de) Date: Thu, 16 Dec 1999 20:05:24 +0100 (MET) Subject: [Types-sig] A lurker's comment In-Reply-To: <19991216170010.65B751CEEF@dinsdale.python.org> from "types-sig-admin@python.org" at "Dec 16, 99 12:00:10 pm" Message-ID: <19991216190524.16032.qmail@lannert.rz.uni-duesseldorf.de> "types-sig-admin@python.org" wrote: [Apologies first. Although being subscribed to the digest only, I hardly manage to follow the current volume of this list. Is there a life beyond work, Types-SIG and a minimum of sleep?? The discussion may well be past the points I'm addressing at the time of this writing ...] > Paul Prescod wrote: > > > > Greg Stein wrote: > > > > > > I stated a preference for allowing this information to reside in the same > > > file as the implementation. i.e. I don't want to maintain two files. > > > > The nice thing about having separate files is that it becomes instantly > > clear what is "interesting" to the compiler. We have no backwards > > compatibility constraints. We have no questions about what variable are > > "in scope" and "available". It's just plain simpler. Please, don't introduce separate spec files. It's OK for a quick hack while doing a proof of concept, but not for actual use. A C[+-]* program that consists of .c, .h, .cpp and some other files usually resides in a directory of its own, but when it's compiled, it usually collapses into just one file that you can freely move around. I'm already not too happy with a Python program that needs a few special-purpose modules to accompany it wherever it goes. > > There is also something deeply elegant and useful about a separation of > > interface from implementation. > > It can be helpful, but that doesn't mean it needs to be in a separate > file. :) Seconded! Wouldn't it be a Pythonic solution to regard a restricted namespace as a "restricted dictionary" which can (a) refuse to accept new items once it is declared closed (or frozen or fixated), and (b) refuse to accept values for certain keys unless these values are compatible with a (list of) type/class/interface spec(s)? (I guess Chris T. had something similar in mind; hadn't you?) d = RestrictedDict() d.declare_type("i", IntType) d.declare_type("j", (IntType, NoneType)) d["i"] = 5 d["j"] = None d["i"] = None # raises TypeError d.fixate() d["spam"] = "foo" # raises KeyError Modules, classes, and instances can offer this sort of __dict__, providing type and name safety; for a function's locals() it has to be simulated. If there is an unambiguous syntax for these restrictions, a compiler can use them for (OPT): def count: IntType # == __dict__.declare_type("count", IntType) def finally # == __dict__.fixate() (Or whatever syntax there will be.) Of course the variable declarations should be performed at definition/compile time; as for "global"s, the variable must not be used before the declaration. Anyway, I'd like to have something as open to inspection and manipulation as Python's __dict__s etc. to achieve type and name safety. (Even a d.unfixate() would be nice for testing a program with the interpreter.) And if I knew how to do the declarations right, I'd help the compiler to implement my count=count+1 as a simple machine code integer increment. [Tim Peters, iirc:] > Types form a > lattice, in which "unknown" is the top element, and the basic rule of type > checking is that the binding > > lhs = rhs > > is OK iff > > type(lhs) >= type(rhs) > > where ">=" is wrt the partial ordering defined by the type lattice (or, in > English , only "widening" bindings are OK; like assigning an int to a > real, or a subclass to a base class etc, but not their converses). This would also be valid for alternative types: (NoneType, IntType) >= (IntType,). An assignment with lhs: (IntType,), rhs: (NoneType, IntType) should not be rejected by the interpreter if rhs happens to be an Int, but by the compiler. Finally, while I'm just bothering you anyway, an irrelevant opinion on the difficulty of declaring lists of tuples of (int, string, someclass, ...): Wouldn't a simplistic approach, which leaves the ultimate responsibility to the user, suffice for the time being? We can't _prove_ the correctness of a program (yet; did I miss something?), but we can help a human to avoid the most frequent errors. class MyListType(ListType): pass # suppose we have types as classes ... def ml: MyListType def al: ListType ml = MyListType([1, 2, "many"]) # OK al = ml # OK ml = al # rejected by the compiler ml = MyListType(al) # OK (it's up to myself to do it right, # I proved my awareness) Detlef From Edward Welbourne Thu Dec 16 19:18:41 1999 From: Edward Welbourne (Edward Welbourne) Date: Thu, 16 Dec 1999 19:18:41 +0000 Subject: [Types-sig] What is the Essence of Python? In-Reply-To: References: Message-ID: > This is the _mechanism_ of Python, but is it the _essence_? well, it's *part of* the mechanism ... the part that grabs me. The mechanism/essence distinction is one it's a lot easier to make in the case of a story ... the closer one gets to the concrete world, the harder it gets to make ... what is the essence of stone ? ... be prepared to cope with the essence and mechanism of a real thing overlapping rather more severely than arises for stories - especially stories written by someone explicitly trying to put across a message. In such cases, the essence may well *be* part of the mechanism. And folk can't be relied on to agree about which part. > ... those using Forth have added all sorts of ... extensions. > My question is: Is that still Forth? or, to put your original question (what is the essence) another way: what is it about python that you can't change because if you did it wouldn't be python any more ? To me, the answer to that is >> A bunch of protocols for manipulating namespaces and functions. (albeit words like `sufficient', `good' and `well' need added in several places there). There are some important `pythonic theses' I saw (by Tim Peters, I think) but I've lost the bookmark ... ask Tim Peters, they were good. They might come closer to satisfying your criteria of essentiality. Eddy. From gstein@lyra.org Thu Dec 16 19:24:55 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 11:24:55 -0800 (PST) Subject: [Types-sig] Interface files In-Reply-To: <3857E02B.53CF27AC@prescod.net> Message-ID: On Wed, 15 Dec 1999, Paul Prescod wrote: > Greg Stein wrote: > > > > I stated a preference for allowing this information to reside in the same > > file as the implementation. i.e. I don't want to maintain two files. > > The nice thing about having separate files is that it becomes instantly > clear what is "interesting" to the compiler. We have no backwards > compatibility constraints. We have no questions about what variable are > "in scope" and "available". It's just plain simpler. > > There is also something deeply elegant and useful about a separation of > interface from implementation. In your opinion, sure. I just got done telling you my opinion :-). And that is that separate files are Not Nice. Elegant? Bah. It's extra files to deal with and coordinate. > Sure, you don't always want to be REQUIRED to separate them. I > acknowledge that we will one day have to support inline declarations but > I'm going to put it off unless I hear some screaming. *SCREAM* How's that? > > I'll go further and state that we should not use a new language for this. > > It should just be Python. (and this is where Martijn's __types__ thing > > comes in, although I'm not advocating that format) > > I think that that's an unreasonable (and unreadable) constraint. The > language should probably be pythonic, but not necessarily Python. Python > doesn't have a type declaration syntax and none of Python's existing > syntax was meant to be used AS a type declaration syntax. It just gets > too unreadable for quasi-complicated declarations. We need to support > polymorphic and parameteric higher order functions! Why in the heck should I have to go and code up a separate file? In a separate language? That is nonsense. Really. And no, I'd rather not be diplomatic here. Saying that we are going to use Yet Another Goddamned Language is the wrong move. I'm going to stop now. I could go on, but it probably would not be productive. Cheers, -g -- Greg Stein, http://www.lyra.org/ From tismer@appliedbiometrics.com Thu Dec 16 19:25:29 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Thu, 16 Dec 1999 20:25:29 +0100 Subject: [Types-sig] A lurker's comment References: <19991216190524.16032.qmail@lannert.rz.uni-duesseldorf.de> Message-ID: <38593CA9.37E36AB3@appliedbiometrics.com> lannert@lannert.rz.uni-duesseldorf.de wrote: ... > > It can be helpful, but that doesn't mean it needs to be in a separate > > file. :) > > Seconded! > > Wouldn't it be a Pythonic solution to regard a restricted namespace as a > "restricted dictionary" which can (a) refuse to accept new items once > it is declared closed (or frozen or fixated), and (b) refuse to accept > values for certain keys unless these values are compatible with a (list > of) type/class/interface spec(s)? (I guess Chris T. had something similar > in mind; hadn't you?) Yes of course. When I can get an effect by adding some sugar to semantics, and I can avoid any syntactic changes, then I try since I hate syntax. > d = RestrictedDict() ...and so on, easy to implement between supper and X chapters... What I was missing was the fact that you cannot get out of this is a static check that your ship will make it to the mars before you travel. This example from Guido really struck me. Still I'm not convinced that compile time and run time are different things, since Python itself is at the moment the best counterexample. There must be a third concept between runtime checks and compiletime syntactic distortion which we are misssing. Python's simplicity together with cleverness is one of its most attractive things for me. While I shouted "yeah" when it came to the type discussion, I quickly recognized that I don't want it to happen. Something inside me cries veto, wrong track. But I can't publish this without providing a better one. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From paul@prescod.net Thu Dec 16 18:28:13 1999 From: paul@prescod.net (Paul Prescod) Date: Thu, 16 Dec 1999 10:28:13 -0800 Subject: [Types-sig] Module Attribute visibility References: <000e01bf46dc$0050ada0$05a0143f@tim> Message-ID: <38592F3D.10A7AFA4@prescod.net> Tim Peters wrote: > > Resist the dubious temptation to conflate declaration with initialization, > and "an easy mechanical transformation to valid Python 1.5.x" consists of > commenting out the decl stmts! Heck, call the keyword "#\s+decl\s+" and > it's a nop. Okay, but doesn't Python already conflate declaration with initialization? When I refer to mymod.foo I am referring to an object that was assigned, somewhere to the name foo in the module mymod. Are we going to say that statically type checked code can only refer to declared (not merely assigned) variables in other modules? Would it be safe to say that undeclared variables are simply not available for type checking? Would you suggest that this is even the case for functions? I.e. def foo( str ): return str*2 is invisible to the type checker until we add: decl foo: str -> str Or would foo have an implicit declaration: decl foo: PyObject -> PyObject And if that foo has an implicit declaration, shouldn't this foo also: foo = lambda x: x*2 -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things never trust in: That's the vendor's final bill The promises your boss makes, and the customer's good will http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Thu Dec 16 18:28:19 1999 From: paul@prescod.net (Paul Prescod) Date: Thu, 16 Dec 1999 10:28:19 -0800 Subject: [Types-sig] Type annotations References: <3858E957.CBF8F4DF@prescod.net> <38592E22.7EC44C45@vet.uu.nl> Message-ID: <38592F43.11042753@prescod.net> Martijn Faassen wrote: > > I disagree that it gets too hairy. I'm advocating using Python > *precisely* because of the complex types. Python expressions can deal > with that kind of complexity right now. What's all this obsession with > syntax early on about anyway? It only distracts us from the real topic, > in my opinion.. I see the situation this way: Python has a type system. It works. (though there are some subtle improvements coming) Our job is to define a syntax and semantics for type assertions and also a) the operation of a software processor called a "type checker" b) changes to the runtime behavior of the PVM to support the accuracy of a) I don't see us as being at that "early on" of a stage. In my head, at least, the pieces are coming together nicely. At this point, I seem to be in agreement on most non-syntactic issues with Tim, Greg and Guido so I think that we are converging. I admit that I have not yet integrated all of the ideas of you, Edward and a few other people. I don't have time to read everyone else's work carefully and nobody has time to read everyone else's either! Maybe this email will help with that. It outlines what I see as the consensus so that we can debate these things one last time and put them behind us. I don't have time to write up all of the semantics of the system yet but the major parts are: * local variables types are usually inferred * module variables and instance variables may have type declarations * non-local writes are checked at runtime (by default) * for optimization, the checks may be stripped based on type inferenced information * function return types are NEVER inferred * ...they must be declared or assumed to be PyObject * "types" can be Python primitive types, or declared classes or interfaces * built-in types are declared through "shadow files" * but a function return statement could be verified based on inferencing to conform to its declaration * expression assertions support within-function assertions * function parameters can have declarations * function calls and assignments are checked at runtime if they cannot be verified at compile time * but you can ask for an explicit verification at compile time * which enables faster code to be generated * ...and verifies your understanding of what you are doing * types can be parameterized * which means that the compile/runtime checks need to be more sophisticated * we do not yet handle the "exception interface" of a function -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things never trust in: That's the vendor's final bill The promises your boss makes, and the customer's good will http://www.geezjan.org/humor/computers/threes.html From Edward Welbourne Thu Dec 16 19:35:55 1999 From: Edward Welbourne (Edward Welbourne) Date: Thu, 16 Dec 1999 19:35:55 +0000 Subject: [Types-sig] Attributes proposal Message-ID: > My proposal for handling attributes is this: > An attribute's type can be declared. Writes to the attribute from the > same module can be statically type checked (if requested). Writes to > the attribute from other modules are checked at runtime. That way we > can always know the type of the attribute value and can therefore make > reasonable use of the attribute in statically type checked functions. > Opinions? Sounds like a good way to cut that pie. At least for modules, also for classes within a module (at outer scope). For a class defined in the body of a function, ... hmm ... the right scope in which to static-check is the function, anything else in the module is outside (i.e. runtime-check). For (attributes of) an instance of a class, we seem to have a messier situation (its class may have lots of bases in lots of files ... so which module is playing as host ?). Did you intend this to apply to instances ? If so how ? Or did you intend to apply this only to attributes of modules ? Sometimes a package might want to modify its submodules and have such activity included in the static checking ... but punting on that sounds reasonable at this stage. Eddy. From paul@prescod.net Thu Dec 16 19:45:55 1999 From: paul@prescod.net (Paul Prescod) Date: Thu, 16 Dec 1999 11:45:55 -0800 Subject: [Types-sig] New syntax? References: Message-ID: <38594173.8762D205@prescod.net> There are two separate issues here. Separate files and separate syntaxes. And there are two different time periods here: today and in Python 2. Separate files are a necessity to handle C-coded types. Ergo anything on top of that is more work and given that we are still talking about something useful in a month (though that is looking less and less likely) I am not inclined to take on the extra work of new operators and an inline syntax. As far as separate syntaxes go, we are designing a new syntax regardless. There is no way to define the type of "map" in Python today. The question is whether the new syntax is built by overloading the meaning of Python basic types or whether it is just new and different. I mean we could outlaw new syntaxes in Python: from re import * compile( union( repeat( character_class( ["abc"] ), optional( negate( character_class ( ["def"]) ) ) That makes no sense to me. If you or someone proposes a completely Pythonic syntax that can handle type unions, parameterized types, lists and tuples gracefully then we can compare some declaration examples to a designed-from-scratch syntax and let Guido decide. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things never trust in: That's the vendor's final bill The promises your boss makes, and the customer's good will http://www.geezjan.org/humor/computers/threes.html From m.faassen@vet.uu.nl Thu Dec 16 19:54:23 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Thu, 16 Dec 1999 20:54:23 +0100 Subject: [Types-sig] Interface files References: <3857E02B.53CF27AC@prescod.net> <3859007A.FD369FB2@vet.uu.nl> <38592CFC.DD175CAB@prescod.net> Message-ID: <3859436F.90846D66@vet.uu.nl> Paul Prescod wrote: > > Martijn Faassen wrote: > > > > * we don't have to debate about syntax anymore and can actually think > > about > > semantics without syntax confusion. > > Clean syntax helps comprehension. 5 syntaxes with uncertain semantics destroy comprehension. We won't know semantics until implementation. It's tough to design a nice syntax before you have tested your semantics. Then again, perhaps one of the syntaxes will blow me away and I'll relent. :) Regards, Martijn From gstein@lyra.org Thu Dec 16 20:28:13 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 12:28:13 -0800 (PST) Subject: [Types-sig] separate files (was: Sorry!) In-Reply-To: Message-ID: On Thu, 16 Dec 1999, Edward Welbourne wrote: > >> Hmm, I'm not sure I read Guido's > >>> I think that any proposal that requires you to keep two separate files > >>> "in sync" is bound to fail in the long term. I left that crap behind > >>> in C++. But in the short term...okay. > > I didn't write the inner quote. > > Oops - sorry: in fact, that was Paul (drawback of snatching a look at > the threaded list during compiles and such) on > http://www.python.org/pipermail/types-sig/1999-December/000617.html > > >From that being Paul, I guess I should infer that the answer to the > question I posed later would be that the two-file scheme is a `for the > present' idea, which greatly reduces my twitchiness about it. But I don't think anybody should be planning on 2.0 to resolve things. That is at least two or three years away, if I'm not mistaken. I think we need an inline syntax in the 1.x series. I like Tim's approach so far; it seems like it should work although I might suggest some tweaks. Cheers, -g -- Greg Stein, http://www.lyra.org/ From scott@chronis.pobox.com Thu Dec 16 20:27:41 1999 From: scott@chronis.pobox.com (scott) Date: Thu, 16 Dec 1999 15:27:41 -0500 Subject: [Types-sig] separate files (was: Sorry!) In-Reply-To: References: Message-ID: <19991216152741.A6338@chronis.pobox.com> On Thu, Dec 16, 1999 at 12:28:13PM -0800, Greg Stein wrote: [...] > But I don't think anybody should be planning on 2.0 to resolve things. > That is at least two or three years away, if I'm not mistaken. > > I think we need an inline syntax in the 1.x series. I like Tim's approach > so far; it seems like it should work although I might suggest some tweaks. For what it's worth, I'd like to second this. scott From m.faassen@vet.uu.nl Thu Dec 16 20:26:57 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Thu, 16 Dec 1999 21:26:57 +0100 Subject: [Types-sig] New syntax? References: <38594173.8762D205@prescod.net> Message-ID: <38594B11.766AE8A3@vet.uu.nl> Paul Prescod wrote: > > There are two separate issues here. Separate files and separate > syntaxes. And there are two different time periods here: today and in > Python 2. > > Separate files are a necessity to handle C-coded types. Um? Why? > Ergo anything on > top of that is more work and given that we are still talking about > something useful in a month (though that is looking less and less > likely) I am not inclined to take on the extra work of new operators and > an inline syntax. What about putting this extra information inside the module file itself? You need a separate file because you want to come up with your own syntax, but even then you can do: __types__ = """ def foo(int, int): hey: int hoi: [int] result: string bar: string grok: [string] class Mine: hm : float def __init__(self, int, string): self.yahoo: [int] self.dict: {string : int} temp: int def getYahoo(self): result: [int] def more(Mine, Mine): temp: int result: Mine class Parametric: firstparam: param secondparam: param def __init__(self): self.a: firstparam self.b: secondparam def hullo(self): result: firstparam whoops: Parametric(string, float) def optional(int, *(int, string, int)): pass def anotheroptional(int, *[Mine]): pass union: int or string """ which incidentally would be a neat Pythonic syntax. :) Regards, Martijn From guido@CNRI.Reston.VA.US Thu Dec 16 20:35:01 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 16 Dec 1999 15:35:01 -0500 Subject: [Types-sig] Interface files In-Reply-To: Your message of "Thu, 16 Dec 1999 11:24:55 PST." References: Message-ID: <199912162035.PAA11433@eric.cnri.reston.va.us> > Why in the heck should I have to go and code up a separate file? In a > separate language? That is nonsense. Really. And no, I'd rather not be > diplomatic here. Saying that we are going to use Yet Another Goddamned > Language is the wrong move. I'm not taking sides here, but I want to note that none of the takers on my latest challenge have shown separate interface files. All the ones I've seen used inline syntax. So perhaps it's not even necessary to get all bent out of shape over this one. Or perhaps one of the proponents could post an example instead of responding directly to Greg's screaming. (Too much sugar again, Greg? :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From m.faassen@vet.uu.nl Thu Dec 16 20:36:31 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Thu, 16 Dec 1999 21:36:31 +0100 Subject: [Types-sig] Interface files References: <199912162035.PAA11433@eric.cnri.reston.va.us> Message-ID: <38594D4F.E81989C4@vet.uu.nl> Guido van Rossum wrote: > > > Why in the heck should I have to go and code up a separate file? In a > > separate language? That is nonsense. Really. And no, I'd rather not be > > diplomatic here. Saying that we are going to use Yet Another Goddamned > > Language is the wrong move. > > I'm not taking sides here, but I want to note that none of the takers > on my latest challenge have shown separate interface files. All the > ones I've seen used inline syntax. So perhaps it's not even necessary > to get all bent out of shape over this one. Or perhaps one of the > proponents could post an example instead of responding directly to > Greg's screaming. (Too much sugar again, Greg? :-) I'm not a real proponent of interface files (I used an inline syntax before) but I just posted an example of a non inline syntax in this very thread. I'm hereby maliciously choosing both sides at the same time. :) Basically I came up with this non inline syntax to prove my point that even that can be turned into an inline one easily, but it may have merits of its own. Regards, Martijn From gstein@lyra.org Thu Dec 16 20:57:54 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 12:57:54 -0800 (PST) Subject: [Types-sig] Module Attribute visibility In-Reply-To: <38592F3D.10A7AFA4@prescod.net> Message-ID: IMO, let's solve static type checking. Leave visibility and modification rules to another phase. They are orthogonal problems, and we would do well to reduce our problem set (and the amount of discussion thereby engendered (my 25 cent word for the day :-)). Really: please, can we table discussions on visibility and modification? Cheers, -g On Thu, 16 Dec 1999, Paul Prescod wrote: > Tim Peters wrote: > > > > Resist the dubious temptation to conflate declaration with initialization, > > and "an easy mechanical transformation to valid Python 1.5.x" consists of > > commenting out the decl stmts! Heck, call the keyword "#\s+decl\s+" and > > it's a nop. > > Okay, but doesn't Python already conflate declaration with > initialization? When I refer to mymod.foo I am referring to an object > that was assigned, somewhere to the name foo in the module mymod. > > Are we going to say that statically type checked code can only refer to > declared (not merely assigned) variables in other modules? Would it be > safe to say that undeclared variables are simply not available for type > checking? > > Would you suggest that this is even the case for functions? I.e. > > def foo( str ): return str*2 > > is invisible to the type checker until we add: > > decl foo: str -> str > > Or would foo have an implicit declaration: > > decl foo: PyObject -> PyObject > > And if that foo has an implicit declaration, shouldn't this foo also: > > foo = lambda x: x*2 > > -- Greg Stein, http://www.lyra.org/ From tismer@appliedbiometrics.com Thu Dec 16 20:55:10 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Thu, 16 Dec 1999 21:55:10 +0100 Subject: [Types-sig] New syntax? References: <38594173.8762D205@prescod.net> <38594B11.766AE8A3@vet.uu.nl> Message-ID: <385951AE.3D3142FD@appliedbiometrics.com> Martijn Faassen wrote: > > Paul Prescod wrote: ... > What about putting this extra information inside the module file itself? > You need a separate file because you want to come up with your own > syntax, but even then you can do: > > __types__ = """ > > def foo(int, int): > hey: int > hoi: [int] > result: string ... and so on ... Yes I like that, but tried this earlier in some thread, see here: [Greg, vaporizing runtime-looking type checks:-] ''' On Wed, 15 Dec 1999, Christian Tismer wrote: >... > It doesn't matter if there is an extra file, or you insert a > function call into your module, like > > system.interface("""triple quoted string defining interface""") > > without changes to the language but experimental syntaxes for > these IF files/strings. The compiler needs the information. This implies that you can't add the information procedurally. The mechanism must be "transparent" to the compiler. ''' So where is the difference. The compiler would have to treat __types__ as a special keyword and not an assignment target. In my example, it would need to know what "system.interface" kind of animal is. I used this explicitly to make my point clear, but this doesn't seem to help. I think people want new syntax since this assures a new meaning to some characters. I don't share this. If a new construct just happens to fit into the existing language, why must we forcibly invent a new escape? Yes this is all about escapes. We escape into syntax, or escape into different files. But nobody cares about None, which yet *can* be overwritten, and which has a special role although not being a special object. Also nobody cares that we use namespaces to "escape" semantics. Those __init__ constructs are escaping animals which are still in the language but have different meaning. Nobody would refuse to parse a class definition and see if it has an __init__, but for types we need a real new language? We escape to justify a wrong idea. > which incidentally would be a neat Pythonic syntax. :) Very nice, IMO. The string looks like a module in the module. But it needn't be a string. I'm not against a new concept if it fits other ideas. Opening a new context with new rules, why not? We have classes, functions etc, which all impose different semantics with nearly the same language. Now if we define an interface object which has the exceptional rule that it can *not* be generated dynamically by some tricks, but can only be written statically down (which is nonsense since I can write source code by program), then there would be just one keyword necessary to tell the type checker that there is something immutable in this module. This interface object can btw. of course contain code which is executed at compile time and be part of the type checking system. Well, I see an idea coming... sory, talking at length, I should go to sleep now - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From gstein@lyra.org Thu Dec 16 21:16:20 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 13:16:20 -0800 (PST) Subject: [Types-sig] consensus(?) summary (was: Type annotations) In-Reply-To: <38592F43.11042753@prescod.net> Message-ID: On Thu, 16 Dec 1999, Paul Prescod wrote: >... > I don't have time to write up all of the semantics of the system yet but > the major parts are: > > * local variables types are usually inferred Woo! :-) > * module variables and instance variables may have type declarations Yes. I believe this is because these variables fall under the same rubric of "interface declarations." One interface for the module, another for a class. > * non-local writes are checked at runtime (by default) Hrm. Is there an easy rule to determine this? I might suggest deferring this unless/until we have a clear set of rules. Shades of C++'s "friend" modifier are forming in my head when we talk about this... > * for optimization, the checks may be stripped based on type inferenced > information Which checks? I think runtime checks are *ignored* if you run with -O. Python doesn't (yet) have different forms of compilation (or did I miss something?). Certainly, in 1.6 we can have different compilations by virtue of substituting a new compiler, but I think it would be nice to retain a single form of compilation. In reference to type-inferred information: I don't think runtime checks would ever be added if the type has been inferred. Issue: what are the rules for inserting runtime checks? When are they added and when are they not? Strawman: 1) they are added for function arguments which have type declarators (i.e. added as a function prologue). 2) they are added when the type-assert operator is used. > * function return types are NEVER inferred > * ...they must be declared or assumed to be PyObject > * "types" can be Python primitive types, or declared classes or > interfaces Agreed. > * built-in types are declared through "shadow files" This is somewhat problematic. How do we map from a builtin type to this shadow file? Do they reside in a well-known location? Second issue: keeping them in sync, version mismatches, distribution and install problems, etc. My recommendation would be to enable a mechanism by which modules can internally declare their interface. I recognize this is complex and would therefore defer any discussion regarding interfaces for builtin types. Note: and when I say "builtin type", I'm referring to things like "socket" rather than the "core types" such as List or Dict. > * but a function return statement could be verified based on > inferencing to conform to its declaration Yes. This would be a compile-time static check. > * expression assertions support within-function assertions > * function parameters can have declarations Agreed. > * function calls and assignments are checked at runtime if they cannot > be verified at compile time Function calls: yes. I'm not sure we would ever check assignments. See my response above, regarding knowing when the proper time is. Instead, I think that an interface is a statement (to users) about the types, but we don't necessarily have to enforce it. Hrm. This kind of falls under the concept of "verifying an implementation conforms to an interface." I would prefer to avoid that. > * but you can ask for an explicit verification at compile time > * which enables faster code to be generated > * ...and verifies your understanding of what you are doing I'm not clear on your points here. > * types can be parameterized > * which means that the compile/runtime checks need to be more > sophisticated Yes, although I might modify it somewhat and say "only core types can be parameterized." > * we do not yet handle the "exception interface" of a function Thank you :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Thu Dec 16 21:28:28 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 13:28:28 -0800 (PST) Subject: [Types-sig] New syntax? In-Reply-To: <38594B11.766AE8A3@vet.uu.nl> Message-ID: On Thu, 16 Dec 1999, Martijn Faassen wrote: > Paul Prescod wrote: > > There are two separate issues here. Separate files and separate > > syntaxes. And there are two different time periods here: today and in > > Python 2. > > > > Separate files are a necessity to handle C-coded types. > > Um? Why? > > > Ergo anything on > > top of that is more work and given that we are still talking about > > something useful in a month (though that is looking less and less > > likely) I am not inclined to take on the extra work of new operators and > > an inline syntax. > > What about putting this extra information inside the module file itself? > You need a separate file because you want to come up with your own > syntax, but even then you can do: > > __types__ = """ ... > """ > > which incidentally would be a neat Pythonic syntax. :) Really. We don't want a separate syntax. Think about the parsing. Who is going to parse it? Are you suggesting that we have the Python parser doing some code parsing, then we invoke another parser to parse interface information, then we pass those blobs off to the compiler (and type inferencer/checker/optimizer/etc) ? No way. Use one parser for code and interface information. Inline vs. external is a different question (and I vote for former). But different syntaxes is a big problem that is easily avoided. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Thu Dec 16 21:29:37 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 13:29:37 -0800 (PST) Subject: [Types-sig] New syntax? In-Reply-To: <38594173.8762D205@prescod.net> Message-ID: On Thu, 16 Dec 1999, Paul Prescod wrote: >... > If you or someone proposes a completely Pythonic syntax that can handle > type unions, parameterized types, lists and tuples gracefully then we > can compare some declaration examples to a designed-from-scratch syntax > and let Guido decide. Tim has come up with a good first pass. If we can formalize that (and I'd like to tweak it), then we have the basis we need to move forward. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Thu Dec 16 21:36:55 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 13:36:55 -0800 (PST) Subject: [Types-sig] challenge response (was: A challenge) In-Reply-To: <14424.31434.689571.714592@dolphin.mojam.com> Message-ID: On Wed, 15 Dec 1999, Skip Montanaro wrote: > Greg> Line 7: per caveat #1, assume the compiler can access the > Greg> find.find() function. From that, it knows the signature. The first > Greg> parameter has a matching type, but the second (PyObject) does not > Greg> match the required type (String), so an error is raised. If caveat > Greg> #5 is resolved, then the second parameter matches. It is also > Greg> possible to avoid the error by rewriting: > > Greg> list = find.find("*.py", dir!StringType) # 7 > > Greg> "list" is now a ListType, based on the find.find() return > Greg> value. (see caveat #5 -- it could be possible to refine this > Greg> knowledge). > > I humbly assert this train of thought rates a *bzzzt*. I thought one core > requirement was that all type declaration stuff be optional. The worst that > the type checker/inferencer should do in the face of incomplete type info is > display a warning. I don't think you can flag an error unless the > programmer sets some sort of PY_ANAL_TYPE_CHECKING_AND_I_REALLY_MEAN_IT > environment variable. My entire post was pre-conditioned on the assumption that type-checking has been enabled. IMO, type checking is NOT enabled by default. I believe it will impose a noticable performance penalty and I'm not willing to pay that in the general case. Periodically, I'll turn it on and run it over my code (and in that sense, type-checking as a lint-like tool is probably okay with me; I'm more interested in typing for its (OPT) features). As a side note: pulling in my Strawman from another thread, re: when to insert runtime checks -- the determination is entirely based on syntax, rather than a type analysis (or failure thereof). In other words, even if we disable compile-time checking, we still end up with the same output (which includes runtime checks where applicable). Cheers, -g -- Greg Stein, http://www.lyra.org/ From skip@mojam.com (Skip Montanaro) Thu Dec 16 21:39:45 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 16 Dec 1999 15:39:45 -0600 (CST) Subject: [Types-sig] New syntax? In-Reply-To: <38594B11.766AE8A3@vet.uu.nl> References: <38594173.8762D205@prescod.net> <38594B11.766AE8A3@vet.uu.nl> Message-ID: <14425.23585.74942.844600@dolphin.mojam.com> Paul> Separate files are a necessity to handle C-coded types. Martijn> Um? Why? My guess is so that declaration and definition are separated. If "definition" roughly means "import", you'd like to get at an object's interface without actually importing (or perhaps even parsing it?) the module it is defined in and risking the side effects of import. Skip Montanaro | http://www.mojam.com/ skip@mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From skip@mojam.com (Skip Montanaro) Thu Dec 16 21:43:17 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 16 Dec 1999 15:43:17 -0600 (CST) Subject: [Types-sig] consensus(?) summary (was: Type annotations) In-Reply-To: References: <38592F43.11042753@prescod.net> Message-ID: <14425.23797.624232.17777@dolphin.mojam.com> >> * non-local writes are checked at runtime (by default) Greg> Hrm. Is there an easy rule to determine this? In particular, is the following a non-local write? import sys p = sys.path p.append("/usr/local/lib/other") Skip Montanaro | http://www.mojam.com/ skip@mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From m.faassen@vet.uu.nl Thu Dec 16 21:47:43 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Thu, 16 Dec 1999 22:47:43 +0100 Subject: [Types-sig] New syntax? References: Message-ID: <38595DFF.B0B1751D@vet.uu.nl> Greg Stein wrote: [snip Pythonic syntax stuff I perversely proposed] > Really. We don't want a separate syntax. > > Think about the parsing. Who is going to parse it? Are you suggesting that > we have the Python parser doing some code parsing, then we invoke another > parser to parse interface information, then we pass those blobs off to the > compiler (and type inferencer/checker/optimizer/etc) ? That's not really what I'm proposing; I was proposing using Python at least for the first shot at things. But, this does appear to be what Paul's proposing. Paul doesn't consider writing a new parser a problem, I do think it'll hold us back when we could better be discussing semantics. But since Paul thinks syntax is important I'm obliging with something that seems Pythonic. Because I'm Dutch I get bonus points anyway. ;) > No way. Use one parser for code and interface information. All right: foo = 1: def bar(i, j): return i + j vardef bar(int, int): return int class Foo: alpha = 1 def __init__(self, beta): self.beta = beta def getbeta(self): return self.beta varclass Foo: alpha: int def __init__(self, int): self.beta: int def getbeta(self): result: int :) # not part of syntax > Inline vs. external is a different question (and I vote for former). But > different syntaxes is a big problem that is easily avoided. So what are you suggesting if you would be voting for external, then? A Python based system such as the one I proposed earlier? Or is this why you're voting for internal? Regards, Martijn From gstein@lyra.org Thu Dec 16 21:55:54 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 13:55:54 -0800 (PST) Subject: [Types-sig] doc strings (was: Sorry!) In-Reply-To: Message-ID: On Thu, 16 Dec 1999, Edward Welbourne wrote: >... > (I'd even be happy with the typedecl incorporating the docstring, which > is part of the interface spec after all: and would make the run-time > thing actually called be lighter-weight in some probably-irrelevant > sense.) Actually, there are two doc-strings in two places: one is the > doc-string of (say) the function object - it says what the function does > - the other is where some object carrying that function documents the > role of the attribute as which it stores that object. This is directly I agree with the doc string thing. Specifically, imagine something like this: ---------- # module foo decl some_global: String "some_global is used for ..." decl some_func: def(Int) -> None "this function does ..." ---------- Note that JimF's interface proposal allows attaching doc strings to the elements of an interface. I believe the "decl" statement and associated doc strings would be the (syntactical) subtitution for his runtime solution. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Thu Dec 16 22:13:11 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 14:13:11 -0800 (PST) Subject: [Types-sig] New syntax? In-Reply-To: <38595DFF.B0B1751D@vet.uu.nl> Message-ID: On Thu, 16 Dec 1999, Martijn Faassen wrote: >... example syntax for a Python syntax to declare interfaces ... Ah. Good. That is better than your/Paul's previous suggestions. > > Inline vs. external is a different question (and I vote for former). But > > different syntaxes is a big problem that is easily avoided. > > So what are you suggesting if you would be voting for external, then? A > Python based system such as the one I proposed earlier? Or is this why > you're voting for internal? By "for former", I meant that I want an internal syntax. Something like Tim's suggestion. It keeps the declaration closest to the implementation, which (IMO) is best. It is kind of like comments and code: they can easily drift apart, especially if the two are distant from each other. In your example above, I think it would be a bit painful to flip back and forth between the "class" and "varclass" every time you wanted to add a method. In fact, I don't even like Tim's notion of declaring a function since a "def" is more than adequate for doing that. I would like to see something like: #--------------------------------------------------------- class Foo: decl class a: Int "The class variable is for ..." a = 1 decl b: String "Member variable. Alternative location for a doc string" def bar(x: Int, y: String) -> List: "Doc string goes here" ... return some_list #--------------------------------------------------------- Note that an interface definition would look exactly the same, except that "interface" would be used instead of "class", variable assignments are not allowed, and functions cannot have a body (only a doc string). Note the use of "decl class ..." to define class variables, while "decl ..." is for member variables. I'm not sure if we should instead use Tim's suggestion of "decl member ...", though. Given the position of the declaration, I think "decl member" might actually be better because it makes it clear that is a *member* variable, despite being in a location that is normally used for class variables. An alternative would be a different "decl" keyword just for members. [ I *really* don't like member declarations in the __init__() method as some people have shown. Those could be confused with declarations of local vars, which I hope we aren't going to have. ] Consider the above example, my latest proposal for syntax changes in support of declarations. Obviously, a bit more detail is needed for things like parameterized types, but I think the above is representative of where I'd like to see things go. And I won't suggest anything for an external syntax, since I don't support that :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Thu Dec 16 22:16:18 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 14:16:18 -0800 (PST) Subject: [Types-sig] New syntax? In-Reply-To: <14425.23585.74942.844600@dolphin.mojam.com> Message-ID: On Thu, 16 Dec 1999, Skip Montanaro wrote: > Paul> Separate files are a necessity to handle C-coded types. > > Martijn> Um? Why? > > My guess is so that declaration and definition are separated. If > "definition" roughly means "import", you'd like to get at an object's > interface without actually importing (or perhaps even parsing it?) the > module it is defined in and risking the side effects of import. I don't think so... Paul was referring to C-coded extension types. Therefore, Python syntax (or any other syntax) is not available. The interface would need to be programmatically defined, or it would occur in a separate file. As I mentioned in another thread, I think we should defer this problem. There are too many issues, none of which move us (or hinder us) from getting to a first-version type system. I think we can revisit this and add some new concepts, code, whatever, to solve the problem (without tearing down or interfering with what we did in Rev 1). Cheers, -g -- Greg Stein, http://www.lyra.org/ From m.faassen@vet.uu.nl Thu Dec 16 22:19:50 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Thu, 16 Dec 1999 23:19:50 +0100 Subject: [Types-sig] New syntax? References: Message-ID: <38596586.63D9E29F@vet.uu.nl> Greg Stein wrote: > > On Thu, 16 Dec 1999, Martijn Faassen wrote: > > >... example syntax for a Python syntax to declare interfaces ... > > Ah. Good. That is better than your/Paul's previous suggestions. It was almost exactly the same as before, though. > > > Inline vs. external is a different question (and I vote for former). But > > > different syntaxes is a big problem that is easily avoided. > > > > So what are you suggesting if you would be voting for external, then? A > > Python based system such as the one I proposed earlier? Or is this why > > you're voting for internal? > > By "for former", I meant that I want an internal syntax. Yes, I understood that. I just was curious what you meant when you stated you'd be something parsed with Python anyway, even if it were external. It's hard to come up with external type annotation syntax that doesn't include a new language. > Something like > Tim's suggestion. It keeps the declaration closest to the implementation, > which (IMO) is best. It is kind of like comments and code: they can easily > drift apart, especially if the two are distant from each other. That's true. It is a disadvantage. > In your example above, I think it would be a bit painful to flip back and > forth between the "class" and "varclass" every time you wanted to add a > method. Yes, but I think my proposal is rather easy to understand for Python programmers, as it looks almost exactly like Python in structure. The flipping back and forth is a bit painful, though, I agree. The advantage of separation though is that it can actually be made to look exactly like Python structures, which is rather neat. [snip] > [ I *really* don't like member declarations in the __init__() method as > some people have shown. Those could be confused with declarations of > local vars, which I hope we aren't going to have. ] Well, my syntax proposal avoids this confusion by following Python's lead: varclass Foo: alpha: int def __init__(self): self.member: int local: int > Consider the above example, my latest proposal for syntax changes in > support of declarations. Obviously, a bit more detail is needed for things > like parameterized types, but I think the above is representative of where > I'd like to see things go. Didn't you think parameterized types looked fairly straightforward in my syntax proposal? > And I won't suggest anything for an external syntax, since I don't support > that :-) Right, that was what I was curious about. :) Regards, Martijn From evan@4-am.com Thu Dec 16 22:25:26 1999 From: evan@4-am.com (Evan Simpson) Date: Thu, 16 Dec 1999 16:25:26 -0600 Subject: [Types-sig] Return of the Docstring: The Typening Message-ID: <385966D5.BAF592C4@4-am.com> I'm sure this was bandied around in the (distant) past, but since backward-compatible inline syntaxes are being proposed, I thought I'd resurrect it: Put the type constraints (of whatever syntax) in docstrings (and ignored strings). Shadow files for extensions can simply be .pyt's with dummy objects definitions, or could be more compact. Example: def foo(x1, x2): '''A foo function ::(int, int) -> int''' return x1+x2 'bar:: string'; bar = "I'm a string!" class Mine: '''My class! Mine! hm, uh:: float''' hm = 1.0 uh = 3.14 This could also serve as a shadow file, or perhaps a more compact notation: def foo::(int, int) ->int bar:: string class Mine: hm, uh:: float That is, if we're going to have name-type declarations at all. I'm rather partial to expression-type constraints with 'as' instead of '!'. Perform single-module analysis at compile time (if requested) to produce a type-inference graph such as Tim(?) described. Save the graph in a *.pyt file, then have a tool which uses them to do full-program type-checking, and possibly rewrite the *.pyc if optimization is possible and requested. I still like the Sparrow/SPython concept, too . Cheers, Evan @ 4-am From gstein@lyra.org Thu Dec 16 23:05:58 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 15:05:58 -0800 (PST) Subject: [Types-sig] bunch o' stuff (was: minimal or major change?) In-Reply-To: <3858E9C2.E2722B88@vet.uu.nl> Message-ID: On Thu, 16 Dec 1999, Martijn Faassen wrote: >... > [I'm disagreeing with the 'isn't that big of a change' thesis, Greg > defends fairly > well that it is, but I still disagree with him. I don't think our > disagreeing will matter much in the future, though, so let's forget > about it.. Not a problem. "Agree to disagree" is quite civilized and proper :-) >... > > > * A whole new operator (which you can't overload..or can you?), which > > > does something quite unusual (most programmers associate types with > > > names, not with expressions). The operation also doesn't actually return > > > much that's useful to the program, so the semantics are weird too. > > > > No, you cannot overload the operator. That would be a Bad Thing, I think. > > That would throw the whole type system into the garbage :-). > > Okay, in that sense the operator would be special, as generally > operators > in Python can be overloaded (directly or indirectly). I'd agree you > shouldn't be able to overload this one, though. Well, I hope to not consider it "special". In my mind, it is just another operator. It has some semantics the compiler can take advantage of, sure, but it isn't like a pragma or some other meta-level thing. However, its semantics don't lend well to overloading. *shrug* Assuming we end up proposing this operator to Guido for inclusion as part of the new type system, then he can certainly make a call on whether it should be possible to overload it. > > The operator is not unusual: it is an inline type assertion. It is not a > > "new-fangled way to declare the type of something." > > But it's quite unusual to the programmer coming from most other > languages, still. That doesn't mean it's bad, but Python isn't an > experimental language, so this could be an objection to the operator > approach. Perl has the ~= operator, which has unusual semantics for a programmer coming from Python. Python's slice operator is not available to a C or C++ programmer, but people don't complain about that. Point is: each language has its own set of operators to solve problems within that language's domain. I see this operator as a pretty neat and clean way to resolve Python's (current) lack of type declarations. And I disagree with the notion "Python isn't an experimental language." It is one of the few to natively support complex types, sophisticated slicing, builtin dictionary types, and keyword arguments. >... > > > * Interfaces with a new 'decl' statement. [If you punt on this you'll > > > have to the innocent Python programmer he can't use the static type > > > system with instances? or will we this be inferenced?] > > > > Yes, I'd prefer to punt this for a while, as it is a much larger can of > > worms. It is another huge discussion piece. In the current discussion, I > > believe that we can factor out the interface issue quite easily -- we > > can do a lot of work now, and when interfaces arrive, they will slide > > right in without interfering with the V1 work. In other words, I believe > > there is very little coupling between the proposal as I've outline, and > > the next set of type system extensions (via interfaces). > > Hm, I'm still having some difficulty with this; as I understand it your > proposal would initially only work with functions (not methods) which > only use built-in types (not class instances). Am I right, or perhaps > I'm missing something.. Methods are actually function objects. When I've referred to functions, I'm talking about functions and methods. In other words: class Foo: def bar(x: String, y:String) -> String: pass In the above code, the bar() method has a type signature, which can be type-checked. Since writing the quoted text, I've read the interface proposal and thought more on the "decl" statement. I am now in favor of including "decl" in V1, thus providing types for all portions of an interface (attributes and method). >... > > > Adding anything like static type checking to Python entails fairly major > > > changes to the language, I'd think. Not that we shouldn't aim at keeping > > > those transparant and mostly compatible with Python as it is now, but > > > what we'll add will still be major. > > > > Sure. > > You say 'sure' to me saying it'll still be major? :) Oh, wait, I wasn't > arguing about that anymore! I'm not sure what I was referring to. Sorry about that. I think I meant, "yes, we should aim at keeping things transparent and compatible." At least, that's what I mean now when I re-read and re-comment on your text :-) >... > > > > > The 'simplicity' part comes in because you don't need *any* type > > > > > inferencing. Conceptually it's quite simple; all names need a type. > > > > > > > > 1) There is *no* way that I'm going to give every name a type. I may as > > > > well switch to Java, C, or C++ (per Guido's advice in another email :-) > > > > > > Sure, but we're looking at *starting* the process. Perhaps we can do > > > away with specifying the type of each local variable very quickly by > > > using type inferencing, but at least we'll have a working > > > implementation! > > > > I don't want to start there. I don't believe we need to start there. And > > my point (2) below blows away your premise of simplicity. Since you still > > need inferencing, the requirement to declare every name is not going to > > help, so you may as well relax that requirement. > > But you'd only need expression inferencing, which I was ('intuitively' > :) assuming is easier than the larger scale thing. Yes, expression-level inferencing is easier, as you don't have to worry about code like this: a = 1 while 1: func_which_takes_int(a) a = "foo" The above code should raise a type-check error. Tim referred to the above problem when he talked about "reaching a stable state," although it probably wasn't obvious to most readers :-) If names have types, then the a="foo" line would raise an error. In a purely inferenced world, the inferencer (eventually) figures out that can have one of two types at the time of the function call. It then raises an error saying "func_which_takes_int expect an Int, but a may be a String." >... > > > I'm not saying this is a good situation, it's just a way to get off the > > > ground without having to deal with quite a few complexities such as > > > inferencing (outside expressions), interaction with modules that don't > > > have type annotations, and so on. I'm *not* advocating this as the end > > > point, but I am advocating this as an intermediate point where it's > > > actually functional. > > > > IMO, it is better to assume "PyObject" when you don't have type > > information, rather than throw an error. Detecting the lack of type info > > is the same in both cases, and the resolution of the lack is easy in both > > mehtods: throw an error, or substitute "PyObject". I prefer the latter so > > that I don't have to update every module I even get close to. > > I still don't understand how making it a PyObject will help here. Would > this mean a run-time check would need to be inserted whenever PyObject > occurs in a function with type annotations? In my approach this would be > part of the Python/Static Python interface work. How does it fit in for > you? The PyObject approach means that you don't throw an error. There are no runtime checks or compile time checks. They are simply unavailable since you have no type information. Using PyObject will help because it means you aren't raising errors simply because some module has not added type declarations. Instead, the compiler just uses the "unknown" (PyObject) type and keeps going. Of course, that may cause type errors later, but that is resolvable with the type-assert operator (which inserts a run-time check, and tells the compiler what type you're expecting it to be). >... > > > Yes, but now you're building a static type checker *and* a Python > > > compiler inserting run time checks into bytecodes. This is two things. > > > This is more work, and more interacting systems, before you get *any* > > > payoff. My sequence would be: > > > > Who says *both* must be implemented in V0.1? If the compiler can't figure > > it out, then it just issues a warning and continues. Some intrepid > > programmer comes along and tweaks the AST to insert a runtime check. Done. > > The project is easily phased to give you a working system very quickly. > > > > Heck, it may even be easier for the compiler to insert runtime checks in > > V0.1. Static checking might come later. Or maybe an external tool does the > > checking at first; later to be built into the compiler. > > That's true; the other approach would start with adding run-time checks > and proceed to a static checker later. Yes, that's what I said :-) First, we add the new typedecl syntax. Then, if the compiler inserts runtime checks for function arguments and as a result of the type-assert operator, then we have a good first pass. Next comes an external tool to consume type information and perform type inferencing and checking. Finally, we decide on integrating the external tool into the compiler proper. >... > So that's where I'm coming from. It's important for our proposal to > actually come up with a workable development plan, because adding type > checking to Python is rather involved. So I've been pushing one course > of implementation towards a testable/hackable system that seems to give > us the minimal amount of development complexities. I haven't seen clear > development paths from others yet; most proposals seem to involve both > run-time and compile-time developments at the same time. I haven't seen any, let alone clear, discussions from others about development paths :-) But I don't think anybody is going to advocate a system that will take a while to bring up, so I think we're all in agreement here. >... > > > This'd be only implementable with run-time assertions, I think, unless > > > you do inferencing and know what the type the object is after all. So > > > that's why I put the limitation there. Don't allow unknown objects > > > entering a statically typed function before you have the basic static > > > type system going. After that you can work on type inference or cleaner > > > interfaces with regular Python. > > > > Why not allow unknown objects? Just call it a PyObject and be done with > > it. > > Hm, I suppose I'm looking at it from the OPT point of view; I'd like to > see a compiler that exploits the type information. If you have PyObjects > this seems to get more difficult; could be solved if you had an > interpreter waiting in the sidelines that would handle stuff like this > that can't be compiled. The compiler can exploit type information, sure. But we're talking about the case where type information is not available. Rather than just failing, the compiler just doesn't optimize. Using the type-assert operator, you can get the compiler cranking up again (of course, you could also go and add type annotations to the code being called). > > Note that the type-assert operator has several purposes: > > > > * a run-time assertion (and possibly: unless -O is used) > > * signal to the compiler that the expression value will have that type > > (because otherwise, an exception would hav been raised) > > * provides a mechanism to type-check: if the compiler discovers (thru > > inferencing) that the value has a different type than the right-hand > > side, then it can flag an error. > > > > The limitation you propose would actually slow things down. People would > > not be able to use the type system until a lot of modules were > > type-annotated. > > I think I'm starting to see where you're coming from now, with the ! > operator. It allows you to say 'from this point on, this value is an > int, otherwise the operator would've raised an exception'. The > inferencer and checker can exploit this. Exactly! The compiler can also use it to perform various optimizations, since it now knows the (guaranteed) type. > The point where I am coming > from is however that you lose compile-time checkability as soon as you > use any function that inserts PyObjects into the mix. I'm afraid that > even with the operator, you wouldn't be able to check most of the code, > if PyObjects are freely allowed. Perhaps I'm wrong, but I'd like to see > some more debate about this. Yes, you lose it, but that doesn't mean you throw the baby out with the bath water. The compiler just degrades gracefully in the presence of a PyObject. With the type-assert operator, you effectively convert that PyObject into a known type which the compiler can then use in later checks and optimizations. > > > But perhaps I'm mistaken and local variables don't need type > > > descriptions, as it's easy to do type inferencing from the types of the > > > function arguments and what the function returns, > > > > That is my (alas: unproven) belief. > > How do we set about to prove it? I don't need a proof :-). I think we *can* use inferencing to avoid decls for local variables. In fact, I am positive of it, and instead would like to hear a counter-proof. > Here I'll come with my approach again; > if you have a type checker that can handle a fully annotated function > (all names used in the function have type annotations), then you have a > platform you can build on to develop a type checker. Then you can figure > out what does need type annotations and what doesn't. You simply try to > build code that adds type annotations itself, based on inferences. You > can spew out warnings: "full type inferencing not possible, cannot > figure out type of 'foo'". The programmer can then go add type info for > 'foo'. If all types are known one way (specified) or the other > (inferred), a compiler can start to do heavy duty optimization on that > code. I do not believe that developing a type checker for fully-annotated functions is going to help in any way towards building an inferencer. In other words, we just build the inferencer. However, I do see that a compiler that knows all types is a good first step. Using those types, it can do various things (e.g. type checks on func args, various optimizations). Where it gets the type information is the point of discussion here :-) I'd rather just start with inferencing rather than modifying the syntax to support typing of locals, only to pull that syntax change out later. Note: your proposal of __types__ would be useful during development of the compiler (presuming that occurs before the inferencer is available). __types__ requires no syntax changes, so it can give the compiler the info right away. Later, we just stop looking for __types__ and use the inferencer. >... > [snip] > > > I'd like to see some actual > > > examples of how this'd work first, though. For instance: > > > > > > def brilliant() ! IntType: > > > a = [] > > > a.append(1) > > > a.append("foo") > > > return a[0] > > > > > > What's the inferred type of 'a' now? A list with heterogenous contents, > > > that's about all you can say, and how hard is it for a type inferencer > > > to deduce even that? > > > > It would be very difficult for an inferencer. It would have to understand > > the semantics of ListType.append(). Specifically, that the type of the > > argument is added to the set of possible types for the List elements. > > > > Certainly: a good inferencer would understand all the builtin types and > > their methods' semantics. > > > > > But for optimization purposes, at least, but it > > > could also help with error checking, if 'a' was a list of IntType, or > > > StringType, or something like that? > > > > It would still need to understand the semantics to do this kind of > > checking. In my no-variable-declaration world, the type error would be > > raised at the return statement. a[0] would have the type set: (IntType, > > StringType). The compiler would flag an error stating "return value may be > > a StringType or an IntType, but it must only be an IntType". > > Right, I think this would be the right behavior. But it becomes a lot > easier to get a working implementation if you get to specify the type of > 'a'. If you say a is a list of StringType, it's then relatively easy for > a compile time checker to notice that you can't add an integer to it. Well, kind of. The checker would sitll have to understand that a.append() is going to insert that value into the list, so that appending an Int would generate a type conflict. Re: working implementation faster: this presumes that the compiler will use the type declarations before the inferencer is available. > And possibly it also becomes clearer for the programmer; I had to think > to figure out why your compiler would complain about a[0]. I had to play > type inferencer myself. I don't have to think as much if I get to > specify what list 'a' may contain; obviously if something else it put > into it, there should be an error. The programmer never has to think about type inferencing. That only exists to create type-check warnings/errors. The programmer believes that he has a list of integers and codes that way. The inferencer then comes along and tells him that he goofed up. Declaring up front simply moves the error from the return statement to the point where the wrong type was inserted into . It is arguable that this is preferable. > > > It seems tough for the type > > > inferencer to be able to figure out that this is so, but perhaps I'm > > > overestimating the difficulty. > > > > Yes it would be tough -- you aren't overestimating :-) > > What would your path towards successful implementation be, then? * add the syntax changes (decl, def changes, and !) * change the compiler to use the new syntax to insert runtime checks * develop external tool to do type checking * possibly integrate the tool into the compiler Note that the external tool will start with rudimentary type inference and analysis. It will then grow in complexity as more capability is added. For example, initially, it might only know "a" + 1 is a type error. Later, it would be able to do some simple inference based on data flow. Later still, it would recognize problems like the "while" example I listed above. Also note that I'm not sure we ever put type-checking into the core interpreter. If it isn't going to alter the compilation output, then why put it in? In other words: somebody ought to come up with a list of things they expect the compiler to alter in the *bytecodes* based on the type information (Python doesn't really have type-specific bytecodes (yet)). Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Thu Dec 16 23:11:27 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 15:11:27 -0800 (PST) Subject: [Types-sig] New syntax? In-Reply-To: <38596586.63D9E29F@vet.uu.nl> Message-ID: On Thu, 16 Dec 1999, Martijn Faassen wrote: >... > > [ I *really* don't like member declarations in the __init__() method as > > some people have shown. Those could be confused with declarations of > > local vars, which I hope we aren't going to have. ] > > Well, my syntax proposal avoids this confusion by following Python's > lead: > > varclass Foo: > alpha: int > > def __init__(self): > self.member: int > local: int Quite true. This is much clearer. But I still want "decl" rather than "varclass" :-) > > Consider the above example, my latest proposal for syntax changes in > > support of declarations. Obviously, a bit more detail is needed for things > > like parameterized types, but I think the above is representative of where > > I'd like to see things go. > > Didn't you think parameterized types looked fairly straightforward in my > syntax proposal? Yes. I would expect something like that for a new typedecl syntax. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Thu Dec 16 23:20:21 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 15:20:21 -0800 (PST) Subject: [Types-sig] Attributes proposal In-Reply-To: <38592D38.63057A1A@prescod.net> Message-ID: On Thu, 16 Dec 1999, Paul Prescod wrote: > My proposal for handling attributes is this: > > An attribute's type can be declared. Writes to the attribute from the > same module can be statically type checked (if requested). Writes to the > attribute from other modules are checked at runtime. That way we can > always know the type of the attribute value and can therefore make > reasonable use of the attribute in statically type checked functions. > > Opinions? Punt issues of writeability to a later revision. Concentrate on type checking instead. Assume that an attribute's declared type is correct. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Thu Dec 16 23:29:21 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 15:29:21 -0800 (PST) Subject: [Types-sig] check_type() In-Reply-To: <3857E01D.6C699075@prescod.net> Message-ID: On Wed, 15 Dec 1999, Paul Prescod wrote: > Greg Stein wrote: ... > > > j=has_type( foo, types.StringType ) or has_type( foo, types.ListType ): > > > > You'll have issues with empty strings and empty lists, as Guido pointed > > out. > > Yes, you have to use it in ways that follow Python's boolean rules. A > better name would be check_type. > > j=check_type( foo, types.StringType ) > > > has_type() does not create a *definitive* type assertion. The compiler > > cannot extract any information from the presence of has_type(). Using an > > operator which raises an exception allows the compiler to make the > > assertion (and thereby assist with type inferencing and type checking). > > j=check_type( foo, types.StringType) > > j is *guaranteed* to be either a string or None. But that is a problem right there: you've introduced the possibility that might be None. While the compiler / type-checker can certainly do something useful with that concept, this does not provide a way to guarantee a *single* type. j = check_type(foo, String) func_taking_string(j) The above will fail because the compiler will flag the possibility of being None. j = foo ! String func_taking_string(j) Now that works :-) > Note that check_type is actually an operator in that it cannot be > overwritten or shadowed. It just happens to be an operator that looks > like a function and that returns a useful value instead of immediately > causing an exception. It also happens to be compatible with the current > Python grammar. Icky. Either the compiler now has to understand that NAME(...) could possibly require special processing, or the parser recognizes it and constructs a new AST node. The former is badness, and the latter says you're changing the parser, so why not use a "real" operator? People might also be tempted to do: ct = check_type ct(x, String) ct(y, Int) ct(z, List) Of course, this will fail because check_type isn't really a valid name. > I have big aesthetic problems with adding a special character to a > language that uses the word "or" to mean, well "or" and "not" to mean > "not". I might be able to live with > > "k = eval('1') as int" > > if it isn't too horribly ambiguous. Using "as" instead of "!" would be non-ambiguous. However, the word "as" seems to imply "use it as an int" rather than an assertion that it *is* an integer. Of course, we can't use the word "is" because it can already be used inside an expression. "isa" might even be more appropriate, and is available for usage. For now, I'll keeping using '!', but I'm on record as being open to alternate representations for the operator. Cheers, -g -- Greg Stein, http://www.lyra.org/ From tim_one@email.msn.com Fri Dec 17 01:17:10 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 16 Dec 1999 20:17:10 -0500 Subject: [Types-sig] Re: RFC 0.1 In-Reply-To: <385691D3.6DC4A36E@vet.uu.nl> Message-ID: <000801bf482c$71670100$63a2143f@tim> [Martijn Faassen] > While my agenda is to kill the syntax discussions for the moment, > ... Martijn, in that case you should stop feeding the syntax meta-discussion and just view all the other notations as virtual spellings for masses of obscure nested dicts . notation-is-an-aid-to-comprehension-ly y'rs - tim From tim_one@email.msn.com Fri Dec 17 01:17:05 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 16 Dec 1999 20:17:05 -0500 Subject: [Types-sig] RFC 0.1 In-Reply-To: <199912141633.LAA23558@eric.cnri.reston.va.us> Message-ID: <000601bf482c$6e075b40$63a2143f@tim> [Guido] > ... > If it's (OPT) we're after, adding run-time checks can never obtain > your goal. That's unclear: declared types don't have to be known correct to be useful! They at least tell you what the user *expects* to be true. Code can then be generated *assuming* everything the user said is true, with a block of code preceding to *verify* it's true. Optimizing compilers do this routinely under the covers, where the (misnamed in this case) "verification" code simply branches to a slower all-purpose translation of the code if the assumptions turn out to be false at runtime. Trivial example: for i in range(n): x[i] = i In the presence of decl x: [Int] the generated (pseduo)code if type(x) is not ListType: raise TypeError("lying bastard!") else: setter = ListType.__setitem__ for i in range(n): setter(x, i, i) is a good bet and already saves n lookups of the proper __setitem__ method. It's a comparatively small step from there for a compiler to say "ah, but I know all about Lists! I can generate list "setter" code inline". And again from there to "ah, now that all the code is exposed, I know the net effect on each i's refcount, so can skip useless inc+dec pairs". I'm not saying you need to do this; just saying that all information *can* be valuable to a gung-ho optimizer -- even wrong information! Optimization is a probability game, and while certainty is helpful it isn't essential. > ... if there's a type error in my except clause, what good does it > do me to get a type-check error at run time? Frankly, I think the "safety" arguments are the weakest -- if someone has untested code paths in their program, they should *assume* all such paths are broken! What good does it do you to have a statically type-correct except clause if it raises an OverflowError at runtime <0.5 wink>? (Speaking of which, I routinely see error paths in C++ apps blow up with memory errors due to null pointers.) >>> The initialization for b denies its type declaration. Do you really >>> want to do this? >> None is a valid value for any type as with NULL in C or SQL. > No. In C, NULL is not a valid integer (at least not conceptually -- > it's a pointer). I hate the fact that in Java, NULL is always a valid > string, because strings happen to be objects, and so I always run into > run-time errors dereferencing NULL. I'd like to be able to declare > the possibility that a particular value is None separate from its type > -- this feels much more natural and powerful to me. Paul later semi-suggested borrowing Haskell's notation for union types. This looks good to me (despite that "my" syntax looks concrete, it's abstract ): decl i: Int # "i = None" not allowed decl j: Int | None # "j = None" is OK As you said earlier, "a type" is a set of values. So if None is a legit value for a name, then None is in the set of values that name can take on, so None is certainly a part of its type. We don't need tricks or compromises here: we can say what's intended directly. > The hard part is keeping which variables (and arguments, etc.) can > contain instances of a given class; if we have that we can track > instance variable assignments. I don't see the problems here, at least not for explicit declaration schemes. The inference schemes are harder -- because they're, well, *harder* . loathe-to-see-hard-problems-prevent-solving-easy-ones-ly y'rs - tim From tim_one@email.msn.com Fri Dec 17 01:17:12 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 16 Dec 1999 20:17:12 -0500 Subject: [Types-sig] RFC 0.1 In-Reply-To: <199912141909.OAA24221@eric.cnri.reston.va.us> Message-ID: <000901bf482c$72738f00$63a2143f@tim> [Guido] > Agreed. List of integer and its friends are important. Also > correspondences (see my example of a sum() function taking a > list of and an additional single . Assuming an object of type C is declared decl x: C and an object of type "list of C" is declared decl y: [C] then for a function taking a list of some type and a scalar of that type, returning a binary tree of objects of that type , I'd suggest: decl sum: def([_T], _T) -> BinaryTree(_T) I'm just warping Haskell's system to Python conventions. As I've noted before, Haskell is the most Pythonic of all the languages that are entirely unlike Python <0.9 wink>. Correspondences require a formal type *variable*. C++ templates use an ugly angle-bracket notation to surround the formal type variables. Haskell uses identifiers that begin with a lowercase letter, conventionally a one-letter name from the end of the alphabet. I suggest a leading underscore in Python, to suggest that there's something special about the name, and to suggest that it's "local" to the type expression in which it appears. it's-easy-if-you-don't-think-ly y'rs - tim From paul@prescod.net Fri Dec 17 01:28:38 1999 From: paul@prescod.net (Paul Prescod) Date: Thu, 16 Dec 1999 17:28:38 -0800 Subject: [Types-sig] Module Attribute visibility References: Message-ID: <385991C6.5B2917B4@prescod.net> Greg Stein wrote: > > IMO, let's solve static type checking. Leave visibility and modification > rules to another phase. They are orthogonal problems, and we would do well > to reduce our problem set (and the amount of discussion thereby > engendered (my 25 cent word for the day :-)). They are not orthogonal at all. I can't statically check a file that uses sys.version unless I know that sys.version has not been overwritten with a string. We can't allow the runtime system to violate the expectations of the static type engine. We also don't want every user of sys.version to need to assert its type. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things never trust in: That's the vendor's final bill The promises your boss makes, and the customer's good will http://www.geezjan.org/humor/computers/threes.html From tim_one@email.msn.com Fri Dec 17 02:04:05 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 16 Dec 1999 21:04:05 -0500 Subject: [Types-sig] RFC 0.1 In-Reply-To: Message-ID: <000c01bf4832$ff321460$63a2143f@tim> [Greg Stein] > Woo hoo! Tim to the rescue! :-) Na -- I just like to stir up trouble . [Tim code that uses one name for both a dict and that dict's list of keys] > Ha! I posted something just like this just the other day: > http://www.python.org/pipermail/types-sig/1999-December/000518.html Yes, I noticed that at the time but forgot -- it's such a common idiom in my code that it didn't "stick". > Basically: I *totally* agree, and this is primarily the time when > I use a single variable name for two different types. This is also > a reason why I'd like to avoid the notion of associating a type > with a [variable] name. Associating types with names is thoroughly conventional, and thoroughly appropriate if a given name is in fact intended to have a fixed type -- and I expect that's most names (e.g., likely every name in __builtin__ and sys!). If I have a class with a dozen methods and they all treat e.g. self.x as a list of floats, I certainly don't want to have to decorate every reference to and binding of self.x[i] to say that over and over again. I'd rather use a distinct name in the few places I "cheat" now. Heck, given a suitable set of predefined interfaces, I could declare my dict/list name as being of type (or implementing the interface) Subscriptable . Or of the universal type (Paul's PyObject). Although the less specific I am, the less help I can expect to get from typing -- that's my tradeoff to make. Given that you *have* to associate types at least with formal argument names, "avoiding the notion of associating a type with *a* name" is a lost cause. A further distinction between "[variable] name"s and "all names" isn't compelling -- although I hope Guido doesn't listen to me and presses on with his type inference schemes anyway, because given the types of globals and arguments, the types of almost all local variables are indeed easy to infer . doesn't-mean-i-want-to-prevent-people-from-declaring-'em- though-they're-*used*-to-it-from-other-languages-and- there's-no-good-reason-to-outlaw-it-ly y'rs - tim From tim_one@email.msn.com Fri Dec 17 02:04:07 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 16 Dec 1999 21:04:07 -0500 Subject: [Types-sig] List of FOO In-Reply-To: Message-ID: <000d01bf4833$0070fd00$63a2143f@tim> > So I could, for instance, define a binary tree module and > have "binary trees of ints" and "binary trees of strings." > How do I define the binary tree class and state that it > is parameterizable? Via: decl type BinaryTree(_T) class BinaryTree: # exactly as today exploiting type variables (see earlier msg), and that I named "decl" "decl" instead of "var" precisely because "var"iable declarations aren't the only kinds of "decl"arations that will be needed before the blood stops flowing. decl-will-encompass-a-sublanguage-bigger-than-python-ly y'rs - tim From tim_one@email.msn.com Fri Dec 17 02:19:21 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 16 Dec 1999 21:19:21 -0500 Subject: [Types-sig] RFC 0.1 In-Reply-To: Message-ID: <000e01bf4835$20e04b20$63a2143f@tim> [GregS] > ... > 2) You *still* need inferencing. "a = foo() + bar()" implies that some > inferencing occurs. The type of the RHS expression is the union of the types returned by foo.__add__ and bar.__radd__. I wouldn't call that inferencing, any more than I'd say it required inferencing to determine the return type of math.sin(3.0) Now you *can* call that inferencing, but doing so wouldn't be helpful . > ... > The compiler can issue a warning and insert a type assertion for > a runtime check. IMO, it should not forbid you from doing anything > simply because it can't figure out some type. Python syntax's "type > agnosticism" is one of its major strengths. It sure is! OTOH, many people write code that doesn't exploit that, and would rather not see runtime surprises when it's *possible* to catch them at compile-time. And some of those would rather not be allowed to write any code that *could* yield a runtime surprise. Different strokes, and I say don't worry about it! What's important is that the type system be defined sufficiently well that the compiler can either *know* that it knows the type of a given expression at compile-time, or know that it *doesn't* know it. What it does in the latter case can be determined by a compile option. There's really no other realistic choice, since Python's dynamicism is too useful to allow defining a type system that's guaranteed always resolvable at compile time. Some people are going to want to die when it's not, others are going to want to press on. and-both-positions-are-ridiculous-ly y'rs - tim From paul@prescod.net Fri Dec 17 03:25:40 1999 From: paul@prescod.net (Paul Prescod) Date: Thu, 16 Dec 1999 19:25:40 -0800 Subject: [Types-sig] consensus(?) summary (was: Type annotations) References: Message-ID: <3859AD34.17FABEC5@prescod.net> Greg Stein wrote: > > > * non-local writes are checked at runtime (by default) > > Hrm. Is there an easy rule to determine this? Yes: if the code's module object is not the module/class whose namespace we are writing to, the write fails. This is a fast pointer comparison in CPython. > I might suggest deferring > this unless/until we have a clear set of rules. Shades of C++'s "friend" > modifier are forming in my head when we talk about this... Simple languages should have simple protection rules. > > * for optimization, the checks may be stripped based on type inferenced > > information > > Which checks? I think runtime checks are *ignored* if you run with -O. > ... > In reference to type-inferred information: I don't think runtime checks > would ever be added if the type has been inferred. That's what I meant. > > * built-in types are declared through "shadow files" > > This is somewhat problematic. How do we map from a builtin type to this > shadow file? Do they reside in a well-known location? They reside on the PYTHONPATH. > Second issue: keeping them in sync, version mismatches, distribution and > install problems, etc. Keeping them in sync: coder's responsibility. Version mismatch: coder's responsibility. Distribution and install programs: I'll toss this to distutils. > > * types can be parameterized > > * which means that the compile/runtime checks need to be more > > sophisticated > > Yes, although I might modify it somewhat and say "only core types can be > parameterized." I don't any longer see a reason to restrict it that way. Parameterizing types is actually not so bad. I don't know how C++'s got so complex as to be a full programming language -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things never trust in: That's the vendor's final bill The promises your boss makes, and the customer's good will http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Fri Dec 17 03:25:47 1999 From: paul@prescod.net (Paul Prescod) Date: Thu, 16 Dec 1999 19:25:47 -0800 Subject: [Types-sig] New syntax? References: Message-ID: <3859AD3B.E841643A@prescod.net> Greg Stein wrote: > > On Thu, 16 Dec 1999, Martijn Faassen wrote: > > >... example syntax for a Python syntax to declare interfaces ... > > Ah. Good. That is better than your/Paul's previous suggestions. and > Tim has come up with a good first pass. If we can formalize that (and I'd > like to tweak it), then we have the basis we need to move forward. There are two problems. One is defining interfaces. The other is referring to compound types. My syntax for the former was almost the same as Martijn's (and thus non-Python). My syntax for the other was directly based on Tim's. Neither of my syntaxes are "Python" but neither are Tim and Martijn's. Perhaps you can quote the syntax that you object to and propose an alternative. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things never trust in: That's the vendor's final bill The promises your boss makes, and the customer's good will http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Fri Dec 17 03:25:51 1999 From: paul@prescod.net (Paul Prescod) Date: Thu, 16 Dec 1999 19:25:51 -0800 Subject: [Types-sig] New syntax? References: <38596586.63D9E29F@vet.uu.nl> Message-ID: <3859AD3F.CFB0ED0A@prescod.net> Martijn Faassen wrote: > > ... > Didn't you think parameterized types looked fairly straightforward in my > syntax proposal? I must have missed something. Could you show me how to do Btree of X and then make concrete types Btree of Int and Btree of Functions From String to Int? Paul Prescod -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things never trust in: That's the vendor's final bill The promises your boss makes, and the customer's good will http://www.geezjan.org/humor/computers/threes.html From tim_one@email.msn.com Fri Dec 17 04:05:49 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 16 Dec 1999 23:05:49 -0500 Subject: [Types-sig] challenge response (was: A challenge) In-Reply-To: Message-ID: <001101bf4844$0085adc0$63a2143f@tim> [GregS] > ... > IMO, type checking is NOT enabled by default. IMO2. oo?-ly y'rs - tim From tim_one@email.msn.com Fri Dec 17 04:05:51 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 16 Dec 1999 23:05:51 -0500 Subject: [Types-sig] New syntax? In-Reply-To: Message-ID: <001201bf4844$01f56a60$63a2143f@tim> [GregS] > ... > In fact, I don't even like Tim's notion of declaring a function since a > "def" is more than adequate for doing that. I thought it would be easier to get one new stmt than to modify existing stmts, and *much* easier to write a dirt-simple tool to strip them out again (vis a vis Guido's requirement). In real life I would certainly prefer annotating "def" stmts directly. I think a declaration statement needs the *ability* to specify full function signatures, though; e.g., decl handlerMap: {String: def(Int, Int)->Int} handlerMap = {"+": lambda x, y: x+y, "*": lambda x, y: x*y, ... } In either case, I'm not sure what to do about varargs (the "*rest" form of argument). > ... > Note the use of "decl class ..." to define class variables, while > "decl ..." is for member variables. I'm not sure if we should > instead use Tim's suggestion of "decl member ...", though. I am: I didn't think about this at all. Member vrbls are far more common than class vrbls, so practicality beats purity . > Given the position of the declaration, I think "decl member" might > actually be better because it makes it clear that is a *member* > variable, despite being in a location that is normally used for class > variables. That's the purity argument <2.0 wink>. > An alternative would be a different "decl" keyword just > for members. And that's the bozo argument <3.0 wink>. "decl" doesn't mean "here's a variable", it means "here's a declaration of 'something'"; e.g., on some days I would have killed to be able to say: decl builtin int, ord # stop looking these up in the inner loop! In fact, I think of "decl" as a devious way of writing "pragma" -- and all that that implies. although-less-than-that-demands-ly y'rs - tim From tim_one@email.msn.com Fri Dec 17 04:05:44 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 16 Dec 1999 23:05:44 -0500 Subject: [Types-sig] Interface files In-Reply-To: <199912162035.PAA11433@eric.cnri.reston.va.us> Message-ID: <001001bf4843$fe435ee0$63a2143f@tim> [Guido] > I'm not taking sides here, but I want to note that none of the takers > on my latest challenge have shown separate interface files. All the > ones I've seen used inline syntax. No, you're confusing my concrete syntax with my abstract syntax : in my submission, you move all the module-level "decl" statements that aren't declaring module-private names into a separate file. Bingo: interface file. I just happened to sprinkle them around inline to aid readability . heck-you-could-even-write-a-perl-script-to-do-it-ly y'rs - tim From tim_one@email.msn.com Fri Dec 17 07:25:55 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 17 Dec 1999 02:25:55 -0500 Subject: [Types-sig] Handling attributes In-Reply-To: <3857DFBC.A2B7ADAC@prescod.net> Message-ID: <001401bf485f$f47d2b40$63a2143f@tim> [PP] > ... > b. do we check assignments to class and module > attributes from other modules at runtime? As an eventual end user, if I declare the type of a name (*any* name), and I've enabled type checking, I expect that there is no possibility of that name getting bound to an object not of the declared type. I expect to get an error at compile-time if that's feasible, but I understand it may not be. In the latter case I expect a runtime error pointing at the offending binding. I also accept that the program may run slower because of this! > ... > c. should we perhaps just disallow writing to "declared" > attributes from other classes and modules? OK by me at the start. It's one way to satisfy my "no possibility", about which I'm serious because users will be serious. Unfortunately, I'm also a typical user in demanding the impossible -- that is, "no" is a very strong word, covering things like "disguised" rebindings via direct __dict__ access too. So as a reasonable user, I settle for "no possibility, with this peculiar but precise meaning of 'no': ...". > d. is it possible to write to UN-declared attributes from > other people's classes and modules? And what are the type safety > implications of doing so? Sure and none, for some peculiar but precise meanings of "sure" and "none" . For example, it may or may not totally screw up the conclusions reached by a type inferencer -- we'd have to see the type inferencer first to know for sure. Or if Guido's "optimize builtin" (yay!) idea is implemented, my doing yourmodule.len = lambda any: 42 will likely have no visible effect (for some peculiar but precise ...). but-i'd-call-that-a-feature!-ly y'rs - tim From tim_one@email.msn.com Fri Dec 17 07:25:50 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 17 Dec 1999 02:25:50 -0500 Subject: [Types-sig] Low-hanging fruit: recognizing builtins In-Reply-To: <199912151707.MAA02639@eric.cnri.reston.va.us> Message-ID: <001301bf485f$f2c12360$63a2143f@tim> [Guido] > ... > LOAD_GLOBAL 0 (len) > LOAD_FAST 0 (a) > CALL_FUNCTION 1 vs > can be replaced by > > LOAD_FAST 0 (a) > UNARY_LEN > > which saves one PVM roundtrip and two dictionary lookups, plus the > argument checking code inside the len() function. To get the latter, I believe we'd need to write an additional len() function. That is, the current checking len has to stick around to deal with stuff like apply(f, args) when f happens to be bound to __builtin__.len. > There are plenty of bytecodes available. Likely many more than there are compilers that can tolerate another case in eval_code2's switch . > ... > The per-module analysis required is major compared to what's > currently happening in compile.c (which only checks one function > at a time looking for assignments to locals) but minor compared > to any serious type inferencing. Note too that it's a length-changing transformation, which is also brand new. That is, the only optimizations in compile.c now replace a bytecode with another of the same size. So (at least) jump offsets and the line-number table would need to be recomputed too. Not hard, there's simply no machinery there to build on now. OTOH, compile.c needn't be involved; the analysis & transformations *could* be done via a bytecode-fiddling Python program. i-knew-michael-hudson-would-be-useful-for-*something*-ly y'rs - tim From tim_one@email.msn.com Fri Dec 17 07:32:59 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 17 Dec 1999 02:32:59 -0500 Subject: [Types-sig] expression-based type assertions In-Reply-To: Message-ID: <001b01bf4860$f19690a0$63a2143f@tim> [Edward Welbourne] > ... [and 14Kb later] ... > This list is too busy. Actually, I heard it was being killed for lack of activity . being-overwhelmed-is-a-way-of-life-ly y'rs - tim From tim_one@email.msn.com Fri Dec 17 08:08:22 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 17 Dec 1999 03:08:22 -0500 Subject: [Types-sig] Low-hanging fruit: recognizing builtins In-Reply-To: <3858B8C9.24962AAD@lemburg.com> Message-ID: <001d01bf4865$e2da0920$63a2143f@tim> [M.-A. Lemburg] > ... > I haven't followed the thread too closely, but isn't there > some way to tell the optimizer which modules to treat at > what optimization level ? No. I'm trying to introduce a "decl" stmt, though, that can in principle express any thought capable of human expression . > BTW, instead of adding oodles of new byte code, how about > grouping them... e.g. instead of UNARY_LEN, BUILD_RANGE, etc. > why not have a CALL_BUILTIN which takes an index into > a predefined set of builtin functions. It's another tradeoff. UNARY_LEN is simple enough that the code for builtin_len could be put in the case stmt inline, but skipping the argument check. Read it out of a table instead, and you're back to Yet Another Function call, or an embedded switch stmt in CALL_BUILTIN's implementation. > ... > Note that the loop as it is built now is already too large > for common Intel+compatible based CPUs. I assume this is Flowery Language for your particularly lame AMD K6 . > Adding even more byte codes to the huge single loop would > probably result in a decrease of CPU cache hits. (I split the > Great Switch in two switch statements and got some good results > out of this: the first switch handles often used byte codes > while the second takes care of the more exotic ones.) Good strategy! silly-cpus-ly y'rs - tim From gstein@lyra.org Fri Dec 17 08:37:26 1999 From: gstein@lyra.org (Greg Stein) Date: Fri, 17 Dec 1999 00:37:26 -0800 (PST) Subject: [Types-sig] Module Attribute visibility In-Reply-To: <385991C6.5B2917B4@prescod.net> Message-ID: On Thu, 16 Dec 1999, Paul Prescod wrote: > Greg Stein wrote: > > IMO, let's solve static type checking. Leave visibility and modification > > rules to another phase. They are orthogonal problems, and we would do well > > to reduce our problem set (and the amount of discussion thereby > > engendered (my 25 cent word for the day :-)). > > They are not orthogonal at all. I can't statically check a file that > uses sys.version unless I know that sys.version has not been overwritten > with a string. We can't allow the runtime system to violate the > expectations of the static type engine. We also don't want every user of > sys.version to need to assert its type. You certainly can statically check a file. Assume that sys.version is a string and remains a string. Done. Why can't the runtime system violate the expectations? Seriously: I doubt you can prevent it. Python is simply too dynamic. I'd be surprised if you could completely stop me from changing sys.version if I want really trying to do so. This falls back to what Tim was stating: you can output code that assumes a particular type and runs better, but falls back to a slower version if the type is wrong (or maybe raises an error). I certainly would hope that the compiler/PVM will not bomb because somebody managed to change the type of something where the type check system didn't think it would be changed. I think the type check system will work to signal errors at compile time, but I don't think it needs to go very far past that (e.g. hard-line restrictions on modification). This is basically a corollary of "we're all adults here." i.e. don't be a child and put a list into sys.version :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From tim_one@email.msn.com Fri Dec 17 08:59:33 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 17 Dec 1999 03:59:33 -0500 Subject: [Types-sig] minimal or major change? (was: RFC 0.1) In-Reply-To: <3858E9C2.E2722B88@vet.uu.nl> Message-ID: <001f01bf486d$09feec80$63a2143f@tim> [Martijn Faassen, reasonably demands ...] > So that's where I'm coming from. It's important for our proposal > to actually come up with a workable development plan, because > adding type checking to Python is rather involved. So I've been > pushing one course of implementation towards a testable/hackable > system that seems to give us the minimal amount of development > complexities. I haven't seen clear development paths from others > yet; most proposals seem to involve both run-time and compile- > time developments at the same time. > > So I'm interested to see other development proposals; possibly > there's a simpler approach or equally complex approach with more > payoff, that I'm missing. I haven't given a lick of thought to development, beyond sketching "the usual" approach to type inference for Guido, and having a hard-won intuition about what is and isn't reasonably parseable. This SIG has been "alive again" for on the order of just one week: design precedes implementation, and I won't bemoan the lack of implementation details even if they're delayed for *another* whole week . At that point, it's fine by me if the first cut is *spelled* using plain dicts and docstrings etc to ease development. But before that point, we don't even know what we want it to *do*. "we"-being-the-consensus-"we"-ly y'rs - tim From tim_one@email.msn.com Fri Dec 17 08:59:36 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 17 Dec 1999 03:59:36 -0500 Subject: [Types-sig] Re: [Doc-SIG] Sorry! In-Reply-To: Message-ID: <002001bf486d$0b457640$63a2143f@tim> [Edward Welbourne] > ... > (Tim: the type Boolean is a (useful) synonym for PyObject. I agree. I'm trying to avoid the mess C and then C++ got into by refusing to define a bool type for so many years (and so 10,000 development groups typedef'ed it in 20,000 different ways, many pairwise incompatable -- the runtime doesn't have much use for a distinct bool type, but it says something vital in signatures for *people*). > It probably includes some added semantics about how you should be > trying to use it.) Python's runtime rules have no restrictions on what a true/false object may be, or how one may be manipulated, and I don't want to impose any that Guido didn't see fit to impose from the start (I think we should be trying to provide notation for Python's actual types, not invent brand new types -- so "synonym" is what I want!). I just want that specific type name *there* so people don't run off defining their own in mutually incompatible ways. someday-i-*might*-want-to-run-somebody-else's-code-ly y'rs - tim From mal@lemburg.com Fri Dec 17 09:07:50 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 17 Dec 1999 10:07:50 +0100 Subject: [Types-sig] Low-hanging fruit: recognizing builtins References: <001d01bf4865$e2da0920$63a2143f@tim> Message-ID: <3859FD66.E47352E@lemburg.com> Tim Peters wrote: > > [M.-A. Lemburg] > > BTW, instead of adding oodles of new byte code, how about > > grouping them... e.g. instead of UNARY_LEN, BUILD_RANGE, etc. > > why not have a CALL_BUILTIN which takes an index into > > a predefined set of builtin functions. > > It's another tradeoff. UNARY_LEN is simple enough that the code for > builtin_len could be put in the case stmt inline, but skipping the argument > check. Read it out of a table instead, and you're back to Yet Another > Function call, or an embedded switch stmt in CALL_BUILTIN's implementation. Sure its a tradeoff, but it has the advantage of allowing to extend it later without adding too much cruft to the inner loop. What it basically does is avoid the global *and* local lookups by replacing them with a C array index lookup. Of course, for very common things such as len and range some other strategy might be worth persuing. > > ... > > Note that the loop as it is built now is already too large > > for common Intel+compatible based CPUs. > > I assume this is Flowery Language for your particularly lame AMD K6 ah, the satsifaction of being a Pure Wintel Guy!>. The performance improvement mentioned below is not really noticable on machines with different architectures, e.g. Sun SPARC. That's where I drew my conclusion from. But then, I tested a few years ago, so perhaps the new Pentiums and Athlon don't gripe about the size of the inner loop anymore. BTW, just to make buying one of those new microwave ovens more attractive: what is the pystone rating for the new Athlon and Pentium III chips ? > > Adding even more byte codes to the huge single loop would > > probably result in a decrease of CPU cache hits. (I split the > > Great Switch in two switch statements and got some good results > > out of this: the first switch handles often used byte codes > > while the second takes care of the more exotic ones.) > > Good strategy! Thanks :-) We are getting a little off-topic here, I'm afraid, but it was fun looking at old optimization strategies again... I haven't touched that code in years. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 14 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From gstein@lyra.org Fri Dec 17 09:16:09 1999 From: gstein@lyra.org (Greg Stein) Date: Fri, 17 Dec 1999 01:16:09 -0800 (PST) Subject: [Types-sig] New syntax? In-Reply-To: <001201bf4844$01f56a60$63a2143f@tim> Message-ID: On Thu, 16 Dec 1999, Tim Peters wrote: > [GregS] > > ... > > In fact, I don't even like Tim's notion of declaring a function since a > > "def" is more than adequate for doing that. > > I thought it would be easier to get one new stmt than to modify existing > stmts, and *much* easier to write a dirt-simple tool to strip them out again > (vis a vis Guido's requirement). > > In real life I would certainly prefer annotating "def" stmts directly. I > think a declaration statement needs the *ability* to specify full function > signatures, though; e.g., > > decl handlerMap: {String: def(Int, Int)->Int} Ah. Right. Good point. I guess that does mean that something like: decl a: def(Int)->None would be possible. e.g. is a member holding a ref to a function object. Of course, the type of in this case is no different than: def a(Int x)->None: It is just that one declares a member and the other declares a method. There is a subtle difference there :-) In fact, these two are probably equivalent: decl class a: def(Int)->None def a(Int x)->None: > handlerMap = {"+": lambda x, y: x+y, > "*": lambda x, y: x*y, > ... > } > > In either case, I'm not sure what to do about varargs (the "*rest" form of > argument). What's wrong with: decl a: def(Int, *)->Int decl b: def(Int, **)->Int decl c: def(Int, *, **)->Int I don't see any ambiguity in the grammar there, unless you use "*" to mean unknown (as Paul once mentioned). I think the unknown type should be "Any" (or "any"), since it really means "take any type of value." > > ... > > Note the use of "decl class ..." to define class variables, while > > "decl ..." is for member variables. I'm not sure if we should > > instead use Tim's suggestion of "decl member ...", though. > > I am: I didn't think about this at all. Member vrbls are far more common > than class vrbls, so practicality beats purity . That was my first thought. Then I started thinking too hard about the problem... :-) I'm not sure whether to go for practical or pure. Cheers, -g -- Greg Stein, http://www.lyra.org/ From tim_one@email.msn.com Fri Dec 17 09:28:12 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 17 Dec 1999 04:28:12 -0500 Subject: [Types-sig] Implementability In-Reply-To: <385903C9.98B69A30@vet.uu.nl> Message-ID: <002101bf4871$09fad100$63a2143f@tim> [Tim] > declared_int = unknown > > is an error, but [Martijn Faassen] > Or, if you're interfacing with untyped python, this could raise a > run-time exception if unknown doesn't turn out to be an integer. Or do > you disagree with this? Yes and no . I think "type-checked Python" needs at least three different compile modes, because nobody is claiming we *can* check the entire language in a bulletproof way -- and when we can't, different people will want different behaviors at different times for legitimate reasons. 1. Anything that can't be proven safe at compile-time is a compile-time error (that's where I disagree with you above). 2. Anything that can't be proven safe at compile-time is checked at runtime (that's where I agree with you above). 3. Anything that can't be proven safe at compile-time emits a compile-time warning, and there's no guarantees one way or the other about what happens at runtime (where I don't even agree with myself). If someone doesn't want any of those behaviors, fine, don't enable type-checking. >> unknown1 = unknown2 >> >> is not. Whether >> >> unknown = declared_int >> >> should be an error is a policy issue. Many will claim it should >> be an error, but the correct answer is that it should not. > This would seem to be the natural way to do it; I'm not sure why many > would claim it should be an error. Could you explain? This is what I call a "self-denying prophecy". That is, by implicitly ridiculing the position before anyone took it, I graciously saved everyone from suggesting it . > I agree. I'm going to save that sentence and paste it into other replies as needed. thanks!-ly y'rs - tim From tim_one@email.msn.com Fri Dec 17 09:28:15 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 17 Dec 1999 04:28:15 -0500 Subject: [Types-sig] A challenge In-Reply-To: Message-ID: <002201bf4871$0baf37c0$63a2143f@tim> [Golden, Howard] > I completely support this style! I won't quibble about 'decl' vs. > 'var', though I suggest the latter, all else being equal, since > it has a proud heritage. Despite that I think of "my syntax" as an abtract one, I've tried to give one that *could* serve as a concrete syntax. Toward that end, I deliberately avoided "var" because we're going to need more kinds of declarations in the future, and Guido's willingness to add a keyword is a miracle I don't want to risk abusing . So it was "decl" for "declaration -- of anything whatsoever". For example, just about everywhere I wrote decl x ... so far I'd be just as happy with decl var x ... Other things that are going to need declaration include type synonyms and parameterized type declarations; e.g., decl type BinaryTree(_T) # a parameterized type decl typedef IntTree = BinaryTree(Int) # a type synonym Stuff like Java's final methods will eventually attract rabid enthusiasts too, and decl can do *anything*. the-miracle-that-is-delayed-definition-ly y'rs - tim From tim_one@email.msn.com Fri Dec 17 10:05:06 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 17 Dec 1999 05:05:06 -0500 Subject: [Types-sig] Type annotations In-Reply-To: <3858E94E.B7D86846@prescod.net> Message-ID: <002301bf4876$318e1880$63a2143f@tim> [Paul] > 3. in separate decl statements: (Incompatible with Python 1.5, but > easily converted) > > Python 1.5 compatibility: low Noting that decl x: Int could just as well be spelled # decl x: Int That would knock it down on the "syntactic cleanliness" scale, though! From tim_one@email.msn.com Fri Dec 17 10:05:07 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 17 Dec 1999 05:05:07 -0500 Subject: [Types-sig] Type annotations In-Reply-To: <38592CD4.963694B0@prescod.net> Message-ID: <002401bf4876$327173a0$63a2143f@tim> I'm burned out on SIG msgs for tonight, so one quickie: [Paul] > ... > Here are some Haskell-ish syntax ideas for type declarations: > > First we need to be able to talk about types. We need a "type > expression" which evalutates to a type. > > Rough Grammar: > > Type : Type ['|' Type] # allow unions > Unit : dotted_name | Parameterized | Function | Tuple | List | Dict > Parameterized : dotted_name '(' Basic (',' Basic)* ')' > Basic : dotted_name | PythonLiteral | "*" # * means anything. > PythonLiteral : atom > Function : Type '->' Type > Tuple : "(" Type ("," Type )* ) > List: "[" Type "]" > Dict: "{" Type ":" Type "}" The Function defn above is appropriate for Haskell because all functions there are curried (exactly one argument). The LHS should be different in Python, because we have multiple-argument functions, and some arglist gimmicks Haskell doesn't have. In my examples I've been using Function : 'def' '(' arglist ')' '->' Type I think that's what's required. With the "def" it's obvious. Without the "def" it's too easy to mistake the parenthesized arglist for a tuple of some sort (yes, the '->' later disambiguates it, but unbounded lookahead isn't machine- or human-friendly). An explict def also allows the variant Function : 'def' '(' arglist ')' for functions that don't return results. The "*" and "**" elements of arglists need also to be addressed. BTW, there appear to be two holes in Tuple: the empty tuple, and a tuple with unknown length. Shivery as it is, I expect we have to follow Python in this regard, and say (T) is a tuple-of-T of arbitrary length, while (T,) is a tuple containing one T. > ... > maptype(intype, outtype) = > (( intype -> outtype ), List( intype )) -> List( outtype ) Don't the parens around (intype->outtype) say that it's a tuple containing a function? BTW, Python's actual "map" function is quite a puzzle to describe! I can't do it: map: def(def(_A)->_B, Seq(_A) -> [_B] | def(def(_A, _B)->_C, Seq(_A), Seq(_B)) -> [_C] | ... is just the start of an unbounded sequence of legit map signatures. Haskell avoids the difficulty here thanks to currying. > ... > Interfaces look like Python classes but they use an "interface" > keyword. Weren't we leaving interfaces to JimF ? too-late-now-ly y'rs - tim From tim_one@email.msn.com Fri Dec 17 10:23:56 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 17 Dec 1999 05:23:56 -0500 Subject: [Types-sig] A challenge In-Reply-To: <3859353C.4106B165@appliedbiometrics.com> Message-ID: <002501bf4878$d2faf240$63a2143f@tim> [Christian Tismer] > ... > Just a question, please: > >> import fnmatch >> import os >> >> decl _debug: Int # but Boolean makes more sense; see below > > Is this meant to be lexically true in the globals scope from > here on? I'm not sure I grasp the question. The implicit model is the "global" stmt: a type declaration applies to the entire compilation unit in which it appears (module, class or def), and referencing a declared name before the declaration should be verboten. "global" currently allows redundant declarations, and you get a Fabulous Prize for later noticing the instance of redundant type declarations that I snuck into one of the examples . Conflicting declarations should be a compile-time error, and of the two traditional approaches to that I vote for name equivalence (as opposed to structural (content) equivalence). That is: class C1: decl member real, imag: Float real = imag = 0 and class C2: decl member real, imag: Float real = imag = 0 are incompatible classes, despite that their guts are identical. >> _debug = 0 >> >> decl _prune: [String] >> _prune = ['(*)'] >> >> decl find: def(String, optional dir: String) -> [String] >> >> def find(pattern, dir = os.curdir): >> decl list, names: [String], name: String # LINE1 >> list = [] >> names = os.listdir(dir) >> names.sort() >> for name in names: >> decl name, fullname: String # LINE2 > Same question: "name" is redefined from here on? My intent was to illustrate redundant declaration. I'm definitely not trying to invent new scoping rules! The semantics would be exactly the same if "name" were absent from either LINE1 or LINE2; it would be an error if e.g. "decl name: Int" appeared in the "for" loop instead; and e.g. it would be illegal to reference fullname before LINE2. > Would this behave (or be as behaviorless) like > the "global" declaration, or lexical, or do > you open a new type scope with "for"? (New > "variable, with C's {} in mind). > The latter cannot be since "for" declared it already. That's right. No new scopes are implied here; just saying something about the names in the current scopes. how-do-we-declare-the-type-of-a-continuation?-ly y'rs - tim From tim_one@email.msn.com Fri Dec 17 11:00:06 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 17 Dec 1999 06:00:06 -0500 Subject: [Types-sig] Module Attribute visibility In-Reply-To: <38592F3D.10A7AFA4@prescod.net> Message-ID: <002601bf487d$e0b16fe0$63a2143f@tim> [Paul] > Okay, but doesn't Python already conflate declaration with > initialization? I don't think so. Its only declaration today is "global"! > When I refer to mymod.foo I am referring to an object that was > assigned, somewhere to the name foo in the module mymod. I don't follow this, unless you think of referring to mymod.foo as "a declaration" -- I don't. It's just a reference. > Are we going to say that statically type checked code can only > refer to declared (not merely assigned) variables in other > modules? I wouldn't say that, although I bet some people will want that as an option. > Would it be safe to say that undeclared variables are simply not > available for type checking? Don't know what "not available" means. Yell at me if it means more than what you're talking about below. > Would you suggest that this is even the case for functions? I.e. > > def foo( str ): return str*2 > > is invisible to the type checker Not invisible, but that its argument and its return type are (in the absence of inference) both associated with the universal set (the set of all types), with rules as sketched earlier (widening bindings only; "int = universal" bad (& whether that's a compile-time error, or *potential* run-time error, or compile-time warning, is an option); "universal = int" good). > until we add: > > decl foo: str -> str > > Or would foo have an implicit declaration: > > decl foo: PyObject -> PyObject Yes, your PyObject appears to be a spelling of what I called the "universal set" above. Although, since this *is* Python, you could probably drop the "Py" prefix without risk of confusion . > And if that foo has an implicit declaration, shouldn't this foo also: > > foo = lambda x: x*2 I can't imagine any reason why it shouldn't! If we're disagreeing here, I don't see how -- unless it's that you believe I mean something by "declaration" and/or "initialization" that I don't. All I was getting at is that decl x: Int = 5 (combining declaration with initialization) is more dubious than decl x: Int x = 5 (leaving declaration (1st line) separate from initialization (2nd line)) in a language where some people clearly don't want to look at type declarations *at all*. Keeping the binding out of the declaration makes it trivial to "comment it out", set an editor mode to suppress displaying "decl" lines, copy decl lines into interface files, and so on. different-purpose-different-statement-ly y'rs - tim From guido@CNRI.Reston.VA.US Fri Dec 17 14:32:25 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 17 Dec 1999 09:32:25 -0500 Subject: [Types-sig] Module Attribute visibility In-Reply-To: Your message of "Fri, 17 Dec 1999 00:37:26 PST." References: Message-ID: <199912171432.JAA12387@eric.cnri.reston.va.us> > Why can't the runtime system violate the expectations? Seriously: I doubt > you can prevent it. Python is simply too dynamic. I'd be surprised if you > could completely stop me from changing sys.version if I want really trying > to do so. Nonsense. You are confusing one particular implementation with what's possible. In JPython, things like this *are* being enforced in an absolute way. If that's what we want, we can do it. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@CNRI.Reston.VA.US Fri Dec 17 14:42:35 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 17 Dec 1999 09:42:35 -0500 Subject: [Types-sig] Keyword arg declarations In-Reply-To: Your message of "Fri, 17 Dec 1999 01:16:09 PST." References: Message-ID: <199912171442.JAA12404@eric.cnri.reston.va.us> I just realized that Tim's decl syntax that's currently being bandied around doesn't declare the names of arguments. That's fine for a language like C, but in Python, any argument with a name (*args excluded) can be used as a keyword argument. I think it will be useful for the decl syntax to allow leaving out or supplying argument names -- that tells whether keyword arguments are allowed for this particular function. And that is part of a function's signature. (Note that not all builtins support keyword arguments; in fact most don't.) (Un)related: I think it makes sense to be able to restrict the types of *varargs arguments. E.g. eons ago (last week in the types-sig) someone proposed an extension to isinstance() allowing one to write isinstance(x, type1, type2, type3, ...). Clearly the varargs are all type objects here. Not so sure about **kwargs, but these should probably be treated the same way. --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein@lyra.org Fri Dec 17 14:54:32 1999 From: gstein@lyra.org (Greg Stein) Date: Fri, 17 Dec 1999 06:54:32 -0800 (PST) Subject: [Types-sig] Keyword arg declarations In-Reply-To: <199912171442.JAA12404@eric.cnri.reston.va.us> Message-ID: On Fri, 17 Dec 1999, Guido van Rossum wrote: > I just realized that Tim's decl syntax that's currently being bandied > around doesn't declare the names of arguments. That's fine for a > language like C, but in Python, any argument with a name (*args > excluded) can be used as a keyword argument. > > I think it will be useful for the decl syntax to allow leaving out or > supplying argument names -- that tells whether keyword arguments are > allowed for this particular function. And that is part of a > function's signature. Shouldn't be hard to add these names. IMO, the syntax for functions in a typedecl should look just like the "def" syntax (which should be updated to allow typedecls). >... > (Un)related: I think it makes sense to be able to restrict the types > of *varargs arguments. E.g. eons ago (last week in the types-sig) > someone proposed an extension to isinstance() allowing one to write > isinstance(x, type1, type2, type3, ...). Clearly the varargs are all > type objects here. > > Not so sure about **kwargs, but these should probably be treated the > same way. Shouldn't be a problem: def foo(bar, *args: [Int], **kw: {String: Float}) -> None: ... Cheers, -g -- Greg Stein, http://www.lyra.org/ From skip@mojam.com (Skip Montanaro) Fri Dec 17 17:26:46 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Fri, 17 Dec 1999 11:26:46 -0600 (CST) Subject: [Types-sig] doc-sig/types-sig clash? Message-ID: <14426.29270.837380.905106@dolphin.mojam.com> One of the proposals regarding typing seems to be inserting stuff into doc strings. It is perhaps worth noting for those who don't subscribe to the doc-sig that that bunch of ne'er-do-wells (I mean esteemed colleagues) also has their eyes on the doc string (imagine that!). Just raising a small flag to make sure people don't assume the doc string is their private sandbox. Apologies if this has already been addressed. I'm still wading through all the recent Python activity. Skip From skip@mojam.com (Skip Montanaro) Fri Dec 17 17:32:45 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Fri, 17 Dec 1999 11:32:45 -0600 (CST) Subject: [Types-sig] RFC 0.1 In-Reply-To: <000601bf482c$6e075b40$63a2143f@tim> References: <199912141633.LAA23558@eric.cnri.reston.va.us> <000601bf482c$6e075b40$63a2143f@tim> Message-ID: <14426.29629.400734.628570@dolphin.mojam.com> Tim> Optimizing compilers do this routinely under the covers, where the Tim> (misnamed in this case) "verification" code simply branches to a Tim> slower all-purpose translation of the code if the assumptions turn Tim> out to be false at runtime. As does/did Self (without explicit type declarations)... ;-) Skip From faassen@vet.uu.nl Fri Dec 17 17:32:45 1999 From: faassen@vet.uu.nl (Martijn Faassen) Date: Fri, 17 Dec 1999 18:32:45 +0100 Subject: [Types-sig] New syntax? In-Reply-To: <3859AD3F.CFB0ED0A@prescod.net> References: <38596586.63D9E29F@vet.uu.nl> <3859AD3F.CFB0ED0A@prescod.net> Message-ID: <19991217183245.A12905@vet.uu.nl> Paul Prescod wrote: > Martijn Faassen wrote: > > > > ... > > Didn't you think parameterized types looked fairly straightforward in my > > syntax proposal? > > I must have missed something. Could you show me how to do Btree of X and > then make concrete types Btree of Int and Btree of Functions From String > to Int? I don't know enough about Btrees to give a good example of that, but this would be the idea: # declclass, decldef to define classes and functions of same name declclass Test: whatevertype: param def __init__(self, whatevertype): self.data: whatevertype def getvalue(self): result: whatevertype # need to come up with new typedefinition syntax here # for classes this isn't necessary as a class definition should be a type # definition, but for functions it may be necesary. This needs to be # thought out. I think typedeffing functions when necessary is better than # devising some syntax to declare a function inline in a parametric # type instantiation. typedef Func(string): result: int foo: Test(int) bar: Test(Func) baz: Test(Test(int)) Something like that. The keywords aren't ideal yet, but the syntax is fairly Pythonic. By the way, I thought of an alternative syntax that might be more Pythonic as it cuts down on the use of ':' declclass Foo: def __init__(self, value=[int]): self.data = [int] self.moredata = string def dosomething(self, one=int, two=string): three = [int] return int This one might be more Pythonic and I think I'll advocate this syntax from now on. :) Transform to external interface by removing all type assignments to local variables. Transform to interface without exposing member data by removing all type assignments in method bodies. Did you all notice how the terms 'type assignment' and 'type instantiation' nicely map to Pythonic syntax? Regards, Martijn From tismer@appliedbiometrics.com Fri Dec 17 17:36:21 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Fri, 17 Dec 1999 18:36:21 +0100 Subject: [Types-sig] A challenge References: <002501bf4878$d2faf240$63a2143f@tim> Message-ID: <385A7495.D7A25EC4@appliedbiometrics.com> Tim Peters wrote: > > [Christian Tismer] > > ... > > Just a question, please: > > > >> import fnmatch > >> import os > >> > >> decl _debug: Int # but Boolean makes more sense; see below > > > > Is this meant to be lexically true in the globals scope from > > here on? > > I'm not sure I grasp the question. The implicit model is the "global" stmt: > a type declaration applies to the entire compilation unit in which it > appears (module, class or def), and referencing a declared name before the > declaration should be verboten. "global" currently allows redundant > declarations, and you get a Fabulous Prize for later noticing the instance > of redundant type declarations that I snuck into one of the examples . Understood. There is always just one possible type in a scope, and if defined again, it has to match. [name equivalence] That sounds very right, since it allows to create different things even if they look the same from structure. You get more strength in error checking, since using the parameter in the wrong context can be detected even if a foo's components look like a bar's. ... > >> decl list, names: [String], name: String # LINE1 > >> decl name, fullname: String # LINE2 > > > Same question: "name" is redefined from here on? > > My intent was to illustrate redundant declaration. I'm definitely not > trying to invent new scoping rules! The semantics would be exactly the same > if "name" were absent from either LINE1 or LINE2; it would be an error if > e.g. "decl name: Int" appeared in the "for" loop instead; and e.g. it would > be illegal to reference fullname before LINE2. Isn't this in conflict with one of your earlier posts where you wanted the same variable to take different types in sequence? I found that example very clean. You assigned a dict's keys() tot he variable which held the dict. Is this idea gone? > how-do-we-declare-the-type-of-a-continuation?-ly y'rs - tim Let's see :-) PythonWin 1.5.42c1 (#0, Dec 15 1999, 01:48:37) [MSC 32 bit (Intel)] on win32 Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam Portions Copyright 1994-1999 Mark Hammond (MHammond@skippinet.com.au) >>> import continuation >>> co = continuation.caller() >>> co >>> type(co) >>> co.__doc__ "I am a continuation object, Deleting 'link' kills me." >>> callable(co) 1 >>> I think the type of a continuation is Continuation. Thanks for the good question - ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From skip@mojam.com (Skip Montanaro) Fri Dec 17 17:40:29 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Fri, 17 Dec 1999 11:40:29 -0600 (CST) Subject: [Types-sig] Module Attribute visibility In-Reply-To: <385991C6.5B2917B4@prescod.net> References: <385991C6.5B2917B4@prescod.net> Message-ID: <14426.30093.817662.654658@dolphin.mojam.com> Paul> Greg Stein wrote: >> IMO, let's solve static type checking. Leave visibility and >> modification rules to another phase. They are orthogonal problems, >> and we would do well to reduce our problem set (and the amount of >> discussion thereby engendered (my 25 cent word for the day :-)). Paul> They are not orthogonal at all. I can't statically check a file Paul> that uses sys.version unless I know that sys.version has not been Paul> overwritten with a string. We can't allow the runtime system to Paul> violate the expectations of the static type engine. We also don't Paul> want every user of sys.version to need to assert its type. Depending what version of Python we are proposing this for, I think you can punt on the issues of visibility and modification if you allow the programmer to state (perhaps with a command line argument) that the elements of all core modules (sys, os, posix, math, ...) have stable type representations. This allows you (us?) to write a set of declarations for these modules akin to function prototypes in C or class declarations in C++. If that doesn't leave enough wiggle room for some objects, perhaps you need an "Object" declaration that tells the type checker the object is of an indeterminate (my 25-cent word for the day) type. Skip From faassen@vet.uu.nl Fri Dec 17 17:41:40 1999 From: faassen@vet.uu.nl (Martijn Faassen) Date: Fri, 17 Dec 1999 18:41:40 +0100 Subject: [Types-sig] New syntax? In-Reply-To: <001201bf4844$01f56a60$63a2143f@tim> References: <001201bf4844$01f56a60$63a2143f@tim> Message-ID: <19991217184140.B12905@vet.uu.nl> Tim Peters wrote: > [GregS] > > ... > > In fact, I don't even like Tim's notion of declaring a function since a > > "def" is more than adequate for doing that. > > I thought it would be easier to get one new stmt than to modify existing > stmts, and *much* easier to write a dirt-simple tool to strip them out again > (vis a vis Guido's requirement). > > In real life I would certainly prefer annotating "def" stmts directly. I > think a declaration statement needs the *ability* to specify full function > signatures, though; e.g., > > decl handlerMap: {String: def(Int, Int)->Int} > > handlerMap = {"+": lambda x, y: x+y, > "*": lambda x, y: x*y, > ... > } I think inline type declarations like def(Int, Int)->Int may not be necessary if you allow typedefs. People often give the advice to avoid Lambdas in Python anyway; why not avoid a lambda like construct in our type definition language as well? typedef Footype(int, int): return int var handlermap = {string: Footype} > In either case, I'm not sure what to do about varargs (the "*rest" form of > argument). Me neither. Perhaps something like: decldef foo(first=int, second=string, *[int]): return int i.e. all the extra arguments must be ints. Note that I'm currently in the out-of-line camp with Paul. :) Regards, Martijn From faassen@vet.uu.nl Fri Dec 17 18:10:05 1999 From: faassen@vet.uu.nl (Martijn Faassen) Date: Fri, 17 Dec 1999 19:10:05 +0100 Subject: [Types-sig] Type annotations In-Reply-To: <002401bf4876$327173a0$63a2143f@tim> References: <38592CD4.963694B0@prescod.net> <002401bf4876$327173a0$63a2143f@tim> Message-ID: <19991217191005.C12905@vet.uu.nl> Tim Peters wrote: > BTW, Python's actual "map" function is quite a puzzle to describe! I can't > do it: > > map: def(def(_A)->_B, Seq(_A) -> [_B] | > def(def(_A, _B)->_C, Seq(_A), Seq(_B)) -> [_C] | ... > > is just the start of an unbounded sequence of legit map signatures. Haskell > avoids the difficulty here thanks to currying. I can't do it either, but for the simple version of the function, isn't this more readable (I already notice a typo involving a closeing ) in yours!): # another variety on my proposal presented here :) decl: typedef Func(_A): return _B decl: typedef map(Func(_A, _B), [_A]): return [_B] actualmap = map(_A=int, _B=string) But perhaps it's just a matter of getting used to things. Regards, Martijn From tony@metanet.com Fri Dec 17 18:37:06 1999 From: tony@metanet.com (Tony Lownds) Date: Fri, 17 Dec 1999 10:37:06 -0800 (PST) Subject: [Types-sig] A challenge In-Reply-To: <000501bf4779$5e566b40$58a2143f@tim> Message-ID: Here is another syntactical variant. ------------------------------------ import sys, find def main() -> None: #Note: I have to rename the "list" variable here, because list is # used as a type a_list: list of str name: str dir:= "." #Note: type implied from the literal. if sys.argv[1:]: dir = sys.argv[1] a_list = find.find("*.py", dir) a_list.sort() for name in a_list: print name if __name__ == "__main__": main() ---------------------------------------------- import fnmatch import os _debug: = 0 _prune: list of string = ['(*)'] def find(pattern: str, dir: str = os.curdir) -> list of str: #Note: again, renaming the var named "list" a_list: list of str = [] names:= os.listdir(dir) #Note: asking for type to be implied here names.sort() for name:str in names: if name in (os.curdir, os.pardir): continue #Note: not asking for fullname to be typed; its usage should be # easy to type check fullname = os.path.join(dir, name) if fnmatch.fnmatch(name, pattern): a_list.append(fullname) if os.path.isdir(fullname) and not os.path.islink(fullname): for p:str in _prune: if fnmatch.fnmatch(name, p): if _debug: print "skip", `fullname` break else: if _debug: print "descend into", `fullname` a_list = a_list + find(pattern, fullname) return a_list #---------------------------------------------------------------------- import re #Note: I'm showing dictionaries' types are declared using a literal (ie # {str: int}) rather than a parameterized type name (list of int) because # a) the consistent choice for the name of a dictionary ("dict") doesnt # exist in python right now # b) actually tuples and lists can be declared using a literal, but that # would declare a tuple/list of exactly that size. # # also note that RegexObject is in a module and is accessed as such. _cache:{str:re.RegexObject} = {} # Declaring all the function signatures in a block here, to follow Tim's # format. # # The reason that "declare" is being used is that if I simply declared the # type of the variable just like the other local variable then when the # type checker gets to the actual def statement, which is really just an # assignment, it should raise an error if it cant determine that the new # value assigned does not match the definition. # # Also I am assuming a "bool" type. The corresponding builtin # function "bool" (str, int, float, etc. all have corresponding builtin # functions and I think that is a Good Thing) could in essence be: # # def bool(value): # if any: # return 1 # else: # return 0 # declare: fnmatch: (str, str) -> bool fnmatchcase: (str, str) -> bool translate: str -> str # showing this one again with parameter names; without them you # restrict the users of this function from using a keyword calling syntax. declare: fnmatchcase: (name:str, pat:str) -> bool def fnmatch(name, pat): import os name = os.path.normcase(name) pat = os.path.normcase(pat) return fnmatchcase(name, pat) def fnmatchcase(name, pat): if not _cache.has_key(pat): res:str = translate(pat) _cache[pat] = re.compile(res) return _cache[pat].match(name) is not None def translate(pat): #Note: the next line was originally: i, n = 0, len(pat) # I had to add parens to use the implied type sugar and its not as # easy to read. Which makes me wonder if that bit of sugar is a wart. (i, n) := 0, len(pat) res := '' while i < n: # Note: introducing chr as a type c:chr = pat[i] i = i+1 if c == '*': res = res + '.*' elif c == '?': res = res + '.' elif c == '[': j:int = i if j < n and pat[j] == '!': j = j+1 if j < n and pat[j] == ']': j = j+1 while j < n and pat[j] != ']': j = j+1 if j >= n: res = res + '\\[' else: stuff:str = pat[i:j] i = j+1 if stuff[0] == '!': stuff = '[^' + stuff[1:] + ']' elif stuff == '^'*len(stuff): stuff = '\\^' else: while stuff[0] == '^': stuff = stuff[1:] + stuff[0] stuff = '[' + stuff + ']' res = res + stuff else: res = res + re.escape(c) return res + "$" Thats it. Here is a point-by-point summary, so you can quickly point out what you dislike (and if a few people do so off-line then I'll post a summary and maybe save a bit of traffic): - types are either classes or builtin type names or type expressions. - the builtin functions involving types are overloaded when used in type expressions. - "bool" is a new type and a new builtin - parameterized types are instantiated (er, whats the real term for this?) with the "of" operator. - dictionaries of any size are defined by a literal dictionary syntax - lists and tuples of exact size are defined by a literal tuple syntax - callables are shown with the -> operator - the arguments to a signature is not a tuple; you can have names and * and ** keywords. - you can omit the type expression altogether if it is an assignment from a well-known function (e.g. builtin) or a simple literal (e.g. 0, '') - variables created by for loops are typed in-line - if you are specifying a signature for a function that occurs later it should be in a declare block - using "list", "int", etc. for variable names potentially shadows the builtin function and the type name. -Tony From da@ski.org Fri Dec 17 18:53:18 1999 From: da@ski.org (David Ascher) Date: Fri, 17 Dec 1999 10:53:18 -0800 Subject: [Types-sig] Keyword arg declarations References: <199912171442.JAA12404@eric.cnri.reston.va.us> Message-ID: <00c101bf48bf$fb9850c0$c355cfc0@ski.org> From: "Guido van Rossum" > I just realized that Tim's decl syntax that's currently being bandied > around doesn't declare the names of arguments. That's fine for a > language like C, but in Python, any argument with a name (*args > excluded) can be used as a keyword argument. This brings to mind a point which may or may not be relevant. Sometimes Python users use some tricks to do 'deferred' type checking and other argument manipulation, because the syntax doesn't allow one to specify the interface which one needs. An example of such a signature is familiar to all is the signature for range(). The docstring for range reads: range([start,] stop[, step]) -> list of integers which is not expressible with the current syntax. A Python version of range would have to do, much like NumPy's arange does, def range(start, stop=None, step=1): if (stop == None): stop = start start = 0 Now, the builtin typechecker can of course be told about __builtin__.range's signature peculiarities, but is there any way we can address the more general problem? Or is it, as I suspect, rare enough that one can ignore it? > (Note that not all builtins support keyword arguments; in fact most > don't.) And a shame it is, IMO. Would it make sense to consider for 2.0 a mechanism which allows keyword arguments almost by default? That way I could do pickle.dump(object=foo, file=myfile) and never have to worry about which came first... --david From gstein@lyra.org Fri Dec 17 19:02:04 1999 From: gstein@lyra.org (Greg Stein) Date: Fri, 17 Dec 1999 11:02:04 -0800 (PST) Subject: [Types-sig] typedefs (was: New syntax?) In-Reply-To: <19991217184140.B12905@vet.uu.nl> Message-ID: On Fri, 17 Dec 1999, Martijn Faassen wrote: > Tim Peters wrote: > > [GregS] > > > ... > > > In fact, I don't even like Tim's notion of declaring a function since a > > > "def" is more than adequate for doing that. > > > > I thought it would be easier to get one new stmt than to modify existing > > stmts, and *much* easier to write a dirt-simple tool to strip them out again > > (vis a vis Guido's requirement). > > > > In real life I would certainly prefer annotating "def" stmts directly. I > > think a declaration statement needs the *ability* to specify full function > > signatures, though; e.g., > > > > decl handlerMap: {String: def(Int, Int)->Int} > > > > handlerMap = {"+": lambda x, y: x+y, > > "*": lambda x, y: x*y, > > ... > > } > > I think inline type declarations like def(Int, Int)->Int may not be necessary > if you allow typedefs. People often give the advice to avoid Lambdas in Python > anyway; why not avoid a lambda like construct in our type definition language > as well? > > typedef Footype(int, int): > return int > > var handlermap = {string: Footype} I see typedefs as a way to associate a typedecl with a name. In your example here, I'm not sure how to do a typedef of something like List. You seem to have pegged typedef to only do function typedefs. Per the GFS proposal, I would recommend that "typedef" is a unary operator keyword. The operand is a typedecl, in the form that we see to the right of a "decl" statement. The result of the operator is a typedecl object. This typedecl object can, of course, be used in further typedecl constructions. For example: HandlerMapType = typedef {String: def(Int, Int)->Int} decl std_handlers: [HandlerMapType] def foo(m: HandlerMapType)->Int: ... In any case, I think using "def" inline to define a function typedecl is fine. A typedef is merely used to create an alias, to clarify a later declaration. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Fri Dec 17 19:52:07 1999 From: gstein@lyra.org (Greg Stein) Date: Fri, 17 Dec 1999 11:52:07 -0800 (PST) Subject: [Types-sig] parameterized typing (was: New syntax?) In-Reply-To: <19991217183245.A12905@vet.uu.nl> Message-ID: Hrm. Looks like I deleted Paul's original note. I've got some ideas now on how to do parameterized types, so I'll just piggyback on Martijn's :-) On Fri, 17 Dec 1999, Martijn Faassen wrote: > Paul Prescod wrote: > > Martijn Faassen wrote: > > > ... > > > Didn't you think parameterized types looked fairly straightforward in my > > > syntax proposal? > > > > I must have missed something. Could you show me how to do Btree of X and > > then make concrete types Btree of Int and Btree of Functions From String > > to Int? The first issue to handle with parameterized classes is how to specify the parameter(s) to be used in your class definition. Tim's "decl" keyword comes into play here. So lets jump in with definition-by-example: class Btree: decl param _T: any # the parameter can have any type def insert(self, value: _T) -> None: ... def asList(self) -> [_T]: ... Now that we have defined and declared our Btree class, we can parameterize it in later declarations: decl tree1: Btree(Int) # Paul's first request decl tree2: Btree(def(String)->Int) # Paul's second request tree1.insert("foo") # causes type-check error; we're passing wrong type l = tree1.asList() # we know that is [Int] Other examples of param declarations might be: decl param _T: Int or String or Tuple decl param _S: [Int] or [Float] It is interesting to note that the above lines are almost exactly the same as doing: _T = typedef Int or String or Tuple _S = typedef [Int] or [Float] The only difference is that (in the decl case) Python understands that _T and _S are type-substitution parameters. If a true typedef was used in the Btree class definition, then we would simply end up with a non-parameterizable class that had "any" in some of its declarations. Note that the compiler *does* treat the declarations using the typedef notion: it can perform optimizations, type checks, and other stuff as if "any" was used. To be clear: class Foo: decl param _T: Int or Float def bar(self, value: _T) -> _T: return 2 * value In the above code, the compiler uses _T as a typedef and optimizes the "2 * value" line, knowing that "value" is either an Int or a Float. Runtime checks are present, as usual. The "decl param" is only important to *users* of class Foo -- the _T becomes part of Foo's interface and type substitutions can be made. ---- Point (1) (this will be important later) One issue is that Btree has no (runtime) argument checks in the above example. "any" is allowed for the insert() parameter, which effectively means "no type check." In other words, Btree is still an abstract implementation; the concreteness is only present in compile-time type checks. Recall my previous note regarding type declarators. Note that a type declarator object can be a type, a class, an interface, or a composite. For example: t1 = typedef Int # typedecl of a type t2 = typedef Btree # typedecl of a class t3 = typedef Sequence # typedecl of an interface t4 = typedef Btree(Int) # typedecl of a composite I might suggest that, in the same way instances are created, we can create concrete classes through the use of type declarators: BtreeInt = typedef Btree(Int) tree = BtreeInt() In other words, typedecl objects are callable. A typedecl that is a class or a parameterized class will instantiate an object. So what does "concrete class" actually give you? (note that I mean something other than the type checks mentioned above) I think a concrete class would be an on-the-fly constructed subclass of the abstract class. Each method would be overridden: a method arg runtime check is done, then a call to the abstract method if performed. For example: class __compiler_built_Btree_Int(Btree): def insert(self, value: Int)->None: return Btree.insert(self, value) def asList(self)->[Int]: return Btree.asList(self) The notion of "concrete class" (to me) simply means the addition of runtime checks to enforce the type constraint. Theoretically, the system could recompile the Btree class and perform various optimizations. But: I don't think that is really possible, given Python's model (the source may not be readily available, and there isn't a way for the compiler to reach into the middle of a source file to grab the Btree class source and rebuild a concrete version). Given that we have compile-time checks with the simplest notion of parameterized types, I don't see the runtime checks offering a whole lot more. Especially for the complexity involved. I think I'll take Tim's tack here: the notion of on-the-fly building concrete classes won't work; the example above is a before-somebody- suggests-it counterproof. So, I would say everything above Point (1) is valid. Everything below should not be dealt with. Paul: does this sufficiently address your desire for parameterized types? Others: how does this look? It seems quite Pythonic to me, and is a basic extension of previous discussions (and to my thoughts of the design). Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Fri Dec 17 20:04:37 1999 From: gstein@lyra.org (Greg Stein) Date: Fri, 17 Dec 1999 12:04:37 -0800 (PST) Subject: [Types-sig] Keyword arg declarations In-Reply-To: <00c101bf48bf$fb9850c0$c355cfc0@ski.org> Message-ID: On Fri, 17 Dec 1999, David Ascher wrote: > From: "Guido van Rossum" > > I just realized that Tim's decl syntax that's currently being bandied > > around doesn't declare the names of arguments. That's fine for a > > language like C, but in Python, any argument with a name (*args > > excluded) can be used as a keyword argument. I responded to this elsewhere; I believe we can easily declare varargs and keywords with an unambigous syntax. >... > An example of such a signature is familiar to all is the signature for > range(). The docstring for range reads: > > range([start,] stop[, step]) -> list of integers > > which is not expressible with the current syntax. A Python version of range > would have to do, much like NumPy's arange does, I believe it is expressible: def range(start: Int, stop=None: Int or None, step=1: Int) -> [Int]: ... The only caveat is that somebody could do: range(3, None, 1) Which (technically) is not supposed to be allowed. >... > Now, the builtin typechecker can of course be told about __builtin__.range's > signature peculiarities, but is there any way we can address the more > general problem? Or is it, as I suspect, rare enough that one can ignore > it? Well, the above isn't necessarily the prettiest, but it *is* possible with at least one proposal for syntax extensions. I believe this kind of argument funkiness is pretty rare and we don't need to provide any special handling or consideration for the problem. >... > > (Note that not all builtins support keyword arguments; in fact most > > don't.) > > And a shame it is, IMO. Would it make sense to consider for 2.0 a mechanism > which allows keyword arguments almost by default? That way I could do I think the builtins need to start using PyArg_ParseTupleAndKeywords(). I seem to recall that there have been problems with that function in the past, but I think they've been cleared up in the 1.5 series. That should keyword-enable those functions... Cheers, -g -- Greg Stein, http://www.lyra.org/ From skaller@maxtal.com.au Fri Dec 17 22:06:54 1999 From: skaller@maxtal.com.au (skaller) Date: Sat, 18 Dec 1999 09:06:54 +1100 Subject: [Types-sig] Type Inference I References: Message-ID: <385AB3FE.AE6C2630@maxtal.com.au> I've just been reading the December Types-Sig archive. Wow, from an almost dead SIG ... I've got a lot of comments, and I'm not sure if to roll them all into one post, or lots of small ones. So I'll try a small one, to see if I get suitably flammed :-) Proposition: it is possible to do whole program type analysis of existing Python programs, and generate optimised code. Evidence: JimH has in fact done this, and claimed some astounding (almost unbelievable) results. I have done some 'micky mouse' testing, and found that in some cases, it can indeed work. So I think the proposition is proven that it can be _done_, but how effective is it? What obstacles stand in the way? Here are some thoughts. OBSTACLES TO TYPE INFERENCE IN PYTHON, I: POOR SPECIFICATIONS ---------------------------------------------------------- First, the biggest obstacle to doing this is the state of the current language definition! The exclamation mark is there because I know you didn't expect this. [You thought that optional type declarations would be the biggest help] Of course, Guido (and ONLY guido) could go ahead and do some type inferencing, and then 'explain' some extra restrictions on Python. But no other third party could do this, and claim to faithfully compile Python. I've presented this argument before, but I'll give it again anyhow. At present, there is no such thing as an 'erroneous' or 'ill formed' python program. Every single file on your computer is a correct python program. If you run python on some arbitrary file, and it throws a SyntaxError, then the program is correct: it has well defined semantics, namely to raise a SyntaxError. Perhaps you think this example is extreme (it is, deliberately), but when we come down to compiling files which are more 'obviously' python, this issue almost completely prevents any optimisation by type inference -- and that is why the most important change for Python is to very carefully define what stuff is NOT correct python. It is very important that in these cases, the language specification does NOT say an exception is raised because that is _precisely_ what the problem is. Raising an exception is well specified behaviour, and when it happens according to specification, the client code which causes it to happen is CORRECT python -- precisely because the behaviour is specified. For example, consider: x = 1 y = open("something") try: x + y except: print "OK" This code is CORRECT python at the moment (AFAIK). It is NOT 'illegal' to add a file and an integer, it is perfectly correct to do it, and then handle the resulting exception. There is no hope for any kind of type inference until this is fixed. What must be said is that this case is an error, and that Python can do anything if the user does this: the result of executing the code MUST be undefined. The fact that a particular implementation throws an exception, is good behavour on the part of that particular implementation, but it must NOT be required -- because that would prevent a compiler rejecting the program. FIXING THE PROBLEM -- WITHOUT DOING TOO MUCH WORK ------------------------------------------------- It is relatively easy to fix SyntaxError: it is easy to say that a 'python' program is one that (at least) conforms to the grammar. It is not so easy to fix all the other cases, because in _some_ of these cases, we would want 'undefined' behaviour, and in other cases, we would actually _want_ to throw an exception. For example, it is common to say this: try: import X except ImportError: import Y and many people think this is reasonable, and changing the existing semantics would break a lot of existing code. It is possible to manually look at EVERY possible place in the language specification, and decide in which cases an exception must be raised, and in which cases the _program_ is plain wrong, and the result is undefined -- and perhaps go on to specify implementation details such as "The current CPython implementation raises an XXX exception here". But that is a lot of work. The example of "SyntaxError" suggests an alternative: we examine exceptions directly, and specify which ones must be thrown, and which ones are the result of an invalid program. This _might_ require some reworking of the exception tree (I can't spell heirarchy :-) The way to do this is to pretend you are a compiler implementor for Python, and want to know which things you MUST generate code for, and which things you will either assume (and let the clients program go haywire or coredump if the assumption is violated), or which things you can reject at compile time as WRONG. For example, os errors clearly require runtime support: they're not (necessarily) program errors. On the other hand , TypeError and ValueError are difficult. Here's why: when the top level catches a type error, it is almost certainly a program error. But in the case: def f(arg): try: return "<" + arg + ">" except TypeError: return repr(arg) the client is doing something sensibly pythonic . Here, a much more complex rule may be required: [Badly worded WRAPPING rule} ---------------------------- "If an operator has 'invalid' argument combinations, then, unless it is wrapped in a lexically enclosing try block which catches a TypeError, the program is ill formed." In other words, if you really want to catch a TypeError when evaluating some expression, you are REQUIRED to make sure it is lexically enclosed in a try block which explicitly catches the TypeError. The reason is clear -- a type inference algorithm can note the try/except block here, and generate code that does type checking. But in the case of plain old: "<" + arg + ">" _without_ lexical enclosure, the type inference is entitles to assume that 'arg' is a string. And then, when the client calls the function containing this fragment, a compiler may reject the program because it can see that 'arg' is not a string, as required by the semantics. The key word in that sentence is 'required'. At present, the semantics don't really _require_ anything, and so optimising python by local type inference is impossible. [This does not prevent whole program analysis, but it does make the results much less effective] Finally, I note that contrary to my simplified assumptions, the current reference DOES in fact say, in places, that something is 'illegal' (or whatever). The contention of this particular article is that this is the first, and most important, work that can be done by people wanting to compile python to efficient code: clean up the semantics. In many cases, I feel Guido will already have an opinion on what is 'intended' to be an error in the program, and what is 'intended' to throw an exception: that is, I think Guido could sensibly resolve some of the cases that other SIG members disagreed on, or found difficult. Here is the 1.5.2 exception tree, with comments: Exception(*) | +-- SystemExit +-- StandardError(*) | +-- KeyboardInterrupt NO! This should be TOP LEVEL +-- ImportError RAISED ONLY IF WRAPPED, ELSE ERROR **** +-- EnvironmentError(*) MUST BE RAISED IN COMPILED CODE | | | +-- IOError | +-- OSError(*) | +-- EOFError MUST BE RAISED IN COMPILED CODE +-- RuntimeError | | | +-- NotImplementedError(*) UNDEFINED BEHAVIOUR | +-- NameError UNDEFINED BEHAVIOUR (except in 'exec, eval') +-- AttributeError UNDEFINED (use getattr to get defined behaviour) +-- SyntaxError UNDEFINED +-- TypeError UNDEFINED UNLESS WRAPPED LOCALLY +-- AssertionError UNDEFINED +-- LookupError(*) UNDEFINED UNLESS WRAPPED LOCALLY | | | +-- IndexError | +-- KeyError | +-- ArithmeticError(*) UNDEFINED | | | +-- OverflowError | +-- ZeroDivisionError | +-- FloatingPointError | +-- ValueError UNDEFINED UNLESS WRAPPED +-- SystemError UNDEFINED +-- MemoryError UNDEFINED The outcome of this is that really, the only times python guarrantees to raise an exception is for environment errors, or for typing/indexing/lookup errors which are locally wrapped. The biggest problem here is clearly ImportError, since it cannot be raised _unless_ the importer catches an exception -- and the only possible exception it could catch would be an environment error, which is unlikely, except if the module cannot be found on the file system. I'm NOT sure I like the wrapping rule. An alternative is just to say that the client is required to test, instead of relying on exceptions. But we need _something_. Comments needed here: this is the hard part (a suitably 'pythonic' rule) ------------------------------------------------------------ ** Keyboard Interrupt: this is wrong wrong wrong!! Ocaml does this too. The reason is, that Ctrl-C (or whatever) can occur _anywhere_ in a program, and catching it therefore clashes with: try: something except: handler Here, 'except' is probably intended to catch SYNCHRONOUS errors caused by code in 'something', but it inadvertantly catches a KeyboardInterrupt as well. SOLUTION: At least, KeyboardInterrupt should be a top level exception, or, better, divide the exception tree into two kinds: synchronous and asynchronous. At least then the programmer can write: try: something except SynchronousException: handler which will still let the Ctrl-C escape through to the top level and kill the program (or, get explicitly handled). A better solution, perhaps, is to get rid of Keyboard Interrupt altogether, and handle signals by a different mechanism (perhaps in the signal module, perhaps with core language support) -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From gstein@lyra.org Fri Dec 17 22:58:25 1999 From: gstein@lyra.org (Greg Stein) Date: Fri, 17 Dec 1999 14:58:25 -0800 (PST) Subject: [Types-sig] Type Inference I In-Reply-To: <385AB3FE.AE6C2630@maxtal.com.au> Message-ID: On Sat, 18 Dec 1999, skaller wrote: > ... long post about exceptions and semantic definition ... Sorry John, call me dense, but I really don't see what you're talking about. :-( I don't see a problem with exceptions. That is part of Python. I don't see that it causes any problems with type inference, either (it just introduces interesting items into the control/data flow graph). This whole tangent about feeding an email to Python and claiming it is a valid Python program with defined semantics (raise SyntaxError). I understand your explanation, but I totally miss the point. So what? Type inferencing for the "1 + file" case is easy. You know the two types, and you know they can't be added. Bam. Error. And this whole thing about wrapping ImportError or TypeError or whatever... I just don't see your point. It was a long email, but what exactly were you trying to say? "Define the semantics" isn't very clear. I feel Python has very clear semantics. What exactly is wrong with them? Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Fri Dec 17 23:31:40 1999 From: gstein@lyra.org (Greg Stein) Date: Fri, 17 Dec 1999 15:31:40 -0800 (PST) Subject: [Types-sig] I've collected my thoughts... Message-ID: I've grabbed up the various snippets that I've blathered on about and dumped them all into a web page: http://www.lyra.org/greg/python/type-proposal.html The page is weak on semantics discussion, but strict on detail. I've got syntax changes defined, and a (starting?) list of runtime and compile-time semantic (changes). I'll keep adding to it as I think of things and hear back from people. I'm going to try to slow down on this type stuff, though, as I'm going to be starting work on the new import system. Cheers, -g -- Greg Stein, http://www.lyra.org/ From tim_one@email.msn.com Fri Dec 17 23:29:04 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 17 Dec 1999 18:29:04 -0500 Subject: [Types-sig] A lurker's comment In-Reply-To: <19991216190524.16032.qmail@lannert.rz.uni-duesseldorf.de> Message-ID: <000501bf48e6$81a33ce0$32a2143f@tim> [lannert@lannert.rz.uni-duesseldorf.de] > .... > An assignment with lhs: (IntType,), rhs: (NoneType, IntType) > should not be rejected by the interpreter if rhs happens to > be an Int, but by the compiler. ??? If it's rejected by the compiler, the interpreter will never get to see it. See earlier msgs for disussion of "modes". Some people will want a compile-time error on that; others will want a runtime error iff rhs==None obtains; others will want a compile-time warning but not a compile-time error or runtime expense. Does one of those cover your view of the world, or do you have a 4th idea in mind? Note that GregS's "!" operator gives another approach to cases like this (explicit runtime check, spelled in a convenient way but on an expression-by-expression bais). From tim_one@email.msn.com Fri Dec 17 23:39:47 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 17 Dec 1999 18:39:47 -0500 Subject: [Types-sig] What is the Essence of Python? In-Reply-To: Message-ID: <000701bf48e8$00ce5d00$32a2143f@tim> [Eddie, cogently attempting to tell Howard this isn't a movie ] > ... > There are some important `pythonic theses' I saw (by Tim Peters, I > think) but I've lost the bookmark ... ask Tim Peters, they were good. > They might come closer to satisfying your criteria of essentiality. They've propagated to surprising places, from Andrew Kuchling's "Python Quotes" page (on the Starship) to Linux News(!). Guido still needs to fill in the 20th, though. the-essence-of-stone-is-the-quality-of-stoniness-ly y'rs - tim From skaller@maxtal.com.au Fri Dec 17 23:59:24 1999 From: skaller@maxtal.com.au (skaller) Date: Sat, 18 Dec 1999 10:59:24 +1100 Subject: [Types-sig] Type Inference I References: Message-ID: <385ACE5C.17CD5684@maxtal.com.au> Greg Stein wrote: > On Sat, 18 Dec 1999, skaller wrote: > > ... long post about exceptions and semantic definition ... > > Sorry John, call me dense, but I really don't see what you're talking > about. :-( It takes a while to understand the impact of conformance and specifications on semantics .. and that this is not just a matter of language lawyering, but a real, pragmatic, issue. > I don't see a problem with exceptions. That is part of Python. I don't see > that it causes any problems with type inference, either (it just > introduces interesting items into the control/data flow graph). The problem arises roughly as follows: type inference works by examining an expression like: x + 1 and _deducing_ that x MUST (**) be an integer. It cannot be a file, because it isn't allowed to add a file to an integer. But in Python you CAN add a file to an integer. It is perfectly legal, it just throws an exception. Do you see? This means we cannot deduce ANYTHING about 'x' in the example snippet given above. Of course, the _expression_ x+1 can only be an integer, we _can_ deduce that. But that isn't enough. Python is too dynamic. We need more constraints to be able to do effective inference. (**) This example ignores class instances with __add__ methods, to make the argument easier to follow. > This whole tangent about feeding an email to Python and claiming it is a > valid Python program with defined semantics (raise SyntaxError). I > understand your explanation, but I totally miss the point. So what? See above. We cannot infer anything, unless there are rules. That is, there MUST be set of permitted signatures for functions/operators, in order to do inference at all. It is possible to do synthetic (bottom up) type analysis, such as: x = 1 + 1 Here, we know that Int + Int -> Int, and so x (at least at this point in the program) must be an Int. But that is only the 'deductive' part of inference, the 'inferential' part infers the types of _arguments_ from the set of allowable signatures of functions. That is, we must do the inference top down, not just bottom up. > Type inferencing for the "1 + file" case is easy. You know the two types, > and you know they can't be added. Bam. Error. but you're wrong, the result of applying the addition operation is, in fact, well defined: it is NOT an error in the program, it just throws an exception, rather than returning a value. If you throw an exception deliberately, that is hardly an error, is it? > It was a long email, but what exactly were you trying to say? "Define the > semantics" isn't very clear. I feel Python has very clear semantics. What > exactly is wrong with them? There is no distinction made between 'incorrect' code, and 'correct' code for which an exception is thrown. In compiled code, we need the distinction, because there is a lot of overhead in doing the dynamic type checking required to throw the exception. The whole point of compilation is to eliminate the overhead of run time type checking. This can be done if we know what the type of an expression MUST be, but the current semantics don't allow this in enough cases. Let me invent a temporary syntax, in which there is no exception throwing, instead, the value returned by an expression is either 'something nice', or 'exceptional'. This IS what happens now in C code, where NULL represents 'exceptional', right? Now, the type of x + 1 can be 'nice or exceptional', which tells us nothing about the type of x. But if the type is 'nice', we know that x must be an integer. And then, the compiler can generate code that doesn't bother to check the result of calling PyAdd(x, &One), because it cannot be NULL. We know PyAdd will return an actual object. I'm sure YOU have done that yourself, writing C extensions. [In fact, we can do better, we can peek into the Int data structure and add 1 to the result, and rewrap it as a PyObject; that is, we can INLINE the PyAdd function, and throw out the cases that cannot occur -- with any luck, within a larger code fragment, we can get rid of PyAdd altogether, and just use 'x++' -- a single machine instruction.] Hope this makes sense: to compile python code effectively, we need to add some reasonable 'static-y' restrictions. Where, 'reasonable' means 'suitably pythonic', but not quite as dynamic as the current CPython 1.5.2 implementation allows. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Sat Dec 18 00:32:29 1999 From: skaller@maxtal.com.au (skaller) Date: Sat, 18 Dec 1999 11:32:29 +1100 Subject: [Types-sig] Easier? References: Message-ID: <385AD61D.548EB5F8@maxtal.com.au> Paul Prescod wrote: >> Jim created an implementation of an excellent, intelligent optimizing compiler. His work is as, or more, interesting than mine, but it is a different problem he is trying to solve. (OPT) comes into the picture because my work makes his much, much easier and more effective in many cases. << I think you are wrong, and right: you are wrong that declarations makes inferencing easier, it makes it harder, because there is more syntax to analyse, and so more code to add to the inference engine. But of course, you are right it makes the algorithm more effective, in particular, if it is only applied locally (say, to a module, rather than a whole program). While I'm here, I'd also like to correct a misconception that type inference is 'hard'. Irrespective of the case for static languages like C++ or ML, a Python type inferencer is not harder, but MUCH easier to write. Here's the explanation: for static systems, the inferencer needs to be as complete as possible: the client is going to be pretty pissed off it it crashes, or if it fails to deduce a type (both these thing can happen in the ocaml inferencer). That's because in these languages, it's necessary to deduce the type for compilation to proceed. This is not true in Python. It is not necessary for inferencing to be 'complete' in the sense it is strongly desired for static languages. In Python, the following inferencer is acceptable, if not very good: def infer_type(expr): return "PyObject" This inferencer will not help optimise compiled code, but it is CORRECT. I will call such an inferencer a _conservative_ inferencer: it will always work and always give the correct results, even if they do not help much with optimisation. The inferencer above is probably the first one I will use for Viperc: it will be a brain dead code generator that does nothing more than wrap the equivalent of bytecode instructions into the equivalent C function calls. This has to work ANYHOW, as a fallback if the inferencer cannot infer enough. Now, you can try for a better inferencer, and, provided it is conservative, you can probably only get better performance. The point is that you can build a compiler with a lousy inference engine, and improve the inferencer -- and the code generator -- later. All that is required of the inferencer is that it be correct (conservative). The issue then arises, that even a good inferencer will yield poor performing code. That is where 'aggressive' inferencers come into play -- these are inferencers that make extra assumptions, in order to get better deductions. And my 'Type Inference I' post is all about adding extra constraints to Python, so that more aggressive inferencers can be considered conservative -- that is, guarranteed correct according to language specifications. Apart from the exception handling problem, another 'assumption' that inferencers can benefit from is 'freezing': assuming that, after loading, the bindings in a module are immutable. Indeed, Viperi actually allows freezing modules now (i.e. even the interpreter can beneift) but you have to explicitly call a function to do it -- because otherwise the interpreter would be breaking the Python language. I could dispense with that requirement (that the user call the freezing function) if Guido were to mandate "thou shalt not change the binding of a module after it is imported". I'm not asking for that, just trying to explain how important conformance issues and specifications are in optimisation, and in particular, how important it is that certain operations NOT be defined (even by a requirement that an exception be thrown). It is, in fact, fairly true to say that it is the things which are NOT legal, which are the very things which permit optimisation. Perhaps Tim or Guido can explain this better. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From tim_one@email.msn.com Sat Dec 18 00:46:35 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 17 Dec 1999 19:46:35 -0500 Subject: [Types-sig] Low-hanging fruit: recognizing builtins In-Reply-To: <3859FD66.E47352E@lemburg.com> Message-ID: <000a01bf48f1$56213bc0$32a2143f@tim> [M.-A. Lemburg] > ... > BTW, just to make buying one of those new microwave > ovens more attractive: what is the pystone rating for > the new Athlon and Pentium III chips ? No idea. AMD and Intel both put in new instructions to speed speech recognition, so that's a clear direction for Python's implementation to follow . AMD's solution to lousy cache performance was to add prefetch instructions, allowing the assembly-language programmer to tell the memory system which addresses they *expect* to be reading from "pretty soon". Helps some SR inner loops a lot. That's data, though, not instruction space. eval_code2's problem is that it contains more code than the English language has words . chances-are-it-scales-with-the-clock-rate-ly y'rs - tim From tim_one@email.msn.com Sat Dec 18 00:46:37 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 17 Dec 1999 19:46:37 -0500 Subject: [Types-sig] New syntax? In-Reply-To: Message-ID: <000b01bf48f1$572dc9c0$32a2143f@tim> [Greg Stein] > ... > I guess that does mean that something like: > > decl a: def(Int)->None > > would be possible. e.g. is a member holding a ref to a function > object. If it weren't possible, it would be quite a hole in the type description mechanism! > Of course, the type of in this case is no different than: > > def a(Int x)->None: > > It is just that one declares a member and the other declares a method. > There is a subtle difference there :-) I'd say it's exactly as subtle as the difference between Class.f = somefunc and Class().f = somefunc today. BTW, I wouldn't object to requiring that the class/member distinction be explicit. decl class a: ... decl member a: ... If "decl" gets used for more stuff down the road, it could be a real help to make the syntax explicit from the start: ofwhat : 'class' | 'member' | 'var' | 'type' | 'frozen' | ... decl-stmt : 'decl' ofwhat > In fact, these two are probably equivalent: > > decl class a: def(Int)->None > def a(Int x)->None: WRT type, yes, but (of course!) the former is merely a declaration while the latter is the initial stmt of a definition. >> In either case, I'm not sure what to do about varargs (the >> "*rest" form of argument). > What's wrong with: > > decl a: def(Int, *)->Int > decl b: def(Int, **)->Int > decl c: def(Int, *, **)->Int > > I don't see any ambiguity in the grammar there, unless you use "*" > to mean unknown (as Paul once mentioned). I think the unknown type > should be "Any" (or "any"), since it really means "take any type > of value." Yes, Any is good. The problem with * and ** is that people are going to want to express restrictions, like "only strings from here on in" or "all the keyword args must be of int type". Under the theory that things work well if you just don't think about them , decl c: def(Int, *: (String), **: {String: Int})->Int > ... > I'm not sure whether to go for practical or pure. I'm leaning toward the "always explicit" above. Restrictions can always be loosened later if they prove too confining, but tightening a permissive spec is usually impossible. despite-guido's-charming-belief-that-we-could-actually-ban- intractably-magical-namespace-mutation-ly y'rs - tim From tim_one@email.msn.com Sat Dec 18 01:17:27 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 17 Dec 1999 20:17:27 -0500 Subject: [Types-sig] Keyword arg declarations In-Reply-To: <199912171442.JAA12404@eric.cnri.reston.va.us> Message-ID: <000c01bf48f5$a60fbd60$32a2143f@tim> [Guido van Rossum] > I just realized that Tim's decl syntax that's currently being > bandied around doesn't declare the names of arguments. That's > fine for a language like C, but in Python, any argument with a > name (*args excluded) can be used as a keyword argument. I never specified the full syntax, and partly because regurgitating the full arglist syntax at this point would lose the idea to the details! Arglists in Python are complex beasts. > I think it will be useful for the decl syntax to allow leaving out or > supplying argument names -- that tells whether keyword arguments are > allowed for this particular function. And that is part of a > function's signature. Definitely part of the signature. Optional arguments too! Are default *values* also part of the signature? def increment(x, bump=1): ... If this got declared via e.g. decl increment: def(Int, Int=1) -> Int then *call* sites could generate code to build the full argument list appropriately, and invoke a leaner entry point to the eval loop that didn't have to deduce the correct arglist at runtime via an all-purpose algorithm. This could be a valuable time-saving (albeit code-bloating) optimization. OTOH, I have too much abusive code that looks like: def whatever(arg1, arg2, _int=int, _ord=ord): and I've been secretly hoping I could abuse the declaration mechanism to "export" this as decl whatever: def(Any, Any) -> Any This addresses the rare but real errors wherein a caller inadvertently passes "too many" arguments, overwriting one of the speed-hack default args. Then there's also decl yadda: def(Int, =Int) -> None for the case where the 2nd argument is optional but the user doesn't want it treated as a keyword argument. My idea on that one was to say "tough luck -- you need to give the name here". What do (the generic) you say? > (Note that not all builtins support keyword arguments; in fact most > don't.) So fix that . > (Un)related: I think it makes sense to be able to restrict the types > of *varargs arguments. E.g. eons ago (last week in the types-sig) > someone proposed an extension to isinstance() allowing one to write > isinstance(x, type1, type2, type3, ...). Clearly the varargs are all > type objects here. > > Not so sure about **kwargs, but these should probably be treated the > same way. Coincidentally addressed that in an earlier msg. Don't think it's a problem. decl isinstance: def(Any, Type) | def(Any, Type, *: (Type)) although the use of "(thing)" to mean "tuple of things of arbitrary length" does look like a syntax awaiting regret <0.9 wink>, and more than one of us has stuck "extra" parens in declaration examples for clarity. unicode-will-supply-many-more-matching-brackets-ly y'rs - tim From tim_one@email.msn.com Sat Dec 18 01:23:17 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 17 Dec 1999 20:23:17 -0500 Subject: [Types-sig] Keyword arg declarations In-Reply-To: Message-ID: <000d01bf48f6$764ee5a0$32a2143f@tim> [Greg Stein] > ... > Shouldn't be a problem: > > def foo(bar, *args: [Int], **kw: {String: Float}) -> None: > ... Except that *args is a tuple of ints (not a list of 'em). (Int) is really unattractive for this. I know: (!Int). tee-hee-ing-ly y'rs - tim From skaller@maxtal.com.au Sat Dec 18 01:49:27 1999 From: skaller@maxtal.com.au (skaller) Date: Sat, 18 Dec 1999 12:49:27 +1100 Subject: [Types-sig] Type Inference II References: Message-ID: <385AE827.42ECB891@maxtal.com.au> In part I of this, I discussed the issues related to the Python language specification and exception handling and the impact on type inference. Summary: certain operations which can throw exception limit the effectiveness of type inference, but it may be relatively easy to clean up the semantics by specifying that many of the exceptions must be caught locally or the program is in error. In this part, I continue by looking at other parts of the language which may restrict the ability to generate optimised code. TYPE INFERENCE II: SOME SPECIAL CASES ------------------------------------- Freezing. --------- Type inference can be severely inhibited by the fact that Python permits programmers to modify modules after they have been imported. Indeed, it is not only _type_ inference that is affected, but also inability to cache values, or inline functions, that is affected. For example, if a module m contains a variable 'x' and a function f, and we know 'x' is not modified after the module is loaded, then we can replace m.x with the actual value of x. There are a lot of such constants in Python library modules, for example, in socket and errno. A similar argument applies to functions: we cannot inline a function call, if we do not know the body of the function. In the case of a call like: m.f(1,2,3) we cannot easily inline f, because we don't know the user hasn't replaced the original f with something else. It may seem at first this issue is independent of 'type inference', but that is not the case. First, consider C and C++, in which 'const' is part of the type system. Second, consider inference applied to an expression y + m.x Here, if we know m.x is an integer -- we don't care about the value -- we can deduce that y is also an integer (ignoring __add__ methods for clarity). But if m.x could be rebound after module loading, we don't know what the type is. Now, I want to explain a bit about how Viperc is going to work. [At least, my current ideas .. which may change :-] Viperc begins by loading modules: it _executes_ the modules using the interpreter Viperi. There's no compilation here! But AFTER the modules are imported dynamically -- with the full power of all the dynamism of the interpreter -- the modules are 'frozen'. The compiler assumes that the modules cannot be tampered with. As a result, Viperc can do type inference on the mainline, since the type of _every_ module attribute is known, and indeed, the values are known too. Indeed, it can inline the functions, and other nice 'global analysis' -- but it is NOT working with the 'source code' or 'AST' for a module: it is working directly with built, run time, objects. Of course, none of this can work, if the mainline is allowed to modify the module contents after the modules are imported: the importing is done at compile time using the interpreter, the modules don't even _exist_ in the generated executable. [There is a more difficult case: freezing classes, and even instances. Perhaps more, later.] Ok, now I'm going to backstep. Assuming all imported modules are frozen after importing is overkill. I think -- not sure -- that we need bit more flexibility than that: the optimisation is massive, but it is useless if it kills sensible semantics. To see what 'sensible semantics' means, I will first consider where freezing is more or less mandatory: the answer is, when you have a threaded implementation or other re-entering code, then modifying global variables is a bad idea. But not everyone is doing that. For a start, it isn't always clear when importing is finished: interscript, for example imports modules 'on the fly' in response to user requests -- this is necessary, because some of the modules represent typesetters (LaTex, HTML etc), and the set of these is 'user installable'. Secondly, it is common to 'patch' functions in modules, just once, in other modules: for example MA-Lemburg's Tools do that. There is an obvious 'semantic equivalence' requirement here so code that doesn't need the extra functionality will still work, but there is still a use for dynamically changing module variables after modules are loaded. What can help here? I don't know. Some ideas include const MYCONSTANT = 1 with a specification that changing a 'const' value after initialisation is an error. For imports, import const MODULE says that changing any attribute of MODULE after importing IN THE CALLING MODULE is an error (this does not stop other importers changing it though .. :-( I'd sure like to see some 'pythonic' ideas here. LOOP VARIABLES -------------- Another case which inhibits optimisation is loop variables. In the loop: for x in y: .... is it allowed to assign to x? What about mutating y? What about mutating x? [Also, an aside: the code for x[1] in y: ... for x.attr in y: ... is allowed but I can't see a real use for it. Is there one? Could we simplify the syntax, and required the loop control to be a whole variable, or tuple of whole variables (recurively), so that the names involved are always bound directly? The idea with loop optimisation is that we can: (1) keep the loop index in a register (2) cache the sequence (3) generate sequence values for 'range(..)' lazily Tightening up for loops will break code that does things like: (1) do extra increments on an loop variable to skip cases (2) mutate a list while scanning it but these seem to lead to newbie posts on c.l.p anyhow. RESTRICTED SCOPE EXTENSION/RENAMING? ------------------------------------ At present it is sometimes necessary to destroy temporary variables like this: x = value ... del x There are some examples in the standard library; this occurs when a temporary is created doing calculations in a module, but is not meant to be there to use after the module is imported. Another case occurs like this: _f = f # protect our f from MODULE import * # might destroy f f = _f # set f back del _f # get rid of temporary _f This is ugly. It also makes inference harder, because there is now a control flow issue. One idea I had was this: import X as Y # import X, but name it Y in this module from M import x as v A related idea is: import private X private x = 1 private def f(qqq): ... which makes these name visible in the module, so you can use unqualified lookup within the module, but so that MODULE.X # fails, X is not visible via M MODULE.x # fails MODULE.f # fails that is, so external clients cannot see X as an attribute of MODULE. This is actually easy to implement in Viper, since it uses a concept of 'environments' for unqualified lookup: a private name would put into a special private dictionary which is looked up inside the module, but which is not part of the module dictionary. [The 'envionment' looks up both dictionaries, qualified acces via operator dot does not] Yet another idea is genuinely temporary variables: temporary x = 1 which are automatically destroyed at the end of module loading. I note python currently supports privacy by name mangling, but really, this is a hack: for Python 2, a more sophisticated architecture would be better. ------------ A final comment: it is useful to _implement_ ideas to test them out. It is damn hard to do that with CPython, because it it written in C, unless you have extensive familiarity with the source. Viper, on the other hand, is written in ocaml, and it is MUCH easier to play with extensions (and write compilers) in ocaml than in C. Viper is available for evaluation, if anyone wants to play with it, but you will need to know some ML. OTOH, I will be playing with some extensions anyhow, so if people here have some ideas to try out, I might be able to implement them relatively easily. [For example: I extended the grammar to include list comprehensions in 15 minutes, I expect that implementing the semantics will take less than an hour. This will be available in the second alpha.] One thing that is required, though, is a _concrete_ syntax. I can't implement an idea, even if the semantics are specified, if there is no syntax! I'm very keen to try 'optional type declarations', but I need a definite syntax, not just for the declaration (easy, but also easy to argue about for a long time), but also for the type (I think this is quite hard). [Remember Viper uses an extended type system, a type can be any object!] -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From tim_one@email.msn.com Sat Dec 18 01:51:31 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 17 Dec 1999 20:51:31 -0500 Subject: [Types-sig] doc-sig/types-sig clash? In-Reply-To: <14426.29270.837380.905106@dolphin.mojam.com> Message-ID: <001001bf48fa$68c618a0$32a2143f@tim> [Skip Montanaro] > ... > Just raising a small flag to make sure people don't assume the > doc string is their private sandbox. Don't worry, Skip: I'm acutely aware that docstrings are *my* private sandbox, and I won't let these fellows break doctest . determinedly y'rs - tim From tim_one@email.msn.com Sat Dec 18 02:11:00 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 17 Dec 1999 21:11:00 -0500 Subject: [Types-sig] A challenge In-Reply-To: <385A7495.D7A25EC4@appliedbiometrics.com> Message-ID: <001201bf48fd$210ba400$32a2143f@tim> [Christian Tismer] > ... > Isn't this in conflict with one of your earlier posts where you > wanted the same variable to take different types in sequence? > I found that example very clean. You assigned a dict's keys() > to the variable which held the dict. Is this idea gone? Not in *my* code it isn't, but I don't think a type system has to cater to every abuse I can come up with . Greg can handle this fine with expression-based type operators, but when I look at my own code I think name-based type declaration is overwhelmingly the less bothersome approach. This leaves me with several choices; at least: + Don't ask for static typing on code that "cheats" this way. + Declare "result" as a union type; e.g., decl typedef Set(_T) = {_T: Int} ... decl var result: Set(Int) | [Int] + Harass Guido to add a core dlict type . + Use Greg's form of dynamic cast (which appears to me to have real merit whether or not declaration stmts are introduced). + Kill any chance of adding declaration stmts by introducing a maze of bizarre new rules just to cater to line-by-line redeclaration. Note that the last is trickier than it may appear, because the crucial line: result = result.keys() uses result as an [Int] on the LHS but as a Set(Int) on the RHS. So it's wholly unnatural for name-based typing -- and that doesn't bother me a bit. >> how-do-we-declare-the-type-of-a-continuation?-ly y'rs - tim > > Let's see :-) > > PythonWin 1.5.42c1 (#0, Dec 15 1999, 01:48:37) [MSC 32 bit (Intel)] on > win32 > Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam > Portions Copyright 1994-1999 Mark Hammond (MHammond@skippinet.com.au) > >>> import continuation > >>> co = continuation.caller() > >>> co > > >>> type(co) > > >>> co.__doc__ > "I am a continuation object, Deleting 'link' kills me." > >>> callable(co) > 1 > >>> > > I think the type of a continuation is Continuation. Hey -- makes *my* life easy . will-spend-the-rest-of-the-night-wondering-where-"1"- came-from-ly y'rs - tim From tim_one@email.msn.com Sat Dec 18 02:45:31 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 17 Dec 1999 21:45:31 -0500 Subject: [Types-sig] New syntax? In-Reply-To: <19991217184140.B12905@vet.uu.nl> Message-ID: <001301bf4901$f3a2f040$32a2143f@tim> [Martijn Faassen] > I think inline type declarations like def(Int, Int)->Int may not > be necessary if you allow typedefs. I like typedefs fine too, but couldn't make sense of a system in which a typedef was *essential* for spelling a concept. typedefs are traditionally shorthands for things that may be (at worst) clumsy to spell without them. > People often give the advice to avoid Lambdas in Python anyway; > why not avoid a lambda like construct in our type definition > language as well? Hmm. These have nothing in common with lambdas apart from having an argument list -- as do all functions and methods. Indeed, from the type expression def(Int) -> Int we have no clue whether it's *defined* via a lambda or def. And the declaration should not expose that, so all is well (strictly, the word "lambda" makes marginally more sense than "def" in the above, but I don't want to encourage a lambda mindset ). > typedef Footype(int, int): > return int > > var handlermap = {string: Footype} If I had a lot of binary integer functions to declare, I would probably use a typedef, a la decl typedef BinaryFunc(_T) = def(_T, _T) -> _T decl typedef BinaryIntFunc = BinaryFunc(Int) ... decl var intHandlerMap: {string: BinaryIntFunc} decl var floatHandlerMap: {string: BinaryFunc(Float)} etc. The "deep" problem I have with your "Pythonic" notations is that while Python excels at expressing imperative algorithms, type specification is a purely declarative task. Type *expressions* allow for a convenient, precise and concise calculus of type-specification "equations". As in the above example, the common parts of common patterns can be factored out and resued with ease. This is useful! You're not going to get the same level of expressiveness in an imperative-style Python syntax: it's the right tool for the wrong job. A type-expression sublanguage with one operator ("|") should suffice. [on varargs] > Me neither. Perhaps something like: > > decldef foo(first=int, second=string, *[int]): > return int > > i.e. all the extra arguments must be ints. Hmm! You and Greg both seem to think varargs get implemented as lists . > Note that I'm currently in the out-of-line camp with Paul. :) Aren't we all? This has been an intense week for the Types-SIG! The good news is that Paul must be taking most of it out on his family & not us . i-asked-sinterklaas-and-we're-*all*-getting-nice-presents-ly y'rs - tim From skaller@maxtal.com.au Sat Dec 18 02:56:09 1999 From: skaller@maxtal.com.au (skaller) Date: Sat, 18 Dec 1999 13:56:09 +1100 Subject: [Types-sig] List of FOO References: <000d01bf48f6$764ee5a0$32a2143f@tim> Message-ID: <385AF7C9.7CEE78E5@maxtal.com.au> Martin wrote, in reponse to Paul: >>1. This system is supposed to be extensible, right? So I could, for >> instance, define a binary tree module and have "binary trees of ints" >> and "binary trees of strings." How do I define the binary tree class and >> state that it is parameterizable? > >Good question; so far I only thought about making built in types (such >as list) parameterizable. One could however do something similar with >classes, though: Let me turn that around. In Viper, there are NO fixed names for any types: types are just any objects. This leads to the question Martin asked immediately. Before I proceed, another observation: in Viper, there is a class class PyListType: .... for lists, but it is planned to allow class PyListOf: ... The idea is that this is a parameterized type, and the instance is specified by constructing an object: PyListOfInt = PyListOf(PyIntType) The class PyListOf contains methods like 'append' which take THREE arguments, instead of two: the first argument is the type. When an object of kind 'PyListOfInt' is the type object of some object x, then a call like x.append(1) ends up calling PyListOf.append(PyIntType, x, 1) which means it can check that 1 is of type PyIntType. In other words, PyListOf is a meta-type, which happens to be a class, and the instance type, PyListOfInt is an _instance_ of it. In this system, there is no provision for types like [Int] meaning a list of integers, instead, any object which has the type 'list of integers' actually has a type object 'ListOfInt' physically available. One of the points of this type system is that builtin types and user defined types are all the same: they all have type objects which provide methods. [That is, the type system is unified, although it is not the case classes are types, but instead that any object can be a type, and all types are objects] In this system, the client must CONSTRUCT a type as that type. Therefore [1,2,3] has type list, NOT type list of integer. I'm not saying this is good, but I am saying that two problems raised in this list disappear automatically: 1) complicated syntax, aluded to by Paul, cannot occur. Indeed, NO new syntax is required at all (to name types) There is ONLY one way to name a type: by refering to an object. Hence 'function taking list of X of Y returning .....' simply cannot occur: no new syntax for typing is used. Example: decl X: ListOfListOfTupleOfInt is possible, if the client defined that horrible name. You can't say: decl X: ListOf(IntType) because that is a different object to the type object of decl Y:ListOfInt(IntType) since class instance objects are compared by address. 2) Extension objects are not different to builtin ones. BOTH use the same idea, of having type objects associated with them, [In fact, there is a hack in Viper: builtin types' type object is found by an extra indirection, since the objects don't exist when the interpreter is first started] If I can summarise: there is considerable advantage using arbitrary objects as type objects: they can be specified using EXISTING python syntax, using the power of the EXISTING python interpreter, without needing a special, second class language, to complicate python, and pose an additional implementation overhead. In particular, the idea is that the type inference mechanism can _compile_ the type objects like any other, and therefore NO special handling is required for extension types. The ONLY types the inference engine needs to 'know' about are the builtin ones. I'm not sure if this will work :-) BTW: In Viper, extension 'modules' do NOT build any vtables or objects, they're just a table of named functions. No types! The 'types' are constructed in Python, usually with a class: class XType: mymethod = Xmymethod [at present, all 'extensions' have their functions loaded directly into _builtins_, which is something I will soon have to fix] -- One thing I think we DO need though, is a categorical sum. In ML notation: IntOrNone = Int | None Python does not support sum objects, so it isn't obvious how to represent type sums using objects. I'm working on this. In ML, this is done by: type Sum = X | Y of int | Z of float let sum = Y 1 in ... Python, like other languages, needs this construction, it is as fundamental as tuples (in fact, it is precisely the categorical dual of a tuple or struct) Dictionaries can be used for this, as can pairs (kind, value), so it can be represented, just not nicely. [I'm still thinking on how to do this 'pythonically' ] -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Sat Dec 18 03:05:21 1999 From: skaller@maxtal.com.au (skaller) Date: Sat, 18 Dec 1999 14:05:21 +1100 Subject: [Types-sig] Viper Type specification References: <000c01bf48f5$a60fbd60$32a2143f@tim> Message-ID: <385AF9F1.1108A094@maxtal.com.au> FYI: here is the Viper file py_types.vy which defines many Viper types. [But there is no reason ALL the types need be here: you can't see the regexp object here, nor sockets -- these are defined elsewhere] # This module is exclusive to Viper # It is a builtin module, defining classes for all # the Python types. # # Methods and attributes for objects can be found in the # class dictionary of the typing class, even when the object # is not a PyInstance of the class. import string import sys # datum types class PyNoneType: pass class PyIntType : def succ(x): return x + 1 def pred(x): return x - 1 class PyFloatType : pass class PyComplexType : pass class PyLongType : pass class PyRationalType : pass class PyStringType: pass # Sequence types class PyTupleType : pass class PyListType : append = list_append extend = list_extend count= list_count index = list_index insert = list_insert sort = list_sort class PyXrangeType: pass # Others class PyClassType: pass PyTypeType = PyClassType # an alias in Viper class PyInstanceType: pass # Viper doesn't currently support execution frames, so tb_frame is set to None class PyTracebackType: tb_frame = None class PyFunctionType: pass # Python function type class PyModuleType: pass # python module type class PyNativeFunctionType: pass # builtin function type class PyNativeMacroType: pass # a macro is an environment sensitive function (eg globals()) class PyBoundMethodType: pass # a bound method class PyDictionaryType: # dictionary items = dictionary_items clear = dictionary_clear copy = dictionary_copy has_key = dictionary_has_key keys = dictionary_keys update = dictionary_update values = dictionary_values get = dictionary_get class PyExpressionType: pass # partially evaluated expression (general expression type) class PyStatementType: pass # type of a code object class PyEnvironmentType: pass # an environment for unqualified name lookup class PyClosureType: pass # a pair consisting of an expression and an environment class PyThreadType: pass # type of a thread class PyInterpreterType: pass # type of a Viper interpreter object class PyLockType: # type of a mutual exclusion lock for threads def __init__(self): self.mutex = lock_create() def acquire(self, waitflag=None): return lock_acquire(self.mutex, waitflag) def release(self): lock_release(self.mutex) def locked(self): return lock_test(self.mutex) # ---------- GUI ----------------------------------------------------- class PyWidgetType: pass # widget class PyColorType: pass # color class PyFontType: pass # font class PyGraphicsContextType: pass # a graphics context class PyDrawableType: pass # something that can be drawn class PyCanvasType: pass # something we can draw on class PyImageType: pass # an external representation of a picture # ---------- FILES ----------------------------------------------------- # this is type of _native_ files class PyFileType: def close(f): file_close(f) def write(f,s): file_write(f,s) def flush(f): file_flush(f) # note: never raises an exception, does nothing at EOF def read(f,amt=None): try: if amt is None: s = "" while 1: b = file_read(f,8096) s = s + b if len(b) < 8096: break return s else: s= file_read(f,amt) return s except: print "EXCEPTION: IOERROR" # this is the class used for _client_ files # we use a class, to support easy subtyping class PyFileClass: def __init__(self): self.buffer = "" def read(self,amt=None): return self.native_file.read(amt) def write(self,s): return self.native_file.write(s) def close(self): self.closed = 1 self.native_file.close() # note: returns '' on end of file (no exception raised) def readline(self): eolpos = string.find(self.buffer, "\n") while eolpos == -1: n = len(self.buffer) data = self.read(1024) self.buffer = self.buffer + data eolpos = string.find(self.buffer, "\n", n) if len(data) == 0: break if eolpos == -1: eolpos = len(self.buffer)-1 line = self.buffer[0:eolpos+1] # include the eol self.buffer = self.buffer[eolpos+1:] return line def readlines(self, hint=None): data = self.native_file.read() return string.split(data,'\n') # this function opens a file def open(filename, mode="r"): try: native_file = file_open(filename, mode) python_file = PyFileClass() python_file.native_file = native_file python_file.name = filename python_file.mode = mode python_file.closed = 0 return python_file except OSError, object: exc = IOError(object.errno, object.strerror, filename) raise exc def make_file_object(native_file, filename, mode): python_file = PyFileClass() python_file.native_file = native_file python_file.filename = filename python_file.mode = mode python_file.closed = 0 return python_file # these functions return _native_ files, not client ones! def get_native_stdin(): return file_get_std_files()[0] def get_native_stdout(): return file_get_std_files()[1] def get_native_stderr(): return file_get_std_files()[2] # these functions return _client_ files! def get_client_stdin(): return make_file_object(get_native_stdin(),"stdin","r") def get_client_stdout(): return make_file_object(get_native_stdout(),"stdout","w") def get_client_stderr(): return make_file_object(get_native_stderr(),"stderr","w") # this is a hack! def set_std_files(): sys.stdin = sys.__stdin__ = get_client_stdin() sys.stdout = sys.__stdout__ = get_client_stdout() sys.stderr = sys.__stderr__ = get_client_stderr() # this is a sucky hack! def type(x): typename = getattr(x,"__typename__") return eval (typename) -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Sat Dec 18 04:08:14 1999 From: skaller@maxtal.com.au (skaller) Date: Sat, 18 Dec 1999 15:08:14 +1100 Subject: [Types-sig] type declaration syntax Message-ID: <385B08AE.A491CD36@maxtal.com.au> On syntax: for functions, the use of ":" for parameters seems 'natural' to me: def f(a:Int, y:Float=0.0) as it is used in pascal and ML. But the return type is a problem. Suggestions include def f() -> Float and def f: Float () and I'll add using -> in the list: def f(a->Int, y->Float)-> Float: .. but here is another idea: don't bother. The reason is: local type inference needs to know the parameter types, and these are needed for call checking. But the _return_ type doesn't need to be annotated as much. Why? Because the inferencer can usually deduce it: it's an output, the argument types are inputs. If the inferencer _cannot_ deduce the return type, it _also_ cannot check that the function is returning the correct type. It is true that knowing the return type can help inferencing, and it is true it is needed for inferencing at the point of call, although in this case the deduced type is (may be) still available. Only an idea .. but "when in doubt, don't", is a good rule for language design :-) One problem with :, that is probably a killer: it cannot work with lambdas: lambda x:Int, y: woops [I'm not saying if this will kill ":" or lambda though :-] -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From gstein@lyra.org Sat Dec 18 04:27:58 1999 From: gstein@lyra.org (Greg Stein) Date: Fri, 17 Dec 1999 20:27:58 -0800 (PST) Subject: [Types-sig] type declaration syntax In-Reply-To: <385B08AE.A491CD36@maxtal.com.au> Message-ID: On Sat, 18 Dec 1999, skaller wrote: >... > But the _return_ type doesn't need to be annotated as much. > Why? Because the inferencer can usually deduce it: > it's an output, the argument types are inputs. Users of the function need the return type. The inferencer won't be global -- it isn't going to look at the function to determine the return type. In order to skip that requirement, we annotate the return type. The caller then simply assumes the return type is correct. When compiling the function in question, the compiler can verify that the declared return type is truly what the function will return. >... > One problem with :, that is probably a killer: it cannot > work with lambdas: > > lambda x:Int, y: woops Good point. I'll need to update my page with this issue. > [I'm not saying if this will kill ":" or lambda though :-] Heh. I would simply state that lambda cannot be annotated. If people want the annotation, then they should use "real" functions. I know that would please Guido's desire to deprecate lambda :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From skaller@maxtal.com.au Sat Dec 18 04:30:21 1999 From: skaller@maxtal.com.au (skaller) Date: Sat, 18 Dec 1999 15:30:21 +1100 Subject: [Types-sig] Re: [Types Sig] Progress References: <001201bf48fd$210ba400$32a2143f@tim> Message-ID: <385B0DDD.2C5473A2@maxtal.com.au> Paul wrote: >>>> 1. Most people seem to agree with the idea that shadow files allow us a nice way to separate type assertions out so that their syntax can vary. I think Greg disagreed but perhaps not violently enough to argue about it. Interface files are in. Inline syntax is temporarily out. Syntactic "details" to be worked out. <<<< I'm more interested in the inline syntax. Reason: it is easy to modify the Viper grammar to allow it. It is much harder to build a completely new translater for a new 'type' language, and, this will not sit well with Viper's "any object can be a type". I also dislike maintaining separate interface files. [i'm not against this, just stating something I dislike about it] However, here is an idea for interface files: an interface file is an ORDINARY python file. No special stuff. Instead, a new keyword: 'defered'. For example: def f(a:int, b:long): defered 'defered' has the same semantics as 'pass', but it means 'we'll define this function later'. The important thing, then, is that the interface file has a different extension, so that a compiler can get the type information, without building the actual module, and it can match the interface against the actual module. But 'defered' can be used anywhere. >>>>> 2. Everybody but me is comfortable with defining genericity/templating/parameterization only for built-in types for now. But now that we are separating interfaces from implementations I am thinking that I may be able to think more clearly about parameterizability. It may be possible to define parameterizable interfaces by IPC8. Parameterization is in. Syntactic "details" to be worked out. <<<<<<<< I agree: parameterisation is important. But I don't think the usual notions used by static languages will work so well in Python. Before proceeding, please consider how Viper is supposed to do this. It's real easy to implement, and it obviates the need for any special new syntax. >>>>>>>>>>> 3. We agree that we need a syntax for asserting the types of expressions at runtime. Greg proposes ! but says he is flexible on the issue. The original RFC spelled this as: has_type( foo, types.StringType ) which returns (in this case) a string or NULL. This strikes me as more flexible than ! because you can use it in an assertion but you don't have to. <<<<<<<<<<< I don't think we agree on this: Guido says that assertions are good enough. I wouldn't argue. >>>>>>>>>>> 4. The Python misfeature that modules are externally writable by default is gone. Only Guido has expressed an opinion on whether they should be writeable at all. His opinion is no. <<<<<<<<<<< I would like this. however a point: we can always write to a class instance attribute instead. And this is just deferring the real problem. Another point: if 'defered' is accepted, it could be OK to write ONCE to a defered variable (and an error to use one that had not been written). >>>>>>>>>>>>>> 5. It isn't clear WHAT we can specify in "PyDL" interface files. Clearly we can define function, class/interface and method interfaces. <<<<<<<<<< Yes it is: if we have them, we have to be able to specify EVERYTHING. :-) -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Sat Dec 18 04:34:43 1999 From: skaller@maxtal.com.au (skaller) Date: Sat, 18 Dec 1999 15:34:43 +1100 Subject: [Types-sig] type declaration syntax References: Message-ID: <385B0EE3.D05A4945@maxtal.com.au> Greg Stein wrote: > > On Sat, 18 Dec 1999, skaller wrote: > >... > > But the _return_ type doesn't need to be annotated as much. > > Why? Because the inferencer can usually deduce it: > > it's an output, the argument types are inputs. > > Users of the function need the return type. The inferencer won't be > global -- it isn't going to look at the function to determine the return > type. Viperc _will_ use a global inferencer. Please don't assume "python" means CPython. There are two other full scale implementations now. There may be more in the future. And there may be other programs -- not full interpreters or compilers, like PyLint -- which will _use_ the information. > > [I'm not saying if this will kill ":" or lambda though :-] > > Heh. I would simply state that lambda cannot be annotated. OK, agreed. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From gstein@lyra.org Sat Dec 18 04:40:59 1999 From: gstein@lyra.org (Greg Stein) Date: Fri, 17 Dec 1999 20:40:59 -0800 (PST) Subject: [Types-sig] type declaration syntax In-Reply-To: <385B0EE3.D05A4945@maxtal.com.au> Message-ID: On Sat, 18 Dec 1999, skaller wrote: > Greg Stein wrote: > > On Sat, 18 Dec 1999, skaller wrote: > > >... > > > But the _return_ type doesn't need to be annotated as much. > > > Why? Because the inferencer can usually deduce it: > > > it's an output, the argument types are inputs. > > > > Users of the function need the return type. The inferencer won't be > > global -- it isn't going to look at the function to determine the return > > type. > > Viperc _will_ use a global inferencer. > Please don't assume "python" means CPython. There are two other > full scale implementations now. There may be more in the future. > And there may be other programs -- not full interpreters or > compilers, like PyLint -- which will _use_ the information. But I am talking about CPython. Do what you want with Viper, but I'm concerned with the core/authoritative distribution. I do not believe that will have a global inferencer. Sure, maybe it will one day, but my proposal assumes "no". Cheers, -g -- Greg Stein, http://www.lyra.org/ From skaller@maxtal.com.au Sat Dec 18 04:44:09 1999 From: skaller@maxtal.com.au (skaller) Date: Sat, 18 Dec 1999 15:44:09 +1100 Subject: [Types-sig] type declaration syntax References: Message-ID: <385B1119.D036BB9F@maxtal.com.au> Paul wrote: >Martijn Faassen wrote: >> >> * we don't have to debate about syntax anymore and can actually think >> about semantics without syntax confusion. > >Clean syntax helps comprehension. I don't agree, but this time it is because I think you have _understated_ the issue. I think that the syntax is just about the ONLY issue here: what 'semantics' is there to debate? The way I see it, we need a way to declare something is type T, which is a syntax issue. And then we can argue about waht "T" can be, which is, more or less, also a syntax issue. Now, if we debate this, we will find we're getting into the details of the type model, which is not a syntactic issue, but it, well, is 'rendered' in syntax all the same. For example, Viper is using a particular type model which is minor extension of CPython 1.5's own model, which leads to a particular syntax: a python expression denoting an object is what "T" is, rather than some new, invented, syntax (like Tim Peters ML/Haskell like one). In other words, I think we SHOULD focus on the syntax, because it is the representation of the ideas we have, and the one programmers will be using. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From gstein@lyra.org Sat Dec 18 04:44:53 1999 From: gstein@lyra.org (Greg Stein) Date: Fri, 17 Dec 1999 20:44:53 -0800 (PST) Subject: [Types-sig] Re: [Types Sig] Progress In-Reply-To: <385B0DDD.2C5473A2@maxtal.com.au> Message-ID: On Sat, 18 Dec 1999, skaller wrote: >... > >>>>>>>>>>> > 3. We agree that we need a syntax for asserting the types of expressions > at runtime. Greg proposes ! but says he is flexible on the issue. The > original RFC spelled this as: has_type( foo, types.StringType ) which > returns (in this case) a string or NULL. This strikes me as more > flexible than ! because you can use it in an assertion but you don't > have to. > <<<<<<<<<<< > > I don't think we agree on this: Guido says that assertions > are good enough. I wouldn't argue. The '!' operator is much more than just a new name for "assert". It can assist the compiler in determining the type of an expression value, which leads to the ability to type check and/or optimize. In other words, I believe Guido is wrong (heresy!) -- assertions are not good enough. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Sat Dec 18 04:55:12 1999 From: gstein@lyra.org (Greg Stein) Date: Fri, 17 Dec 1999 20:55:12 -0800 (PST) Subject: [Types-sig] New syntax? In-Reply-To: <000b01bf48f1$572dc9c0$32a2143f@tim> Message-ID: On Fri, 17 Dec 1999, Tim Peters wrote: > [Greg Stein] > > ... > > I guess that does mean that something like: > > > > decl a: def(Int)->None > > > > would be possible. e.g. is a member holding a ref to a function > > object. > > If it weren't possible, it would be quite a hole in the type description > mechanism! hehe :-) This syntax is part of my current proposal. I definitely agree it is a requirement to be able to specify functional types. >... > today. BTW, I wouldn't object to requiring that the class/member > distinction be explicit. > > decl class a: ... > decl member a: ... > > If "decl" gets used for more stuff down the road, it could be a real help to > make the syntax explicit from the start: > > ofwhat : 'class' | 'member' | 'var' | 'type' | 'frozen' | ... > decl-stmt : 'decl' ofwhat This seems entirely reasonable to me. Let's see what Mr. Consensus says. > > In fact, these two are probably equivalent: > > > > decl class a: def(Int)->None > > def a(Int x)->None: > > WRT type, yes, but (of course!) the former is merely a declaration while the > latter is the initial stmt of a definition. Correct. I forgot to mention that and noticed the lack later when I read that email. No worries... you won't let me get away with being a slacker... :-) >... > Yes, Any is good. I've listed this in my proposal as an open question. I'm leaning to "formally endorsing" it. My only real opposition is whether it must be a new keyword, or we can find some other way to deal with it. For example: import types Int = types.IntType String = types.StringType Any = None decl foo: Any decl bar: String The compiler isn't going to have recognized names for the types. I think it will be using data flow to figure that out (and maybe some builtin knowledge of the type() builtin and the types module). If the compiler determines that a particular dotted_name leads to the value None (whereas it typically refers to a PyTypeObject, a class object, or a typedecl object), then it says "oh. that is the 'any' construct". This also leads quite naturally to the following: def foo(bar): ... In this case, all the type annotations are not specified -- they are None. Implicitly, that means "any". Damn, I'm smooth. ;-) > The problem with * and ** is that people are going to want to express > restrictions, like "only strings from here on in" or "all the keyword args > must be of int type". Under the theory that things work well if you just > don't think about them , > > decl c: def(Int, *: (String), **: {String: Int})->Int Yah... this has been covered. No problem. Funny note: looking at the grammar, I've found the following is legal: def foo(bar, *args, * *kw): ... In my typedecl syntax, I punted the ability to use "* *" ... you must use "**". So there :-) > > ... > > I'm not sure whether to go for practical or pure. > > I'm leaning toward the "always explicit" above. Restrictions can always be > loosened later if they prove too confining, but tightening a permissive spec > is usually impossible. Yup. Quite a reasonable argument. Cheers, -g -- Greg Stein, http://www.lyra.org/ From skaller@maxtal.com.au Sat Dec 18 04:58:14 1999 From: skaller@maxtal.com.au (skaller) Date: Sat, 18 Dec 1999 15:58:14 +1100 Subject: [Types-sig] type declaration syntax References: Message-ID: <385B1466.EA5E1F6B@maxtal.com.au> Greg Stein wrote: > > Viperc _will_ use a global inferencer. > > Please don't assume "python" means CPython. There are two other > > full scale implementations now. There may be more in the future. > > And there may be other programs -- not full interpreters or > > compilers, like PyLint -- which will _use_ the information. > > But I am talking about CPython. Do what you want with Viper, but I'm > concerned with the core/authoritative distribution. I do not believe that > will have a global inferencer. Sure, maybe it will one day, but my > proposal assumes "no". It is possible that Viperc will generate C code for CPython. In fact, it seems likely. It may be a third part tool, written in ocaml rather than C, and so not part of the 'core' distribution, but is a LOT more likely to work than anything that will ever make it into the core distribution for the simple reasons that it is written in a language suitable for the task, unlike C, and it is already under development. In fact, IMHO, even Java is a LOT more suitable for doing this than C will ever be. Perhaps a C version can be written AFTER a proof of principle version is got working in a high level language. Now, I'd love to be proven wrong, and find a real Python compiler in the next major distribution, so my Interscript program actually becomes useful. But I'm not going to hold my breath, and I guess that the 'small change left over from DARPA funding' Guido mentions will not fund a compiler -- indeed, I doubt the WHOLE of the funding provided would be enough, if it is going to be written in C. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From gstein@lyra.org Sat Dec 18 05:03:17 1999 From: gstein@lyra.org (Greg Stein) Date: Fri, 17 Dec 1999 21:03:17 -0800 (PST) Subject: [Types-sig] New syntax? In-Reply-To: <001301bf4901$f3a2f040$32a2143f@tim> Message-ID: On Fri, 17 Dec 1999, Tim Peters wrote: >... > If I had a lot of binary integer functions to declare, I would probably use > a typedef, a la > > decl typedef BinaryFunc(_T) = def(_T, _T) -> _T > decl typedef BinaryIntFunc = BinaryFunc(Int) > ... > decl var intHandlerMap: {string: BinaryIntFunc} > decl var floatHandlerMap: {string: BinaryFunc(Float)} Okay, Tim. I'm going to stop you right here :-) The problem with using "decl" to do typedefs is that it does weird voodoo to associate the typedecl with the name (e.g. BinaryFunc). I believe my unary operator is much clearer to what is happening: BinaryIntFunc = typedef BinaryFunc(Int) In this case, it is (IMO) very clear that you are storing a typedecl object into BinaryIntFunc, for later use. For example, we might see the following code: import types Int = types.IntType List = types.ListType IntList = typedef [Int] ... Hrm. I don't have a ready answer for your first typedef, though. That is a new construct that we haven't seen yet. We've been talking about parameterizing *classes*, rather than typedecls. *ponder* >... > You're not going to get the same level of expressiveness in an > imperative-style Python syntax: it's the right tool for the wrong job. A > type-expression sublanguage with one operator ("|") should suffice. "or" is more Pythonic. > [on varargs] > > Me neither. Perhaps something like: > > > > decldef foo(first=int, second=string, *[int]): > > return int > > > > i.e. all the extra arguments must be ints. > > Hmm! You and Greg both seem to think varargs get implemented as lists > . Bite me. :-) You do raise a good point in another post, however: def foo(*args: (Int)): Looks awfully funny. For a Python programmer, that looks like grouping rather than a tuple. If it had a comma in there, then it would look like a tuple. But of course: there will never be more than one typedecl inside there, so whythehell is there a comma? *grumble* .... I don't have a handy resolution for this one. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Sat Dec 18 05:06:27 1999 From: gstein@lyra.org (Greg Stein) Date: Fri, 17 Dec 1999 21:06:27 -0800 (PST) Subject: [Types-sig] type declaration syntax In-Reply-To: <385B1466.EA5E1F6B@maxtal.com.au> Message-ID: On Sat, 18 Dec 1999, skaller wrote: > Greg Stein wrote: > ... me talking about the core distro's inferencer ... > > It is possible that Viperc will generate C code for CPython. > In fact, it seems likely. It may be a third part tool, written in > ocaml rather than C, and so not part of the 'core' distribution, > but is a LOT more likely to work than anything that will ever > make it into the core distribution for the simple reasons > that it is written in a language suitable for the task, > unlike C, and it is already under development. > > In fact, IMHO, even Java is a LOT more suitable > for doing this than C will ever be. Perhaps a C version > can be written AFTER a proof of principle version is got working > in a high level language. > > Now, I'd love to be proven wrong, and find a > real Python compiler in the next major distribution, > so my Interscript program actually becomes useful. > But I'm not going to hold my breath, and I guess that the > 'small change left over from DARPA funding' Guido mentions > will not fund a compiler -- indeed, I doubt the WHOLE > of the funding provided would be enough, if it is going > to be written in C. Nobody has ever suggested writing the bugger in C. My assumption is that it will be written in Python. A second assumption is that it will always remain as a lint-like tool rather than integrated into the core compiler. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Sat Dec 18 08:43:45 1999 From: gstein@lyra.org (Greg Stein) Date: Sat, 18 Dec 1999 00:43:45 -0800 (PST) Subject: [Types-sig] Type Inference I In-Reply-To: <385ACE5C.17CD5684@maxtal.com.au> Message-ID: Whatever you want to call it: inference or deduction or type analysis. I think we will be doing "bottom up type analysis" to use your phrasing. I don't think we need any top-down inferencing at all. In a function, we get our initial type inputs from the arguments and from function return values. With those, we compute the types of each item. Those are the types we pass to functions or return from the function. I still don't see your point. You've gone on at length about exceptions, semantics, and what kinds of inference can or can't be done. What is the point? At the end of this you say something about adding "static-y" stuff to Python? What do you mean? Honestly, the previous email and this one just seems to be a lot of gobbledygook. Long words, short on applicable, useful content. Again: call me small-minded, but I think you're being overly obtuse. Comments below... On Sat, 18 Dec 1999, skaller wrote: > Greg Stein wrote: > > > On Sat, 18 Dec 1999, skaller wrote: > > > ... long post about exceptions and semantic definition ... > > > > Sorry John, call me dense, but I really don't see what you're talking > > about. :-( > > It takes a while to understand the impact of conformance > and specifications on semantics .. and that this is not just a > matter of language lawyering, but a real, pragmatic, issue. I'm not an idiot. If it takes me a while, then it is going to take everybody a while. Phrase your discussion so that you're actually saying something, rather than speaking so much in the abstract. You bring up points about boundary cases and how they throw exceptions: great, but nobody cares about those boundary cases (I'm never going to feed my .emacs file into Python). >... > > I don't see a problem with exceptions. That is part of Python. I don't see > > that it causes any problems with type inference, either (it just > > introduces interesting items into the control/data flow graph). > > The problem arises roughly as follows: > type inference works by examining an expression like: > > x + 1 > > and _deducing_ that x MUST (**) be an integer. It cannot > be a file, because it isn't allowed to add a file to an integer. > But in Python you CAN add a file to an integer. It is perfectly > legal, it just throws an exception. You can't deduce/infer anything from x+1. x could be a class instance, in which case you're totally screwed. Otherwise, it could be any numeric type. But even then: as you point out, it could be a string. This is entirely the wrong direction. We aren't trying to figure out what x *should* be. We're trying to say "x is . will it cause an error?" > Do you see? This means we cannot deduce ANYTHING about 'x' > in the example snippet given above. > > Of course, the _expression_ x+1 can only be an integer, > we _can_ deduce that. But that isn't enough. Python You can't deduce that at all. class foo: def __add__(self, value): return "hello" > is too dynamic. We need more constraints to be able > to do effective inference. Not at all. As I mentioned: we'll be doing bottoms-up. We don't need constraints: we just need some type annotations on the input values (e.g. arguments and return values). > (**) This example ignores class instances with __add__ methods, > to make the argument easier to follow. You argument is easy to follow, but I don't see *why* you're making the argument. I don't care what x is from "x+1". I know what x is from where it got assigned a value. > > This whole tangent about feeding an email to Python and claiming it is a > > valid Python program with defined semantics (raise SyntaxError). I > > understand your explanation, but I totally miss the point. So what? > > See above. We cannot infer anything, unless there are rules. > That is, there MUST be set of permitted signatures for > functions/operators, > in order to do inference at all. > > It is possible to do synthetic (bottom up) type analysis, such as: > > x = 1 + 1 > > Here, we know that Int + Int -> Int, and so x (at least at this > point in the program) must be an Int. But that is only > the 'deductive' part of inference, the 'inferential' part > infers the types of _arguments_ from the set of allowable > signatures of functions. That is, we must do the inference > top down, not just bottom up. No. We're only going to do bottom up, as far as I know. Nobody has even ventured a request to have any kind of top-down inference. In fact, most people don't want anything beyond simple statement-level inference (achievable by declaring the types of all names used). I'd rather see no local declarations because we can infer/deduce the types of all local names. > > Type inferencing for the "1 + file" case is easy. You know the two types, > > and you know they can't be added. Bam. Error. > > but you're wrong, the result of applying the addition > operation is, in fact, well defined: it is NOT an error in the program, > it just throws an exception, rather than returning a value. > If you throw an exception deliberately, that is hardly an error, is it? All right. Now you're just being silly. The entire purpose of this discussion is to local those exceptions at compile time. THAT IS THE PURPOSE HERE. By definition, we are saying it is wrong. Argue semantics all you want about what is correct or not, but raising an exception is exactly what we want to avoid. We want to know about it before we run the program. > > It was a long email, but what exactly were you trying to say? "Define the > > semantics" isn't very clear. I feel Python has very clear semantics. What > > exactly is wrong with them? > > There is no distinction made between 'incorrect' code, > and 'correct' code for which an exception is thrown. What?! Of course there is a distinction. We want to filter out the incorrect code (i.e. that which uses types incorrectly and would throw errors at runtime). > In compiled code, we need the distinction, because there > is a lot of overhead in doing the dynamic type checking required > to throw the exception. The whole point of compilation is to > eliminate the overhead of run time type checking. In Python, the whole point of compilation is to transform source code into something that the PVM can execute. Done. >... more stuff ... > Hope this makes sense: to compile python code effectively, > we need to add some reasonable 'static-y' restrictions. > Where, 'reasonable' means 'suitably pythonic', but not > quite as dynamic as the current CPython 1.5.2 implementation > allows. No, it doesn't make sense. I see that we can do this with declarations and without the need for restrictions. I'm getting the feeling that you are trying to solve an entirely different problem from what we've been discussing over the past week. Your discussions about what is correct and incorrect just doesn't seem to have any basis in the problem being worked on. We want to detect incorrect code before runtime, where "incorrect" is defined as throwing an (unexpected) exception. And it is actually pretty easy to tell whether something is expected or not: did the developer put in a try/catch? Your discussion seems to saying something about removing exceptions. But honestly, I really don't know what you're advocating. I'm sorry, but I'm obviously a little bit tweaked. As Guido said, maybe too much sugar lately :-). More likely, not enough sleep. How about if you write a short email with a concrete suggestion for a change? That may help to define what exactly you're suggesting should happen. All this background "theory" is just noise to me. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Sat Dec 18 08:51:44 1999 From: gstein@lyra.org (Greg Stein) Date: Sat, 18 Dec 1999 00:51:44 -0800 (PST) Subject: [Types-sig] Viper Type specification In-Reply-To: <385AF9F1.1108A094@maxtal.com.au> Message-ID: On Sat, 18 Dec 1999, skaller wrote: > FYI: here is the Viper file py_types.vy which defines > many Viper types. [But there is no reason ALL the types > need be here: you can't see the regexp object here, > nor sockets -- these are defined elsewhere] >... I don't understand the relevancy of this to the types-sig and our recent discussions about adding static typing to Python. ?? -g -- Greg Stein, http://www.lyra.org/ From tismer@appliedbiometrics.com Sat Dec 18 13:57:35 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Sat, 18 Dec 1999 14:57:35 +0100 Subject: [Types-sig] Viper Type specification References: Message-ID: <385B92CF.69937EF6@appliedbiometrics.com> Greg Stein wrote: > > On Sat, 18 Dec 1999, skaller wrote: > > FYI: here is the Viper file py_types.vy which defines > > many Viper types. [But there is no reason ALL the types > > need be here: you can't see the regexp object here, > > nor sockets -- these are defined elsewhere] > >... > > I don't understand the relevancy of this to the types-sig and our recent > discussions about adding static typing to Python. John is telling us his truth, and we have to learn. This is no discussion but a lecture. Look into class PyFileType. My lesson was that I have to learn that 8096 is a power of two :-) enlighted-ly - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From skaller@maxtal.com.au Sat Dec 18 14:07:35 1999 From: skaller@maxtal.com.au (skaller) Date: Sun, 19 Dec 1999 01:07:35 +1100 Subject: [Types-sig] Viper Type specification References: Message-ID: <385B9527.53DDB2F8@maxtal.com.au> Greg Stein wrote: > On Sat, 18 Dec 1999, skaller wrote: > > FYI: here is the Viper file py_types.vy which defines > > many Viper types. > I don't understand the relevancy of this to the types-sig and our recent > discussions about adding static typing to Python. The types-SIG has discussed a 'language' for types, correct? For example, Tim Peters demonstrated a Haskell like syntax. One property of that syntax is that it allows complex specifications, including generics. Correct? Well, I'm showing another way to do it. The file I posted IS the type specification for Viper 2. With this mechanism, there is no need for any 'language' to describe types, the only 'language' permitted or required is 'python expression denoting an object'. This meets the some of the stated requirements of the SIG, and Guido, better than any other language describing types, in particular the 'no new stuff' requirement -- since it is clearly done entirely IN python. I posted the file, mainly for interest, so people could see what type specifications IN python would look like: this file is the one actually used in Viper, it isn't a 'demo', but the real thing. I'm not saying this is 'the' solution, but it ought to be considered because it simultaneously provides a powerful typing model, which is based on the existing model with only a small generalisation, seems easy to implement, and also solves the problem of how to name types. I hope it is clear now, why it is relevant. BTW: w.r.t expr!type, your (Greg's) proposal, what precedence would your give operator ! ? -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Sat Dec 18 16:04:16 1999 From: skaller@maxtal.com.au (skaller) Date: Sun, 19 Dec 1999 03:04:16 +1100 Subject: [Types-sig] Type Inference I References: Message-ID: <385BB080.A2DBB67C@maxtal.com.au> Greg Stein wrote: [] Ok Greg, lets see where we agree and what we understand. First, interpreter python is too damn slow for some applications. Also, errors sometimes get reported later than we'd like. We'd both like to: (OPT) be able to translate Python sources in to C which runs faster than interpreter python (ERR) find errors in a program, before running it I hope we agree on these points so far. Now, here is something I believe, mainly from comments made at various times by Guido, Tim, and others: people have tried compiling python before, and found that the resulting C code didn't run much faster than the interpreter. Thats mainly because these compilers didn't know anythong about the types, they just generated API calls corresponding to what the byte code interpreter would execute -- and the interpreter is pretty fast already. So the question arises: how well can we do if we know what the types of things are, or at least some of them? I am going to assume, without evidence, that we can do better, and I'm going to assume you agree. So now the question arises: how can we find out the types? Now, I am going to TELL you, that there is some evidence, that we can in fact do surprisingly well without changing Python one little bit. We can do better, if we analyse a whole program. We can do better, if we make some assumptions. I'd like you to accept this, without argument, because I cannot prove it: I've only done a micky mouse experiment, but JimH has done a less micky one, I'm told. So now the question arises: what is required to make the type inference we can do WITHOUT changing python one little bit, even better, if we make some changes ????? So, when you say 'we' are not going to do this or that kind of inference, you are missing the point. I surely AM going to. Others certainly WILL be going to. This is how it is done. When you are writing a compiler, you use every bit of information you can to make it go faster. So the point is how to give the compiler more information, while minimising the impact on 'python' (you can read that as 'keeping it pythonic' if you like). Now, my point, in Type Inference I and II, is that static type declarations are only ONE way of providing more information, and they are not even the most important one. In fact, type inference is hampered by quite a lot of other factors. I'm sure you will agree on some. For example, I'm sure you understand how 'freezing' a module, or banning rebinding of variables afer importing, or, disallowing module.attribute = xxx will help type inference: you clearly understand that python is very dynamic, which makes static analysis difficult. Right? So what I am trying to do is list all the things OTHER than adding optional type declarations, which might contribute to our agreed upon aim, namely, to provide a compiler with enough information to generate fast code. I think it is clear this is in the scope of the SIG's charter, for example, there seems to be a consensus that module.attribute = xxx is going to be disallowed -- because if it isn't, sophisticated global control flow analysis is required to even be sure _which_ function is being called at some point in the program. Tim Peters said that the standard algoithm for that might not even terminate. Clearly, this is a problem for a compiler :-) Now I want to go back to the original example I gave, and I want you to accept, temporarily, that we have only THREE types: integers, strings, and files. And assume that a function 'add(x,y)' exists, which throws an exception if the types of x and y are not both integers, or both strings. I want you to accept, that given a function call: add(x,1) that deducing that x is an integer is useful to a type inferencer, IF it can be done. The question is: can it be done? And the answer is: it depends on the DOCUMENTED SPECIFICATION OF THE FUNCTION. Consider two cases: 1) The spec says: IF the arguments are both ints .. OR IF the arguments are both strings .. OTHERWISE an exception is thrown 2) The spec says: IF the arguments are both ints .. OR IF the arguments are both strings ... OTHERWISE THE BEHAVIOUR IS UNDEFINED There is a huge difference between these two cases for a compiler. In case (2), the compiler can ASSUME that given the call add(x,1) that x must be an integer. This is a valid type deduction, because the compiler doesn't care what happens if the program has undefined behaviour: the assumption that x is an integer is STILL CORRECT, because it cannot have any consequences which break the language specification. In this case, the compiler could, for example, just keep x in a C int variable, and add 1 to it by using the code x + 1 -- which is much faster than PyAdd(x, One) On the other hand, in case (1), the compiler cannot deduce anything, at least from the given fragment, so it can NOT generate fast code: it has to call PyAdd(x,One) or, perhaps do something like: if (PyTypeIsInt(x)) x->value ++; else PyRaise(SomeException) .. which involves an extra run time check, at least, and is therefore much slower. Therefore, there is a performance advantage in adopting spec (2) as a language specification, instead of (1). Note this does not mean the generated code will crash, if x is not an integer. What it means is that if the compiler detects that x is not an integer, it can generate a compile time error. It is NOT allowed to do that with specification (1). So my point is: the Python documentation contains many examples where it says 'such and such an exception is thrown', and this prevents generating fast code, and it prevents early error detection. The point is that throwing an exception is _well defined_ behaviour, and it would be better if the specification said the program was in error. That way, a compiler can report the error at compile time. At the moment, no errors can be reported by a compiler, because there is no such thing as an erroneous python program -- someone may catch the exception that the specification says is thrown and do something they think is OK, and would rightly claim the compiler is breaking their program and not implementing the Python language faithfully. Just to make it clear, an example: try: return x + 1 except TypeError: return str(x) + "1" is a valid python code fragment, and it relies on x + 1 throwing an exception if x is not an integer. So IF we continue to allow this, we cannot deduce that x must be an integer, and this prevents optimising generated code, both here, and in other places in the program. You might just think, seeing x + 1 that if x is not an integer, the code must be an error, but the example above shows that you'd be wrong if you said that. For this reason, I think it is important that the Types SIG also examine when it is legitimate to catch an exception at run time, and do something, and when the code is just plain wrong, and a compiler can reject it. My proposal, and I thought it was fairly 'concrete' as you required, was that apart from EnvironmentErrors, all _standard_ exceptions not trapped within a function body (I said 'lexically local' before, but now I'll be more specific), are in fact programming errors: the programmer may NOT rely on catching any standard exceptions other than environment errors, generated by code inside a client written function. Note this is only a proposal, I'm not sure if I like it, but I hope the reason for proposing it for discussion is easier to understand now. Note: I find this difficult too. I'm not a compiler writer. But I have spent over five years on a standards committee, and have some vague idea of the impact of specifications -- and in particular lack of them -- on the ability to generate fast, conforming code. The way I learned it, was listening to the arguments of compiler writers, explaining the impact of various possible specifications on opportunties for optimisation. Believe me, some of the arguments are pretty devious :-) -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From paul@prescod.net Sat Dec 18 16:58:12 1999 From: paul@prescod.net (Paul Prescod) Date: Sat, 18 Dec 1999 10:58:12 -0600 Subject: [Types-sig] consensus(?) summary (was: Type annotations) References: <38592F43.11042753@prescod.net> <14425.23797.624232.17777@dolphin.mojam.com> Message-ID: <385BBD24.B1DE49D2@prescod.net> Skip Montanaro wrote: > > In particular, is the following a non-local write? > > import sys > p = sys.path > p.append("/usr/local/lib/other") No, only name rebindings are writes. p.append is just a method call. It's type safety is checked by the usual method call type assertion mechanism. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things never trust in: That's the vendor's final bill The promises your boss makes, and the customer's good will http://www.geezjan.org/humor/computers/threes.html From gstein@lyra.org Sat Dec 18 19:31:39 1999 From: gstein@lyra.org (Greg Stein) Date: Sat, 18 Dec 1999 11:31:39 -0800 (PST) Subject: [Types-sig] Viper Type specification In-Reply-To: <385B9527.53DDB2F8@maxtal.com.au> Message-ID: On Sun, 19 Dec 1999, skaller wrote: >... > I'm not saying this is 'the' solution, but it ought to > be considered because it simultaneously provides a > powerful typing model, which is based on the existing > model with only a small generalisation, seems easy > to implement, and also solves the problem of how to name types. > > I hope it is clear now, why it is relevant. Now it is clear, yes. But when it just gets posted with "here is X" rather than "this is how Y could be done", then it is definitely unclear. > BTW: w.r.t expr!type, your (Greg's) proposal, what precedence > would your give operator ! ? Lowest possible (as seen in the type-proposal.html I recently posted here). I don't have any actual experience with it, but I would think that when somebody is using it to annotate/verify their code, they would just append it to the end of key lines in a function. The lowest precedence creates the correct binding in this case. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Sat Dec 18 20:38:19 1999 From: gstein@lyra.org (Greg Stein) Date: Sat, 18 Dec 1999 12:38:19 -0800 (PST) Subject: [Types-sig] Type Inference I In-Reply-To: <385BB080.A2DBB67C@maxtal.com.au> Message-ID: On Sun, 19 Dec 1999, skaller wrote: >... > Ok Greg, lets see where we agree and what we understand. > > First, interpreter python is too damn slow for some applications. > Also, errors sometimes get reported later than we'd like. > > We'd both like to: > > (OPT) be able to translate Python sources in to C > which runs faster than interpreter python > > (ERR) find errors in a program, before running it > > I hope we agree on these points so far. Sure. > Now, here is something I believe, mainly from comments > made at various times by Guido, Tim, and others: > people have tried compiling python before, and found that > the resulting C code didn't run much faster than the > interpreter. Thats mainly because these compilers didn't > know anythong about the types, they just generated API > calls corresponding to what the byte code interpreter would > execute -- and the interpreter is pretty fast already. Bill Tutt and I have done it and measured about 30% speed improvement in most cases. Not as lot as most people would hope for, but definitely there. Bill is continuing to improve the code. > So the question arises: how well can we do if we know > what the types of things are, or at least some of them? > I am going to assume, without evidence, that we can do better, > and I'm going to assume you agree. Agreed. > So now the question arises: how can we find out the types? > Now, I am going to TELL you, that there is some evidence, > that we can in fact do surprisingly well without changing > Python one little bit. We can do better, if we analyse a whole > program. We can do better, if we make some assumptions. > I'd like you to accept this, without argument, because I cannot > prove it: I've only done a micky mouse experiment, but JimH > has done a less micky one, I'm told. Sure, I accept that we can. But to state up front: I don't think we want to rely on whole-program analysis. At the moment, I am assuming that a type-checking tool will not be part of the [byte-code] compiler -- that is is just too much and too slow to directly include. However, that obviously negates a number of things that the compiler could do if it knew the types. For example, maybe we introduce some integer-manipulation opcodes because we find they would be beneficial to 90% of Python programs. An external tool doesn't let Python take advantage of them. To that end, I think we might eventually want to integrate something. And to do that, we definitely cannot rely on whole-program analysis. In other words, if we depend on whole-program analysis, then I don't think the builtin, byte-code compiler will ever be able to take advantage of type information. >... > So, when you say 'we' are not going to do this or that > kind of inference, you are missing the point. > I surely AM going to. Others certainly WILL be going to. > This is how it is done. When you are writing a compiler, > you use every bit of information you can to make > it go faster. I agree that you want to use every bit of information possible. I disagree that I'm missing the point: I think we are discussing what will happen to the native compiler. To that end, I *am* positing that 'we' will do or . If "third parties" (if you will) want to create an even better compiler, than I'm all for it. However, we still want to improve the native system, and I believe that is through a different path than you are suggesting. > So the point is how to give the compiler more information, > while minimising the impact on 'python' (you can read that > as 'keeping it pythonic' if you like). Yes. > Now, my point, in Type Inference I and II, is that static > type declarations are only ONE way of providing > more information, and they are not even the most important > one. In fact, type inference is hampered by quite a lot > of other factors. This was entirely unclear. I saw it as some kind of weird ramble about changing Python's exception behavior in some unclear way, and for some unknown purpose. Given the above paragraph: know I understand what you are trying to get at. > I'm sure you will agree on some. For example, I'm sure > you understand how 'freezing' a module, or banning > rebinding of variables afer importing, or, disallowing > > module.attribute = xxx > > will help type inference: you clearly understand that > python is very dynamic, which makes static analysis > difficult. Right? Yup. > So what I am trying to do is list all the things > OTHER than adding optional type declarations, > which might contribute to our agreed upon aim, > namely, to provide a compiler with enough information > to generate fast code. All right. >... module attribute assignments ... >... add() example, explaining exceptions mess up compiler ... > > 1) The spec says: > IF the arguments are both ints .. > OR IF the arguments are both strings .. > OTHERWISE an exception is thrown > > 2) The spec says: > IF the arguments are both ints .. > OR IF the arguments are both strings ... > OTHERWISE THE BEHAVIOUR IS UNDEFINED >... > Therefore, there is a performance advantage in adopting > spec (2) as a language specification, instead of (1). > Note this does not mean the generated code will crash, > if x is not an integer. What it means is that if the > compiler detects that x is not an integer, it can > generate a compile time error. It is NOT allowed to > do that with specification (1). Interesting point. As a user of Python, I like (1) and do not want to see Python to change to use (2). Sure, it hurts the compiler, but it ensures that I always know what will happen. > So my point is: the Python documentation contains > many examples where it says 'such and such an exception > is thrown', and this prevents generating fast code, > and it prevents early error detection. The point > is that throwing an exception is _well defined_ behaviour, > and it would be better if the specification said > the program was in error. That way, a compiler > can report the error at compile time. While true, I think the compiler will still know enough about the type information to generate very good code. If somebody needs to squeak even more performance, then I'd say Python is the wrong language for the job. I do understand that you believe Python can be the right language, IFF it relaxes its specification. I hope it doesn't, though, as I like the rigorous definition. >... catching exceptions ... > is a valid python code fragment, and it relies on > x + 1 throwing an exception if x is not an integer. > So IF we continue to allow this, we cannot deduce > that x must be an integer, and this prevents optimising > generated code, both here, and in other places in the > program. It only prevents it in the presence of exception handlers. You can still do a lot of optimization outside of them. And a PyIntType_Check() test here and there to validate your assumptions is techically more expensive than not having it, but (IMO) it is not expensive in absolute terms. >... > For this reason, I think it is important that the Types SIG > also examine when it is legitimate to catch an exception > at run time, and do something, and when the code is just > plain wrong, and a compiler can reject it. Sure. > My proposal, and I thought it was fairly 'concrete' as > you required, was that apart from EnvironmentErrors, > all _standard_ exceptions not trapped within a function body > (I said 'lexically local' before, but now I'll be more > specific), are in fact programming errors: the programmer > may NOT rely on catching any standard exceptions > other than environment errors, generated by code inside > a client written function. All right. This is clear now. And it is clearly something that I would not want to see :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From tim_one@email.msn.com Sat Dec 18 20:56:44 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sat, 18 Dec 1999 15:56:44 -0500 Subject: [Types-sig] type declaration syntax In-Reply-To: <385B08AE.A491CD36@maxtal.com.au> Message-ID: <000001bf499a$64fa9c00$dca2143f@tim> [John Skaller] > ... > But the _return_ type doesn't need to be annotated as much. > Why? Because the inferencer can usually deduce it: > it's an output, the argument types are inputs. > > If the inferencer _cannot_ deduce the return type, > it _also_ cannot check that the function is returning > the correct type. "The" correct type (as opposed to "a type consistent with the operations") is impossible for an inferencer to determine, but this is addressed more to the SIG than to John : My bet is that the vast majority of Python people asking for "static typing" have in mind a conventional explicit system of the Algol/Pascal/C (APC) ilk, and that decisions based on what *inference* schemes can do are going to leave them very unhappy. Inference schemes commit two kinds of gross errors that the APC camp won't abide: 1) Inferring types that aren't general enough. 2) Inferring types that are too general. Both mistakes occur because inference can only look at the code that's written, knowing nothing about the user's *intent*. In APC, explicit type declarations often serve the latter purpose, supplying (& enforcing) design and semantic constraints that can't be deduced from the code: 1) Not general enough. This is usually due to an implementation in progress, where looking at the code that currently exists can't possibly guess what will get implemented tomorrow; e.g., a function that returns an int if it can, but is spec'ed to return None if it can't, but the author hasn't yet gotten around to coding up the latter cases. The clients of this routine must nevertheless accept an IntOrNone result, and explicit declarations can force that on clients long before the routine is actually capable of producing a None. The alternative is a large class of all too familiar last-second "integration crises". 2) Too general. This is very common in numeric programming. An inferencer sees a routine with nothing but +, *, / and infers "ah, any Number will do". But it's *unusual* for any such routine to work *correctly* for all Numbers (algorithms appropriate for complex numbers are often wildly different from those appropriate for floats, and likewise for ints). For example, I tell Haskell intgamma 1 = 1 intgamma n = x * intgamma x where x = n-1 and it deduces the type intgamma :: Num a => a -> a Arghghghgh . Yes, every flavor of Num supports all the operations I used, but no, call it with anything other than an Int and the *algorithm* is plain wrong. APC folk (Haskell folk too) routinely use explicit declarations to enforce such constraints. Explicit typing goes beyond what type inference can do "even in theory"; while types must be *consistent* with the code, only the author can know *the* correct type (which may be more-- or less! --general than what an inferencer determines is merely consistent). in-the-apc-tradition-types-communicate-design-ly y'rs - tim From tim_one@email.msn.com Sat Dec 18 20:56:49 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sat, 18 Dec 1999 15:56:49 -0500 Subject: [Types-sig] Type Inference I In-Reply-To: Message-ID: <000101bf499a$670a9040$dca2143f@tim> > ... the _expression_ x+1 can only be an integer, we _can_ deduce that > ... You can't deduce/infer anything from x+1. > ["Yes! No!" * 10 ] We can deduce that x+1 will blow up at runtime with a TypeError (perhaps spelled by some other name ) unless type(x) supports an __add__ method which in turn accepts (at least, and besides self) a single argument of type Int. If type(x) does support an __add__ method which in turn etc, we have no idea whether it will blow up at runtime. But the current incarnation of the Types-SIG (TCIOTTS) doesn't care about that: it's trying (only!) to determine at compile-time when it's certain that type(x) *does* support etc. Toward that end, TCIOTTS assumes that type(x) and all relevant information about type(x).__add__ has been handed to it on a silver platter. The type of x+1 is the union of all the types that T.__add__(1) may return across all types T in the set of possible types for x, and that info constitutes the "silver platter" handed to x+1's context. Bottom-up, all the way, with oracles at the base. AFAIK, TCIOTTS doesn't yet have an explicit policy about what to do in the presence of try/except blocks. Everyone has clearly assumed that, for purposes of type-checking, the possibility of an *up*-level handler will be ignored (and if a user can't live with that, fine, then they can't enable type-checking). Given that this SIG self-destructed the last time it tried to take on too much, and currently has a goal to produce genuinely useful code in a matter of months, I doubt TCIOTTS will be persuaded to move beyond that for now. Indeed, I think it should forget inferencing *entirely* at the start, even for cases like def unity() -> Int: a = 1 # compile-time error in type-check mode -- a not declared return a Inferencing (ya, ya -- *useful* inferencing) is harder than mere checking (indeed, checking is easy enough to write in K&R C ). one-man's-opinion-ly y'rs - tim From paul@prescod.net Sat Dec 18 17:06:10 1999 From: paul@prescod.net (Paul Prescod) Date: Sat, 18 Dec 1999 11:06:10 -0600 Subject: [Types-sig] Optionality and performance References: Message-ID: <385BBF02.E4B1DC2@prescod.net> Greg Stein wrote: > > Skip Montanaro wrote: > > I humbly assert this train of thought rates a *bzzzt*. I thought one core > > requirement was that all type declaration stuff be optional. The worst that > > the type checker/inferencer should do in the face of incomplete type info is > > display a warning. > My entire post was pre-conditioned on the assumption that type-checking > has been enabled. Optionality of type checking is not about it being enabled or disabled. Even when it is enabled, type checking any particular method must be optional. This whole discussion should presume "enabled". But optionality is still important. > IMO, type checking is NOT enabled by default. I believe it will impose a > noticable performance penalty and I'm not willing to pay that in the > general case. I don't see how we can logically treat type checks differently than array bounds checks, overflow checks and so forth. It needs to be on by default and we'll just need to figure out how to minimize its impact. Most type checks should involve quick pointer comparisons and that will be covered up by the other performance enhancement. In particular, when you declare conformance to a class or interface, method calls should no longer be string-dispatched. That means you need interfaces to be like vtables so the type checker's job is to find the right vtable. The type check actually comes "for free" in implementing the name lookup optimization. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things never trust in: That's the vendor's final bill The promises your boss makes, and the customer's good will http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Sat Dec 18 16:58:12 1999 From: paul@prescod.net (Paul Prescod) Date: Sat, 18 Dec 1999 10:58:12 -0600 Subject: [Types-sig] consensus(?) summary (was: Type annotations) References: <38592F43.11042753@prescod.net> <14425.23797.624232.17777@dolphin.mojam.com> Message-ID: <385BBD24.B1DE49D2@prescod.net> Skip Montanaro wrote: > > In particular, is the following a non-local write? > > import sys > p = sys.path > p.append("/usr/local/lib/other") No, only name rebindings are writes. p.append is just a method call. It's type safety is checked by the usual method call type assertion mechanism. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things never trust in: That's the vendor's final bill The promises your boss makes, and the customer's good will http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Sat Dec 18 17:13:26 1999 From: paul@prescod.net (Paul Prescod) Date: Sat, 18 Dec 1999 11:13:26 -0600 Subject: [Types-sig] Syntax References: <000801bf482c$71670100$63a2143f@tim> Message-ID: <385BC0B6.3879244B@prescod.net> Tim Peters wrote: > > [Martijn Faassen] > > While my agenda is to kill the syntax discussions for the moment, > > ... > > Martijn, in that case you should stop feeding the syntax meta-discussion and > just view all the other notations as virtual spellings for masses of obscure > nested dicts . Let me point out that it was the masses of obscure nested dicts that I was objecting to when I told Greg that the syntax cannot be restricted to Python (by which I meant Python 1.5). Obviously, by definition any syntax that we use for Python 2 becomes "Python". In fact, I don't see a lot of difference between the widely embraced Tim-syntax and the syntax I posted a few days ago (based on the Tim-syntax). But if putting the keyword "decl:" in front makes it feel better then I'm all for that! I'm still thinking that it should go in another file because I want to be able to experiment with this stuff WITHOUT maintaining a new Python interpreter binary. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things never trust in: That's the vendor's final bill The promises your boss makes, and the customer's good will http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Sat Dec 18 18:06:25 1999 From: paul@prescod.net (Paul Prescod) Date: Sat, 18 Dec 1999 12:06:25 -0600 Subject: [Types-sig] Type Inference I References: <385ACE5C.17CD5684@maxtal.com.au> Message-ID: <385BCD21.1373DDAD@prescod.net> skaller wrote: > Of course, the _expression_ x+1 can only be an integer, > we _can_ deduce that. But that isn't enough. Python > is too dynamic. We need more constraints to be able > to do effective inference. John, I have tried languages that were big on inferencing and I have tried languages that were big on dynamicity and I strongly prefer the latter. I don't see how your global type inferencer is going to handle: a = 1 + unpickle( "foo.pcl" ) b = a + eval( raw_input() ) I don't think that we can make these illegal without alienating most Python users. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things never trust in: That's the vendor's final bill The promises your boss makes, and the customer's good will http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Sat Dec 18 18:43:31 1999 From: paul@prescod.net (Paul Prescod) Date: Sat, 18 Dec 1999 12:43:31 -0600 Subject: [Types-sig] type declaration syntax References: <385B1119.D036BB9F@maxtal.com.au> Message-ID: <385BD5D3.2BD541B5@prescod.net> skaller wrote: > > I don't agree, but this time it is because I think you have > _understated_ the issue. I just can't win. :) -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things never trust in: That's the vendor's final bill The promises your boss makes, and the customer's good will http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Sat Dec 18 19:59:28 1999 From: paul@prescod.net (Paul Prescod) Date: Sat, 18 Dec 1999 13:59:28 -0600 Subject: [Types-sig] Type Inference I References: <385BB080.A2DBB67C@maxtal.com.au> Message-ID: <385BE7A0.285282C3@prescod.net> I don't see how this strategy can work. skaller wrote: > > You might just think, seeing > > x + 1 > > that if x is not an integer, the code must be an error, > but the example above shows that you'd be wrong > if you said that. But as you and others have pointed out, Python is protocol-centric, not type-centric. In real Python, x could be anything that with an __add__ function. The optimization opportunity is thus dubious. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things never trust in: That's the vendor's final bill The promises your boss makes, and the customer's good will http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Sat Dec 18 20:31:13 1999 From: paul@prescod.net (Paul Prescod) Date: Sat, 18 Dec 1999 14:31:13 -0600 Subject: [Types-sig] type declaration syntax References: Message-ID: <385BEF11.A8A6266E@prescod.net> Greg Stein wrote: > > ... > > Nobody has ever suggested writing the bugger in C. My assumption is that > it will be written in Python. A second assumption is that it will always > remain as a lint-like tool rather than integrated into the core compiler. That is not my assumption. If a function creator asks for the function to be type checked, it should be type checked every time it is recompiled unless some option has turned type-checking off. The difference between type signatures and lint is that lint is guessing about things that are, strictly speaking, correct, but questionable. Type check declarations are either right or wrong and if they are wrong, the programmer should be told. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things see no end: A loop with exit code done wrong A semaphore untested, and the change that comes along http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Sat Dec 18 20:37:30 1999 From: paul@prescod.net (Paul Prescod) Date: Sat, 18 Dec 1999 14:37:30 -0600 Subject: [Types-sig] New syntax? References: Message-ID: <385BF08A.6B9BD070@prescod.net> Greg Stein wrote: > > Bite me. :-) > > You do raise a good point in another post, however: > > def foo(*args: (Int)): Python should not use tuples as "read-only lists." From a type-system point of view, a tuple should be a fixed-length, fixed-type data structure defined at compile time. A mathematician would not say: "A foo is a variable-length tuple of X". Rather they would say: "A foo is a variable-langth list of X." The "unary tuple" problem almost always arises when people are using tuples as readonly lists also. We should just make a readonly list type (or readonly type annotation) and be done with it. Heck, we could have read-write tuples also! -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things see no end: A loop with exit code done wrong A semaphore untested, and the change that comes along http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Sat Dec 18 20:46:05 1999 From: paul@prescod.net (Paul Prescod) Date: Sat, 18 Dec 1999 14:46:05 -0600 Subject: [Types-sig] type declaration syntax References: <385B08AE.A491CD36@maxtal.com.au> Message-ID: <385BF28D.38FE3543@prescod.net> skaller wrote: > > If the inferencer _cannot_ deduce the return type, > it _also_ cannot check that the function is returning > the correct type. Two different issues. Some functions will have return type declarations that are checked at runtime. I strongly believe that it should be legal to declare a return type on a function that cannot be proved to return the type you claim. def foo() -> String : return FunctionThatReturnsStringWhenICallWithString("abc") def foo() -> Int : return FunctionThatReturnsIntWhenICallWithInt(5) Anyhow, the inferencer won't have access to all of the code. We still need to deal with pre-compiled functions. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things see no end: A loop with exit code done wrong A semaphore untested, and the change that comes along http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Sat Dec 18 20:15:49 1999 From: paul@prescod.net (Paul Prescod) Date: Sat, 18 Dec 1999 14:15:49 -0600 Subject: [Types-sig] List of FOO References: <000d01bf48f6$764ee5a0$32a2143f@tim> <385AF7C9.7CEE78E5@maxtal.com.au> Message-ID: <385BEB75.BD06517B@prescod.net> Thanks for describing how viper does parameterized types. There are a couple of things that I don't understand: skaller wrote: > > PyListOfInt = PyListOf(PyIntType) But does this involve executing arbitrary code defined by PyListOf? That would hurt our ability to do static type checking. > x.append(1) > > ends up calling > > PyListOf.append(PyIntType, x, 1) > > which means it can check that 1 is of type PyIntType. Right, but what is the declaration for append and how does it say that it takes a single argument and the argument must be of type PyXType where X can vary? -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things see no end: A loop with exit code done wrong A semaphore untested, and the change that comes along http://www.geezjan.org/humor/computers/threes.html From gstein@lyra.org Sat Dec 18 21:24:37 1999 From: gstein@lyra.org (Greg Stein) Date: Sat, 18 Dec 1999 13:24:37 -0800 (PST) Subject: [Types-sig] Optionality and performance In-Reply-To: <385BBF02.E4B1DC2@prescod.net> Message-ID: On Sat, 18 Dec 1999, Paul Prescod wrote: > Greg Stein wrote: > > > > Skip Montanaro wrote: > > > I humbly assert this train of thought rates a *bzzzt*. I thought one core > > > requirement was that all type declaration stuff be optional. The worst that > > > the type checker/inferencer should do in the face of incomplete type info is > > > display a warning. > > > My entire post was pre-conditioned on the assumption that type-checking > > has been enabled. > > Optionality of type checking is not about it being enabled or disabled. > Even when it is enabled, type checking any particular method must be > optional. This whole discussion should presume "enabled". But > optionality is still important. I'm assuming that we type-check a module at a time -- that we don't have the kind of fine-grained checking you're assuming. If a person doesn't want find-grained checking, then they just shouldn't add type annotations there. > > IMO, type checking is NOT enabled by default. I believe it will impose a > > noticable performance penalty and I'm not willing to pay that in the > > general case. > > I don't see how we can logically treat type checks differently than > array bounds checks, overflow checks and so forth. It needs to be on by The latter are runtime checks. I do agree that *runtime* type checks will always be generated by the compiler (per my other emails) and that the runtime check will always be performed (well, not with -O, just like regular asserts are not enabled when -O is provided). > default and we'll just need to figure out how to minimize its impact. > Most type checks should involve quick pointer comparisons and that will > be covered up by the other performance enhancement. Again: runtime. I was referring to compile-time, static checks. I do not believe those will always be enabled. > In particular, when you declare conformance to a class or interface, > method calls should no longer be string-dispatched. That means you need > interfaces to be like vtables so the type checker's job is to find the > right vtable. The type check actually comes "for free" in implementing > the name lookup optimization. Different issue (and a good/valid one!). I would recommend adding this to your list of issues and deferring it for now. Tim rightly points out: taking on too much right now will just cause another self-destruction. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Sat Dec 18 21:26:16 1999 From: gstein@lyra.org (Greg Stein) Date: Sat, 18 Dec 1999 13:26:16 -0800 (PST) Subject: [Types-sig] tuples (was: New syntax?) In-Reply-To: <385BF08A.6B9BD070@prescod.net> Message-ID: On Sat, 18 Dec 1999, Paul Prescod wrote: > Greg Stein wrote: > > > > Bite me. :-) > > > > You do raise a good point in another post, however: > > > > def foo(*args: (Int)): > > Python should not use tuples as "read-only lists." From a type-system > point of view, a tuple should be a fixed-length, fixed-type data > structure defined at compile time. Ideal or not, this is the current situation. *args is a tuple. Are you suggesting a particular change here? If so, then add it to your issues list :-) [you are maintaining one, right? :-)] Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Sat Dec 18 21:31:23 1999 From: gstein@lyra.org (Greg Stein) Date: Sat, 18 Dec 1999 13:31:23 -0800 (PST) Subject: [Types-sig] Syntax In-Reply-To: <385BC0B6.3879244B@prescod.net> Message-ID: On Sat, 18 Dec 1999, Paul Prescod wrote: > Tim Peters wrote: > > > > [Martijn Faassen] > > > While my agenda is to kill the syntax discussions for the moment, > > > ... > > > > Martijn, in that case you should stop feeding the syntax meta-discussion and > > just view all the other notations as virtual spellings for masses of obscure > > nested dicts . > > Let me point out that it was the masses of obscure nested dicts that I > was objecting to when I told Greg that the syntax cannot be restricted > to Python (by which I meant Python 1.5). Obviously, by definition any > syntax that we use for Python 2 becomes "Python". In fact, I don't see a I'll reiterate: I think our goal is for 1.6. We should assume that 2.0 does not and will not exist. It is too far out to defer any of our goals to that version. Yes, we'll have V1, V2, V3 goals, but I think we ought to shoot for their inclusion into 1.6. Only when Guido says "no, I don't want to put that into 1.6," *then* we start to lobby for Python 2.0 changes." > lot of difference between the widely embraced Tim-syntax and the syntax > I posted a few days ago (based on the Tim-syntax). But if putting the > keyword "decl:" in front makes it feel better then I'm all for that! Sorry. I won't let you rewrite history :-). You were suggesting a new, alternative syntax, rather than adding new syntax to Python. Tim and I (and some others) have lobbied for adding new syntax. In particular, I don't want to see Yet Another Language and Yet Another Parser to deal with a distinct language/syntax for type specifications. > I'm still thinking that it should go in another file because I want to > be able to experiment with this stuff WITHOUT maintaining a new Python > interpreter binary. This will be quite possible. My current development proposal specifies the static, compile-time checker as a separate tool. That tool could easily use a separate file for its input. Regardless: I'd hope that the first step to any implementation is to update the Python grammar and allow us to annotate existing Python programs (i.e. to use inline syntax). Updating the grammar is not super difficult, but I hear you about wanting to not use another binary. But I'll just shrug that off and say that's your problem :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Sat Dec 18 21:34:51 1999 From: gstein@lyra.org (Greg Stein) Date: Sat, 18 Dec 1999 13:34:51 -0800 (PST) Subject: [Types-sig] type declaration syntax In-Reply-To: <385BEF11.A8A6266E@prescod.net> Message-ID: On Sat, 18 Dec 1999, Paul Prescod wrote: > Greg Stein wrote: > > ... > > Nobody has ever suggested writing the bugger in C. My assumption is that > > it will be written in Python. A second assumption is that it will always > > remain as a lint-like tool rather than integrated into the core compiler. > > That is not my assumption. If a function creator asks for the function > to be type checked, it should be type checked every time it is > recompiled unless some option has turned type-checking off. If you want to write the C code, then please be my guest. I'm hoping that I'll find time to contribute to actual coding here (between my other projects), and assuming that to be true, then I'll be using Python. I'm structuring my development proposal assuming that Python will be used for the majority of the compile-time checking. > The difference between type signatures and lint is that lint is guessing > about things that are, strictly speaking, correct, but questionable. > Type check declarations are either right or wrong and if they are wrong, > the programmer should be told. Woah!! Do not read "historical implementation of lint" into my phrasing. I meant "a separate tool, separately invoked." I totally agree that it will declare things right/wrong. However, I do not believe that it will be integrated into the core, bytecode compiler any time in the near future. If it does, then its invocation will be optional (IMO). Cheers, -g -- Greg Stein, http://www.lyra.org/ From skaller@maxtal.com.au Sat Dec 18 22:47:44 1999 From: skaller@maxtal.com.au (skaller) Date: Sun, 19 Dec 1999 09:47:44 +1100 Subject: [Types-sig] List of FOO References: <000d01bf48f6$764ee5a0$32a2143f@tim> <385AF7C9.7CEE78E5@maxtal.com.au> <385BEB75.BD06517B@prescod.net> Message-ID: <385C0F10.E003F55E@maxtal.com.au> Paul Prescod wrote: > > Thanks for describing how viper does parameterized types. There are a > couple of things that I don't understand: > > skaller wrote: > > > > PyListOfInt = PyListOf(PyIntType) > > But does this involve executing arbitrary code defined by PyListOf? Yes. > That would hurt our ability to do static type checking. I'm not so certain. I think you have asked the right question. Here's why I'm uncertain: the code for PyListOf is written in Python. Typically, it will be a simple class. A compiler or other static analysis tool can analyse that code just like any other. Now, for _builtin_ types, it will surely help to have _builtin_ semantics, and this is possible, because Python does have a specification for these types. For user defined types, it isn't clear analysing a type object is that much harder or different, to analysing any other python code. Compare with analysing the behaviour of class instances, tracking which classes they are statically. I'm not sure it is much different. In fact, __getattr_ and friends already make analysis of user defined classes difficult .. so perhaps there isn't much difference here. I don't (yet) know. > > x.append(1) > > > > ends up calling > > > > PyListOf.append(PyIntType, x, 1) > > > > which means it can check that 1 is of type PyIntType. > > Right, but what is the declaration for append and how does it say that > it takes a single argument and the argument must be of type PyXType > where X can vary? Well, in Viper, the definition of append would be in the class PyListOf: class PyListOf: ... def append(Type, object, value): if type(value) is not Type): raise TypeError else: object.append(value) Now, here, using a "Guido rambling argument" I think you ( a human ) could deduce what is going on. The explicit type test indicates typeness of value: it tells that the type of value in object.append(value) must be Type. It is harder to deduce that 'object' is a list. Indeed, it might not be, it can be anything with an append method. Hopefully, the definition I gave won't lead to an infinite recursion. I guess the point I'm making is: suppose the Viper type system works out nicely for the interpreter. Then this suggests a more 'pythonic' way of naming types, the way python programmers do it now: type([1,2,3]) is types.ListType type(user_object) is user_module.MyType where the RHS in both cases is a python expression denoting a type object. The reason I'm suggesting this is worth examining, is that it doesn't require much change to python: the CPython currently uses special type objects for types ... but JPython is a bit different, and Viper just generalises CPython a bit. At least one advantages is that C extensions are well covered by this idea. No, I should say "it seems to me that this might work well with C extensions, possibly better in Python 2 than 1.6 (since the architecture of Python 2 will be reworked)". Might also work better for JPython too. Note I'm not against using a functional language's type description for Python, a'la Tim/Haskell, but it isn't clear that is going to work well either, and it seems to involve 'extra' work, writing a parser for a 'new' language, etc. I think you said 'ignore non builtin types for the moment', and I think I'm giving an argument that this might not be such a restriction after all. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From faassen@vet.uu.nl Sat Dec 18 22:54:12 1999 From: faassen@vet.uu.nl (Martijn Faassen) Date: Sat, 18 Dec 1999 23:54:12 +0100 Subject: [Types-sig] Type Inference I In-Reply-To: <000101bf499a$670a9040$dca2143f@tim> References: <000101bf499a$670a9040$dca2143f@tim> Message-ID: <19991218235412.A15050@vet.uu.nl> Tim Peters wrote: > Indeed, I think it should forget inferencing *entirely* at the start, even > for cases like > def unity() -> Int: > a = 1 # compile-time error in type-check mode -- a not declared > return a To use my famous phrase again: I agree. The counter argument I got to this before is that inferencing takes place anyway in the case of expressions: def foo(a, b): # Martijn's evil verbose format in yet another form decl: a = Int b = Int return Int return a + b 'a + b' would need inferencing to figure out what the type is of the complete expression. I think that this argument overlooks that this kind of evaluation is a lot more easy than a back-tracking kind of inferencing. > Inferencing (ya, ya -- *useful* inferencing) is harder than mere checking > (indeed, checking is easy enough to write in K&R C ). Though checking could be seen as a kind of inferencing, right? Or are people confusing the issues? Initially I didn't consider the expression evaluation stuff as inferencing either, but there's a good argument to consider it so, not? Regards, Martijn From skaller@maxtal.com.au Sat Dec 18 23:05:41 1999 From: skaller@maxtal.com.au (skaller) Date: Sun, 19 Dec 1999 10:05:41 +1100 Subject: [Types-sig] type declaration syntax References: <000001bf499a$64fa9c00$dca2143f@tim> Message-ID: <385C1345.C21FF180@maxtal.com.au> Tim Peters wrote: > My bet is that the vast majority of Python people asking for "static typing" > have in mind a conventional explicit system of the Algol/Pascal/C (APC) ilk, > and that decisions based on what *inference* schemes can do are going to > leave them very unhappy. I'm not sure why. My 'assumption' is that 1) a conservative inferencer is used, which means it tries to optimise code by inference, but if it isn't sure, it falls back to the usual run-time checking -- that is, it faithfully reproduces the expected behaviour no matter what. 2) optional static type declarations allow the performance of the inferencer to be improved; that is, to generate better code 3) it would also help to tighten up the specifications of python, particularly in areas like a) when is it OK to expect an exception b) module freezing etc. I would make the point that, as often is the case, the client is 'asking' for X, but what they actually need is Y, because they don't understand their own requirements. That is, they may be 'asking' for APC style static typing, but they have no idea what the implications are, and if they knew, they would withdraw their application. I guess that NO python programmer wants to declare the type of every single name, which is what APC style static type checking requires. So they 'throw in' the word 'optional', and that changes the whole thing to 'general inference like in a functional programming language, only trickier, because Python isn't' :-) -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Sat Dec 18 23:15:05 1999 From: skaller@maxtal.com.au (skaller) Date: Sun, 19 Dec 1999 10:15:05 +1100 Subject: [Types-sig] Type Inference I References: <385ACE5C.17CD5684@maxtal.com.au> <385BCD21.1373DDAD@prescod.net> Message-ID: <385C1579.1D937D34@maxtal.com.au> Paul Prescod wrote: > I don't see how your global type inferencer is going to handle: > > a = 1 + unpickle( "foo.pcl" ) > b = a + eval( raw_input() ) > > I don't think that we can make these illegal without alienating most > Python users. I agree. the way I plan to handle this in Viperc is to fall back on the run time system (Viperi). However, you might be surprised how well inference can do. For example, consider b = a + eval( raw_input() ) It may seem that this tells nothing about a or b. But looking closer, both a and b must be 'addable' in some sense. Furthermore, in context, both 'a' and 'b' have to be _used_ elsewhere for the code to be useful, and we can learn more about the typing from examining those contexts. There is no need to always deduce the types: python is not a functional programming language with a full static typing system. It is enough, that we can make significant performance improvements in some places, or report a few definite errors. Short answer: you're right, but it doesn't matter: no one expects a python compiler to produce code that runs as fast as C. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Sat Dec 18 23:37:22 1999 From: skaller@maxtal.com.au (skaller) Date: Sun, 19 Dec 1999 10:37:22 +1100 Subject: [Types-sig] Type Inference I References: <385BB080.A2DBB67C@maxtal.com.au> <385BE7A0.285282C3@prescod.net> Message-ID: <385C1AB2.E840F9D6@maxtal.com.au> Paul Prescod wrote: > > I don't see how this strategy can work. > > skaller wrote: > > > > You might just think, seeing > > > > x + 1 > > > > that if x is not an integer, the code must be an error, > > but the example above shows that you'd be wrong > > if you said that. > > But as you and others have pointed out, Python is protocol-centric, not > type-centric. In real Python, x could be anything that with an __add__ > function. The optimization opportunity is thus dubious. That depends on the scope of the analyser I think. If you are only analysing a function, by itself, without any type declarations, you are probably right that many cases cannot be optimised. In that case type declarations may help. However, it may well be that it _is_ possible to deduce things in an isolated function in important places, like in the body of an innner loop. On the other hand, if you widen the scope of the analyser to a whole module, or the whole program, then it may be possible to do better. [See Guidos 'rambling' post] Greg wants to write one kind of tool, I'm building a different one. The point is to try to help both these tools, and any others, do a better job for the programmer, by changing the python language. I.e. the goal of the SIG is to recommend language changes NOT to produce any kind of tool (although that is useful to help decide what needs changing, and it may also be useful to end users as well: these are secondary goals) At least, that's my understanding. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Sat Dec 18 23:39:45 1999 From: skaller@maxtal.com.au (skaller) Date: Sun, 19 Dec 1999 10:39:45 +1100 Subject: [Types-sig] Type Inference I References: Message-ID: <385C1B41.73F23836@maxtal.com.au> Greg Stein wrote: > Bill Tutt and I have done it and measured about 30% speed improvement in > most cases. Not as lot as most people would hope for, but definitely > there. Bill is continuing to improve the code. That's quite worthwhile, though. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From gstein@lyra.org Sat Dec 18 23:43:30 1999 From: gstein@lyra.org (Greg Stein) Date: Sat, 18 Dec 1999 15:43:30 -0800 (PST) Subject: [Types-sig] what tools? (was: Type Inference I) In-Reply-To: <385C1AB2.E840F9D6@maxtal.com.au> Message-ID: On Sun, 19 Dec 1999, skaller wrote: >... > Greg wants to write one kind of tool, I'm building a different one. > The point is to try to help both these tools, and any others, > do a better job for the programmer, by changing the python > language. Agreed. So far, I do not believe that adding type annotations (declarations) will hinder your tool. And it certainly will help the standard tools (i.e. those incorporated into the standard distro). > I.e. the goal of the SIG is to recommend language > changes NOT to produce any kind of tool (although that > is useful to help decide what needs changing, and it may > also be useful to end users as well: these are secondary > goals) At least, that's my understanding. I don't believe we are limited to language changes. That is a bit too narrow to solving the problems at hand. I figure that we'll implement an external tool, leaving the integration decision for another day. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Sat Dec 18 23:44:20 1999 From: gstein@lyra.org (Greg Stein) Date: Sat, 18 Dec 1999 15:44:20 -0800 (PST) Subject: [Types-sig] Type Inference I In-Reply-To: <385C1B41.73F23836@maxtal.com.au> Message-ID: On Sun, 19 Dec 1999, skaller wrote: > Greg Stein wrote: > > > Bill Tutt and I have done it and measured about 30% speed improvement in > > most cases. Not as lot as most people would hope for, but definitely > > there. Bill is continuing to improve the code. > > That's quite worthwhile, though. Yup. But when people say "Python is 10X slower", then you want a 10X speed improvement to shut them up :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From tony@metanet.com Sat Dec 18 23:56:50 1999 From: tony@metanet.com (Tony Lownds) Date: Sat, 18 Dec 1999 15:56:50 -0800 (PST) Subject: [Types-sig] Syntax In-Reply-To: <385BC0B6.3879244B@prescod.net> Message-ID: On Sat, 18 Dec 1999, Paul Prescod wrote: > Tim Peters wrote: > > > > [Martijn Faassen] > > > While my agenda is to kill the syntax discussions for the moment, > > > ... > > > > Martijn, in that case you should stop feeding the syntax meta-discussion and > > just view all the other notations as virtual spellings for masses of obscure > > nested dicts . > > Let me point out that it was the masses of obscure nested dicts that I > was objecting to when I told Greg that the syntax cannot be restricted > to Python (by which I meant Python 1.5). Obviously, by definition any > syntax that we use for Python 2 becomes "Python". In fact, I don't see a > lot of difference between the widely embraced Tim-syntax and the syntax > I posted a few days ago (based on the Tim-syntax). But if putting the > keyword "decl:" in front makes it feel better then I'm all for that! > > I'm still thinking that it should go in another file because I want to > be able to experiment with this stuff WITHOUT maintaining a new Python > interpreter binary. > I think it'd be possible to put type declarations in-line without using a new binary, at least in the short term: 1. make a module that overloads __import__() 2. when a module is imported it asks the syntax handler to parse the file and generate a plain .py file and a .pi (ie "interface") file with appropriately nested dicts in it. 3. Then it asks the type checker to make sure the .pi and .py match up. The type checker may need to call __import__() recursively. 4. Then, __import__() should import the generated .py file. There are a few caveats I can think of: a. eval/exec/execfile couldnt use type declarations b. The outputted .py file would basically be stripped of type declarations, nothing would be added to it. A full-blown system might want to add runtime type checks. c. The syntax handler wouldn't get to use Python's parser "for free". -Tony From gstein@lyra.org Sun Dec 19 00:04:05 1999 From: gstein@lyra.org (Greg Stein) Date: Sat, 18 Dec 1999 16:04:05 -0800 (PST) Subject: [Types-sig] Syntax In-Reply-To: Message-ID: On Sat, 18 Dec 1999, Tony Lownds wrote: >... > I think it'd be possible to put type declarations in-line without using a > new binary, at least in the short term: > > 1. make a module that overloads __import__() > > 2. when a module is imported it asks the syntax handler to parse the file > and generate a plain .py file and a .pi (ie "interface") file with > appropriately nested dicts in it. > > 3. Then it asks the type checker to make sure the .pi and .py match up. > The type checker may need to call __import__() recursively. > > 4. Then, __import__() should import the generated .py file. Interesting approach! However, I'd think that implementing that would be about the same difficulty as altering Python's grammar (i.e. not a walk in the park, but not hard). But if a single binary is important (for now), then your thought is quite valid. Over the next week or so, I'm going to be work on Python's import system. Depending on whether Guido likes the changes and if checks them in, then tweaking the import as you mention would get a good deal easier. Cheers, -g -- Greg Stein, http://www.lyra.org/ From skaller@maxtal.com.au Sun Dec 19 00:42:48 1999 From: skaller@maxtal.com.au (skaller) Date: Sun, 19 Dec 1999 11:42:48 +1100 Subject: [Types-sig] Type Inference I References: Message-ID: <385C2A08.DD36C485@maxtal.com.au> Greg Stein wrote: > But to state up front: I don't think we want to rely on whole-program > analysis. Right. I accept this as a reasonable requirement. Perhaps to explain my viewpoint: it is worthwhile seeing just how far it is possible to go, with no changes, and with a few small changes like optional type checking, in each case: whole program analysis, single module, and single function. The reason it is worth considering whole program analysis is that, because there is a LOT more information available, such a tool can do a much better job, and therefore less changes are needed to the python language. Establishing the _minimum_ changes required is useful, even if we both agree we'd like to do more -- and I share your desire to work at a much finer grained level like modules or functions. Viperc has taken the whole program approach, NOT because I like it, but because, at the moment, there is no real alternative. I'd sure like to see things that made per module compilation possible!! > At the moment, I am assuming that a type-checking tool will not > be part of the [byte-code] compiler -- that is is just too much and too > slow to directly include. You are probably right. however, I think the issue here is the Python language, and what _might_ be done if we change it, rather than any particular tool. That is, we should be examining "what can we do, if we make these changes" and not "what tool can we write for the CPython 1.6 distribution". I'm not saying a tool cannot be written, just that the issue isn't writing the tool, but changing the language so tool writers can actually get better results. > To that end, I think we might eventually want to integrate something. Actually, I tend to think the most likely tool which will get people excited is a compiler that generates C code: nothing to do with the bytecode interpreter at all. But I could be wrong. [The JPython people, for example, won't care :-] > And to do that, we definitely cannot rely on whole-program analysis. Agreed. It makes sense to consider how to change Python so a more localised tool can work well. > In other words, if we depend on whole-program analysis, then I don't think the > builtin, byte-code compiler will ever be able to take advantage of type > information. You are probably right, but I don't think the bytecode compiler is the target. Well, i _didn't_ think that, until you said that's what you were interested in right now. Forgive my misunderstanding! > I agree that you want to use every bit of information possible. I disagree > that I'm missing the point: I think we are discussing what will happen to > the native compiler. To that end, I *am* positing that 'we' will do > or . I see. This appears to be where we are crossing wires. It hadn't even occurred to me that this had anything at all to do with the bytecode compiler. My assumption was people were interested in: a) a stand alone type checker to help diagnose errors b) a compiler to convert functions, modules, or whole programs into C. Thanks for pointing out my assumptions were overly restrictive: you are right, the bytecode compiler might benefit from analysis too. > > Now, my point, in Type Inference I and II, is that static > > type declarations are only ONE way of providing > > more information, and they are not even the most important > > one. In fact, type inference is hampered by quite a lot > > of other factors. > > This was entirely unclear. I saw it as some kind of weird ramble about > changing Python's exception behavior in some unclear way, and for some > unknown purpose. I accept it was a weird ramble. Sorry. I was trying my best to explain something which is, in fact, difficult to understand for me. Perhaps that's why the ramble was so long winded and rambly (or perhaps I write like that anyhow :-) > > I'm sure you will agree on some. For example, I'm sure > > you understand how 'freezing' a module, or banning > > rebinding of variables afer importing, or, disallowing > > > > module.attribute = xxx > > > > will help type inference: you clearly understand that > > python is very dynamic, which makes static analysis > > difficult. Right? > > Yup. So, you agree with my point that there are OTHER things than optional type declarations that can improve the situation wrt typing/optimisation? You understand the freezing issue, but not the exception handling one? I don't fully understand it either. That's why the ramble, to promote discussion: if I fully understood it, I would have posted a tightly worded proposal instead. > >... module attribute assignments ... > >... add() example, explaining exceptions mess up compiler ... > > > > 1) The spec says: > > IF the arguments are both ints .. > > OR IF the arguments are both strings .. > > OTHERWISE an exception is thrown > > > > 2) The spec says: > > IF the arguments are both ints .. > > OR IF the arguments are both strings ... > > OTHERWISE THE BEHAVIOUR IS UNDEFINED > >... > > Therefore, there is a performance advantage in adopting > > spec (2) as a language specification, instead of (1). > > Note this does not mean the generated code will crash, > > if x is not an integer. What it means is that if the > > compiler detects that x is not an integer, it can > > generate a compile time error. It is NOT allowed to > > do that with specification (1). > > Interesting point. As a user of Python, I like (1) and do not want to see > Python to change to use (2). Sure, it hurts the compiler, but it ensures > that I always know what will happen. Right. But you can see the tension between these two specifications and the impact on performance?? So now: let me put it to you, that we could try for specification (3) -- which is a compromise between (1) and (2), which provides BOTH advantages with some restrictions: (3) .... OTHERWISE, IF the function call is enclosed in a try block within the same function as the call, AND that try block has a handler which explicitly catches the exception or a base thereof, or, any exception, THEN an exception will be thrown, OTHERWISE the program is in error. EXAMPLE: def f(x): try: return x + 1 except: return str(x) REQUIRED: x + 1 requires an exception be thrown at run time, and the function f will never fail. No compile time diagnostics can be printed. It is hard to optimise the code, but it does what Python does right now. EXAMPLE: def f(x): return x + 1 PERMITTED: a static analyser can report a compile time error, if it sees a call: def g(x): if type(x) is StringType: return f(x) THIS IS A CHANGE FROM CPYTHON. We're saying that this code IS AN ERROR, and that a compiler can REJECT the program as invalid python. Let me explain the rationale: even a simple, local, per function analyser can _see_ that a function call is wrapped inside a try/except clause, and can examine the exceptions that will be handled .. if the exception is a Python defined one, and is named with the Guido given name (rather than a variable bound to it), then this case is 'static' enough to determine that the user DELIBERATELY executed possibly faulty code, with the aim of handling the exception. In this case, we should respect the users wishes. Now, if the user puts the handler well down the stack, somewhere else .. then the user is NOT deliberately trying to use exception handling to do type checking, they're just trying to make the program continue to run without falling over, or, print some diagnostic before terminating .. in other words, in this case, it is reasonable to assume that the function call is a programming error. That is, THE CODE IS NOT VALID PYTHON. In the first example, the code IS valid python though. The question I would ask is: does this rule cover enough existing code, that if it is taken as a Python language rule, it will not break too many programs if a compiler REJECTS the code in the case that the handler is not local to the function call?? And the second question is: will this really provide opportunities for better code generation by a compiler, or better diagnostics from a static analyser? Finally: is there a better rule?? I'm really hoping here that you (Greg) will like the idea, because it helps a per function or per module static checker more than a whole program analyser. It is clear Guido is willing to add restrictions on the existing language rules, to allow better static analysis for ERR or OPT: he already said that module freezing is a goer. I do not know if the rule above is a goer: I don't really know what all the issues are. > While true, I think the compiler will still know enough about the type > information to generate very good code. I'm not so sure. I tried it, and the result was that because of the EH (exception handling) issue I mentioned, it wasn't possible EVEN with global whole program analysis, to get good results -- without doing a full control flow analysis as well. And that, unlike type inference, is probably too hard. > If somebody needs to squeak even > more performance, then I'd say Python is the wrong language for the job. I > do understand that you believe Python can be the right language, IFF it > relaxes its specification. I hope it doesn't, though, as I like the > rigorous definition. I would say python needs to _tighten_ its specification, not relax it, so we had better watch out when we use these terms we don't cross wires, since we clearly agree on the semantics we're refering to :-) [Technically, I think you're right, and I'm wrong] > It only prevents it in the presence of exception handlers. Yes. But 'presence' is a dynamic thing, not a lexical one. You need whole program control flow analysis to detect presence statically. The kind of rule I proposed above changes that. >You can still > do a lot of optimization outside of them. And a PyIntType_Check() test > here and there to validate your assumptions is techically more expensive > than not having it, but (IMO) it is not expensive in absolute terms. That depends on where the check is. If it is in a tight inner loop, where the code would otherwise a) do no function calls b) all fit in the machines hardware cache c) reduce to register operations then I'm sure you would agree you were wrong. And these are the cases of most interest, because it is the really tight Python coded inner loops where Python falls so badly behind C in performance. With just the right specifications and tweaks to the language, plus static type declarations, it may well be possible to compile down to extremely fast C code. > All right. This is clear now. And it is clearly something that I would not > want to see :-) So you are willing to throw out optimisation opportunities in favour of preserving the existing semantics. This is valid viewpoint. I accept one vote against any change here. I'd like to see other peoples opinions, and then perhaps a consensus can be reached: I am quite willing to accept whatever is decided by Guido in the end .. I can always add features to Viper than make it much faster than Python :-) But I'd prefer not to, I'd rather it compile Guido Python with good performance :-) -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Sun Dec 19 00:54:33 1999 From: skaller@maxtal.com.au (skaller) Date: Sun, 19 Dec 1999 11:54:33 +1100 Subject: [Types-sig] Type Inference I References: Message-ID: <385C2CC9.8F4158A7@maxtal.com.au> Greg Stein wrote: > > On Sun, 19 Dec 1999, skaller wrote: > > Greg Stein wrote: > > > > > Bill Tutt and I have done it and measured about 30% speed improvement in > > > most cases. Not as lot as most people would hope for, but definitely > > > there. Bill is continuing to improve the code. > > > > That's quite worthwhile, though. > > Yup. But when people say "Python is 10X slower", then you want a 10X speed > improvement to shut them up :-) To put it in context: I have a literate programming tool, Interscript, written in Python. My problem isn't that it is 10X too slow. It takes four hours to build itself, instead of four seconds. For a more usual project, it would take 20 seconds, where less than a second is required. In other words, it is something like 100x - 1000X too slow. I know that other LP tools written in C can run that much faster, because I have used them. The crux of the problem is in the character by character handling which occurs in tight loops that continually reallocate strings by appending characters to them. I believe it is possible to compile suitably written and annotated python so it will run 100-1000X faster. 30% is worthwhile, but 100000% is better :-) -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Sun Dec 19 00:58:15 1999 From: skaller@maxtal.com.au (skaller) Date: Sun, 19 Dec 1999 11:58:15 +1100 Subject: [Types-sig] Viper Type specification References: Message-ID: <385C2DA7.9E429569@maxtal.com.au> Greg Stein wrote: > > BTW: w.r.t expr!type, your (Greg's) proposal, what precedence > > would your give operator ! ? > > Lowest possible (as seen in the type-proposal.html I recently posted > here). I don't have any actual experience with it, but I would think that > when somebody is using it to annotate/verify their code, they would just > append it to the end of key lines in a function. The lowest precedence > creates the correct binding in this case. OK, I'll implement it, and if you make some test files, I'll run them and send you the results to see if they're what you expected (in an interpreter! no compilation yet :-) [BTW: it will take less than an hour to do: I just finished doing list comprehensions, it took less than 2 hour all up. Assignment operators took longer -- there are more of them to code up] -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Sun Dec 19 01:04:17 1999 From: skaller@maxtal.com.au (skaller) Date: Sun, 19 Dec 1999 12:04:17 +1100 Subject: [Types-sig] Re: what tools? (was: Type Inference I) References: Message-ID: <385C2F11.C89266A4@maxtal.com.au> Greg Stein wrote: > > On Sun, 19 Dec 1999, skaller wrote: > >... > > Greg wants to write one kind of tool, I'm building a different one. > > The point is to try to help both these tools, and any others, > > do a better job for the programmer, by changing the python > > language. > > Agreed. So far, I do not believe that adding type annotations > (declarations) will hinder your tool. Of course not: I want them badly. > > I.e. the goal of the SIG is to recommend language > > changes NOT to produce any kind of tool (although that > > is useful to help decide what needs changing, and it may > > also be useful to end users as well: these are secondary > > goals) At least, that's my understanding. > > I don't believe we are limited to language changes. That is a bit too > narrow to solving the problems at hand. I figure that we'll implement an > external tool, leaving the integration decision for another day. Yeah, but your tool will probably by CPython centric, that is, it will not work in JPython or Viper because you will hook the AST stuff etc. [If you can do it all in 'pure python' that would be better -- however it STILL won't cope with Viper or Jpython extensions ...] -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From paul@prescod.net Sun Dec 19 01:56:03 1999 From: paul@prescod.net (Paul Prescod) Date: Sat, 18 Dec 1999 19:56:03 -0600 Subject: [Types-sig] Type Inference I References: Message-ID: <385C3B33.13C8AEFC@prescod.net> Greg Stein wrote: > > ... > > Yup. But when people say "Python is 10X slower", then you want a 10X speed > improvement to shut them up :-) Damn would that be sweet. We'll get some nice marketing on the safety side, too. One of the first questions Java programmers ask is: "is it like Perl in that its only going to catch my typing mistakes at runtime??" I'd love to be able to say: "If you ask it to." -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things see no end: A loop with exit code done wrong A semaphore untested, and the change that comes along http://www.geezjan.org/humor/computers/threes.html From tim_one@email.msn.com Sun Dec 19 04:42:50 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sat, 18 Dec 1999 23:42:50 -0500 Subject: [Types-sig] Keyword arg declarations In-Reply-To: <00c101bf48bf$fb9850c0$c355cfc0@ski.org> Message-ID: <000101bf49db$810c5dc0$16a2143f@tim> [David Ascher] > ... > An example of such a signature is familiar to all is the signature for > range(). The docstring for range reads: > > range([start,] stop[, step]) -> list of integers > > which is not expressible with the current syntax. A Python > version of range would have to do, much like NumPy's arange does, > > def range(start, stop=None, step=1): > if (stop == None): > stop = start > start = 0 Or whrandom's funkier: def randrange(self, start, stop=None, step=1, # Do not supply the following arguments int=int, default=None): > Now, the builtin typechecker can of course be told about > __builtin__.range's signature peculiarities, but is there > any way we can address the more general problem? Or is it, > as I suspect, rare enough that one can ignore it? I suggest you're wrestling with an illusion here: Python *internally* has no such form of argument list as range([start,] stop[, step]) This is, that's just the way the *doc* is written, to make it clearer. bltinmodule.c's builtin_range analyzes the snot out of the arglist, much like the Python versions do. A clue that the doc makes no actual sense is that it apparently allows expressing a stop and step without a start. Everything the builtin truly requires can be captured via the declaration decl range: def(Int, =Int, =Int) -> [Int] using =Type notation for optional arguments that are not also keyword arguments. If the builtin also accepted these as keyword arguments, this could be expressed as (dropping my customary "|" in favor of GregS's "or"): decl range: def(stop: Int) -> [Int] or \ def(start: Int, stop: Int, step=:Int) -> [Int] using name=:Type notation for an optional keyword argument. An alternative is to change the docs to match what actually happens. but-no-need-for-extremes-ly y'rs - tim From da@ski.org Sun Dec 19 06:54:31 1999 From: da@ski.org (David Ascher) Date: Sat, 18 Dec 1999 22:54:31 -0800 Subject: [Types-sig] Keyword arg declarations References: <000101bf49db$810c5dc0$16a2143f@tim> Message-ID: <00ac01bf49ed$e76a3f80$df55cfc0@ski.org> From: Tim Peters > I suggest you're wrestling with an illusion here: Python *internally* has > no such form of argument list as > > range([start,] stop[, step]) > > This is, that's just the way the *doc* is written, to make it clearer. I know that. However, I can imagine that it will be hard to justify to the unwashed masses why they need to use seemingly unrelated syntax to describe the signature for humans and the signature for the compiler. I believe that you raise a similar point in another of your posts, w.r.t the 'int=int, ord=ord' extra junk in your function definition. That said, I suspect that the issue is peripheral and rare enough that I needn't worry. --da From paul@prescod.net Sun Dec 19 09:20:20 1999 From: paul@prescod.net (Paul Prescod) Date: Sun, 19 Dec 1999 03:20:20 -0600 Subject: [Types-sig] Easier? References: <385AD61D.548EB5F8@maxtal.com.au> Message-ID: <385CA354.18C252A7@prescod.net> skaller wrote: > > ... > > I'm not asking for that, just trying to explain how > important conformance issues and specifications are > in optimisation, and in particular, how important > it is that certain operations NOT be defined > (even by a requirement that an exception be thrown). I think that you should build an inferencer for a Python subset where TypeError and module-rebindings are illegal. If you get a serious speedup then those feature will defacto fall out of use because people will want to use your inferencer to get serious speedups. At that point we can officially deprecate those features. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things never trust in: That's the vendor's final bill The promises your boss makes, and the customer's good will http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Sun Dec 19 09:20:27 1999 From: paul@prescod.net (Paul Prescod) Date: Sun, 19 Dec 1999 03:20:27 -0600 Subject: [Types-sig] Type Inference I References: <385C2A08.DD36C485@maxtal.com.au> Message-ID: <385CA35B.5F919934@prescod.net> If you find that a restriction like this practically allows interesting type inference then I would propose a rule similar to the following: "If a Python compiler can determine that there is a code path through the program that raises TypeError it may reject the program. If it does not reject the program then it must report the TypeError at runtime by throwing an exception." -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things see no end: A loop with exit code done wrong A semaphore untested, and the change that comes along http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Sun Dec 19 11:44:08 1999 From: paul@prescod.net (Paul Prescod) Date: Sun, 19 Dec 1999 05:44:08 -0600 Subject: [Types-sig] parameterized typing (was: New syntax?) References: Message-ID: <385CC508.D8684CEC@prescod.net> Greg Stein wrote: > > .... > Paul: does this sufficiently address your desire for parameterized types? > Others: how does this look? It seems quite Pythonic to me, and is a basic > extension of previous discussions (and to my thoughts of the design). Without thinking every detail through it looks good to me for handling parameterized classes. I think that parameterized typedecls and functions are still an issue. Also, was it your intent that the _ be required or would the fact that the param was declared obviate that. I am thinking that there may a more general syntax that allows us to parameterize various sorts of things. interface (a,b) foo: ... class (a, b) foo: ... def (a, b) foo(a) -> b: decl foo(a,b) = typedef ... -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things see no end: A loop with exit code done wrong A semaphore untested, and the change that comes along http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Sun Dec 19 10:30:39 1999 From: paul@prescod.net (Paul Prescod) Date: Sun, 19 Dec 1999 04:30:39 -0600 Subject: [Types-sig] Syntax References: Message-ID: <385CB3CF.AB7343AE@prescod.net> Greg Stein wrote: > > > lot of difference between the widely embraced Tim-syntax and the syntax > > I posted a few days ago (based on the Tim-syntax). But if putting the > > keyword "decl:" in front makes it feel better then I'm all for that! > > Sorry. I won't let you rewrite history :-). You were suggesting a new, > alternative syntax, rather than adding new syntax to Python. I was suggesting a new, alternative syntax that would eventually become a part of the Python runtime system. I said that the new, alternative syntax should go in separate files for now because that makes implementation simpler. What I argued against was restricting ourselves to Python as it exists today, in particular nested dicts and lists. > Tim and I > (and some others) have lobbied for adding new syntax. In particular, I > don't want to see Yet Another Language and Yet Another Parser to deal with > a distinct language/syntax for type specifications. decl has a grammar. It *is* Yet Another Language. As Tim says, it is rapidly approaching the complexity of Python itself. :) I am still guessing that for the first version it will also use Yet Another Parser because I don't want to change the real Python parser while we are in development mode. Are we going to set up our own CVS tree and have all of our testers install a new binary when we change the precedent of the ! operator or add a feature to the decl statement? I would rather send them one or two new Python files. *If* 1.6 is coming when we need it to, then we could give it a very informal grammar for decl that basically stops at a comment or line boundary. That could invoke our (Python coded) decl-parser. More likely, we will want to test things out before 1.6 so we will probably stuff decl statements into expression statement-strings or shadow files. Either way, we have our own (sub-)grammar and (sub-)parser based, it seems, on Haskell. > Regardless: I'd hope that the first step to any implementation is to > update the Python grammar and allow us to annotate existing Python > programs (i.e. to use inline syntax). Updating the grammar is not super > difficult, but I hear you about wanting to not use another binary. But > I'll just shrug that off and say that's your problem :-) Updating the grammar is not super-difficult but getting it right the first time is difficult. I cannot believe that nobody in parser-land has written a Python-based Python parser that we can hack. Whatever happened to the ethic that a parser-generator was not done until it could parse the language it was written in? That a Real Programming Language was not done until it could compile itself? :) -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things see no end: A loop with exit code done wrong A semaphore untested, and the change that comes along http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Sun Dec 19 10:41:35 1999 From: paul@prescod.net (Paul Prescod) Date: Sun, 19 Dec 1999 04:41:35 -0600 Subject: [Types-sig] Optionality and performance References: Message-ID: <385CB65F.FAE20948@prescod.net> Greg Stein wrote: > > > Optionality of type checking is not about it being enabled or disabled. > > Even when it is enabled, type checking any particular method must be > > optional. This whole discussion should presume "enabled". But > > optionality is still important. > > I'm assuming that we type-check a module at a time -- that we don't have > the kind of fine-grained checking you're assuming. If a person doesn't > want find-grained checking, then they just shouldn't add type annotations > there. Here's the issue I tried to address in RFC 1.0 under the term "safety declarations". def foo() -> String: # 10,000 lines of code return str( abc ) This function is guaranteed to meet its type signature if it completes, but it is not necessarily type-safe in the Java sense. Anywhere within it, an integer could be added to a string or a ".foo" invoked on a float. For ERR it is important to be able to say that this function is type-safe if there is some important reason that it really should not fail. The type system can't guarantee termination but it can at least ensure that TypeError and AttributeError will never be triggered by this code. For a big part of our target audience, this assertion is the reason for the exercise. For OPT it is important to be able to say: "I need this code to run like a bat out of hell and I believe that there are no string lookups or runtime type checks required. Please verify that for me." So we need "decl type-safe foo" or something like that. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things see no end: A loop with exit code done wrong A semaphore untested, and the change that comes along http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Sun Dec 19 11:12:08 1999 From: paul@prescod.net (Paul Prescod) Date: Sun, 19 Dec 1999 05:12:08 -0600 Subject: [Types-sig] Issue: binding assertions Message-ID: <385CBD88.ECC0009F@prescod.net> Okay, we need to move to conclusion on certain issues so I can make a new RFC. A week ago I posted a proposal that had a concept called "binding assertion": #2. The system must allow authors to make assertions about the sorts of values that may be bound to names. These are called binding assertions. They should behave as if an assertion statement was inserted after every assignment to that name program-wide. Greg argued relatively persuasively that it was more Pythonic to allow the same variable to have varying values over time. This is great in the local case but causes problems in cases like this: def brian(a) -> int: a.spam=42 somefunc( a ) return a.spam+a.parrot Without getting into nightmares of "const" and so forth, we need to deal with the fact that we can't know the type of a.spam and a.parrot without analyzing somefunc and maybe other code. We don't have that option. Therefore we must be able to guarantee at least the type, if not the values, of a.parrot and a.spam. Now if a is a module object then hard-coding the values is equivalent to declaring the static, unchanging type of global variables. So in at least the module and class instance cases, we are binding types to names, not values. This begs the question: is there any reason to treat parameters differently and allow parameters to vary over their lifetime? And what about declared local variables? Surely they must behave the same as global variables! My personal vote is that we treat all variables the same. Someone who wants to allow some particular name to vary its type over its lifetime has the following options: * turn off type checking * declare it to be of type Any * declare it to be of type dict | list of string If we are too strict at first then we could ease the rule based on feedback. I *do* understand that the vast majority of local variables (not parameters) should have their types inferred (perhaps just as "Any") rather than declared. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things see no end: A loop with exit code done wrong A semaphore untested, and the change that comes along http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Sun Dec 19 11:58:29 1999 From: paul@prescod.net (Paul Prescod) Date: Sun, 19 Dec 1999 05:58:29 -0600 Subject: [Types-sig] Issue: definition of "type" Message-ID: <385CC865.13C842A5@prescod.net> A "static type" is either a statically declared (top-level) class or something declared with a "decl type" statement or whatever we come up with. Jim Fulton and Max Skaller notwithstanding, we do not seem to be moving in the direction that any Python name can serve as a type. For instance, these things are not types: if somefunc(): class spam: foo: String else: class spam: foo: int spam is a class but not a static type. Jim Fulton also defines some ways to make interfaces at runtime. Those are also not "static types" for our purposes. An interface constructed at the top level would be a valid static type. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things see no end: A loop with exit code done wrong A semaphore untested, and the change that comes along http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Sun Dec 19 11:59:23 1999 From: paul@prescod.net (Paul Prescod) Date: Sun, 19 Dec 1999 05:59:23 -0600 Subject: [Types-sig] Any References: Message-ID: <385CC89B.D228E925@prescod.net> Can somebody tell me why we are looking at the word "Any" instead of "Object" or "PyObject." I think that it is useful to encourage people to think of Python's type system as a graph and not a set. Or to put it another way, we should emphasize what is common about all Python objects like the ability to convert them to a string. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things see no end: A loop with exit code done wrong A semaphore untested, and the change that comes along http://www.geezjan.org/humor/computers/threes.html From gstein@lyra.org Sun Dec 19 19:10:37 1999 From: gstein@lyra.org (Greg Stein) Date: Sun, 19 Dec 1999 11:10:37 -0800 (PST) Subject: [Types-sig] Optionality and performance In-Reply-To: <385CB65F.FAE20948@prescod.net> Message-ID: I'm not sure what point your're making here -- it seems to be different than my issue. On Sun, 19 Dec 1999, Paul Prescod wrote: > Greg Stein wrote: > > > Optionality of type checking is not about it being enabled or disabled. > > > Even when it is enabled, type checking any particular method must be > > > optional. This whole discussion should presume "enabled". But > > > optionality is still important. > > > > I'm assuming that we type-check a module at a time -- that we don't have > > the kind of fine-grained checking you're assuming. If a person doesn't > > want find-grained checking, then they just shouldn't add type annotations > > there. > > Here's the issue I tried to address in RFC 1.0 under the term "safety > declarations". > > def foo() -> String: > # 10,000 lines of code > return str( abc ) > > This function is guaranteed to meet its type signature if it completes, I agree. Although, I would state that it is only guaranteed if the type-checking is enabled when you compile the *module*. Specifically: the presence of the return type means that the type-checker will verify the return value as matching that type. This process is enabled when the type-checker is enabled; the checking is *not* done if the type-checking is not enabled. You enable/disable compile-time, static type checking on a *module* basis, when you compile the module. Given two declarations: def foo(): ... def bar() -> String: ... The foo() function will always pass the type-checker because it assumes "-> any" as the return value, which is satisfied by whatever foo() might return. The bar() function passes IFF it only returns strings (and doesn't fall off the end of the function, which implies a "return None"). > but it is not necessarily type-safe in the Java sense. Anywhere within > it, an integer could be added to a string or a ".foo" invoked on a > float. Sure... > For ERR it is important to be able to say that this function is > type-safe if there is some important reason that it really should not > fail. The type system can't guarantee termination but it can at least > ensure that TypeError and AttributeError will never be triggered by this > code. For a big part of our target audience, this assertion is the > reason for the exercise. Absolutely. > For OPT it is important to be able to say: "I need this code to run like > a bat out of hell and I believe that there are no string lookups or > runtime type checks required. Please verify that for me." So we need > > "decl type-safe foo" > > or something like that. I'm saying we don't declare a need for type-safety. I'm saying that type-safety checking is preconditioned on two things: 1) type-checking is enabled when the module is compiled 2) type annotations are present So: when the compilation process is occurring and type-checking is enabled, then it will verify as many types as possible. Now, maybe it is desirable to have a second switch that says "foo.bar() should fail if you cannot ensure foo has a bar method taking no parameters." I'm presuming that without this switch, the type-checker could assume that anything of type "any" would have any method that is used. Just to clarify since I think we missed each other: 1) the process of type-checking is an aspect of the compilation process and is enabled/disabled at the module level 2) a basic level of type checking states "check whatever you know about" 3) a stricter level of type checking states "all types must be done and verified correct." Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Sun Dec 19 19:23:18 1999 From: gstein@lyra.org (Greg Stein) Date: Sun, 19 Dec 1999 11:23:18 -0800 (PST) Subject: [Types-sig] development approach (was: Syntax) In-Reply-To: <385CB3CF.AB7343AE@prescod.net> Message-ID: On Sun, 19 Dec 1999, Paul Prescod wrote: >... > I am still guessing that for the first version it will also use Yet > Another Parser because I don't want to change the real Python parser > while we are in development mode. Are we going to set up our own CVS > tree and have all of our testers install a new binary when we change the > precedent of the ! operator or add a feature to the decl statement? I > would rather send them one or two new Python files. Our own CVS tree? Nah. I think that once we reach consensus and have Guido Approval, then it goes right into the main CVS tree. I think the issue here would be whether or how much we find we must iterate. I don't see any iteration happening with the grammar, especially once we have Guido Approval. > *If* 1.6 is coming when we need it to, then we could give it a very What do you mean "when we need it to" ? I didn't realize that there was any "need" being discussed. I certainly am not interested in building a system that is some kind of hackery add-on to 1.5. Design, build, and integrate into 1.6, IMO. > informal grammar for decl that basically stops at a comment or line > boundary. That could invoke our (Python coded) decl-parser. More likely, > we will want to test things out before 1.6 so we will probably stuff > decl statements into expression statement-strings or shadow files. > Either way, we have our own (sub-)grammar and (sub-)parser based, it > seems, on Haskell. Well... whoever codes it gets to decide :-). I'm just stating for the record, that I believe the best approach is to directly start working on the grammar changes [rather than use a short-term, throw-away solution]. If I do any coding on this, then it will use that approach. > > Regardless: I'd hope that the first step to any implementation is to > > update the Python grammar and allow us to annotate existing Python > > programs (i.e. to use inline syntax). Updating the grammar is not super > > difficult, but I hear you about wanting to not use another binary. But > > I'll just shrug that off and say that's your problem :-) > > Updating the grammar is not super-difficult but getting it right the > first time is difficult. I disagree, but that's okay. > I cannot believe that nobody in parser-land has written a Python-based > Python parser that we can hack. Whatever happened to the ethic that a > parser-generator was not done until it could parse the language it was > written in? That a Real Programming Language was not done until it could > compile itself? :) I think a number of people have done this. Go take a look for it. One of my projects for 1.6 is to insert a hook for the parsing and the compilation process. This would allow Python-level code to replace the parse step, and/or Python code to replace the bytecode compilation. Once those hooks are in, then it will be pretty much a given to have a Python parser written in Python. And a bytecode compiler written in Python. Specifically: can you replace each step, and parse the same files, producing the same set of bytecodes? [ I'll be doing the hooks; not sure if I want to write the replacements, though ] Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Sun Dec 19 19:47:27 1999 From: gstein@lyra.org (Greg Stein) Date: Sun, 19 Dec 1999 11:47:27 -0800 (PST) Subject: [Types-sig] Issue: binding assertions In-Reply-To: <385CBD88.ECC0009F@prescod.net> Message-ID: On Sun, 19 Dec 1999, Paul Prescod wrote: >... > def brian(a) -> int: > a.spam=42 > somefunc( a ) > return a.spam+a.parrot > > Without getting into nightmares of "const" and so forth, we need to deal > with the fact that we can't know the type of a.spam and a.parrot without > analyzing somefunc and maybe other code. We don't have that option. > Therefore we must be able to guarantee at least the type, if not the > values, of a.parrot and a.spam. You cannot know the type of because you did not add a type declaration to the parameter. If you had the following code: class Foo: decl member spam: Int decl member parrot: Int ... def somefunc(x: Foo): ... def brian(a: Foo) -> Int: ... THEN, you could assert type-safety. But if you don't even tell the thing what type your inputs are: well... your problem. What I believe is a distinct issue: while the interface specification of Foo tells you what a.spam *is*, I believe we have a separate problem of deciding whether to *enforce* that. While I am not strictly opposed to enforcing type safety during assignment, I would ask that you please list this as two problems: 1) declaring an interface, 2) enforcing type-safety during assignments. > Now if a is a module object then hard-coding the values is equivalent to > declaring the static, unchanging type of global variables. So in at Declaring module globals is really declaring the module *interface*, IMO. The fact that the interface happens to be implemented with globals is beside the point. In fact, it might be neat to have something like this: module Foo: decl member a: Int decl member b: Int def foo(x: String) -> String: "doc string" # no code! def bar(y: String) -> None: "doc string" Then at some point in your code, you could do: def baz(a: Foo) -> Int: ... While all this is neat, you'll notice that the "module Foo:" is exactly the same as doing "interface Foo:". > least the module and class instance cases, we are binding types to > names, not values. This begs the question: is there any reason to treat > parameters differently and allow parameters to vary over their lifetime? > And what about declared local variables? Surely they must behave the > same as global variables! I think you are binding types as part of an interface declaration. That is quite different than binding locals. > My personal vote is that we treat all variables the same. Someone who > wants to allow some particular name to vary its type over its lifetime > has the following options: > > * turn off type checking > * declare it to be of type Any > * declare it to be of type dict | list of string > > If we are too strict at first then we could ease the rule based on > feedback. I *do* understand that the vast majority of local variables > (not parameters) should have their types inferred (perhaps just as > "Any") rather than declared. I believe that if we add syntax to declare locals, then we are going to have a real hard time getting rid of that syntax. I would rather proceed with caution in this regard: if we simply can't get the type inferencing (or deduction as some people would say) working, then I'll relent and advocate declaring locals. But until we reach that step, I'd like to avoid adding new syntax for declaring locals. If people want to use local declarations for commentary purposes, I would just tell them: def foo(a, b): #decl local c: Int ... c = 5 ... In other words, Python isn't going to provide direct assistance for declaring the things. Note that, for a function, I do not believe there is much merit to declaring types for enforcement purposes. I don't want to add syntax and the resulting non-cleanliness to deal with people who do the following: def foo(): decl local c: Int c = "foo" Tough for them. I think most programmers can easily keep track of types within a single function. We will be helping them out, certainly, with the type checking, but it just doesn't happen as a type-enforcement at assign time. [ in other words: I think type problems occur at boundaries (func calls) rather than within the code/variables of a single function ] Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Sun Dec 19 20:02:53 1999 From: gstein@lyra.org (Greg Stein) Date: Sun, 19 Dec 1999 12:02:53 -0800 (PST) Subject: [Types-sig] Issue: definition of "type" In-Reply-To: <385CC865.13C842A5@prescod.net> Message-ID: On Sun, 19 Dec 1999, Paul Prescod wrote: > A "static type" is either a statically declared (top-level) class or > something declared with a "decl type" statement or whatever we come up > with. Jim Fulton and Max Skaller notwithstanding, we do not seem to be > moving in the direction that any Python name can serve as a type. For euh... I don't think so. We should be able to do the following: import types Int = types.IntType int = type(1) def foo(x: Int) -> int: return x I think the compiler/inferencer will understand the "types" module's interface and will understand the type() builtin. There are probably a few other ways to get type information (e.g. foo.__class__) that it also must understand. But each of these mechanisms are quite traceable. For example, as the inferencer is doing its type analysis, if a value has TypeType, then it remembers the value, too. That value can then be used in the future for type assertions. In the above example, the inferencer sees that types.IntType is a TypeType (with value ). As part of its normal effort, it records that Int has a TypeType value. In this case, it also records that the value is . The inferencer also understands that type(1) returns and records the appropriate bits with "int". > instance, these things are not types: > > if somefunc(): > class spam: > foo: String > else: > class spam: > foo: int > > spam is a class but not a static type. I disagree. After the if/else statement, spam is effectively: spam = typedef _internal_interface_spam1 or _internal_interface_spam2 i.e. the type inferencer simply understands that the typedecl "spam" is a class with one of two interfaces. > Jim Fulton also defines some ways to make interfaces at runtime. Those > are also not "static types" for our purposes. An interface constructed > at the top level would be a valid static type. I agree: runtime constructions such as the JimF stuff are not static types. However, I disagree with JimF's mechanism for interface definition. I believe our syntax changes are sufficient. [ still need the associated functions such as implements() ] Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Sun Dec 19 20:04:32 1999 From: gstein@lyra.org (Greg Stein) Date: Sun, 19 Dec 1999 12:04:32 -0800 (PST) Subject: [Types-sig] Any In-Reply-To: <385CC89B.D228E925@prescod.net> Message-ID: On Sun, 19 Dec 1999, Paul Prescod wrote: > Can somebody tell me why we are looking at the word "Any" instead of > "Object" or "PyObject." I think that it is useful to encourage people to > think of Python's type system as a graph and not a set. Or to put it > another way, we should emphasize what is common about all Python objects > like the ability to convert them to a string. Comes from CORBA/IDL. I don't think most people view the number 5 as an object. Sure, Python happens to view it that way, but people will say "that's a number!". From a human perspective, "any" makes more sense than "object". Cheers, -g -- Greg Stein, http://www.lyra.org/ From paul@prescod.net Sun Dec 19 21:58:38 1999 From: paul@prescod.net (Paul Prescod) Date: Sun, 19 Dec 1999 15:58:38 -0600 Subject: [Types-sig] Issue: binding assertions References: Message-ID: <385D550E.AD1F0F2F@prescod.net> Greg Stein wrote: > > ... > You cannot know the type of because you did not add a type declaration > to the parameter. Yes, I meant to do so. > What I believe is a distinct issue: while the interface specification of > Foo tells you what a.spam *is*, I believe we have a separate problem of > deciding whether to *enforce* that. While I am not strictly opposed to > enforcing type safety during assignment, I would ask that you please list > this as two problems: 1) declaring an interface, 2) enforcing type-safety > during assignments. That's fine. Do you support enforcing type safety during assignments? If not, doesn't the type declaration become meaningless documentation? And if you support enforcing type declarations during assignments, do you support doing so for assignments to: a) instance variables b) module variables c) local variables d) parameters If you could summarize your proposed syntax/semantics for the four types of assertions in a small chart, that would help a lot. > I believe that if we add syntax to declare locals, then we are going to > have a real hard time getting rid of that syntax. The syntax to declare locals would be the same syntax used to declare globals and instance variables. It would just be in the function context. Anyhow, I wasn't saying that we would ever get rid of the syntax. We could just allow variables so declared to vary across their lifetime. > I don't want to add syntax and > the resulting non-cleanliness to deal with people who do the following: > > def foo(): > decl local c: Int > c = "foo" Neither do I. But I also do not want to illogically restrict the syntax that is used in the module context, and class context from being available in the local context. I also do not want parameter declarations to have a very different semantic from instance variable declarations. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things see no end: A loop with exit code done wrong A semaphore untested, and the change that comes along http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Sun Dec 19 22:19:20 1999 From: paul@prescod.net (Paul Prescod) Date: Sun, 19 Dec 1999 16:19:20 -0600 Subject: [Types-sig] Issue: binding assertions References: <385D550E.AD1F0F2F@prescod.net> Message-ID: <385D59E7.6DDE668A@prescod.net> Paul Prescod wrote: > > Greg Stein wrote: > > > > ... > > You cannot know the type of because you did not add a type declaration > > to the parameter. > > Yes, I meant to do so. I mean that I meant to put in the type, but forgot to go back and do so. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things see no end: A loop with exit code done wrong A semaphore untested, and the change that comes along http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Sun Dec 19 22:29:18 1999 From: paul@prescod.net (Paul Prescod) Date: Sun, 19 Dec 1999 16:29:18 -0600 Subject: [Types-sig] Optionality and performance References: Message-ID: <385D5C3E.8609C4FF@prescod.net> Greg Stein wrote: > > ... > I'm saying we don't declare a need for type-safety. I'm saying that > type-safety checking is preconditioned on two things: > > 1) type-checking is enabled when the module is compiled > 2) type annotations are present And I'm saying that there are times when "as many as possible" is not enough. It is my presumption that this function will *always* pass the type checker: def foo() -> String: # 10,000 lines of code print "abc"+eval( raw_input() ) return str( abc ) Its declaration is correct. Sure, it may raise TypeError but Python isn't Java and we should make it easy to write partially type-safe code. The return value is verifiably correct and that is all that matters. But the same function would *never* pass the type checker if it was declared type-safe: decl type-safe foo Because its declaration is *incorrect*. Even though it returns what it should it is NOT typesafe because it can trigger one of TypeError or AttributeError. type-safe means safe in the Java/C++ sense which is a very different issue. There is no reason to require foo to be moved to a separate module if it is the only function that requires that level of safety. I can think of no interesting reason to require all functions in a module to have the same safety level. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things see no end: A loop with exit code done wrong A semaphore untested, and the change that comes along http://www.geezjan.org/humor/computers/threes.html From tim_one@email.msn.com Sun Dec 19 22:53:37 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sun, 19 Dec 1999 17:53:37 -0500 Subject: [Types-sig] Type Inference I In-Reply-To: <385AB3FE.AE6C2630@maxtal.com.au> Message-ID: <000d01bf4a73$e37edda0$922d153f@tim> [John Skaller] > ... > For example, consider: > > x = 1 > y = open("something") > try: x + y > except: print "OK" > > This code is CORRECT python at the moment (AFAIK). > It is NOT 'illegal' to add a file and an integer, > it is perfectly correct to do it, and then handle > the resulting exception. What I wonder is why you imagine Guido didn't intend this; just as he intended that your "open" call above may also raise an exception (if e.g. "something" doesn't exist, or it does but the program doesn't have read permission, or ...). > There is no hope for any kind of type inference > until this is fixed. What must be said is that > this case is an error, and that Python can > do anything if the user does this: the result > of executing the code MUST be undefined. I think this is dead on arrival; almost nothing in Python was intended to be undefined. It might help if you gave an example that actually presented a difficulty : above, *if* you get beyond the "except", x is an int and y is an open file object, and an inferencer shouldn't care what the type of "x + y" is (it's not referenced). Change it to "z = x + y", and an inferencer knows that the type of z is the same as the type it had before entering the block of code shown (because the inferencer knows that int+file will blow up, so z won't get rebound > The fact that a particular implementation throws an exception, > is good behavour on the part of that particular implementation, > but it must NOT be required -- because that would prevent > a compiler rejecting the program. This SIG is *adding* (optionally enforced) rules about when compile-time detectable potential TypeErrors can cause compile-time rejection (just "potential" because nobody has signed up to do reachability analysis; that is, def f(): return 2 3 + open("3") will probably get rejected in typecheck mode, despite that no runtime TypeError is possible (the offending stmt is unreachable)). > ... > The outcome of this is that really, the only times python > guarrantees to raise an exception is for environment errors, > or for typing/indexing/lookup errors which are locally > wrapped. This certainly wasn't the intent! E.g., in certain endcases, and sometimes platform-dependent ones, Python has failed to raise OverflowError when appropriate ((-sys.maxint-1)/-1 on (at least) Pentiums is the most recent example that comes to mind). Guido has always considered such behaviors to be bugs in the implementation, and either fixes them or makes me do it . > ... > ** Keyboard Interrupt: this is wrong wrong wrong!! Agreed there! but-probably-not-a-complaint-for-the-types-sig-ly y'rs - tim From tim_one@email.msn.com Sun Dec 19 22:53:44 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sun, 19 Dec 1999 17:53:44 -0500 Subject: [Types-sig] I've collected my thoughts... In-Reply-To: Message-ID: <000f01bf4a73$e6ef4c40$922d153f@tim> [GregS] > http://www.lyra.org/greg/python/type-proposal.html > ... > I'll keep adding to it as I think of things and hear back from > people. I'm going to try to slow down on this type stuff, though, > as I'm going to be starting work on the new import system. Thanks, Greg! I'm going to vanish entirely Real Soon -- pushed all my vacation time to the end of the year so I could get to the Python Conference (it was scheduled for Dec. when I did this), and just learned that I can't carry it over to next year. That is, "use it or lose it". After a day of thoughtful consideration , "use it" still appears to be the wiser choice. preparing-to-celebrate-the-first-1000-glorious-years-of- types-sig-ly y'rs - tim From gstein@lyra.org Sun Dec 19 23:08:21 1999 From: gstein@lyra.org (Greg Stein) Date: Sun, 19 Dec 1999 15:08:21 -0800 (PST) Subject: [Types-sig] Issue: binding assertions In-Reply-To: <385D550E.AD1F0F2F@prescod.net> Message-ID: On Sun, 19 Dec 1999, Paul Prescod wrote: > Greg Stein wrote: >... > > What I believe is a distinct issue: while the interface specification of > > Foo tells you what a.spam *is*, I believe we have a separate problem of > > deciding whether to *enforce* that. While I am not strictly opposed to > > enforcing type safety during assignment, I would ask that you please list > > this as two problems: 1) declaring an interface, 2) enforcing type-safety > > during assignments. > > That's fine. Do you support enforcing type safety during assignments? If Generally, I don't support it in V1. I think the assignments are usually being done near to their definition, and only by the original author. In that sense, I think there won't be too many errors in type-incorrectness. In V2, then sure. But I want to separate the issue and discuss it later. We can implement a system without worrying about assignments. Small bites! > not, doesn't the type declaration become meaningless documentation? Absolutely not. In the following: class Foo: decl member bar: Int Anytime that the type-checker/inferencer refers to Foo_instance.bar, it knows what the type is. Very important. > And if you support enforcing type declarations during assignments, do > you support doing so for assignments to: > > a) instance variables > b) module variables The above two are part of assigning values to an interface's attributes. In the future: sure, enforcement would make sense. > c) local variables I don't think we should even be declaring these, thus rendering type-enforcement moot. > d) parameters Unsure. I'm punting my thoughts to V2. [ V2 meaning Type System V2, not Python 2.0 ... I don't even think Python 2.0 should be mentioned in our discussions... ] > If you could summarize your proposed syntax/semantics for the four types > of assertions in a small chart, that would help a lot. -- No enforcement at all for any assignment. -- All references use the declared type info (if any) for purposes of type checking > > I believe that if we add syntax to declare locals, then we are going to > > have a real hard time getting rid of that syntax. > > The syntax to declare locals would be the same syntax used to declare > globals and instance variables. It would just be in the function I disagree. The latter two are part of an interface declaration (of a module or a class instance). Locals are not part of an interface, so I don't think they fall into the same category at all. decl member a: Int That is an incorrect semantic for locals. And I don't support adding a "decl var" or "decl local" for local declaration. IMO, of course :-) > context. Anyhow, I wasn't saying that we would ever get rid of the > syntax. We could just allow variables so declared to vary across their > lifetime. My comment about getting rid of the syntax was based on the assumption that we *might* have local declarations until the inferencer is up-to-snuff. At that point, local declarations would be redundant. Problem is: the interim code that used the new syntax would break once we tried to remove that interim syntax. > > I don't want to add syntax and > > the resulting non-cleanliness to deal with people who do the following: > > > > def foo(): > > decl local c: Int > > c = "foo" > > Neither do I. But I also do not want to illogically restrict the syntax > that is used in the module context, and class context from being > available in the local context. I also do not want parameter > declarations to have a very different semantic from instance variable > declarations. I think the module/class context declarations are the same: interface member declarations. Parameters, locals, and return values have a different semantic altogether. Therefore, I don't see any illogic to keeping the syntax separate. I don't want to worry about enforcement, but I do want to worry about declaring parameters and return values so that we can properly infer/deduce the types of all expressions within a function. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Sun Dec 19 23:14:51 1999 From: gstein@lyra.org (Greg Stein) Date: Sun, 19 Dec 1999 15:14:51 -0800 (PST) Subject: [Types-sig] Optionality and performance In-Reply-To: <385D5C3E.8609C4FF@prescod.net> Message-ID: Good point. I think that I kind of described this difference in a previous email, but you've stated it much more clearly. I see type-safety as a second generation step, building on top of type-checking. Can we defer these issues to a V2 system? In general: I'd be happy with adding a "typesafe" keyword, but I think it would behoove us to keep things small and simple. Let's just do type-checking and defer type-safety and assignment enforcement. Cheers, -g On Sun, 19 Dec 1999, Paul Prescod wrote: > Greg Stein wrote: > > ... > > I'm saying we don't declare a need for type-safety. I'm saying that > > type-safety checking is preconditioned on two things: > > > > 1) type-checking is enabled when the module is compiled > > 2) type annotations are present > > And I'm saying that there are times when "as many as possible" is not > enough. It is my presumption that this function will *always* pass the > type checker: > > def foo() -> String: > # 10,000 lines of code > print "abc"+eval( raw_input() ) > return str( abc ) > > Its declaration is correct. Sure, it may raise TypeError but Python > isn't Java and we should make it easy to write partially type-safe code. > The return value is verifiably correct and that is all that matters. > > But the same function would *never* pass the type checker if it was > declared type-safe: > > decl type-safe foo > > Because its declaration is *incorrect*. Even though it returns what it > should it is NOT typesafe because it can trigger one of TypeError or > AttributeError. type-safe means safe in the Java/C++ sense which is a > very different issue. > > There is no reason to require foo to be moved to a separate module if it > is the only function that requires that level of safety. I can think of > no interesting reason to require all functions in a module to have the > same safety level. > > -- Greg Stein, http://www.lyra.org/ From tim_one@email.msn.com Sun Dec 19 23:47:02 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sun, 19 Dec 1999 18:47:02 -0500 Subject: [Types-sig] Type Inference II In-Reply-To: <385AE827.42ECB891@maxtal.com.au> Message-ID: <001701bf4a7b$59ab61e0$922d153f@tim> [John Skaller] > ... > Another case which inhibits optimisation is loop variables. > In the loop: > > for x in y: .... > > is it allowed to assign to x? Yes (explicitly allowed by section 7.3 of the Lang Ref). > What about mutating y? Yes but delicate (see 7.3 again); e.g., I have a great deal of code that does breadth-first traversals via: sawit = {root: 1} # set of things seen so far sequence = [root] for thing in sequence: for child in thing.children(): if not sawit.has_key(child): sawit[child] = 1 sequence.append(child) return sequence # a breadth-first list of everything reachable > What about mutating x? Sure; this one's so obvious it doesn't need to be said <0.9 wink>. > [Also, an aside: the code > > for x[1] in y: ... > for x.attr in y: ... > > is allowed but I can't see a real use for it. > Is there one? I used this once, many years ago, but felt silly doing it. There was no real need for it. Can't recall ever seeing it in other folks' code, either. Wouldn't miss it. > Could we simplify the syntax, and required the loop control to > be a whole variable, or tuple of whole variables (recurively), > so that the names involved are always bound directly? OK by me, but do note it would make CPython's grammar more complicated (in the sense that Grammar/Grammar would grow larger); the current for_stmt: 'for' exprlist 'in' testlist ':' suite ['else' ':' suite] shares "exprlist" with the "del" stmt. > ... > Tightening up for loops will break code that does things like: > > (1) do extra increments on an loop variable to skip cases Nope. Assigning to the loop variable has no effect on the values assigned *to* it by the for stmt: >>> for i in range(5): print i i = i ** 10 print i 0 0 1 1 2 1024 3 59049 4 1048576 >>> > (2) mutate a list while scanning it Again this shouldn't break anything. "The rules" are already set up to reflect a simple one-at-a-time implementation. When you talk about "caching" the list, though, I don't know what that could mean other than making a *copy* of it -- but that would be a pessimization, not a speedup. In any case, loop overhead is usually minor. It's usually the guts of the loop that consume the time. > ... > _f = f # protect our f > from MODULE import * # might destroy f > f = _f # set f back > del _f # get rid of temporary _f > > This is ugly. It also makes inference harder, because > there is now a control flow issue. > > One idea I had was this: > > import X as Y # import X, but name it Y in this module > from M import x as v Has been suggested many times on c.l.py. To the best of my knowledge, Guido has never responded to one of these. But in any case I'd say that a "real inferencer" that can't deal with a branch-free basic block isn't real enough for the Types-SIG to worry about -- "import as" belongs in a different forum. "import *" is a legit Types-SIG headache, though; Guido already voted "if they do that, tough nuts to them". > ... > I note python currently supports privacy by name mangling, > but really, this is a hack: for Python 2, a more sophisticated > architecture would be better. It had a more sophisticated one, years ago (the now-defunct "access" stmt); experience with that was bad, so it was tossed and __ was added; Guido isn't likely to backtrack here. it's-a-*cute*-hack-that-works-well-in-practice-ly y'rs - tim From tim_one@email.msn.com Mon Dec 20 00:59:33 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sun, 19 Dec 1999 19:59:33 -0500 Subject: [Types-sig] List of FOO In-Reply-To: <385AF7C9.7CEE78E5@maxtal.com.au> Message-ID: <001801bf4a85$7aba51c0$922d153f@tim> [John Skaller] > ... > If I can summarise: there is considerable advantage using > arbitrary objects as type objects: they can be specified > using EXISTING python syntax, using the power of the EXISTING > python interpreter, without needing a special, second class > language, to complicate python, and pose an additional > implementation overhead. One of the groundrules here is that the SIG's work cannot require importing (or executing via any other means) modules. It *may* be OK to compile them and deviously suck out their code objects for inspection, but execution is forbidden. That's why a "special, second class language" is attractive, provided it's recognizable as such: it can be analyzed without execution. If types are arbitrary run-time objects-- and *especially* if they "use the power of the ... Python interpreter", I don't see how the compilation process could get *at* them without execution. statically y'rs - tim From tim_one@email.msn.com Mon Dec 20 00:59:37 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sun, 19 Dec 1999 19:59:37 -0500 Subject: [Types-sig] New syntax? In-Reply-To: Message-ID: <001901bf4a85$7c8ec3a0$922d153f@tim> [Tim] > Yes, Any is good. [GregS] > I've listed this in my proposal as an open question. I'm leaning > to "formally endorsing" it. My only real opposition is whether > it must be a new keyword, or we can find some other way to deal > with it. > > For example: > > import types > Int = types.IntType > String = types.StringType > Any = None > > decl foo: Any > decl bar: String > > The compiler isn't going to have recognized names for the types. I pushed almost everything into "decl" stmts so that type specification really was a sublanguage distinct from current Python, and specifically a declarative (no control flow of its own & no side-effects) sublanguage, fully evaluable at compile-time via simple means. To the extent that that's true, it can enjoy its own "compile time" namespace distinct from the runtime namespaces, and Int, Any, String, Boolean ... can be decreed to "just be there", by magic, *in* declarations, for purposes of compile-time type checking. If instead we have to interpret imports and binding stmts and attribute dereferences and ... to get at names for types, we pretty much have to *execute* the code -- and Guido won't go for that. Or, if he does, he shouldn't . The "static" in "static typing" has implications. > ... > Funny note: looking at the grammar, I've found the following is legal: > > def foo(bar, *args, * *kw): > ... > > In my typedecl syntax, I punted the ability to use "* *" ... you must > use "**". So there :-) Good! It's little-known that e.g. x = 2and 3 is legit Python but x = 2 and3 isn't, and I'm sure Guido would like to suppress that secret too. or-if-he-wouldn't-he-should-ly y'rs - tim From tim_one@email.msn.com Mon Dec 20 01:51:11 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sun, 19 Dec 1999 20:51:11 -0500 Subject: [Types-sig] New syntax? In-Reply-To: Message-ID: <001a01bf4a8c$b22354c0$922d153f@tim> [Tim[ > ... > If I had a lot of binary integer functions to declare, I > would probably use a typedef, a la > > decl typedef BinaryFunc(_T) = def(_T, _T) -> _T > decl typedef BinaryIntFunc = BinaryFunc(Int) > ... > decl var intHandlerMap: {string: BinaryIntFunc} > decl var floatHandlerMap: {string: BinaryFunc(Float)} [GregS] > Okay, Tim. I'm going to stop you right here :-) Good -- the speed was killing me . > The problem with using "decl" to do typedefs is that it does > weird voodoo to associate the typedecl with the name (e.g. > BinaryFunc). Perhaps an earlier msg made this clearer: I've viewed "decl"s as (purely!) compile-time expressions. IOW, BinaryFunc is a compile-time name in the above; there's no implication that a name introduced by a "decl typedef" will appear in any runtime namespace (this doesn't preclude that in some modes the implementation may *want* to make a Python object of the same name available at runtime). > I believe my unary operator is much clearer to what is happening: > > BinaryIntFunc = typedef BinaryFunc(Int) This looks like a runtime stmt to me; if so, it's of no use to static (compile-time) type declaration. If it's not a runtime stmt, better to stick a "decl" (or something) in front of it to make that crucial distinction obvious. > In this case, it is (IMO) very clear that you are storing a typedecl > object into BinaryIntFunc, for later use. For example, we might see the > following code: > > import types > Int = types.IntType > List = types.ListType > IntList = typedef [Int] > ... This all looks like runtime code to me -- if so, how is a *compiler* supposed to get any benefit out of it? Or if not, how is a compiler supposed to recognize that it's not runtime code? > Hrm. I don't have a ready answer for your first typedef, though. That > is a new construct that we haven't seen yet. We've been talking about > parameterizing *classes*, rather than typedecls. > > *ponder* In my twisted little universe, I'm using a declarative language for compile-time type expressions, and BinaryFunc(_T) can be thought of as a compile-time macro -- same as the BinaryIntFunc typedef (except the latter doesn't take any arguments -- or does take no arguments ). >> ("|") should suffice. > "or" is more Pythonic. Agreed. I'm not sure what's in vogue among category theorists, though . > Bite me. :-) Yummy! > You do raise a good point in another post, however: > > def foo(*args: (Int)): > > Looks awfully funny. For a Python programmer, that looks like > grouping rather than a tuple. If it had a comma in there, then > it would look like a tuple. Worse, it would look like a tuple of length one, which *args is not. > But of course: there will never be more than one typedecl inside > there, so whythehell is there a comma? I think it should be legal to do, e.g., def foo(*args: (Int, Float, String)) -> whatever: This says the function takes exactly three arguments, of the given types, but gets them as the * tuple. Some people do that (typically if they're just going to pass the arglist on via apply(somfunc, args)). > *grumble* .... I don't have a handy resolution for this one. So let's make one up. The problem is spelling "tuple of unknown length" (and Paul's complaint notwithstanding, that *is* Python so we gotta deal with it). Python has no notation for this. OK: ... Tuple(T1, T2, T3) equivalent_to (T1, T2, T3) Tuple(T1, T2) equivalent_to (T1, T2) Tuple(T1,) equivalent_to (T1,) Tuple(T1) means tuple-of-T1 of unknown length So it's always *legal* to stick "Tuple" in front of a tuple specifier, and it's *required* in the last case. Actually, tuples show up in type specifiers rarely enough-- and look so much like grouping now --that I'd be happy requiring "Tuple" all the time. Again one of those things that could be relaxed later if it proved too irksome. if-only-you-could-relax-me-too-ly y'rs - tim From tim_one@email.msn.com Mon Dec 20 03:12:06 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sun, 19 Dec 1999 22:12:06 -0500 Subject: [Types-sig] Type Inference I In-Reply-To: <385BB080.A2DBB67C@maxtal.com.au> Message-ID: <002001bf4a97$ff9a90a0$922d153f@tim> [John Skaller] > ... accept, temporarily, that we have only THREE types: > integers, strings, and > ... a function 'add(x,y)' exists, which throws an exception > if the types of x and y are not both integers, or both strings. > ... > Consider two cases: > > 1) The spec says: > IF the arguments are both ints .. > OR IF the arguments are both strings .. > OTHERWISE an exception is thrown > > 2) The spec says: > IF the arguments are both ints .. > OR IF the arguments are both strings ... > OTHERWISE THE BEHAVIOUR IS UNDEFINED > > There is a huge difference between these two cases for > a compiler. In case (2), the compiler can ASSUME > that given the call > > add(x,1) > > that x must be an integer. The philosophy of (at least) CPython is that core dumps and stack faults are never the user's fault -- any time that happens, it's a bug in Python itself. Some of those bugs will never get fixed <0.9 wink>, but they're considered to be Python bugs all the same. This is unlike C/C++/Fortran etc, and is one of Python's selling points relative to them. Now the kind of assumption you want to make above will lead to generating code that *can* cause core dumps when the assumption is false. For example, if x is actually a file object, you may well generate code that adds 1 to some internal field of the underlying FILE* struct. Not good -- not in Python. Spec #1 is the *intent* of the language. > ... > On the other hand, in case (1), the compiler cannot deduce > anything, at least from the given fragment, so it can NOT > generate fast code: it has to call > > PyAdd(x,One) > > or, perhaps do something like: > > if (PyTypeIsInt(x)) > x->value ++; > else PyRaise(SomeException) > > .. which involves an extra run time check, at least, > and is therefore much slower. The transformation is valid provided your inferencer is strong enough to prove that x *is* an int at this point. If it's not, it may be no significant loss anyway: most program time is spent inside loops, and runtime type checks can often be floated out; when they can't, the compiler can provide feedback that a user who gives a rip about speed can act on. > Therefore, there is a performance advantage in adopting > spec (2) as a language specification, instead of (1). Yes, but I doubt Python will ever go that route. It's harder to optimize Python than Fortran -- but, in return, you get to program in *Python* . > Note this does not mean the generated code will crash, > if x is not an integer. What it means is that if the > compiler detects that x is not an integer, it can > generate a compile time error. It is NOT allowed to > do that with specification (1). No, but under #1, and under the same assumption (that the compiler detects that x is not an int), it can generate code to raise the exception and skip the PyAdd business. That is, *if* the inferencer can *either* say "definitely an int" or "definitely not an int", it makes no practical difference whether spec #1 or #2 is in effect. > So my point is: the Python documentation contains > many examples where it says 'such and such an exception > is thrown', and this prevents generating fast code, > and it prevents early error detection. I'm sure that if anything comes of this SIG, people will ask for a compile *option* to warn about (or error on) every nastiness the analysis code can detect. > ... > But I have spent over five years on a standards committee, > and have some vague idea of the impact of specifications -- > and in particular lack of them -- on the ability to generate > fast, conforming code. Speed wasn't one of Python's primary goals; had you been on the ECMAS JavaScript committee instead, you wouldn't have heard anyone arguing about speed <0.9 wink>. C++ is the Fortran of OO languages ... not-that-speed-kills-ly y'rs - tim From tim_one@email.msn.com Mon Dec 20 03:51:12 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sun, 19 Dec 1999 22:51:12 -0500 Subject: [Types-sig] Syntax In-Reply-To: <385BC0B6.3879244B@prescod.net> Message-ID: <002201bf4a9d$76357900$922d153f@tim> [Paul Prescod] > ... > In fact, I don't see a lot of difference between the widely > embraced Tim-syntax and the syntax I posted a few days ago > (based on the Tim-syntax). Me neither. > But if putting the keyword "decl:" in front makes it feel > better then I'm all for that! Ditto if taking "decl" away makes people feel better. I'm getting an increasingly strong suspicion that what I've had in mind doesn't match what *anybody* here has been talking about, though! That is, as covered in earlier msgs tonight, I've taken it as a given that type declarations must be fully identifiable and evaluable at compile time, without code execution. The real point of slopping "decl" in front of everything was to add just one new "compile time" keyword and statement (Guido's happiness will be proportional to 1./math.e**k, where k == the number of new keywords ). > I'm still thinking that it should go in another file because I > want to be able to experiment with this stuff WITHOUT maintaining > a new Python interpreter binary. I don't know what people are arguing about here. We'll need a separate file to declare the signatures of stuff supplied by C modules anyway. good-enough-to-start-with!-ly y'rs - tim From tim_one@email.msn.com Mon Dec 20 04:50:55 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sun, 19 Dec 1999 23:50:55 -0500 Subject: [Types-sig] List of FOO In-Reply-To: <385C0F10.E003F55E@maxtal.com.au> Message-ID: <002401bf4aa5$cdb7d760$922d153f@tim> [John Skaller] > ... > Note I'm not against using a functional language's > type description for Python, a'la Tim/Haskell, > but it isn't clear that is going to work well either, The only thing that's clear is that *something's* going to work well -- we just have no idea what that is . > and it seems to involve 'extra' work, writing a parser > for a 'new' language, etc. But a language much simpler than Python, and one designed for the task rather than (hard?) pressed into service. There are reasons to be skeptical either way. also-reasons-to-be-optimisitic-ly y'rs - tim From tim_one@email.msn.com Mon Dec 20 04:51:04 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sun, 19 Dec 1999 23:51:04 -0500 Subject: [Types-sig] type declaration syntax In-Reply-To: <385C1345.C21FF180@maxtal.com.au> Message-ID: <002701bf4aa5$d2334d60$922d153f@tim> [Tim] > My bet is that the vast majority of Python people asking for > "static typing" have in mind a conventional explicit system of > the Algol/Pascal/C (APC) ilk, and that decisions based on what > *inference* schemes can do are going to leave them very unhappy. [John Skaller] > I'm not sure why. The rest of my msg went on at some length about why they would be unhappy; I'm not going to repeat it here. Or if the "why" is wrt why it's a majority, you yourself recently wrote a very cogent piece on c.l.py explaining that: familiarity. Relatively few Python programmers have any experience with inference schemes; and the two on this SIG who have admitted to it (that's Paul & me, BTW) testified they didn't really care for it (I at least explicitly declare *every* function in Haskell, although I do generally skip delcaring local names). > My 'assumption' is that > > 1) a conservative inferencer is used, > which means it tries to optimise code by inference, > but if it isn't sure, it falls back to the usual run-time > checking -- that is, it faithfully reproduces the expected > behaviour no matter what. > > 2) optional static type declarations allow > the performance of the inferencer to be improved; > that is, to generate better code > > 3) it would also help to tighten up > the specifications of python, particularly > in areas like > > a) when is it OK to expect an exception > b) module freezing > > etc. All of which are irrelevant to the points in the msg to which you're replying. People from APC are accustomed to declaring types to communicate design requirements and semantic restrictions beyond the competence of inference to determine. If my first example was too subtle , at least read the intgamma example. > I would make the point that, as often is the case, > the client is 'asking' for X, but what they actually > need is Y, because they don't understand their own requirements. > That is, they may be 'asking' for APC style static typing, > but they have no idea what the implications are, > and if they knew, they would withdraw their application. Ditto. > I guess that NO python programmer wants to declare the type > of every single name, which is what APC style static type > checking requires. I would be delighted to if it sped some of my "marginal" programs by a factor of 2. My employer would be delighted to if it saved them from runtime TypeErrors next week instead of next year. Any tradeoff you can think of has a larger audience than you can imagine . repetitively y'rs - tim From tim_one@email.msn.com Mon Dec 20 04:51:02 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sun, 19 Dec 1999 23:51:02 -0500 Subject: [Types-sig] Type Inference I In-Reply-To: <19991218235412.A15050@vet.uu.nl> Message-ID: <002601bf4aa5$d0c390c0$922d153f@tim> [Martijn Faassen] > ... > The counter argument I got to this before is that inferencing takes > place anyway in the case of expressions: > > def foo(a, b): > # Martijn's evil verbose format in yet another form > decl: > a = Int > b = Int > return Int > return a + b Tony also had a "declaration block" construct; they look nice. > 'a + b' would need inferencing to figure out what the type is of > the complete expression. I think that this argument overlooks > that this kind of evaluation is a lot more easy than a back- > tracking kind of inferencing. > ... > Though checking could be seen as a kind of inferencing, right? Or > are people confusing the issues? Initially I didn't consider the > expression evaluation stuff as inferencing either, but there's a > good argument to consider it so, not? Not to me -- it's logic-chopping. This is like the "compiler vs interpreter" arguments that pop up on c.l.py from time to time: yes, there's a fine line between compilation and interpretation, but Python today is nowhere near that line. It's an interpreter. Likewise there's a fine line between inferencing and checking, but in the common usage of the words, deducing the type of "a + b" *given* the types of a and b, and *given* the signatures of a.__add__ and b.__radd__, is not called inferencing. Insisting that it is cheapens the currency of the marketplace of verbal discourse . To the extent that you take away one or more of the the four givens, "inferencing" gets more and more appropriate. Rule of thumb: If it's something Algol and Fortran did Since The Beginning, it's unhelpful to call it inferencing. hmm!-we-could-just-begin-every-python-identifier-with-its-type-name- and-call-it-quits-early-ly y'rs - tim From tim_one@email.msn.com Mon Dec 20 04:51:08 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sun, 19 Dec 1999 23:51:08 -0500 Subject: [Types-sig] type declaration syntax In-Reply-To: <19991219005308.A29210@chronis.pobox.com> Message-ID: <002801bf4aa5$d45d3240$922d153f@tim> For whatever reason, "xxx" sent the attached to me in email, but not to the SIG. Since it's relevent, I'm passing it on. yes-their-name-really-is-xxx-ly y'rs - tim -----Original Message----- From: xxx Sent: Sunday, December 19, 1999 12:53 AM To: Tim Peters Subject: Re: [Types-sig] type declaration syntax On Sat, Dec 18, 1999 at 03:56:44PM -0500, Tim Peters wrote: > [John Skaller] > > ... > > But the _return_ type doesn't need to be annotated as much. > > Why? Because the inferencer can usually deduce it: > > it's an output, the argument types are inputs. > > > > If the inferencer _cannot_ deduce the return type, > > it _also_ cannot check that the function is returning > > the correct type. > > "The" correct type (as opposed to "a type consistent with the > operations") is impossible for an inferencer to determine, but > this is addressed more to the SIG than to John : > > My bet is that the vast majority of Python people asking for > "static typing" have in mind a conventional explicit system of > the Algol/Pascal/C (APC) ilk, and that decisions based on what > *inference* schemes can do are going to leave them very unhappy. If this is so, I am a member of the majority. > Inference schemes commit two kinds of gross errors that the APC > camp won't abide: > > 1) Inferring types that aren't general enough. > 2) Inferring types that are too general. I always had more trouble following the inferencer in ml than simply declaring everything as in C. I think we should aim at one thing, a way to declare types, and what the different types will be. Once there is a standard for this in core python, a myriad of tools from optimizers to code checkers to cross-language compilers will become much more feasible. A type inferencer is one of those things, and it could be built for those who wished to have their types inferenced from a minimal set of type declarations. xxx From tim_one@email.msn.com Mon Dec 20 06:15:50 1999 From: tim_one@email.msn.com (Tim Peters) Date: Mon, 20 Dec 1999 01:15:50 -0500 Subject: [Types-sig] Keyword arg declarations In-Reply-To: <00ac01bf49ed$e76a3f80$df55cfc0@ski.org> Message-ID: <002c01bf4ab1$a96e2ce0$922d153f@tim> >> range([start,] stop[, step]) >> >> This is, that's just the way the *doc* is written, to make it >> clearer. [DavidA] > I know that. However, I can imagine that it will be hard to > justify to the unwashed masses why they need to use seemingly > unrelated syntax to describe the signature for humans and the > signature for the compiler. How would that need arise? Signatures for the builtins will come with the system. If an unwashed mass is clever enough to document their own function in a way that doesn't reflect the way they implemented it, they already understand the two points of view fine. I think there's more potential for confusion from automated tools (like IDLE's calltips) that dig out the actual signature instead of the documented one; e.g., for randrange, calltips pops up (start, stop=None, step=1, int=, default=None) > I believe that you raise a similar point in another of your > posts, w.r.t the 'int=int, ord=ord' extra junk in your function > definition. Yes; I'm wondering whether it's possible (or wise, or something) to *lie* about the true signature. > That said, I suspect that the issue is peripheral and rare enough > that I needn't worry. I agree; but if you do want to worry, think about calltips too . it's-the-end-users-i'm-not-worried-about-ly y'rs - tim From tim_one@email.msn.com Mon Dec 20 06:30:12 1999 From: tim_one@email.msn.com (Tim Peters) Date: Mon, 20 Dec 1999 01:30:12 -0500 Subject: [Types-sig] Syntax In-Reply-To: <385CB3CF.AB7343AE@prescod.net> Message-ID: <002d01bf4ab3$ab7c1360$922d153f@tim> [Paul Prescod] > I cannot believe that nobody in parser-land has written a > Python-based Python parser that we can hack. John Aycock's extremely general (any CF grammar, ambiguous or not) parsing framework comes with a Python grammar. I know Gordon McMillan hacked on that to fix some of the productions, but don't know whether John folded that back in yet (or, indeed, whether G gave it back to J!). It's probably the fastest way in the universe to get a parser working -- as well as the slowest way to actually parse . Definitely worth a look, anyway. > Whatever happened to the ethic that a parser-generator was not > done until it could parse the language it was written in? That a > Real Programming Language was not done until it could compile > itself? :) Coming from a background in Fortran compiler development, that always struck me as a charming myth kept alive by people in other fields. kinda-like-the-myth-that-rocket-scientists-are-smart-ly y'rs - tim From tim_one@email.msn.com Mon Dec 20 06:50:26 1999 From: tim_one@email.msn.com (Tim Peters) Date: Mon, 20 Dec 1999 01:50:26 -0500 Subject: [Types-sig] Issue: binding assertions In-Reply-To: <385CBD88.ECC0009F@prescod.net> Message-ID: <002e01bf4ab6$7f9d9400$922d153f@tim> [Paul Prescod] > ... > I *do* understand that the vast majority of local variables > (not parameters) should have their types inferred (perhaps just as > "Any") rather than declared. Just a reminder that if type decorations do someone actual good, to get that they'll put up with declaring everything at first. Inference is frosting. in-a-land-without-cake-ly y'rs - tim From tim_one@email.msn.com Mon Dec 20 06:50:31 1999 From: tim_one@email.msn.com (Tim Peters) Date: Mon, 20 Dec 1999 01:50:31 -0500 Subject: [Types-sig] Issue: definition of "type" In-Reply-To: <385CC865.13C842A5@prescod.net> Message-ID: <002f01bf4ab6$820a9c60$922d153f@tim> [PaulP] > For instance, these things are not types: > > if somefunc(): > class spam: > foo: String > else: > class spam: > foo: int > > spam is a class but not a static type. True, but it can be given a static type *name*; e.g., decl type spam Provided that the attributes of spam actually referenced outside of spam have the same signatures, static type checking outside of spam shouldn't care that it doesn't know about spam's internals. Or, IOW, if the two dynamic versions of spam present the same external interface to the compiler, it doesn't matter how the *class* spam comes into being at runtime. > Jim Fulton also defines some ways to make interfaces at runtime. Those > are also not "static types" for our purposes. An interface constructed > at the top level would be a valid static type. As above, shouldn't matter. what-isn't-built-until-runtime-can-yet-be-declared-at- compile-time-ly y'rs - tim From gstein@lyra.org Mon Dec 20 09:20:22 1999 From: gstein@lyra.org (Greg Stein) Date: Mon, 20 Dec 1999 01:20:22 -0800 (PST) Subject: [Types-sig] New syntax? In-Reply-To: <001901bf4a85$7c8ec3a0$922d153f@tim> Message-ID: On Sun, 19 Dec 1999, Tim Peters wrote: >... > [GregS] >... > > decl foo: Any > > decl bar: String > > > > The compiler isn't going to have recognized names for the types. > > I pushed almost everything into "decl" stmts so that type specification > really was a sublanguage distinct from current Python, and specifically a > declarative (no control flow of its own & no side-effects) sublanguage, > fully evaluable at compile-time via simple means. To the extent that that's > true, it can enjoy its own "compile time" namespace distinct from the > runtime namespaces, and Int, Any, String, Boolean ... can be decreed to > "just be there", by magic, *in* declarations, for purposes of compile-time > type checking. Ack. Now you're talking about a whole new set of names to introduce to the language. I think that is a Bad Thing. I can understand the desire to simplify the task for the compiler, but creating a distinct, partitioned namespace is just that. It doesn't mesh well into Python itself. > If instead we have to interpret imports and binding stmts > and attribute dereferences and ... to get at names for types, we pretty much > have to *execute* the code -- and Guido won't go for that. Or, if he does, > he shouldn't . The "static" in "static typing" has implications. Nah. No execution needs to take place. Just some data flow analysis. And thankfully, Python doesn't have "goto" ... the hardest control structure to model is try/except. The others are pretty basic. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Mon Dec 20 09:29:00 1999 From: gstein@lyra.org (Greg Stein) Date: Mon, 20 Dec 1999 01:29:00 -0800 (PST) Subject: [Types-sig] New syntax? In-Reply-To: <001a01bf4a8c$b22354c0$922d153f@tim> Message-ID: On Sun, 19 Dec 1999, Tim Peters wrote: >... > > The problem with using "decl" to do typedefs is that it does > > weird voodoo to associate the typedecl with the name (e.g. > > BinaryFunc). > > Perhaps an earlier msg made this clearer: I've viewed "decl"s as (purely!) > compile-time expressions. IOW, BinaryFunc is a compile-time name in the > above; there's no implication that a name introduced by a "decl typedef" > will appear in any runtime namespace (this doesn't preclude that in some > modes the implementation may *want* to make a Python object of the same name > available at runtime). I think that we definitely want to be able to construct and use typedecl objects at runtime. That's why I prefer the typedef unary operator over your "sub-language." Viewing the "decl" stuff as a sub-language is kind of icky. Where is the integration with Python itself? Having a clean integration is a good measure that you have a Pythonic syntax and feel. > > I believe my unary operator is much clearer to what is happening: > > > > BinaryIntFunc = typedef BinaryFunc(Int) > > This looks like a runtime stmt to me; if so, it's of no use to static > (compile-time) type declaration. If it's not a runtime stmt, better to > stick a "decl" (or something) in front of it to make that crucial > distinction obvious. It definitely is a runtime statement. But the compiler can easily track what is happening. We're doing data flow and type checking already: that's what the SIG is about. Tracking the result of a typedef is cake once you have that. > > In this case, it is (IMO) very clear that you are storing a typedecl > > object into BinaryIntFunc, for later use. For example, we might see the > > following code: > > > > import types > > Int = types.IntType > > List = types.ListType > > IntList = typedef [Int] > > ... > > This all looks like runtime code to me -- if so, how is a *compiler* > supposed to get any benefit out of it? Or if not, how is a compiler > supposed to recognize that it's not runtime code? It is runtime code. The runtime is going to need those objects to execute the runtime type checks (on function entry and for the type-assert operator; possibly for assignment enforcement). But the compiler can extract a lot of benefit from the above statements. As I mentioned: the compiler can/should understand the types module and the type() builtin (plus things like __class__). Then you're quite fine. No magic voodoo involved. > > Hrm. I don't have a ready answer for your first typedef, though. That > > is a new construct that we haven't seen yet. We've been talking about > > parameterizing *classes*, rather than typedecls. > > > > *ponder* > > In my twisted little universe, I'm using a declarative language for > compile-time type expressions, and BinaryFunc(_T) can be thought of as a > compile-time macro -- same as the BinaryIntFunc typedef (except the latter > doesn't take any arguments -- or does take no arguments ). I know that. I meant that I didn't have a response that fits into *my* universe :-) >... tuple stuff ... > > *grumble* .... I don't have a handy resolution for this one. > > So let's make one up. The problem is spelling "tuple of unknown length" > (and Paul's complaint notwithstanding, that *is* Python so we gotta deal > with it). Python has no notation for this. OK: > > ... > Tuple(T1, T2, T3) equivalent_to (T1, T2, T3) > Tuple(T1, T2) equivalent_to (T1, T2) > Tuple(T1,) equivalent_to (T1,) > Tuple(T1) means tuple-of-T1 of unknown length > > So it's always *legal* to stick "Tuple" in front of a tuple specifier, and > it's *required* in the last case. > > Actually, tuples show up in type specifiers rarely enough-- and look so much > like grouping now --that I'd be happy requiring "Tuple" all the time. Again > one of those things that could be relaxed later if it proved too irksome. A little wordy to include that keyword(?) in there, while things like List and Dict don't require it. While a valiant attempt to solve the readability of tuple type declarators, it just doesn't seem right... :-( Cheers, -g -- Greg Stein, http://www.lyra.org/ From paul@prescod.net Mon Dec 20 13:18:29 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 20 Dec 1999 07:18:29 -0600 Subject: [Types-sig] Python parser in Python? References: <002d01bf4ab3$ab7c1360$922d153f@tim> Message-ID: <385E2CA5.E88FE7B5@prescod.net> Tim Peters wrote: > > John Aycock's extremely general (any CF grammar, ambiguous or not) parsing > framework comes with a Python grammar. It depends on Python's built-in lexer: # # Why would I write my own when GvR maintains this one? # import tokenize Doesn't that remove the possibility for new keywords? -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself The occasional act of disrespect for the American flag creates but a flickering insult to the values of democracy -- unless it provokes America into limiting the freedoms that are its hallmark. -- Paul Tash, executive editor of the St. Petersburg Times From paul@prescod.net Mon Dec 20 13:28:18 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 20 Dec 1999 07:28:18 -0600 Subject: [Types-sig] Issue: definition of "type" References: <002f01bf4ab6$820a9c60$922d153f@tim> Message-ID: <385E2EF2.6CE888DC@prescod.net> Tim Peters wrote: > > > spam is a class but not a static type. > > True, but it can be given a static type *name*; e.g., > > decl type spam > > Provided that the attributes of spam actually referenced outside of spam > have the same signatures, static type checking outside of spam shouldn't > care that it doesn't know about spam's internals. Or, IOW, if the two > dynamic versions of spam present the same external interface to the > compiler, it doesn't matter how the *class* spam comes into being at > runtime. Okay, but do you or do you not agree that in the simple case of: class spam: def a(self) -> String: return "abc" a type object should be made implicitly as if someone had actually typed in the decl. I certainly would not support a position that said that the entire signature of spam had to be re-declared. I MIGHT support a position that said that the user had to explicitly declare spam as being available to the static type system. I'm on the fence about this last requirement because I would like to think that all of the code out there with class statements is *already* defining a bunch of types. A minority of it depends on runtime information and we can easily detect those cases. So why not let the simple case of "defined class that doesn't depend on runtime information" be a shortcut for a type declaration? -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself The occasional act of disrespect for the American flag creates but a flickering insult to the values of democracy -- unless it provokes America into limiting the freedoms that are its hallmark. -- Paul Tash, executive editor of the St. Petersburg Times From paul@prescod.net Mon Dec 20 13:55:42 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 20 Dec 1999 07:55:42 -0600 Subject: [Types-sig] New syntax? References: Message-ID: <385E355E.9B15FA42@prescod.net> Greg Stein wrote: > > ... > > Nah. No execution needs to take place. Just some data flow analysis. Let's be concrete: 1. if somefunction(): class a: def b(self)->String: return "abc" else: class a: def b(self)->Int: return 5 How many type objects are created? What are there names? What is the type of a? 2. class a: def b(self)->String: return "abc" for i in sys.argv: class a: def b(self)->Int: return 5 3. def makeClass(): class a: def b( self ): return "abc" return a j=makeClass()() -------------------- This seems intractable to me. I got around this in my original proposal by requiring all declaring classes to be *top-level*. In other words I formally defined the subset of Python that does not require code execution. If you can formally define the semantics of "data flow" then I will be able to compare the proposals. Note that I am half-way between you and Tim. I think that type objects should be more like Python objects but I am willing to restrict where they are created to make the problem tractable and the semantics understandable. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself The occasional act of disrespect for the American flag creates but a flickering insult to the values of democracy -- unless it provokes America into limiting the freedoms that are its hallmark. -- Paul Tash, executive editor of the St. Petersburg Times From tratt@dcs.kcl.ac.uk Mon Dec 20 10:18:43 1999 From: tratt@dcs.kcl.ac.uk (Laurence Tratt) Date: Mon, 20 Dec 1999 10:18:43 GMT Subject: [Types-sig] Type Inference II In-Reply-To: <385AE827.42ECB891@maxtal.com.au> References: <385AE827.42ECB891@maxtal.com.au> Message-ID: <3665527349.laurie@btinternet.com> In message <385AE827.42ECB891@maxtal.com.au> skaller wrote: > I note python currently supports privacy by name mangling, but really, > this is a hack: for Python 2, a more sophisticated architecture would be > better. Nnngg. I'm not keen on Python ever gaining privacy (the __ name mangling is nasty, I agree). It just doesn't really seem in the spirit of things; I always tend to think of the Larry Wall quote "Perl would rather you kept out of its living room because you weren't invited, not because it has a shotgun". In my recent projects, I denote "private" (there's no distinction between private, protected etc as there is in, say, Java) by just preceeding names with a "_". I've actually found that highly effective, and it makes it obvious that "self._method()" and so on are private calls. This approach also tends to make modules fairly "from module import *" safe. The only argument I can imagine for privacy is that "from module import *" tends to import module names etc as well which can make it confusing; but when we use that feature we deserve everything we get . Laurie -- http://eh.org/~laurie/comp/python/ From guido@CNRI.Reston.VA.US Mon Dec 20 15:14:44 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 20 Dec 1999 10:14:44 -0500 Subject: [Types-sig] tuples (was: New syntax?) In-Reply-To: Your message of "Sat, 18 Dec 1999 13:26:16 PST." References: Message-ID: <199912201514.KAA04222@eric.cnri.reston.va.us> > On Sat, 18 Dec 1999, Paul Prescod wrote: > > Greg Stein wrote: > > > > > > Bite me. :-) > > > > > > You do raise a good point in another post, however: > > > > > > def foo(*args: (Int)): > > > > Python should not use tuples as "read-only lists." From a type-system > > point of view, a tuple should be a fixed-length, fixed-type data > > structure defined at compile time. [Greg again] > Ideal or not, this is the current situation. *args is a tuple. > > Are you suggesting a particular change here? If so, then add it to your > issues list :-) [you are maintaining one, right? :-)] I don't think there's a good, deep reason why *args yields a tuple; only a historical one. I think that originally, all argument lists were internally passed around as tuples, because (in *very* early Python) argument lists *were* tuples. There's no particular reason why it should be immutable -- after all **kwargs returns a dict, which is mutable. The only reason not to switch to tuples is backwards compatibility -- in particular there is a lot of code (e.g. in the std library) that creates new arg lists by adding tuples to *args. This could be solved by allowing + to operate on a mix of lists and tuples. I think the result should yield a list. Not a strong argument to do this, just a relaxation of Greg's argument that *args is a tuple -- it needn't be, if we have a good reason to change it. --Guido van Rossum (home page: http://www.python.org/~guido/) From m.faassen@vet.uu.nl Mon Dec 20 15:44:18 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Mon, 20 Dec 1999 16:44:18 +0100 Subject: [Types-sig] typedefs (was: New syntax?) References: Message-ID: <385E4ED2.C3EEB28E@vet.uu.nl> Greg Stein wrote: > > On Fri, 17 Dec 1999, Martijn Faassen wrote: [snip] > > typedef Footype(int, int): > > return int > > > > var handlermap = {string: Footype} > > I see typedefs as a way to associate a typedecl with a name. In your > example here, I'm not sure how to do a typedef of something like > List. You seem to have pegged typedef to only do function > typedefs. And class typedefs, I suppose, but you're right. Though you could do this: typedef Footype: List(Int) I should finally work out my syntax proposal into something sensible because now I'm confusing myself. :) I do still think there's something interesting to be learned from the 'class instantiation' - 'typedef instantiation' and 'value assignment' - 'type assignment' analogy. [snip] > In any case, I think using "def" inline to define a function typedecl is > fine. A typedef is merely used to create an alias, to clarify a later > declaration. Yes, but you basically have the same setup with current Python if you exclude Lambdas. A function definition is merely used to create an 'alias' for a piece of code, to clarify other pieces of code. If you assume for the moment lambdas are bad, we may want to assume by analogy that inline defs are not a good idea either. Regards, Martijn From rmasse@cnri.reston.va.us Mon Dec 20 16:16:32 1999 From: rmasse@cnri.reston.va.us (Roger Masse) Date: Mon, 20 Dec 1999 11:16:32 -0500 (EST) Subject: [Types-sig] Low-hanging fruit: recognizing builtins In-Reply-To: <000a01bf48f1$56213bc0$32a2143f@tim> References: <3859FD66.E47352E@lemburg.com> <000a01bf48f1$56213bc0$32a2143f@tim> Message-ID: <14430.22112.622745.998835@nobot.cnri.reston.va.us> Tim Peters writes: > [M.-A. Lemburg] > > ... > > BTW, just to make buying one of those new microwave > > ovens more attractive: what is the pystone rating for > > the new Athlon and Pentium III chips ? > I just bought a cute little 600MHz compaq with the K7 for around $1400. (Chrismas gift for my girlfriend) Performance is 7487.21/pystones per sec. -Roger From m.faassen@vet.uu.nl Mon Dec 20 16:17:25 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Mon, 20 Dec 1999 17:17:25 +0100 Subject: [Types-sig] Issue: definition of "type" References: <002f01bf4ab6$820a9c60$922d153f@tim> <385E2EF2.6CE888DC@prescod.net> Message-ID: <385E5695.1CAEF90B@vet.uu.nl> Paul Prescod wrote: [snip] > I'm on the fence about this last requirement because I would like to > think that all of the code out there with class statements is *already* > defining a bunch of types. A minority of it depends on runtime > information and we can easily detect those cases. So why not let the > simple case of "defined class that doesn't depend on runtime > information" be a shortcut for a type declaration? Are you sure that in fact a minority depends on runtime information? Often Python code is used without any inheritance link, like this: class Foo: def doSomething(self): ... class Bar: def doSomething(self): ... a = [Foo(), Bar()] for el in a: el.doSomething() Doesn't this rely on run-time information? How would a type system deal with this? I suppose I'm entering the domain of interfaces now... Regards, Martijn From paul@prescod.net Mon Dec 20 16:22:48 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 20 Dec 1999 10:22:48 -0600 Subject: [Types-sig] Issue: definition of "type" References: <002f01bf4ab6$820a9c60$922d153f@tim> <385E2EF2.6CE888DC@prescod.net> <385E5695.1CAEF90B@vet.uu.nl> Message-ID: <385E57D8.E5518928@prescod.net> Martijn Faassen wrote: > >... > > Doesn't this rely on run-time information? How would a type system deal > with this? I suppose I'm entering the domain of interfaces now... Yes, that is the role of interfaces. Nobody has yet suggested that the code you described would be type-safe. The two doSomething methods are unrelated. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself The occasional act of disrespect for the American flag creates but a flickering insult to the values of democracy -- unless it provokes America into limiting the freedoms that are its hallmark. -- Paul Tash, executive editor of the St. Petersburg Times From paul@prescod.net Mon Dec 20 16:24:32 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 20 Dec 1999 10:24:32 -0600 Subject: [Types-sig] Issue: definition of "type" References: <002f01bf4ab6$820a9c60$922d153f@tim> <385E2EF2.6CE888DC@prescod.net> <385E5695.1CAEF90B@vet.uu.nl> Message-ID: <385E5840.EF5ED124@prescod.net> Martijn Faassen wrote: > > Paul Prescod wrote: > [snip] > > I'm on the fence about this last requirement because I would like to > > think that all of the code out there with class statements is *already* > > defining a bunch of types. A minority of it depends on runtime > > information and we can easily detect those cases. So why not let the > > simple case of "defined class that doesn't depend on runtime > > information" be a shortcut for a type declaration? > > Are you sure that in fact a minority depends on runtime information? Note that I'm saying that the vast majority of Python classes are statically declared, not that the vast majority of Python *code* is statically type checkable. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself The occasional act of disrespect for the American flag creates but a flickering insult to the values of democracy -- unless it provokes America into limiting the freedoms that are its hallmark. -- Paul Tash, executive editor of the St. Petersburg Times From m.faassen@vet.uu.nl Mon Dec 20 16:39:38 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Mon, 20 Dec 1999 17:39:38 +0100 Subject: [Types-sig] A challenge References: <002501bf4878$d2faf240$63a2143f@tim> <385A7495.D7A25EC4@appliedbiometrics.com> Message-ID: <385E5BCA.E238292E@vet.uu.nl> Christian Tismer wrote: > > Tim Peters wrote: > > [stuff on name equivalence] > That sounds very right, since it allows to create different > things even if they look the same from structure. You get more > strength in error checking, since using the parameter in the wrong > context can be detected even if a foo's components look like a bar's. Okay, but then I'll repeat the question I asked before: class Foo: def getIt(self)->String: ... class Bar: def getIt(self)->String: ... list = [Foo(), Bar()] for el in list: print el.doIt() This wouldn't work, even though the interfaces are similar. This brings us into two domains: * inheritance I haven't seen too much discussion on how types are going to interact with the inheritance system. I could of course let Foo and Bar derive from a common base class which defines doIt() as well. This is a common way to do it, if type annotations get inherited from base classes, etc. * interfaces Another way to do it is to use interfaces and say Foo and Bar both conform to some interface which supports doIt(). This was something we wouldn't discuss in this SIG, but can we in fact avoid it? Regards, Martijn From guido@CNRI.Reston.VA.US Mon Dec 20 16:46:52 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 20 Dec 1999 11:46:52 -0500 Subject: [Types-sig] development approach (was: Syntax) In-Reply-To: Your message of "Sun, 19 Dec 1999 11:23:18 PST." References: Message-ID: <199912201646.LAA04958@eric.cnri.reston.va.us> [Greg] > I think that once we reach consensus and have Guido Approval, then > it goes right into the main CVS tree. I don't think so. I'd like to see an experimental implementation first. Consensus about a proposal is very different than a working implementation! --Guido van Rossum (home page: http://www.python.org/~guido/) From m.faassen@vet.uu.nl Mon Dec 20 16:53:24 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Mon, 20 Dec 1999 17:53:24 +0100 Subject: [Types-sig] minimal or major change? (was: RFC 0.1) References: <001f01bf486d$09feec80$63a2143f@tim> Message-ID: <385E5F04.1B218E20@vet.uu.nl> Tim Peters wrote: > > [Martijn Faassen, reasonably demands ...] > > So that's where I'm coming from. It's important for our proposal > > to actually come up with a workable development plan, because > > adding type checking to Python is rather involved. So I've been > > pushing one course of implementation towards a testable/hackable > > system that seems to give us the minimal amount of development > > complexities. I haven't seen clear development paths from others > > yet; most proposals seem to involve both run-time and compile- > > time developments at the same time. > > > > So I'm interested to see other development proposals; possibly > > there's a simpler approach or equally complex approach with more > > payoff, that I'm missing. > > I haven't given a lick of thought to development, beyond sketching "the > usual" approach to type inference for Guido, and having a hard-won intuition > about what is and isn't reasonably parseable. This SIG has been "alive > again" for on the order of just one week: design precedes implementation, > and I won't bemoan the lack of implementation details even if they're > delayed for *another* whole week . Of course I can wait a couple of weeks longer. :) Now I'll add some buts: But implementation possibilities do influence design. I wasn't asking for actual implementation proposals, I was thinking about how to go about development. What brings us early payoffs. What is most effective and least complex. What development difficulties may appear that we simply can't avoid (so we can brace ourselves :). > At that point, it's fine by me if the first cut is *spelled* using plain > dicts and docstrings etc to ease development. But before that point, we > don't even know what we want it to *do*. remember-there-something-called-'prototyping'-ly yours, Martijn From skaller@maxtal.com.au Mon Dec 20 17:18:01 1999 From: skaller@maxtal.com.au (skaller) Date: Tue, 21 Dec 1999 04:18:01 +1100 Subject: [Types-sig] New syntax? References: Message-ID: <385E64C9.A82E84@maxtal.com.au> Greg Stein wrote: > I think that we definitely want to be able to construct and use typedecl > objects at runtime. That's why I prefer the typedef unary operator over > your "sub-language." Are these options mutually exclusive? I've implemented operator ! in Viper now, x!t checks type(x) is t, and returns x if it is, otherwise it raises a TypeError. The precedence is one level tighter than Greg recommended, mainly because that was slightly easier for me to implement quickly. Here is some code I wrote using it: def append(self,object,value): object.list.append(value ! self.Type) which is quite terse, seems readable and 'pythonic', and works as I expected. Without this operator, the code would read: def append(self,object,value): if type(value) is self.Type: object.list.append(value) else: raise TypeError My current feeling: I quite like it -- but the above is the only use I have tried, other than specifically for testing it. My feeling, also, is that in those circumstances where the test would fail, then the program should be considered in error (that is, it is not legitimate practice to catch and handle the TypeError, so that if a compiler can prove it would be raised, it is entitled to reject the program, and a lint like checker, to issue a diagnostic. [The explicit test, like in the second example above, should be used if it is desired to catch and handle the raised TypeError] This means that the x!t can be optimised to x, without affecting strictly conforming program semantics. Comments? Greg? -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From paul@prescod.net Mon Dec 20 17:28:39 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 20 Dec 1999 11:28:39 -0600 Subject: [Types-sig] Basic questions References: <002501bf4878$d2faf240$63a2143f@tim> <385A7495.D7A25EC4@appliedbiometrics.com> <385E5BCA.E238292E@vet.uu.nl> Message-ID: <385E6747.4831D32@prescod.net> Martijn Faassen wrote: > > Another way to do it is to use interfaces and say Foo and Bar both > conform to some interface which supports doIt(). This was something we > wouldn't discuss in this SIG, but can we in fact avoid it? It seems to me that we've been discussing it for about a week now! You are right that we can't avoid it. > * inheritance > > I haven't seen too much discussion on how types are going to interact > with the inheritance system. I think it would work more or less as it does in other object oriented languages. I, personally, am concentrating on the parts of the system that I feel I don't understand. Those parts mostly have to do with Python's dynamism and not with its already existing type system. Of course subtypes of "foo" should follow "foo"'s interface and should be recognized as "foo"s. But the much more basic question is whether: class foo: pass even *defines* a type that can be used in type declarations. Greg says yes, even if the declaration is buried in code. Tim says no,(I think) not unless it is preceded with a decl statement. I'm trying to figure out which one is right. We can get to inheritance and interfaces later. Basic questions. 1. Is this valid: class foo: pass def a( arg: foo ): pass 2. Is this valid: if someFunc(): class foo: "abc" else: class foo: "def" def a( arg: foo ): pass -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself The occasional act of disrespect for the American flag creates but a flickering insult to the values of democracy -- unless it provokes America into limiting the freedoms that are its hallmark. -- Paul Tash, executive editor of the St. Petersburg Times From aycock@csc.UVic.CA Mon Dec 20 17:33:20 1999 From: aycock@csc.UVic.CA (John Aycock) Date: Mon, 20 Dec 1999 09:33:20 -0800 Subject: [Types-sig] Re: [String-SIG] Python parser in Python? Message-ID: <199912201733.JAA12537@valdes.csc.UVic.CA> | From: Paul Prescod | | Tim Peters wrote: | > | > John Aycock's extremely general (any CF grammar, ambiguous or not) parsing | > framework comes with a Python grammar. | | It depends on Python's built-in lexer: | | import tokenize The tokenize module doesn't interface with the lexer inside Python -- it does its work using a set of ugly-looking regular expressions. | Doesn't that remove the possibility for new keywords? Not at all. If the new keywords (here I'm assuming reserved words) are of the same form as identifiers, as would most likely be the case, then you can easily pick them out after tokenize splits them apart. That's what my Python lexer does: piggybacks on tokenize, then flags reserved words. (Some people advocate such a splitting of lexical analysis tasks this way, into a scanner (tokenize) and a screener (postprocessing of tokens).) Of course, if you want odd-looking keywords, you could always modify a provate copy of tokenize :-) John From skaller@maxtal.com.au Mon Dec 20 17:36:58 1999 From: skaller@maxtal.com.au (skaller) Date: Tue, 21 Dec 1999 04:36:58 +1100 Subject: [Types-sig] Type Inference II References: <001701bf4a7b$59ab61e0$922d153f@tim> Message-ID: <385E693A.94A2BF83@maxtal.com.au> Tim Peters wrote: > > ... > > Tightening up for loops will break code that does things like: > > > > (1) do extra increments on an loop variable to skip cases > > Nope. Assigning to the loop variable has no effect on the values assigned > *to* it by the for stmt: > > >>> for i in range(5): > print i > i = i ** 10 > print i Ooops. You're right. Comment withdrawn. [Anyone that actually wrote that should probably be withdrawn too :-] -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From evan@4-am.com Mon Dec 20 17:51:08 1999 From: evan@4-am.com (Evan Simpson) Date: Mon, 20 Dec 1999 11:51:08 -0600 Subject: [Types-sig] New syntax? References: <19991220045108.747A41CD8C@dinsdale.python.org> Message-ID: <385E6C8C.635293B2@4-am.com> [Tim Peters wrote:] > So let's make one up. The problem is spelling "tuple of unknown length" > (and Paul's complaint notwithstanding, that *is* Python so we gotta deal > with it). Python has no notation for this. In one of the many messages I started composing for this SIG, then never sent, I mixed regexp-style notation into your ML-style declarations. How's about: (T*) means T-tuple of unknown length, (T+) means length at least one, (T1?, T2{1,2}, T3) means optional T1 followed by one or two T2's and exactly one T3. This still requires (T,) for a single-T tuple, but all other uses are distinguishable from grouping. > Actually, tuples show up in type specifiers rarely enough-- and look so much > like grouping now --that I'd be happy requiring "Tuple" all the time. Again > one of those things that could be relaxed later if it proved too irksome. Naturally, this isn't restricted to tuples; Argument (and regular) lists could also use it. In particular, the much-discussed range could be declared as: decl def range(start as Int?, stop as Int, step as Int?) return [Int*] (and I still think "default" arguments used for closures should be separable from regular arguments by a ';', but that's another SIG) Still can't spell 'map', though. decl def(ResultType, *SeqTypes) map(def(SeqTypes) return ResultType, map(lambda x: Sequence(x), SeqTypes) ) return [Result{len(SeqTypes)}] Cheers, Evan @ 4-am From m.faassen@vet.uu.nl Mon Dec 20 17:58:34 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Mon, 20 Dec 1999 18:58:34 +0100 Subject: [Types-sig] Basic questions References: <002501bf4878$d2faf240$63a2143f@tim> <385A7495.D7A25EC4@appliedbiometrics.com> <385E5BCA.E238292E@vet.uu.nl> <385E6747.4831D32@prescod.net> Message-ID: <385E6E4A.4D56A904@vet.uu.nl> Paul Prescod wrote: > > Martijn Faassen wrote: [snip] > > * inheritance > > > > I haven't seen too much discussion on how types are going to interact > > with the inheritance system. > > I think it would work more or less as it does in other object oriented > languages. I, personally, am concentrating on the parts of the system > that I feel I don't understand. Okay, I'll focus away from inheritance for now. > Those parts mostly have to do with > Python's dynamism and not with its already existing type system. Of > course subtypes of "foo" should follow "foo"'s interface and should be > recognized as "foo"s. > > But the much more basic question is whether: > > class foo: pass > > even *defines* a type that can be used in type declarations. Greg says > yes, even if the declaration is buried in code. Tim says no,(I think) > not unless it is preceded with a decl statement. I'm trying to figure > out which one is right. I'm in the make it explicit camp here. Nothing defines any type (functions or classes) unless we explicitly say that it does. Otherwise it may default to 'everything is the Any' type which should be equivalent to basic Python. Note that we will probably lose static type info in any code path that passes through basic Python, to any path exiting Python into statically typed Python should have runtime checks (or give compile time errors). I wonder if in practice this will mean people will start to assign types to *everything* to make it work well (or efficient) with types at all. If so then we need to somehow avoid this. > We can get to inheritance and interfaces later. I actually need interfaces in my following discussion, so I apologize in advance. :) > Basic questions. > > 1. Is this valid: > > class foo: pass > > def a( arg: foo ): pass Not unless somewhere it says explicitly that class foo defines a static type, I'd say. > 2. Is this valid: > > if someFunc(): > class foo: "abc" > else: > class foo: "def" > > def a( arg: foo ): pass This is the really interesting one.. Perhaps interfaces can help here. One rule could be this: You can't define the same name multiple times in the same scope. You have to do 'class foo1' and 'class foo2' instead, and then say they both conform to the interface 'foo'. Consequences: * A separate interface declaration syntax would seem to be required. Consequences I describe at the alternative rule apply too, I think. An alternative rule would be the following: Any class names that are defined multiple times in the same scope are taken to support an interface with that same name. This interface is the only type you can use elsewhere; you can't use the class type directly. It is a compile time error if classes with the same name define different interfaces. Consequences: * This may mean we enter access-rule land; it would be okay classes conforming to an interface to define different member variables, as long as these are private. * The interface needs to be hooked up to the actual implementation during runtime. This may happen as soon as a class (that the compiler has seen has multiple definitions) is actually being defined at run-time. * These are odd interfaces in the sense that it looks as if you can instantiate from them! What 'in fact' happens is that the interfaces passes any instantiation requests to the actual class that's doing the implementation -- the interface is a simple factory. The same story would apply to function signatures/prototypes; if the same function name occurs multiple times in the same scope they're all taken to define the same prototype, which would be the actual type used. Regards, Martijn From m.faassen@vet.uu.nl Mon Dec 20 18:10:00 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Mon, 20 Dec 1999 19:10:00 +0100 Subject: [Types-sig] Return of the Docstring: The Typening References: <385966D5.BAF592C4@4-am.com> Message-ID: <385E70F8.95EBFA07@vet.uu.nl> Evan Simpson wrote: [snip] > I still like the Sparrow/SPython concept, too . That was Swallow, as in African or European swallows. :) And that doesn't do any analysis, it basically declares *all* types exhaustively anywhere and restricts the heck out of what is allowed. All for OPT, of course. :) Regards, Martijn From m.faassen@vet.uu.nl Mon Dec 20 18:15:15 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Mon, 20 Dec 1999 19:15:15 +0100 Subject: [Types-sig] Issue: definition of "type" References: <002f01bf4ab6$820a9c60$922d153f@tim> <385E2EF2.6CE888DC@prescod.net> <385E5695.1CAEF90B@vet.uu.nl> <385E57D8.E5518928@prescod.net> Message-ID: <385E7233.74DF151A@vet.uu.nl> Paul Prescod wrote: > Martijn Faassen wrote: [snip] > > Doesn't this rely on run-time information? How would a type system deal > > with this? I suppose I'm entering the domain of interfaces now... > > Yes, that is the role of interfaces. Nobody has yet suggested that the > code you described would be type-safe. The two doSomething methods are > unrelated. I understood that, but I am saying that this type of thing is quite common in Python, and I was reacting to what you said here: > ...because I would like to > think that all of the code out there with class statements is *already* > defining a bunch of types. A minority of it depends on runtime > information and we can easily detect those cases. I was pointing out this common idiom in Python as an argument against your statement that a minority depends on runtime information (that we can easily detect). Lots of Python code depends on this idiom so it's good to address it. Regards, Martijn From gstein@lyra.org Mon Dec 20 18:32:04 1999 From: gstein@lyra.org (Greg Stein) Date: Mon, 20 Dec 1999 10:32:04 -0800 (PST) Subject: [Types-sig] private names (was: Type Inference II) In-Reply-To: <3665527349.laurie@btinternet.com> Message-ID: I just wanted to jump in with a "me too!" :-) On Mon, 20 Dec 1999, Laurence Tratt wrote: > In message <385AE827.42ECB891@maxtal.com.au> > skaller wrote: > > I note python currently supports privacy by name mangling, but really, > > this is a hack: for Python 2, a more sophisticated architecture would be > > better. > > Nnngg. I'm not keen on Python ever gaining privacy (the __ name mangling is > nasty, I agree). It just doesn't really seem in the spirit of things; I > always tend to think of the Larry Wall quote "Perl would rather you kept out > of its living room because you weren't invited, not because it has a > shotgun". That's an excellent quote :-). I agree. Guido has always phrased it as "we're all adults here [so we know what to do and what not to do]." But I agree: I never really liked the __ name mangling. I liked relying on adulthood. However, there was a secondary reason for the mangling, not just privacy. It was added to help prevent conflicts between super/subclass' use of attributes. Personally, I think that Python is transparent enough that a subclass is going to know what attributes its parent class uses and will avoid those. [ this may also be a result of my tendency towards shallow hierarchies ] > In my recent projects, I denote "private" (there's no distinction between > private, protected etc as there is in, say, Java) by just preceeding names > with a "_". I've actually found that highly effective, and it makes it > obvious that "self._method()" and so on are private calls. This approach > also tends to make modules fairly "from module import *" safe. Same here. > The only argument I can imagine for privacy is that "from module import *" > tends to import module names etc as well which can make it confusing; but > when we use that feature we deserve everything we get . Absolutely! IMO, the construct should just go away. I do believe that the following is goodness: from deep.package import subpackage subpackage.somefunc() In this case, you are still importing a module, but you've stripped down some of its hierarchy for easier access. But you *don't* import "somefunc" directly, because then you would lose the "subpackage." when you call the thing. I believe that people should always use "module.foo" references rather than just "foo". About the only exception that I make for this is with the "types" module. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Mon Dec 20 18:38:24 1999 From: gstein@lyra.org (Greg Stein) Date: Mon, 20 Dec 1999 10:38:24 -0800 (PST) Subject: [Types-sig] Re: decl f: def(_T, _T) -> _T (fwd) Message-ID: misfire... redirecting... ---------- Forwarded message ---------- Date: Tue, 04 Apr 2000 15:30:21 +0100 From: Edward Welbourne To: gstein@lyra.org Subject: Re: decl f: def(_T, _T) -> _T Hey ! I remember that type. Ponder calls it Boolean. def true(this, that): return this def false(this, that): return that def or(this, that): return lambda i,a,_i=this,_a=that: _i(i,_a(i,a)) def and(this, that): return lambda i,a,_i=this,_a=that: _i(_a(i,a),a) wow ! the rest of the type transcribes cleanly too ;^> I like the spec (of course, I want to change it, too). More comments to the list when I've read more (notably, I see Tim Peters has responded ...) A suggestion: decl f(_T): def(_T, _T) -> _T that is, the `foralltype' names are parameters of the decl ? Eddy. -- I believe in getting into hot water; it keeps you clean. -- G. K. Chesterton. From m.faassen@vet.uu.nl Mon Dec 20 18:35:38 1999 From: m.faassen@vet.uu.nl (Martijn Faassen) Date: Mon, 20 Dec 1999 19:35:38 +0100 Subject: [Types-sig] Interfaces (was: definition of "type") References: <002f01bf4ab6$820a9c60$922d153f@tim> <385E2EF2.6CE888DC@prescod.net> <385E5695.1CAEF90B@vet.uu.nl> <385E5840.EF5ED124@prescod.net> Message-ID: <385E76FA.48F953DF@vet.uu.nl> Paul Prescod wrote: > > Martijn Faassen wrote: > > > > Paul Prescod wrote: > > [snip] > > > I'm on the fence about this last requirement because I would like to > > > think that all of the code out there with class statements is *already* > > > defining a bunch of types. A minority of it depends on runtime > > > information and we can easily detect those cases. So why not let the > > > simple case of "defined class that doesn't depend on runtime > > > information" be a shortcut for a type declaration? > > > > Are you sure that in fact a minority depends on runtime information? > > Note that I'm saying that the vast majority of Python classes are > statically declared, not that the vast majority of Python *code* is > statically type checkable. [just responded to your other response to me, but here you address my concern in that response, so...I've hopelessly confused everyone now] Okay. We should look into this issue, though. Ideally it should be as easy as possible for the current Python programmer to adapt his code to use types. I think interfaces are the answer here more than a common base class. Here I'll go off on a tangent that may help here: Possibly this is a wild idea (or possibly it's old hat to everyone), but what about a system to produces interfaces without having to declare them? Take the intersection of two class interfaces and call this a new interface; all methods with the same signature (and possibly members). class Foo conforms FooInter: def getS(self)->String: ... def otherstuff(self): ... class Bar conforms FooInter: def getS(self)->String: ... def otherstuff(self, a, b): ... class Baz: pass Foo ! fooInter # works Bar ! fooInter # works Baz ! fooInter # TypeError Alternatively you could move the interface declaration code outside the classes, into something like this: decl interface FooInter: intersection(Foo, Bar) This way programmers don't need to explicitly declare interfaces and still have them. I don't know if this is a good idea though; there's a lot to say for explicitness. These intersections may be too big, containing overlaps you aren't interested in. Though of course it's easy to explicitize it and prevent this, too, if you want: class ExplicitInterface conforms FooInter: def getS(self)->String: pass Though in this case you could make the interface too small if not all conforming classes actually implement a method. So you'd need something like: class ExplicitInterface defines FooInter: ... to be really sure you get compiler errors if not all classes conform. Sorry to fill the SIG with discussions on interfaces. They just seem unavoidable if you want to preserve lots of interesting Python features. Python right now is after all often used in a way that decouples interface from implementation without explicitizing the interfaces (for instance cStringIO, File, etc). This idea would at least make them more explicit with minimum programmer-hassle, while still providing for full explicity if desired. In this way it is similar to current Python practice; you can make interfaces fully explicit by using a common base class. Regards, Martijn From skaller@maxtal.com.au Mon Dec 20 18:40:50 1999 From: skaller@maxtal.com.au (skaller) Date: Tue, 21 Dec 1999 05:40:50 +1100 Subject: [Types-sig] Basic questions References: <002501bf4878$d2faf240$63a2143f@tim> <385A7495.D7A25EC4@appliedbiometrics.com> <385E5BCA.E238292E@vet.uu.nl> <385E6747.4831D32@prescod.net> Message-ID: <385E7832.81D4CFC7@maxtal.com.au> Paul Prescod wrote: > I think it would work more or less as it does in other object oriented > languages. I, personally, am concentrating on the parts of the system > that I feel I don't understand. Those parts mostly have to do with > Python's dynamism and not with its already existing type system. Of > course subtypes of "foo" should follow "foo"'s interface and should be > recognized as "foo"s. Sure, but how do you know what types are subtypes? You cannot tell from inheritance: subtyping isn't related to inheritance, at least in Python (same in ocaml). Example: class Foo: def f(x): return None class Bar(Foo): def f(x,y): return int(x)+int(y) Foo is a base of Bar, Bar is not a subtype of Foo. Classes do not specify types. They are simply constructions which make constructing instances easy: all the instances have the same type, Instance... you could argue that instances of a particular class X have type 'Instance of X', but the behaviour is only default (since attributes can be dynamically altered). > But the much more basic question is whether: > > class foo: pass > > even *defines* a type that can be used in type declarations. The declaration defines a class. It specifies the initial attributes of the class. In CPython 1.5 at least, class declarations do not define types. If you expand the syntax so that, if the type is Instance, then you can give a class name instead, then this would imply that _any_ class object which can be refered to can be used. If I may: there is an issue here which some people may not have realised: recursive types. In an interface file, this can be handled by two passes. In implementation files, it is much harder, since scoping rules are dynamic. This is a good argument for interface files. Example: class X: def f(y:Y): ... class Y: def g(x:X): ... Resolving this in a single pass requires backpatching, which is messy: but using two passes leads to difficult ambiguities due to renaming: class X: def h(): ... Ok -- so now, which X does the X in g refer to? In python, names are bound dynamically, which resolves the problem. In an interface file, renaming can be banned. Can it be banned, for classes, in implementation files? -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Mon Dec 20 19:20:38 1999 From: skaller@maxtal.com.au (skaller) Date: Tue, 21 Dec 1999 06:20:38 +1100 Subject: [Types-sig] Interfaces (was: definition of "type") References: <002f01bf4ab6$820a9c60$922d153f@tim> <385E2EF2.6CE888DC@prescod.net> <385E5695.1CAEF90B@vet.uu.nl> <385E5840.EF5ED124@prescod.net> <385E76FA.48F953DF@vet.uu.nl> Message-ID: <385E8186.3BC32DF4@maxtal.com.au> Martijn Faassen wrote: > class Foo conforms FooInter: How about class Foo is a FooInter: .. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From tony@metanet.com Mon Dec 20 19:37:22 1999 From: tony@metanet.com (Tony Lownds) Date: Mon, 20 Dec 1999 11:37:22 -0800 (PST) Subject: [Types-sig] tuples (was: New syntax?) In-Reply-To: <199912201514.KAA04222@eric.cnri.reston.va.us> Message-ID: On Mon, 20 Dec 1999, Guido van Rossum wrote: > > The only reason not to switch to tuples is backwards compatibility -- > in particular there is a lot of code (e.g. in the std library) that > creates new arg lists by adding tuples to *args. This could be solved > by allowing + to operate on a mix of lists and tuples. I think the > result should yield a list. > There would be forwards compatability issues too; people might starting writing: class A: def foo(self, *args): args[:0] = [self] apply(foo, args) def bar(self, *args): ... This code would not work on existing Pythons. If this change would just be because of a lack of a way to say, a tuple of any length of type A, then may I suggest "tuple of A", e.g. def foo(*args: tuple of float) -> (float, float): # return the median and mode of its arguments ... -Tony From tim_one@email.msn.com Mon Dec 20 22:18:51 1999 From: tim_one@email.msn.com (Tim Peters) Date: Mon, 20 Dec 1999 17:18:51 -0500 Subject: [Types-sig] RE: [String-SIG] Python parser in Python? In-Reply-To: <385E2CA5.E88FE7B5@prescod.net> Message-ID: <000801bf4b38$32b5f2e0$b3a0143f@tim> I can only make time for one easy one, and ... lessee ... Paul wins! [Tim] > John Aycock's ... framework comes with a Python grammar. [Paul Prescod] > It depends on Python's built-in lexer: > > # > # Why would I write my own when GvR maintains this one? > # > import tokenize > > Doesn't that remove the possibility for new keywords? I'm going to respond a little more than John did, because tokenize.py has a funky API that takes some getting used to. Run the attached, and things will be clearer. tokenize.py doesn't know about keywords per se; all alphanumeric names (whether keyword or identifier) come back with the NAME token type. Deciding what's a keyword is a post-lexing decision (i.e., that's up to tokenize's caller). So unless the Types-SIG decides to prototype syntax unreasonably different from current Python's, the only likely way in which tokenize.py may need to be altered is in extending its Operator regexp. For example, the "->" in the attached is tokenized as two distinct OP tokens, "-" and ">". You can easily live with that by defining a *grammar* production to recognize that pair, but then you can't stop e.g. "- >" from getting treated as "->" too (tokenize suppresses intraline whitespace). Good enough for a prototype! Note that "-" followed by ">" is never legit Python today. Subtleties for tokenize newbies: a NEWLINE token terminates a stmt. An NL token is produced for an *intra*-stmt newline (NL does not terminate a stmt; you can usually ignore NL, and COMMENT, tokens). Changes in nesting level are signaled by INDENT and DEDENT tokens. Watch out for files whose final line is indented but doesn't end with \n (that's the only time you'll see a sequence of DEDENT tokens not immediately preceded by a NEWLINE token); Mark Hammond has no other kind of file . I'll be back next year, if not next week. Americans should leave cookies and milk out for Santa and his reindeer; people in other countries should set deadly traps for evil goat gods -- or whatever other foolishness they believe in. and-remember-that-whoever-writes-code-first-wins-l y'rs - tim import tokenize class TokDemo: def __init__(self, file): self.f = file def run(self): tokenize.tokenize(self.f.readline, self.gobbler) def gobbler(self, ttype, token, (sline, scol), (eline, ecol), line): print tokenize.tok_name[ttype], `token` example = """ def rootlist(n: Int, r: Real) -> [Real]: decl var result: [Real] result = [] decl var i: Int for i in range(n): result.append(i ** (1/r)) return result """ import StringIO d = TokDemo(StringIO.StringIO(example)) d.run() From paul@prescod.net Tue Dec 21 00:16:22 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 20 Dec 1999 18:16:22 -0600 Subject: [Types-sig] Basic questions References: <002501bf4878$d2faf240$63a2143f@tim> <385A7495.D7A25EC4@appliedbiometrics.com> <385E5BCA.E238292E@vet.uu.nl> <385E6747.4831D32@prescod.net> <385E6E4A.4D56A904@vet.uu.nl> Message-ID: <385EC6D6.E55B0A27@prescod.net> Martijn Faassen wrote: > > > ... I wonder if in practice this will mean > people will start to assign types to *everything* to make it work well > (or efficient) with types at all. If so then we need to somehow avoid > this. This is why I think that "make everything explicit" is too strong of a rule in practice. I want type-checked code and untype-checked code to work together more or less seamlessly. On the other hand, I don't want to get into complicated data flow analysis. Even if someone implemented it, how would we explain it to Python programmers? "In order to understand what types your program is producing, follow this complicated algorithm." That's why we are running away from strict (non-conservative) type inferencing in the first place. I think that the middle ground is more or less what I proposed last week. This is a class/static type definition: class a: pass This is not: if 1: class a: pass This is a function declaration where the function's type (Any->Any) is known at compile time: def a( b ): return "foo" This is not a static function declaration and cannot be used from static code without a type assertion: if 1: def a( b ): return "foo" I'm trying to keep thing simple. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself The occasional act of disrespect for the American flag creates but a flickering insult to the values of democracy -- unless it provokes America into limiting the freedoms that are its hallmark. -- Paul Tash, executive editor of the St. Petersburg Times From billtut@microsoft.com Tue Dec 21 00:55:14 1999 From: billtut@microsoft.com (Bill Tutt) Date: Mon, 20 Dec 1999 16:55:14 -0800 Subject: [Types-sig] Re: Py2C speed benefit Message-ID: <4D0A23B3F74DD111ACCD00805F31D8101D8BCB6F@RED-MSG-50> > From: Greg Stein [mailto:gstein@lyra.org] > > On Sun, 19 Dec 1999, skaller wrote: > >... > > > Now, here is something I believe, mainly from comments > > made at various times by Guido, Tim, and others: > > people have tried compiling python before, and found that > > the resulting C code didn't run much faster than the > > interpreter. Thats mainly because these compilers didn't > > know anythong about the types, they just generated API > > calls corresponding to what the byte code interpreter would > > execute -- and the interpreter is pretty fast already. > > Bill Tutt and I have done it and measured about 30% speed > improvement in > most cases. Not as lot as most people would hope for, but definitely > there. Bill is continuing to improve the code. > To clarify, this is just an approximate speed improvement in pystone. This doesn't (as yet) reflect a speed benefit when using typical OOP-like production code. I'm hoping to eventaully find some time to optimizing class-based method calls in Py2C. Bill From skaller@maxtal.com.au Tue Dec 21 17:29:22 1999 From: skaller@maxtal.com.au (skaller) Date: Wed, 22 Dec 1999 04:29:22 +1100 Subject: [Types-sig] Basic questions References: <002501bf4878$d2faf240$63a2143f@tim> <385A7495.D7A25EC4@appliedbiometrics.com> <385E5BCA.E238292E@vet.uu.nl> <385E6747.4831D32@prescod.net> <385E6E4A.4D56A904@vet.uu.nl> <385EC6D6.E55B0A27@prescod.net> Message-ID: <385FB8F2.343ED658@maxtal.com.au> Paul Prescod wrote: > This is why I think that "make everything explicit" is too strong of a > rule in practice. I want type-checked code and untype-checked code to > work together more or less seamlessly. I agree. > On the other hand, I don't want > to get into complicated data flow analysis. Even if someone implemented > it, how would we explain it to Python programmers? But this I do not understand. When an inferencer assigns types to a variable of function, there are three cases: (1) the types are what the programmer expected. Programmers usually expect a particular result. (2) the type is more general than the programmer expected. This is easy to explain: the inferencer isn't as smart as the programmer. If you want better types for these cases, add explicit type declarations. (3) the type is not what the programmer expected. In this case, there is a definite bug in the programmers understanding of the code. (assuming the inference engine actually works correctly). [The bug may be in a function, or in a function call: that is, the programmer has to sort out whether the serving code is wrong, or the client code: does the function have to be generalised to meet the clients requirements, or does the client need to adjust the code to use the function as it was intended to be used??] It is only case (3) which is difficult. But, the difficulty is less than that which would result from a run time error, so the inferencing cannot reduce the programmers understanding, only make them realise there are bugs earlier than they might wish to be reminded :-) Of course, there is a fouth case: the inferencer is producing the wrong answer. This would certainly confuse the programmer(s) -- probably both the client programmer and the author(s) of the inferencer :-) -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From gstein@lyra.org Tue Dec 21 19:21:43 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 21 Dec 1999 11:21:43 -0800 (PST) Subject: [Types-sig] parameterized typing In-Reply-To: <385CC508.D8684CEC@prescod.net> Message-ID: On Sun, 19 Dec 1999, Paul Prescod wrote: > Greg Stein wrote: > > .... > > Paul: does this sufficiently address your desire for parameterized types? > > Others: how does this look? It seems quite Pythonic to me, and is a basic > > extension of previous discussions (and to my thoughts of the design). > > Without thinking every detail through it looks good to me for handling > parameterized classes. I think that parameterized typedecls and > functions are still an issue. True. > Also, was it your intent that the _ be required or would the fact that > the param was declared obviate that. I am thinking that there may a more > general syntax that allows us to parameterize various sorts of things. The "_" was just following Tim's lead. Certainly, there shouldn't be a requirement. Maybe not even a convention (e.g. "self" is a convention rather than a requirement). I kind of like the leading underscore as it differentiates the param from regular arguments. > interface (a,b) foo: ... > class (a, b) foo: ... > def (a, b) foo(a) -> b: > decl foo(a,b) = typedef ... This has possibilities. Remember, though: I believe that parameterization is only useful for the type-checker. The Python runtime doesn't need it. In other words, our basis for choosing to do parameterization is based solely on a need for type checking. Is that need sufficient? I'm on the fence with parameterization altogether. The problem is that I'm not sure we can defer this one to a second phase because the type declarator syntax will probably be affected dramatically by the requirement for parameterization. i.e. we have to design it now to get the long-term type decl syntax correct :-( So: your idea for parameterization is nice, but I'd like to understand whether there is a strong feeling for having it in the first place. (before spending a lot of brain-cycles on the issue) Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Tue Dec 21 19:24:49 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 21 Dec 1999 11:24:49 -0800 (PST) Subject: [Types-sig] Issue: definition of "type" In-Reply-To: <385E2EF2.6CE888DC@prescod.net> Message-ID: On Mon, 20 Dec 1999, Paul Prescod wrote: > Tim Peters wrote: > > > spam is a class but not a static type. > > > > True, but it can be given a static type *name*; e.g., > > > > decl type spam > > > > Provided that the attributes of spam actually referenced outside of spam > > have the same signatures, static type checking outside of spam shouldn't > > care that it doesn't know about spam's internals. Or, IOW, if the two > > dynamic versions of spam present the same external interface to the > > compiler, it doesn't matter how the *class* spam comes into being at > > runtime. > > Okay, but do you or do you not agree that in the simple case of: > > class spam: > def a(self) -> String: > return "abc" > > a type object should be made implicitly as if someone had actually typed > in the decl. I certainly would not support a position that said that the > entire signature of spam had to be re-declared. I MIGHT support a > position that said that the user had to explicitly declare spam as being > available to the static type system. I believe that a typedecl object would be created implicitly with the above class definition. Even if the ->String wasn't present. Every class definition implies an interface typedecl. I concur: having to redeclare would suck. Explicit declaration is unneeded -- why should a person have to declare that the implicit type is usable? It is there, so it can be used. > I'm on the fence about this last requirement because I would like to > think that all of the code out there with class statements is *already* > defining a bunch of types. A minority of it depends on runtime > information and we can easily detect those cases. So why not let the > simple case of "defined class that doesn't depend on runtime > information" be a shortcut for a type declaration? Absolutely. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Tue Dec 21 19:43:23 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 21 Dec 1999 11:43:23 -0800 (PST) Subject: [Types-sig] computing typedecl objects (was: New syntax?) In-Reply-To: <385E355E.9B15FA42@prescod.net> Message-ID: On Mon, 20 Dec 1999, Paul Prescod wrote: > Greg Stein wrote: > > > > ... > > > > Nah. No execution needs to take place. Just some data flow analysis. > > Let's be concrete: > 1. > > if somefunction(): > class a: > def b(self)->String: return "abc" > else: > class a: > def b(self)->Int: return 5 > > How many type objects are created? What are there names? What is the > type of a? There are three typedecl objects (not "type object"!). Let's annotate your code a bit to make this clearer: if somefunction(): class a: ... _internal_interface_a_1 = make_interface(a) # compiler implies this a.__typedecl = _internal_interface_a_1 # compiler remembering the type first_interface = typedef a # == _internal_interface_a_1 else: class a: ... _internal_interface_a_2 = make_interface(a) a.__typedecl = _internal_interface_a_2 second_interface = typedef a # type inferencer unions the type of a a.__typedecl = typedef _internal_interface_a_1 or _internal_interface_a_2 final_interface = typedef a The compiler does not create any names for these typedecl objects, although it does imply them (the "_internal.*" names demo this). I've also annotated that the compiler is remembering/associating a particular typedecl with the class object. Finally, I've shown where the user is explicitly fetching the typedecl object using the "typedef" keyword. At the end of the above code, "a" has a union typedecl (for the purposes of type checking/inferencing). The user has three typedecl objects held in three variables (first_interface, second_interface, final_interface). One point to make: "a" is a name referring to a class object. That is not the same as a typedecl object, although it can be used in some contexts where typedecl objects are needed. "typedef a" is definitely a typedecl object and it cannot be used to instantiate an object. It refers to an interface definition, actually. > 2. > > class a: > def b(self)->String: return "abc" > for i in sys.argv: > class a: > def b(self)->Int: return 5 Just before the "for" statement, "typedef a" returns one typedecl. After the "for" loop, it returns a different typedecl. Again: it will be a union of the two interfaces (because we don't know whether the loop executes zero or more iterations, so we can't know whether the class was redefined). If somebody gets smart and upgrades the type inferencer, it might be able to detect: class a: ... for i in range(10): class a: ... In this case, the inferencer knows the redefinition occurred, so it does not have to create a union type. > 3. > > def makeClass(): > class a: > def b( self ): > return "abc" > return a > > j=makeClass()() In this case, the "def" marks an analysis boundary. Its return type is "any". In a type-safe world, the makeClass()() fails because we cannot verify that a callable object was returned from makeClass. In a type-checked world, there is nothing wrong with the above code. > -------------------- > This seems intractable to me. I got around this in my original proposal > by requiring all declaring classes to be *top-level*. In other words I > formally defined the subset of Python that does not require code > execution. If you can formally define the semantics of "data flow" then > I will be able to compare the proposals. The data flow merely replaces typedecls [that are associated with names], or unions them if there are alternate code paths. The conditionals and things that push classes away from "top-level" do not confuse an inferencer. The type of your object will be a bit looser at type-check time than at runtime, however. The union will occur frequently when try/except is present: a = 1 try: a = "foo" ... except: ... typedef a == typedef Int or String > Note that I am half-way between you and Tim. I think that type objects > should be more like Python objects but I am willing to restrict where > they are created to make the problem tractable and the semantics > understandable. Please call them typedecl objects to avoid confusion with TypeType objects. typedecl objects are created through type declarators or implicitly by the inferencer and/or compiler. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Tue Dec 21 19:46:46 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 21 Dec 1999 11:46:46 -0800 (PST) Subject: [Types-sig] typedefs (was: New syntax?) In-Reply-To: <385E4ED2.C3EEB28E@vet.uu.nl> Message-ID: On Mon, 20 Dec 1999, Martijn Faassen wrote: >... > I should finally work out my syntax proposal into something sensible > because now I'm confusing myself. :) I do still think there's something > interesting to be learned from the 'class instantiation' - 'typedef > instantiation' and 'value assignment' - 'type assignment' analogy. A summary would be good. I'm not sure at all where your position is because you've been discussing from each position at different times. Please create a bit of coherence :-) > [snip] > > In any case, I think using "def" inline to define a function typedecl is > > fine. A typedef is merely used to create an alias, to clarify a later > > declaration. > > Yes, but you basically have the same setup with current Python if you > exclude Lambdas. A function definition is merely used to create an > 'alias' for a piece of code, to clarify other pieces of code. If you I disagree that a function def is merely an alias. It provides a new namespace, parameter binding, and capabilities such as deferred execution. I definitely don't see it as simply an alias. > assume for the moment lambdas are bad, we may want to assume by analogy > that inline defs are not a good idea either. I don't think that argument follows. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Tue Dec 21 20:04:07 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 21 Dec 1999 12:04:07 -0800 (PST) Subject: [Types-sig] Issue: definition of "type" In-Reply-To: <385E5695.1CAEF90B@vet.uu.nl> Message-ID: On Mon, 20 Dec 1999, Martijn Faassen wrote: > Paul Prescod wrote: > [snip] > > I'm on the fence about this last requirement because I would like to > > think that all of the code out there with class statements is *already* > > defining a bunch of types. A minority of it depends on runtime > > information and we can easily detect those cases. So why not let the > > simple case of "defined class that doesn't depend on runtime > > information" be a shortcut for a type declaration? > > Are you sure that in fact a minority depends on runtime information? > Often Python code is used without any inheritance link, like this: > > class Foo: > def doSomething(self): > ... > > class Bar: > def doSomething(self): > ... > > a = [Foo(), Bar()] > > for el in a: > el.doSomething() > > Doesn't this rely on run-time information? How would a type system deal > with this? I suppose I'm entering the domain of interfaces now... The type of "a" is a List where the elements' type is the union of the type of each initialization value. In this case: typedef a == typedef [Foo or Bar] Pretty straightforward, but I'd be happy to detail this. When the checker gets to el.doSomething, it knows that the type of el is "Foo or Bar". It then goes through each alternative and verifies that ".doSomething" is legal for that possibility. No problem :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Tue Dec 21 20:05:33 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 21 Dec 1999 12:05:33 -0800 (PST) Subject: [Types-sig] Issue: definition of "type" In-Reply-To: <385E57D8.E5518928@prescod.net> Message-ID: On Mon, 20 Dec 1999, Paul Prescod wrote: > Martijn Faassen wrote: > >... > > Doesn't this rely on run-time information? How would a type system deal > > with this? I suppose I'm entering the domain of interfaces now... > > Yes, that is the role of interfaces. Nobody has yet suggested that the > code you described would be type-safe. The two doSomething methods are > unrelated. I maintain that it could be declared type-safe. In fact, it is reasonably straight-forward to generate the type information at each point, for each value, and then to verify that the .doSomething is valid. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Tue Dec 21 20:09:36 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 21 Dec 1999 12:09:36 -0800 (PST) Subject: [Types-sig] polymorphic code (was: A challenge) In-Reply-To: <385E5BCA.E238292E@vet.uu.nl> Message-ID: On Mon, 20 Dec 1999, Martijn Faassen wrote: >... > Okay, but then I'll repeat the question I asked before: > > class Foo: > def getIt(self)->String: > ... > > class Bar: > def getIt(self)->String: > ... > list = [Foo(), Bar()] > > for el in list: > print el.doIt() > > This wouldn't work, even though the interfaces are similar. This brings I maintain that it will work :-) [ assuming your doIt() is a typo, and you intended getIt() ] >... > * interfaces > > Another way to do it is to use interfaces and say Foo and Bar both > conform to some interface which supports doIt(). This was something we > wouldn't discuss in this SIG, but can we in fact avoid it? We don't need any explicit interfaces to resolve the above code to determine that it is type-safe. "el" has one of two types: Foo or Bar. The (implicit) interface of each has a method named getIt that takes zero parameters. In this case, the "print" statement can take any type, so we don't even need to worry about the return types (even though they happen to be the same). Specifically: the type-checker does not have to unify the interface to verify type-safety or type-checks, it merely needs to check that each alternative for the type of "el" supports the required method and signature. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Tue Dec 21 20:13:27 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 21 Dec 1999 12:13:27 -0800 (PST) Subject: [Types-sig] type-assert operator optimizations (was: New syntax?) In-Reply-To: <385E64C9.A82E84@maxtal.com.au> Message-ID: On Tue, 21 Dec 1999, skaller wrote: > Greg Stein wrote: > > > I think that we definitely want to be able to construct and use typedecl > > objects at runtime. That's why I prefer the typedef unary operator over > > your "sub-language." > > Are these options mutually exclusive? I'm not sure that I understand this question. I think some context was lost (i.e. what is the sub-language). > I've implemented operator ! in Viper now, x!t checks type(x) is t, >... > My current feeling: I quite like it -- but the above > is the only use I have tried, other than specifically > for testing it. My feeling, also, is that in those > circumstances where the test would fail, then the > program should be considered in error (that is, > it is not legitimate practice to catch and handle > the TypeError, so that if a compiler can prove it would > be raised, it is entitled to reject the program, > and a lint like checker, to issue a diagnostic. > [The explicit test, like in the second example above, > should be used if it is desired to catch and handle > the raised TypeError] > > This means that the x!t can be optimised to x, > without affecting strictly conforming program > semantics. If the compiler can definitively state that the test will never fail, then it doesn't have to include a runtime check. If the compiler can definitively state that the test will always fail, then it can issue an error and refuse to compile. [ with the caveat of catching exceptions ] If the compiler believes that it might fail in some cases, then it could issue a warning (and go ahead and insert a runtime check). [ and yes, there can be switches to avoid issuing warnings ] Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Tue Dec 21 20:17:09 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 21 Dec 1999 12:17:09 -0800 (PST) Subject: [Types-sig] Basic questions In-Reply-To: <385E6747.4831D32@prescod.net> Message-ID: On Mon, 20 Dec 1999, Paul Prescod wrote: >... > But the much more basic question is whether: > > class foo: pass > > even *defines* a type that can be used in type declarations. Greg says > yes, even if the declaration is buried in code. Tim says no,(I think) Definitely yes. The typedecl syntax allows the use of a class object as a way to specify a typedecl. Internally, the class contains a reference to an interface definition; the interface is the "real" typedecl. >... > 1. Is this valid: > > class foo: pass > > def a( arg: foo ): pass Absolutely. The compiler understands that "foo" refers to a class object, so it is allowed in a typedecl. > 2. Is this valid: > > if someFunc(): > class foo: "abc" > else: > class foo: "def" > > def a( arg: foo ): pass Absolutely. The compiler understands that "foo" refers to a class object, although it doesn't know which one. No matter, though, as it just associates a union typedecl with the name "foo". Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Tue Dec 21 20:19:22 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 21 Dec 1999 12:19:22 -0800 (PST) Subject: [Types-sig] New syntax? In-Reply-To: <385E6C8C.635293B2@4-am.com> Message-ID: On Mon, 20 Dec 1999, Evan Simpson wrote: >... > In one of the many messages I started composing for this SIG, then never sent, > I mixed regexp-style notation into your ML-style declarations. How's about: > > (T*) means T-tuple of unknown length, (T+) means length at least one, (T1?, > T2{1,2}, T3) means optional T1 followed by one or two T2's and exactly one T3. > This still requires (T,) for a single-T tuple, but all other uses are > distinguishable from grouping. An interesting approach, but it is a bit cryptic. Almost shades of Perl... :-) But hey: I haven't offered a better approach, so who am I to say? :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Tue Dec 21 20:26:11 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 21 Dec 1999 12:26:11 -0800 (PST) Subject: [Types-sig] Basic questions In-Reply-To: <385E6E4A.4D56A904@vet.uu.nl> Message-ID: On Mon, 20 Dec 1999, Martijn Faassen wrote: >... > > 2. Is this valid: > > > > if someFunc(): > > class foo: "abc" > > else: > > class foo: "def" > > > > def a( arg: foo ): pass > > This is the really interesting one.. Perhaps interfaces can help here. > > One rule could be this: > > You can't define the same name multiple times in the same scope. You > have to do 'class foo1' and 'class foo2' instead, and then say they > both conform to the interface 'foo'. Icky. No way. Even if I didn't believe that the inferencer could resolve the above code fragment, I wouldn't like having to use different names. That feels like too much of an imposition (on the part of the type-checker) onto my code. > Consequences: > > * A separate interface declaration syntax would seem to be required. Nah. A class implies an interface. > Consequences I describe at the alternative rule apply too, I think. > > An alternative rule would be the following: > > Any class names that are defined multiple times in the same scope are > taken to support an interface with that same name. This interface is the > only type you can use elsewhere; you can't use the class type directly. > It is a compile time error if classes with the same name define > different interfaces. The typedecl is a union of the two (implied) interfaces. No reason to impose a single interface or to refuse the usage of the class name. > Consequences: > > * This may mean we enter access-rule land; it would be okay classes > conforming to an interface to define different member variables, as long > as these are private. I don't see this happening. > * The interface needs to be hooked up to the actual implementation > during runtime. This may happen as soon as a class (that the compiler > has seen has multiple definitions) is actually being defined at > run-time. I do agree that a class object would have an associated typedecl object at runtime. The typedecl would define the class' interface. > * These are odd interfaces in the sense that it looks as if you can > instantiate from them! What 'in fact' happens is that the interfaces > passes any instantiation requests to the actual class that's doing the > implementation -- the interface is a simple factory. I disagree. I would say the interface is just another typedecl. And typedecl objects are not callable (and are certainly not factories). Even if you wanted to make an interface instantiable, that just can't happen: an interface could be used by multiple classes. > The same story would apply to function signatures/prototypes; if the > same function name occurs multiple times in the same scope they're all > taken to define the same prototype, which would be the actual type used. Again: I disagree. The inferencer would associate different typedecl objects (signatures) with the name at different points in the execution. Depending on the control flow, each redefinition will cause a union of typedecl objects, or a replacment. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Tue Dec 21 20:37:48 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 21 Dec 1999 12:37:48 -0800 (PST) Subject: [Types-sig] Basic questions In-Reply-To: <385E7832.81D4CFC7@maxtal.com.au> Message-ID: On Tue, 21 Dec 1999, skaller wrote: >... > If I may: there is an issue here which > some people may not have realised: recursive types. These are not possible in Python because definitions are actually constructed at runtime. The particular name/object must be available at that point in the execution. > In an interface file, this can be handled by > two passes. > > In implementation files, it is much > harder, since scoping rules are dynamic. > This is a good argument for interface files. > Example: > > class X: > def f(y:Y): ... This fails. Y is not defined. > class Y: > def g(x:X): ... > > Resolving this in a single pass requires > backpatching, which is messy: but using two > passes leads to difficult ambiguities > due to renaming: > > class X: > def h(): ... > > Ok -- so now, which X does the X in g refer to? Y.g referred to the X that existed at that point in time. > In python, names are bound dynamically, which resolves > the problem. In an interface file, renaming can be banned. > Can it be banned, for classes, in implementation files? No need to ban it. Y.g refers to the first X. Simple. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Tue Dec 21 20:39:52 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 21 Dec 1999 12:39:52 -0800 (PST) Subject: [Types-sig] Interfaces In-Reply-To: <385E8186.3BC32DF4@maxtal.com.au> Message-ID: On Tue, 21 Dec 1999, skaller wrote: > Martijn Faassen wrote: > > class Foo conforms FooInter: > > How about > > class Foo is a FooInter: .. I don't think we should be worrying about how to explicitly declare and associate interfaces with classes. The type system can easily infer an interface from a class definition, and we can work with that. I also believe they do not need to be explicit for the type system to function. A later phase can make interfaces explicit. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Tue Dec 21 20:41:13 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 21 Dec 1999 12:41:13 -0800 (PST) Subject: [Types-sig] compatibility (was: tuples) In-Reply-To: Message-ID: On Mon, 20 Dec 1999, Tony Lownds wrote: > On Mon, 20 Dec 1999, Guido van Rossum wrote: > > The only reason not to switch to tuples is backwards compatibility -- > > in particular there is a lot of code (e.g. in the std library) that > > creates new arg lists by adding tuples to *args. This could be solved > > by allowing + to operate on a mix of lists and tuples. I think the > > result should yield a list. > > There would be forwards compatability issues too; people might starting > writing: > > class A: > def foo(self, *args): > args[:0] = [self] > apply(foo, args) > > def bar(self, *args): > ... > > This code would not work on existing Pythons. This kind of stuff happens all the time. There is code out there with "assert" statements that don't work on old versions of Python. Python 1.6 has methods on the string objects; old versions do not. Cheers, -g -- Greg Stein, http://www.lyra.org/ From guido@CNRI.Reston.VA.US Tue Dec 21 20:42:33 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 21 Dec 1999 15:42:33 -0500 Subject: [Types-sig] Recursive types (was: Basic questions) In-Reply-To: Your message of "Tue, 21 Dec 1999 12:37:48 PST." References: Message-ID: <199912212042.PAA13406@eric.cnri.reston.va.us> > From: Greg Stein > > On Tue, 21 Dec 1999, skaller wrote: > >... > > If I may: there is an issue here which > > some people may not have realised: recursive types. > > These are not possible in Python because definitions are actually > constructed at runtime. The particular name/object must be available at > that point in the execution. Huh? "Recursive types" typically refers to all sorts of nodes, graphs and trees (where an instance attribute has the same type as its container). Certainly these are possible in Python! > > In an interface file, this can be handled by > > two passes. > > > > In implementation files, it is much > > harder, since scoping rules are dynamic. > > This is a good argument for interface files. > > Example: > > > > class X: > > def f(y:Y): ... > > This fails. Y is not defined. If I understand the context correctly (X defined before Y but using Y) I disagree. Since this works fine without type declarations (as long as the instantiations happen after the classes are defined) I don't see why adding static typing should break this. Also, I think that static typing should have a much more liberal view about forward referencing than Python itself. Since it is quite legal to have def f(): g() def g(): ... print f() I think that typecheckers should deal with this, and not complain about the forward reference to g in f! (Except when f is called before g is defined. Flow analysis should allow this distinction.) (Incidentally, this is one of the things that annoys me about Aaron Watters's kjpylint: it warns about all forward references. This conflicts with the top-down coding style that I currently prefer for procedural coding, where main() precedes everything it calls, and so on.) --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein@lyra.org Tue Dec 21 20:50:44 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 21 Dec 1999 12:50:44 -0800 (PST) Subject: [Types-sig] Basic questions In-Reply-To: <385EC6D6.E55B0A27@prescod.net> Message-ID: On Mon, 20 Dec 1999, Paul Prescod wrote: >... > This is why I think that "make everything explicit" is too strong of a > rule in practice. I want type-checked code and untype-checked code to > work together more or less seamlessly. On the other hand, I don't want > to get into complicated data flow analysis. Even if someone implemented > it, how would we explain it to Python programmers? "In order to > understand what types your program is producing, follow this complicated > algorithm." That's why we are running away from strict > (non-conservative) type inferencing in the first place. Euh... why does it have to be explained? Why do Python programmers care what the types are? They know. The inferencer is just figuring out what the programmer did. The programmer doesn't have to understand it to produce valid programs. John responded to this in a different email listing the kinds of mismatches between the programmer's intent and the inferencer's deductions. He explains the situation well... >... > This is a class/static type definition: > > class a: pass Yes. > This is not: > > if 1: > class a: pass I disagree. This does create an implicit typedecl which can be used. In addition the class name "a" can be used. Caveat: typedef a == typedef or Undefined Specifically, the compiler may warn the programmer that "a" could possibly be undefined. [ because I really don't think we want to do constant evaluation in the inferencer. although if somebody does... cool! it could then removed the Undefined alternative. ] > This is a function declaration where the function's type (Any->Any) is > known at compile time: > > def a( b ): return "foo" Agreed. > This is not a static function declaration and cannot be used from static > code without a type assertion: > > if 1: > def a( b ): return "foo" I disagree again :-), for the same reasons as the class. > I'm trying to keep thing simple. An approach that I heartily agree with! However, I'd rather make it simple for the Python programmer: define it wherever you want, in whatever style you're using -- we won't force you to use a particular style. In other words, I think the rule of "it must be at the top-level, by which I mean yah.. at the globals level, but not inside an 'if' statement. oh, or inside a 'for' statement or a 'while' statement, for that matter. hrm. imports might enter into this somehow, too... lemme think..." I say just let them do what they want. I believe we can figure it out. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Tue Dec 21 21:07:27 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 21 Dec 1999 13:07:27 -0800 (PST) Subject: [Types-sig] recursive types, type safety, and flow analysis (was: Recursive types) In-Reply-To: <199912212042.PAA13406@eric.cnri.reston.va.us> Message-ID: On Tue, 21 Dec 1999, Guido van Rossum wrote: > > From: Greg Stein > > > > On Tue, 21 Dec 1999, skaller wrote: > > >... > > > If I may: there is an issue here which > > > some people may not have realised: recursive types. > > > > These are not possible in Python because definitions are actually > > constructed at runtime. The particular name/object must be available at > > that point in the execution. > > Huh? "Recursive types" typically refers to all sorts of nodes, graphs > and trees (where an instance attribute has the same type as its > container). Certainly these are possible in Python! True... I use the stuff, too... I should have clarified that I don't think his particular example would work because of the compile- / definition- time recursion of the names. Runtime? Sure. It's fine. > > > In an interface file, this can be handled by > > > two passes. > > > > > > In implementation files, it is much > > > harder, since scoping rules are dynamic. > > > This is a good argument for interface files. > > > Example: > > > > > > class X: > > > def f(y:Y): ... > > > > This fails. Y is not defined. > > If I understand the context correctly (X defined before Y but using Y) > I disagree. Since this works fine without type declarations (as long > as the instantiations happen after the classes are defined) I don't > see why adding static typing should break this. Because Y is not defined. This is analogous to the following code: class Foo: def build_it(self, x, y, cls=Bar): return cls(x, y) class Bar: ... The above code breaks. I am positing that if you put "Y" into a declarator, then it should exist at that point in time. Where "time" is specified as following the flow of execution as the functions/classes are defined. > Also, I think that static typing should have a much more liberal view > about forward referencing than Python itself. Since it is quite legal > to have > > def f(): g() > def g(): ... > print f() > > I think that typecheckers should deal with this, and not complain > about the forward reference to g in f! (Except when f is called > before g is defined. Flow analysis should allow this distinction.) Good point. I don't think we can detect that call-before-definition, though. I think your point can be restated as: Can we type-check the following code? def f() -> String: return g() def g() -> String: ... I haven't thought about this particular scenario or the resulting impact on the inferencer. We probably require some kind of a two-pass analysis as John points out. Maybe it is as simple as deferring analysis of function bodies until the global "body" is analyzed. Actually, I think the deferral mechanism is sufficient, as that mirrors the execution environment: at the time the function body is executed, the globals are defined. [ with the caveat of call-before-define, but we can't determine that ] Hrm. The whole call-before-define problem might actually axe Paul's desire for type safety. I don't think we can *ever* guarantee that a name will exist at the time it is used. For example: value = 1 def typesafe f(): somefunc(value) del value f() If you start doing *control* flow analysis, then you might be able to definitely state the above code is in error. But then, I'll just throw this wrench at it: if sometest(): del value f() Now what? The type inferencing that I believe we can/should be using is based on some basic data flow, without regard to definitively determining whether a particular branch is reached or not. If a possibility exists, then the possible types are unioned in. If type-safety is defined as "no NameErrors", then we have a problem, as control flow is required. Note that I believe the following can be handled: value = 1 def typesafe f() func_taking_int(value) value = "foo" f() In this case, the global variable "value" has a typedecl of: Int or String. This would fail the func_taking_int() function call. Back to the other point: I do believe that you should not use a name in a type-declarator if it isn't defined. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Tue Dec 21 21:11:21 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 21 Dec 1999 13:11:21 -0800 (PST) Subject: [Types-sig] Basic questions In-Reply-To: <385FB8F2.343ED658@maxtal.com.au> Message-ID: On Wed, 22 Dec 1999, skaller wrote: > Paul Prescod wrote: > > This is why I think that "make everything explicit" is too strong of a > > rule in practice. I want type-checked code and untype-checked code to > > work together more or less seamlessly. > > I agree. Me too! :-) [ although, strictly speaking, I'm not sure of the granularity of enabling type-checking, other than the presence/absence of type declarators. is the checking on a module level? if a function level, how do we indicate that? a new keyword(s)? ] > > On the other hand, I don't want > > to get into complicated data flow analysis. Even if someone implemented > > it, how would we explain it to Python programmers? > > But this I do not understand. When an inferencer assigns > types to a variable of function, there are three cases: > > (1) the types are what the programmer expected. >... > (2) the type is more general than the programmer expected. >... > (3) the type is not what the programmer expected. >... > It is only case (3) which is difficult. But, the difficulty > is less than that which would result from a run time error, so the > inferencing cannot reduce the programmers understanding, only > make them realise there are bugs earlier than they might wish to > be reminded :-) Excellent analysis! I heartily concur! > Of course, there is a fouth case: the inferencer is > producing the wrong answer. This would certainly confuse the > programmer(s) -- probably both the client programmer and the > author(s) of the inferencer :-) Ssshhh! Quiet! We don't talk about that around here. Cheers, -g -- Greg Stein, http://www.lyra.org/ From guido@CNRI.Reston.VA.US Tue Dec 21 23:38:58 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 21 Dec 1999 18:38:58 -0500 Subject: [Types-sig] recursive types, type safety, and flow analysis (was: Recursive types) In-Reply-To: Your message of "Tue, 21 Dec 1999 13:07:27 PST." References: Message-ID: <199912212338.SAA13830@eric.cnri.reston.va.us> [Greg Stein] > I should have clarified that I don't think his particular example would > work because of the compile- / definition- time recursion of the names. > Runtime? Sure. It's fine. Hm... Since type checking is essentially a compile time activity, I think it would be better if the run time order of events didn't matter. Yes, in Pascal or C you need to declare everything before it's used. This is a compromise because of old-fashioned one pass compiler technology. I don't see a reason why we should adopt this rule in Python. Note that Java doesn't have this either -- you can declare your methods in any order you like and the compiler will figure it out. But, you may say, in Python we have a certain run time order of events that defines the validity of names. Names must be defined before they are used, and a later redefinition overrides an earlier one. Of course. (Although I wouldn't mind getting at least a compile time warning when I define two classes or methods with the same name; it can be frustrating when you're editing the first definition but your program keeps using the second one! :-) But checking that names are defined by the time they are used at run time is a different kind of check. (Java does this too, to a decent extent.) I personally find this a very useful check. But it doesn't necessarily affect the compile time static typechecking. > Because Y is not defined. This is analogous to the following code: > > class Foo: > def build_it(self, x, y, cls=Bar): > return cls(x, y) > > class Bar: > ... > > The above code breaks. I am positing that if you put "Y" into a > declarator, then it should exist at that point in time. Where "time" is > specified as following the flow of execution as the functions/classes are > defined. I disagree. See above -- there's no reason to burden the compile time type checker with run time ordering. > I don't think we can detect that call-before-definition, though. But I think you can, in by far the most cases. There may be a few borderline cases where it's impossible to tell, and I don't mind requiring a little help to make the type checker happy. For example, in C code I frequently add an initialization of a local variable to 0 which isn't really necessary because it is initialized in a for loop, but the compiler isn't smart to figure out that the for loop will execute at least once. Gcc -Wall complains about such cases, and shutting it up completely every once in a while is sufficiently satisfying that I'll add redundant code. Of course, I'd be happier if gcc was smarter, and I hope that Python's type checker will usually be smarter -- and then in the remaining cases I think it's okay to help it. > I think your point can be restated as: Can we type-check the following > code? > > def f() -> String: > return g() > > def g() -> String: > ... > > I haven't thought about this particular scenario or the resulting impact > on the inferencer. We probably require some kind of a two-pass analysis as > John points out. Maybe it is as simple as deferring analysis of function > bodies until the global "body" is analyzed. Actually, I think the deferral > mechanism is sufficient, as that mirrors the execution environment: at the > time the function body is executed, the globals are defined. > [ with the caveat of call-before-define, but we can't determine that ] I don't see a big problem here for the type checker. Assuming that there's only one definition of g, and that we disallow changes to g from outside the module (and from exec statements), the type checker will have no trouble discovering that g is a global function definition, and it can collect its type info to help checking f. There may even be arbitrary cross references; the solution (from the type checking point of view) is to iterate until all definitions are found. Again, checking that g is actually defined by the time f is called is a separate thing; but again in most cases this will be easy, since there is usually no executable code between the definitions of f and g (except perhaps other function definitions). It's a simple flow check. > Hrm. The whole call-before-define problem might actually axe Paul's desire > for type safety. I don't think we can *ever* guarantee that a name will > exist at the time it is used. For example: > > value = 1 > def typesafe f(): > somefunc(value) > > del value > f() > > If you start doing *control* flow analysis, then you might be able to > definitely state the above code is in error. But then, I'll just throw > this wrench at it: > > if sometest(): > del value > f() > > Now what? Simple. After the if statement has executed, value is "possibly undefined". This warrants a warning. > The type inferencing that I believe we can/should be using is based on > some basic data flow, without regard to definitively determining whether a > particular branch is reached or not. If a possibility exists, then the > possible types are unioned in. If type-safety is defined as "no > NameErrors", then we have a problem, as control flow is required. I don't see the problem. I claim that the examples you give are accidents waiting to happen, so it's only helpful if the type checker complains about them! > Note that I believe the following can be handled: > > value = 1 > def typesafe f() > func_taking_int(value) > > value = "foo" > f() > > In this case, the global variable "value" has a typedecl of: Int or > String. This would fail the func_taking_int() function call. Yes. In my view, "possibly undefined" is no different than "int or string". > Back to the other point: I do believe that you should not use a name in a > type-declarator if it isn't defined. I like the idea better (I think proposed by Tim Peters) that the names used for type declarations live in a separate compile time namespace where different rules apply. (Even though there are obvious correspondences, e.g. the names of defined or imported classes should probably be available both at compile time and at run time.) --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein@lyra.org Wed Dec 22 00:33:24 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 21 Dec 1999 16:33:24 -0800 (PST) Subject: [Types-sig] recursive types, type safety, and flow analysis In-Reply-To: <199912212338.SAA13830@eric.cnri.reston.va.us> Message-ID: On Tue, 21 Dec 1999, Guido van Rossum wrote: > [Greg Stein] > > I should have clarified that I don't think his particular example would > > work because of the compile- / definition- time recursion of the names. > > Runtime? Sure. It's fine. > > Hm... Since type checking is essentially a compile time activity, I > think it would be better if the run time order of events didn't > matter. But runtime order does matter. a = 1 func_taking_int(a) a = "foo" func_taking_string(a) I can come up with all kinds of variants, but that's the basic pattern. The code is perfectly type-safe, and depends on the order of events. If we waited until the end, then "a" will either have type "String," or type "Int or String." Either way, it produces false errors. > Yes, in Pascal or C you need to declare everything before > it's used. This is a compromise because of old-fashioned one pass > compiler technology. I don't see a reason why we should adopt this > rule in Python. Note that Java doesn't have this either -- you can > declare your methods in any order you like and the compiler will > figure it out. My new position (after your prodding my brain :-) is that each suite is evaluated in order. The global suite first, then each of the function bodies (in arbitrary order). This basically gives us a multiple pass, and allows functions, variables, classes, etc to be defined in any order. But: it still doesn't allow for the recursive type declarators. To be clear, it allows: def f() -> String: return g() def g() -> String: return "abc" But it does not allow: def f(x: Foo): ... class Foo: ... I believe the compiler should be recording information about the function arguments' typedecls. Unless the compiler is going to have multiple passes then the name should be defined before usage. Or rather, let's assume that the function argument information is constructed and recorded at runtime (as part of the standard function object construction at runtime). Then you really have to ensure that name is available, so the appropriate value can be stored into the function object. (this is, of course, predicated on recording signatures in the function object for use at runtime; I feel strongly that we should do this, as it will dramatically assist some runtime tools/apps such as IDLE) > But, you may say, in Python we have a certain run time order of events > that defines the validity of names. Names must be defined before they > are used, and a later redefinition overrides an earlier one. > > Of course. (Although I wouldn't mind getting at least a compile time > warning when I define two classes or methods with the same name; it > can be frustrating when you're editing the first definition but your > program keeps using the second one! :-) We can surely issue a warning for a redefinition that changes the type. > But checking that names are defined by the time they are used at run > time is a different kind of check. (Java does this too, to a decent > extent.) I personally find this a very useful check. But it doesn't > necessarily affect the compile time static typechecking. But we have runtime information to record, and I also believe that we have some runtime type checks to perform. In the above example, I think we should be implicitly inserting code like so: def f(x: Foo): x ! Foo ... While we can do a lot of useful compile-time checks, I think we still have runtime considerations that impose ordering. > > Because Y is not defined. This is analogous to the following code: > > > > class Foo: > > def build_it(self, x, y, cls=Bar): > > return cls(x, y) > > > > class Bar: > > ... > > > > The above code breaks. I am positing that if you put "Y" into a > > declarator, then it should exist at that point in time. Where "time" is > > specified as following the flow of execution as the functions/classes are > > defined. > > I disagree. See above -- there's no reason to burden the compile time > type checker with run time ordering. I've loosened up to "globals first, then function bodies." That provides a lot of relaxation of the requirements. (but it still does not allow for recursive declarators) To resolve the recursive declarator problem, I think we'd simply want a notion of an undefined interface (much like an undefined struct). I'm not sure of the mechanics for runtime resolution, but this would allow us to do something like: decl class Foo class Bar: def method(self, x: Foo)->None: ... class Foo: def method(self, x: Bar)->None: ... We still have a runtime issue, however :-( > > I don't think we can detect that call-before-definition, though. > > But I think you can, in by far the most cases. There may be a few You can within a single block of code. It is very difficult across code bodies, and requires an entirely different kind of analysis. My point was about different code bodies. > borderline cases where it's impossible to tell, and I don't mind > requiring a little help to make the type checker happy. For example, > in C code I frequently add an initialization of a local variable to 0 > which isn't really necessary because it is initialized in a for loop, > but the compiler isn't smart to figure out that the for loop will > execute at least once. Gcc -Wall complains about such cases, and > shutting it up completely every once in a while is sufficiently > satisfying that I'll add redundant code. Of course, I'd be happier if > gcc was smarter, and I hope that Python's type checker will usually be > smarter -- and then in the remaining cases I think it's okay to help > it. Sure. This is all within a single code body. I agree that we can provide use-before-definition errors. Going back to the original problem: def f(): g() def g(): ... There isn't a way to easily know that g is defined at the time f is called. We don't even record that g is used by f! The best that we can do is note that g is available in the global namespace and that it has a proper type. But we can't determine whether it might be Undefined or not. Specifically, consider the two cases: 1) del g f() 2) f() del g Unless the compiler knows that f() is going to use g, it can't do anything here. It has to do some *serious* control flow analysis and record a lot of information about f's requirements. We might be able to go one step beyond and say (during f's analysis) that g is possibly undefined, but we would get a lot of those warnings. That's because we don't really have/record information that distinguishes these patterns: 1) f() def g(): ... 2) def g(): ... f() 3) def g(): ... del g f() 4) def g(): ... f() del g In case 1, you could say that g is "Func or Undefined" simply by stating a policy that *any* global is "... or Undefined". The analysis of f() would then raise an appropriate flag. In case 2, any assumption of "or Undefined" is invalid. The code is fine. In cases 3 and 4, maybe we're assuming that not all globals get an "or Undefined" unless we see a "del" statement. That's fine, but we may have a false warning because we can't different cases 3 and 4. > > I think your point can be restated as: Can we type-check the following > > code? > > > > def f() -> String: > > return g() > > > > def g() -> String: > > ... > > > > I haven't thought about this particular scenario or the resulting impact > > on the inferencer. We probably require some kind of a two-pass analysis as > > John points out. Maybe it is as simple as deferring analysis of function > > bodies until the global "body" is analyzed. Actually, I think the deferral > > mechanism is sufficient, as that mirrors the execution environment: at the > > time the function body is executed, the globals are defined. > > [ with the caveat of call-before-define, but we can't determine that ] > > I don't see a big problem here for the type checker. Assuming that > there's only one definition of g, and that we disallow changes to g > from outside the module (and from exec statements), the type checker > will have no trouble discovering that g is a global function > definition, and it can collect its type info to help checking f. > There may even be arbitrary cross references; the solution (from the > type checking point of view) is to iterate until all definitions are > found. I agree. We can do this. This was my track about deferral. > Again, checking that g is actually defined by the time f is called is > a separate thing; but again in most cases this will be easy, since > there is usually no executable code between the definitions of f and g > (except perhaps other function definitions). It's a simple flow check. I disagree that it will be easy or that it is "a simple flow check." Checking for undefined names is really only easy within a single code body. As I outline further above, I don't think you can tell whether g is really defined by the time f is called. > > Hrm. The whole call-before-define problem might actually axe Paul's desire > > for type safety. I don't think we can *ever* guarantee that a name will > > exist at the time it is used. For example: > > > > value = 1 > > def typesafe f(): > > somefunc(value) > > > > del value > > f() > > > > If you start doing *control* flow analysis, then you might be able to > > definitely state the above code is in error. But then, I'll just throw > > this wrench at it: > > > > if sometest(): > > del value > > f() > > > > Now what? > > Simple. After the if statement has executed, value is "possibly > undefined". This warrants a warning. Right. But we don't cross-reference the fact that at after the if statement is one of the times that we call f() and that f() happens to need "value." Recording this information about "when" is control flow. I think we just want to record possible types and feed that into the type-checker. With the presence of "del" in the above code, maybe we can record "or Undefined", but that isn't really going to do what we'd like. > > The type inferencing that I believe we can/should be using is based on > > some basic data flow, without regard to definitively determining whether a > > particular branch is reached or not. If a possibility exists, then the > > possible types are unioned in. If type-safety is defined as "no > > NameErrors", then we have a problem, as control flow is required. > > I don't see the problem. I claim that the examples you give are > accidents waiting to happen, so it's only helpful if the type checker > complains about them! I agree, and it would be nice to have a warning. I don't think it is possible (given the scope of analysis that I'm thinking of). You would need a LOT more analysis to determine "undefined." And it would probably have to be global (cross-module). > > Note that I believe the following can be handled: > > > > value = 1 > > def typesafe f() > > func_taking_int(value) > > > > value = "foo" > > f() > > > > In this case, the global variable "value" has a typedecl of: Int or > > String. This would fail the func_taking_int() function call. > > Yes. In my view, "possibly undefined" is no different than "int or > string". Agreed. See above. > > Back to the other point: I do believe that you should not use a name in a > > type-declarator if it isn't defined. > > I like the idea better (I think proposed by Tim Peters) that the names > used for type declarations live in a separate compile time namespace > where different rules apply. (Even though there are obvious > correspondences, e.g. the names of defined or imported classes should > probably be available both at compile time and at run time.) I think we're going to want those names at runtime, which means they should be defined at the time of their usage. If we have a separate namespace, then I think the output of the compiler will need a bit more magic because it would need to build that namespace right at the beginning, for use by the code later on. i.e. some prologue which sets up the typedecl namespace. That just doesn't strike me as a good thing. Cheers, -g -- Greg Stein, http://www.lyra.org/ From guido@CNRI.Reston.VA.US Wed Dec 22 01:19:05 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 21 Dec 1999 20:19:05 -0500 Subject: [Types-sig] recursive types, type safety, and flow analysis In-Reply-To: Your message of "Tue, 21 Dec 1999 16:33:24 PST." References: Message-ID: <199912220119.UAA14134@eric.cnri.reston.va.us> [me] > > Hm... Since type checking is essentially a compile time activity, I > > think it would be better if the run time order of events didn't > > matter. [Greg] > But runtime order does matter. > > a = 1 > func_taking_int(a) > a = "foo" > func_taking_string(a) > > I can come up with all kinds of variants, but that's the basic pattern. > The code is perfectly type-safe, and depends on the order of events. If we > waited until the end, then "a" will either have type "String," or type > "Int or String." Either way, it produces false errors. If this pattern occurs locally (a is a local variable), fine. The flow analyzer will have no problem with this, and shouldn't find any type errors. But if a were a global, I'd say this was bad taste and asking for trouble. I'm giving up on responding point-by-point -- let's just agree that we differ in opinion on this matter. > But: it still doesn't allow for the recursive type declarators. To be > clear, it allows: > > def f() -> String: > return g() > def g() -> String: > return "abc" > > But it does not allow: > > def f(x: Foo): > ... > class Foo: > ... If there's only one Foo (which is usually the case) I still think this is too strict, and I don't see a technical reason why it would be necessary. > I believe the compiler should be recording information about the function > arguments' typedecls. Unless the compiler is going to have multiple passes > then the name should be defined before usage. > > Or rather, let's assume that the function argument information is > constructed and recorded at runtime (as part of the standard function > object construction at runtime). Then you really have to ensure that name > is available, so the appropriate value can be stored into the function > object. OK, this is why we disagree. I am only interested in compile time type checking; I can admit that some run time checking is necessary, but only in order to assert certain invariants that are assumed by the compile time checker. E.g. if I'm deducing that global X is a constant, I'm going to make sure at run time it really won't change. This catches several things: (1) dynamically loaded or generated code that surreptitiously tries to change the value of a constant (memories of Fortran...:); (2) other cases where (e.g. through unexpected aliasing) the constant might be changed. A form of type checking that happens completely at run time (the way you describe it) is uninteresting to me, and using such a system as the semantic basis for a type checker seems to be a mistake. Yes, this follows Python's semantics closer than what I am proposing. But I don't think that it is closer to what the user expects the type checker to do. Here's the crux of my argument: Python's dynamic semantics can often be surprising. Compile time checking should warn the user about these surprises, it shouldn't try to assume that these surprises are what the user wanted! (I've skipped the rest of what you wrote, because of the agreement to disagree.) --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein@lyra.org Wed Dec 22 04:06:28 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 21 Dec 1999 20:06:28 -0800 (PST) Subject: [Types-sig] recursive types, type safety, and flow analysis In-Reply-To: <199912220119.UAA14134@eric.cnri.reston.va.us> Message-ID: On Tue, 21 Dec 1999, Guido van Rossum wrote: >... > I'm giving up on responding point-by-point -- let's just agree that we > differ in opinion on this matter. I'm not sure that I'm there yet :-) Basically, I think your request to find and report on use-before-definition is "intractable" *when* you're talking about multiple bodies of code (e.g. two functions, or the global space and a function). [ by "intractable", I mean within the scope of what I believe we want to build; the problem is certainly doable but I believe it would involve complex, global, control-flow analysis. ] >... > > But it does not allow: > > > > def f(x: Foo): > > ... > > class Foo: > > ... > > If there's only one Foo (which is usually the case) I still think this > is too strict, and I don't see a technical reason why it would be > necessary. I want compile time checks, but I also want function objects to contain typedecl information at runtime. I'm not talking about runtime type checks, just recording more information with the function objects. For example, I'd like to be able to say something like: for i in range(func.func_code.co_argcount): print func.func_code.co_varnames[i], ':', func.func_argtypes[i] > > I believe the compiler should be recording information about the function > > arguments' typedecls. Unless the compiler is going to have multiple passes > > then the name should be defined before usage. > > > > Or rather, let's assume that the function argument information is > > constructed and recorded at runtime (as part of the standard function > > object construction at runtime). Then you really have to ensure that name > > is available, so the appropriate value can be stored into the function > > object. > > OK, this is why we disagree. I am only interested in compile time > type checking; I can admit that some run time checking is necessary, > but only in order to assert certain invariants that are assumed by the > compile time checker. Actually, I'm assuming that runtime checks are *only* present to verify parameter values and when the type-assert operator is used. I do not believe we would ever insert them outside of these two cases. Asserting the types of parameters could be arguable. Back to the point: I think we're in agreement on compile-time vs run-time checks. The difference is that I have one more requirement: the typedecl information should be available at runtime (for introspection purposes). >... > A form of type checking that happens completely at run time (the way > you describe it) is uninteresting to me, and using such a system as > the semantic basis for a type checker seems to be a mistake. Sorry, I have been unclear if this was the result. I do not want a runtime-based type checker. I want compile time (*). The runtime checks for function parameters are just assertions to ensure that non-type-checked code does not pass the wrong thing. The runtime check for the type-assert operator is present because the person requested it. [ although it possible that the compiler can optimize away the assertion generatd by the type-assert operator if the compiler can determine that it will always fail or that it will never fail. ] Neither of these two classes of runtime checks are intended to replace any compile-time type checks. (*) strictly speaking, I don't care about compile-time checks, as I'm in this for (OPT) :-), but I'm attempting to design a solution that encompasses (ERR), too. > Yes, > this follows Python's semantics closer than what I am proposing. But > I don't think that it is closer to what the user expects the type > checker to do. I agree with you. > Here's the crux of my argument: > > Python's dynamic semantics can often be surprising. Compile time > checking should warn the user about these surprises, it shouldn't > try to assume that these surprises are what the user wanted! Agreed. > (I've skipped the rest of what you wrote, because of the agreement to > disagree.) Our difference lies in two items: * I do not believe that you can do cross-function, compile-time checks to determine if a name is undefined. [ or if a name has different types over time, which type it may be ] * I am requiring the ability to associate typedecl objects with a function object at runtime. This imposes the requirement on a typedecl name (such as a class' name) being defined at the point that a function is defined. [ I also want typedecl objects associated with a class object and a module object so that we can reflect on their interface at runtime ] We can agree to disagree on the first item (I'll let you write the code to do that :-). I'd like your opinion on the second. Cheers, -g -- Greg Stein, http://www.lyra.org/ From scott@chronis.pobox.com Wed Dec 22 04:44:15 1999 From: scott@chronis.pobox.com (scott) Date: Tue, 21 Dec 1999 23:44:15 -0500 Subject: [Types-sig] recursive types, type safety, and flow analysis In-Reply-To: References: <199912220119.UAA14134@eric.cnri.reston.va.us> Message-ID: <19991221234415.A12628@chronis.pobox.com> On Tue, Dec 21, 1999 at 08:06:28PM -0800, Greg Stein wrote: > On Tue, 21 Dec 1999, Guido van Rossum wrote: [...] > >... > > Basically, I think your request to find and report on > use-before-definition is "intractable" *when* you're talking about > multiple bodies of code (e.g. two functions, or the global space and a > function). > > [ by "intractable", I mean within the scope of what I believe we want to > build; the problem is certainly doable but I believe it would involve > complex, global, control-flow analysis. ] I'd agree that this has been demonstrated, but only for examples of code which seem like great candidates for compile time warnings. Are there examples which strike you otherwise? [...] > > I want compile time checks, but I also want function objects to contain > typedecl information at runtime. I'm not talking about runtime type > checks, just recording more information with the function objects. > > For example, I'd like to be able to say something like: > > for i in range(func.func_code.co_argcount): > print func.func_code.co_varnames[i], ':', func.func_argtypes[i] > This sounds great, but to what extent do you think it should affect the initial coding design? It seems to me like this sort of functionality is more likely a candidate for something quite post-prototype-version-1 code, and to a large extent could be added to a compile-time checking system that could store it's type assertions in a form usable at runtime. If that information is in the byte code (is that even feasible a remotely backword compatible fashion?), then planning for this needs to happen earlier. If it's acceptible that that information could be stored elsewhere, perhaps even (optionally) in the interpreter itself, It seems like this functionality could be relatively easy to add to an existing compile-time-only static typing mechanism without cluttering the initial develop of that compile-time-only static typing mechanism with what seems like a rather large new set of complexities. The way I see all this compile-time vs. run-time stuff is that 1) run time is much more complex and undesirable for several already stated reasons. 2) run time has the ability to further resolve some inadequately-typed-at-compile-time code. 3) run time offers myriads of cool python code interfaces for interacting with a stronger typing system. To me, it seems to make the most sense to develop a compile time base first, paying enough attention to the benefits of run-time use so as not to preclude it or present it with unduly large obstacles. And as the compile-time code begins to stabilize a bit, define an interface though which the compile time type information may become available to the interpreter at run time. I'd much rather have a fairly closed compile time type system in several months than a fairy closed compile+run type system in a few years, especially if the former doesn't preclude the latter. scott From gstein@lyra.org Wed Dec 22 06:02:08 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 21 Dec 1999 22:02:08 -0800 (PST) Subject: [Types-sig] recursive types, type safety, and flow analysis In-Reply-To: <19991221234415.A12628@chronis.pobox.com> Message-ID: On Tue, 21 Dec 1999, scott wrote: > On Tue, Dec 21, 1999 at 08:06:28PM -0800, Greg Stein wrote: >... > > Basically, I think your request to find and report on > > use-before-definition is "intractable" *when* you're talking about > > multiple bodies of code (e.g. two functions, or the global space and a > > function). > > > > [ by "intractable", I mean within the scope of what I believe we want to > > build; the problem is certainly doable but I believe it would involve > > complex, global, control-flow analysis. ] > > I'd agree that this has been demonstrated, but only for examples of > code which seem like great candidates for compile time warnings. Are > there examples which strike you otherwise? One of my points was that I do not believe you can issue warnings because you can't know whether a problem might exist. Basically, it boils to not knowing whether a global used by a function exists at the time the function is called. So you either issues warnings for all global usage, or you issue none. You can make a few guesses based on what happens in the global code body, but I don't think the guesses will really improve the quality of warnings. Examples? No, I don't really have any handy. Any example would be a short code snippet and people would say, "yah. that's bad. it should fail." But the issue is with larger bodies of code... that's what we're trying to fix! So... No, I don't have a non-trivial example. > [...] > > I want compile time checks, but I also want function objects to contain > > typedecl information at runtime. I'm not talking about runtime type > > checks, just recording more information with the function objects. > > > > For example, I'd like to be able to say something like: > > > > for i in range(func.func_code.co_argcount): > > print func.func_code.co_varnames[i], ':', func.func_argtypes[i] > > This sounds great, but to what extent do you think it should affect > the initial coding design? The origination of this discussion was based on the recursive type issue. If we have runtime objects, then I doubt we could support the recursive type thing without some additional work. Or, as I'm suggesting, you do not allow an undefined name (as specified by runtime/execution order) to be used in a typedecl. The design of how to handle recursive types depends on the decision to include/exclude runtime objects that define function, class, or module typedecl information. Even if we defer the runtime creation of those objects, it will affect the design today. > It seems to me like this sort of > functionality is more likely a candidate for something quite > post-prototype-version-1 code, and to a large extent could be added to > a compile-time checking system that could store it's type assertions > in a form usable at runtime. I'm all for deferring stuff, but unfortunately, I believe this affects the V1 design. > If that information is in the byte code (is that even feasible a > remotely backword compatible fashion?), then planning for this needs > to happen earlier. Bytecodes do not really need to be backwards compatible. The magic value in the header of a .pyc prevents use of an incorrect version of bytecodes. (see line 80 or so in Python/import.c) I do believe the information goes into the bytecode, but I don't think that is the basis for needing to plan now. Instead, we have to define the semantics of when/where those typedecl objects exist. Do we have them at runtime? Does a name have to exist (in terms of runtime execution) for it to be used in a typedecl, or does it just have to exist *somewhere*? If names must exist before usage, then how is the recursive type thing handled? With unspecified typedecls? (like an unspecified struct) Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Wed Dec 22 09:45:43 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 22 Dec 1999 01:45:43 -0800 (PST) Subject: [Types-sig] recursive types, type safety, and flow analysis In-Reply-To: <19991222033636.A14007@chronis.pobox.com> Message-ID: On Wed, 22 Dec 1999, scott wrote: >... > > One of my points was that I do not believe you can issue warnings because > > you can't know whether a problem might exist. Basically, it boils to not > > knowing whether a global used by a function exists at the time the > > function is called. So you either issues warnings for all global usage, or > > you issue none. You can make a few guesses based on what happens in the > > global code body, but I don't think the guesses will really improve the > > quality of warnings. > > I personally can't imagine that it would be an issue to treat globals > in functions as anything other than a simple flat-rule: for type > checking purposes, globals must be defined at compile time in the > global namespace, that's just me, but I'd probably fire any of the > python programmers that work for me if they did what you describe > above with globals in a large project :) So it sounds like we agree? Treat globals simply, using a union of all the types that they may have in the global space (of course, noting that most sane people won't be changing the type!). Do not worry about control flow: specifically, what is the type and/or defined-status when function is called. > > Examples? No, I don't really have any handy. Any example would be a short > > code snippet and people would say, "yah. that's bad. it should fail." But > > the issue is with larger bodies of code... that's what we're trying to > > fix! So... No, I don't have a non-trivial example. > > I can't even imagine one, so if there's any way to describe this > global issue a little further without putting too much effort into it, > I'd appreciate it. I posted a set of 4 cases a few messages ago. Without control flow analysis, the type checker cannot determine which of the four cases is being used when it analyzes f(). Now just take one of those four patterns and drop it into a large module. Given that big old module, it would be nice to find problems with sequencing of type/defined-ness and function calls (because it is too big to eyeball; we want compiler support), but I'm saying "punt -- the compiler is not going to be able to provide any kind of adequate warning." The compiler *will* be able to generally verify types. It just can't handle a determine which of a set of alternatives an object will have at a specific point in type (assuming that object occurs in a different body of code than that which is being analyzed). Am I being clear enough? It seems like I've said this about three times so far... > > The origination of this discussion was based on the recursive type issue. > > If we have runtime objects, then I doubt we could support the recursive > > type thing without some additional work. Or, as I'm suggesting, you do not > > allow an undefined name (as specified by runtime/execution order) to be > > used in a typedecl. > > you could even allow typedecl to import modules for the sake of > gaining access to the names, where those imports would only occur when > the optional type checking is turned on. I'd agree that the use of an > undefined name should be disallowed. With the presence of > type-check-only import, following the same > no-mutually-recursive-imports rule of the regular import, but only > importing typedecl statements, you could achieve all this at compile > time. Actually, the recursive import issue is resolved by have a module registered which is incomplete. If you have: --- a.py import b --- b.py import a >>> import a Module "a" will get partially defined and then its code will be run. During that execution, the "import b" occurs and the "b" module is imported. Now the code for "b" runs and it says "import a". Since "a" has been partially defined (specifically, a name/module is entered into sys.modules), then b.py can create a local name "a" referring to the module object that it finds in sys.modules (which is about to be filled in when the "import b" completes). I'm suggesting a similar mechanism be made available to resolve the recursive typedecl issue. Specifically, we provide a way to create a partially-defined ("incomplete") typedecl object and bind that to a name. That name can then be used; later, the name will become fully specified. More thought is needed here, but I'll hold off as this is still premised on runtime typedecl availability. >... > > I do believe the information goes into the bytecode, but I don't think > > that is the basis for needing to plan now. Instead, we have to define the > > semantics of when/where those typedecl objects exist. Do we have them at > > runtime? > > in the above, no, though we do have the ability to find a name > anywhere at compile time. > > >Does a name have to exist (in terms of runtime execution) for it > > to be used in a typedecl, or does it just have to exist *somewhere*? > > in the above, it has to exist in the typedecl 'execution' model, which > is during compile time. > > >If > > names must exist before usage, then how is the recursive type thing > > handled? With unspecified typedecls? (like an unspecified struct) > > How about an iterative model which continues until all typedecl names > are filled in? These three items form a possible alternative. You wouldn't really need an iterative model to gather typedecl names; two passes is sufficient. >... > For me, it is sufficient to proceed from the premiss that you can't > have static typing work on code that redefines types at run time, and > to limit runtime checking (for the time being) to optionally have the > interpreter take some action (warn or abort) when that happens. That > requirement alone implies that typedecl'd names and their typedecl > bodies need to be available at run time, which is sufficient to > support just about any future developments in a static-typeing > interface in pure python. I definitely agree with the second part. For the first, if I assume "redefines types at run time" as being "through some shady mechanism, redefining a typedecl object", then yes: we can/should limit static checking. If you're talking about a name having multiple types over a period of time, then I disagree: we can handle that case. Also, I think the runtime objects are for more than the occasional type assertion. >... > As an aside, I'm glad to learn it wouldn't be difficult to have python > put static type information in it's byte code. That seems like a good > place for it. I'm hoping it would be hung from the function, class, and module objects. > As weird as it is to have a separate type-decl name model, it seems > infintely to depict dynamic typing in a static typing model. I don't follow/parse this line... Cheers, -0g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Wed Dec 22 09:47:33 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 22 Dec 1999 01:47:33 -0800 (PST) Subject: FWD: [Types-sig] recursive types, type safety, and flow analysis Message-ID: I tried to "bounce" this to the SIG, but it looks like it got held/discarded for admin approval since it didn't have types-sig in the To: header. Forwarding this time... ---------- Forwarded message ---------- Date: Wed, 22 Dec 1999 03:36:36 -0500 From: scott To: Greg Stein Subject: Re: [Types-sig] recursive types, type safety, and flow analysis On Tue, Dec 21, 1999 at 10:02:08PM -0800, Greg Stein wrote: > On Tue, 21 Dec 1999, scott wrote: > > On Tue, Dec 21, 1999 at 08:06:28PM -0800, Greg Stein wrote: > >... > > > Basically, I think your request to find and report on > > > use-before-definition is "intractable" *when* you're talking about > > > multiple bodies of code (e.g. two functions, or the global space and a > > > function). [...] > > I'd agree that this has been demonstrated, but only for examples of > > code which seem like great candidates for compile time warnings. Are > > there examples which strike you otherwise? > > One of my points was that I do not believe you can issue warnings because > you can't know whether a problem might exist. Basically, it boils to not > knowing whether a global used by a function exists at the time the > function is called. So you either issues warnings for all global usage, or > you issue none. You can make a few guesses based on what happens in the > global code body, but I don't think the guesses will really improve the > quality of warnings. I personally can't imagine that it would be an issue to treat globals in functions as anything other than a simple flat-rule: for type checking purposes, globals must be defined at compile time in the global namespace, that's just me, but I'd probably fire any of the python programmers that work for me if they did what you describe above with globals in a large project :) > > Examples? No, I don't really have any handy. Any example would be a short > code snippet and people would say, "yah. that's bad. it should fail." But > the issue is with larger bodies of code... that's what we're trying to > fix! So... No, I don't have a non-trivial example. I can't even imagine one, so if there's any way to describe this global issue a little further without putting too much effort into it, I'd appreciate it. [...] > > The origination of this discussion was based on the recursive type issue. > If we have runtime objects, then I doubt we could support the recursive > type thing without some additional work. Or, as I'm suggesting, you do not > allow an undefined name (as specified by runtime/execution order) to be > used in a typedecl. you could even allow typedecl to import modules for the sake of gaining access to the names, where those imports would only occur when the optional type checking is turned on. I'd agree that the use of an undefined name should be disallowed. With the presence of type-check-only import, following the same no-mutually-recursive-imports rule of the regular import, but only importing typedecl statements, you could achieve all this at compile time. I've run into this issue on large projects, importing a classname, just to run assert isinstance(foo, thatclass), "complain meaningfully" But it hasn't come up with recursive types in any code I've seen, just deeply-complex types in terms of container and class hierarchy relationships. > > The design of how to handle recursive types depends on the decision to > include/exclude runtime objects that define function, class, or module > typedecl information. Even if we defer the runtime creation of those > objects, it will affect the design today. > indeed. [...] > > I do believe the information goes into the bytecode, but I don't think > that is the basis for needing to plan now. Instead, we have to define the > semantics of when/where those typedecl objects exist. Do we have them at > runtime? in the above, no, though we do have the ability to find a name anywhere at compile time. >Does a name have to exist (in terms of runtime execution) for it > to be used in a typedecl, or does it just have to exist *somewhere*? in the above, it has to exist in the typedecl 'execution' model, which is during compile time. >If > names must exist before usage, then how is the recursive type thing > handled? With unspecified typedecls? (like an unspecified struct) How about an iterative model which continues until all typedecl names are filled in? I understand your concern about 2 distinct namespace models being unsettling. It raises issues of what exactly we want out of static typing, and what sets of existing and future python code may benefit from static typing, and these are indeed big issues. For me, it is sufficient to proceed from the premiss that you can't have static typing work on code that redefines types at run time, and to limit runtime checking (for the time being) to optionally have the interpreter take some action (warn or abort) when that happens. That requirement alone implies that typedecl'd names and their typedecl bodies need to be available at run time, which is sufficient to support just about any future developments in a static-typeing interface in pure python. As an aside, I'm glad to learn it wouldn't be difficult to have python put static type information in it's byte code. That seems like a good place for it. As weird as it is to have a separate type-decl name model, it seems infintely to depict dynamic typing in a static typing model. scott From mwh21@cam.ac.uk Wed Dec 22 11:33:03 1999 From: mwh21@cam.ac.uk (Michael Hudson) Date: Wed, 22 Dec 1999 11:33:03 +0000 Subject: [Types-sig] recursive types, type safety, and flow analysis Message-ID: Whew! I go on holiday for a week and 400+ messages turn up on types-sig! I've scanned them all, but I'm not sure I'm not repeating others here. Exciting times... ---------- >From: Greg Stein >To: types-sig@python.org >Subject: Re: [Types-sig] recursive types, type safety, and flow analysis >Date: Wed, Dec 22, 1999, 6:02 am > >On Tue, 21 Dec 1999, scott wrote: >> On Tue, Dec 21, 1999 at 08:06:28PM -0800, Greg Stein wrote: >>... >> > Basically, I think your request to find and report on >> > use-before-definition is "intractable" *when* you're talking about >> > multiple bodies of code (e.g. two functions, or the global space and a >> > function). >> > >> > [ by "intractable", I mean within the scope of what I believe we want to >> > build; the problem is certainly doable but I believe it would involve >> > complex, global, control-flow analysis. ] >> >> I'd agree that this has been demonstrated, but only for examples of >> code which seem like great candidates for compile time warnings. Are >> there examples which strike you otherwise? > >One of my points was that I do not believe you can issue warnings because >you can't know whether a problem might exist. Basically, it boils to not >knowing whether a global used by a function exists at the time the >function is called. Which is because you CAN'T! For the very simple case (i.e. name assigned to at toplevel of module, never referred to in a "del" statement), you know everything about the lifetime of the variable, and for other cases you in general know nothing, because to know more for arbitrary cases involves solving the halting problem. If people want to typecheck code along the lines of a = 0 if some_function(): del a then, frankly, sod 'em. You could make allowances for code along the lines of if __debug__: verbose = 1 else: verbose = 0 but I don't think it's worth it. (which leads to an argument for being able to restrict types assigned to names, thinking about it...) [snip] On a separate point, there is only one language that I can think of that is as dynamic (probably more so, actually) than Python, yet has optional static typing (mainly for OPT, to be sure, but you get some ERR, too), and that is ANSI Common Lisp. The more I read of this present discussion, the more I repsect the way CL is designed. The things that seem most immediately relavent are: 1) Just declaring the type for names is enough most of the time (eg. (declare (type *))), but occasionally you want to type expressions too (eg. (the expr)). 2) It's worth having distinguished "read" and "compile" phases (and a "macro expansion" phase would surely be nice...). 3) Type inference need not be demanded in a compiler, but it's nice when it's there. If anyone partaking in this discussion doesn't know CL's approach to types and typing, I'd heartily enjoin them to find out. Dylan might be relavent too, but I don't know that. From scott@chronis.pobox.com Wed Dec 22 11:47:24 1999 From: scott@chronis.pobox.com (scott) Date: Wed, 22 Dec 1999 06:47:24 -0500 Subject: [Types-sig] recursive types, type safety, and flow analysis In-Reply-To: References: <19991222033636.A14007@chronis.pobox.com> Message-ID: <19991222064724.A14726@chronis.pobox.com> On Wed, Dec 22, 1999 at 01:45:43AM -0800, Greg Stein wrote: > On Wed, 22 Dec 1999, scott wrote: > > I posted a set of 4 cases a few messages ago. Without control flow > analysis, the type checker cannot determine which of the four cases is > being used when it analyzes f(). for easy reference, they're at: http://www.python.org/pipermail/types-sig/1999-December/000935.html 2 of those 4 cases use 'del' in the global space. I'm inclined to believe that those 2 cases are beyond the scope of static type checking. I've even recently written code that does just that, creating a module-as-object interface, where the deleted variables are parallell to private variables in a class -- basically inaccessible. I would not expect static typing mechanism to grok that module, and would be fine with simply having it ignore the possibility of undefined names that result from 'del' or other runtime behavior. It certainly seems like an exceptional case that has undue complications in a static type system. Do you see any relatively easy way to handle it more accurately ? The other 2 cases seem handleable at compile time under a system which builds typedecl information by reference (*) -- iteratively or 2-pass, whichever works out best, sans flow-control analysis, and at compile time, with a separate typedecl-name namespace mechanism. Any information distinguishing case 1 from case 2 (like redefining g() inside f) would either fall under local namespace type-ing or something which should, IMO, not be handled by a static type system. [...] > > The compiler *will* be able to generally verify types. It just can't > handle a determine which of a set of alternatives an object will have at a > specific point in type (assuming that object occurs in a different body of > code than that which is being analyzed). > > Am I being clear enough? It seems like I've said this about three times so > far... Yes, I got it, or so I think :) But I think we may have 2 different expectations of something fairly basic: decl f(x: Int) -> Int|None decl g(x: Int) -> Int def g(x): return f(max(x, 1)) def f(x): if x > 0: return x else: return None I *want* the static typing to complain and to be warned or blow up with the message that according to the type information alone, g() is not verifiably an Int. It seems like you want this to work without complaints. Is this correct? > > > > The origination of this discussion was based on the recursive type issue. > > > If we have runtime objects, then I doubt we could support the recursive > > > type thing without some additional work. Or, as I'm suggesting, you do not > > > allow an undefined name (as specified by runtime/execution order) to be > > > used in a typedecl. I still don't see how enforcing your suggestion allows any compile time checking at all -- unless you you further qualify it with 'used in a typedecl as it operates at run time'. What happens at run time should become more clear once we come up with a way to provide run time access to compile time static typing. Run time behavior could even be programmable. [...] > I'm suggesting a similar mechanism be made available to resolve the > recursive typedecl issue. Specifically, we provide a way to create a > partially-defined ("incomplete") typedecl object and bind that to a name. > That name can then be used; later, the name will become fully specified. > More thought is needed here, but I'll hold off as this is still premised > on runtime typedecl availability. (*) yes, essentially the same way that struct node { int i; node * n; }; is OK, but struct node { int i; node n; } isn't. (gives 'incomplete type' with gcc). You need to be able to have a reference to a type and alter it as the type declarations are processed. > > >... > > > I do believe the information goes into the bytecode, but I don't think > > > that is the basis for needing to plan now. Instead, we have to define the > > > semantics of when/where those typedecl objects exist. Do we have them at > > > runtime? > > > > in the above, no, though we do have the ability to find a name > > anywhere at compile time. I'd like to recant this statement and replace it with: 1) The typedecl information is stored in an application-wide static type model which is created at compile time (implies typedecl specific import/#include mechanism). 2)The model is mapped to something potentially available at run time, eg bytecode with associated module, classe and function objects. 3)The runtime environment can do with that information what it pleases, but 1) and 2) need to be done first, and have a lot of potential for use, even without 3). > > > > >Does a name have to exist (in terms of runtime execution) for it > > > to be used in a typedecl, or does it just have to exist *somewhere*? > > > > in the above, it has to exist in the typedecl 'execution' model, which > > is during compile time. > > > > >If > > > names must exist before usage, then how is the recursive type thing > > > handled? With unspecified typedecls? (like an unspecified struct) > > > > How about an iterative model which continues until all typedecl names > > are filled in? > > These three items form a possible alternative. You wouldn't really need an > iterative model to gather typedecl names; two passes is sufficient. > > >... [...] > checking. If you're talking about a name having multiple types over a > period of time, then I disagree: we can handle that case. perhaps for local variables, but I don't see how with global variables unless that global variable is explicitly stated to be a union by the programmer, and the type model works out OK -- with atleast the option of working with my expectation of static typing of (global) unions as described above. > > Also, I think the runtime objects are for more than the occasional type > assertion. Indeed, there are lots of places where optimizations can be made at runtime if the types are known. I just don't think we're there yet. I don't think undefined names are an issue with possible future run-time optimizations and introspective interfaces, as it seems like a NameError would do it's job only after all namespaces are searched, and that something introspective wouldn't even run if a NameError was going to be raised. For the moment, it seems sufficient to make static, compile-time type information available at run time (optionally), and then we can decide what to do with that information at run time - optimize code, keep tabs on the types of variables at run time and make sure they match the static type information, etc. > > As weird as it is to have a separate type-decl name model, it seems > > infintely to depict dynamic typing in a static typing model. ^ insert 'weirder ' here guido should threaten to kill sigs more often :) scott __off_topic_but_interesting_from_here_down__ > > > > you could even allow typedecl to import modules for the sake of > > gaining access to the names, where those imports would only occur when > > the optional type checking is turned on. I'd agree that the use of an > > undefined name should be disallowed. With the presence of > > type-check-only import, following the same > > no-mutually-recursive-imports rule of the regular import, but only > > importing typedecl statements, you could achieve all this at compile > > time. > > Actually, the recursive import issue is resolved by have a module > registered which is incomplete. If you have: > > --- a.py > import b > > --- b.py > import a > > >>> import a right, but there are limits: --- a.py from b import c d = 1 --- b.py from a import d c = 1 doesn't work: 1222 04:47 shock:~% python a.py Traceback (innermost last): File "a.py", line 1, in ? from b import c File "b.py", line 1, in ? from a import d File "a.py", line 1, in ? from b import c ImportError: cannot import name c > scott From skaller@maxtal.com.au Wed Dec 22 19:12:50 1999 From: skaller@maxtal.com.au (skaller) Date: Thu, 23 Dec 1999 06:12:50 +1100 Subject: [Types-sig] type-assert operator optimizations (was: New syntax?) References: Message-ID: <386122B2.5B2932B0@maxtal.com.au> Greg Stein wrote: > > This means that the x!t can be optimised to x, > > without affecting strictly conforming program > > semantics. > > If the compiler can definitively state that the test will never fail, then > it doesn't have to include a runtime check. > > If the compiler can definitively state that the test will always fail, > then it can issue an error and refuse to compile. > [ with the caveat of catching exceptions ] > > If the compiler believes that it might fail in some cases, then it could > issue a warning (and go ahead and insert a runtime check). > [ and yes, there can be switches to avoid issuing warnings ] These semantics are (a) incoherent/inconsistent (b) not the same as what I propose I want to explain both points. (b) first: I propose that if the test were to fail at any point in the execution of the program, the program is invalid, and the translator can do anything it wants: behaviour is undefined. So the test can be elided. If the test would always succeed, then it can be elided. It follows the test can ALWAYS be elided. Now for (a). There is an assumption: that there is no definite algorithm given for deducing if the test will fail. In this case, it is possible that compiler (A) deduces the test will always fail, and rejects the program, while compiler (B) isn't smart enough, and compiles code to raise an exception. In this case, a programmer may catch the exception and handle it, and this behaviour would be required of the language. But that is NOT the behaviour (A) produced. If one compiler can reject the program, it CANNOT be a valid program, and in that case, a requirement on a compiler (that it throw an exception if it is dumb) cannot be made to stick, since the program is in error. I think you must decide that the semantics require a run time error ALWAYS, or, that the test can be elided ALWAYS. There is no half way ground. The current requirements for assertion statements are, effectively, that the test can be elided, and therefore, invalid programs exist. The fact that the current CPython interpreter in non-optimising mode raises an exception is nicety of that particular implementation, not a requirement of the language. I'm assuming 1) there is ONE python language 2) both the optimising and non-optimising byteocode compiler conform to the semantics and from this I deduce the above language semantics. Remember language semantics are constraints on translators, they're not specifications of what a particular tool does in cases that no particular behaviour is required. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Wed Dec 22 19:45:40 1999 From: skaller@maxtal.com.au (skaller) Date: Thu, 23 Dec 1999 06:45:40 +1100 Subject: [Types-sig] typedefs (was: New syntax?) References: Message-ID: <38612A64.12E5F327@maxtal.com.au> Greg Stein wrote: > > Yes, but you basically have the same setup with current Python if you > > exclude Lambdas. A function definition is merely used to create an > > 'alias' for a piece of code, to clarify other pieces of code. If you > > I disagree that a function def is merely an alias. It provides a new > namespace, parameter binding, and capabilities such as deferred execution. > I definitely don't see it as simply an alias. Greg is correct in at least one sense: when a 'def' is executed, the function is given a particular name which can be retrieved, the same applies to classes. These names are independent of a variable which happens to be bound to the function or class: def f(): pass g = f del f The function refered to by variable g has name 'f'. Def is not an alias for a lambda, even if lamda were extended to provide a suite in which statements could be written and locals exist. The case of functions is not interesting here, but the case of classes certainly is, a point I missed in a previous post. When a class is refered to in a type declaration, we have to decide if the reference is an evaluable expression, or is the name of a class -- independent of any variables. In the second case, we must disallow two classes having the same name, or permit an ambiguity in the type declaration. I'm half guessing that Guido would be happy to prohibit definition of two classes with the same name. Unfortunately, this leads to a problem: clearly this restriction is not global, but only 'per namespace'. Which opens up the question: how are namespaces identified. For modules, the module name? For functions, the function name? Sounds good: we can ban duplicate definitions of classes, identifying a class by its fully qualified name: the full package name of the containing module, then any enclosing functions and classes, finally the class name. But note that for a class enclosed in a function, the class has a transitory life. So in this case, it isn't clear that static analysis means anything, since two _distinct_ classes can have the same name at different times. This also applies to other scopes, where class objects can be deleted. To make the issue even more complex, the lifetime of a class is not determined solely by the existence of a variable bound to it -- since a class can be a base of another, or refered to by one of its instances. So a ban on duplicate names might be hard to enforce, and wouldn't make any sense in the function case .. so I'm half guessing Guido will not be so happy to ban duplicate definitions :-) Now, where does this lead? I think it leads to a requirement for syntax to _specify_ a class is static. In this case, the class is immortal, will not be deleted, and no other of the same name may be created. And THEN, we can use the class names in type declarations (of function parameters). Summary: if you want a function parameter to have a specified class type, then the class definition must be specified as static. This can be assumed in an interface file, but must be specified by additional syntax in an implementation file. It looks like Guido is right: interface files are the only way to get enough control to actually do static analysis/checking, without adding a lot more syntax to the python (implementation) language. BTW: I want to propose some terminology here: when a type declaration is given in a function definition, that declaration is called an INLINE declaration, to distinguish it from a declaration in an interface file, or a stand-alone prototype which is not a definition. def f(x : Int): ... ^^^^^ this is an INLINE declaration given that terminology, it would seem that inline declarations cannot work well UNLESS there is also a non-inline declaration of the type being used (Int in the example). -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From gstein@lyra.org Wed Dec 22 19:57:20 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 22 Dec 1999 11:57:20 -0800 (PST) Subject: [Types-sig] type-assert operator optimizations In-Reply-To: <386122B2.5B2932B0@maxtal.com.au> Message-ID: On Thu, 23 Dec 1999, skaller wrote: > Greg Stein wrote: > > > This means that the x!t can be optimised to x, > > > without affecting strictly conforming program > > > semantics. > > > > If the compiler can definitively state that the test will never fail, then > > it doesn't have to include a runtime check. > > > > If the compiler can definitively state that the test will always fail, > > then it can issue an error and refuse to compile. > > [ with the caveat of catching exceptions ] > > > > If the compiler believes that it might fail in some cases, then it could > > issue a warning (and go ahead and insert a runtime check). > > [ and yes, there can be switches to avoid issuing warnings ] > > These semantics are > > (a) incoherent/inconsistent They're fine to me :-) > (b) not the same as what I propose Tough. I proposed it first. :-) > I want to explain both points. (b) first: > I propose that if the test were to fail at any point > in the execution of the program, the program is invalid, Sorry, but that is with your "exceptions-are-evil" model, and we've covered that before. Exceptions are part of Python, and their rigorous specification is also part of Python. The type-assert operator is *defined* to raise an exception if the type is wrong. Period. There are no statements about the validity of the program. >... point moot; I disagree with the premise ... > > Now for (a). There is an assumption: that there is no definite > algorithm given for deducing if the test will fail. If the compiler sees: x = 1 ! type(1) Assuming that "type" cannot be altered, then the compiler can most assuredly elide the check because it knows it will always succeed. Given: x = 1 ! type("") It will always fail, and the compiler may as well stop right there. [ with the caveat of possible switches that turn this in a warning or ignore it or whatever... ] I do grant you, however, that in most cases the compiler *cannot* make a definitive statement about what will happen at runtime with that operator. So it just puts the assertion code in, and continues on. No biggy. [ personally, I probably wouldn't even bother with trying to elide the code; the type checker could easily comment on it, though ] > In this case, it is possible that compiler (A) deduces > the test will always fail, and rejects the program, > while compiler (B) isn't smart enough, and compiles code > to raise an exception. In this case, a programmer > may catch the exception and handle it, and this > behaviour would be required of the language. > But that is NOT the behaviour (A) produced. Okay. Let's work on the assumption that we have different behaviors from the compilers. > If one compiler can reject the program, it CANNOT be > a valid program, and in that case, a requirement > on a compiler (that it throw an exception if it is dumb) > cannot be made to stick, since the program is in error. If one compile rejects it, then it is just smarter. It doesn't say anything about the validity of the program. ---- foo.py raise "an exception" ---- I maintain the above is an entirely valid program. Your smart compiler might say "sorry, that always raises an exception, so I'm not going to bother compiling it." Fine. But that doesn't alter the fact the program is valid. > I think you must decide that the semantics require > a run time error ALWAYS, or, that the test can > be elided ALWAYS. There is no half way ground. Yes, there is a half-way ground. In the "exceptions-are-evil" model, maybe not. But in the standard Python model, I can certainly elide tests that I know will always succeed. Granted: it is hard to know that, but when I *can*, then I'm free to remove the tests. Runtime errors are fine, and they do not imply that the program is in error and the compiler should have rejected it. It would *nice* if the compiler did, but life is a bitch. Very few C compilers will complain about: void main(void) { *(int *)0 = 5; } But it certainly bombs quite quickly. > The current requirements for assertion statements > are, effectively, that the test can be elided, > and therefore, invalid programs exist. The fact > that the current CPython interpreter in non-optimising > mode raises an exception is nicety of that particular > implementation, not a requirement of the language. IMO, you have this entirely wrong. CPython defines the language (where it isn't explicit in the language and library reference manuals). The fact that exceptions are raised means they are part of the language, and therefore a requirement. Guido has been using his JPython chips every now and then to define the language, but generally speaking: CPython is the definition and states the requirements for all Python implementations. > I'm assuming > > 1) there is ONE python language > 2) both the optimising and non-optimising > byteocode compiler conform to the semantics > > and from this I deduce the above language semantics. > Remember language semantics are constraints on translators, > they're not specifications of what a particular tool does > in cases that no particular behaviour is required. Your two assumptions are correct. But I think you're assuming the wrong semantics. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Wed Dec 22 19:33:43 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 22 Dec 1999 11:33:43 -0800 (PST) Subject: [Types-sig] recursive types, type safety, and flow analysis In-Reply-To: <19991222064724.A14726@chronis.pobox.com> Message-ID: On Wed, 22 Dec 1999, scott wrote: > On Wed, Dec 22, 1999 at 01:45:43AM -0800, Greg Stein wrote: > > On Wed, 22 Dec 1999, scott wrote: > > > > I posted a set of 4 cases a few messages ago. Without control flow > > analysis, the type checker cannot determine which of the four cases is > > being used when it analyzes f(). > > for easy reference, they're at: > http://www.python.org/pipermail/types-sig/1999-December/000935.html > > 2 of those 4 cases use 'del' in the global space. I'm inclined to > believe that those 2 cases are beyond the scope of static type > checking. >... > The other 2 cases seem handleable at compile time under a system which > builds typedecl information by reference (*) -- iteratively or 2-pass, > whichever works out best, sans flow-control analysis, and at compile > time, with a separate typedecl-name namespace mechanism. Any > information distinguishing case 1 from case 2 (like redefining g() > inside f) would either fall under local namespace type-ing or something > which should, IMO, not be handled by a static type system. Yes, 2 can be handled and 2 cannot. My point was that while you're analyzing f(), you cannot know which of the 4 cases are present in the global script. Therefore you cannot issue warnings for f's use of g(). [ and funny heuristics may just serve to make the warnings issue non-deterministic from a human's standpoint ] > [...] > > The compiler *will* be able to generally verify types. It just can't > > handle a determine which of a set of alternatives an object will have at a > > specific point in type (assuming that object occurs in a different body of > > code than that which is being analyzed). > > > > Am I being clear enough? It seems like I've said this about three times so > > far... > > Yes, I got it, or so I think :) But I think we may have 2 different > expectations of something fairly basic: > > decl f(x: Int) -> Int|None > decl g(x: Int) -> Int > > def g(x): > return f(max(x, 1)) > > def f(x): > if x > 0: > return x > else: > return None > > I *want* the static typing to complain and to be warned or blow up > with the message that according to the type information alone, g() is > not verifiably an Int. It seems like you want this to work without > complaints. Is this correct? Incorrect. I said that we can perform type verification, but we cannot know a global name's type (or existance) at a point in time. We collect (and union if necessary) the types of globals. Then we analyze the functions. In your example above, we find the types of f and g (as you listed in the decl statements). When we turn to analyzing g(), we find a mismatch between the return value of f() and g's return value and flag it. Note that you don't have to declare f and g beforehand, but could just rely on their definition to work: def g(x: Int)->Int: return f(max(x, 1)) def f(x: Int)->Int or None: ... Since we collect the globals' type information before looking at the function bodies, we know the typedecl for f() before we analyze g. > > > > The origination of this discussion was based on the recursive type issue. > > > > If we have runtime objects, then I doubt we could support the recursive > > > > type thing without some additional work. Or, as I'm suggesting, you do not > > > > allow an undefined name (as specified by runtime/execution order) to be > > > > used in a typedecl. > > I still don't see how enforcing your suggestion allows any compile time > checking at all -- unless you you further qualify it with 'used in a typedecl > as it operates at run time'. Given: Int = type(1) String = type("") def f(x=len(sys.path): Int)->String: return str(x) At runtime, Int and String must be defined when the function object is constructed (because their values are stored into the function object). This is analogous to requiring that sys.path and len must be defined. That is the specified runtime behavior. The question: what happens at compile time? The compiler knows that Int and String are defined (earlier in the program) and what they represent. It performs the type checking as we expect it to. Let's be concrete and say this declaration occurs in the global code body. As I've previously stated, the global body is analyzed first. It is analyzed in runtime order! (top to bottom) As it steps through, and reaches the f() definition, it knows the types/values of Int and String; it then records that information as part of f's signature for later use. If the line assigning to String is shift below f(), then we have an error: when the analyzer reaches f(), it has not yet seen a definition for String. >... > yes, essentially the same way that > > struct node { > int i; > node * n; > }; > > is OK, but > > struct node { > int i; > node n; > } > > isn't. (gives 'incomplete type' with gcc). You need to be able to have > a reference to a type and alter it as the type declarations are > processed. Correct. That is what I'm suggesting. >... > > > > I do believe the information goes into the bytecode, but I don't think > > > > that is the basis for needing to plan now. Instead, we have to define the > > > > semantics of when/where those typedecl objects exist. Do we have them at > > > > runtime? > > > > > > in the above, no, though we do have the ability to find a name > > > anywhere at compile time. > > I'd like to recant this statement and replace it with: > > 1) The typedecl information is stored in an application-wide static > type model which is created at compile time (implies typedecl specific > import/#include mechanism). > > 2)The model is mapped to something potentially available at run time, > eg bytecode with associated module, classe and function objects. > > 3)The runtime environment can do with that information what it > pleases, but 1) and 2) need to be done first, and have a lot of > potential for use, even without 3). Yes, yes, and yes. And: 4) the design needs to incorporate (3) even though we may be deferring its implementation. >... > > checking. If you're talking about a name having multiple types over a > > period of time, then I disagree: we can handle that case. > > perhaps for local variables, but I don't see how with global variables > unless that global variable is explicitly stated to be a union by the > programmer, and the type model works out OK -- with atleast the option > of working with my expectation of static typing of (global) unions as > described above. The global case is the same as the local case. Sequence through the statements, looking at what happens to the types at each point. Take this example: a = 1 f(a) a = "foo" f(a) As we step through, we know that "a" starts as an Int and that we call f() with an Int. "a" then becomes a String and we call f() with a String. Now that we've processed the global body, we start processing the function bodies. But first, we create a union of all the types that "a" ever used; effectively, the functions see the following declaration: decl global a: Int or String [ it is possible to consider adding "or Undefined" in there for completeness, but that may cause more trouble than it saves ] So, to answer your question: we can handle varying types. But it does take a qualification -- it can only be handled within a single code body. When you consider the global body vs. a function body, then we must be conservative and take the union of all the types a name ever had. Cheers, -g -- Greg Stein, http://www.lyra.org/ From skaller@maxtal.com.au Wed Dec 22 20:04:43 1999 From: skaller@maxtal.com.au (skaller) Date: Thu, 23 Dec 1999 07:04:43 +1100 Subject: [Types-sig] Issue: definition of "type" References: Message-ID: <38612EDB.6C1C119D@maxtal.com.au> Greg Stein wrote: > > a = [Foo(), Bar()] > > > > for el in a: > > el.doSomething() > > > > Doesn't this rely on run-time information? How would a type system deal > > with this? I suppose I'm entering the domain of interfaces now... > > The type of "a" is a List where the elements' type is the union of the > type of each initialization value. No. Extra values can be appended of other types. If you want to have lists of a particular type, this must be declared as a constraint. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From gstein@lyra.org Wed Dec 22 20:16:33 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 22 Dec 1999 12:16:33 -0800 (PST) Subject: [Types-sig] Issue: definition of "type" In-Reply-To: <38612EDB.6C1C119D@maxtal.com.au> Message-ID: On Thu, 23 Dec 1999, skaller wrote: > Greg Stein wrote: > > > a = [Foo(), Bar()] > > > > > > for el in a: > > > el.doSomething() > > > > > > Doesn't this rely on run-time information? How would a type system deal > > > with this? I suppose I'm entering the domain of interfaces now... > > > > The type of "a" is a List where the elements' type is the union of the > > type of each initialization value. > > No. Extra values can be appended of other types. If you want to have > lists of a particular type, this must be declared as a constraint. Extra values could be appended, but in the above code, the union algorithm is sufficient to determine the type of "a". Later in the code, the list may change type or we may raise errors if some appends something not part of its type -- I don't care and it doesn't matter here. We're trying to figure out the type of the list to know whether the loop will succeed or not. I also believe that we would ignore the fact that el.doSomething() could theoretically alter "a". [ again, a case where Paul's desire for type-safety breaks -- we can't know that "a" doesn't get changed during that call. ] Cheers, -g -- Greg Stein, http://www.lyra.org/ From guido@CNRI.Reston.VA.US Wed Dec 22 20:17:01 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 22 Dec 1999 15:17:01 -0500 Subject: [Types-sig] recursive types, type safety, and flow analysis In-Reply-To: Your message of "Tue, 21 Dec 1999 20:06:28 PST." References: Message-ID: <199912222017.PAA20990@eric.cnri.reston.va.us> This is the last thing I'm saying in this thread before the new year: > Our difference lies in two items: > > * I do not believe that you can do cross-function, compile-time checks to > determine if a name is undefined. > [ or if a name has different types over time, which type it may be ] I'm assuming that a global name in a module won't be undefined once it is defined if there are no deletions of it anywhere in the module. I believe this catches 99.9% of all module globals (including functions, classes, and imported modules). > * I am requiring the ability to associate typedecl objects with a function > object at runtime. This imposes the requirement on a typedecl name (such > as a class' name) being defined at the point that a function is defined. > [ I also want typedecl objects associated with a class object and a > module object so that we can reflect on their interface at runtime ] I only care about that as a secondary objective. The run time information made available follows whatever we decide we do at compile time. > We can agree to disagree on the first item (I'll let you write the code > to do that :-). I'd like your opinion on the second. I don't think the second requirement should affect the type checking rules. --Guido van Rossum (home page: http://www.python.org/~guido/) From paul@prescod.net Wed Dec 22 20:19:06 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 22 Dec 1999 14:19:06 -0600 Subject: [Types-sig] Non-conservative inferencing considered harmful References: Message-ID: <3861323A.6C23ACCA@prescod.net> Greg Stein wrote: > > ... > > I maintain that it could be declared type-safe. In fact, it is reasonably > straight-forward to generate the type information at each point, for each > value, and then to verify that the .doSomething is valid. Greg, you are getting into exactly what I want to avoid. Let's say the first doSomething returns an int and the second doSomething returns a string. Now you are trying to statically bind it to an integer-bearing parameter. What's the appropriate error message: > > a = [Foo(), Bar()] > > > > ... 10,000 lines of code ... > > > > for el in a: > > j = el.doSomething() > > > > ... 10,000 lines of code ... > > decl k: Int > > k = j "Warning: integer|string cannot be assigned to integer." Note also that the point where the error message occurs may be miles from the place where you did the funky thing that allowed the variable to take two types. This is EXACTLY the sort of error message that made me run screaming from the last language that tried to "help" me with data flow analysis and type inferencing. To put this is in the strongest possible terms, this sort of data flow analysis/inferencing "helps" me like MS Word's guessing what I mean when I "misspell" happy face (e.g. Python code) and Word fixes it for me. We are trying too hard and the result will be non-intuitive. There is no need to complicate our system in order to handle these corner cases. If the user wants a to be a List of (Foo|Bar) then they can darn well SAY SO. It is because of this experience that I am strongly of the belief that we should do NO flow analysis in the normative part of our type system. I am willing to support these basic kinds type inferencing: * if a variable is consistently assigned a particular type within its scope, we inference the type. * if a variable is inconsistently assigned we infer it as "Any" * if a non-error-checking optimizer wants to inference anything else and that inference doesn't change the language semantics then I am all for it. Paul Prescod From paul@prescod.net Wed Dec 22 20:50:15 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 22 Dec 1999 14:50:15 -0600 Subject: [Types-sig] recursive types, type safety, and flow analysis (was: Recursive types) References: <199912212338.SAA13830@eric.cnri.reston.va.us> Message-ID: <38613987.4D6C32F1@prescod.net> Guido van Rossum wrote: > > Hm... Since type checking is essentially a compile time activity, I > think it would be better if the run time order of events didn't > matter. I agree. > > I haven't thought about this particular scenario or the resulting impact > > on the inferencer. > ... > I don't see a big problem here for the type checker. The problem is that you and I are thinking about a Algol-style type checker. Greg is talking about a complex, sophisticated type inferencer that tries to understand what your Python code *means*. I am only willing to try to understand Python in the very simplest cases: top level definitions that have no dependence on procedural code. > I like the idea better (I think proposed by Tim Peters) that the names > used for type declarations live in a separate compile time namespace > where different rules apply. (Even though there are obvious > correspondences, e.g. the names of defined or imported classes should > probably be available both at compile time and at run time.) I agree. We will soon have to move forward on this basis. The static type checker is a very different tool and it does not, in general, try to understand "Python code." Two *convenient exceptions* are top-level class and function declarations. But these are just convenient exceptions. I feel strongly that this: if doSomething(): class a: def doSomething(self): pass else: class a: def doSomething(self): pass is NOT TYPE CHECKABLE because we do NOT do data-flow analysis. Paul Prescod From skaller@maxtal.com.au Wed Dec 22 21:13:59 1999 From: skaller@maxtal.com.au (skaller) Date: Thu, 23 Dec 1999 08:13:59 +1100 Subject: [Types-sig] type-assert operator optimizations References: Message-ID: <38613F17.AF01AD63@maxtal.com.au> Greg Stein wrote: > The type-assert operator is *defined* to raise an exception if the type is > wrong. Period. That is not my defintion. And yours has not made it into the Guido distribution yet, so I am free to make my own definition :-) > > I'm assuming > > > > 1) there is ONE python language > > 2) both the optimising and non-optimising > > byteocode compiler conform to the semantics > > > Your two assumptions are correct. I am glad you agree. Consider: assert 0 Run that through CPython. It raises an exception, right? Now run it again, this time with optimisation enabled. Nothing happens. No exception. The assert statement was elided by turning on optimisation. But both compilers are conforming by agreement, so it follows that an assert statement does NOT have to raise an exception. What is required then? Well, it isn't defined! The program is invalid: it isn't a valid python program. A third compiler could reject it as such. [Of course, you COULD specify that either a) an exception is raised OR b) nothing happens so that anything else -- like deleting your hard disk, or rejecting the program -- is not allowed. In other words, you could make the behaviour non-determinate, without making it undefined altogether.] -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Wed Dec 22 21:21:29 1999 From: skaller@maxtal.com.au (skaller) Date: Thu, 23 Dec 1999 08:21:29 +1100 Subject: [Types-sig] Non-conservative inferencing considered harmful References: <3861323A.6C23ACCA@prescod.net> Message-ID: <386140D9.3AA7D25D@maxtal.com.au> Paul Prescod wrote: > > * if a variable is consistently assigned a particular type within its > scope, we inference the type. > * if a variable is inconsistently assigned we infer it as "Any" I'd like to suggest ONE extra case worth considering: * particular type OR None The reason I suggest this, is that it is common enough for either None or a single type to be used, for example, often function parameters have a default of None. It would be a pity to lump this case into the case Any. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From gstein@lyra.org Wed Dec 22 21:39:26 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 22 Dec 1999 13:39:26 -0800 (PST) Subject: [Types-sig] recursive types, type safety, and flow analysis In-Reply-To: <38613987.4D6C32F1@prescod.net> Message-ID: On Wed, 22 Dec 1999, Paul Prescod wrote: >... > The problem is that you and I are thinking about a Algol-style type > checker. Greg is talking about a complex, sophisticated type inferencer > that tries to understand what your Python code *means*. I am only "Means"? Not at all. I'm only suggesting we watch what happens with the types. We have to do that *anyhow* to do the type checking. You're saying we declare types up front and verify them when we run into the name. To verify it, that implies we need to know the implicit typedecl when we find the name. I'm saying we use the implied typedecl. Why make it explicit, when we have all the information we already need? >... > > I like the idea better (I think proposed by Tim Peters) that the names > > used for type declarations live in a separate compile time namespace > > where different rules apply. (Even though there are obvious > > correspondences, e.g. the names of defined or imported classes should > > probably be available both at compile time and at run time.) > > I agree. We will soon have to move forward on this basis. As long as we have a solution for making the typedecl objects available at runtime. If we are introducing type information to Python, then we ought be able to introspect on that information. > The static type checker is a very different tool and it does not, in > general, try to understand "Python code." It has to understand it to some level to ensure that your types are used properly. There is no avoiding that. I think you're assuming there is too much of an increment between type checking and the level of inference that I'm proposing (actually, "deduction" as John would call it). > Two *convenient exceptions* > are top-level class and function declarations. But these are just > convenient exceptions. I feel strongly that this: > > if doSomething(): > class a: > def doSomething(self): > pass > else: > class a: > def doSomething(self): > pass > > is NOT TYPE CHECKABLE because we do NOT do data-flow analysis. I think we do. Not pure data flow in the classic sense, but really type flow. We are doing that anyhow as part of the checking. With declared names, the scope of your type analysis can be minimized (i.e. on a per-statement level rather than per-function). Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Wed Dec 22 21:46:52 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 22 Dec 1999 13:46:52 -0800 (PST) Subject: [Types-sig] type-assert operator optimizations In-Reply-To: <38613F17.AF01AD63@maxtal.com.au> Message-ID: On Thu, 23 Dec 1999, skaller wrote: >... > I am glad you agree. Consider: > > assert 0 > > Run that through CPython. It raises an exception, right? > Now run it again, this time with optimisation enabled. > Nothing happens. No exception. The assert statement > was elided by turning on optimisation. > But both compilers are conforming by agreement, > so it follows that an assert statement does NOT > have to raise an exception. The assert statement has different meanings based on the compiler you use. That's all. Nothing funny about that. In one case, it is defined to test and value and raise an exception. In the other, it is a no-op. > What is required then? Well, it isn't defined! Yes, it is. It is just defined differently for the two compilers. > The program is invalid: it isn't a valid python > program. It is entirely valid. > A third compiler could reject it as such. Sure: it could say "this thing will always fail, so I won't compile this code." But you're still missing the point. "assert" has a very specific definition, and that is to raise an exception if its value is zero. It is also defined to be a no-op in an optimizing compiler. I do not see where you're going with all this. Exceptions are part of Python. They exist, and they must be raised when needed. Your attempt to redefine their role in Python is getting awfully tiresome. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Wed Dec 22 21:54:29 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 22 Dec 1999 13:54:29 -0800 (PST) Subject: [Types-sig] Non-conservative inferencing considered harmful In-Reply-To: <386140D9.3AA7D25D@maxtal.com.au> Message-ID: On Thu, 23 Dec 1999, skaller wrote: > Paul Prescod wrote: > > > > * if a variable is consistently assigned a particular type within its > > scope, we inference the type. > > * if a variable is inconsistently assigned we infer it as "Any" > > I'd like to suggest ONE extra case worth considering: > > * particular type OR None > > The reason I suggest this, is that it is common enough > for either None or a single type to be used, for example, > often function parameters have a default of None. Agreed, but I believe Paul's response would simply be "then declare it." But then: I have different thoughts on the base issue :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From paul@prescod.net Wed Dec 22 21:56:44 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 22 Dec 1999 15:56:44 -0600 Subject: [Types-sig] Back to basics Message-ID: <3861491C.48DDE5F4@prescod.net> I think that our version 1 is going to spin out of control if we spend too much energy trying to reverse engineer what Python code means in terms of a type system. Yes, I think that we can easily extract type declarations from 90% of all Python code. We will do that for version 2. Yes, I think that there are more sophisticated ways of doing data flow analysis, we may do that for version 3. For version *1* we should going to require all declarations to be 100% explicit. We will allow out-of-line declarations in "shadow files" to allow the annotation of "old" Python and C modules. Inline declarations will be treated as Tim Peters suggests. They are identical in semantics to out-of-line declarations. Inline declarations should always be preceded by the "decl" keyword so that they can be easily stripped. We can even write a small script that extracts them from Python code and generates an out-of-line file so that the semantics are clearly not context-dependent. In version 1 there is no automatic anything. There will be two syntaxes for declaring types: interface declarations and compound typedecls. Both are parameterizable. This should help us to answer (in the short term) many of the tricky questions that have been raised. "del foo" is merely illegal if it violates the declared interface of a module and is not otherwise. a=5 a='abc' is illegal if it violates the declared interface of a module and is not otherwise. In version 2 and subsequent versions we can get to automatic type detection and maybe dataflow and type inferencing. But for version 1 we've got to KISS if we're going to make progress. Paul Prescod From gstein@lyra.org Wed Dec 22 22:08:40 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 22 Dec 1999 14:08:40 -0800 (PST) Subject: [Types-sig] Non-conservative inferencing considered harmful In-Reply-To: <3861323A.6C23ACCA@prescod.net> Message-ID: On Wed, 22 Dec 1999, Paul Prescod wrote: > Greg Stein wrote: > > ... > > I maintain that it could be declared type-safe. In fact, it is reasonably > > straight-forward to generate the type information at each point, for each > > value, and then to verify that the .doSomething is valid. > > Greg, you are getting into exactly what I want to avoid. Let's say the > first doSomething returns an int and the second doSomething returns a > string. Now you are trying to statically bind it to an integer-bearing > parameter. What's the appropriate error message: > > > > a = [Foo(), Bar()] > > > > > > ... 10,000 lines of code ... > > > > > > for el in a: > > > j = el.doSomething() > > > > > > ... 10,000 lines of code ... > > > decl k: Int > > > k = j > > "Warning: integer|string cannot be assigned to integer." > > Note also that the point where the error message occurs may be miles > from the place where you did the funky thing that allowed the variable > to take two types. That is *exactly* what would happen in my proposal. Yes, the error is located at the assignment, and that is a good distance from the assignments to "a" and "j". If you want to tighten up the code, then declare "a" or "j". It will fail at whatever point you feel the code is "wrong". BUT!! -- you never said what the error in the program was, and what the type-checker was supposed to find: 1) was "a" filled in with inappropriate types of values? 2) was "j" assigned a type it wasn't supposed to hold? 3) was "k" declared wrong? In the absence of knowing which of the three cases is wrong, I *strongly* maintain that the error at the "k = j" assignment is absolutely correct. How is the compiler to know that "a" or "j" is wrong? You didn't tell it that their types were restricted (and violated). In other words, the error message is NOT "miles away". It is exactly where it should be. When that error hit, the programmer could have went "oops! I declared k wrong. I see that j is supposed to be an Int or a String. okay... lemme fix that..." Find another example -- this one doesn't support your position that inferencing is harmful. > This is EXACTLY the sort of error message that made me run screaming > from the last language that tried to "help" me with data flow analysis > and type inferencing. To put this is in the strongest possible terms, > this sort of data flow analysis/inferencing "helps" me like MS Word's > guessing what I mean when I "misspell" happy face (e.g. Python code) and > Word fixes it for me. We are trying too hard and the result will be > non-intuitive. There is no need to complicate our system in order to > handle these corner cases. If the user wants a to be a List of (Foo|Bar) > then they can darn well SAY SO. Sure. And in the absence of saying so, the inferencer above did exactly what it was supposed to do. You have one of three problems in your code, yet no way for any compiler to know which one you meant. If you want to sprinkle declarations throughout your code, then be my guest. But you don't have to. Python has very clean and rigorous execution semantics that we can easily and deterministically determine type information. I hate the thought that people are going to start feeling that they should put declarations into their code. We will lose one of the best things of Python -- the ability to toss out and ignore all that declaration crap from Algol-like languages. If you give a means to people to declare their variables, then they'll start using it. "But it's optional!" you'll say. Well, that won't be heard. People will just blindly follow their C and Java knowledge and we will lose the cleanliness of syntax that Python has enjoyed for so long. > It is because of this experience that I am strongly of the belief that > we should do NO flow analysis in the normative part of our type system. I disagree, and your example does not support that position. Cheers, -g -- Greg Stein, http://www.lyra.org/ From paul@prescod.net Wed Dec 22 22:09:24 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 22 Dec 1999 16:09:24 -0600 Subject: [Types-sig] recursive types, type safety, and flow analysis References: Message-ID: <38614C14.48075352@prescod.net> Greg Stein wrote: > >... > > "Means"? Not at all. I'm only suggesting we watch what happens with the > types. We have to do that *anyhow* to do the type checking. I say: "given the name of the thing and the type associated with the name, we check that the operations are legal." Seems straightforward. We've been doing it since the 60's. You say: "given the name of the thing we will intuit a type (often an anonymous/union type (which I consider confusing)) and then check that the operations are legal." "Classic" type inference works from the operators to figure out what the types are. "Classic" type checking works from the names to validate the operators. You want both. I don't think that it will work in practice. The will be the semantics will be confusing as hell and the error messages will be totally indecipherable. I am willing to iteratively open up the design to types of type inferencing and discovery as time goes by but I don't yet understand your model well enough to be able to write a specification around it. Paul Prescod From gstein@lyra.org Wed Dec 22 22:24:59 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 22 Dec 1999 14:24:59 -0800 (PST) Subject: [Types-sig] recursive types, type safety, and flow analysis In-Reply-To: <38614C14.48075352@prescod.net> Message-ID: On Wed, 22 Dec 1999, Paul Prescod wrote: > Greg Stein wrote: > > "Means"? Not at all. I'm only suggesting we watch what happens with the > > types. We have to do that *anyhow* to do the type checking. > > I say: "given the name of the thing and the type associated with the > name, we check that the operations are legal." Seems straightforward. > We've been doing it since the 60's. > > You say: "given the name of the thing we will intuit a type (often an > anonymous/union type (which I consider confusing)) and then check that > the operations are legal." That's not what I've been saying. I take blame for not making myself clearer in this case. > "Classic" type inference works from the operators to figure out what the > types are. "Classic" type checking works from the names to validate the > operators. You want both. I don't think that it will work in practice. > The will be the semantics will be confusing as hell and the error > messages will be totally indecipherable. > > I am willing to iteratively open up the design to types of type > inferencing and discovery as time goes by but I don't yet understand > your model well enough to be able to write a specification around it. Given: x = a + b I don't want to infer anything from the "+" there. So no... this wouldn't be called "classic type inferencing." John Skaller best described it as type deduction. I want to know the type of x, given the types of a and b. 1) In the above statement, we already know the types of "a" and "b" (if not, then they haven't been assigned to and we can raise an error!). 2) We know what the "+" operator does, given those two types. 3) We produce a result type. 4) We associate that result type with "x". In your model, (4) is replaced by "validate the result type against the declared type of 'x'". Otherwise, there is no difference between our proposals. Your statements about not working in practice, confusing semantics, and indecipherable error messages are simply FUD. You said yourself you were unsure of the model that I was proposed. And FUD like this generally peeves me. I'm talking about figuring out types from the assignments, not from declarations. That's it. Forget declarations and the clutter that they bring to programs. -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Wed Dec 22 22:30:32 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 22 Dec 1999 14:30:32 -0800 (PST) Subject: [Types-sig] Back to basics In-Reply-To: <3861491C.48DDE5F4@prescod.net> Message-ID: This is all fine as long as the design does not preclude the availability of typedecl information at runtime. Some of these discussions about new namespaces or not worrying about names being defined could prevent that. I've proposed plenty of syntax for the typedecls and interface declarations. I don't think there has been a solid proposal yet for parameterizations. I would recommend that the syntax design at least starts with the proposal that I set up to save some work and provide a basis for discussion about how to add parameterization. [ personally: I'd recommend parameterization get punted to V2, although I worry that if we don't take its syntax into account, we might preclude its addition later on. ] -g On Wed, 22 Dec 1999, Paul Prescod wrote: > I think that our version 1 is going to spin out of control if we spend > too much energy trying to reverse engineer what Python code means in > terms of a type system. Yes, I think that we can easily extract type > declarations from 90% of all Python code. We will do that for version 2. > Yes, I think that there are more sophisticated ways of doing data flow > analysis, we may do that for version 3. > > For version *1* we should going to require all declarations to be 100% > explicit. We will allow out-of-line declarations in "shadow files" to > allow the annotation of "old" Python and C modules. Inline declarations > will be treated as Tim Peters suggests. They are identical in semantics > to out-of-line declarations. Inline declarations should always be > preceded by the "decl" keyword so that they can be easily stripped. > > We can even write a small script that extracts them from Python code and > generates an out-of-line file so that the semantics are clearly not > context-dependent. In version 1 there is no automatic anything. > > There will be two syntaxes for declaring types: interface declarations > and compound typedecls. Both are parameterizable. > > This should help us to answer (in the short term) many of the tricky > questions that have been raised. "del foo" is merely illegal if it > violates the declared interface of a module and is not otherwise. > > a=5 > a='abc' > > is illegal if it violates the declared interface of a module and is not > otherwise. > > In version 2 and subsequent versions we can get to automatic type > detection and maybe dataflow and type inferencing. But for version 1 > we've got to KISS if we're going to make progress. > > Paul Prescod -- Greg Stein, http://www.lyra.org/ From sjmachin@lexicon.net Wed Dec 22 23:55:22 1999 From: sjmachin@lexicon.net (John Machin) Date: Thu, 23 Dec 1999 09:55:22 +1000 Subject: [Types-sig] type declaration syntax In-Reply-To: <002701bf4aa5$d2334d60$922d153f@tim> References: <385C1345.C21FF180@maxtal.com.au> Message-ID: <19991222224650088.AAA118.228@max41101.izone.net.au> [John Skaller] > > I guess that NO python programmer wants to declare the type > > of every single name, which is what APC style static type > > checking requires. > [Tim] > I would be delighted to if it sped some of my "marginal" programs by a > factor of 2. My employer would be delighted to if it saved them from > runtime TypeErrors next week instead of next year. Any tradeoff you can > think of has a larger audience than you can imagine includes an audience for Viperish global inference too!>. > Indeed. Further, from what we of the APC persuasion also desire salvation is the speed-driven need to cache references to methods and functions in local variables e.g. def myfunc(real_arg_1, real_arg_2, srepl = string.replace): # known favourite trick of Tim's # as Tim has said recently, it's not robust in the face of caller # going myfunc("foo", "bar", "you lose") or one of mine: wlist = [] wapp = wlist.append for x in some_long_sequence: wapp(some_func(x)) Now I do appreciate the possibilities and practicalities of replacing functions and methods on the fly, the reflective capabilities, the whole dynamic thing, but please please please can't we have a way of declaring that we're peeking, not poking, 99% of the time? Some possibilities: (a) declare that certain objects are unpokeable --- pragma nopokeever at the top of the sys module would trap sys.maxint = "gotcha" at compile-time (b) others may be pokeable initially but then frozen some_module.some_func = my_better_func nopokeanymore(some_module) or my_soon_to_be_read_only_data_structure = load_from_file(.......) nopokeanymore(my_soon...) so that run-time checks could made on attempts to poke at the object? (c) declare that code in the current module doesn't poke in replacements for methods in other modules (or in itself) and neither do the modules it calls and if they do then I'll take the rap .... so that an optimising compiler could move the resolution of wlist.append outside the loop as in my example, or even as in Tim's trick do the resolution when the module is loaded ... (d) I really miss #define and const and enum; how about const INITIAL_STATE = 0 const NEXT_STATE = 1 const ANOTHER = 2 so that faster and more robust code could be generated? Don't-stop-at-bytecode-hacks-the-ultimate-dynamic-language-would-allow- "poke(arbitrary_memory_address)=expression"-ly yours John From paul@prescod.net Wed Dec 22 23:00:02 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 22 Dec 1999 17:00:02 -0600 Subject: [Types-sig] Non-conservative inferencing considered harmful References: Message-ID: <386157F2.183DA4AC@prescod.net> Greg Stein wrote: > > BUT!! -- you never said what the error in the program was, and what the > type-checker was supposed to find: > > 1) was "a" filled in with inappropriate types of values? > 2) was "j" assigned a type it wasn't supposed to hold? > 3) was "k" declared wrong? > > In the absence of knowing which of the three cases is wrong, I *strongly* > maintain that the error at the "k = j" assignment is absolutely correct. > How is the compiler to know that "a" or "j" is wrong? You didn't tell it > that their types were restricted (and violated). That's right. This is all valid Python code and should *not* give an error message. That's why I'm been trying to make a distinction between a type safety declaration and a type declaration. If you didn't ask for this code to be type-safe then it won't cause a problem. Here's where the error might arise: ... 10,000 lines of code ... for el in a: j = el.doSomething() ... 10,000 lines of code ... type-safe def foo( k: Int ): k = j > In other words, the error message is NOT "miles away". It is exactly where > it should be. When that error hit, the programmer could have went "oops! I > declared k wrong. I see that j is supposed to be an Int or a String. > okay... lemme fix that..." No, here's what really happens (based on my experience with ML): "Oooops. Something weird has happened why would it expect an int or string? Where does this variable get its value? Humm. What are the possible types of el? Humm, what are the possible contents of a? Hummm. Why does this language make it so hard for me to find my errors?" > Find another example -- this one doesn't support your position that > inferencing is harmful. It absolutely does. If I can't jump immediately from the error message to the line that causes the problem then something is wrong with the type system. I shouldn't have to "debug" compiler errors by inserting declarations here and there. > Sure. And in the absence of saying so, the inferencer above did exactly > what it was supposed to do. You have one of three problems in your code, > yet no way for any compiler to know which one you meant. It shouldn't have let me get so far without saying *exactly what I mean*. It shouldn't have tried to read my mind. I'm the human asking for the computer's help in keeping things straight. If it tries to read my mind it's going to be making the same mistakes I make! Rather it should tell me when I've first done something fishy. > I hate the thought that people are going to start feeling that they should > put declarations into their code. That's *inevitable*. People who want statically type-checked code are inevitably going to start feeling that they must put declarations in their code. That's the case with ML. That's the case with Haskell. That will be the case with Python. We aren't smarter here then all of the programming language researchers in the world. Just as programmers mistrust the ML and Haskell inferencers, they will distrust the Python inferencer. Just as programmers (me, Tim, the editor of the Journal of Functional Programmers) always put in declarations in ML and Haskell code, we will do so for Python. One or two cryptic error messages that take you fifteen minute to "debug" are enough to turn you off inferencing really quick. Therefore our real decision is: do we want to force programmers who want static type checking to sprinkle their code with declarations *explicitly* or do we want to wait until they get frustrated with the inferencer? It seems to me that "simple and explicit" is more Pythonic than "we'll guess what you mean and you can being explicit if we guess wrong." If you don't want to put in declarations, don't use static type checking! > We will lose one of the best things of > Python -- the ability to toss out and ignore all that declaration crap > from Algol-like languages. If you give a means to people to declare their > variables, then they'll start using it. "But it's optional!" you'll say. > Well, that won't be heard. People will just blindly follow their C and > Java knowledge and we will lose the cleanliness of syntax that Python has > enjoyed for so long. I don't see the absence of type declarations as having much to do with Python's cleanliness. As Tim said, declarations improve readability by serving as documentation. Paul Prescod From paul@prescod.net Wed Dec 22 23:10:19 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 22 Dec 1999 17:10:19 -0600 Subject: [Types-sig] recursive types, type safety, and flow analysis References: Message-ID: <38615A5B.D2E2B608@prescod.net> Greg Stein wrote: > > I'm talking about figuring out types from the assignments, not from > declarations. That's it. Forget declarations and the clutter that they > bring to programs. Okay, let's try again. a.py: a=5 if something(): a = 5 elif somethingElse(): a = "abc" elif somethingElse2(): a = 32L elif somethingElse3(): a = ["ab"] else: del a b.py: import a a.a = "jab" c.py: import a a.a = ("abc",5) d.py: import a, b, c j: Int j = a.a What's the error message? What are the list of valid types that a.a could have at this point and how would the type system have inferred them? Paul Prescod From skaller@maxtal.com.au Wed Dec 22 23:18:01 1999 From: skaller@maxtal.com.au (skaller) Date: Thu, 23 Dec 1999 10:18:01 +1100 Subject: [Types-sig] type-assert operator optimizations References: Message-ID: <38615C29.2F26AB8F@maxtal.com.au> Greg Stein wrote: > The assert statement has different meanings based on the compiler you use. > That's all. Nothing funny about that. > > In one case, it is defined to test and value and raise an > exception. In the other, it is a no-op. The semantics of a programming language are a property of the LANGUAGE and not the compiler. So what you are saying is that there are TWO python languages, which you disagreed with before. > But you're still missing the point. "assert" has a very specific > definition, and that is to raise an exception if its value is zero. It is > also defined to be a no-op in an optimizing compiler. Make up your mind :-) > I do not see where you're going with all this. I know. I've got five or so years intensive experience with standardisation committees -- and quite a few more as a 'lingerer'. :-) It takes a long while to understand some of the subtle distinctions and issues, and I can't claim to understand them myself. This is NOT a put down, I know you are not stupid, etc etc .. I just happen to know something about these issues because I spent lots of time dealing with them. Be glad I have done so, and spared you most of the agony :-) A programming language isn't defined by an implementation. Semantics comes from documented specifications -- and sometimes the specifications don't agree with practice, are suboptimal, etc .. which makes discussing changing the specifications all that much harder. At present, the python language reference confuses what a single implementation does with semantics. [but, it's fairly good, because the distinction IS made in some places] This is because it was written as a description of a single implementation. However, there are now many implementations -- several versions of CPython including 1.5.0, 1.5.1, 1.5.2, 1.6 (under development) and 2.0, as well as versions of JPython, Viper, and other tools like type checkers. And then, it runs on multiple platforms. Whew! That's a lot of Pythons. What do they ALL have in common? What will run on ALL these implementations on ALL platforms? It is no longer sensible to merely describe what a particular version of CPython does on one platform: the Python Language itself needs an implementation independent specification. This is so (ultimately) programmers can write programs and DEPEND on a particular behaviour -- even when compiled with an optimising compiler (and I'm not talking about merely optimising byte code) One of my complaints about Python was that it requires exceptions in far too many places, where in fact one would consider that the program is in error: the fact that CPython 1.5.2 raises exceptions in these cases ought to be considered a quality of implementation issue, and NOT mandatory semantics. The reason is that all these requirements for exceptions prevent effective compilation, by preventing many optimisations to be done, and by preventing early error diagnosis. There is an exception, effectively, in the case of assert statements -- in this case the intention is quite clearly reflected in the optimising compiler: the language semantics required for assert are 'none'. This specification only applies to correct programs, so that each implementation is free to choose a behviour for incorrect ones -- throw an exception, do nothing, or reject the program -- or even core dump. The point is the semantics of assert are there to allow the program to declare intentions, and discover if in fact their program is wrong. The same applies in many other areas, where one can NOT produce an efficient code generator unless some of the requirements are relaxed. For example, if a program recurses past some reasonable limit, CPython reports the error. But this cannot be a sensible language requirement, because stack overflow is VERY expensive to check for in compiled code. So it is a Quality Of Implementation (QOI) issue what happens on stack overflow, NOT part of the language specification. [At least, this is a sensible position if one wants efficient code to be generated] Did YOU check for stack overflow in the last extension module you wrote? What I expect you (Greg) will find when you try to write the type checking stuff, is that just a few, judicious, language changes, -- mainly restrictions on what is considered a correct Python program -- will vastly improve what you can deduce. You will need to experience this yourself, to really begin to understand how sensitive optimisation is to small changes in semantic requirements. [And when you do, please report your experiences!] There is a classic example of this: FORTRAN vs C. It is well known that Fortran produces considerably more efficient code than C. It is also well known why, and that in retrospect, C was just a bit too dynamic. The problem relates to aliasing, especially of function arguments -- Fortran does NOT permit this. C does, and it pays dearly. The C committee tried to fix this with a new keyword 'noalias', and they stuffed up the design, and had to withdraw it. A new proposal for keyword 'restricted' has been accepted for C9X: the semantics are subtly different. It remains to be seen whether this allows C to catch up with Fortran. Now, I myself have a desire to compile Python to efficient code. I know that the lack of explicit type declarations, by itself, does not prevent this -- although allowing explicit declarations can certainly improve performance, provided that a failure to comply means that the program is in error -- and NOT that exceptions be throw: that would defeat the (OPT) purpose. (Not entirely though) It is in fact quite conventional in most programming languages to provide assertions with NO semantics, to allow debugging dynamic situations (for example, Eiffel). I can't see any reason Python shouldn't follow this conformance model. I also cannot see a reason why CPython should not raise exceptions in these cases. But it would be doing so because that is what a quality INTERPRETER would do, and NOT because it was a requirement of the language. That is why I tried to distinguish two uses of exceptions: for reporting program errors, and for reporting environmental problems (like end of file, or cannot open file) -- in the latter case, we would not dispute that the program is correct and should handle the exception (whether or not it is interpreted or compiled). In the other cases there is a tradeoff-- the more code that is banned, the faster the code we can generate, and the more likely we are to get a core dump or unexpected behaviour. in trying to decide on where the tradeoff should lie, your position at one extreme is not helping. The optimisations which can be performed by NOT requiring assert statements to raise exceptions are minimal but existant. The reason that they're minimal is that, apart from eliding the actual test, it is hard to 'understand' what an assertion is asserting (for a compiler). On the other hand, your typecheck operator seems to provide MORE information, precisely because it is a far more specific test: assert statements can check almost anything, whereas the typecheck operator very specifically asserts something has a particular type, allowing inferencing to do a lot more work, and a lot of typechecking and method lookup to be reduced or eliminated. You designed it for this reason! But we will lose quite a bit of performance if we have to keep checking at run time, and raising exceptions. The reason is that if the exception is caught -- the type that it asserts an object have cannot be assumed anymore -- the object can have any type: try: y = x ! Int # assert y,x are ints .. except TypeError: pass y .. x We cannot say anything about the type of y and x here. To be sure, we can do some control flow analysis in this case, and notice that we lose information where the TypeError is caught. So you COULD argue that requiring an exception be thrown doesn't cost that much (only the test) because information is lost only outside the 'try' body. I might even agree. But the point is that we would then be discussing the cost of various possible restrictions -- not insisting that every exception must be generated exactly as in the CPython 1.5.2 interpreter. We do not HAVE to require the CPython 1.5.2 behaviour in a compiler, we should just weigh up the tradeoffs. Just to give you an example: try: y = x ! int except: ValueError ... You might think this didn't catch a failed exception, and that y and x had to be ints .. until you realised that because Python is so dynamic, someone may have said: ValueError = TypeError somewhere. So you might have to throw away all type information whenever you saw ANY except clause, just to be sure you conformed to the requirements. And you might just want to ban changing the meaning of the standard exceptions! I have a final word. I have a vested interest in exceptions being thrown in many cases: my interscript program relies on catching exceptions when client script contains errors: it reports the error, and then continues on. The program is robust: a useful property for a literate programming tool, where bugs result in scrambled output, rather than a core dump. But I ALSO want interscript to run at least 1000 times faster. A compromise is needed which supports BOTH usages. So I'm NOT in the 'exceptions are evil camp' after all, I in the 'what compromises preserve as much existing behaviour as possible while still allowing effective optimisation and type checking?' I.e. I think we're on the same side, or ought to be: looking for a suitable compromise. Probably, we can at best analyse -- and try out -- various possibilities, with Guido's Guidance to help, and knowing he's making the final decision (at least for CPython ) anyhow. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Wed Dec 22 23:58:39 1999 From: skaller@maxtal.com.au (skaller) Date: Thu, 23 Dec 1999 10:58:39 +1100 Subject: [Types-sig] type declaration syntax References: <385C1345.C21FF180@maxtal.com.au> <19991222224650088.AAA118.228@max41101.izone.net.au> Message-ID: <386165AF.F6E6BF81@maxtal.com.au> John Machin wrote: > Now I do appreciate the possibilities and practicalities of replacing > functions and methods on the fly, the reflective capabilities, the > whole dynamic thing, but please please please can't we have a way of > declaring that we're peeking, not poking, 99% of the time? Why not the OTHER way around, to match the statistics? 1) Ban rebinding module variables after importing is done 2) Specify that 'def' creates an immutable binding. You can't rebind (or del) the name. Same for 'class'. [This includes defs inside classes] 3) It isn't necessary to introduce 'const', at least for modules and classes: Here's why: Given (1), the only ways to change a module level variable are: a) a module level assignment -- easy to detect b) an assignement in a function defined in the same module, which has a global directive for that variable c) exec statement [if you see one, abandon hope of optimisation] d) __dict__ hackery from outside the module The only hard case is (d). This cannot be statically checked. It can be banned all the same (and on the programmers head be it), I.e. TWO bans fix most problems. The ban on module level rebindings is a significant restriction. The ban on changing 'def' and 'class' bindings can be worked around [use another variable] and so no functionality is lost, provided that the author takes this into account. EG: class X: def f(..): pass g = f # g can be rebound, f cannot be I think this covers: caching all module and class level variables caching instance methods what is not covered is caching instance attributes. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Thu Dec 23 00:14:44 1999 From: skaller@maxtal.com.au (skaller) Date: Thu, 23 Dec 1999 11:14:44 +1100 Subject: [Types-sig] recursive types, type safety, and flow analysis References: <38615A5B.D2E2B608@prescod.net> Message-ID: <38616974.AFC365D6@maxtal.com.au> Paul Prescod wrote: > a.py: > > a=5 > > if something(): > a = 5 > elif somethingElse(): > a = "abc" > elif somethingElse2(): > a = 32L > elif somethingElse3(): > a = ["ab"] > else: > del a The type of a is one of: int, string, long, tuple, or terminal where 'terminal' means 'a doesn't exist'. > b.py: > > import a > a.a = "jab" module attribute assignment is banned > c.py: > > import a > a.a = ("abc",5) module attribute assignment is banned > d.py: > > import a, b, c > > j: Int > j = a.a > > What's the error message? There isn't one at compile time. a could be an int, the inference engine cannot know if it is or not, so it keeps quiet. At run time, a TypeError is raised if something other than int is assigned. [Yes, I know I changed the rules on you by saying the module attribute assignments were banned] Note: Viper will do better than this compiling whole programs. It loads modules dynamically at compile time, by running the interpreter, so all modules are fully built at compile time. Then, it is easy to know not only the types of every (module) variable, but also their values! Obviously, this will not work on a 'per module' basis. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From gstein@lyra.org Thu Dec 23 00:50:52 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 22 Dec 1999 16:50:52 -0800 (PST) Subject: [Types-sig] examples (was: recursive types, type safety, and flow analysis) In-Reply-To: <38615A5B.D2E2B608@prescod.net> Message-ID: On Wed, 22 Dec 1999, Paul Prescod wrote: > Greg Stein wrote: > > I'm talking about figuring out types from the assignments, not from > > declarations. That's it. Forget declarations and the clutter that they > > bring to programs. > > Okay, let's try again. > > a.py: > > a=5 > > if something(): > a = 5 > elif somethingElse(): > a = "abc" > elif somethingElse2(): > a = 32L > elif somethingElse3(): > a = ["ab"] > else: > del a As John stated: at the end of this code block, "a" is one of Int, String, Long, List, or Undefined. There are no errors. "a" does not form part of a.py's interface as it is not explicitly mentioned in a "decl member" statement. [ below, I explore the ramifications of changing this feature/design ] > b.py: > > import a > a.a = "jab" I would say this is perfectly acceptable. Remember: I'm proposing to defer all assignment enforcement. No errors. > c.py: > > import a > a.a = ("abc",5) Same as b.py. > d.py: > > import a, b, c > > j: Int > j = a.a > > What's the error message? What are the list of valid types that a.a > could have at this point and how would the type system have inferred > them? None of the modules above export an interface. Especially with regard to module attributes (i.e. module globals). I do believe that a module exports function signatures as part of its interface; but unless you include a "decl member" in there, the module does not export defined attributes. In my proposal, I would state that ".a" is not part of a's interface, so the reference simply fails. This is analogous to: some_instance.undefined_attribute In other words, Module "a" and "some_instance" both export an interface. It is an error to reference something that is NOT part of that interface. For discussion's sake, because I think you are seeking more detail around this particular fragment... Let us posit that referencing "a.a" is allowed (for discussion: let's say we make allowances for module interfaces). Given that: the expression "a.a" is not type-checked at all. We know that "a" is a Module, but nothing more. a.a might raise an AttributeError, but we can't know, and we don't flag that. If it does have a value, we will treat it as type "Any". The assignment to "j" succeeds because I don't want local declarations or assignment enforcement. ========================================== Now. Let's weaken some of my assumptions/requirements and/or look further ahead. 1) module globals implicitly form part of an exported interface. This would imply that "a.a" has an exported type (the union described above). Things like "b.a" would also be exported (with a type of Module), but that doesn't enter into this discussion. Given this, the type-checker will know that the RHS of "j = a.a" has that union type. No error is raised because enforcement does not exist. 2) Enable assignment enforcement (and local declarations) d.py issues an error at the assignment. 3) Enable module attribute assignment enforcement (by virtue of an exported interface, clients must respect the typedecls specified) b.py does not raise an error. String is a valid type for a.a. c.py raises an error. A tuple is an invalid type for a.a. 4) Module attribute assignment is outright forbidden. b.py and c.py raise errors. These assignments are forbidden. 5) Separate track: references to an attribute that is not part of a Module interface is noted as an error. b.py, c.py, and d.py would raise an error because "a" is not part of the Module interface (it was not declared). [ per my note above: I do think that these references to "a.a" would cause an error because of the interface violation -- a.a is not exported by the Module. ] Items 2, 3, 4 are conditioned upon "a" being part of Module a's interface (implicitly per #1, or explicit via a declaration). In summary, the deferred parts of my proposal are: 1) a module global does not form part of its interface unless it is explicitly declared with a "decl member" statement. 2) assignment enforcement does not exist 3) module attribute assignments are legal and un-type-checked Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Thu Dec 23 01:16:00 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 22 Dec 1999 17:16:00 -0800 (PST) Subject: [Types-sig] Non-conservative inferencing considered harmful In-Reply-To: <386157F2.183DA4AC@prescod.net> Message-ID: On Wed, 22 Dec 1999, Paul Prescod wrote: > Greg Stein wrote: > > > > BUT!! -- you never said what the error in the program was, and what the > > type-checker was supposed to find: > > > > 1) was "a" filled in with inappropriate types of values? > > 2) was "j" assigned a type it wasn't supposed to hold? > > 3) was "k" declared wrong? > > > > In the absence of knowing which of the three cases is wrong, I *strongly* > > maintain that the error at the "k = j" assignment is absolutely correct. > > How is the compiler to know that "a" or "j" is wrong? You didn't tell it > > that their types were restricted (and violated). > > That's right. This is all valid Python code and should *not* give an > error message. Woah... if you presume enforcement of assignments, then a type-check error exists somewhere. We're *trying* to raise errors. Did you typo here? > That's why I'm been trying to make a distinction between > a type safety declaration and a type declaration. If you didn't ask for > this code to be type-safe then it won't cause a problem. All right. This is a different problem than I was responding to. You asked me whether an error would be raised by a specified piece of code. I said yes. Now you're saying the code was actually correct and an error shouldn't be raised? Or that there was some type-safety vs type-decl difference being discussed? You're confusing me. Regardless: in your world of assignment-enforcement, the code you specified *would* raise an error. Either at the assignment to "a", "j", or "k", depending on which ones you had declared ahead of time. > Here's where > the error might arise: > > ... 10,000 lines of code ... > > for el in a: > j = el.doSomething() > > ... 10,000 lines of code ... > > type-safe > def foo( k: Int ): > k = j Sure. You'd get an error at the "k = j" line. What's the point? We agree on this one. [presuming assignment enforcement] > > In other words, the error message is NOT "miles away". It is exactly where > > it should be. When that error hit, the programmer could have went "oops! I > > declared k wrong. I see that j is supposed to be an Int or a String. > > okay... lemme fix that..." > > No, here's what really happens (based on my experience with ML): > > "Oooops. Something weird has happened why would it expect an int or > string? Where does this variable get its value? Humm. What are the > possible types of el? Humm, what are the possible contents of a? Hummm. > Why does this language make it so hard for me to find my errors?" All right. This is because you are specifying the error was #1 or #2. If the error was #3, then it would be immediately obvious what happened (you declared "k" wrong). Given error #1 or #2, this implies that at the "k = j" statement, you had the misconception that "j" was of type Int. The compiler tells you that your impression is wrong. No harm in that, and exactly what you're seeking from the type checker. As a thrifty Python programmer, knowing how to declare variables, you simply insert "j: Int" somewhere. Now you get the error up there at the doSomething() call. If your error was back at #1 when you constructed "a", then the error at doSomething() will make you pause. Again, you'll realize that your assumptions were incorrect and you insert another declaration to verify your assumption. Now, you get an error at the assignment to "a". I see no head-scratching in here, other than that caused by the programmer's failure to understand their code. That is what we are trying to solve -- find cases where their misunderstanding is going to cause problems at run time. > > Find another example -- this one doesn't support your position that > > inferencing is harmful. > > It absolutely does. If I can't jump immediately from the error message > to the line that causes the problem then something is wrong with the > type system. I shouldn't have to "debug" compiler errors by inserting > declarations here and there. But I'm telling you: in this case, there is ONE of THREE problems. If the problem was #3, then you *can* immediately jump to the spot to fix the declaration of "k". Otherwise, your assumptions are incorrect and the compiler just told you that. That is exactly what it is supposed to do. When I get a segfault in my C code, it is rarely caused by a local problem. If you believe that all error messages and their cause are supposed to be localized, then you are truly mistaken. On one hand, you say that you don't want to "debug compiler errors" by inserting declarations. But that is *exactly* what you're proposing to do! You just want to be pre-emptive and force people to declare them up front. What if "a" and "j" were *supposed* to be of type "any"? There is no compiler error in your example and my proposed response. You specifically did not want to enforce any types on "a" or "j". The compiler did exactly what you told it to: no type checks on those. Later, you made an assertion that "j" was an Int by virtue of believing that you could assign it to "k". Well, the compiler just *helped* you out by telling you that assumption was incorrect. What is the problem with that? > > Sure. And in the absence of saying so, the inferencer above did exactly > > what it was supposed to do. You have one of three problems in your code, > > yet no way for any compiler to know which one you meant. > > It shouldn't have let me get so far without saying *exactly what I > mean*. It shouldn't have tried to read my mind. I'm the human asking for > the computer's help in keeping things straight. If it tries to read my > mind it's going to be making the same mistakes I make! Rather it should > tell me when I've first done something fishy. But you didn't do anything fishy! Not until you assumed that "j" was an Int, when it really wasn't. The compiler isn't trying to read your mind. It just told you that you messed up. The assignments to "a" and "j" were perfectly valid. If, in your mind, they were not, then you should have made that clear. But I'm trying to say that we should not require that. We can get along quite fine with knowing the type information as it gets resolved, rather than having to declare it all up front. >... > Therefore our real decision is: do we want to force programmers who want > static type checking to sprinkle their code with declarations > *explicitly* or do we want to wait until they get frustrated with the > inferencer? You are assuming that it will cause frustration. It isn't making a single guess. It is tracking exactly what you are doing with your types -- it is entirely logical and deterministic. When it tells that you goofed, that is because you really did. That is valid, helpful information. I do not want to require explicit declaration. There is no need for it. > It seems to me that "simple and explicit" is more Pythonic > than "we'll guess what you mean and you can being explicit if we guess > wrong." If you don't want to put in declarations, don't use static type > checking! Stop that. There are no guesses occurring! > > We will lose one of the best things of > > Python -- the ability to toss out and ignore all that declaration crap > > from Algol-like languages. If you give a means to people to declare their > > variables, then they'll start using it. "But it's optional!" you'll say. > > Well, that won't be heard. People will just blindly follow their C and > > Java knowledge and we will lose the cleanliness of syntax that Python has > > enjoyed for so long. > > I don't see the absence of type declarations as having much to do with > Python's cleanliness. I do. Python is remarkably devoid of syntactic sugar. The lack of declarations is a huge factor in that. > As Tim said, declarations improve readability by > serving as documentation. I agree and submit: # foo is an Integer Perfectly valid documentation if that is what you need. No reason to introduce syntax for that. Cheers, -g -- Greg Stein, http://www.lyra.org/ From paul@prescod.net Thu Dec 23 01:15:15 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 22 Dec 1999 19:15:15 -0600 Subject: [Types-sig] Back to basics References: Message-ID: <386177A3.86F0D505@prescod.net> Greg Stein wrote: > > This is all fine as long as the design does not preclude the availability > of typedecl information at runtime. I totally agree. I would even like a type-checking function/operator (preferably the former for implementation reasons). > Some of these discussions about new > namespaces or not worrying about names being defined could prevent that. I don't follow the part following "or". > I've proposed plenty of syntax for the typedecls and interface > declarations. I don't think there has been a solid proposal yet for > parameterizations. I would recommend that the syntax design at least > starts with the proposal that I set up to save some work and provide a > basis for discussion about how to add parameterization. Agreed. > [ personally: I'd recommend parameterization get punted to V2, although I > worry that if we don't take its syntax into account, we might preclude > its addition later on. ] Agreed. Tim Peters convinced me that it isn't actually too big of a deal. Parameterization is almost like string substitution. In some lexical scope, _T stands for a paramater that is substituted in when a concrete object is declared. If you treat it like string substitution then the semantics are pretty simple. One minor detail to work out is whether to predeclare the list of parameter variables or just look for names beginning with underscores. Paul Prescod From gstein@lyra.org Thu Dec 23 09:35:40 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 23 Dec 1999 01:35:40 -0800 (PST) Subject: [Types-sig] Back to basics In-Reply-To: <386177A3.86F0D505@prescod.net> Message-ID: On Wed, 22 Dec 1999, Paul Prescod wrote: > Greg Stein wrote: > > This is all fine as long as the design does not preclude the availability > > of typedecl information at runtime. >... > > Some of these discussions about new > > namespaces or not worrying about names being defined could prevent that. > > I don't follow the part following "or". Sorry. There are at least a couple things that have been discussed recently might prevent us from having typedecl objects at runtime: - shifting names of typedecl objects into a distinct namespace (how does the runtime access this namespace? when is the namespace created? etc) - undefined names, per the runtime execution order (at the time a function object is created, we need a typedecl object for one of its args; if the name referring to the typedecl is undefined at that point in the execution, then we fail or we don't get a typedecl; both situations are untenable for me) In light of these possible directions in discussion, I'm concerned that following them will mean we don't have runtime typedecls. While walking to dinner tonite, I came up with a great example for runtime typedecl objects: debuggers. I imagine there are other IDE functions that would find the information useful. [ of course, I can also imagine that, in some situations, an IDE needs typedecl objects without loading a module ] >... > > [ personally: I'd recommend parameterization get punted to V2, although I > > worry that if we don't take its syntax into account, we might preclude > > its addition later on. ] > > Agreed. Tim Peters convinced me that it isn't actually too big of a > deal. Parameterization is almost like string substitution. In some > lexical scope, _T stands for a paramater that is substituted in when a > concrete object is declared. If you treat it like string substitution > then the semantics are pretty simple. One minor detail to work out is > whether to predeclare the list of parameter variables or just look for > names beginning with underscores. Sure. Parameterization isn't a hard concept, but coming up with a nice syntax :-). I would hope that we don't base semantics on the presence of a leading underscore. In your original "Back to basic" note (which started this thread), you state that out-of-line files would be used. I'm beginning to agree with the need for separate interface files. But for a different reason :-). Let's say that you have a module that has a few "decl member" statements here and there. Those decl statements along with function and class definitions form its interface. How do we cache that interface so the type-checker can use it later? If we are analyzing module "foo" and it imports "bar", then where do we get bar's interface? I hope the answer isn't that we go and parse bar to derive it. [ actually, I'd hope we somehow generate a central database of interfaces, but that is an implementation detail best left for later :-) ] Cheers, -g -- Greg Stein, http://www.lyra.org/ From guido@CNRI.Reston.VA.US Thu Dec 23 13:37:44 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 23 Dec 1999 08:37:44 -0500 Subject: [Types-sig] type declaration syntax In-Reply-To: Your message of "Thu, 23 Dec 1999 10:58:39 +1100." <386165AF.F6E6BF81@maxtal.com.au> References: <385C1345.C21FF180@maxtal.com.au> <19991222224650088.AAA118.228@max41101.izone.net.au> <386165AF.F6E6BF81@maxtal.com.au> Message-ID: <199912231337.IAA21818@eric.cnri.reston.va.us> [John Skaller] > 1) Ban rebinding module variables after importing is done > > 2) Specify that 'def' creates an immutable binding. You can't > rebind (or del) the name. Same for 'class'. [This includes > defs inside classes] > > 3) It isn't necessary to introduce 'const', at least for modules and > classes: > Here's why: > Given (1), the only ways to change a module level variable are: > > a) a module level assignment -- easy to detect > b) an assignement in a function defined in the same module, > which has a global directive for that variable > c) exec statement [if you see one, abandon hope of optimisation] > d) __dict__ hackery from outside the module > > The only hard case is (d). This cannot be statically checked. > It can be banned all the same (and on the programmers head be it), > > I.e. TWO bans fix most problems. The ban on module level rebindings > is a significant restriction. The ban on changing 'def' and 'class' > bindings can be worked around [use another variable] and so > no functionality is lost, provided that the author takes > this into account. EG: > > class X: > def f(..): pass > g = f # g can be rebound, f cannot be > > I think this covers: > > caching all module and class level variables > caching instance methods > > what is not covered is caching instance attributes. Agreed. I proposed most of this a week ago. However, I don't see why you propose to disallow rebinding def and class. I proposed to have a warning for this. If they are rebound, it is easily detected, so the effects can simply be calculated by the type checker. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@CNRI.Reston.VA.US Thu Dec 23 13:47:32 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 23 Dec 1999 08:47:32 -0500 Subject: [Types-sig] recursive types, type safety, and flow analysis In-Reply-To: Your message of "Thu, 23 Dec 1999 11:14:44 +1100." <38616974.AFC365D6@maxtal.com.au> References: <38615A5B.D2E2B608@prescod.net> <38616974.AFC365D6@maxtal.com.au> Message-ID: <199912231347.IAA21844@eric.cnri.reston.va.us> [a.a can have type int, string, or a few other possibilities] > > j: Int > > j = a.a > > > > What's the error message? > > There isn't one at compile time. > a could be an int, the inference engine cannot > know if it is or not, so it keeps quiet. > At run time, a TypeError is raised if something > other than int is assigned. I don't like this rule, and I don't think this kind of rule exists in other languages. I would say a.a is a union, and most languages dealing with unions require the programmer to explicitly code a type test. Never mind that *we* might know that the original program in fact will only reach this point with an int in a.a; the fact that the type checker can't see that (and it would have to solve the halting problem to see it!) means that I'd be happy to get an error message here. Paul's issue was that in ML the error message is typically ununderstandable. I have never used ML (it's a language for people with excess IQ points) but I don't think that is the right level of critique. In any case I think we can do better by simply referring to the line number(s) where a.a gets assigned a non-int value. Good error messages are a human factors issue, not a type system issue. --Guido van Rossum (home page: http://www.python.org/~guido/) From skaller@maxtal.com.au Thu Dec 23 16:48:59 1999 From: skaller@maxtal.com.au (skaller) Date: Fri, 24 Dec 1999 03:48:59 +1100 Subject: [Types-sig] Run time arg checking implemented References: <386177A3.86F0D505@prescod.net> Message-ID: <3862527B.99B783C8@maxtal.com.au> I have implemented run time argument checking in Viper, using Greg's ! operator. The syntax (so far) is like: def f( p ! t = dflt): pass and the semantics are to check that an argument has the nominated type: f(a) checks like: if type(a) is not t: raise TypeError "messge" There is no return type declaration or checking, and the type can be an arbitrary expression. [In Viper, any object can be a type] Even if a parameter has a default argument, it is not checked at the point of definition. The type is bound at the point of definition. Implementation time, a few hours. Tuple parameters are checked componentwise: def f((a!t1, b!t2)) : pass We're not talking any complex type checking, inference, or anything else here .. but this could, IMHO, be implemented easily in CPython, well in time for 1.6. Again, it _looks_ nice to me, and it sits well with the expression form x ! t Example: ---------------------------------------- >>>t = type(1) ...def f(x!t): pass ...f("no") ... Uncaught Python Exception at top level!! .. Kind: Instance of TypeError .. Attributes: args --> Argument 1 in call to function f has type but type is required -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Thu Dec 23 17:40:29 1999 From: skaller@maxtal.com.au (skaller) Date: Fri, 24 Dec 1999 04:40:29 +1100 Subject: [Types-sig] type declaration syntax References: <385C1345.C21FF180@maxtal.com.au> <19991222224650088.AAA118.228@max41101.izone.net.au> <386165AF.F6E6BF81@maxtal.com.au> <199912231337.IAA21818@eric.cnri.reston.va.us> Message-ID: <38625E8D.5280EA8A@maxtal.com.au> Guido van Rossum wrote: > > 2) Specify that 'def' creates an immutable binding. You can't > > rebind (or del) the name. Same for 'class'. [This includes > > defs inside classes] > Agreed. I proposed most of this a week ago. Yes, I guess I was trying to summarise more than propose anything new. > However, I don't see why > you propose to disallow rebinding def and class. I proposed to have > a warning for this. If they are rebound, it is easily detected, so > the effects can simply be calculated by the type checker. You're at least right that my stance on 'const' variables and classes/functions is inconsistent. The rebinding rule was also intended to conver nested classes and functions. At present: def f(): class X: pass return X() x1 = X() x2 = X() assert classof(x1) != classof(x2) I hoped to eliminate this excess dynamism. .. redefining the class every function invocation is rarely desirable. But it may be harder than I thought, because even though CPython does not lexically scope function bodies (Viper does .. so my rule would break it), default arguments _are_ lexically scoped. And perhaps it is best left alone, since nested classes and functions are rarely used. [I mean, nested in functions] Hmm. I have already put a test in Viper for module.attr = value. I found I used it in a couple of places, initialising the sys module. [specifically, assigning the standard input/output files] I like the warning. I guess I could put duplicate defintion warnings in easily, and see what warnings I get. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From guido@CNRI.Reston.VA.US Thu Dec 23 18:08:55 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 23 Dec 1999 13:08:55 -0500 Subject: [Types-sig] type declaration syntax In-Reply-To: Your message of "Fri, 24 Dec 1999 04:40:29 +1100." <38625E8D.5280EA8A@maxtal.com.au> References: <385C1345.C21FF180@maxtal.com.au> <19991222224650088.AAA118.228@max41101.izone.net.au> <386165AF.F6E6BF81@maxtal.com.au> <199912231337.IAA21818@eric.cnri.reston.va.us> <38625E8D.5280EA8A@maxtal.com.au> Message-ID: <199912231808.NAA22805@eric.cnri.reston.va.us> [John Skaller] > CPython does not lexically scope function bodies I'm not sure what you mean by lexically scoped here, since in my opinion Python functions *are* lexically scoped -- however the scopes don't nest like they do in Pascal etc. (The opposite of lexical scoping is dynamic scoping, which is emphatically *not* used in Python -- variables bound in outer stack frames don't affect a function's use of variable names.) > default arguments _are_ lexically scoped. I'm not sure what you mean here either. Defaults are evaluated in the containing scope. I'm not sure how that makes them any more lexically scoped. --Guido van Rossum (home page: http://www.python.org/~guido/) From skaller@maxtal.com.au Fri Dec 24 02:01:27 1999 From: skaller@maxtal.com.au (skaller) Date: Fri, 24 Dec 1999 13:01:27 +1100 Subject: [Types-sig] type declaration syntax References: <385C1345.C21FF180@maxtal.com.au> <19991222224650088.AAA118.228@max41101.izone.net.au> <386165AF.F6E6BF81@maxtal.com.au> <199912231337.IAA21818@eric.cnri.reston.va.us> <38625E8D.5280EA8A@maxtal.com.au> <199912231808.NAA22805@eric.cnri.reston.va.us> Message-ID: <3862D3F7.1D71AC9B@maxtal.com.au> Guido van Rossum wrote: > > [John Skaller] > > CPython does not lexically scope function bodies > > I'm not sure what you mean by lexically scoped here, since in my > opinion Python functions *are* lexically scoped -- however the scopes > don't nest like they do in Pascal etc. (The opposite of lexical > scoping is dynamic scoping, which is emphatically *not* used in Python > -- variables bound in outer stack frames don't affect a function's > use of variable names.) > > > default arguments _are_ lexically scoped. > > I'm not sure what you mean here either. Defaults are evaluated in the > containing scope. I'm not sure how that makes them any more lexically > scoped. I agree with what you say, your guess at what I meant is correct. I should have given a more detailed description. The issue remains, independently of the terminology used to describe it: def f(x): def g(a): .... return g Here, each g created on invocation of f is has the same behaviour on each invocation of f, no matter what the arguments of f are, and no matter what the values of the locals of f are. But each invocation of f() returns a _new_ function called g: a distinct object. This is not the case if g has a default argument: def f(x): def g(a=x): return a return g g1 = f(1) g2 = f(2) print g1(), g2() prints 1,2: g1 and g2 have different behaviours, even though the bodies of the functions are the same, because the default arguments differ. So _part_ of a function definition may depend on the definition context, namely, its default arguments. Unlike C and C++, in Python default arguments are part of the function. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Fri Dec 24 03:16:04 1999 From: skaller@maxtal.com.au (skaller) Date: Fri, 24 Dec 1999 14:16:04 +1100 Subject: [Types-sig] Multiple dispatch [for interest only] References: <385C1345.C21FF180@maxtal.com.au> <19991222224650088.AAA118.228@max41101.izone.net.au> <386165AF.F6E6BF81@maxtal.com.au> <199912231337.IAA21818@eric.cnri.reston.va.us> <38625E8D.5280EA8A@maxtal.com.au> <199912231808.NAA22805@eric.cnri.reston.va.us> Message-ID: <3862E574.156465BB@maxtal.com.au> [This is mainly for interest, it is not a 'proposal' of any kind] A day after implementing type checked function arguments in Viper, using the syntax def f(a! t1, b!t2) ... I am thinking of some interesting new possibilities this leads to for polymorphism. This comes about because the type information is part of the run-time function object. Consider, just for the moment: def f(x!t1): ... def f(x!t2): ... and suppose this was allowed, and matched the arguments of a function call in reverse order of definition: f(a) is notionally expanded to if type(a) is t2: f_t2(a) else if type(a) is t1: f_t1(a) where f_t1, and f_t2 are the two functions named f, defined by the above definitions, respectively. This could be useful, and the mechanism has a name: it's called multiple dispatch. In particular, it is _message_ based multiple dispatch. [The table of possible signatures is centralised, and associated with the function name, rather than distributed between 'objects' as methods] Here's an example: suppose you have an algorithm: # module System def add(a,b): # add standard integer types def gcd(a,b): ... add(a,b) ... add(b,a) It's well known that Euclids algorithm for calculating the greatest common divisor is generic: it works when both numbers elements of the same integral domain such as the integers. Python already has two integer types, 'int' and 'long': the algorithm will work for both. What happens if a new type, MyType, is created? You would write: _add = add def add(a,b): if type(a) is MyNumber and type(b) is MyNumber: ... # details for adding MyNumbers else: _add(a,b) # call old function System.add = add I note this is somewhat insecure, since _add is likely to get redefined in a subsequent attempt at the same thing, Viper has a syntax to fix this, but that's another story :-] I also note that this _requires_ invading the module System, and breaking the 'freezing'. Let me call this an incremental function defintion. What we're doing, is overloading the add function. This is part of the theoretical 'type' of MyNumber (that you can add them). Adding a new meaning to the System 'add' function is essential for making the generic algorithm work with MyNumbers. [There is another way to do this, using classes and __add__ and __radd__ methods, but this does NOT generalise because of the usual covariance problem with object orientation: please consider a three argument function to see this, I've used a two argument one for brevity] So, I was thinking that it would be interesting to consider what would happen if the run time system, instead of just looking a function up by name, ALSO used the types of the arguments to help choose which one to call. [This could _also_ be applied to methods, except that the method would not be applied to the 'object', since the class of the object already acts as a lookup namespace] Just to summarise: I'm thinking about some way of permitting 'incremental' function defintions, and an associate 'multiple dispatch'. Before considering syntax, there are architectural issues. The invisaged implementation for _builtin_ functions, such as 'len', would be to keep a list of the functions named 'len' associated with the string 'len': instead of the lookup table I'm using at present, which is like: {'len': len, 'abs': abs .. ] the table would be changed to {'len', [len1, len2 ..] .. } and instead of dispatch return table['len'](arg) we'd have something like: for f in table['len']: if type(arg) = f.param_types[0]: return f(arg) This is NOT particularly new for Python, it is similar to the way exceptions are matched against handlers. I note: the actual implementation is trivial. The extra overhead calling functions is also minimal. The issues here are semantics, and to some extent, syntax. ------------------------------------------------------- FYI: what excites me about this is that it is well known that object oriented polymorphism doesn't work in the sense that it doesn't extend gracefully to multiple arguments. Dynamic multiple dispatch is a fairly unprincipled way of supporting the notion of genericity; however proper genericity cannot so easily be implemented: no one knows how to do it, it is a very active research area in which there are currently a diverse collection of theories. Many functional languages provide _correct_, well principled support for genericity, using type variables, modules, Haskell classes/monads .. etc, but all of these type systems are severely limited in their expressivity. Generally, less well principled systems like C++ templates, or dynamic disptching, are less secure .. but more expressive. For those interested in why someone like me, interested in genericity, is VERY interested in Python: this is the reason: it is possible to use the relatively unprincipled dynamism to _implement_ generic things in python, even though such implementations are not very secure. This is somewhat like using casts in C. As any Python enthusiast will tell you, Pythons dynamic nature makes its object orientation more powerful (expressive) than that provided by many statically typed languages .. and also more dangerous. However, OO doesn't work well, and it is interesting to experiment with a dynamic language which does not have the constraints of _both_ strict object orientation and also static typing: a suitable dynamic system can suggest how to 'design' a static type system that retains the expressivity enabled by dynamism, but also provide the extra security static typing traditionally provides. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 Return-Path: Delivered-To: types-sig@dinsdale.python.org Received: from python.org (parrot.python.org [132.151.1.90]) by dinsdale.python.org (Postfix) with ESMTP id AFB6D1CDCE for ; Wed, 22 Dec 1999 04:14:14 -0500 (EST) Received: from nebula.lyra.org (IDENT:gstein@nebula.lyra.org [216.98.236.100]) by python.org (8.9.1a/8.9.1) with ESMTP id EAA04412 for ; Wed, 22 Dec 1999 04:14:13 -0500 (EST) Received: from localhost (gstein@localhost) by nebula.lyra.org (8.9.3/8.9.3) with ESMTP id BAA25356 for ; Wed, 22 Dec 1999 01:16:46 -0800 X-Received: from chronis.pobox.com (chronis.pobox.com [208.210.124.49]) by nebula.lyra.org (8.9.3/8.9.3) with ESMTP id AAA25205 for ; Wed, 22 Dec 1999 00:39:31 -0800 X-Received: by chronis.pobox.com (Postfix, from userid 1001) id CAD4D9B1B; Wed, 22 Dec 1999 03:36:36 -0500 (EST) Date: Wed, 22 Dec 1999 03:36:36 -0500 From: scott To: Greg Stein Subject: Re: [Types-sig] recursive types, type safety, and flow analysis Message-ID: <19991222033636.A14007@chronis.pobox.com> References: <19991221234415.A12628@chronis.pobox.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.95.7i In-Reply-To: ReSent-Date: Wed, 22 Dec 1999 01:16:39 -0800 (PST) Resent-From: Greg Stein Resent-To: types-sig@python.org ReSent-Subject: Re: [Types-sig] recursive types, type safety, and flow analysis ReSent-Message-ID: Sender: types-sig-admin@python.org Errors-To: types-sig-admin@python.org X-BeenThere: types-sig@python.org X-Mailman-Version: 1.2 (experimental) Precedence: bulk List-Id: Special Interest Group on the Python type system On Tue, Dec 21, 1999 at 10:02:08PM -0800, Greg Stein wrote: > On Tue, 21 Dec 1999, scott wrote: > > On Tue, Dec 21, 1999 at 08:06:28PM -0800, Greg Stein wrote: > >... > > > Basically, I think your request to find and report on > > > use-before-definition is "intractable" *when* you're talking about > > > multiple bodies of code (e.g. two functions, or the global space and a > > > function). [...] > > I'd agree that this has been demonstrated, but only for examples of > > code which seem like great candidates for compile time warnings. Are > > there examples which strike you otherwise? > > One of my points was that I do not believe you can issue warnings because > you can't know whether a problem might exist. Basically, it boils to not > knowing whether a global used by a function exists at the time the > function is called. So you either issues warnings for all global usage, or > you issue none. You can make a few guesses based on what happens in the > global code body, but I don't think the guesses will really improve the > quality of warnings. I personally can't imagine that it would be an issue to treat globals in functions as anything other than a simple flat-rule: for type checking purposes, globals must be defined at compile time in the global namespace, that's just me, but I'd probably fire any of the python programmers that work for me if they did what you describe above with globals in a large project :) > > Examples? No, I don't really have any handy. Any example would be a short > code snippet and people would say, "yah. that's bad. it should fail." But > the issue is with larger bodies of code... that's what we're trying to > fix! So... No, I don't have a non-trivial example. I can't even imagine one, so if there's any way to describe this global issue a little further without putting too much effort into it, I'd appreciate it. [...] > > The origination of this discussion was based on the recursive type issue. > If we have runtime objects, then I doubt we could support the recursive > type thing without some additional work. Or, as I'm suggesting, you do not > allow an undefined name (as specified by runtime/execution order) to be > used in a typedecl. you could even allow typedecl to import modules for the sake of gaining access to the names, where those imports would only occur when the optional type checking is turned on. I'd agree that the use of an undefined name should be disallowed. With the presence of type-check-only import, following the same no-mutually-recursive-imports rule of the regular import, but only importing typedecl statements, you could achieve all this at compile time. I've run into this issue on large projects, importing a classname, just to run assert isinstance(foo, thatclass), "complain meaningfully" But it hasn't come up with recursive types in any code I've seen, just deeply-complex types in terms of container and class hierarchy relationships. > > The design of how to handle recursive types depends on the decision to > include/exclude runtime objects that define function, class, or module > typedecl information. Even if we defer the runtime creation of those > objects, it will affect the design today. > indeed. [...] > > I do believe the information goes into the bytecode, but I don't think > that is the basis for needing to plan now. Instead, we have to define the > semantics of when/where those typedecl objects exist. Do we have them at > runtime? in the above, no, though we do have the ability to find a name anywhere at compile time. >Does a name have to exist (in terms of runtime execution) for it > to be used in a typedecl, or does it just have to exist *somewhere*? in the above, it has to exist in the typedecl 'execution' model, which is during compile time. >If > names must exist before usage, then how is the recursive type thing > handled? With unspecified typedecls? (like an unspecified struct) How about an iterative model which continues until all typedecl names are filled in? I understand your concern about 2 distinct namespace models being unsettling. It raises issues of what exactly we want out of static typing, and what sets of existing and future python code may benefit from static typing, and these are indeed big issues. For me, it is sufficient to proceed from the premiss that you can't have static typing work on code that redefines types at run time, and to limit runtime checking (for the time being) to optionally have the interpreter take some action (warn or abort) when that happens. That requirement alone implies that typedecl'd names and their typedecl bodies need to be available at run time, which is sufficient to support just about any future developments in a static-typeing interface in pure python. As an aside, I'm glad to learn it wouldn't be difficult to have python put static type information in it's byte code. That seems like a good place for it. As weird as it is to have a separate type-decl name model, it seems infintely to depict dynamic typing in a static typing model. scott From skaller@maxtal.com.au Sat Dec 25 21:33:41 1999 From: skaller@maxtal.com.au (skaller) Date: Sun, 26 Dec 1999 08:33:41 +1100 Subject: [Types-sig] Viper module compiler begun Message-ID: <38653835.11199C33@maxtal.com.au> I have started work on a function which takes an already imported module, and generates CPython 1.5.2 CAPI compatible C code. [i.e. the idea is that you can compile it as a replacement for the source python script] This function does not as yet use the type information which the function argument type declarations provide [recall I have implemented 'def f(x!typeexpr)'] However, it _does_ use the type information available from the module level objects (obviously, this is necessary to distinguish functions from other kinds of object :-) Note also, the model under investigation does not 'compile' from script source, instead, the script is executed by the interpreter first, to build the module dictionary, and then code is generate to make each of the objects found in the module dictionary. Because of this approach, a lot of the 'dynamism' involved in constructing a module is bypassed -- it is handled in the usual way by the interpreter. Clearly, this approach is limited. One of the problems is that any objects 'imported' from elsewhere will get duplicated in each such module (rather than being shared). I'd call this a 'leaf compiler' for this reason: it works best with modules that can be regarded as leaves (not needing to import from other modules). I'm investigating how to get around this (look at the source, to see what is imported, and actually import it, rather than building copies of the objects). I'll report back later on further progress. I note that while my code is written in ML, if the tool proves useful, in that it can compile code for some modules, and that code runs significantly faster than script, then a translation into Python should be possible. In particular, such a translator could be compiled with itself .. :-) -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From gstein@lyra.org Sat Dec 25 21:38:47 1999 From: gstein@lyra.org (Greg Stein) Date: Sat, 25 Dec 1999 13:38:47 -0800 (PST) Subject: [Types-sig] Viper module compiler begun In-Reply-To: <38653835.11199C33@maxtal.com.au> Message-ID: On Sun, 26 Dec 1999, skaller wrote: >... > Note also, the model under investigation does not > 'compile' from script source, instead, the script > is executed by the interpreter first, to build the module > dictionary, and then code is generate to make each of the > objects found in the module dictionary. > > Because of this approach, a lot of the 'dynamism' > involved in constructing a module is bypassed -- it is > handled in the usual way by the interpreter. Interesting approach. Dunno why, but for those that need the dynamism or want to avoid importing, the Python2C compiler can be used: http://www.mudlib.org/~rassilon/p2c/ And it exists today :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From skaller@maxtal.com.au Sun Dec 26 01:04:19 1999 From: skaller@maxtal.com.au (skaller) Date: Sun, 26 Dec 1999 12:04:19 +1100 Subject: [Types-sig] Viper module compiler begun References: Message-ID: <38656993.58E50914@maxtal.com.au> Greg Stein wrote: > > Because of this approach, a lot of the 'dynamism' > > involved in constructing a module is bypassed -- it is > > handled in the usual way by the interpreter. > > Interesting approach. > > Dunno why, but for those that need the dynamism or want to avoid > importing, the Python2C compiler can be used: > > http://www.mudlib.org/~rassilon/p2c/ Last time I tried that, it crashed unceremoniously. Has that been fixed? -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From gstein@lyra.org Sun Dec 26 09:45:59 1999 From: gstein@lyra.org (Greg Stein) Date: Sun, 26 Dec 1999 01:45:59 -0800 (PST) Subject: [Types-sig] Viper module compiler begun In-Reply-To: <38656993.58E50914@maxtal.com.au> Message-ID: On Sun, 26 Dec 1999, skaller wrote: > Greg Stein wrote: > > > Because of this approach, a lot of the 'dynamism' > > > involved in constructing a module is bypassed -- it is > > > handled in the usual way by the interpreter. > > > > Interesting approach. > > > > Dunno why, but for those that need the dynamism or want to avoid > > importing, the Python2C compiler can be used: > > > > http://www.mudlib.org/~rassilon/p2c/ > > Last time I tried that, it crashed unceremoniously. > Has that been fixed? Not much of a bug report. Get serious. How the heck should I know whether that particular bug has been fixed? "oh. it broke. fix it." *snort* As far as I know, P2C can successfully convert *any* module into a Python extension model. -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Sun Dec 26 12:13:41 1999 From: gstein@lyra.org (Greg Stein) Date: Sun, 26 Dec 1999 04:13:41 -0800 (PST) Subject: [Types-sig] merry christmas... here is a demo! Message-ID: This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. Send mail to mime@docserver.cac.washington.edu for more info. --1658348780-1663354158-946210421=:412 Content-Type: TEXT/PLAIN; charset=US-ASCII Hi all, I banged together a rough prototype for a type checker. It provides some interesting errors/warnings, but totally ignores a bazillion others :-) But: it does provide a complete structure for handling that stuff. It understands a variety of types and composite types and whatnot. It analyzes a parse tree of a target module. It provides for looking up names in the builtin, global, and local namespaces; each name has an associated type. No declarations exist, but it does extract a bit of type information based on what is going on. It runs in "verbose" mode. In this mode, all assignments (or dels) and return statements are printed, along with the type that will be assigned or returned. It's fun to do something like: return (1, "ab", []) And watch it print: line 1: return tuple> I figured it would be nice to go ahead and dump a copy to the SIG. Merry Christmas! :-) [ and for those who don't celebrate christmas, figure this to be an early New Year's Gift... for those who celebrate Chinese New Years... consider this a *really* early gift :-) ... for those... hehe... just have fun! ] For fun: run "check.py" on itself. Cheers, -g -- Greg Stein, http://www.lyra.org/ --1658348780-1663354158-946210421=:412 Content-Type: APPLICATION/octet-stream; name="typesys.tar.gz" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: Content-Disposition: attachment; filename="typesys.tar.gz" H4sIAJkBZjgAA+09a3PbRpL5Gv6KiVQ+kDZN62E7W0xkl2wpG9XaclaWN7ul 0/EgEpQQUQAXACVr49xvv35MzwMPknJsZ7eKqMQigMFMT0+/prtnZngRDS97 09uvPuO1sbmx8e2TJ19twPXtU/67+fgx/d3YePpk88m3X218u7G9+XhjY+vb TSi/+e3206/UxucESq5ZXoSZUl+d50UUJ/PKRVn+JQD6stf6N49mefboLE4e Rcm1mt4WF2nSWm+tq6GmDPXwIf9WxUWkittppGZ5eB6pOFGh+ok+UFfpaDaJ 4LtWfDVNs0JNwwwQJnf57dVZOpG7Ir2MEvsqNz+LLE7OTSloKW+5d6NoaOqY XIeTWSR3SXgV5dNwGLVaAHmYqOj9MJoWMUBWpABupMZppq7DLE5nObQzG49b 6+vrahy/h17FuZpNe71eK8oyKLajBgOscDBQD1TQo4cBVDychHmuXiIqoqzf UmoUjaFonMTFYNDOo8m4q66j7CzNo52NDhZQCp/29EOoWP9q2XeDUQyAF8ML ePur+o1eMKzQvS51DBHNCOzBHwKtFxfRVd7WjSgVj9VFmIdFkWk4gkEAwGNR U6bc4Ak1cQrt2sIt3Ssa78E4nkS6Pvypa4K2cDDa9Ejt7PBA9d7S4B3Db2kQ C0Dt6TRKqDBAlQUd23VupIjeF/S6l0XhqN3p+DDQa4YhT2fZUKAI8wLqZirr 5TPAR1u/p9dFFkX2PRTeKmZT6A386qpNB4iBbgbKt/Efan4daP73XFgFVfPT 0cHfdo/31ev94x/f7L2l50I4TsPcP2peegdgF/TkZOMUkawJABEFFDedFQ4N jWfJMIfenqjTr4Gm4Q4pP0fSZ8ZFAoqGaTIChOQ5f7kOjwqgezWcZVmUODzU U0mqzifpWTjJ1W1U9GxLCTZjC75LoJ1DuW07aJXPd9RhmkTSYpxQt0DiJqMw G2mpwRW6jfSQ08MsagfCiUHXSABNaZ2mD0bpEMsbqncvU8cMIR/Ibd5ul2qv /7z5Mp8XI+xwp9MI3tksnhSACL9P75LLJL1JXhNCOi0jBpJ0RPxPlLDZP3UY Hl8hcXyzwwK1t3+493r36C/7RyAAR9XXh/s/vzo43LfiQBOZFLRElhdXRUlo 4KM2lQQgOjKeqBAMNagzIDH4TU9l/KFTdUSh8SL1TLN0GIFsjUKQg+mYavjn LJpFI0vNXFR/kKRF1HeJ/yq8VedZeqNGMxw+FuqTNJ0ipkJTC3Ieymv4qWvi Kx0CF+RInzHgO4TqL0BsmFHAz5mJpEHBYh1XuAyhi60rTQGqDUwZJeEZkD3g Pz5PrpD3SKtGCbQ1jPBBR4WTCaFBIDcVgUiD+gswRnoiZ5FO4BmQ0wihxE+A Dnthdj4oRnlJ/jvUKB8ZMF2JiLW0qSocdqgr/7Si8e0xCMbX+4fH6vDN3r4v GonaWCiaxvsiteAdDD0ofpCEeKM+KKC7aTpLRgNDuW5PsIq21AOUrvvhfdQH OpHabi5iWzUgV34W2a00IHRjrg+EdQQeoEE7AX5CuYHXCFoW+BeEtChhoHAc 5Rx1OhggVHV+BYOvoYreT037U6Dsgm/gbhRNzAtoz8A7SW/qofyg2FiSksyN che9j4bymwWD1DKw8JgOLOiBAaKvzkCrX9pxSkD0zQxys6iYZYm5C+M8Mq2a OpZrlOnGUkUj+dgifQfTqh18Fzj3nfvqBJ6cKi01RVJFaDSFRcSiChRZQfwK xsllPJ2y5AG4oqt4mE5QB6MoljrKQl2AEw4tCWQo0RaJbADramn+dv/1QdeX 7J2qbnBkuiWqsmg3b3wBTxjVXNGETf26r4J4HKgiAqMs6AMe0R4DlEaT6mPC bDTJo8A+O9XoBYs8iXBsUxDoCc4xSE+TShBVbXVjjBjKwuQ8am901QSsTAOf eqi2u+pxp18j2HxxEIPlu3naQeJCOJcqv83lCfSW2MR++/fUY0T+9mIAHmLr TmWEdiuCmjBvSwDy6aaC/9+H5oaSwMIwiTmPElD41WFp7ib30qC4ueC2g45a 1D5Tjxdj9WkNUkWUN6FU3gNC4WdAsncSI0rjhJHLd38kgnnaa0wo3Xe2JMrI Fvj5Q/hxvFf6rg7z0k/blQs0q1J1E+H0NR7fkojT9ZHhgjIOpmlZOCxo7j2J ro73ntP3/FvtWEt3N7k1POPOjmVMJ9cwCDgJ5672YCiuwmKgO97Wf0XWkU5U wSROInVv1Mcx7Kt7iJ97eQBc6Or9k63Trq6+qwHr2F7yTOnmIkLTT7p3RcqG TVpuGWdN5ptwOp1U0NEGGoXhtJ/ktpF/zmJo5CJEZ0qKNh6gljWSYw3mgp44 B4MU5kpgSeqauoKVV3/DP4fu/F6sO00M+oseG3led5tJ4Ikr2Rq470+Lue9P JQFJ7CfmUxP7yXtgP/gZuIqEvTkDMKzQlWNVyQPPiMerypJgWJTrC2AOALh3 n4l6dxsCQPg+UCckWk+CLsuB0xpGBw7JYfrDHMG8jIP5jZn7wvQC2AO7+Yir VaDD+F7D81xQ7rCko8J94MyEwFa4cFy2iMOvQpzKmF5XlOl2SZl21bbjRCLj ZceCGJ+aV7WGRy3UJdJCstpy3zV0AjvglSrV8divYwlxefL41Mx+nErrBJNb 611ElL3KwkqTgVAbiS0Qj0ZuwfA7MqsCJ1LfzUVYaOGMDgQVnqWzgm1QdHJq qcIUCk+Bexi45zV1sWl601W/zIDYkdOMrPYL3kGKeVi9mzwr0UKTZHMVi4vy OSbcgy2tHx30ewyBessf+KqvRJveu6/3sSooritaHoRWuaF1Vxgsw8wLyzxx THk9QW20fvg1CD34N1DYMcfXUJGU9hV80QYJeQ0PoD5UhKcq6AS6nPO4r9rj KYJyEuxoOapAonbuV6Q4Xe3gvgaD5C7c3g8+4DP4v0MvULbXPO7U1Qcz43Lb bay2/JSmJ93gtNauR5x7Zv06oXacptYVD5YETHlBy3heJ3S4hAUwWBFe4iwx Ab7JzmcuoxQj10w63vtBf912KbyLTt4uOVX53yoVAXfE5Bx3oIb//DkmF/Il tTNUdiapC8IspUL3lj+8YVY3cXEBvdePUMtdRrc3aTZ6SPdEOfZTJwpjHvqo gG7/4HvBfO70W08TMMnQ0QBaDjGdReczQD43m1txRv46dpqbZ+7kXHfc8bnS VxqvgAxflSWukx5pyhdj1FoP7MUoGbWZX6mUnXK7sgvnzV6NSJp+hST8gciy KAkKQHBknA9ZlM8mhcri84uC5fkZqIT7o/S+RAMKryaGBlsoO3jN+IhiGE/C oogS45xkHKL5z6IZFYbWNP4Is7+WnLLwRR5hwI1qAKqIMoawYFUW5dihK2wo k5b9QRsgGHUjp2NlVKhkaZBhTuGweRGrciMyYPShNz5ls0ACNjXNvMKpAfyq oQeYNFWr127bRR0kAD0Vxd8JzO0a9eiOq8JAxMjGdK2MamDDRomkG/alkrGP Mag7LIAn82lEsq/r+bV1pzBSUGCwSYXXYTxB/7ipwSt+lo5uoTjSSD6bTsE+ I9smjyJDkTAYaGpDneh9jSfRSCsEVPPaiPhOiRdc3aQzwMM0i67REw+y25gw drJkjQ9njg3/Od7zdY5YCGEbgHHgYKSAmHWUA0PcNKIYVNhRe9E4yrJoZJAr LXDdBgQKO8jg4o2j2afz9DoJI1akH0hVj6fsxzA6mh/0tS60arGiC93JdJM2 YJ+uV3RT11AWuXW+UMLr/DjXyzevX+8uEcaiPviiYwkJLPBTTFIwLC79JiTL e7CF6KcxXNqu46gTnDp2lKFviX5Zk8q8MvkJzKy6ZKvKnDp++HJi47vNvMeJ DMsxHpf9N+Y6wZ4F1sb4UHeodlyg1ikuUhCqNzGorjOUdsx1iJC0mg5QddEa UjARmSZaMAX6dujbYl/iHRi8uuh5VFgBrOOe+DlKCW3s4O0Cv52AhzVqPWwq 1Y4oyXyxkfbFDvStrtrqWK/cMn7H2FoOjZPou06ey5NmbrPBybdZcvIR8mwU tuLm09id4+YrO/r0F/McfR8xSf6IyfF8d5/tOBGtDR02Ua0tATKMbmDKxQEF nKKxB+y0mphgUX8noV1LzRZeCW42QSvvabY6CUqu7mV95J/RDw1QzfdAO2Mj 0dvGoZECODKoXOi5ZPJwmM5GdhtDdbYIhuvoLgB7G+zsESXZsOJ3HtRPzMGG GGfplf+pqY8m7R9Y+1F9NCG/L1Ny55u+FOpJoVozY5M0utTfdxgsvyAVQr6b FOU7ToDOIpw3YEyYC9ZqSc6yEWadF5Ft8Gm63WhwEtXMl91pJCJtx5TcdCeT 6zyFJRGeTh9OQENOjLIWf17Zy+SoSpPUgS/dmUot226781u/szrQfLx75Hay TOk/7x4dHhz+ua/WkCwkpUvnZd5fg8Yu4jNUwaSPiBdhoHrIGtql2+h003Pc UUpTXNTlUs/C7ntOQeESJ+OhiUucIsAlfBdUyLnKgCZ9otkq0AUoqhANWWaB hSixRS+6UKnfScloasEpAm3wXaXqJcK0dY4vz0E/N4JoILaJH00A2xIAL93U SDYvZaTR/HYLoQ2u72vqc1JOmmpzikBdfKdDQOTdbAj5A8cYFJHYmY/lRXZS mc/0vKRBqWiDmLto8mgae2hKYAfxphLicsmxhh51xm1Djg2+LOdo6SQVdXC4 h0lf+PSB2tvHm0Z8btZkxvjzT8lusU15ROomBHlapV/jfv8IReDn0zQnS36y rLn9v/90tP/27cGbQydtDl8cU3BHE0mo0OG0Z73O6dkv0ZBSwmw4VpNzYzhW v+dZDEs//OU7BcgKOt4rG1tncZJO7Vh6HCOf8AibXFWZyjRbjbpz8j26YAVW zO1+OA1JtfAcjzI6v7MIoSJWd+jnrnMrfh+NXkXJMeWJGyA9fM3DVR97MtC4 wvQRuQWEfVCT8OoMvRHNnhRNUbpgqc+MW/2uSsl3HoqyXr25iIcX0I3sSqah 1AWZLTMU2uooo66cSl1FnWCiUXPp9330RQgK4WFg7sU4/MR9pDY+UScF1KZO ynuQuPDTdk2nr8IEPYcpv3XtLJjvL1RBaDxhDGOEsSjyrXO7aETiZDQafYdZ 02eYrDAFBkmiG2hYpjPGjSSIGDkaWaBtVsdSghNYVRufDNIp3clYTqJxMbef G0LZJV/FViU/wfopOOoxr1bHR8GQAlwwJt/jtCV4Rv/u8G/+873+w6++4bud nUDq+NoI/w+cKPZBI1pu4tz8oRf6Q0GJm0rxcLMUptOFfBGhH+qSTNY7pihJ UEpnJAl639oZumR9CE/Tkq7Fj3LWALQ1d3oDlTng6Ckc9t0Pmgjo+AqQ5byr BlhM2Th3kFgtWYOOTUKHMxytOhZBF088DM/iSVzc9tCkhFn6bFgwYwDj95yv zKTnhpZuYc4FEim5as2XfbWtvlfv4f8n4lNgkjdS5SApPC7D956jcZ6Psa/e p9mA+Sv4EJi7j5KVHiYwukehqxosePCWVDGBLWA0gS7vWV9q8P8nMHd/MPgC xjxdxeDnF/G4kA78V+Dc/8FdsIA0GuimBIxCFhcXuhvt4PvvgZSePQs6zvN/ H4FdFZc2Dg8yB1+LRHq1/8Px2x8Pfjh28gQQxZLBimoQOT6iZXLYtUcESk/N pjQG+KhnviWF6EgcEChua0cHf/6Rm/t0rTUKBjsyjURqSvTZ4IGRfQAD+xDG Fe//I0f0p1fv3n6RwXx9cPju7RcYRxyJ5hlFdtVX43CIs7c2elU/BI/g/3sw gvz0P3IMfWfiJxvDciuvdt/++EVI5af9o5fiwPi8xMKD3hjkp7d9JWz+Ifg/ Qyi4QAzM/Wz5uQUVTNxB3TBDKolJVZ6c03d/XmHGLCnx3Cep6fjg1d7+R9Wk 0zgDwlbQNP2ht42hGnwJOrVIr1SRhfEkyu5zimKJaxcMQxPPbpZ41t0AwGFN zzbfe/Puxat9n/OK0dZ87n6wWU4CA4P6DH2RBcbMR1vmnYeh2q5R8HSg0dHG 76tSpIpnxGGjfoN3Ou3UeGUxvQLt+xP/4Sk9/BUejuJhcRVeRhk8/U1SX/iC Ev/rJmr8r4lhwZ93r1/sH8GPt8dHYPI/aOSOGvZ49ZNFujtCpeDQ0U/u2Cxy TbVNqqDnHJrrZy6B9favL5YByylWB9bfwgyAwvS2th9vaYQOsTswFi2Is4WQ vjjafbm/DKxeQR20zzg75TK6fcRBfbKpgZKzLAKD+iLKoudz+rcXU0JWmN2W Et6W7C0S3J16+2L35V/++u7NsenH4jhNFW7Oaqyrnim5VjSGBa1sbXdMKgQu +tKJPD1c+w4GyRSa4hwJBENXcwmMQJ/6Qdsm6CiPyfjR2pe1eGBO69+9g45H w4mwOmlojoSifSn0fXbrika7P0HvPCpsSqfYAbxswN0F4wyXeGN+xj7uFeM5 VdyND1w/hdem364uXdv4HQHAyzFhNAsV2Bj63WXPGHdXim5515hyEL00hABl Ux3egmRstDm6tX909Oaor2bcE8UJAg2xLh8flVzxEtlovURRLkV7+XRVMONE AA734MeB0TviZG9QPeKfB9jxV1hZ+dBnRWJn9WP0L8eJp+bqUtvN9H12lg+z eDo3OuMV6ttbjtOY23/LYA0uTjLwglHG0RsqWe98XxycMfUtxBcMXC8w/39Q 2gV/opej9AP5eQIDOozS6anUwLd9pwzPvzBFyLXJ7mA+770x7gntX52Q47VB 8h3vvWAWa7NzaH8yiad5nGOwz2YAKswwnoAdizDq8IauSdInqDbyMpHF6xSR MTLCFnvN+fk3UDDEvUrS+u3AbuIipsxDMaZNIqPeqWTOmoYlzGEjSiZeZNhl dF0EqtrUUwPrG8anG4yjcjC5ZC+/fPPqzaHhAfVMgTFca2Y/2CzF7vxVGktZ 126FD8v1CcmJSkv9lZdYHv7/GoVSzv2RyJb+0of6zsA6rsmbiLbYEQIwqcrx vyLbYC3zEgV90hj4O5jiHRwf7Ndtp0WwWxmg+b+8aoezgQfaRuC+y9YiqHbb SceOTuEWdwihvCUbKNMBmJrwbUXNae3mwKiGYYJVX4BEBV7lutTavXxNte+N Oqj3/tvT4sruBaIbPElQFxqrBKfBTMDw7++HiKpaAJDfXBmemmQWo+VL2hdU /YgSwuoMMtbcgd6GT9dOJpBjtHXVNdjlZziat7iuZjiIk3HarqQR67shyD7c XAoz1owNwuZGtZdnveJsMEaC740HZ+HwEv7gR0mKTda8HEK/esN04Boqmi3c 3dDMzn+eSa0LilnFtOxQYKdmlxYvw7VC/evqzeG+evOD3CmTLivhWbQZ2NOv zQUpuDAJhArq4tdx6IMyIM3erW1+TvTKbXy5rIoKFGYvgY+orFZilHSX2RoE U4ZKQpXw5air+jXuNfNFjbQ8+ucMd//QIpiLO7LfLKdZV/9KgX5HyEqki3HP w1lSxBMU1+MYbDWbGEszOIAOcy9p8yJdBT2/ikYxOiSpFZ1Xe0bKBKA/Ly52 drb6bIeT3d9Fc+ohgdfT1VylgIxqXefAGFdhltNcHDOoI1wq2Bd5106n+mfn /nfGPASTMNM5pOSsS1LcQ5ISmfIeecRmOCmnFnIEixMpuuJi67o1OZXwV1IV Q04757g7LhnSRQtriWEk4fT2NinC9/s8uVgDQRoUZiWGBX5t7ljCkNG6e9pI CtEERlfC2+FRR3tYwo5NiHtJaFelFR6Na6HYA1wuUjY40K8m8MRsSdCHDDKo E3Fq5rSQKCGzAIF2FuFcteZhDMTwFY6CaF3y8gksJiuP/Z5g5Egq3slWHywd b2INbxe6OT9ufKpQ1vkxdRla/jjHCSTkVP6KeuWu3KHBD2WJSICI4YHtAXNd AzMmxOnCcH+oC7SGij2hudkw36nxh5YzSUq+0CWGr91ZW4R+f/MQH6ajqjO0 ApPvCF0CppPThTD5cr4BuJqFlTU1oVkgTNRx3R3zgZzEIK7DyVqtGWHB+6L2 BE+pzQIed6WoKw78xUZaFCxYIGoFhqPEFVdcnRqZr2Vhlbdy1CMuFADe2lFn NVgVrzRYFZx6XzoLvrxhrVYmYoXrk9CKs6USvdZyAuAsiwnPjSSSQi/IuTtL L0N0Zl00muBrdRU6zDaHeTwnmXScuMjTcp6vxcm0cRzfNzT1cUDkSCEY3/HZ rDBT6vP4OrLTWmdiPJn0ej1es4NJdqgnQVy3TQUdd6NXZ4/XVuPA76Lh7/Sp efzrfIUVKrirr7D01Sf1mNErqHrQ7PExctBZEclfdGtcYR3ySZpJGxfskVvK bvSAgLj7PDSQAC4ls0NN4KPaJXowGfRNg0aN0KgxDJ251OY3BaZ69L6uqbqG DrBwuSGHNGjj3lo36NpLnvbqPZJ5S/CHGB9AstdQujstgOHcWxMDZekNgniM G71fj2u2ZnQd0bWgz10dorF0Mt+DqKpecFe9zFkM7WbGyNd1+qCSIeNpBfmy 1OlKTLRG3y41DW+0/f0dHf2tRT6LYrXxD91gI7JYdbrgI+LhC5w0wDAEuEq7 NO2ZRCFt8xMZaVDK1ZgfoS5n8mNjZRaqRG5rxsTY0n2776o/OubRcsNkKuTe XEa3pW1giIAG1fGrzw/puPvf6srmEq0lWKepBV/gNmadJuzXxc/L2NeAdZbY b3/+WoxB3Tj6M7bPYiR9hJVUNWpE5g7QKDKav25C4FpH89Wosz9cgw4irfOI 1dxaMVrjjbvWuNrS/EXCqPV2FeHAWdK8ZYLt6DOAkqOehJqtT5TPLBEtTSN7 +MbxE9TFh9v38k7vnt7GY0j79kFHdsXUIqOz6sQ1DgZeKelHkOvjx+ROL4P3 evcfL/abIDRLAgRGjJd8LhA9jhvVGACWmuqpXuLS7CNLyvXps27KexP1pSH/ 0BtRebQMte/SiWPClcw3qdI9J0NX48S47EvGwQiP92GYc7ShQCfcltzftCcH OtGsY5ROgWH3KO/X0cJFvuFEDhHyfH4oXKFZR5V17a04jJ1Hxv1oH9n1T85D nG06t+LvLlVeemQT9d1yJru76wKaXTm37ArFB2D1tOKxPdIIl+IMBrgH7GBA 63HkgKWYj/+hBSrwu4f/UABFn3jU3uz03HOBbnM8YeJaJ/todogm4ZQ2B4KP +3jAia1JPVRF648+7mp1lS49tfisB8Dh+W9P9Xlv1fPfNra2t5/w+W9b25tP HuP5b1ub8Gh1/tsXuPCkN55X8u5ZUe4c4lZ7IJvWDfwRCpDgnd4OLIsw4wPd HKFaw+z1hxhMZmNyjY6LSxxfSA8d0Jz07tZIXi/+SbpkbZe+oN0tU+sgpwD3 WoM+sglsrFi0aYIHnMkH0DVd3vfAmQ98qMgl0wxV4vuLQnUiqu5UT+ebYEW1 iN+68IquMw+wADzCPws7EGj7g1LndHVdW03H7xc7E+Z1TPsmxvh7fldS9tHE o/cDYwdQu6m2U+mvfcwF4Tn/WK5nJ/fyU9s13aRTXal/7JWZR07s5Vm+e/TB ch2Uomii65+/t5Om9VaLdx6u3ZSK1Lre9FSe0Sze25DUp3ugb2dPK7P7FU80 aRPLa2RhXcZ4ZXQbNfudiqHMVclEkvqHfauA3qnuyOF/yiBSMQs0xrN/SWPc AI33fAOjI+jgwY3dYGVtLL5sFvHnMwEW6f9vv91m/b/9dGvrKen/p5vbK/3/ JS7U/+bAOk7+QpW/bvQ8svLrOKd1ATR5be/L2aokZ0iDc9FyHnpdSV3UNNlv 1UlbV7iiX0wfjqqL+olOst8Zf5JibjpIS/yL+lfcI1gLp/ZR+TbvVbzRMb4I 8wHnOBe6pl4+m8JMFpAwSMdt04wqn75AmZEgH2lRid5sXBaXcFysgsiW38cT hAyPY2278BuXlhzaV9vrhko2/cNUpyAnCxe/iU0F5y1uAf0d/UbwbrE3TKe3 bX+TXOPMlrUKFQOsGFEmXu0weEmWuuBHo7dMfL5Lg+qmJHQhVe/kUvOLbYS1 NTJpeUsHOczSWXCBvf5ZAxXqOIpC9Eiw0B6KyftR6P3oefWMOQsqnGAcUdKo c3bOhdi7Ih7SpvVkuV5Ek1FP7QKWQGNO3LrRg3ONiSqcdc3byF5HGU27uwg/ jgvtiYRVX5AiRadJBK9oQwvo6jwGNHjp+W8t0XFnlmZPW2F9qdZcJqbGFnKx XWdiP7J8odmaniI/u6xWtj/qvke4NpbjStvZhmKteY0sxby2iXIhYWMZIAeB mpFJ/huX3Gf0AMzX/1uPnz7dsOe/P95G/f8YHq30/xe4UP+X9v/X8yDXEcBT f98r0Gr9Y/8t7sgDf4Kv12Xb7Whyq26h8OEbfHf4pvQqSVvkRseX9APfc5LI Ga4LyiLP8rB72PVdZpDT1s20DoPb3s7OvOs1vA5yOZJL9niO83KPe2sVF0Cc DCezEUxMSsvduEnDmp51IDN6HyJKpKy2iR7sK5DCoDoTTMPihMpqponARkru MC0O0P2B/pNoZKwIhIQFWgUEOYuGxYBUitoHi5KxU20z75nMHCqF0Buh6mDJ LdRoKNWUL5tVjlQrr+fRd8d7u5MiyhLoTG47KX2X4FLF+lg7xjispYnYrn/i dFZcujKNhvE4xpRb8eH49HD4xlCkWYDl0yZbDcelAUbzQa+J1FPjel3LcRHP AUTpK2a3n0WEJqu7/GBLKUXGyYoxbSzEH1phGGkfhbddpUNZB5jlPUb7A7rJ aEX/17CYARGTp6rgPHNWp+9D8tft4IZW9JMWtrYNDEZ1S0kYI1TrxhC02/hR YhUeMMkluypP0diY4dbIfHSM4hUs15FN4vKpCsZSNyd9NpV5y1518X/svy1T wkIHDg1SoB5YNPck+GLpCGmlbUeIPVMVEaFJE+N4t0RO+Nmx3tc/A7Mzcw5w YUd+E4m5otK026uhw7wnjTjmCS+f3+FabC9+josLQw6Lu0PJJoYl2DIWOcuu Rl3VXEbpwugNrQNubme4C9VgJNeAPGLAd3rLb9ExST8WcgnOYPTRPJLVXmT5 fFJyGirFxhdT2L38+16v9wy9aNwekZzrtD7eo5NA2qUhonpsadTBfFwJPiag 0eMXDDgBZoCutMHgLMyjXG6YjoPThvEpjYzXeD2xEZzH7rDaPvA+7Ut0gjcd b+rFpwOWAWqA9kCIazG8Qodz8a4n5J8QfIGwoQNvJeN7KUYmuYPrux5RWoyk iy+h6fg02jtxMJG5/go3nOBfv0c3SmdZObr1lw0Zeb4kZ95HB/cz6723I98l J7vbVMfBvpOr5kC3YACwOG9OSQj3qQzHJWgYh/IISIP1hCMufftdGW5emL88 4MdmO+Y6yImuPg3o1FAT7LKtwHlxUU/0ZCiy2jJm4kNeE2covst5U9aMjNg+ V5gl2hT1cwHKl+ODCiNQZhXh3VS0DEdof1uJI3wvYAfXRZgMJUwp9BqmYBJn avZKnTC8sEG3Jj+xWoe36Kbwe3YSnzaxISXWVtZRbrhsuDmHUxltV+G0TdEi H6QyL9/LmxnZBJ7o6DuKOgH60WA+0OYqLry8lGQ7tl3xQC+gEE1ByKC4dgBn UOlVRIdsoRVLtYRK4l4ds9guIMM2w7ym6DrKbnlpZzoe44En+BFv6EyGuhLi tktCQU1iDiDYytjcRXqDn5V5QUsghzXmm3NlvniE2OSufZxMsuzgwLCkXMpr mFtE0+/sEXHaxwqrO/SpKrDcTjk5tUtrafxGRG0tkJyE2zVZv8tNE7Da0jSB KwKk8I/S/IFfyc/fo7wtFkrLP0qQVGQIP276SECrfGYQs4QJ8Ou9HLcS+M0K DkGv14Yzpo5nYxm3Ah3Al5FXPzRfPjJBhfn21334xE/RRMGMDzHh3vhncHYK D2nZhAnsG8pE5VBfpBz7hwKYt0LF5FFNsy1blhYy4QtWH5OwGPALN+MdCAmz iSbuOvsKkVi0OuoCvpFTV2FcTTvl/SqkXRHBnq+e4ZEiy9AxyZVJMWC4pRJ3 oQW9bHShOZrN03mLJqb6KOAd0yQMQ00mdj145bUgDKL3NZoB3IaFdN0JaNHa cNxmSLtkIuti69LWCehbokJUIHZ2Q0UFeBW/V1dRmORmVV2SqpuQjhjWzuJe GUPkU3axxPDd0SIwhFGTaML6HsEPXG0jx7cuFMvs9xAlk7vnhGNELCywW038 C/CQLAnlzN3rkO4vb6zcDp2ziHW5puOIQzkNnMaeHW+KfNOGmXQDnqHvuVWo 6YaP6V1JYreWUS6CzuOS6cv9VzsaEQ5TmlOT9S/cFgmNEFU6EszzaolKgr9f f01Hnt46J3GyHU8nmOsDZJ47uu5Gq7qbJZVZvd3tZMJ7NrcxibmjFZXEjxvs dDPq1kyXR/OtdF5FVKnHJj3oM3yJZ/VJ1OxdvnUOSFBy3HfkFqcADLpsKcyN NA+SAbP4UZPpE+J5ZYephy94wZuN5ZjzaSWSC6LexOsbWmviP3Mx5ODVKbRZ nWx4b5umG+bsrtCJOZt0Bk1eJOYKp2CNdTGok/XcBu/rNPLbqIWCCLEKBdOp CwU9qZpGN0tA4bXhqqGl5l56cYdo8lr0YRmTK3if3OjiNGE8lU7t9CDyP/a/ 5v452ypLFN7qPB7z2srYqOPtKCV6LwsXuVeuiuDFMDQn9GxDOl77Hp6u+0xq 46IOp3seReNar7ELq5qFGMtVLFwRvJFTGRc62NEBWZN8hX+WkXEoQs/Toggx bcdsPiTHcS6MYc7zsNv0IbOt2ScK2Bis/AqA/hZY/OvjQt0Ih+/VXfuBNiPi 3cx01Lwra6okSoaJLnmRs5fIpolP0vRyNm2y1M0UrNGl++tvi8Ofnk10F0Q8 f/6c9odxJinJ7V2o0CwFyOWkrdsAs6AwH9B2eYm53+bCTlq/ByFerxXWK+/A XMww2fv5x+AD+gzUQInWtVkAhIAXdDJ72Dj75si6BOf0+io0zbiOFk+4rH3m Wj4eOO5eseWPJGTnfaCjjA3tmMCLxaJG3W2YXNKRRrhdqKEJv2YJL1neAHqs NmIDJh/VigkALWjGDWx8VENO7KbUVMspZXZf1dn3dUvJsfnSUvYu7uZUs2yd zSKg22mWXsejaEROJJl+o/WxYPpdKeJPv7WHeCSb7+plJTnPxUkFwlRYz/rw uZnw6ZI6SU6XcvLuaYnoOgl4hYgbhdmozAB5CwgWqmHR0Wnxtt30wN/Fll+w oQ/IrykBT/k1reaoq0I2+Oi0ihFNQ6pl8LGeTVBCEiAgbd8n5d1XbdzJndQy /Oi3HFajmszkjrZ/B/ulNB+iz7s0AaLqtSrAeSpKoRu9VZpzGCHNkUNMz3RI sqxvqO2KEuq0vGOsqZBmlHJZU5TYlUoy4zYWFEagsoYrqsVlGS6eHShLcXF3 nwhtfMnkEDTAfAp3HOR8lnWdVKIzGqxC1Fkjz1smUQTTSltK0/aL2XgcZUTd II3xdzsIaN8AeU+j7YlPsFsyW8JKPBqpLswQNT5oTHJgRdr3SZdOR5G05sr+ Hs5caItPp+YU4X3PxTd+sS/ER9rH/Fj72N24ua/kzr7/IdZCUwP6Na/Xfk84 t6Umaahr3+htOF/jnixeP2u/9hDl99HW5UlWZA7GmiGTGsRpbgWgbDVGMBHz 2McpMz6UfWWfvqbNTk2L4nJyClitwk8YKM0PNSAJ72uEmOd2YyBeoNXedAjK yqW+CgL7/DgDDsCdXhei2EhsEDBOxUZZl+5tib8f4Tycy7zXe9JQBb+tljSt rtW1ulbX6lpdq2t1ra7VtbpW1+paXatrda2u1bW6VtfqWl2ra3WtrtW1ulbX 6qq9/h9NWvziAMgAAA== --1658348780-1663354158-946210421=:412-- From skaller@maxtal.com.au Sun Dec 26 15:48:52 1999 From: skaller@maxtal.com.au (skaller) Date: Mon, 27 Dec 1999 02:48:52 +1100 Subject: [Types-sig] merry christmas... here is a demo! References: Message-ID: <386638E4.541677F7@maxtal.com.au> Greg Stein wrote: > > Hi all, > > I banged together a rough prototype for a type checker. It provides some > interesting errors/warnings, but totally ignores a bazillion others :-) Here's the output of check.py, run on itself (only the last few lines .. :-) line 800: WARNING: (module<...>).LPAR may cause an AttributeError line 802: WARNING: (module<...>).LSQB may cause an AttributeError line 803: WARNING: (Any)._check_assign_subscriptlist may cause an AttributeError line 803: return Any line 805: WARNING: (module<...>).DOT may cause an AttributeError line 810: WARNING: (module<...>).LValueAttr may cause an AttributeError line 810: return Any line 817: WARNING: (Any)._check_node may cause an AttributeError line 817: assign: sub_td = Any line 819: WARNING: (Any).type may cause an AttributeError line 819: WARNING: (module<...>).SliceType may cause an AttributeError line 821: WARNING: (module<...>).LValueSlice may cause an AttributeError line 821: return Any line 824: WARNING: (module<...>).LValueIndex may cause an AttributeError line 824: return Any line 831: WARNING: (Any)._check_node may cause an AttributeError line 835: WARNING: (Any)._check_node may cause an AttributeError line 835: return list<*(Any)> line 837: assign: valueTDs = list<*(Any)> line 838: for: i = Any line 840: return list<*(Any)> line 844: WARNING: (module<...>).testlist may cause an AttributeError line 846: assign: tds = list<*(Any)> line 847: for: arg = Any line 848: WARNING: (module<...>).COMMA may cause an AttributeError line 849: WARNING: (module<...>).test may cause an AttributeError line 855: WARNING: (module<...>).TDVarLenList may cause an AttributeError line 855: return Any line 859: WARNING: (module<...>).dictmaker may cause an AttributeError line 861: assign: key_tds = list<*(Any)> line 862: assign: value_tds = list<*(Any)> line 863: for: i = Any line 867: WARNING: (module<...>).TDDictionary may cause an AttributeError line 867: return Any line 873: assign: n = Any line 874: WARNING: (module<...>).LPAR may cause an AttributeError line 875: WARNING: (Any)._check_function_call may cause an AttributeError line 875: return Any line 876: WARNING: (module<...>).LSQB may cause an AttributeError line 877: WARNING: (Any)._check_node may cause an AttributeError line 877: assign: sub_td = Any line 879: return Any line 881: WARNING: (module<...>).DOT may cause an AttributeError line 882: assign: name = Any line 883: WARNING: (Any).hasattr may cause an AttributeError line 883: assign: has = Any line 884: WARNING: (module<...>).NO may cause an AttributeError line 887: WARNING: (module<...>).Any may cause an AttributeError line 887: assign: td = Any line 888: WARNING: (module<...>).MAYBE may cause an AttributeError line 891: WARNING: (module<...>).Any may cause an AttributeError line 891: assign: td = Any line 893: return Any line 897: return Any line 903: assign: (Any).nodeargs = Any line 904: assign: (Any).td = Any -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Sun Dec 26 17:50:22 1999 From: skaller@maxtal.com.au (skaller) Date: Mon, 27 Dec 1999 04:50:22 +1100 Subject: [Types-sig] Viper module compiler begun References: Message-ID: <3866555E.B0C44B59@maxtal.com.au> Greg Stein wrote: > > > http://www.mudlib.org/~rassilon/p2c/ > > > > Last time I tried that, it crashed unceremoniously. > > Has that been fixed? > > Not much of a bug report. Get serious. How the heck should I know whether > that particular bug has been fixed? It wasn't a bug report. It was a comment: when I last tried it, it failed so catastophically that I just junked it. > As far as I know, P2C can successfully convert *any* module into a Python > extension model. I'll try it again, since you seem to believe the current version actually works, and the one I downloaded, was probably an early release (I was pretty eager!). -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Sun Dec 26 18:01:35 1999 From: skaller@maxtal.com.au (skaller) Date: Mon, 27 Dec 1999 05:01:35 +1100 Subject: [Types-sig] Viper module compiler begun References: Message-ID: <386657FF.86736C80@maxtal.com.au> Greg Stein wrote: > > Last time I tried that, it crashed unceremoniously. > > Has that been fixed? > > Not much of a bug report. Get serious. How the heck should I know whether > that particular bug has been fixed? > > "oh. it broke. fix it." *snort* > > As far as I know, P2C can successfully convert *any* module into a Python > extension model. Here's what I get with the latest version: Did I do something wrong? [root@ruby] ~/py2c>python gencode.py gencode.py __gencode.c _gencode.py Traceback (innermost last): File "gencode.py", line 35, in ? genc.Generator(args[0], args[1], args[2]) File "genc.py", line 91, in __init__ tree = t.parsefile(input) File "transformer.py", line 176, in parsefile return self.parsesuite(file.read()) File "transformer.py", line 166, in parsesuite return self.transform(parser.suite(text)) parser.ParserError: Could not parse string. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From paul@prescod.net Mon Dec 27 03:45:12 1999 From: paul@prescod.net (Paul Prescod) Date: Sun, 26 Dec 1999 22:45:12 -0500 Subject: [Types-sig] PyDL RFC 0.01 Message-ID: <3866E0C8.C1867223@prescod.net> I've been off-list for a few days so if this RFC doesn't include the last few day's feedback, I apologize in advance. PyDL RFC 0.01 ============= A PyDL file declares the interface for a Python module. PyDL files declare interfaces, objects and the required interfaces of objects. At some point in the future, PyDL files will likely be generated from source code using a combination of declarations within Python code and some sorts of interface deduction and inferencing based on the contents of those files. For version 1, however, PyDL files are separate although they do have some implications for the Python runtime. This document describes the behavior of a class of software modules called "static interface interpreters" and "static interface checkers". Interface interpreters are run as part of the regular Python module interpetation process. They read PyDL files and make the type objects available to the Python interpreter. Interface checkers read interfaces and Python code to verify conformance of the code to the interface. Concepts: ========= An interface is a Python object with the following attributes: __conforms__ : def (obj: Any ) -> boolean __class_conforms__ : def (obj: Class ) -> boolean (the rest of the interface reflection API will be worked out later) Interfaces can be created through interface definitions and typedefs. There may also be facilities for creating interfaces at runtime but they are neither available nor relevant to the interface interpreter. Interface definitions are similar to Python class definitions. They use the keyword "interface" instead of the keyword "class". Sometimes an interface can be specialized for working with specific types. For instance a list could be specialized for working with integers. We call this "parameterization". A type with unresolved parameter variables is said to be "parameterizable". A type with some resolved parameter variables is said to be "partially resolved." A type with all parameter variables resolved is said to be "fully resolved." Typedefs allow us to give names to partially or fully resolved instantiations of interfaces. In addition to defining interfaces, it is possible to declare other attributes of the module. Each declaration associates an interface with the name of the attribute. Values associated with the name in the module namespace must never violate the declaration. Furthermore, by the time the module has been imported each name must have an associated value. Behavior: ========= The Python interpreter invokes the static interface interpreter and optionally the interface checker on a Python file and its associated PyDL file. Typically a PyDL file is associated with a Python file through placement in the same path with the same base name and a ".pydl" or ".gpydl" extension. "Non-standard" importer modules may find PyDL files using other mechanisms such as through a look-up in an relational database. The interface interpreter reads the interface file and builds the relevant type objects. If the interface file refers to other modules then the interface interpreter can read the interface files associated with those other modules. The interface interpreter maintains its own module dictionary so that it does not import the same module twice. The Python interpreter can optionally invoke the interface checker after the interface interpreter has built type objects and before it interprets the Python module. Once it interprets the Python code, the type objects are available to the runtime code through a special namespace called the "interface namespace". This namespace is interposed in the name search order between the module's namespace and the built-in namespace. Type expression language: ========================= Type expressions are used to declare the types of variables and to make new types. In a type expression you may: 1. refer to a "dotted name" (local name or name in an imported module) 2. make a union of two or more types: integer or float or complex 3. parametrize a type: Array( Integer, 50 ) Note that the arguments can be either types or simple Python expressions. A "simple" Python expression is an expression that does not involve a function call. 4. use a syntactic shortcut: [Foo] => Sequence( Foo ) # sequence of Foo's {a:b} => Mapping( a, b ) # Mapping from a's to b's (a,b,c) => Record( a, b, c ) # 3-element sequence of type a, followed by b followed by c 5. Declare un-modifability: const [const Array( Integer )] Declarations in a PyDL file: ============================ (formal grammar to follow) 1. Imports An import statement in an interface file loads another interface file. 2. Basic attribute type declarations: decl myint as Integer # basic decl intarr as Array( Integer, 50 ) # parameterized decl intarr2 as Array( size = 40, elements = Integer ) # using keyword syntax Attribute declarations are not parameteriable. Furthermore, they must resolve to fully parameterized (not parameterizable!) types. 3. Callable object type declarations: Functions are the most common sort of callable object but class instances can also be callable. They may be runtime parameterized and/or type parameterized. For instance, there might be a method "add" that takes two numbers of the same type and returns a number of that type. decl Add(_X: Number) as def( a: const _X, b: const _X )-> _X 4. Class Declarations A class is a callable object that can be subclassed. Currently the only way to make those (short of magic) is with a class declaration, but one could imagine that there might someday be an __subclass__ magic method that would allow any old object instance to also stand in as a class. decl TreeNode(_X: Number) as class( a: _X, Right: TreeNode( _X ) or None, Left: TreeNode( _X ) or None ) -> ParentClasses, Interfaces 5. Interface declarations: interface (_x,_y) foo(a, b ): decl shared somemember as _x decl someOtherMember as _y decl shared const someClassAttr as List( _x ) decl shared const someFunction as def( a: Integer, b: Float ) -> String 6. Typedefs: Typedefs allow interfaces to be renamed and for parameterized variations of interfaces to be given names. typedef PositiveInteger as BoundedInt( 0, maxint ) typedef NullableInteger as Integer or None typedef Dictionary(_Y) as {String:_Y} The Undefined Object: ===================== The undefined object is used as the value of unassigned attributes and the return value of functions that do not return a value. It may not be bound to a name. a = Undefined # raises UndefinedValueError a = b # raises UndefinedValueError if b has not been assigned Undefined CAN be compared. if a==Undefined: blah blah blah New Runtime Function: ===================== conforms( x: Any, y: Interface ) -> Any or Undefined This function can be used in various ways. Here it is used as an assertion: j = conforms( j, Integer ) which is equivalent to: if isinstance( j, Integer ): raise UndefinedValueError Here it is test: if conforms( j, Integer )!=Undefined: anint = conforms( j, Integer ) which is equivalent to the very similar isinstance based code: if isinstance( j, Integer ): anint = j Experimental syntax: ==================== There is a backwards compatible syntax for embedding declarations in a Python 1.5x file: "decl","myint as Integer" "typedef","PositiveInteger as BoundedInt( 0, maxint )" There will be a tool that extracts these declarations from a Python file to generate a .gpydl (for "generated PyDL") file. These files are used alongside hand-crafted PyDL files. The "effective interface" of the file is evaluated by combining the declarations from the same file as if they were concatenated together (more or less...exact details to follow). The two files must not contradict each other, just as declarations within a single file must not contradict each other. Over time the generation of the .gpydl file may be more intelligent and may deduce type information based on code outside of explicit declarations (for instance function and class definitions, assignment statements and so forth). Runtime Implications: ===================== All of the named types defined in a PyDL file are available in the "types" dictionary that is searched between the module dictionary and the built-in dictionary. The runtime should not allow an assignment or function call to violate the declarations in the PyDL file. In an "optimized speed mode" those checks would be disabled. From scott@chronis.pobox.com Mon Dec 27 05:15:24 1999 From: scott@chronis.pobox.com (scott) Date: Mon, 27 Dec 1999 00:15:24 -0500 Subject: [Types-sig] PyDL RFC 0.01 In-Reply-To: <3866E0C8.C1867223@prescod.net> References: <3866E0C8.C1867223@prescod.net> Message-ID: <19991227001524.A39501@chronis.pobox.com> On Sun, Dec 26, 1999 at 10:45:12PM -0500, Paul Prescod wrote: > I've been off-list for a few days so if this RFC doesn't include the > last few day's feedback, I apologize in advance. Very grateful for you providing some more centralized direction with this RFC. Some questions follow, mostly intended to make sure I'm on the same page as the intent of the RFC. > > PyDL RFC 0.01 > ============= [...] > > An interface is a Python object with the following attributes: > > __conforms__ : def (obj: Any ) -> boolean > __class_conforms__ : def (obj: Class ) -> boolean What is the rational behind separating __conforms__ and __class_conforms__? It seems to me like __conforms__ could do everything __class_conforms__ is supposed to. Am I missing something? [...] [...] > Type expression language: > ========================= > > Type expressions are used to declare the types of variables and to > make new types. In a type expression you may: > > 1. refer to a "dotted name" (local name or name in an imported module) > > 2. make a union of two or more types: > > integer or float or complex > > 3. parametrize a type: > > Array( Integer, 50 ) By `50', do you intend length of 50? > > Note that the arguments can be either types or simple Python > expressions. A "simple" Python expression is an expression that does > not involve a function call. > > 4. use a syntactic shortcut: > > [Foo] => Sequence( Foo ) # sequence of Foo's > {a:b} => Mapping( a, b ) # Mapping from a's to b's > (a,b,c) => Record( a, b, c ) # 3-element sequence of type a, followed by > b > followed by c > > 5. Declare un-modifability: > > const [const Array( Integer )] By nesting const declarations, do you intend that checks against modifiability at runtime are shallow? For example, if I declare array A as a constant array of Foo instances, are those foo instances (or their attributes) modifiable? > > Declarations in a PyDL file: > ============================ > > (formal grammar to follow) > > 1. Imports > > An import statement in an interface file loads another interface file. Do you envision the import statement to be similar in syntax to regular python? (eg from m import v; from m2 import *) > > 2. Basic attribute type declarations: > > decl myint as Integer # basic > decl intarr as Array( Integer, 50 ) # parameterized > decl intarr2 as Array( size = 40, elements = Integer ) > # using keyword syntax > > Attribute declarations are not parameteriable. Furthermore, they must > resolve to fully parameterized (not parameterizable!) types. what do you mean by 'attribute declarations'? I'd hate to see classes that couldn't have attributes that are parameterizable, but agree that resolving parameters needs to end somewhere. [...] > 5. Interface declarations: > > interface (_x,_y) foo(a, b ): > decl shared somemember as _x > decl someOtherMember as _y > decl shared const someClassAttr as List( _x ) > > decl shared const someFunction as def( a: Integer, b: Float ) -> > String what do you mean by 'shared' in the above? Are you referring to the distinction between class attributes and instance attributes? [...] > The Undefined Object: > ===================== > > The undefined object is used as the value of unassigned attributes and > the return value of functions that do not return a value. It may not > be bound to a name. By functions that do not return a value, do you mean functions that return None, or that may return None? > > a = Undefined # raises UndefinedValueError > a = b # raises UndefinedValueError if b has not been assigned by 'b has not been assigned', do you mean assigned a type, or is this in your view a replacement for NameError? I'm a little unclear where your going with Undefined. > > Undefined CAN be compared. > > if a==Undefined: > blah > blah > blah > scott From paul@prescod.net Mon Dec 27 11:01:05 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 27 Dec 1999 06:01:05 -0500 Subject: [Types-sig] PyDL RFC 0.01 References: <3866E0C8.C1867223@prescod.net> <19991227001524.A39501@chronis.pobox.com> Message-ID: <386746F1.64CB72B5@prescod.net> Every one of your questions addresses an issue of ambiguity in the spec. Thanks! I'll quote you pieces of the NEW and IMPROVED spec that your comments generated. scott wrote: > > > An interface is a Python object with the following attributes: > > > > __conforms__ : def (obj: Any ) -> boolean > > __class_conforms__ : def (obj: Class ) -> boolean > > What is the rational behind separating __conforms__ and > __class_conforms__? It seems to me like __conforms__ could do > everything __class_conforms__ is supposed to. Am I missing something? Every interface object (remember, interfaces are just Python objects!) has the following method : __conforms__ : def (obj: Any ) -> boolean This method can be used at runtime to determine whether an object conforms to the interface. It would check the signature for sure but might also check the actual values of particular attributes. There is also a global function with this signature: class_conforms : def ( obj: Class, Obj: Interface ) -> boolean This function can be used either at compile time (e.g. by an implementation of an interface checker) or runtime to check that a class will generate objects that have the right signature to conform to the interface. > > Array( Integer, 50 ) > By `50', do you intend length of 50? 3. parameterize a type: Array( Integer, 50 ) Array( length=50, elements=Integer ) > > const [const Array( Integer )] > > By nesting const declarations, do you intend that checks against > modifiability at runtime are shallow? For example, if I declare > array A as a constant array of Foo instances, are those foo instances > (or their attributes) modifiable? Right. That's my feeling right now but I could probably be convinced otherwise. > Do you envision the import statement to be similar in syntax to > regular python? (eg from m import v; from m2 import *) An import statement in an interface file loads another interface file. The import statement works just like Python's except that it loads the PyDL file found with the referenced module, not the module itself. (of course we will make this definition more formal in the future) > what do you mean by 'attribute declarations'? I'd hate to see > classes that couldn't have attributes that are parameterizable, but > agree that resolving parameters needs to end somewhere. I'm talking both about module, interface and class attributes. I think that it is sufficient that a class' attributes can be parameterized and can use class parameters. They don't need to be independently parameterizable. So this is allowed: class (_X,_Y) spam( A, B ): decl someInstanceMember as _X decl someOtherMember as Array( _X, 50 ) .... These are NOT allowed: decl someModuleMember(_X) as Array( _X, 50 ) class (_Y) spam( A, B ): decl someInstanceMember(_X) as Array( _X, 50 ) Because that would allow you to create a "spam" without getting around to saying what _X is for that spam's someInstanceMember. That strikes me as overly dynamic for a static type-check system (at least for version 1). > what do you mean by 'shared' in the above? Are you referring to the > distinction between class attributes and instance attributes? Yes. But in retrospect the concept may not jibe very well with the idea that there can be many classes that implement a particular interface. There is no way to share state between all of these objects. I think I'll take that out. Here's the new section on Undefined: The Undefined Object: ===================== The undefined object is used as the value of unassigned attributes and the return value of functions that do not return a value. It may not be bound to a name. a = Undefined # raises UndefinedValueError a = b # raises UndefinedValueError if b has not been assigned Undefined can be thought of as a subtype of NameError. Undefined is needed because it is now possible to declare names at compile time but never get around to assigning to them. In ordinary Python this is not possible. The only useful thing you can do with Undefined is check whether an object "is" Undefined: if a is Undefined: doSomethingWithA(a) else: doSomethingElse() This is equivalent to: try: doSomethingWithA( a ) except NameError: doSomethingElse It is debatable whether we still need NameError for anything other than backwards compatibility. We could say that any referenced variable is automatically initialized to "undefined". Undefined is sufficiently restrictive that this will not lead to buggy programs. Undefined also corrects a long-term unsafe issue with functions. Now, functions that do not explicitly return a value return Undefined instead of None. That means that this is no longer possible a = list.sort() With Undefined, it will blow up because it is not possible to assign the Undefined value. Before Undefined, the code did not blow up but it also did not do the "right thing." It assigned None to "a" which was seldom what was intended. Paul Prescod From paul@prescod.net Mon Dec 27 11:02:04 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 27 Dec 1999 06:02:04 -0500 Subject: [Types-sig] PyDL RFC 0.02 Message-ID: <3867472C.7ECBAC55@prescod.net> PyDL RFC 0.02 A PyDL file declares the interface for a Python module. PyDL files declare interfaces, objects and the required interfaces of objects. At some point in the future, PyDL files will likely be generated from source code using a combination of declarations within Python code and some sorts of interface deduction and inferencing based on the contents of those files. For version 1, however, PyDL files are separate although they do have some implications for the Python runtime. This document describes the behavior of a class of software modules called "static interface interpreters" and "static interface checkers". Interface interpreters are run as part of the regular Python module interpetation process. They read PyDL files and make the interface objects available to the Python interpreter. Interface checkers read PyDL files and Python code to verify conformance of the code to the interface. Interfaces: =========== Interfaces are the central concept in PyDL file. Interfaces are Python objects like anything else but they are created by the interface interpreter available to the static interface checker before Python interpretation begins. The PyDL file itself generates an interface object that describes the attributes of the module. It may also contain interface definitions for class instances and other objects. These other interfaces can be created through interface definitions and typedefs. There may also be facilities for creating interfaces at runtime but they are neither available to nor relevant to the interface interpreter. Interface definitions are similar to Python class definitions. They use the keyword "interface" instead of the keyword "class". Sometimes an interface can be specialized for working with specific other interfaces. For instance a list could be specialized for working with integers. We call this "parameterization". An interface with unresolved parameter variables is said to be "parameterizable". A type with some resolved parameter variables is said to be "partially resolved." A type with all parameter variables resolved is said to be "fully resolved." Typedefs allow us to give names to partially or fully resolved instantiations of interfaces. In addition to defining interfaces, it is possible to declare other attributes of the module. Each declaration associates an interface with the name of the attribute. Values associated with the name in the module namespace must never violate the declaration. Furthermore, by the time the module has been imported each name must have an associated value. Behavior: ========= The Python interpreter invokes the static interface interpreter and optionally the interface checker on a Python file and its associated PyDL file. Typically a PyDL file is associated with a Python file through placement in the same path with the same base name and a ".pydl" or ".gpydl" extension. If both are avaiable, the module'sj interface is created by combining the declarations in the ".pydl" and ".gpydl" files. "Non-standard" importer modules may find PyDL files using other mechanisms such as through a look-up in an relational database, just as they find modules themselves using non-standard mechanisms. The interface interpreter reads the PyDL file and builds the relevant interface objects. If the PyDL file refers to other modules then the interface interpreter can read the PyDL files associated with those other modules. The interface interpreter maintains its own module dictionary so that it does not import the same module twice. The Python interpreter can optionally invoke the interface checker after the interface interpreter has built interface objects and before it interprets the Python module. Once it interprets the Python code, the interface objects are available to the runtime code through a special namespace called the "interface namespace". This namespace is interposed in the name search order between the module's namespace and the built-in namespace. Interface expression language: ============================== Interface expressions are used to declare that attributes must conform to certain interfaces. In a interface expression you may: 1. refer to a "dotted name" (local name or name in the PyDL of an imported module ). 2. make a union of two or more interfaces: integer or float or complex 3. parameterize a interface: Array( Integer, 50 ) Array( length=50, elements=Integer ) Note that the arguments can be either interfaces or simple Python expressions. A "simple" Python expression is an expression that does not involve a function call. 4. use a syntactic shortcut: [Foo] => Sequence( Foo ) # sequence of Foo's {A:B} => Mapping( A, B ) # Mapping from A's to B's (A,B,C) => Record( A, B, C ) # 3-element sequence of interface a, followed # by b followed by c 5. Declare un-modifiability: const [const Array( Integer )] (the semantics of un-modifiability need to be worked out) Declarations in a PyDL file: ============================ (formal grammar to follow) 1. Imports An import statement in an interface file loads another interface file. The import statement works just like Python's except that it loads the PyDL file found with the referenced module, not the module itself. (of course we will make this definition more formal in the future) 2. Basic attribute interface declarations: decl myint as Integer # basic decl intarr as Array( Integer, 50 ) # parameterized decl intarr2 as Array( size = 40, elements = Integer ) # using keyword syntax Attribute declarations are not parameteriable. Furthermore, they must resolve to fully parameterized (not parameterizable!) interfaces. So this is allowed: class (_X,_Y) spam( A, B ): decl someInstanceMember as _X decl someOtherMember as Array( _X, 50 ) .... These are NOT allowed: decl someModuleMember(_X) as Array( _X, 50 ) class (_Y) spam( A, B ): decl someInstanceMember(_X) as Array( _X, 50 ) Because that would allow you to create a "spam" without getting around to saying what _X is for that spam's someInstanceMember. That strikes me as overly dynamic for a static type-check system (at least for version 1). 3. Callable object interface declarations: Functions are the most common sort of callable object but class instances can also be callable. Callables may be runtime parameterized and/or interface parameterized. For instance, there might be a method "add" that takes two numbers of the same interface and returns a number of that interface. decl Add(_X: Number) as def( a: const _X, b: const _X )-> _X _X is the interface parameter. a and b are the runtime parameters. 4. Class Declarations A class is a callable object that can be subclassed. Currently the only way to make those (short of magic) is with a class declaration, but one could imagine that there might someday be an __subclass__ magic method that would allow any old object instance to also stand in as a class. Here is the syntax for a class definition: decl TreeNode(_X: Number) as class( a: _X, Right: TreeNode( _X ) or None, Left: TreeNode( _X ) or None ) -> ParentClasses, Interfaces What we are really defining is the constructor. The signature of the created object can be described in an interface declaration. 5. Interface declarations: interface (_X,_Y) spam( a, b ): decl somemember as _X decl someOtherMember as _Y decl const someClassAttr as [ _X ] decl const someFunction as def( a: Integer, b: Float ) -> String 6. Typedefs: Typedefs allow interfaces to be renamed and for parameterized variations of interfaces to be given names. typedef PositiveInteger as BoundedInt( 0, maxint ) typedef NegativeInteger as BoundedInt( max=-1, min=minint ) typedef NullableInteger as Integer or None typedef Dictionary(_Y) as {String:_Y} The Undefined Object: ===================== The Undefined object is used as the value of unassigned attributes and the return value of functions that do not return a value. It may not be bound to a name. a = Undefined # raises UndefinedValueError a = b # raises UndefinedValueError if b has not been assigned Undefined can be thought of as a subtype of NameError. Undefined is needed because it is now possible to declare names at compile time but never get around to assigning to them. In ordinary Python this is not possible. The only useful thing you can do with Undefined is check whether an object "is" Undefined: if a is Undefined: doSomethingWithA(a) else: doSomethingElse() This is equivalent to: try: doSomethingWithA( a ) except NameError: doSomethingElse It is debatable whether we still need NameError for anything other than backwards compatibility. We could say that any referenced variable is automatically initialized to "Undefined". Undefined is sufficiently restrictive that this will not lead to buggy programs. Undefined also corrects a long-term unsafe issue with functions. Now, functions that do not explicitly return a value return Undefined instead of None. That means that this is no longer possible a = list.sort() With Undefined, it will blow up because it is not possible to assign the Undefined value. Before Undefined, the code did not blow up but it also did not do the "right thing." It assigned None to "a" which was seldom what was intended. New Runtime Functions: ====================== conforms( x: Any, y: Interface ) -> Any or Undefined This function can be used in various ways. The most basic way to use it is as a test: if conforms( j, Integer ) is Undefined: anint = conforms( j, Integer ) Because of the behavior of Undefined, it can also be used as an assertion: j = conforms( j, Integer ) which is equivalent to: if isinstance( j, Integer ): raise UndefinedValueError Every interface object (remember, interfaces are just Python objects!) has the following method : __conforms__ : def (obj: Any ) -> boolean This method can be used at runtime to determine whether an object conforms to the interface. It would check the signature for sure but might also check the actual values of particular attributes. There is also a global function with this signature: class_conforms : def ( obj: Class, Obj: Interface ) -> boolean This function can be used either at compile time (e.g. by an implementation of an interface checker) or runtime to check that a class will generate objects that have the right signature to conform to the interface. (the rest of the interface reflection API will be worked out later) Experimental syntax: ==================== There is a backwards compatible syntax for embedding declarations in a Python 1.5x file: "decl","myint as Integer" "typedef","PositiveInteger as BoundedInt( 0, maxint )" There will be a tool that extracts these declarations from a Python file to generate a .gpydl (for "generated PyDL") file. These files are used alongside hand-crafted PyDL files. The "effective interface" of the file is evaluated by combining the declarations from the same file as if they were concatenated together (more or less...exact details to follow). The two files must not contradict each other, just as declarations within a single file must not contradict each other. Over time the .gpydl generator will get more intelligent and may deduce type information based on code outside of explicit declarations (for instance function and class definitions, assignment statements and so forth). Summary of Major Runtime Implications: ===================== All of the named interfaces defined in a PyDL file are available in the "interfaces" dictionary that is searched between the module dictionary and the built-in dictionary. The runtime should not allow an assignment or function call to violate the declarations in the PyDL file. In an "optimized speed mode" those checks would be disabled. Several new object interfaces and functions are needed. The new "Undefined" object is needed and assignments need to check for "Undefined". From paul@prescod.net Mon Dec 27 11:54:10 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 27 Dec 1999 06:54:10 -0500 Subject: [Types-sig] recursive types, type safety, and flow analysis References: <38615A5B.D2E2B608@prescod.net> <38616974.AFC365D6@maxtal.com.au> <199912231347.IAA21844@eric.cnri.reston.va.us> Message-ID: <38675362.AF29942E@prescod.net> Guido van Rossum wrote: > > Paul's issue was that in ML the error message is typically > ununderstandable. I have never used ML (it's a language for people > with excess IQ points) My impression was that the ONLY thing that makes ML tricky was the type system. And the type system was built around the idea of type inferencing. > but I don't think that is the right level of > critique. In any case I think we can do better by simply referring to > the line number(s) where a.a gets assigned a non-int value. Good > error messages are a human factors issue, not a type system issue. It's a little bit more subtle than that. The problem is that we are generating anonymous types left right and center: if a: def foo(): # anon type foo1 def bar(self): # anon type foo1->String if self.something: return "Abc" else: return None else: def foo2(): # anon type foo2 def bar(self): # anon type foo2->String if self.something: return 123 else: return 45L k = [foo, foo] # anon type [foo1_class or foo2_class] j = [] for i in k: j.append(i()) #oops. how do we handle this? #okay, another try: k=[foo().bar(), foo().bar()] # anon type [String or None or Integer or Long ] mvar: String or Integer or Long myvar = k Now you have to back-track a LOT of code. Merely reporting: "bar can return None on line 13" is not very helpful because I have to trace a path from bar to where I am which is harder when this code is embedded in other code. Anyhow, I won't say (anymore) that this sort of deduction is unequivocally a bad idea. If we generate these PyDL files then you will have a useful debugging tool which may clear up a lot of these problems. It comforts me to know that if the inferencing goes horibbly wrong you can look at the PyDL file to figure out what the inferencer was thinking and override it. Paul Prescod From gstein@lyra.org Mon Dec 27 12:43:38 1999 From: gstein@lyra.org (Greg Stein) Date: Mon, 27 Dec 1999 04:43:38 -0800 (PST) Subject: [Types-sig] type inference/deduction (was: recursive types, type safety, and flow analysis) In-Reply-To: <38675362.AF29942E@prescod.net> Message-ID: On Mon, 27 Dec 1999, Paul Prescod wrote: > Guido van Rossum wrote: >... > > but I don't think that is the right level of > > critique. In any case I think we can do better by simply referring to > > the line number(s) where a.a gets assigned a non-int value. Good > > error messages are a human factors issue, not a type system issue. > > It's a little bit more subtle than that. The problem is that we are > generating anonymous types left right and center: > > if a: > def foo(): # anon type foo1 This would be def()->Any. And for discussion, we'll call this type "foo1" Oh. Wait. I just saw. You actually meant: class foo: And yah: we'll refer to this version as "foo1" > def bar(self): # anon type foo1->String Huh? This would be: def(foo1)->Any > if self.something: > return "Abc" > else: > return None > else: > def foo2(): # anon type foo2 We'll assume: class foo And we'll refer to this version as "foo2" > def bar(self): # anon type foo2->String > if self.something: > return 123 > else: > return 45L > > k = [foo, foo] # anon type [foo1_class or foo2_class] > > j = [] This would be: [Any] > for i in k: > j.append(i()) #oops. how do we handle this? No problem. j can take any element. > #okay, another try: > > k=[foo().bar(), foo().bar()] > # anon type [String or None or Integer or Long ] > > mvar: String or Integer or Long myvar: ... > myvar = k myvar = k[0] (??) Assuming assignment enforcement, you would get an error here. Nowhere else. It would give some error message about k having a type which is incompatible with myvar. > Now you have to back-track a LOT of code. Merely reporting: "bar can > return None on line 13" is not very helpful because I have to trace a > path from bar to where I am which is harder when this code is embedded > in other code. I do not believe that we would issue an error message like that. The message would be about a type conflict at the assignment. > Anyhow, I won't say (anymore) that this sort of deduction is > unequivocally a bad idea. If we generate these PyDL files then you will > have a useful debugging tool which may clear up a lot of these problems. > It comforts me to know that if the inferencing goes horibbly wrong you > can look at the PyDL file to figure out what the inferencer was thinking > and override it. I would expect to be able to generate/cache the PyDL files. I do not see how things can go "horribly wrong." The types are the types. As I mentioned previously: we have to know the type of the RHS in an assignment. I say we use that info, you say we check it against the type of the LHS. Based on the prototype that I posted, I think that I'm going to modify my position a bit: * if you declare a variable/attribute, then you get assignment enforcement (note this also applies to func param names) * if you do not declare, then there is no enforcement (the type is deduced from the RHS) Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Mon Dec 27 12:55:08 1999 From: gstein@lyra.org (Greg Stein) Date: Mon, 27 Dec 1999 04:55:08 -0800 (PST) Subject: [Types-sig] PyDL RFC 0.02 In-Reply-To: <3867472C.7ECBAC55@prescod.net> Message-ID: Only a few quick comments for now... On Mon, 27 Dec 1999, Paul Prescod wrote: > PyDL RFC 0.02 PyDL ?! Don't tell me "Python Definition Language." That is the wrong semantic... you're talking about an "inteface file", not a language. I'd seriously recommend a new acronym before you confuse people :-) .pyi or something. >... > The Python interpreter invokes the static interface interpreter and > optionally the interface checker on a Python file and its associated > PyDL file. Typically a PyDL file is associated with a Python file > through placement in the same path with the same base name and a > ".pydl" or ".gpydl" extension. If both are avaiable, the module'sj > interface is created by combining the declarations in the ".pydl" and > ".gpydl" files. The notion of two types of files just adds complexity. There is no reason that a generated file would be *any* different in form/syntax than a human's file. The human just gets to add funky comments, indentation, etc. In other words: design around a single file. >... > Once it interprets the Python code, the interface objects are > available to the runtime code through a special namespace called the > "interface namespace". This namespace is interposed in the name search > order between the module's namespace and the built-in namespace. Search *another* namespace? Eek! We're already seeing people avoiding the time with things like: def foo(len=len): ... Adding another namespace will just exacerbate the situation. I don't recommend adding another distinct namespace, but *IF* you are going to do so, then I might suggest that it is only available for use from withing a typedecl. >... > 5. Declare un-modifiability: > > const [const Array( Integer )] > > (the semantics of un-modifiability need to be worked out) Wasn't the notion of "const" (successfully) argued against inclusion? >... > The Undefined Object: > ===================== > > The Undefined object is used as the value of unassigned attributes and > the return value of functions that do not return a value. It may not > be bound to a name. I don't think this is going to work as you expect. The Python interpreter can't work with "Undefined" unless it is an object (otherwise, you're talking about a near-impossible revamp). Therefore, Undefined is an object and you're going to have some *real* serious issues trying to keep that out of some kind of assignment or other usage. Pass it as a parameter? Shove it into a list or tuple? Check for Undefined on every name binding? What about indexed or slice assignment? >... > conforms( x: Any, y: Interface ) -> Any or Undefined This is predicated upon the "Undefined" concept. I believe that Undefined isn't possible as you're currently defined it, thereby making conforms() unusable. Cheers, -g -- Greg Stein, http://www.lyra.org/ From scott@chronis.pobox.com Mon Dec 27 16:39:39 1999 From: scott@chronis.pobox.com (scott) Date: Mon, 27 Dec 1999 11:39:39 -0500 Subject: [Types-sig] PyDL RFC 0.02 In-Reply-To: References: <3867472C.7ECBAC55@prescod.net> Message-ID: <19991227113939.B41570@chronis.pobox.com> On Mon, Dec 27, 1999 at 04:55:08AM -0800, Greg Stein wrote: > Only a few quick comments for now... > > On Mon, 27 Dec 1999, Paul Prescod wrote: > > PyDL RFC 0.02 > > PyDL ?! > > Don't tell me "Python Definition Language." That is the wrong semantic... > you're talking about an "inteface file", not a language. > > I'd seriously recommend a new acronym before you confuse people :-) > > .pyi or something. One thing to consider is that windows/dos users can't have a 4-char suffix on a file name reliably. I like .pyi as greg suggests. The shorter the suffix, the better IMO. > > >... > > The Python interpreter invokes the static interface interpreter and > > optionally the interface checker on a Python file and its associated > > PyDL file. Typically a PyDL file is associated with a Python file > > through placement in the same path with the same base name and a > > ".pydl" or ".gpydl" extension. If both are avaiable, the module'sj > > interface is created by combining the declarations in the ".pydl" and > > ".gpydl" files. > > The notion of two types of files just adds complexity. There is no reason > that a generated file would be *any* different in form/syntax than a > human's file. The human just gets to add funky comments, indentation, etc. > > In other words: design around a single file. Greg, are you suggesting a single file which gets generated type info appened automatically? If so, I don't see it being harmful. A simple comment header denoting the beginning of machine generated info would suffice IMO, and it would facilitate some of the problems with working with extra files... like permissions denying writes of the interface file and what not. > > >... > > Once it interprets the Python code, the interface objects are > > available to the runtime code through a special namespace called the > > "interface namespace". This namespace is interposed in the name search > > order between the module's namespace and the built-in namespace. > > Search *another* namespace? Eek! We're already seeing people avoiding the > time with things like: > > def foo(len=len): > ... > > Adding another namespace will just exacerbate the situation. > > I don't recommend adding another distinct namespace, but *IF* you are > going to do so, then I might suggest that it is only available for use > from withing a typedecl. I believe greg has a good point here. But I also think another namespace gives us a good degree of flexibility in development. Perhaps the extra namespace could also just be available at runtime iff python is invoked in such a way as to run type checking and interface interpretting (not the default). just another way to possibly minimize the extra lookup overhead. Another idea: Perhaps an additional attribute or set of attributes would achieve a similar level of modularity of the type checking system? For example, maybe each existing namespace could have an __interfaces__ attribute or __types__ or something that would contain the type information without affecting lookup time of builtins so much. Intuitively, this seems like something which could be extended in the future to work with local namespace type checking or optimization more easily. This area could use some more specification before we start anything too serious, IMO. I think modularity of this type checking is important because it seems like it will facilitate making the necessary changes as time goes on: a potential big efficiency winner in the ongoing development of type checking. > > >... > > 5. Declare un-modifiability: > > > > const [const Array( Integer )] > > > > (the semantics of un-modifiability need to be worked out) > > Wasn't the notion of "const" (successfully) argued against inclusion? Any pointers to this discussion? scott From paul@prescod.net Mon Dec 27 17:16:54 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 27 Dec 1999 12:16:54 -0500 Subject: [Types-sig] PyDL RFC 0.02 References: Message-ID: <38679F06.37B6DFDD@prescod.net> PyDL is a pun. These files behave just like IDL in COM and CORBA. They are a separate language, at least for now. Greg Stein wrote: > > > The notion of two types of files just adds complexity. There is no reason > that a generated file would be *any* different in form/syntax than a > human's file. The human just gets to add funky comments, indentation, etc. From PyDL RFC 0.03: We have two different files so that hand-crafted PyDL files will not be overwritten by generated ones. The syntax of the files is identical. > In other words: design around a single file. I did. The existence of two files affects the language semantics not one whit but it makes the system much more safe and arguably much more usable. > Search *another* namespace? Eek! We're already seeing people avoiding the > time with things like: > > def foo(len=len): > ... > > Adding another namespace will just exacerbate the situation. The whole point of static type checking is that the checks of the namespaces should be done at *compile time*. It isn't like a name reference from six levels of lexical nesting in a C++ file requires more time to look up than a "local" reference. That's just a temporary Python bug that we're trying to fix. > Wasn't the notion of "const" (successfully) argued against inclusion? Maybe. But I think that the situation changed when I moved from talking about "lists", "tuples" and "dictionaries" to talking about "sequences", "mappings" and "records" because we have no way of saying "read-only record." Maybe that's a big deal for version 1. Maybe it isn't. I'm open to opinions. > I don't think this is going to work as you expect. The Python interpreter > can't work with "Undefined" unless it is an object (otherwise, you're > talking about a near-impossible revamp). Therefore, Undefined is an object > and you're going to have some *real* serious issues trying to keep that > out of some kind of assignment or other usage. > > Pass it as a parameter? Shove it into a list or tuple? Check for Undefined > on every name binding? What about indexed or slice assignment? I think that there are finite number of such issues and you've listed most of them. In each of these parts of the interpreter we need to add two or three lines of C code. From a performance perspective, we will just be doing a pointer comparison and branch which is tiny compared to the rest of the interpreter overhead. Paul Prescod From paul@prescod.net Mon Dec 27 17:54:54 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 27 Dec 1999 12:54:54 -0500 Subject: [Types-sig] PyDL RFC 0.02 References: <3867472C.7ECBAC55@prescod.net> <19991227113939.B41570@chronis.pobox.com> Message-ID: <3867A7EE.5F7EB2D0@prescod.net> scott wrote: > > One thing to consider is that windows/dos users can't have a 4-char > suffix on a file name reliably. Well, DOS users....Windows 9x/NT users will have no problem. I'm not sure if I care enough about DOS to think that we should change this. > Greg, are you suggesting a single file which gets generated type info > appened automatically? If so, I don't see it being harmful. A simple > comment header denoting the beginning of machine generated info would > suffice IMO, How do you combine the hand-maintained portions of the interface file beside the generated parts? How do you prevent a hand-matained interface file from being overwritten by a generated one? > and it would facilitate some of the problems with working > with extra files... like permissions denying writes of the interface > file and what not. I don't follow you at all. We have extra files. One or two, you are going to have potential problems with permissions. That's one reason to NOT use generated interface files in some circumstances. > I believe greg has a good point here. I think I've addressed it. The Python interpreter should not be looking at each namespace in turn. I would expect that in the future we will allow an infinite number of nested namespaces without any performance penalty. > Any pointers to this discussion? I don't have any. I think we just said: "we'll figure out const later." There may not have been a big discussion. Paul Prescod From scott@chronis.pobox.com Mon Dec 27 18:39:38 1999 From: scott@chronis.pobox.com (scott) Date: Mon, 27 Dec 1999 13:39:38 -0500 Subject: [Types-sig] PyDL RFC 0.02 In-Reply-To: <3867A7EE.5F7EB2D0@prescod.net> References: <3867472C.7ECBAC55@prescod.net> <19991227113939.B41570@chronis.pobox.com> <3867A7EE.5F7EB2D0@prescod.net> Message-ID: <19991227133937.A42463@chronis.pobox.com> On Mon, Dec 27, 1999 at 12:54:54PM -0500, Paul Prescod wrote: > scott wrote: > > > > One thing to consider is that windows/dos users can't have a 4-char > > suffix on a file name reliably. > > Well, DOS users....Windows 9x/NT users will have no problem. I'm not > sure if I care enough about DOS to think that we should change this. Don't piss off the DOS users! That's dangerous ;) On the other hand, it does seem prudent to have a suffix that works on the Denial Of Service platform if all it takes is a shorter set of suffixes. plus, that would mean less typing and neater output to `ls'. Minor, Minor point though. > > > and it would facilitate some of the problems with working > > with extra files... like permissions denying writes of the interface > > file and what not. > > I don't follow you at all. We have extra files. One or two, you are > going to have potential problems with permissions. That's one reason to > NOT use generated interface files in some circumstances. 2 extra files with potential permissions problems leads to more combinations of problem scenarios to deal with than a single one. It's not that big a deal to me one way or the other, though. 2 files is fine. > > > I believe greg has a good point here. > > I think I've addressed it. The Python interpreter should not be looking > at each namespace in turn. I would expect that in the future we will > allow an infinite number of nested namespaces without any performance > penalty. Perhaps, but when? I haven't seen any indication that this will happen in the near future, and predicting such things in the longer run seems to be asking for problems both in the meantime and in how the long run might actually work out. IMO, the most important goal is modularity of the system, the second most important goal is clean accessibility, and the least important goal is performance. Does this seem like a reasonable set of goals for deciding where to store this info? One thing to think about with the extra run-time namespace scheme is accidentally overwriting the values of typedefs and what not. It may well be more modular to put this stuff in a special place which does not affect the regular run-time environment at all. > > > Any pointers to this discussion? > > I don't have any. I think we just said: "we'll figure out const later." > There may not have been a big discussion. Sounds good to me. scott From gstein@lyra.org Mon Dec 27 19:30:47 1999 From: gstein@lyra.org (Greg Stein) Date: Mon, 27 Dec 1999 11:30:47 -0800 (PST) Subject: [Types-sig] PyDL RFC 0.02 In-Reply-To: <19991227133937.A42463@chronis.pobox.com> Message-ID: On Mon, 27 Dec 1999, scott wrote: > On Mon, Dec 27, 1999 at 12:54:54PM -0500, Paul Prescod wrote: > > scott wrote: > > > > > > One thing to consider is that windows/dos users can't have a 4-char > > > suffix on a file name reliably. > > > > Well, DOS users....Windows 9x/NT users will have no problem. I'm not > > sure if I care enough about DOS to think that we should change this. Windows 9x people can very well have problems. The underlying filesystem is still 8.3. I continued to see issues with the name mapping between long and short. Mostly, it appears with certain APIs and the registry. Seriously: avoid more than .3 if possible. >... > > > I believe greg has a good point here. > > > > I think I've addressed it. The Python interpreter should not be looking > > at each namespace in turn. I would expect that in the future we will > > allow an infinite number of nested namespaces without any performance > > penalty. > > Perhaps, but when? I haven't seen any indication that this will > happen in the near future, and predicting such things in the longer > run seems to be asking for problems both in the meantime and in how > the long run might actually work out. Yah. What he said. "in the future" is a *long* ways off when there hasn't been any real discussion on if/how to deal with the multiple namespace issue. Relying on a solution to appear is asking for trouble (IMO). There are also a number of auxilliary things that would need to occur and changes to programs to realize that more namespaces exist in the standard lookup (a true partition of purpose would avoid this). However: I'm still against adding a whole new namespace. I haven't seen a good argument for why it is needed. Can somebody come up with a concise rationale? >... > > > Any pointers to this discussion? > > > > I don't have any. I think we just said: "we'll figure out const later." > > There may not have been a big discussion. > > Sounds good to me. Hrm. I'll try to dig it up. I thought I remembered somebody saying "and is why const isn't really needed." Happy Holidays, -g -- Greg Stein, http://www.lyra.org/ From scott@chronis.pobox.com Mon Dec 27 19:37:31 1999 From: scott@chronis.pobox.com (scott) Date: Mon, 27 Dec 1999 14:37:31 -0500 Subject: [Types-sig] PyDL RFC 0.02 In-Reply-To: References: <19991227133937.A42463@chronis.pobox.com> Message-ID: <19991227143731.A43112@chronis.pobox.com> On Mon, Dec 27, 1999 at 11:30:47AM -0800, Greg Stein wrote: > On Mon, 27 Dec 1999, scott wrote: > > > On Mon, Dec 27, 1999 at 12:54:54PM -0500, Paul Prescod wrote: > > > scott wrote: > > > > > > However: I'm still against adding a whole new namespace. I haven't seen a > good argument for why it is needed. Can somebody come up with a concise > rationale? > In my understanding of it, a separate namespace is needed for the generation of compile-time checking, simply because compile time checking can't know everything that happens in the run-time namespace. In other words, the static-type interpreter in the RFC needs it's own way of dealing with variable names. This perspective, however, is 100% independent of the idea of a separate namespace at run time. I don't see a need for a separate run time namespace at all, only for a modular, cleanly accessible way of accessing type information at run time. scott From gstein@lyra.org Mon Dec 27 19:41:20 1999 From: gstein@lyra.org (Greg Stein) Date: Mon, 27 Dec 1999 11:41:20 -0800 (PST) Subject: [Types-sig] PyDL RFC 0.02 In-Reply-To: <19991227113939.B41570@chronis.pobox.com> Message-ID: On Mon, 27 Dec 1999, scott wrote: > On Mon, Dec 27, 1999 at 04:55:08AM -0800, Greg Stein wrote: >... > > > The Python interpreter invokes the static interface interpreter and > > > optionally the interface checker on a Python file and its associated > > > PyDL file. Typically a PyDL file is associated with a Python file > > > through placement in the same path with the same base name and a > > > ".pydl" or ".gpydl" extension. If both are avaiable, the module'sj > > > interface is created by combining the declarations in the ".pydl" and > > > ".gpydl" files. > > > > The notion of two types of files just adds complexity. There is no reason > > that a generated file would be *any* different in form/syntax than a > > human's file. The human just gets to add funky comments, indentation, etc. > > > > In other words: design around a single file. > > Greg, are you suggesting a single file which gets generated type info > appened automatically? Nope. It sounded like Paul was suggesting different formats, suffixes, and purpose. I don't think we should go that route. It would seem best to have a .pyi file that a human can craft and maintain. It would be quite easy to have the type-check mode warn the user that they haven't declared some interface or something (so they can go and add it in). Heck, maybe the user did that on purpose, because the class isn't public. It would also be quite possible to invoke the type-checker with a mode that says "generate a .pyi file for me." The user can then edit the thing as needed. I also think that we'd want to avoid "combining the declarations" of two files. Again, the user may not want the second group of declarations. And the combination rules might be a bit hard to describe or handle (from the human's standpoint). Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Mon Dec 27 20:06:57 1999 From: gstein@lyra.org (Greg Stein) Date: Mon, 27 Dec 1999 12:06:57 -0800 (PST) Subject: [Types-sig] PyDL RFC 0.02 In-Reply-To: <38679F06.37B6DFDD@prescod.net> Message-ID: On Mon, 27 Dec 1999, Paul Prescod wrote: > PyDL is a pun. These files behave just like IDL in COM and CORBA. They > are a separate language, at least for now. Let's name it for its purpose, not a pun. Please. I know where the derivation came from, and the file is an interface file. The file is not a "python definition language." If it uses a separate language, then that still doesn't mean we're talking about something other than an interface file. > Greg Stein wrote: > > The notion of two types of files just adds complexity. There is no reason > > that a generated file would be *any* different in form/syntax than a > > human's file. The human just gets to add funky comments, indentation, etc. > > >From PyDL RFC 0.03: > > We have two different files so that hand-crafted PyDL files will not be > overwritten by generated ones. The syntax of the files is identical. In my reply to Scott: I think we should be choosing one or the other rather than taking two files and combining them. > > In other words: design around a single file. > > I did. The existence of two files affects the language semantics not one > whit but it makes the system much more safe and arguably much more > usable. Two more files for each module doesn't make it more usable. Having zero extra files (and inline declarations) is more usable/maintainable. How does it make it more safe? > > Search *another* namespace? Eek! We're already seeing people avoiding the > > time with things like: > > > > def foo(len=len): > > ... > > > > Adding another namespace will just exacerbate the situation. > > The whole point of static type checking is that the checks of the > namespaces should be done at *compile time*. Duh. > It isn't like a name > reference from six levels of lexical nesting in a C++ file requires more > time to look up than a "local" reference. That's just a temporary Python > bug that we're trying to fix. Per my other email: we should not be relying on vapor to solve our problems. We should avoid exacerbating the situation. Regardless, I haven't seen a good rationale for needing a new namespace yet. There has been conversation where people have thrown out "well, just move it into a separate namespace," but I haven't seen a clear/cogent description of the real need. I think these names should follow standard Python rules. It goes into the global, local, or class namespace depending upon the context. > > Wasn't the notion of "const" (successfully) argued against inclusion? > > Maybe. But I think that the situation changed when I moved from talking > about "lists", "tuples" and "dictionaries" to talking about "sequences", > "mappings" and "records" because we have no way of saying "read-only > record." Maybe that's a big deal for version 1. Maybe it isn't. I'm open > to opinions. Record? How is that different from a sequence? That is new terminology, and it seems it is just a bare cover for saying "tuple." Why don't we stick to "tuple" instead of introducing a new term. > > I don't think this is going to work as you expect. The Python interpreter > > can't work with "Undefined" unless it is an object (otherwise, you're > > talking about a near-impossible revamp). Therefore, Undefined is an object > > and you're going to have some *real* serious issues trying to keep that > > out of some kind of assignment or other usage. > > > > Pass it as a parameter? Shove it into a list or tuple? Check for Undefined > > on every name binding? What about indexed or slice assignment? > > I think that there are finite number of such issues and you've listed > most of them. In each of these parts of the interpreter we need to add > two or three lines of C code. From a performance perspective, we will > just be doing a pointer comparison and branch which is tiny compared to > the rest of the interpreter overhead. It is a lot more prevalent than what I just listed. I was just spouting off some examples Some more: tuple unpacking, in an "is" expression, in an "==" expression, or passed to the __cmp__() instance method. I bet that I could come up with more. That latter one should throw a nice screw into the machine. Next up: you suggested pre-assigning all names to the Undefined object. Now dir(some_instance) produces an incorrect list of valid names. Or some_instance.__dict__.items() (uh oh! how does the items() return a two-tuple with Undefined in there?!). But don't just look at class instances, we have the same issue at the global level. Right here in Lib/symbol.py, I see a globals().items() on line 73. And what does __builtins__['Undefined'] return? If I'm trying to establish a restricted mode of execution, how do I insert Undefined into the constructed builtins dictionary? When I'm writing a C extension, do I have to check for the Undefined object now? What happens if I see one? Raise an error? Can I return one? I would venture that the Undefined concept would require a pretty fundamental change to Python's (internal) object model. Given more thought, I might find additional issues, and that worries me. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Mon Dec 27 20:14:41 1999 From: gstein@lyra.org (Greg Stein) Date: Mon, 27 Dec 1999 12:14:41 -0800 (PST) Subject: [Types-sig] PyDL RFC 0.02 In-Reply-To: <19991227143731.A43112@chronis.pobox.com> Message-ID: On Mon, 27 Dec 1999, scott wrote: > On Mon, Dec 27, 1999 at 11:30:47AM -0800, Greg Stein wrote: > > However: I'm still against adding a whole new namespace. I haven't seen a > > good argument for why it is needed. Can somebody come up with a concise > > rationale? > > In my understanding of it, a separate namespace is needed for the > generation of compile-time checking, simply because compile time > checking can't know everything that happens in the run-time namespace. > In other words, the static-type interpreter in the RFC needs it's own > way of dealing with variable names. > > This perspective, however, is 100% independent of the idea of a > separate namespace at run time. I don't see a need for a separate run > time namespace at all, only for a modular, cleanly accessible way of > accessing type information at run time. Right -- a compile-time "namespace". But really: that is just an abbreviated form of the runtime namespaces rather than a separate compile-time namespace (so "... needed for the generation of compile-time checking, ..." doesn't hold). Regardless of how the compile-time namespace is viewed, Paul was suggesting a new runtime namespace in the RFC. Note: the compile-time checking *does* need to know everything that happens in the run-time namespaces. It must check the assignments and usage of values in the namespaces. Happy Holidays, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Mon Dec 27 20:34:18 1999 From: gstein@lyra.org (Greg Stein) Date: Mon, 27 Dec 1999 12:34:18 -0800 (PST) Subject: [Types-sig] const (was: PyDL RFC 0.02) In-Reply-To: <3867A7EE.5F7EB2D0@prescod.net> Message-ID: On Mon, 27 Dec 1999, Paul Prescod wrote: > scott wrote: >... > > Any pointers to this discussion? > > I don't have any. I think we just said: "we'll figure out const later." > There may not have been a big discussion. Paul's right, and I'm senile :-) The only discussion of "const" that I found is in Paul's own email at: http://www.python.org/pipermail/types-sig/1999-December/000599.html I must be thinking of another concept that was raised and subsequently dismissed... Cheers, -g p.s. I'd recommend assignment enforcement over the notion of const; the former seems to be more easily enforcable at runtime. -- Greg Stein, http://www.lyra.org/ From billtut@microsoft.com Mon Dec 27 20:54:06 1999 From: billtut@microsoft.com (Bill Tutt) Date: Mon, 27 Dec 1999 12:54:06 -0800 Subject: [Types-sig] Viper module compiler begun Message-ID: <4D0A23B3F74DD111ACCD00805F31D8101D8BCB8E@RED-MSG-50> > -----Original Message----- > From: skaller [mailto:skaller@maxtal.com.au] > > Greg Stein wrote: > > > skaller wrote: > > > > > > Last time I tried that, it crashed unceremoniously. > > > Has that been fixed? > > > > Not much of a bug report. Get serious. How the heck should > I know whether > > that particular bug has been fixed? > > > > "oh. it broke. fix it." *snort* > > > > As far as I know, P2C can successfully convert *any* module > into a Python > > extension model. > > Here's what I get with the latest version: > Did I do something wrong? > > [root@ruby] ~/py2c>python gencode.py gencode.py __gencode.c > _gencode.py > Traceback (innermost last): > File "gencode.py", line 35, in ? > genc.Generator(args[0], args[1], args[2]) > File "genc.py", line 91, in __init__ > tree = t.parsefile(input) > File "transformer.py", line 176, in parsefile > return self.parsesuite(file.read()) > File "transformer.py", line 166, in parsesuite > return self.transform(parser.suite(text)) > parser.ParserError: Could not parse string. Well, that indeed is a strange incident given that Python's internal parser liked thef ile and then subsequently didn't like the file. :) I've gone and stuck the current contents of CVS at: http://lima.mudlib.org/~rassilon/p2c/p2c-cvs.zip This should produce slightly happier output, the compiled transformer.py actually works and shaves one whole second off the execution time of translating genc.py. I haven't yet been able to test genc.py's compiled C code completly since MSVC has this annoying habit of stop emitting line #s in its debug info after the 64kth line. (Ugh.) Bill From billtut@microsoft.com Mon Dec 27 22:56:12 1999 From: billtut@microsoft.com (Bill Tutt) Date: Mon, 27 Dec 1999 14:56:12 -0800 Subject: [Types-sig] Viper module compiler begun Message-ID: <4D0A23B3F74DD111ACCD00805F31D8101D8BCB90@RED-MSG-50> > -----Original Message----- > From: Bill Tutt [mailto:billtut@microsoft.com] > I've gone and stuck the current contents of CVS at: > http://lima.mudlib.org/~rassilon/p2c/p2c-cvs.zip > Err.. skaller just reminded me that to use this particular release you need pyclbr.py from the python CVS repository. http://www.python.org/download/cvs.html Bill From scott@chronis.pobox.com Mon Dec 27 22:59:55 1999 From: scott@chronis.pobox.com (scott) Date: Mon, 27 Dec 1999 17:59:55 -0500 Subject: [Types-sig] PyDL RFC 0.02 In-Reply-To: References: <19991227143731.A43112@chronis.pobox.com> Message-ID: <19991227175955.B44344@chronis.pobox.com> On Mon, Dec 27, 1999 at 12:14:41PM -0800, Greg Stein wrote: > On Mon, 27 Dec 1999, scott wrote: > > On Mon, Dec 27, 1999 at 11:30:47AM -0800, Greg Stein wrote: > > > However: I'm still against adding a whole new namespace. I haven't seen a > > > good argument for why it is needed. Can somebody come up with a concise > > > rationale? > > > > In my understanding of it, a separate namespace is needed for the > > generation of compile-time checking, simply because compile time > > checking can't know everything that happens in the run-time namespace. > > In other words, the static-type interpreter in the RFC needs it's own > > way of dealing with variable names. > > > > This perspective, however, is 100% independent of the idea of a > > separate namespace at run time. I don't see a need for a separate run > > time namespace at all, only for a modular, cleanly accessible way of > > accessing type information at run time. > > Right -- a compile-time "namespace". But really: that is just an > abbreviated form of the runtime namespaces rather than a separate > compile-time namespace (so "... needed for the generation of compile-time > checking, ..." doesn't hold). We already had a big discussion about this that was never resolved. > > Regardless of how the compile-time namespace is viewed, Paul was > suggesting a new runtime namespace in the RFC. Yes. > > Note: the compile-time checking *does* need to know everything that > happens in the run-time namespaces. It must check the assignments and > usage of values in the namespaces. I don't see how compile-time checking can know much of anything about runtime-specific namespaces without running code. If it runs code, it is no longer compile-time checking. Furthermore, if the compile-time checker assumes that the running of code can do anything it can today, there's not much of anything that can be checked at compile time to begin with. This is why it seems to me that checks done at compile-time must be done based on a compile-time specific model of the namespaces, and that model must be more restrictive in naming and scoping usage than python currently is. Example restrictions that seem to help meet this end are: don't delete typed variables, don't use different types for variables at different times, unless that variable is pre-set as a union of both types, etc. scott From skaller@maxtal.com.au Mon Dec 27 23:37:52 1999 From: skaller@maxtal.com.au (skaller) Date: Tue, 28 Dec 1999 10:37:52 +1100 Subject: [Types-sig] PyDL RFC 0.02 References: <3867472C.7ECBAC55@prescod.net> Message-ID: <3867F850.14D60FCF@maxtal.com.au> Paul Prescod wrote: > > PyDL RFC 0.02 > > A PyDL file declares the interface for a Python module. PyDL files > declare interfaces, objects and the required interfaces of objects. Please stick to a syntax which can _also_ be embedded in .py files. In this case, an interface file is ordinary Python, except that it consists only of compile time directives. If I understand you correctly, interface files are used to provide module interfaces: there is no other sensible way to do that at present, since .py files define modules. IF there were a way to create modules like: module X: # stuff that normally goes in a module file in python, then there would be a corresponding interface module X: # stuff that normally goes in a module interface file In other words, interface files should be regarded as an _artefact_ of the existing 'lack of syntax for defining a module'. [Which Viper may correct :-] On this basis, some comments: > Interface definitions are similar to Python class definitions. They > use the keyword "interface" instead of the keyword "class". > > Sometimes an interface can be specialized for working with specific > other interfaces. For instance a list could be specialized for working > with integers. No. I think you have to make up your mind here. You must choose. Either 'List' is an interface, or, it is an interface generator, it cannot be both. [In your terminology, you can't use a parameterised interface where a fully resolved one is required; so List cannot be both partly unresolved and also fully resolved] > In addition to defining interfaces, it is possible to declare other > attributes of the module. Each declaration associates an interface > with the name of the attribute. Values associated with the name in the > module namespace must never violate the declaration. Furthermore, by > the time the module has been imported each name must have an > associated value. OK. This is the crux of the semantics: you are applying interfaces to names, rather than values/objects. > The interface interpreter reads the PyDL file and builds the > relevant interface objects. Furthermore, the Python compiler will do it too; that is, it will process embedded interface specifications. >If the PyDL file refers to other modules > then the interface interpreter can read the PyDL files associated > with those other modules. Yeah, but you would do well to get out of the habit of saying 'can' and 'may'. Use the word 'shall'. Meaning, that the damn thing is REQUIRED to do something :-) Dont give permission. Specify requirements. > The interface interpreter maintains its own > module dictionary so that it does not import the same module twice. That's better, but should be marked as 'commentary', since it has no semantic implications. > Interface expression language: > ============================== > > Interface expressions are used to declare that attributes must conform > to certain interfaces. In a interface expression you may: Do NOT say 'may'. Do not refer to 'you', the programmer, we're not interested in what the programmer does, we're interested in what the interface compiler does. And it SHALL interpret certain grammatical constructions in a particular way, no 'may' about it. -- Point 0: Paul, list the predefined names like Integer, or whatever. Say if they are keywords or plain identifiers. Use a grammar production like: basic_if_name ::= "Integer" | "Float" > 1. refer to a "dotted name" (local name or name in the PyDL of an > imported module ). This doesn't make any sense to me. > 2. make a union of two or more interfaces: > integer or float or complex Give the grammar. EG: if_alt ::= if_name "or" if_alt | if_name > 3. parameterize a interface: > > Array( Integer, 50 ) > Array( length=50, elements=Integer ) grammar? > Note that the arguments can be either interfaces or simple Python > expressions. A "simple" Python expression is an expression that does > not involve a function call. No. See above. List(Int) already involves a 'function call'. > 4. use a syntactic shortcut: > > [Foo] => Sequence( Foo ) # sequence of Foo's > {A:B} => Mapping( A, B ) # Mapping from A's to B's > (A,B,C) => Record( A, B, C ) # 3-element sequence of interface a, > followed > # by b followed by c Forget this, for the moment. Add syntact sugar later, when the core grammar and semantics are more settled. > 5. Declare un-modifiability: > > const [const Array( Integer )] > > (the semantics of un-modifiability need to be worked out) Again, forget it, for the moment. This one can be real nasty. > Declarations in a PyDL file: > ============================ > > (formal grammar to follow) > > 1. Imports > > An import statement in an interface file loads another interface file. > The import statement works just like Python's except that it loads the > PyDL file found with the referenced module, not the module itself. (of > course we will make this definition more formal in the future) No. Use a distinct keyword like 'include'. There is a good reason for this: consider embedded declarations. Then it is a) impossible to load an interface but not the module b) impossible to load a module, but not the interface A separate keyword resolves the ambiguity when embedded: import X # load the module include X # load the interface Note that importing a module implicitly loads the interface anyhow. However, it will do so in an appropriate namespace. It is necessary to load interfaces even when modules are not imported (by the client module). There are other ways to get at stuff from a module than import it. For example, a function call f() can return an object whose class is defined in a module X the calling module has not imported: we may want to type check the returned object, which requires importing the module X's interface -- without importing the module X itself. > 2. Basic attribute interface declarations: > > decl myint as Integer # basic > decl intarr as Array( Integer, 50 ) # parameterized > decl intarr2 as Array( size = 40, elements = Integer ) # using keyword > syntax > > Attribute declarations are not parameteriable. Furthermore, they must > resolve to fully parameterized (not parameterizable!) interfaces. grammar. Again, distinguish interfaces from interface generators, and the above ambiguity in the wording disappears. > 3. Callable object interface declarations: > > Functions are the most common sort of callable object but class > instances can also be callable. Callables may be runtime parameterized > and/or interface parameterized. For instance, there might be a method > "add" that takes two numbers of the same interface and returns a number > of > that interface. > > decl Add(_X: Number) as def( a: const _X, b: const _X )-> _X > > _X is the interface parameter. a and b are the runtime parameters. I am already using ! not : here, following Greg Stein. There are enough ":"'s in python already :-) OTOH, I am also using 'as' in another context :-( > 4. Class Declarations > > A class is a callable object that can be subclassed. Currently the > only way to make those (short of magic) is with a class declaration, They can be created in extension modules. > Here is the syntax for a class definition: > > decl TreeNode(_X: Number) as > class( a: _X, Right: TreeNode( _X ) or None, > Left: TreeNode( _X ) or None ) > -> ParentClasses, Interfaces Hmmm. Confusing me. Especially the newline after 'as'. Python requires brackets, or a colon, to start a newline, you can't do it in the middle of a statement. > What we are really defining is the constructor. The signature of the > created object can be described in an interface declaration. Not good enough. The semantics of class instance attributes would be 'when you assign to this attribute, it had better have this type'. This doesn't mean that you can be sure an access gives that type, the attribute might not exist. This defeats optimisation. You'd need to say something like: AFTER the constructor has run, and BEFORE the destructor has run the attribute exists and has the designated type. Enforcing that might be tricky :-) > 5. Interface declarations: > > interface (_X,_Y) spam( a, b ): > decl somemember as _X > decl someOtherMember as _Y > decl const someClassAttr as [ _X ] > > decl const someFunction as def( a: Integer, b: Float ) -> String Semantics? > The Undefined Object: > ===================== > > The Undefined object is used as the value of unassigned attributes and > the return value of functions that do not return a value. No. All Python functions return a value. If one is not returned explicitly, None is returned implicitly. People check for that: for example: def f(x): if x in [1,2,3]: return x if f(99): print 'Got it' else: print 'Not 1,2 or 3' Your spec would break this code. You can argue that your spec is a better spec -- but it isn't Python compatible. FYI: In Viper, uninitialised, statically declared variables are initialised with the special object PyInitial. Another special object, PyTerminal, also exists. These objects are useful in the internal workings of the implementation, for bounding things (i.e. as sentinels). For example, it makes calculating max( .... ) much easier. [PyInitial is less than all other objects] > Undefined also corrects a long-term unsafe issue with functions. Now, > functions that do not explicitly return a value return Undefined > instead of None. No. That would break compatibility. > Experimental syntax: > ==================== > > There is a backwards compatible syntax for embedding declarations in a > Python 1.5x file: > > "decl","myint as Integer" > "typedef","PositiveInteger as BoundedInt( 0, maxint )" Nice. > Summary of Major Runtime Implications: > ===================== > > All of the named interfaces defined in a PyDL file are available in the > "interfaces" dictionary that is searched between the module dictionary > and > the built-in dictionary. I _think_ you mean that the interface dictionary is 'per module'? And you can refer to an interface in another module with other.interfx notation? > The runtime should not allow an assignment or function call to violate > the declarations in the PyDL file. In an "optimized speed mode" those > checks would be disabled. I think you have to think very carefully about what constitues an error here: see my posts about errors in python. It is not acceptable to specify that an exception be thrown. That would NOT permit an optimiser to elide checks, except when it could prove they were not needed. Much better, you deem a violating program is not valid, and then the language processor can do whatever it wants: it may raise an exception, or it may core dump, or it may reject the program early. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From paul@prescod.net Mon Dec 27 23:49:01 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 27 Dec 1999 18:49:01 -0500 Subject: [Types-sig] PyDL RFC 0.02 References: Message-ID: <3867FAED.BAEC3B6@prescod.net> Greg Stein wrote: > > Windows 9x people can very well have problems. The underlying filesystem > is still 8.3. I continued to see issues with the name mapping between long > and short. Mostly, it appears with certain APIs and the registry. > > Seriously: avoid more than .3 if possible. Okay, .pyi will be the extension but I won't give up on the pun as the formal name for the language without more teeth pulling (and I've just had my wisdom's removed so my tolerance level is high). > "in the future" is a *long* ways off when there hasn't been any real > discussion on if/how to deal with the multiple namespace issue. Relying on > a solution to appear is asking for trouble (IMO). It seems to me that the simplest solution is to move the "types" namespace BEHIND the __builtin__ namespace. > However: I'm still against adding a whole new namespace. I haven't seen a > good argument for why it is needed. Can somebody come up with a concise > rationale? Well there are a few issues and I admit to having not thought all of them through completely yet: * importing modules are supposed to only see exported attributes. For instance dir() should only show exported attributes. * the two namespace arrangement is similar to the way that a class' namespace is segmented from that of instances. * Types are independent objects but variable declarations need to be somehow unified with the declared objects. * But we also need an API to query type information associated with a name (instead of the value bound to the name) * Type expressions can make forward references. So when they are embedded in Python code we still won't think of them as ordinary assignments. I have not put a lot of thought into this part of the system and am open to suggestions of how to get all of this to work. Paul Prescod From skaller@maxtal.com.au Tue Dec 28 00:14:18 1999 From: skaller@maxtal.com.au (skaller) Date: Tue, 28 Dec 1999 11:14:18 +1100 Subject: [Types-sig] PyDL RFC 0.02 References: <19991227143731.A43112@chronis.pobox.com> <19991227175955.B44344@chronis.pobox.com> Message-ID: <386800DA.7DBE1B7D@maxtal.com.au> scott wrote: > This is why it seems to me that checks done at compile-time must be > done based on a compile-time specific model of the namespaces, and > that model must be more restrictive in naming and scoping usage than > python currently is. No: the way I see it, the 'optional type checking' when added to the python language makes it more expressive, not more restrictive -- even though constraints on declared names are part of the extension, it is a genuine extension. > Example restrictions that seem to help meet this > end are: don't delete typed variables, don't use different types for > variables at different times, unless that variable is pre-set as a > union of both types, etc. This is correct. There must be a set of 'text files' which are not valid python programs. As you say, assigning the wrong type to a statically typed variable would render the program 'not python': but this situation cannot occur in python 1.5, because there are no static type declarations. What this means is that there are some files which technically have incompatible semantics to Python 1.5, technically, running these files currently requires a SyntaxError to be raised. Under the modified semantics, there are two cases: the program runs 'more or less as if the declarations were not there', which will happen if the execution of the program obeys the declared type constraints, or, 'the file is not valid python', which means we don't care what happens. In the latter case, a run time exception would be useful from a non-optimising interpreter, and a compile time diagnostic from a type-checking translator, but in those cases where, in particular, the type checker cannot ensure the constraints are met, an optimiser is entitled to ASSUME that they're met, optimise accordingly, and core dump if, in fact, they're not. It is ALSO possible to _require_ a diagnostic in some cases of an invalid program. But much care is need specifying them, to make sure it is possible to for _all_ language translators to detect these cases. [And usually, what happens after that is undefined anyhow] -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From paul@prescod.net Tue Dec 28 08:38:36 1999 From: paul@prescod.net (Paul Prescod) Date: Tue, 28 Dec 1999 03:38:36 -0500 Subject: [Types-sig] const (was: PyDL RFC 0.02) References: Message-ID: <3868770C.7B427BC@prescod.net> Greg Stein wrote: > > p.s. I'd recommend assignment enforcement over the notion of const; the > former seems to be more easily enforcable at runtime. I think we need both. We need to be able to enforce the TYPES of assignments and we need to sometimes say that an object is not modifiable, for all of the things we currently use tuples, files open for read and other read-only objects for. Paul Prescod From paul@prescod.net Tue Dec 28 08:38:42 1999 From: paul@prescod.net (Paul Prescod) Date: Tue, 28 Dec 1999 03:38:42 -0500 Subject: [Types-sig] PyDL RFC 0.02 References: Message-ID: <38687712.1F7E3714@prescod.net> Greg Stein wrote: > > Nope. It sounded like Paul was suggesting different formats, suffixes, and > purpose. I don't think we should go that route. One format. One purpose. Two suffixes. Two maintenance strategies. > It would seem best to have a .pyi file that a human can craft and > maintain. It would be quite easy to have the type-check mode warn the user > that they haven't declared some interface or something (so they can go > and add it in). Heck, maybe the > user did that on purpose, because the class isn't public. It would also be > quite possible to invoke the type-checker with a mode that says "generate > a .pyi file for me." The user can then edit the thing as needed. But the whole point is that we don't want to be forced to maintain the thing in a separate file. If you want to put some or all of the declarations in your source file then we need a place to extract those to. I could have just banished in-file declarations but it seemed that we could easily extract them so why not allow the convenience? > I also think that we'd want to avoid "combining the declarations" of two > files. Again, the user may not want the second group of declarations. Then they shouldn't put declarations in their Python file. > And > the combination rules might be a bit hard to describe or handle (from the > human's standpoint). It's just concatenation! There is nothing hard about it. Paul Prescod From gstein@lyra.org Tue Dec 28 09:08:29 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 28 Dec 1999 01:08:29 -0800 (PST) Subject: [Types-sig] const (was: PyDL RFC 0.02) In-Reply-To: <3868770C.7B427BC@prescod.net> Message-ID: On Tue, 28 Dec 1999, Paul Prescod wrote: > Greg Stein wrote: > > p.s. I'd recommend assignment enforcement over the notion of const; the > > former seems to be more easily enforcable at runtime. > > I think we need both. We need to be able to enforce the TYPES of > assignments and we need to sometimes say that an object is not > modifiable, for all of the things we currently use tuples, files open > for read and other read-only objects for. Um... Are you suggesting that we add a readonly flag to the list and dict types? Short of that, I'm not sure how you would do "const". IMO, adding a readonly flag to those types seems wrong. Cheers, -g -- Greg Stein, http://www.lyra.org/ From paul@prescod.net Tue Dec 28 09:14:30 1999 From: paul@prescod.net (Paul Prescod) Date: Tue, 28 Dec 1999 04:14:30 -0500 Subject: [Types-sig] Interface files References: <3867472C.7ECBAC55@prescod.net> <3867F850.14D60FCF@maxtal.com.au> Message-ID: <38687F76.47B415C7@prescod.net> skaller wrote: > > In other words, interface files should be regarded as an _artefact_ > of the existing 'lack of syntax for defining a module'. > [Which Viper may correct :-] If the normative spec. is in terms of interface files then we can deal with various situations through transformation TO interface files: * C modules * "read-only" Python modules (like library modules that you don't want to change) * modules (in any language) already defined by IDL * Python modules with embedded declarations * Python modules without embedded declarations that "use" non-conservative type inferencing Paul Prescod From gstein@lyra.org Tue Dec 28 09:22:26 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 28 Dec 1999 01:22:26 -0800 (PST) Subject: [Types-sig] PyDL RFC 0.02 In-Reply-To: <38687712.1F7E3714@prescod.net> Message-ID: On Tue, 28 Dec 1999, Paul Prescod wrote: > Greg Stein wrote: > > Nope. It sounded like Paul was suggesting different formats, suffixes, and > > purpose. I don't think we should go that route. > > One format. One purpose. Two suffixes. Two maintenance strategies. Fine. It just didn't sound like that in your proposal, so I was concerned. > > It would seem best to have a .pyi file that a human can craft and > > maintain. It would be quite easy to have the type-check mode warn the user > > that they haven't declared some interface or something (so they can go > > and add it in). Heck, maybe the > > user did that on purpose, because the class isn't public. It would also be > > quite possible to invoke the type-checker with a mode that says "generate > > a .pyi file for me." The user can then edit the thing as needed. > > But the whole point is that we don't want to be forced to maintain the > thing in a separate file. I totally agree here. > If you want to put some or all of the > declarations in your source file then we need a place to extract those > to. While true, I could just as easily argue that they should be stored as pickles in a central database. It might be nice to start very simple: there is one file that we look for (a .pyi). Whether that was hand-created or computer-created, we just don't care. The file would be used for accessing a module's interface without needing to actually load the module. In a type-check mode, it can be verified against declarations (if any) in the source module. > I could have just banished in-file declarations but it seemed that > we could easily extract them so why not allow the convenience? Euh... How could you have "just banished in-file declarations" ?? > > I also think that we'd want to avoid "combining the declarations" of two > > files. Again, the user may not want the second group of declarations. > > Then they shouldn't put declarations in their Python file. I do not believe this is a valid position. Specifically: I would put all the declarations inline. If I create a .pyi, it would simply be as an extract from the inline declarations *or* to create a public subset of the items in the source file. Your scheme would mean that I couldn't use the type stuff internally -- it would be exposed through the automagic generated portion. > > And > > the combination rules might be a bit hard to describe or handle (from the > > human's standpoint). > > It's just concatenation! There is nothing hard about it. It doesn't seem to be simple concatenation. How are conflicts handled? How are merges done? e.g. a method is not declared in the interface file, but it does have a declaration in the source. Cheers, -g -- Greg Stein, http://www.lyra.org/ From paul@prescod.net Tue Dec 28 10:42:48 1999 From: paul@prescod.net (Paul Prescod) Date: Tue, 28 Dec 1999 05:42:48 -0500 Subject: [Types-sig] RFC Comments Message-ID: <38689428.4953374D@prescod.net> > No. I think you have to make up your mind here. > You must choose. Either 'List' is an interface, > or, it is an interface generator, it cannot be both. > [In your terminology, you can't use a parameterised interface > where a fully resolved one is required; so List cannot > be both partly unresolved and also fully resolved] Okay, I will use your terminology. > Yeah, but you would do well to get out of the habit > of saying 'can' and 'may'. Use the word 'shall'. Meaning, > that the damn thing is REQUIRED to do something :-) > Dont give permission. Specify requirements. I expect to rewrite the specification from scratch (with grammar) before I am done. Consider this version a prototype. Once we have the design down I will generate the normative spec. > Point 0: Paul, list the predefined names like Integer, > or whatever. Say if they are keywords or plain identifiers. I've been putting this off because there are some tricky issues around file objects. > > Note that the arguments can be either interfaces or simple Python > > expressions. A "simple" Python expression is an expression that does > > not involve a function call. > > No. See above. List(Int) already involves a 'function call'. List is (according to your terminology) an interface generator, not a function. > > 5. Declare un-modifiability: > > > > const [const Array( Integer )] > > > > (the semantics of un-modifiability need to be worked out) > > Again, forget it, for the moment. Isn't that what I did? :) > This one can be real nasty. Agreed. > No. Use a distinct keyword like 'include'. > There is a good reason for this: consider embedded declarations. > Then it is > > a) impossible to load an interface but not the module > b) impossible to load a module, but not the interface > > A separate keyword resolves the ambiguity when embedded: > > import X # load the module > include X # load the interface > > Note that importing a module implicitly loads the interface anyhow. > However, it will do so in an appropriate namespace. I don't understand your model of namespaces and inclusions. I don't understand mine either so don't feel bad. > It is necessary to load interfaces even when modules > are not imported (by the client module). There are other > ways to get at stuff from a module than import it. > For example, a function call f() can return an object whose > class is defined in a module X the calling module has > not imported: we may want to type check the returned > object, which requires importing the module X's interface > -- without importing the module X itself. We can have an API like: load_interface("foo") I don't think that the needs of a very specific tool like a static type checker should drive syntax to that extent. The other 99% of code will never do an "include" and the keyword will be wasted. > I am already using ! not : here, following Greg Stein. I'm going to presume that that isn't a backwards-compatibility argument. :) > There are enough ":"'s in python already :-) Debatable. I would also be amenable to "as", "is" or "isa". "!" means not to me. > > What we are really defining is the constructor. The signature of the > > created object can be described in an interface declaration. > > Not good enough. The semantics of class instance > attributes would be 'when you assign to this attribute, > it had better have this type'. This doesn't mean that > you can be sure an access gives that type, > the attribute might not exist. This defeats optimisation. The attribute will either have the type or something like "undefined". Since undefined is not a "useful" value, you can optimize away. > Your spec would break this code. You can argue that your > spec is a better spec -- but it isn't Python compatible. Agreed. I will clarify that the behavior of "dropped off" functions is just a suggestion of how Python 2 might be improved using the features of the new object. > FYI: In Viper, uninitialised, statically > declared variables are initialised with the special object PyInitial. > Another special object, PyTerminal, also exists. These objects > are useful in the internal workings of the implementation, > for bounding things (i.e. as sentinels). For example, > it makes calculating max( .... ) much easier. [PyInitial > is less than all other objects] It sounds like None re-invented. My only reason for wanting a new object (not None) is because None is way too flexible. You could pass a None through ten thousand lines of code accidently. So I wouldn't want Undefined to be useful to "max" or anything else other than "is", "str" and "repr". > I _think_ you mean that the interface dictionary > is 'per module'? And you can refer to an interface > in another module with other.interfx notation? True. > > The runtime should not allow an assignment or function call to violate > > the declarations in the PyDL file. In an "optimized speed mode" those > > checks would be disabled. > > I think you have to think very carefully about what > constitues an error here: see my posts about errors in python. > It is not acceptable to specify that an exception be thrown. > That would NOT permit an optimiser to elide checks, except > when it could prove they were not needed. > > Much better, you deem a violating program > is not valid, and then the language processor can do whatever > it wants: it may raise an exception, or it may core dump, > or it may reject the program early. I will consider this. An alternate technique is to list allowed recovery strategies: "It is an error if this leaves more than one match. An XSLT processor may signal the error; if it does not signal the error, it must recover by choosing, from amongst the matches that are left, the one that occurs last in the stylesheet." "It is an error if instantiating the content of xsl:processing-instruction creates nodes other than text nodes. An XSLT processor may signal the error; if it does not signal the error, it must recover by ignoring the offending nodes together with their content." Paul Prescod From skip@mojam.com (Skip Montanaro) Tue Dec 28 14:53:52 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 28 Dec 1999 08:53:52 -0600 (CST) Subject: [Types-sig] A plea for quoting consistency... In-Reply-To: <38689428.4953374D@prescod.net> References: <38689428.4953374D@prescod.net> Message-ID: <14440.52992.308594.392137@dolphin.mojam.com> >>>>> "Paul" == Paul Prescod writes: >> No. I think you have to make up your mind here. You must >> choose. Either 'List' is an interface, or, it is an interface >> generator, it cannot be both. [In your terminology, you can't use a >> parameterised interface where a fully resolved one is required; so >> List cannot be both partly unresolved and also fully resolved] Paul> Okay, I will use your terminology. ... Unfortunately, since this super-thread has grown so enormous, I wind up reading things a bit out of order and/or in multiple chunks, separated by significant time gaps. Please, if you don't CC the author on your response, at least list the author's name somewhere near the beginning of your response. Thx, Skip Montanaro | http://www.mojam.com/ skip@mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From skaller@maxtal.com.au Tue Dec 28 16:31:51 1999 From: skaller@maxtal.com.au (skaller) Date: Wed, 29 Dec 1999 03:31:51 +1100 Subject: [Types-sig] PyDL RFC 0.02 References: <38687712.1F7E3714@prescod.net> Message-ID: <3868E5F7.3B52D1ED@maxtal.com.au> Paul Prescod wrote: > > Greg Stein wrote: > > I also think that we'd want to avoid "combining the declarations" of two > > files. I don't think that this is possible. Separate interface files seem necessary, if only for C extensions. Embedded declarations seem important, at least to me. > Then they shouldn't put declarations in their Python file. I agree. > > And > > the combination rules might be a bit hard to describe or handle (from the > > human's standpoint). > > It's just concatenation! There is nothing hard about it. If that is so, please give the rules. In particular, you will need to cover the issue of duplicate declarations. In c and C++ in particular, these issues turned out to be non-trivial. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Tue Dec 28 16:39:29 1999 From: skaller@maxtal.com.au (skaller) Date: Wed, 29 Dec 1999 03:39:29 +1100 Subject: [Types-sig] Interface files References: <3867472C.7ECBAC55@prescod.net> <3867F850.14D60FCF@maxtal.com.au> <38687F76.47B415C7@prescod.net> Message-ID: <3868E7C1.665CD6AC@maxtal.com.au> Paul Prescod wrote: > > skaller wrote: > > > > In other words, interface files should be regarded as an _artefact_ > > of the existing 'lack of syntax for defining a module'. > > [Which Viper may correct :-] > > If the normative spec. is in terms of interface files then we can deal > with various situations through transformation TO interface files: Yes, this is possible to some extent, but not totally: first, the grammar needs to be compatible with existing python to permit embedding in the first place, and secondly _references_ to typedecls seem to require embedding, even if the decls themselves are not embedded. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Tue Dec 28 16:57:15 1999 From: skaller@maxtal.com.au (skaller) Date: Wed, 29 Dec 1999 03:57:15 +1100 Subject: [Types-sig] PyDL RFC 0.02 References: Message-ID: <3868EBEB.BDAFABE0@maxtal.com.au> Greg Stein wrote: > It doesn't seem to be simple concatenation. How are conflicts handled? How > are merges done? e.g. a method is not declared in the interface file, but > it does have a declaration in the source. Yes. Can we please assume the following position: 1) declarations can be embedded. 2) declarations can also be given in a separate file 3) Processing module X commences by loading the separate interface file 4) Next, the .py file is scanned for declarations 5) The results of (3) and (4) are merged somehow 6) The .py files is scanned again by the code generator We must make some decisions here. Question: what happens if a typedecl kind of name is declared more than once? Partial Answer 1: This must be permitted, because this is _exactly_ what will happen if the .pi file is generated from the .py file by scanning for declarations. It is not necessary to permit such declarations twice in the _same_ file though. One possible solution: require the declarations be identical, token for token: this is what C++ requires. Another solution: the declarations must be semanically equivalent. What this means is that the processor is free to chose either declaration as 'the' definition. Another solution: use the second declaration. Another solution: require _both_ apply (the product: combine constraints). Another: require _either_ apply: (the sum: take the 'union') I could do some analysis on these alternatives, but first, we need to agree there is an issue here. I note there is plenty of existing practice -- with different resolutions :-) -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Tue Dec 28 17:55:36 1999 From: skaller@maxtal.com.au (skaller) Date: Wed, 29 Dec 1999 04:55:36 +1100 Subject: [Types-sig] RFC Comments References: <38689428.4953374D@prescod.net> Message-ID: <3868F998.594C4EEF@maxtal.com.au> Paul Prescod wrote: > > Yeah, but you would do well to get out of the habit > > of saying 'can' and 'may'. Use the word 'shall'. Meaning, > > that the damn thing is REQUIRED to do something :-) > > Dont give permission. Specify requirements. > > I expect to rewrite the specification from scratch (with grammar) before > I am done. Consider this version a prototype. Once we have the design > down I will generate the normative spec. I do consider your RFC's prototypes -- but they're already quite good specifications, so it is already time to try to tighten them up. IMHO. Doing this will also help uncover ambiguities and problems, that 'loose' wording will cover up. > > Point 0: Paul, list the predefined names like Integer, > > or whatever. Say if they are keywords or plain identifiers. > > I've been putting this off because there are some tricky issues around > file objects. The leave them out. Temporarily. If your proposal is coherent and well principled, but doesn't quite cover all the territory, it should be possible to extend it. If you try to make it cover too much, it may be harder to get something concrete enough to extend. > > > 5. Declare un-modifiability: > > > > > > const [const Array( Integer )] > > > > > > (the semantics of un-modifiability need to be worked out) > > > > Again, forget it, for the moment. > > Isn't that what I did? :) No, you mentioned it in point 5. :-) > I don't understand your model of namespaces and inclusions. I don't > understand mine either so don't feel bad. I agree. I'll try again; perhaps an example: # file m.py import n include p Here, in the module m, we import n. This has to actually import the module n at run time. In pass 1, we read the interface file n.pyi. In pass 2, we generate code to actually load module n. Agree? But, for p, we ONLY read in the interface file p.pyi. We do not generate code to import p. Why would we do this? The answer is, we may gain access to classes and functions of the module p, even though we have not imported it. For example, consider a function def f(x): import p # local import return p.someclass() def f(x): # get at module p from module n return n.p.someclass() We cannot state the interface of f, in particular the return type, without the name of the interface of the class 'someclass' which is defined in the interface p. But p isn't imported into module m. So: we have to be able to load an interface, without that necessarily implying the module be imported. On the other hand, in an _interface_ file, we cannot import anything: importing implies run time code generation, to bind a name to a module object. So the correct way to load an interface, but not import anything, requires a separate keyword like 'include'. The semantics are distinct: import implies include the converse is not the case > We can have an API like: > > load_interface("foo") Yes, that would be possible but ugly. :-) > I don't think that the needs of a very specific tool like a static type > checker should drive syntax to that extent. The other 99% of code will > never do an "include" and the keyword will be wasted. but you cannot write that in an implementation file because it would be interpreted as a function call to be done at run time, whereas loading the interface must be done at compile time. > > I am already using ! not : here, following Greg Stein. > > I'm going to presume that that isn't a backwards-compatibility argument. > :) Sure it is. It is only a minor one though. The reason I chose "!" for argument declarations was that it was already being used in similar way for the _expression_: x ! t as in: y = x ! t and in this context, ":" cannot be used. > > There are enough ":"'s in python already :-) > > Debatable. I would also be amenable to "as", "is" or "isa". "!" means > not to me. OK. You should proceed with _some_ fixed syntax. Perhaps it makes sense to seek feedback from users on c.l.p? I'll implement whatever you decide [provided it fits with the grammar of course :-] > > > What we are really defining is the constructor. The signature of the > > > created object can be described in an interface declaration. > > > > Not good enough. The semantics of class instance > > attributes would be 'when you assign to this attribute, > > it had better have this type'. This doesn't mean that > > you can be sure an access gives that type, > > the attribute might not exist. This defeats optimisation. > > The attribute will either have the type or something like "undefined". > Since undefined is not a "useful" value, you can optimize away. I understand that this is your intent, but I am questioning it. My argument is something like this: a requirement that an attribute have type X IF it exists, is weaker than one that doesn't require anything at all, since the typing requirement is contingent on the existence requirement. What I mean is that, the purpose of the typing requirement can be stated as 'you can be sure when you access this name that the object it is bound to has the specified type', but that purpose is not met, if the name isn't bound to an object. you cannot safely optimise an access, because you don't know if the name is bound. Uggg. I'm not explaining this very well. What I'm saying is that type safe access isn't type safe at all unless the access is also safe, irrespective of whether it is typesafe: it has to be safe, before being typesafe is any use. > > Your spec would break this code. You can argue that your > > spec is a better spec -- but it isn't Python compatible. > > Agreed. I will clarify that the behavior of "dropped off" functions is > just a suggestion of how Python 2 might be improved using the features > of the new object. The new Undefined object is an implementation detail in this respect: It is not required, at all, to specify that Python functions be required to explictly return a value, and may not drop off the end, or, weaker, that IF a function drops off the end, the return value may not be used. [Yes, I know you added some extra semantics allowing the dropped of the end returns to be tested -- more debatable, I think] > > FYI: In Viper, uninitialised, statically > > declared variables are initialised with the special object PyInitial. > It sounds like None re-invented. It is, except that clients may refer to None explicitly, but NOT to PyInitial: x = None # valid Python x = PyInitial # NameError, no such thing > My only reason for wanting a new object > (not None) is because None is way too flexible. You could pass a None > through ten thousand lines of code accidently. So I wouldn't want > Undefined to be useful to "max" or anything else other than "is", "str" > and "repr". Perhaps you misunderstood: PyInitial is used in the IMPLEMENTATION of 'max', which is written in ocaml. It is not available to the client python programmer. > I will consider this. An alternate technique is to list allowed recovery > strategies: > > "It is an error if this leaves more than one match. An XSLT processor > may signal the error; if it does not signal the error, it must recover > by choosing, from amongst the matches that are left, the one that occurs > last in the stylesheet." Style sheets have different requirements: there is some kind of need for robustness: compilers should be fragile. [If it is at all possible to break the users program, do it!] It is, of course, possible to specify _anything_. But it is not a good idea, IMHO. For example, Greg Stein might argue that two options be allowed: a compile time diagnostic OR a run time diagnostic. This is dangerous: it limits the kind of processors to what Greg thinks is important today. The general rule of standards bodies is that if there is no consensus, leave it out -- don't define anything. This gives implementors maximum freedom, and restricts the programmer most. It also gives the standardisers the option of adding more constraints on implementors _later_: it is much harder to undo a rule, than to add a new one. Note that NO ONE likes 'undefined behaviour'. On the other hand, most of us prefer 'deterministic behaviour', that is, exactly one option is given the implementor, and the programmer can rely on it. But the next best thing is 'don't do it -- it is not defined'. Two or more choices is a very weak compromise (usually), because the programmer cannot rely on a particular behaviour, and will usually have to avoid it for this reason: meaning the implementor is constrained needlessly, providing a feature the programmer cannot use. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Tue Dec 28 18:30:48 1999 From: skaller@maxtal.com.au (skaller) Date: Wed, 29 Dec 1999 05:30:48 +1100 Subject: [Types-sig] const (was: PyDL RFC 0.02) References: Message-ID: <386901D8.9CD61B50@maxtal.com.au> Greg Stein wrote: > > On Tue, 28 Dec 1999, Paul Prescod wrote: > > Greg Stein wrote: > > > p.s. I'd recommend assignment enforcement over the notion of const; the > > > former seems to be more easily enforcable at runtime. > > > > I think we need both. We need to be able to enforce the TYPES of > > assignments and we need to sometimes say that an object is not > > modifiable, for all of the things we currently use tuples, files open > > for read and other read-only objects for. > > Um... Are you suggesting that we add a readonly flag to the list and dict > types? Short of that, I'm not sure how you would do "const". IMO, adding a > readonly flag to those types seems wrong. 'const', IMHO, in Paul's name based model, means the name cannot be rebound: const x = 1 # x is always bound to 1 But: const x = [] x.append(1) # fine, x is still bound to the same list This does not require a readonly flag, it can be enforced at compile time (in the absence of 'exec' statements :-) In some sense, this kind of const is a _stronger_ constraint that a type constraint: x: int = y since any name which is not rebindable is necessarily bound to the same object, and therefore has an invariant type during its lifetime**: there is no need to give the type for the purpose of checking assignments, since any such asssignment is an error (because it violates the no-rebinding requirement). [** this is not true for raw objects in Viper, where the type object can be dynamically changed .. but that is another story :-] -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Tue Dec 28 18:41:06 1999 From: skaller@maxtal.com.au (skaller) Date: Wed, 29 Dec 1999 05:41:06 +1100 Subject: [Types-sig] Help? References: Message-ID: <38690442.35EFCE57@maxtal.com.au> Um, I feel dumb asking this but .. I'm having some trouble figuring out how the C API works with functions and methods. Consider the script: def f(self,arg): pass class X: g = f x = X() X.g(x, 1) f(2,1) x.g(1) Here, there is only a single function object, f. A call to f requires two arguments: in C, the declaration PyObject *f(PyObject *self, PyObject *args) would have args be a two argument tuple, and self NULL, for the call: f(2,1) Now, when f is called by _either_ X.g(x,1) x.g(1) then x is the 'self' argument of the function, and the tuple 'args' contains only one element. Right? So HOW do I convert f to a C function? It does not seem possible. When used as 'f', there are two arguments in the 'args' tuple, but when used as g, the first arg is the self pointer. The python script indicates a _single_ function can be correctly used in both cases, but I cannot see how this is possible if a C function is used. Sorry to ask a dumb question. Can anyone correct my misconceptions? -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From gstein@lyra.org Tue Dec 28 21:54:24 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 28 Dec 1999 13:54:24 -0800 (PST) Subject: [Types-sig] Help? In-Reply-To: <38690442.35EFCE57@maxtal.com.au> Message-ID: On Wed, 29 Dec 1999, skaller wrote: >... > PyObject *f(PyObject *self, PyObject *args) In this form, the "self" argument is determined entirely by the C implementation. When the C function is a method on a C-based Type, then look in the module for a Py_FindMethod() call. The second parameter is "self". When the C function is a module-level function, then look at the InitModule call. If the call is Py_InitModule() or Py_InitModule3(), then self will always be NULL. If the call is Py_InitModule4(), then the fourth parameter will be passed as self. > would have args be a two argument tuple, and self NULL, > for the call: > > f(2,1) > > Now, when f is called by _either_ > > X.g(x,1) > x.g(1) > > then x is the 'self' argument of the function, > and the tuple 'args' contains only one element. > Right? See above. It is based on the C implementation, rather than the style of call. > So HOW do I convert f to a C function? > It does not seem possible. When used as 'f', > there are two arguments in the 'args' tuple, > but when used as g, the first arg is the self > pointer. The python script indicates a _single_ > function can be correctly used in both cases, > but I cannot see how this is possible if > a C function is used. The C function is called in only one way. And that is based on the C implementation and how you fetch the function (directly from a module or via an object [of some type]). > Sorry to ask a dumb question. Can anyone correct > my misconceptions? I hope that I have. Ask for more detail if I haven't been clear somewhere. Cheers, -g -- Greg Stein, http://www.lyra.org/ From paul@prescod.net Tue Dec 28 21:56:52 1999 From: paul@prescod.net (Paul Prescod) Date: Tue, 28 Dec 1999 16:56:52 -0500 Subject: [Types-sig] Type checks References: <3867472C.7ECBAC55@prescod.net> <3867F850.14D60FCF@maxtal.com.au> Message-ID: <38693224.490174D1@prescod.net> skaller wrote: > > I think you have to think very carefully about what > constitues an error here: see my posts about errors in python. > It is not acceptable to specify that an exception be thrown. > That would NOT permit an optimiser to elide checks, except > when it could prove they were not needed. If people use the static type check system extensively then it would OFTEN be able to elide the checks. If you use type declarations as aggressively (say) as you would in Java then you should get exactly as many type checks. So I am leaning toward throwing an exception. Paul Prescod From gstein@lyra.org Tue Dec 28 22:04:22 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 28 Dec 1999 14:04:22 -0800 (PST) Subject: [Types-sig] Type checks In-Reply-To: <38693224.490174D1@prescod.net> Message-ID: On Tue, 28 Dec 1999, Paul Prescod wrote: > skaller wrote: > > I think you have to think very carefully about what > > constitues an error here: see my posts about errors in python. > > It is not acceptable to specify that an exception be thrown. > > That would NOT permit an optimiser to elide checks, except > > when it could prove they were not needed. > > If people use the static type check system extensively then it would > OFTEN be able to elide the checks. If you use type declarations as > aggressively (say) as you would in Java then you should get exactly as > many type checks. So I am leaning toward throwing an exception. Python is also very deterministic. "Implementation-defined" really does not exist. Dunno Guido's policy or leanings on this matter, but I've been assuming that it would remain that way. And that CPython would generally be the reference platform/definition when the language manual is not clear enough. Errors in Python raise exceptions. That is how it is defined, and that is the general style/pattern for the language. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Tue Dec 28 22:40:22 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 28 Dec 1999 14:40:22 -0800 (PST) Subject: [Types-sig] const (was: PyDL RFC 0.02) In-Reply-To: <386901D8.9CD61B50@maxtal.com.au> Message-ID: On Wed, 29 Dec 1999, skaller wrote: > Greg Stein wrote: > > On Tue, 28 Dec 1999, Paul Prescod wrote: > > > Greg Stein wrote: > > > > p.s. I'd recommend assignment enforcement over the notion of const; the > > > > former seems to be more easily enforcable at runtime. > > > > > > I think we need both. We need to be able to enforce the TYPES of > > > assignments and we need to sometimes say that an object is not > > > modifiable, for all of the things we currently use tuples, files open > > > for read and other read-only objects for. > > > > Um... Are you suggesting that we add a readonly flag to the list and dict > > types? Short of that, I'm not sure how you would do "const". IMO, adding a > > readonly flag to those types seems wrong. > > 'const', IMHO, in Paul's name based model, means the name > cannot be rebound: > > const x = 1 # x is always bound to 1 > > But: > > const x = [] > x.append(1) # fine, x is still bound to the same list > > This does not require a readonly flag, it can be > enforced at compile time (in the absence of 'exec' > statements :-) Please re-read Paul's posts. In the quoted section above, he says we need to say "that an object is not modifiable." In a previous post, he had the following example code: const [ const Array( Integer )] These two points said (to me) that he wanted to disable your second example. I disagree with the notion of add const-ness to objects. I could agree with preventing rebinding (more agreement on preventing external rebinding; less agreement on marking names as not rebindable at all). If Paul means something else, then I'd ask for clarification. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Wed Dec 29 02:19:26 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 28 Dec 1999 18:19:26 -0800 (PST) Subject: [Types-sig] PyDL RFC 0.02 In-Reply-To: <19991227175955.B44344@chronis.pobox.com> Message-ID: On Mon, 27 Dec 1999, scott wrote: > On Mon, Dec 27, 1999 at 12:14:41PM -0800, Greg Stein wrote: >... > > Note: the compile-time checking *does* need to know everything that > > happens in the run-time namespaces. It must check the assignments and > > usage of values in the namespaces. > > I don't see how compile-time checking can know much of anything about > runtime-specific namespaces without running code. It doesn't have to run code. Try out the prototype that I posted to this list a few days ago. It can tell you a lot about what, when, and where values are stored into the different namespaces. And it doesn't run code -- it just walks the parse tree. > If it runs code, it > is no longer compile-time checking. Furthermore, if the compile-time > checker assumes that the running of code can do anything it can today, > there's not much of anything that can be checked at compile time to > begin with. You'd be surprised at what it can check :-) The checker can easily track type usage and find things that should not be allowed. The check.py (and friends) that I posted only does a couple things, but the framework is there for more. I just need to start filling stuff in. I went for breadth-first so that people could see what a type checker would look like. > This is why it seems to me that checks done at compile-time must be > done based on a compile-time specific model of the namespaces, and > that model must be more restrictive in naming and scoping usage than > python currently is. Nope. I posted an "existence proof" that I believe contradicts this :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From scott@chronis.pobox.com Wed Dec 29 06:05:07 1999 From: scott@chronis.pobox.com (scott) Date: Wed, 29 Dec 1999 01:05:07 -0500 Subject: [Types-sig] PyDL RFC 0.02 In-Reply-To: References: <19991227175955.B44344@chronis.pobox.com> Message-ID: <19991229010507.A53430@chronis.pobox.com> On Tue, Dec 28, 1999 at 06:19:26PM -0800, Greg Stein wrote: > On Mon, 27 Dec 1999, scott wrote: > > On Mon, Dec 27, 1999 at 12:14:41PM -0800, Greg Stein wrote: > >... > > > Note: the compile-time checking *does* need to know everything that > > > happens in the run-time namespaces. It must check the assignments and > > > usage of values in the namespaces. > > > > I don't see how compile-time checking can know much of anything about > > runtime-specific namespaces without running code. > > It doesn't have to run code. > > Try out the prototype that I posted to this list a few days ago. It can > tell you a lot about what, when, and where values are stored into the > different namespaces. And it doesn't run code -- it just walks the parse > tree. OK, done that. > > > If it runs code, it > > is no longer compile-time checking. Furthermore, if the compile-time > > checker assumes that the running of code can do anything it can today, > > there's not much of anything that can be checked at compile time to > > begin with. > > You'd be surprised at what it can check :-) While there may be a lot of value in walking the parse tree as your checker does, it doesn't seem to do much in terms of what I expect out of a type checker. What I want to be able to do: declare types, and have things which contradict the declarations reported nicely at compile time. A little searching through the code you posted didn't show any clear way to declare types, it just seems to spit out lots of attribute warnings when run it on itself, and it fails to detect anything wrong with the few simple cases I've thrown at it. for example: a = 1 b = "3" a + b yields no warnings, but is an error I'd expect a type checker to understand. def foo(x, y): return x + y foo(2) also yields no warnings, and is something I'd expect a type checker to understand. > > The checker can easily track type usage and find things that should not be > allowed. The check.py (and friends) that I posted only does a couple > things, but the framework is there for more. I just need to start filling > stuff in. I went for breadth-first so that people could see what a type > checker would look like. > > > This is why it seems to me that checks done at compile-time must be > > done based on a compile-time specific model of the namespaces, and > > that model must be more restrictive in naming and scoping usage than > > python currently is. > > Nope. I posted an "existence proof" that I believe contradicts this :-) The checker you posted either falls way short of being able to declare and check static types, or it's sufficiently unclear how to make it do that that I'd only accept existence proof as a series of examples of making it do that. For example, how do you make it check the two examples above properly? How can I declare variable 'a' to be an integer, and then have the checker report something remotely meaningful when I assign a string to the variable 'a' in the same namespace? in another namespace via ``global''? scott scott > > Cheers, > -g > > -- > Greg Stein, http://www.lyra.org/ > From gstein@lyra.org Wed Dec 29 06:52:56 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 28 Dec 1999 22:52:56 -0800 (PST) Subject: [Types-sig] check.py (was: PyDL RFC 0.02) In-Reply-To: <19991229010507.A53430@chronis.pobox.com> Message-ID: On Wed, 29 Dec 1999, scott wrote: >... > While there may be a lot of value in walking the parse tree as your > checker does, it doesn't seem to do much in terms of what I expect out > of a type checker. It isn't done. I thought that I made that very clear. It provides a framework for how to do this work. It tracks expression types, records the types that should be associated with variables, etc. The problem is that it does not yet have a way to declare types. Also, some of the type recording (e.g. for types for an interface) is not yet complete. > What I want to be able to do: declare types, and have things which > contradict the declarations reported nicely at compile time. A little > searching through the code you posted didn't show any clear way to > declare types, To do this, I would need to change the Python grammar, or suck in .pyi files. I plan to do the latter once some kind of formal grammar is specified. If that doesn't happen soon, then I'll be using the grammar that I posted in my type-proposal.html. It is complete and is sufficient (yet Paul seems to be starting from scratch... :-( ). > it just seems to spit out lots of attribute warnings > when run it on itself, and it fails to detect anything wrong with the > few simple cases I've thrown at it. for example: > > a = 1 > b = "3" > a + b > > yields no warnings, but is an error I'd expect a type checker to > understand. > > > def foo(x, y): return x + y > > foo(2) > > also yields no warnings, and is something I'd expect a type checker to > understand. Correct. It does not check these types of errors yet. Try this, however: a = { } a.append(1) b = [ ] b.append(1) You will get an error on that a.append. The attribute does not exist. But it allows the b.append. This demonstrates that it is tracking that "a" is a dictionary and that "b" is a list. Further, it understands that "append" is only defined on a list. The first problem you list "a + b" is because _arith_expr() is not filled in. It does not handle verification of the left/right operands as being compatible with the "+" operator. The second problem (with the foo(2)) is because _check_function_call() is not yet filled in. However, the code *does* know that foo() has two parameters named "x" and "y" (of type "Any" right now). This implies that _check_function_call() has enough information to check the number of arguments and to verify that if you use keywords, they must be "x" or "y". [ but I don't record defaults yet, handle varargs or keyword funcs, or deal with things like: def foo(x, (y, z)):. ] >... > The checker you posted either falls way short of being able to declare > and check static types, It does fall way short. It is a prototype/demo. It is *not* complete. It can be filled in to provide for this -- the necessary structure is there. > or it's sufficiently unclear how to make it do > that that I'd only accept existence proof as a series of examples of > making it do that. Fine. I'll accept that you don't see it as having the future capability to do this. Not a problem, as I'll just work on it some more until it reaches that point. I feel that it *does* show you can do full namespace tracking without running code (the original issue that stemmed this mini-thread). I believe it also provides a good structure for writing a type-checker (in fact, if somebody else were to write a type-checker, I think it would have so much of the same form that I would recommend against duplication of work; I'd rather see a couple people contributing to the same chunk o' code). > For example, how do you make it check the two > examples above properly? Described above. Fill in _arith_expr() and _check_function_call(). The type information is present, although I need to think of a way to have a TypeDeclarator object say "I can support addition" (at the moment, it can only say "I have attribute"). > How can I declare variable 'a' to be an > integer, We need an external file format and/or to change the grammar. It just isn't possible right now since it is using Python's internal parser. > and then have the checker report something remotely > meaningful when I assign a string to the variable 'a' in the same > namespace? Currently, the checker understands the difference between something being declared, and something having a specified type by virtue of an assignment. It will issue an error for the former case, and allow a redefinition in the latter case. But: since you can't declare something to have a given type, this functionality can't be exercised. But #2: it raises a namespace.TypeMismatchError (and stops) rather than printing an error; I simply need to add the appropriate try/except for that and print the right message. > in another namespace via ``global''? Dunno. I haven't thought about how to handle the "global" statement yet. I suspect that the Namespace class will simply understand that it must delegate certain names to a different namespace; that target namespace will then raise the appropriate error in case of a type mismatch. Cheers, -g -- Greg Stein, http://www.lyra.org/ From scott@chronis.pobox.com Wed Dec 29 11:42:52 1999 From: scott@chronis.pobox.com (scott) Date: Wed, 29 Dec 1999 06:42:52 -0500 Subject: [Types-sig] check.py (was: PyDL RFC 0.02) In-Reply-To: References: <19991229010507.A53430@chronis.pobox.com> Message-ID: <19991229064252.A55464@chronis.pobox.com> On Tue, Dec 28, 1999 at 10:52:56PM -0800, Greg Stein wrote: > On Wed, 29 Dec 1999, scott wrote: > >... > > While there may be a lot of value in walking the parse tree as your > > checker does, it doesn't seem to do much in terms of what I expect out > > of a type checker. > > It isn't done. I thought that I made that very clear. It provides a I was just working under the assumption that if it was a complete framework -- filled in or not, there'd be a way to do things like declare types. Then I threw a couple of off-the-cuff basic things at it, and it didn't do well, so I figured it wasn't done enough to warrant a basic framework to develop on. While I still sortof feel that way, your message has made a lot of what's going on in check.py more clear -- and shows some really cool things about one approach to it all. So we all know that exactly how .pyi info and embedded declarations maps runtime namespaces is a touchy issue -- we can't really account for exec, and there are lots of things which may act odd, such as get/set-attr hooks and global and del what not that can cause some real issues. We also know that whatever inaccuries or mismatches there might be between the picture of runtime namespaces available at compile time and how they work at runtime will probably become a royal pain in the ass down the road. So check.py uses the parser module to gain a pretty darn accurate picture of runtime name spaces via the parse tree. That's a *very* good thing when compared to some half-assed namespace picture that sortof works like you expect, but is bound to blow up in way too many cases and create reams of new faqs down the road. It also has some drawbacks: it's a little awkward to have compile time activity depend so heavily on a module that is optional. Also, compile-time activity IMO is rightly done in C (or Java or whatever) and not in the language that is being interpretted, though prototyping can of course be anything. The module seems to be built primarily for availability from within python, and not so much from the interpreter itself; while the end product seems like it should be (mostly atleast) in the interpreter itself. The list of drawbacks goes on a bit, but all the points rest on one question that I'm not sure of: Does the framework presented in check.py actually depend on the parser module, or is this just a functional relationship that can be met by some reasonable alternative means in the interpreter itself? scott From gstein@lyra.org Wed Dec 29 12:14:15 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 29 Dec 1999 04:14:15 -0800 (PST) Subject: [Types-sig] check.py In-Reply-To: <19991229064252.A55464@chronis.pobox.com> Message-ID: On Wed, 29 Dec 1999, scott wrote: >... > I was just working under the assumption that if it was a complete > framework -- filled in or not, there'd be a way to do things like > declare types. Then I threw a couple of off-the-cuff basic things at > it, and it didn't do well, so I figured it wasn't done enough to > warrant a basic framework to develop on. If you want to hack some code, you can declare types. The framework is there, you just need to figure out how to do something like: self.ns.declare('x', typedecl.Int) (and then fix the code to handle the exception condition that will arise if you attempt to assign something other than Int to 'x') But your point still stands: it is missing some of the more interesting stuff that is being generated in this SIG lately. But hey... I only hacked on it for a day or two :-) >... > So we all know that exactly how .pyi info and embedded declarations > maps runtime namespaces is a touchy issue -- we can't really account > for exec, and there are lots of things which may act odd, such as > get/set-attr hooks and global and del what not that can cause some > real issues. Right. > We also know that whatever inaccuries or mismatches there might be > between the picture of runtime namespaces available at compile time > and how they work at runtime will probably become a royal pain in the > ass down the road. That would suck. But I'm pretty darn sure that we can figure out at compile-time what 99% of the software out there will do at runtime (in terms of storing values into namespaces). For that other 1%, I'm not sure if we just won't work right, or whether we can at least warn the person that we won't work as expected. [ in other words, there are times when we can detect a bad situation, but can't do anything about it. other times, we just outright fail :-) ] > So check.py uses the parser module to gain a pretty darn accurate > picture of runtime name spaces via the parse tree. That's a *very* > good thing Yup. And yah, I think so :-) > when compared to some half-assed namespace picture that > sortof works like you expect, but is bound to blow up in way too many > cases and create reams of new faqs down the road. Quite true. I wouldn't think of doing it some other way. > It also has some drawbacks: it's a little awkward to have compile > time activity depend so heavily on a module that is optional. Well, we actually use just a single function from the parser module. And the underlying C code is quite simplistic. Most of the parser module actually deals with building AST nodes from Python and passing the result to the Python bytecode compiler. You could really view the parser module as an interface to two things: to the parser output, and to the compiler input. We just want the parser output. Optional? The module may be, but the parser itself isn't :-). The parser is enabled by default in recent distributions. Some code shifting or other structural changes could ensure that we always have access to parser output. We could also just say "type checking not available unless the parser module is built." > Also, > compile-time activity IMO is rightly done in C (or Java or whatever) > and not in the language that is being interpretted, though prototyping > can of course be anything. 1) This isn't necessarily a compile-time activity. It could be an external tool that is occasionally run. We could also argue semantics and say that type-checking isn't part of compilation (since the output is not necessarily used/consumed by the compilation step). 2) I disagree that it is "rightly done in C", but recognize the "IMO" you inserted there :-). I see no issue whatsoever in using Python as part of the Python runtime environment. In fact, I would hope that Python 1.6 allows you to write its parser and compiler entirely in Python. The only C code would be the builtin types and the VM. > The module seems to be built primarily for > availability from within python, and not so much from the interpreter > itself; while the end product seems like it should be (mostly atleast) > in the interpreter itself. I'm not sure that I understand the basis of this perception. However, I don't really need to, I think... we can certainly restructure some of the interfaces to make it follow whatever requirements/pattern that you're thinking of. > The list of drawbacks goes on a bit, but all the points rest on one > question that I'm not sure of: Does the framework presented in > check.py actually depend on the parser module, or is this just a > functional relationship that can be met by some reasonable alternative > means in the interpreter itself? If you integrate the thing directly into the interpreter, then the need for the parser module doesn't exist. The parser module is just a Python API for the internal C API to the parser -- the interpreter definitely has that access. But again: I would disagree with the notion of integrating it tightly into the interpreter. check.py is currently sitting at 929 lines of code. My historical yardstick says this would expand to 9290 lines of C code -- for its CURRENT form. I believe that check.py is going to get bigger once all those missing expression handling checks are inserted. Maybe 2000 to 3000 lines of Python. Dropping that into C increases your bug count and reduces flexibility/maintenance. But hey... as I mentioned to Paul a week ago or so: feel free to code a type-checker in C. I won't stop you. But I can guarantee that a Python version will be ready before yours :-). And when the SIG comes up with additional, nifty rules to check for... the Python version will have them implemented much faster. In fact, people could very well present the new rules as patches to the type-checker. Cheers, -g -- Greg Stein, http://www.lyra.org/ From paul@prescod.net Wed Dec 29 13:14:14 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 29 Dec 1999 08:14:14 -0500 Subject: [Types-sig] PyDL RFC 0.02 References: <38687712.1F7E3714@prescod.net> <3868E5F7.3B52D1ED@maxtal.com.au> Message-ID: <386A0926.84689DCE@prescod.net> skaller wrote: > > > > > And > > > the combination rules might be a bit hard to describe or handle (from the > > > human's standpoint). > > > > It's just concatenation! There is nothing hard about it. > > If that is so, please give the rules. > In particular, you will need to cover the issue of duplicate > declarations. In c and C++ in particular, these issues > turned out to be non-trivial. The RFC says that the rules for duplicate and conflicting declarations between a .pyi and a .gpi are the same as those within a .pyi. The issue is simply orthogonal. Paul Prescod From paul@prescod.net Wed Dec 29 13:16:51 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 29 Dec 1999 08:16:51 -0500 Subject: [Types-sig] Interface files References: <3867472C.7ECBAC55@prescod.net> <3867F850.14D60FCF@maxtal.com.au> <38687F76.47B415C7@prescod.net> <3868E7C1.665CD6AC@maxtal.com.au> Message-ID: <386A09C3.8EF9153D@prescod.net> skaller wrote: > > > If the normative spec. is in terms of interface files then we can deal > > with various situations through transformation TO interface files: > > Yes, this is possible to some extent, but not totally: > first, the grammar needs to be compatible with existing > python to permit embedding in the first place, That's why we preced everything with "decl" or "typedef" and thus get our own sublanguage. > and secondly > _references_ to typedecls seem to require embedding, > even if the decls themselves are not embedded. True enough. The references are going to be dotted names which Python will look for at runtime. Paul Prescod From paul@prescod.net Wed Dec 29 13:48:46 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 29 Dec 1999 08:48:46 -0500 Subject: [Types-sig] PyDL RFC 0.02 References: <19991227175955.B44344@chronis.pobox.com> <19991229010507.A53430@chronis.pobox.com> Message-ID: <386A113E.2FF6EE79@prescod.net> scott wrote: > > What I want to be able to do: declare types, and have things which > contradict the declarations reported nicely at compile time. A little > searching through the code you posted didn't show any clear way to > declare types, it just seems to spit out lots of attribute warnings > when run it on itself, and it fails to detect anything wrong with the > few simple cases I've thrown at it. The important thing is that Greg's code (presumably!) knows how to propogate types around expressions and suites like: j = k or q(foo() and bar()) That's quite an accomplishment considering how quickly he coded it. Paul Prescod From skaller@maxtal.com.au Wed Dec 29 16:05:37 1999 From: skaller@maxtal.com.au (skaller) Date: Thu, 30 Dec 1999 03:05:37 +1100 Subject: [Types-sig] const (was: PyDL RFC 0.02) References: Message-ID: <386A3151.8F5AFCD7@maxtal.com.au> Greg Stein wrote: > > 'const', IMHO, in Paul's name based model, means the name > > cannot be rebound: > > > > const x = 1 # x is always bound to 1 > > > > But: > > > > const x = [] > > x.append(1) # fine, x is still bound to the same list > > > > This does not require a readonly flag, it can be > > enforced at compile time (in the absence of 'exec' > > statements :-) > > Please re-read Paul's posts. In the quoted section above, he says we need > to say "that an object is not modifiable." I know, but that is my point: it isn't consistent with a model in which checking is applied to _names_: we'd need to model access like in C/C++ with pointers. This is pervasive, and it doesn't seem to me to sit well with optional declarations. Declaring a name non-rebindable on the other hand fits well with current semantics (a function cannot rebind non-local names unless declared 'global') -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Wed Dec 29 16:07:53 1999 From: skaller@maxtal.com.au (skaller) Date: Thu, 30 Dec 1999 03:07:53 +1100 Subject: [Types-sig] Type checks References: Message-ID: <386A31D9.A29EA65E@maxtal.com.au> Greg Stein wrote: > Python is also very deterministic. "Implementation-defined" really does > not exist. I agree, more or less. There is some indeterminism with bitwise operators (depends on the underlying C implementation, which sucks :-) > Errors in Python raise exceptions. That is how it is defined, and that is > the general style/pattern for the language. Not true for assertions. And type constraints are assertions. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Wed Dec 29 16:17:59 1999 From: skaller@maxtal.com.au (skaller) Date: Thu, 30 Dec 1999 03:17:59 +1100 Subject: [Types-sig] PyDL RFC 0.02 References: <19991227175955.B44344@chronis.pobox.com> <19991229010507.A53430@chronis.pobox.com> <386A113E.2FF6EE79@prescod.net> Message-ID: <386A3437.741BCC93@maxtal.com.au> Paul Prescod wrote: > The important thing is that Greg's code (presumably!) knows how to > propogate types around expressions and suites like: > > j = k or q(foo() and bar()) > > That's quite an accomplishment considering how quickly he coded it. I agree. More to the point, Greg says this is primarily a framework -- clearly, it is currently a pretty lousy checker, but there's scope to add rules to improve it. The same is true of the code I'm doing for the cgen_module function in Viper: it generates pretty lousy code at the moment -- but that can be fixed later. What's important at first is a working implementation that covers the territory. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skip@mojam.com (Skip Montanaro) Wed Dec 29 17:11:02 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Wed, 29 Dec 1999 11:11:02 -0600 (CST) Subject: [Types-sig] check.py (was: PyDL RFC 0.02) In-Reply-To: <19991229064252.A55464@chronis.pobox.com> References: <19991229010507.A53430@chronis.pobox.com> <19991229064252.A55464@chronis.pobox.com> Message-ID: <14442.16550.539070.199835@dolphin.mojam.com> scott> It also has some drawbacks: it's a little awkward to have compile scott> time activity depend so heavily on a module that is optional. scott> Also, compile-time activity IMO is rightly done in C (or Java or scott> whatever) and not in the language that is being interpretted, scott> though prototyping can of course be anything. The module seems scott> to be built primarily for availability from within python, and scott> not so much from the interpreter itself; while the end product scott> seems like it should be (mostly atleast) in the interpreter scott> itself. There's no reason the parser module needs to always be optional. Also, coding the thing in Python makes sense for programmer productivity reasons. Over time, as type information gets known more completely, tools like Greg's & Bill's Python2C will be able to convert that code into fairly efficient C. Skip Montanaro | http://www.mojam.com/ skip@mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From gstein@lyra.org Wed Dec 29 20:17:30 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 29 Dec 1999 12:17:30 -0800 (PST) Subject: [Types-sig] const (was: PyDL RFC 0.02) In-Reply-To: <386A3151.8F5AFCD7@maxtal.com.au> Message-ID: On Thu, 30 Dec 1999, skaller wrote: > Greg Stein wrote: > > > 'const', IMHO, in Paul's name based model, means the name > > > cannot be rebound: > > > > > > const x = 1 # x is always bound to 1 > > > > > > But: > > > > > > const x = [] > > > x.append(1) # fine, x is still bound to the same list > > > > > > This does not require a readonly flag, it can be > > > enforced at compile time (in the absence of 'exec' > > > statements :-) > > > > Please re-read Paul's posts. In the quoted section above, he says we need > > to say "that an object is not modifiable." > > I know, but that is my point: it isn't consistent > with a model in which checking is applied to _names_: > we'd need to model access like in C/C++ with pointers. > This is pervasive, and it doesn't seem to me to sit well > with optional declarations. Declaring a name non-rebindable > on the other hand fits well with current semantics > (a function cannot rebind non-local names unless declared 'global') Then we are in agreement. Paul say "readonly objects." I said no. You "explained" Paul's point and said no. I said the explanation wasn't necessary because I had agreed with you and said no. Fun, huh? :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Wed Dec 29 20:21:32 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 29 Dec 1999 12:21:32 -0800 (PST) Subject: [Types-sig] Type checks In-Reply-To: <386A31D9.A29EA65E@maxtal.com.au> Message-ID: On Thu, 30 Dec 1999, skaller wrote: > Greg Stein wrote: > > Python is also very deterministic. "Implementation-defined" really does > > not exist. > > I agree, more or less. There is some indeterminism with > bitwise operators (depends on the underlying C implementation, > which sucks :-) If this is the case, then let Guido know. He has generally taken the pain to ensure that cases like this just don't exist. > > Errors in Python raise exceptions. That is how it is defined, and that is > > the general style/pattern for the language. > > Not true for assertions. > And type constraints are assertions. Stop being a nit-pick. But since you are, let me rephrase: [In general,] errors in Python raise exceptions. [This is the pattern used for all errors. One error, AssertionError, as raised by the "assert" statement will not be raised by the compiler in "debug" mode, or in code generated when optimization is enabled.] Essentially, even the assert statement is rigidly defined. I strongly believe that type assertions would follow the exact pattern of regular assertions. -g -- Greg Stein, http://www.lyra.org/ From paul@prescod.net Wed Dec 29 16:49:31 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 29 Dec 1999 11:49:31 -0500 Subject: [Types-sig] RFC Comments References: <38689428.4953374D@prescod.net> <3868F998.594C4EEF@maxtal.com.au> Message-ID: <386A3B9B.43E94DB9@prescod.net> skaller wrote: > > I do consider your RFC's prototypes -- but they're already > quite good specifications, so it is already time to try to tighten them > up. > IMHO. Doing this will also help uncover ambiguities and problems, > that 'loose' wording will cover up. I'm thinking that my current work will evolve into a tutorial and the spec will be separate. I'm a believer in short, formal specs and long, wordy, expanatory tutorials. > No, you mentioned it in point 5. :-) Okay, you win. Const is out for now. > So: we have to be able to load an interface, without > that necessarily implying the module be imported. I see why you would sometimes only care about interfaces and not about implementations but I do not see what it harms to do a "real import" of the module. In languages like Java and C++ you might import a package or header file only for interfaces. I guess I need to know the difference which module-import semantics you are trying to avoid. > On the other hand, in an _interface_ file, we cannot > import anything: importing implies run time code > generation, to bind a name to a module object. Okay, but if it can be shown that no code is ever executed from the module then you don't have to generate that code. > The reason I chose "!" for argument declarations was that it > was already being used in similar way for the _expression_: > > y = x ! t > > and in this context, ":" cannot be used. Right. My RFC uses a function call syntax inline. It seems more Pythonic and can cause no no precedence confusion. It is also compatible with the Python 1.5.x grammar. > OK. You should proceed with _some_ fixed syntax. I used "as" everywhere else. The colon was just a lapse. > My argument is something like this: I'm lost in this subthread. I never understood what change you were proposing. Can we start again? Paul Prescod From paul@prescod.net Wed Dec 29 16:49:27 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 29 Dec 1999 11:49:27 -0500 Subject: [Types-sig] check.py References: Message-ID: <386A3B97.42AEB426@prescod.net> Greg Stein wrote: > > ... > > 2) I disagree that it is "rightly done in C", but recognize the "IMO" you > inserted there :-). I see no issue whatsoever in using Python as part > of the Python runtime environment. In fact, I would hope that Python > 1.6 allows you to write its parser and compiler entirely in Python. The > only C code would be the builtin types and the VM. I agree with you Greg. By Python 2 we may have sufficient performance that the "standard" compiler can be written in Python. Paul Prescod From paul@prescod.net Wed Dec 29 16:49:29 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 29 Dec 1999 11:49:29 -0500 Subject: [Types-sig] const (was: PyDL RFC 0.02) References: Message-ID: <386A3B99.74EE9F6E@prescod.net> It is quite possible that at some point I used const inconsistently to mean both "non-rebindable name" and "immutable." Greg is right that I was thinking about the latter when I put it in. In the long run we really need both, but I will remove them from version 1 for now. For now, we just need a solid definition of what types of rebinding are legal. There are four kinds of names: * module -- we must always disallow rebinding these because we don't have a notion of two modules with the "same interface". Maybe in some future version we could. * class -- rebinding is fine as long as the new class has a signuture that will produce instances that conform to the declared interface(s). * functions and other objects -- rebinding is fine as long as the new function conforms to the declared interface. Paul Prescod From paul@prescod.net Wed Dec 29 16:49:33 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 29 Dec 1999 11:49:33 -0500 Subject: [Types-sig] Type checks References: Message-ID: <386A3B9D.228BD31B@prescod.net> Greg Stein wrote: > > Python is also very deterministic. "Implementation-defined" really does > not exist. > > Dunno Guido's policy or leanings on this matter, but I've been assuming > that it would remain that way. And that CPython would generally be the > reference platform/definition when the language manual is not clear > enough. To faithfully represent CPython, an optimizing compiler would need to silently compile: jfoieawjofij fewajofijeawofj fjowiaejfowei to: raise SyntaxError So yes, we do have to allow some flexibility to the implementor. I agree with Greg that as much as possible we should try to keep the undefined stuff at *compile time* instead of runtime. Paul Prescod From paul@prescod.net Wed Dec 29 16:49:30 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 29 Dec 1999 11:49:30 -0500 Subject: Duplicate declarations Re: [Types-sig] PyDL RFC 0.02 References: <3868EBEB.BDAFABE0@maxtal.com.au> Message-ID: <386A3B9A.C8DFE00D@prescod.net> skaller wrote: > > 1) declarations can be embedded. > 2) declarations can also be given in a separate file > 3) Processing module X commences by loading the > separate interface file > 4) Next, the .py file is scanned for declarations > 5) The results of (3) and (4) are merged somehow > 6) The .py files is scanned again by the code generator Agreed. > We must make some decisions here. > > Question: what happens if a typedecl kind of name is > declared more than once? > > Partial Answer 1: This must be permitted, because this > is _exactly_ what will happen if the .pi file is generated > from the .py file by scanning for declarations. In my model: Human creates .pi Human creates .py Type extractor scans .py and generates .gpi Type checker reads .pi and .gpi So we have no problem with the same declaration being read twice. Thus I would say that for version 1 we should ban duplicate declarations. Paul Prescod From paul@prescod.net Wed Dec 29 16:49:28 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 29 Dec 1999 11:49:28 -0500 Subject: [Types-sig] check.py (was: PyDL RFC 0.02) References: Message-ID: <386A3B98.35DF5735@prescod.net> Greg Stein wrote: > > ... > > To do this, I would need to change the Python grammar, or suck in .pyi > files. I plan to do the latter once some kind of formal grammar is > specified. If that doesn't happen soon, then I'll be using the grammar > that I posted in my type-proposal.html. It is complete and is sufficient > (yet Paul seems to be starting from scratch... :-( ). My syntax is mostly based on your web page. I switched "!" for "as" based on my belief that it isn't Pythonic to use random keyboard characters in ways that are not universally understood. And I put decl and typedecl at the front instead of making them operators because I agree with Tim Peters that we are designing a sub-language that needs to be understood as being separate by virtue of being evaluated BEFORE the code is executed. It is my personal opinion that the grammar should be the last thing you integrate into your system. In order to avoid maintaining a whole compiler while the grammar shifts, I would suggest you define classes like this: class ParameterizedInterface: .... class ConcreteInterface: .... class MethodSignature: and so forth. You need these classes regardless. Then your interface file becomes: Array = new ParamterizedInterface( parameters=["elements", "array"], attributes=[new MethodSignature( arguments=... )] ) We need this API anyhow so it would help alot if you could design it while you are writing your package. Paul Prescod From skaller@maxtal.com.au Wed Dec 29 16:17:59 1999 From: skaller@maxtal.com.au (skaller) Date: Thu, 30 Dec 1999 03:17:59 +1100 Subject: [Types-sig] PyDL RFC 0.02 References: <19991227175955.B44344@chronis.pobox.com> <19991229010507.A53430@chronis.pobox.com> <386A113E.2FF6EE79@prescod.net> Message-ID: <386A3437.741BCC93@maxtal.com.au> Paul Prescod wrote: > The important thing is that Greg's code (presumably!) knows how to > propogate types around expressions and suites like: > > j = k or q(foo() and bar()) > > That's quite an accomplishment considering how quickly he coded it. I agree. More to the point, Greg says this is primarily a framework -- clearly, it is currently a pretty lousy checker, but there's scope to add rules to improve it. The same is true of the code I'm doing for the cgen_module function in Viper: it generates pretty lousy code at the moment -- but that can be fixed later. What's important at first is a working implementation that covers the territory. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Wed Dec 29 23:51:24 1999 From: skaller@maxtal.com.au (skaller) Date: Thu, 30 Dec 1999 10:51:24 +1100 Subject: [Types-sig] Type checks References: Message-ID: <386A9E7C.C170AC5C@maxtal.com.au> Greg Stein wrote: > > On Thu, 30 Dec 1999, skaller wrote: > > Greg Stein wrote: > > > Python is also very deterministic. "Implementation-defined" really does > > > not exist. > > > > I agree, more or less. There is some indeterminism with > > bitwise operators (depends on the underlying C implementation, > > which sucks :-) > > If this is the case, then let Guido know. He has generally taken the pain > to ensure that cases like this just don't exist. I agree. [I gave up 'advising' Guido on most things some time ago] There's more 'indeterminism' or 'unspecified behaviour' in Python than you might think -- although it is hard to say, since the specification is not itself entirely precise, being worded somewhat informally. > > > Errors in Python raise exceptions. That is how it is defined, and that is > > > the general style/pattern for the language. > > > > Not true for assertions. > > And type constraints are assertions. > > Stop being a nit-pick. Why? Programs are executed by deterministic electro-mechanical automata. They picky. Worse, language specifications describe formal systems, which are also sensitive to nits. :-) >But since you are, let me rephrase: > > [In general,] errors in Python raise exceptions. [This is the pattern used > for all errors. One error, AssertionError, as raised by the "assert" > statement will not be raised by the compiler in "debug" mode, or in code > generated when optimization is enabled.] Nit: usually, there's no such thing as an 'error' in Python. > Essentially, even the assert statement is rigidly defined. I strongly > believe that type assertions would follow the exact pattern of regular > assertions. So do I. But I don't agree that the _formal_ semantics of the language define the behaviour of an assertion when it fails. The behaviour when it succeeds is 'none'. Of couse, since the 'reference manual' is not as formal as it might be, my belief is not backed up by a formal specification. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Thu Dec 30 00:06:21 1999 From: skaller@maxtal.com.au (skaller) Date: Thu, 30 Dec 1999 11:06:21 +1100 Subject: [Types-sig] check.py (was: PyDL RFC 0.02) References: <386A3B98.35DF5735@prescod.net> Message-ID: <386AA1FD.C726C130@maxtal.com.au> Paul Prescod wrote: > > My syntax is mostly based on your web page. I switched "!" for "as" > based on my belief that it isn't Pythonic to use random keyboard > characters in ways that are not universally understood. Then you had better think again. 'as' is an ENGLISH word. English is not 'universally' understood. > It is my personal opinion that the grammar should be the last thing you > integrate into your system. I don't agree. This viewpoint is equivalent to the old-fashioned notion of top down analysis, in which a design is completed in every detail before it is implemented. Object oriented programming is utterly contrary to this paradigm, being bottom up: it specifies bottom up development, with early coding of low level design parts. To be more concrete, what you are saying would require a vast amount of human brain power, instead of permitting early, partial implementations, which would allow machines to aid in our analysis. I have implemented 'x!t' in Viper, and then, later, I implemented 'def f(x!t)' -- the uses require grammar modifications in different places and are technically distinct. As I am now implementing a C code generator, I am noticing the effects of the optional typing on a compiler (although I'm not actually using the information yet). In particular, since my implementation is entirely dynamic, it fits well with cgen_module, which uses an already loaded module. I have not tried a static compiler which 'parses' text to generate code yet, but I suspect this will make my dynamic interpretation difficult to implement -- on the other hand, Greg Stein HAS tried this kind of tool -- and so I'd like to hear from him what the impact of the 'at run time' meaning would be, if he has looked at this. >In order to avoid maintaining a whole > compiler while the grammar shifts, I would suggest you define classes > like this: > > class ParameterizedInterface: > .... > > class ConcreteInterface: > .... > > class MethodSignature: > > and so forth. You need these classes regardless. I don't. My implementation is ML based, and the static compilations tools are likely to use native constructions not Python ones .. although I'm not sure. And in order to implement _anything_ I need a starting point, which is a formal grammar for the syntax extensions. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Thu Dec 30 00:11:46 1999 From: skaller@maxtal.com.au (skaller) Date: Thu, 30 Dec 1999 11:11:46 +1100 Subject: [Types-sig] check.py References: <386A3B97.42AEB426@prescod.net> Message-ID: <386AA342.A9AF80AA@maxtal.com.au> Paul Prescod wrote: > I agree with you Greg. By Python 2 we may have sufficient performance > that the "standard" compiler can be written in Python. Irrelevant. What is relevant is: a) the compiler uses efficient algorithms b) it is powerful enough to compile itself c) it generates efficient code In which case a Python written compiler can be used to generate fast code fast, for any Python code, including itself, by compiling itself. I personally don't believe Python 2 will have much better performance than 1.5, because I don't think Guido will add the features, and write the specification, in such a way that high performance is possible. [It would be too 'unpythonic' :-[ -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Thu Dec 30 00:38:14 1999 From: skaller@maxtal.com.au (skaller) Date: Thu, 30 Dec 1999 11:38:14 +1100 Subject: [Types-sig] RFC Comments References: <38689428.4953374D@prescod.net> <3868F998.594C4EEF@maxtal.com.au> <386A3B9B.43E94DB9@prescod.net> Message-ID: <386AA976.F268268@maxtal.com.au> Paul Prescod wrote: > > > No, you mentioned it in point 5. :-) > > Okay, you win. Const is out for now. Make an appendix for 'to be considered' items. > > So: we have to be able to load an interface, without > > that necessarily implying the module be imported. > > I see why you would sometimes only care about interfaces and not about > implementations but I do not see what it harms to do a "real import" of > the module. I do, I will try to explain, but the reason is that it is inconsistent with the compilation module you specified, at least as I understand it. The way I undertand it, a new-fangled python language translator is required to behave 'as if' two passes are performed on script: the first pass gleans static type information, but generates no executable code, while the second generates executable code. In this 'two pass' model, it is inconsistent to 'import' a module in pass 1, since 'importing' a module requires a recursive tranlation pass involving TWO passes, and we know that the second pass can even involve recursive module execution. So it isn't _possible_ to import a module during pass 1. It won't work. It _is_ possible to import only the interface of a module, and this should be done when 'import X' is seen. In pass two, a full two pass importation is triggered, but the interface loading is skipped because the interface is already loaded. However, the import still requires TWO passes DURING PASS 2, because the implementation file may also include inline declarations. It follows FROM THE MODEL that these declarations are effectively private. You _could_ change the detail of the model which makes that so, to perform pass 1 on the implementation file during pass 1. I'm not sure what the impact is, or what your intent is. But one thing you cannot do is actually import a module during pass 1. Summmary: pass 1 processing only permits pass 1 processing to occur recursively, whereas pass 2 imports may invoke a full two phase translation. So because the semantics of importing a module (two passes) are quite distinct from only importing interfaces, and even that has two possible variants, it seems useful, if not essential, to permit a pass 1 only kind of importation -- 'include'. > > The reason I chose "!" for argument declarations was that it > > was already being used in similar way for the _expression_: > > > > y = x ! t > > > > and in this context, ":" cannot be used. > > Right. My RFC uses a function call syntax inline. It seems more Pythonic > and can cause no no precedence confusion. It is also compatible with the > Python 1.5.x grammar. Yes, but it cannot be used for function parameters. This IS an 'inline' use, even though it is distinct from a run time expression check (being applied to the parameter, syntactically, not the argument). > I used "as" everywhere else. The colon was just a lapse. OK. I like '!' because it is terse. 'as' requires four characters (two spaces are needed). This will clutter function definitions: def f(self as X, x as X, y as X) .. def f(self!X, x!X, y!X) but my taste is only a minor point here, I'll run with 'as' if that is the final choice. But your use of a function call like: interface_check(x,i) for a run time test is not as simple as reusing "!" for the same purpose: you could specify (x as i) be allowed in an expression instead -- I know this cannot be ambiguous, because 'as' is a keyword, and so is '!', and I have implemented the latter. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Thu Dec 30 01:03:55 1999 From: skaller@maxtal.com.au (skaller) Date: Thu, 30 Dec 1999 12:03:55 +1100 Subject: [Types-sig] Conformance model References: <386A3B9D.228BD31B@prescod.net> Message-ID: <386AAF7B.9EA4E3F3@maxtal.com.au> Paul Prescod wrote: > To faithfully represent CPython, an optimizing compiler would need to > silently compile: > > jfoieawjofij > fewajofijeawofj > fjowiaejfowei > > to: > > raise SyntaxError Precisely. In particular, assuming the name 'silly' for the above module: # module X try: import silly x = 0 except SyntaxError: x = 0 .. more code using x .. clearly indicates why this is necessary with the current specification, and why it is a bad specification to optimise. Indeed, it is lucky that 'SyntaxError' is a properly of a whole file so that: try: lkjhglkjhsdf lkhslkjhsdf except SyntaxError: .... raises a SyntaxError which is NOT trapped by the except clause (since SyntaxErrors are raised by the compiler, and apply to the WHOLE file). My argument is that this is what should be specified for some other 'errors' in some contexts. Given that python is dynamic, my argument is that, say, for a type error, it might make sense to ALLOW: try: 1 + "Hello" except TypeError: pass but mandate that def f(x): 1 + "Hello" is not a valid Python program -- a compiler can reject the program, rather than being forced to implement: def f(x): raise TypeError which is what is currently required. in particular, most people who use compilers RELY on the compiler, when it rejects code, NOT providing a .o file, which prevents the linker linking the code, which prevents the 'erroneous' code actually being run. C/C++ compilers are not required to do this, but they're not prevented from doing it either. A Python compiler would be, if we do not modify the semantics. At least as I see it, Greg in particular is not seeing this project the way I am -- I see that we are TRYING to make python LESS dynamic: Guido never intended it to be as dynamic as it is. That is, we DO NOT WANT to actually preserve the existing semantics. We WANT to break some naughty programs. It's desirable. It's clear Guido agrees. He has been the leader in suggesting changes to the specification supporting this, including backing up, if not actually first suggesting, the idea that module.value = x should be disallowed. You just can't sensibly talk about adding static typing to a language, without also saying that some text is not IN the language: it is ill-formed, invalid, an error, or just plain NOT PYTHON. In fact, the more stuff we ban the better, the condition we must observe is that we do not break TOO much sensible code. Deciding exactly what should be 'banned' is not easy, but it is not even possible if don't first agree that things like hgkjgaskhgad khgsdfkhgsdfkhg are not in fact valid Python programs. BTW: this tract belongs in a thread marked 'conformance model', rather than being mixed up with the static typing specification as given in Paul's RFC. There is a relation, but I'd not like the conformance issues to muddle, say, specification of the static typing grammar extensions. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Thu Dec 30 01:23:16 1999 From: skaller@maxtal.com.au (skaller) Date: Thu, 30 Dec 1999 12:23:16 +1100 Subject: [Types-sig] const (was: PyDL RFC 0.02) References: Message-ID: <386AB404.928FCFF7@maxtal.com.au> Greg Stein wrote: > Then we are in agreement. Oh dear, and I cannot think of any way out of this. I'm agreeing with someone. Someone is agreeing with me. For the moment :-> -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Thu Dec 30 01:36:29 1999 From: skaller@maxtal.com.au (skaller) Date: Thu, 30 Dec 1999 12:36:29 +1100 Subject: [Types-sig] const (was: PyDL RFC 0.02) References: <386A3B99.74EE9F6E@prescod.net> Message-ID: <386AB71D.CFF4DE70@maxtal.com.au> Paul Prescod wrote: > It is quite possible that at some point I used const inconsistently to > mean both "non-rebindable name" and "immutable." Greg is right that I > was thinking about the latter when I put it in. In the long run we > really need both, but I will remove them from version 1 for now. Hang on! removing 'const' is not the same as adding a restriction preventing certain name rebindings -- as Guido pointed out in response to one of my posts, it is possible to detect many rebindings, and it is possible to ban some -- without needing a 'const' specification. > For now, we just need a solid definition of what types of rebinding are > legal. There are four kinds of names: No. There is only one kind of name. Perhaps you mean 'a name can be bound to one of four kinds of object'? > * module -- we must always disallow rebinding these because we don't > have a notion of two modules with the "same interface". Maybe in some > future version we could. I'm not so sure. Consider: import m module_x = m.x module_x = m.y Here, 'module_x' is a name bound to a module, namely m.x, which is rebound to another module, m.y. Module objects can be accessed just like any other python object. And in this case, you may not even KNOW that 'module_x' is bound to a module object. I'm thinking that your intent is that the name 'module_x' be declared as a module (with some properties), but then .. > * class -- rebinding is fine as long as the new class has a signuture > that will produce instances that conform to the declared interface(s). .. should be the same as for modules. Rebinding is allowed, provided the constraints implied by a declaration the name conforms to a particular interface are met. > * functions and other objects -- rebinding is fine as long as the new > function conforms to the declared interface. I think a better and simpler rule is to be found by simply taking this rule without the words 'functions and other' -- i.e. the rule applies uniformly to all objects. [not sure though] BUT: the is an important reason to do more, namely, caching of module functions and class methods. In this case, merely requiring interface conformance is not enough: we'd actually want to prevent rebinding. This can be done by banning: x.f = g where x is a module or class: it is still permitted where x is an instance. (Other cases??) -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Thu Dec 30 01:48:20 1999 From: skaller@maxtal.com.au (skaller) Date: Thu, 30 Dec 1999 12:48:20 +1100 Subject: Duplicate declarations Re: [Types-sig] PyDL RFC 0.02 References: <3868EBEB.BDAFABE0@maxtal.com.au> <386A3B9A.C8DFE00D@prescod.net> Message-ID: <386AB9E4.74E286B0@maxtal.com.au> Paul Prescod wrote: > In my model: > > Human creates .pi > Human creates .py > Type extractor scans .py and generates .gpi > Type checker reads .pi and .gpi > > So we have no problem with the same declaration being read twice. Thus I > would say that for version 1 we should ban duplicate declarations. But you have not addressed the possibility that the .pi and .gpi contain a declaration for the same name: more precisely, at this point the above description does not describe exactly how the set of declarations (interfaces) is constructed. Presumably, the following axiom holds: name in names if name in (gpi xor pi) where we're talking about sets. What happens if: name in (gpi and pi) Here, there are TWO declarations of the same name. I don't think you can ban this, because it is not only likely to be a common case, it is likely to be almost EVERY case -- since many people will use a 'genpi' tool to extract embedded declarations into a separate interface file -- but won't remove the embedded declarations. A rule is required, but the one you mention (don't allow it), will not work in practice (IMHO). Or perhaps I misunderstand completely, and what you are saying is that the type checker's work is exactly to check that the gpi names do not conflict with the .pi names? [Hmmm: the more I think about it, the more this seems to be your intent ..??] -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From tim.hochberg@ieee.org Thu Dec 30 02:29:15 1999 From: tim.hochberg@ieee.org (Tim Hochberg) Date: Wed, 29 Dec 1999 19:29:15 -0700 Subject: [Types-sig] Conformance model References: <386A3B9D.228BD31B@prescod.net> <386AAF7B.9EA4E3F3@maxtal.com.au> Message-ID: <00c701bf526d$b5224c60$87740918@phnx3.az.home.com> John Skaller wrote: > My argument is that this is what should be > specified for some other 'errors' in some > contexts. Given that python is dynamic, > my argument is that, say, for a type error, > it might make sense to ALLOW: > > try: > 1 + "Hello" > except TypeError: pass > > but mandate that > > def f(x): > 1 + "Hello" > > is not a valid Python program -- a compiler > can reject the program, rather than being > forced to implement: > > def f(x): > raise TypeError > > which is what is currently required. > > in particular, most people who use > compilers RELY on the compiler, when it rejects > code, NOT providing a .o file, which prevents > the linker linking the code, which prevents > the 'erroneous' code actually being run. Let me stop lurking for a moment to comment: First off, the function 'f' is close enough to: def g(x): return g + "Hello" that it strikes me as somewhat strange to ban the former but not the later. I would like the compiler to catch, report, and reject a file containing the definition of f(x) when invoked directly (e.g., from the command line) . However, when invoked implicitly (e.g., by import) the compiler would go ahead and compile f(x) to raise a type error. In any event, I fail to see where you gain an efficiency advantage by outlawing f(x). Perhaps someone can elighten me here. Don't get me wrong, I definately see the advantage of having f(x) reported by a compiler. And I see the advantage of not generating .o files by default when invoked from the command line. This has finally made me appreicate where Paul Prescod was going with typesafe. I haven't gone back and reread it, so I apologize if I'm messsing this up. Anyway, it seems that both: typesafe def f(x): return 1 + "hello" typesafe def g(x): return g + "Hello" would both result in compile time errors similar to SyntaxError. (It probably should not be TypeError -- TypeError allready has a runtime meaning, perhaps InterfaceError or StaticTypeError). This, it seems, would allow more efficient code to be generated (at the very least, checks for thrown TypeErrors could be removed). In fact, I would argue that: def h(x) -> Int: return "spam" should also be legal Python (in some sense), although I would like the compiler to catch it by default. However: typesafe def h(x) -> Int: return "spam" would again raise a compile time error. -tim PS, I just realized that typesafe is equivalent to a " throws everythingButTypeError" clause in Java. Doubt it's too important, but I thought it interesting. From tim_one@email.msn.com Thu Dec 30 06:09:32 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 30 Dec 1999 01:09:32 -0500 Subject: [Types-sig] check.py (was: PyDL RFC 0.02) In-Reply-To: <386A3B98.35DF5735@prescod.net> Message-ID: <000601bf528c$70a62920$a02d153f@tim> [Paul Prescod] > My syntax is mostly based on your {GregS's] web page. I switched > "!" for "as" based on my belief that it isn't Pythonic to use > random keyboard characters in ways that are not universally > understood... FYI, in Common Lisp the name of this function is the delightful "the"; e.g., (the integer (somefunc i)) looks at the value returned by (somefunc i), passes it along if it's an integer, else raises an error. and-some-people-say-lisp-is-unreadable-ly y'rs - tim From tim_one@email.msn.com Thu Dec 30 06:09:36 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 30 Dec 1999 01:09:36 -0500 Subject: [Types-sig] Type checks In-Reply-To: <386A31D9.A29EA65E@maxtal.com.au> Message-ID: <000801bf528c$72f01920$a02d153f@tim> [John Skaller] > ... > There is some indeterminism with bitwise operators (depends > on the underlying C implementation, which sucks :-) If you know of a platform dependence in longs, it's a bug. Ditto for ints, unless it's one of the handful of shift cases that depends on the native C long size (but, e.g., that Python's right shift sign-extends is guaranteed regardless of what the platform C does with right shifts; ditto for mixing signs across / and %; etc). If you know of a bug, report it! >> Errors in Python raise exceptions. That is how it is defined, >> and that is the general style/pattern for the language. > Not true for assertions. > And type constraints are assertions. John, you're not getting anywhere with this approach -- drop it. This is not the ISO C++ committee, and we're not bound by the latter's conventions & conceits. The behavior of assertions in Python depends on the setting of a processor option. The notion that "you can't do that!!" is an arbitrary rule you're carrying in from C++ (*they* can't do that, because that's the rule *they* agreed to live by -- we did not, and all evidence says nobody else here is about to). The Python language does not define the means by which processor options are specified, but does define their effects. It is not required that a processor implement the processor option that we informally refer to as "-O mode" -- but if it does, its effect is defined. it's-not-hard-to-read-between-the-lines-when-they're-a- kilometer-apart-ly y'rs - tim From paul@prescod.net Wed Dec 29 22:12:41 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 29 Dec 1999 17:12:41 -0500 Subject: [Types-sig] const (was: PyDL RFC 0.02) References: <386A3B99.74EE9F6E@prescod.net> <386AB71D.CFF4DE70@maxtal.com.au> Message-ID: <386A8759.C2567113@prescod.net> skaller wrote: > > Hang on! removing 'const' is not the same as > adding a restriction preventing certain name rebindings -- > as Guido pointed out in response to one of my posts, > it is possible to detect many rebindings, and it is possible > to ban some -- without needing a 'const' specification. I don't follow you and anyhow I don't see why banning some name rebindings needs to be high on our list of things to do. The optimization and safety benefits are not as high as those of ordinary type checking. > > For now, we just need a solid definition of what types of rebinding are > > legal. There are four kinds of names: > > No. There is only one kind of name. > Perhaps you mean 'a name can be bound to one of four kinds of object'? Yes and no. We are now able to associate types with names so we need to be cognizant of both the type of the name and the type of the object. > > * module -- we must always disallow rebinding these because we don't > > have a notion of two modules with the "same interface". Maybe in some > > future version we could. > > I'm not so sure. Consider: > > import m > module_x = m.x > module_x = m.y > > Here, 'module_x' is a name bound to a module, namely m.x, > which is rebound to another module, m.y. Module objects > can be accessed just like any other python object. > > And in this case, you may not even KNOW that 'module_x' > is bound to a module object. > > I'm thinking that your intent is that the name 'module_x' > be declared as a module (with some properties), but then .. > > > * class -- rebinding is fine as long as the new class has a signuture > > that will produce instances that conform to the declared interface(s). > > .. should be the same as for modules. Rebinding > is allowed, provided the constraints implied by a declaration > the name conforms to a particular interface are met. My point was that we need to treat modules differently than classes because two modules cannot export the same interface whereas two classes can. interface Foo: decl bar: def( int ) -> int decl foo1: def( ) -> Foo decl foo2: def( ) -> Foo class foo1: __impements__=[Foo] def bar( num ): return 5 class foo2: __impements__=[Foo] def bar( num ): return 6 foo1 = foo2 # okay There is no equivalent code for modules because module interfaces are not named and modules do not claim to conform to particular interfaces. > BUT: the is an important reason to do more, > namely, caching of module functions and class methods. > In this case, merely requiring interface conformance > is not enough: I cannot see that there is much to be gained in caching class methods. The vast majority of time you will be handed an *instance* of the class and you will look up the method at runtime using vtables or whatever. Paul Prescod From paul@prescod.net Wed Dec 29 22:16:06 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 29 Dec 1999 17:16:06 -0500 Subject: [Types-sig] Re: Conformance model References: <386A3B9D.228BD31B@prescod.net> <386AAF7B.9EA4E3F3@maxtal.com.au> Message-ID: <386A8826.23694955@prescod.net> skaller wrote: > > ... > > Given that python is dynamic, > my argument is that, say, for a type error, > it might make sense to ... mandate that > > def f(x): > 1 + "Hello" > > is not a valid Python program -- a compiler > can reject the program, rather than being > forced to implement: > > def f(x): > raise TypeError > > which is what is currently required. My feeling could be summed up thus: "The following actions are illegal. A Python compiler may report them and refuse to compile the program or it may run the program and generate some form of Error exception." I would only willing to go further if you would describe overwhelming optimization benefits in allowing undefined behavior. You are swimming against the tide of history here. Java doesn't have much undefined behavior either. "Programmers these days" are more interested in determinism than performance. Programmers are generally more interested in optimizing coding efficiency rather than program efficiency. Java and Python do not even array-bounds checks to be elided even though there is no excuse for a valid program overwriting array bounds. Paul Prescod From paul@prescod.net Wed Dec 29 22:19:51 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 29 Dec 1999 17:19:51 -0500 Subject: [Types-sig] RFC Comments References: <38689428.4953374D@prescod.net> <3868F998.594C4EEF@maxtal.com.au> <386A3B9B.43E94DB9@prescod.net> <386AA976.F268268@maxtal.com.au> Message-ID: <386A8907.906DC149@prescod.net> skaller wrote: > > ... > It _is_ possible to import only the interface > of a module, and this should be done when 'import X' is > seen. In pass two, a full two pass importation is triggered, > but the interface loading is skipped because the interface > is already loaded. However, the import still requires TWO > passes DURING PASS 2, because the implementation file > may also include inline declarations. Do you mean old fashioned Python function and class declarations or newfangled decls and typedecls? > It follows FROM THE MODEL > that these declarations are effectively private. It was never my intent that decls and typedecls could be private. > ... > Summmary: pass 1 processing only permits > pass 1 processing to occur recursively, whereas > pass 2 imports may invoke a full two phase translation. My plan was: do everything relating to types, in ALL modules and then do everything relating to code generation in all modules. > So because the semantics of importing a module > (two passes) are quite distinct from only importing > interfaces, and even that has two possible variants, > it seems useful, if not essential, to permit a > pass 1 only kind of importation -- 'include'. Still not sold on "include". > but my taste is only a minor point here, I'll run with 'as' > if that is the final choice. We'll do a poll once other details are worked out. > But your use of a function > call like: > > interface_check(x,i) > > for a run time test is not as simple as reusing "!" > for the same purpose: I am mildly uncomfortable with new expression syntax but my arguments against it are not watertight so I will document the inline "as" unless someone else feels as I do. To be honest, I would prefer colons in function defs and "as" in other contexts if it came down to it. If terseness is important enough to sacrifice readability for, I would rather sacrifice it in favor of a mild inconsistency instead of a whole new meaning for a punctuation character. Paul Prescod From paul@prescod.net Thu Dec 30 09:13:14 1999 From: paul@prescod.net (Paul Prescod) Date: Thu, 30 Dec 1999 04:13:14 -0500 Subject: [Types-sig] check.py (was: PyDL RFC 0.02) References: <386A3B98.35DF5735@prescod.net> <386AA1FD.C726C130@maxtal.com.au> Message-ID: <386B222A.4AD0EC24@prescod.net> skaller wrote: > > > and so forth. You need these classes regardless. > > I don't. My implementation is ML based, > and the static compilations tools are likely > to use native constructions not Python ones .. although > I'm not sure. You need these classes because they are the Python equivalent of Java's reflection API. They are available to the Python programmer. I think that in the interests of determinism, you must keep this information around as Python objects unless you can demonstrate that the programmer does not "ask for" them. This will actually be pretty easy to demonstrate if our API is explicit enough. (e.g. if you don't import type_reflect then you can't get at the information so the compiler can throw it away) Paul Prescod From paul@prescod.net Thu Dec 30 09:44:49 1999 From: paul@prescod.net (Paul Prescod) Date: Thu, 30 Dec 1999 04:44:49 -0500 Subject: Duplicate declarations Re: [Types-sig] PyDL RFC 0.02 References: <3868EBEB.BDAFABE0@maxtal.com.au> <386A3B9A.C8DFE00D@prescod.net> <386AB9E4.74E286B0@maxtal.com.au> Message-ID: <386B2991.F91CC926@prescod.net> skaller wrote: > > ... > > Here, there are TWO declarations of the same name. > I don't think you can ban this, because it is not > only likely to be a common case, it is likely > to be almost EVERY case -- since many people > will use a 'genpi' tool to extract embedded declarations > into a separate interface file -- but won't remove > the embedded declarations. The virtue of the model that you described so succinctly is that there is no reason to run "genpi" explicitly. It is run during the first pass of the compilation FOR you. > Or perhaps I misunderstand completely, and what you are > saying is that the type checker's work is exactly to > check that the gpi names do not conflict with the .pi > names? That is also a reasonable argument and probably one I would tend toward in future versions of the spec. Paul Prescod From paul@prescod.net Thu Dec 30 09:44:36 1999 From: paul@prescod.net (Paul Prescod) Date: Thu, 30 Dec 1999 04:44:36 -0500 Subject: [Types-sig] const (was: PyDL RFC 0.02) References: <386A3151.8F5AFCD7@maxtal.com.au> Message-ID: <386B2984.DEE409B@prescod.net> skaller wrote: > > ... > I know, but that is my point: it isn't consistent > with a model in which checking is applied to _names_: Yes it is. Readonly-ness is part of an object's interface. We could make ReadOnlyMapping types, ReadOnlyFile types, ReadOnlyList types, ReadOnlyBankAccount types and so forth. It makes more sense to me, however, to separate out ReadOnly-ness because it is so pervasive. But not for version 1. Paul Prescod From skip@mojam.com (Skip Montanaro) Thu Dec 30 15:46:34 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 30 Dec 1999 09:46:34 -0600 (CST) Subject: [Types-sig] Run time arg checking implemented In-Reply-To: <3862527B.99B783C8@maxtal.com.au> References: <386177A3.86F0D505@prescod.net> <3862527B.99B783C8@maxtal.com.au> Message-ID: <14443.32346.368043.41998@dolphin.mojam.com> skaller> I have implemented run time argument checking in Viper, using skaller> Greg's ! operator. The syntax (so far) is like: skaller> def f( p ! t = dflt): pass skaller> and the semantics are to check that an argument has the skaller> nominated type: skaller> f(a) skaller> checks like: skaller> if type(a) is not t: skaller> raise TypeError "messge" Any reason this isn't assert type(a) is t ? Skip Montanaro | http://www.mojam.com/ skip@mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From skip@mojam.com (Skip Montanaro) Thu Dec 30 15:50:35 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Thu, 30 Dec 1999 09:50:35 -0600 (CST) Subject: [Types-sig] type declaration syntax In-Reply-To: <386165AF.F6E6BF81@maxtal.com.au> References: <385C1345.C21FF180@maxtal.com.au> <19991222224650088.AAA118.228@max41101.izone.net.au> <386165AF.F6E6BF81@maxtal.com.au> Message-ID: <14443.32587.574186.48706@dolphin.mojam.com> skaller> I.e. TWO bans fix most problems. The ban on module level skaller> rebindings is a significant restriction. I'll say. The common idiom for trapping stdout or stderr is to rebind sys.stdout/stderr to a file-like object. How would that be accomplished in such a straightforward way if module-level rebindings are disallowed? Skip Montanaro | http://www.mojam.com/ skip@mojam.com | http://www.musi-cal.com/ 847-971-7098 | Python: Programming the way Guido indented... From skaller@maxtal.com.au Thu Dec 30 16:38:12 1999 From: skaller@maxtal.com.au (skaller) Date: Fri, 31 Dec 1999 03:38:12 +1100 Subject: [Types-sig] type declaration syntax References: <385C1345.C21FF180@maxtal.com.au> <19991222224650088.AAA118.228@max41101.izone.net.au> <386165AF.F6E6BF81@maxtal.com.au> <14443.32587.574186.48706@dolphin.mojam.com> Message-ID: <386B8A74.3617D3E9@maxtal.com.au> Skip Montanaro wrote: > > skaller> I.e. TWO bans fix most problems. The ban on module level > skaller> rebindings is a significant restriction. > > I'll say. The common idiom for trapping stdout or stderr is to rebind > sys.stdout/stderr to a file-like object. How would that be accomplished in > such a straightforward way if module-level rebindings are disallowed? Special case the sys module. A bit messy -- but so is the sys module :-) -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Thu Dec 30 16:44:06 1999 From: skaller@maxtal.com.au (skaller) Date: Fri, 31 Dec 1999 03:44:06 +1100 Subject: [Types-sig] Run time arg checking implemented References: <386177A3.86F0D505@prescod.net> <3862527B.99B783C8@maxtal.com.au> <14443.32346.368043.41998@dolphin.mojam.com> Message-ID: <386B8BD6.4E494370@maxtal.com.au> Skip Montanaro wrote: > > skaller> I have implemented run time argument checking in Viper, using > skaller> Greg's ! operator. The syntax (so far) is like: > > skaller> def f( p ! t = dflt): pass > > skaller> and the semantics are to check that an argument has the > skaller> nominated type: > > skaller> f(a) > > skaller> checks like: > > skaller> if type(a) is not t: > skaller> raise TypeError "messge" > > Any reason this isn't > > assert type(a) is t > > ? Different message. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Thu Dec 30 17:46:42 1999 From: skaller@maxtal.com.au (skaller) Date: Fri, 31 Dec 1999 04:46:42 +1100 Subject: [Types-sig] RFC Comments References: <38689428.4953374D@prescod.net> <3868F998.594C4EEF@maxtal.com.au> <386A3B9B.43E94DB9@prescod.net> <386AA976.F268268@maxtal.com.au> <386A8907.906DC149@prescod.net> Message-ID: <386B9A82.ED38623E@maxtal.com.au> Paul Prescod wrote: > > skaller wrote: > > > > ... > > It _is_ possible to import only the interface > > of a module, and this should be done when 'import X' is > > seen. In pass two, a full two pass importation is triggered, > > but the interface loading is skipped because the interface > > is already loaded. However, the import still requires TWO > > passes DURING PASS 2, because the implementation file > > may also include inline declarations. > > Do you mean old fashioned Python function and class declarations or > newfangled decls and typedecls? Sorry, I mean't 'decls': i.e. interface declarations. [which can also be embedded like 'def f(x as t)'] > > It follows FROM THE MODEL > > that these declarations are effectively private. > > It was never my intent that decls and typedecls could be private. OK. I think this is necessary for correct interface design. But I concede that the default could sensibly be public; and a 'private' keyword used? > > Summmary: pass 1 processing only permits > > pass 1 processing to occur recursively, whereas > > pass 2 imports may invoke a full two phase translation. > > My plan was: do everything relating to types, in ALL modules and then do > everything relating to code generation in all modules. I see. I'm not sure that can work. The reason is, interface loading is dynamic. Because module loading is dynamic. [This can't be changed, it would break Interscript, for example] My thought was the static processing would be done by making the compiler two pass. And the compiler currently runs when a module is first loaded -- at varying times during code execution when 'import' statements are executed. > Still not sold on "include". Leave it out, and see what happens. Write some code: invent a mini-language (easy to parse in python), and implement the model. I think something like: import decl y check y where check y prints either 'y is defined' or 'y is not defined': this code is to be executed at 'run time' to report on what names are visible. Write the py -> pi tool. [!] > I am mildly uncomfortable with new expression syntax Rightly so. I'm not insisting, just presenting a feeling. I _do_ feel reasonably comfortable with 'as'. [I write lots of code. I can't type well. I like code compact. Means I can fit more information on a screen. I like python string/sequence handling precisely because it is a terse notation] -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From paul@prescod.net Thu Dec 30 17:44:39 1999 From: paul@prescod.net (Paul Prescod) Date: Thu, 30 Dec 1999 12:44:39 -0500 Subject: [Types-sig] PyDL RFC 0.4 Message-ID: <386B9A07.57234970@prescod.net> PyDL RFC 0.03 A PyDL file declares the interface for a Python module. PyDL files declare interfaces, objects and the required interfaces of objects. This document (loosely, informally) describes the behavior of a class of software modules called "static interface interpreters" and "static interface checkers". Interface interpreters are run as part of the regular Python module interpetation process. They read PyDL files and make the interface objects available to the Python compiler. Interface checkers read PyDL files and Python code to verify conformance of the code to the interface. Once this design is done we will write a formal specification. PyDL Files: =========== A PyDL file can be either created by a programmer or auto-generated. The syntax and semantics of the two types are identical. An auto-generated file is created by scanning a Python module for inline declarations. Interfaces are the central concept in PyDL files. Interfaces are Python objects like anything else but they are created by the interface interpreter. They are made available to the static interface checker before Python compilation begins. In addition to defining interfaces, it is possible to declare other attributes of the module. Each declaration associates an interface with the name of the attribute. Values associated with the name in the module namespace must always conform to the declared interface. Furthermore, by the time the module has been imported each name must have an associated value. It is not necessary for the static interface checker to prove that these rules will not be violated. It is also acceptable to check at runtime. Grammar: ======== In the very short term, implementors are encouraged to use any grammar that allows every example in this document. Contributions of proposals for the grammar are solicited. Interfaces: =========== Interfaces are created through interface definitions and interface expressions. There may also be facilities for creating interfaces at runtime but they are neither available to nor relevant to the interface interpreter. Interface definitions are similar to Python class definitions. They use the keyword "interface" instead of the keyword "class". Interfaces are either complete or incomplete. An incomplete interface takes parameters and a complete interface does not. It is not possible to create Python objects that conform to incomplete interfaces. They are just a reuse mechanism analogous to functions in Python. An example of an incomplete interface would be "Sequence". It is incomplete because we need to define the interface of the contents of the sequence. In an interface expression the programmer can provide parameters to generate a new interface. Typedefs allow us to give names to complete or incomplete interfaces described by interface expressions. Typedefs are an interface expression re-use mechanism. Interfaces have an intuitive concept of equivalence which will be formalized later in the document. Behavior: ========= For our purposes, we will presume that every Python environment has some form of compilation phase. This is true of all existing Python environments. The Python compiler invokes the static interface interpreter and optionally the interface checker on a Python file and its associated PyDL file. Typically a PyDL file is associated with a Python file through placement in the same path with the same base name and a ".pydl" or ".gpydl" extension. If both are avaiable, the module's interface is created by combining the declarations in the ".pydl" and ".gpydl" files. "Non-standard" importer modules may find PyDL files using other mechanisms such as through a look-up in an relational database, just as they find modules themselves using non-standard mechanisms. The interface interpreter reads the PyDL file and builds the relevant interface objects. If the PyDL file refers to other modules then the interface interpreter can read the PyDL files associated with those other modules after generating them if necessary. It is acceptable to use date-stamps, CRCs and other heuristics to demonstrate that a generated PyDL file is not likely to be inconsistent with its module. The Python compiler may invoke the interface checker after the interface interpreter has built interface objects and before it interprets the Python module. Once it interprets the Python code, the interface objects are available to the runtime code through a special namespace called the "interface namespace". There is one such namespace per module. It is accessible from the module's namespace via the name "__interfaces__". This namespace is interposed in the name search order between the module's namespace and the built-in namespace. Built-in Interfaces: ==================== Any Number Integral Int Long Float Complex Sequence String Record Mapping Modules Callable Class Function Methods UnboundMethods BoundMethods Null File Certain interfaces may have only one implementation. These "primitive" types are Int, Long, Float, String, UnboundMethods, BoundMethods, Module, Function and Null. Over time this list may get shorter as the Python implementation is generalized to work mostly by interfaces. Note: In rare cases it may be necessary to create new primitive types with only a single implementation (such as "window handle" or "file handle"). This is the case when the object's actual bit-pattern is more important than its interface. Note: The Python interface graph may not always be a tree. For instance there might someday be a type that is both a mapping and a sequence. The details of each interface remain to be worked out. Volunteers are solicited. Interface expression language: ============================== Interface expressions are used to declare that attributes must conform to certain interfaces. In a interface expression you may: 1. refer to an interface by name. The name can either be simple or it may be of the form "module.interfacename" where "interfacename" is a name in one of two PyDL files for the named module. The expression evaluates to the referenced interface. Two expressions consisting only of names are equivalent if the referenced interface objects are equivalent. 2. make a union of two or more interfaces: integer or float integer or float or complex The expression evaluates to an interface object I such that a value V conforms to I iff it conforms to any interface in ths list. Two union expressions X and Y are equivalent if their lists are the same length and each element in X has an equivalent in Y and vice versa. 3. parameterize a interface: Array( Int, 50 ) Array( length=50, elements=Int ) Note that the arguments can be either interface expressions or simple Python expressions. A "simple" Python expression is an expression that does not involve a function call or variable reference. The expression evaluates to a complete instantiation of the referenced incomplete interface. Two parameterization expressions are equivalent if the parameterized interface is equivalent and each parameter is equivalent. 4. use a syntactic shortcut: [Foo] => Sequence( Foo ) # sequence of Foo's {String:Int} => Mapping( String, Int ) # Mapping from A's to B's (A,B,C) => Record( A, B, C ) # 3-element sequence of interface a, followed # by b followed by c The expression evaluates to the same thing as the expanded versions. Equivalence is identical to the situation for the expanded versions. 5. generate a callable interface: def( Arg1 as Type1, Arg2 as Type2 ) -> ReturnType The argument name may be elided: def( Int, String ) -> None Note: this is provided for compatibiity with libraries and tools that may not support named arguments. Python programmers are strongly encouraged to use argument names as they are good documentation and are useful for development environments and other reflective tools. It is possible to declare variable length argument lists. They must always be declared as sequences but the element interface may vary. def( Arg1 as String, * as [Int] ) -> Int # callable taking a String, and some un-named Int # arguments Finally, it is possible to declare keyword argument lists. They must always be declared as mappings from string to some interface. def( Arg1 as Int , ** as {String: Int}) - > Int Note that at this point in time, every Python callable returns something, even if it is None. The return value can be named, merely as documentation: def( Arg1 as Int , ** as {String: Int}) - > ReturnCode as Int The expression evaluates to a callable interface that takes the described arguments and returns the described value. Declarations in a PyDL file: ============================ 1. Imports An import statement in an interface file loads another interface file. The import statement works just like Python's except that it loads the PyDL file found with the referenced module, not the module itself. (of course we will make this definition more formal in the future) 2. Basic attribute interface declarations: decl myint as Int # basic decl intarr as Array( Int, 50 ) # parameterized decl intarr2 as Array( size = 40, elements = Int ) # using keyword syntax Attribute declarations are not parameteriable. Furthermore, they must resolve to complete interfaces. So this is allowed: class (_X,_Y) spam( A, B ): decl someInstanceMember as _X decl someOtherMember as Array( _X, 50 ) .... These are NOT allowed: decl someModuleMember(_X) as Array( _X, 50 ) class (_Y) spam( A, B ): decl someInstanceMember(_X) as Array( _X, 50 ) Because that would allow you to create a "spam" without getting around to saying what _X is for that spam's someInstanceMember. That would disallow static type checking. 3. Callable object interface declarations: Functions are the most common sort of callable object but class instances can also be callable. Callables may be runtime parameterized and/or interface parameterized. For instance, there might be a method "add" that takes two objects with the same interface and returns an object with that interface. decl DoSomething( _X ) as def( a as _X, b as _X )-> _X _X is the interface parameter. By convention these start with underscores. a and b are the runtime parameters. Note: it is usually possible to coerce a parameterized function into a fully polymorphic function where the arguments can vary from each other quite widely despite being declared to have the same parameter type. You can do this by instantiating the function with "Any" as the parametric type. It is possible to allow _X to vary to some extent but still require it to always be a Number: decl Add(_X as Number) as def( a as _X, b as _X )-> _X So this function could take two longs or two floats but not two strings. Note: as above, you could create a version that would take a float and a long by referring to a common base interface like Number itself. 4. Class Declarations A class is a callable object that can be subclassed. Currently the only way to make those (short of magic) is with a class declaration, but one could imagine that there might someday be an __subclass__ magic method that would allow any old object instance to also stand in as a class. The syntax for a class definition is identical to that for a function with the keyword "def" replaced by "class". What we are really defining is the constructor. The signature of the created object can be described in an interface declaration. decl TreeNode(_X) as class( a as _X, Right as TreeNode( _X ) or None, Left as TreeNode( _X ) or None ) -> ParentClasses, Interfaces When the initialization completes, every attribute in the declared interfaces should have a value. 5. Interface declarations: An interface decaration starts with the keyword "interface", optionally has interface parameters in parentheses and then continues with the interface name and the names of super-interfaces. This interface inherits and must not contradict the signature of the parent interfaces. The interface body is made up of attribute declarations. interface (_X,_Y) spam( a, b ): decl somemember as _X decl someOtherMember as _Y decl someClassAttr as [ _X ] decl someFunction as def( a as Int, b as Float ) -> String 6. Typedefs: Typedefs allow interfaces to be renamed and for parameterized variations of interfaces to be given names. typedef PositiveInt as BoundedInt( 0, maxint ) typedef NegativeInt as BoundedInt( max=-1, min=minint ) typedef NullableInt as Int or None typedef Dictionary(_Y) as {String:_Y} New Module Syntax: ====================== In a future version of Python, declarations will be allowed in Python code and will have the same meanings. They will be extracted to a generated PyDL file and evaluated there (along with hand-written declarations in the PyDL file). In the meantime, there is a backwards compatible syntax explained later. "typesafe": =========== In addition to decl and typedecl the keyword "typesafe" can be used to indicate that a function or method uses types in such a way that each operation can be checked at compile time and demonstrated not to call any function or operation with the wrong types. The keyword precedes the function definition: typesafe def foo( a, b ): ... The typesafe keyword can also be used before a class definition. That means that every method in the class is declared to be type safe. There typesafe keyword can be used with the "module" modifier before the first function or class definitions in a module to state that all of the functions and classes in the module are type safe: import spam import rabbit import orphanage typesafe module An interface checker's job is to ensure that methods that claim to be typesafe actually are. It must report and refuse to compile modules that misuse the keyword and may not refuse to compile modules that do not. The interface checker may optionally warn the programmer about other suspect constructs in Python code. Note: typesafe is the only change to class definitions or module definitions syntax. "as" ==== The "as" operator takes an expression and an interface expression and verifies at runtime that the expression evaluates to an object that conforms to the interface described by the expression. It returns the expression's value if it succeeds and raises TypeAssertionError (a subtype of AssertionError) otherwise. foostr = foo as [String] # verifies that foo is a string and # re-assigns it. This operation can be used in various ways. The most basic way to use it is as a test: >>> j = getData() >>> j as Int >>> j=j+1 The "as" operator has the lowest precedence of the binary operators. Interface objects ================= Every interface object (remember, interfaces are just Python objects!) has the following method : __conforms__ : def (obj: Any ) -> boolean This method can be used at runtime to determine whether an object conforms to the interface. It would check the signature for sure but might also check the actual values of particular attributes. There is also a global function with this signature: class_conforms : def ( obj as Class, Obj as Interface ) -> boolean This function can be used either at compile time (e.g. by an implementation of an interface checker) or runtime to check that a class will generate objects that have the right signature to conform to the interface. (the rest of the interface reflection API will be worked out later) Experimental syntax: ==================== There is a backwards compatible syntax for embedding declarations in a Python 1.5x file: "decl","myint as Integer" "typedef","PositiveInteger as BoundedInt( 0, maxint )" "typesafe" def ...( ... ): ... "typesafe module" There will be a tool that extracts these declarations from a Python file to generate a .gpydl (for "generated PyDL") file. These files are used alongside hand-crafted PyDL files. The "effective interface" of the file is evaluated by combining the declarations from the same file as if they were concatenated together (more or less...exact details to follow). The two files must not contradict each other, just as declarations within a single file must not contradict each other. This means that names that are declared twice must evaluate to equivalent types. Over time the .gpydl generator will get more intelligent and may deduce type information based on code outside of explicit declarations (for instance function and class definitions, assignment statements and so forth). The "as" keyword is replaced in the backwards-compatible syntax with Summary of Major Runtime Implications: ====================================== All of the named interfaces defined in a PyDL file are available in the "__interfaces__" dictionary that is searched between the module dictionary and the built-in dictionary. The runtime should not allow an assignment or function call to violate the declarations in the PyDL file. In an "optimized speed mode" those checks would be disabled. In non-optimized mode, these assignments would generate an IncompatibleAssignmentError. The runtime should not allow a read from an unassigned attribute. It should raise NotAssignedError if it detects this at runtime instead of at compile time. Several new object interfaces and functions are needed. Future Directions: ================== Inferencing/Deduction: ====================== At some point in the future, PyDL files will likely be generated from source code using a combination of declarations within Python code and some sorts of interface deduction and inferencing based on various kinds of assignment. Const-ness/Readonly-ness: ========================= We need to be able to say that some attributes cannot be re-bound and that some attributes and parameters are immutable. Idea: The Undefined Object: =========================== The Undefined object is used as the value of unassigned attributes and the return value of functions that do not return a value. It may not be bound to a name. a = Undefined # raises UndefinedValueError a = b # raises UndefinedValueError if b has not been assigned Undefined can be thought of as a subtype of NameError. Undefined is needed because it is now possible to declare names at compile time but never get around to assigning to them. In ordinary Python this is not possible. The only useful thing you can do with Undefined is check whether an object "is" Undefined: if a is Undefined: doSomethingWithA(a) else: doSomethingElse() This is equivalent to: try: doSomethingWithA( a ) except NameError: doSomethingElse It is debatable whether we still need NameError for anything other than backwards compatibility. We could say that any referenced variable is automatically initialized to "Undefined". Undefined is sufficiently restrictive that this will not lead to buggy programs. Undefined also corrects a long-term unsafe issue with functions. Now, functions that do not explicitly return a value return Undefined instead of None. That means that this is no longer possible a = list.sort() With Undefined, it will blow up because it is not possible to assign the Undefined value. Before Undefined, the code did not blow up but it also did not do the "right thing." It assigned None to "a" which was seldom what was intended. From gstein@lyra.org Thu Dec 30 18:06:12 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 30 Dec 1999 10:06:12 -0800 (PST) Subject: [Types-sig] RFC Comments In-Reply-To: <386A3B9B.43E94DB9@prescod.net> Message-ID: On Wed, 29 Dec 1999, Paul Prescod wrote: >... > > The reason I chose "!" for argument declarations was that it > > was already being used in similar way for the _expression_: > > > > y = x ! t > > > > and in this context, ":" cannot be used. > > Right. My RFC uses a function call syntax inline. It seems more Pythonic > and can cause no no precedence confusion. It is also compatible with the > Python 1.5.x grammar. I've discussed the notion of function call syntax before. It is Not Good. It causes problems with people believing that an actual function exists that can be referenced, passed around, etc. I think that I had a few other reasons. If you're going to introduce a new operator, then it should be a new operator. Not a function call. > > OK. You should proceed with _some_ fixed syntax. > > I used "as" everywhere else. The colon was just a lapse. "as" has the wrong semantic. x = y as Int That looks like you want to use y "as an integer". Of course, that isn't what is happening. You're asserting that y *is* an integer. Either use "!" or "isa". But definitely do not use "as". Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Thu Dec 30 18:18:51 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 30 Dec 1999 10:18:51 -0800 (PST) Subject: [Types-sig] rebinding (was: const) In-Reply-To: <386A3B99.74EE9F6E@prescod.net> Message-ID: On Wed, 29 Dec 1999, Paul Prescod wrote: >... > For now, we just need a solid definition of what types of rebinding are > legal. There are four kinds of names: Why not just make them all legal, and worry about this later? > * module -- we must always disallow rebinding these because we don't > have a notion of two modules with the "same interface". Maybe in some > future version we could. Untrue. Ever look at the "anydbm" module and its cohorts? How about the DBAPI modules? I've said before: modules and classes both have the notion of an interface. We ought to be able to associate an interface with a module! > * class -- rebinding is fine as long as the new class has a signuture > that will produce instances that conform to the declared interface(s). > > * functions and other objects -- rebinding is fine as long as the new > function conforms to the declared interface. These make sense. Note on functions: how is a function declared to have a particular signature? If the function itself is declaring the signature, then rebinding could be allowed. For example: interface foo: def bar(x: Int)->String: "doc string" class baz(foo): def bar(x: Int)->String: ... def func(x: String)->String: ... def func(x: Int)->String: ... In the above example, baz.bar must conform to foo.bar since the class is supposed to conform to the foo interface. func() is defining the module's interface, so the second func() is simply tweaking the "final" interface. Cheers, -g -- Greg Stein, http://www.lyra.org/ From skaller@maxtal.com.au Thu Dec 30 18:17:34 1999 From: skaller@maxtal.com.au (skaller) Date: Fri, 31 Dec 1999 05:17:34 +1100 Subject: [Types-sig] const (was: PyDL RFC 0.02) References: <386A3151.8F5AFCD7@maxtal.com.au> <386B2984.DEE409B@prescod.net> Message-ID: <386BA1BE.7F92DBC6@maxtal.com.au> Paul Prescod wrote: > > skaller wrote: > > > > ... > > I know, but that is my point: it isn't consistent > > with a model in which checking is applied to _names_: > > Yes it is. Readonly-ness is part of an object's interface. No. Objects don't HAVE interfaces. NAMES have interfaces. Access to the object is constrained by the interface associated with the name. A standard dictionary could, if we had some kind of const, be accessed via two names: the first, allowing only read access, and the second read and write. An object can be 'compatible' with many interfaces. It can only be _accessed_ via a name, through the declared interface. Two names, two interfaces. One object. your model! Example: an object is accessed as a Sequence. Same object is also accessed by a different name bound to a different interface, as List. Should work across function call boundaries to support protocol based polymorphism: decl x as List def f(y as Sequence): .. f(x) # should be OK: a List 'is a' Sequence So: 'const' in an interface is an access control. Perhaps object bound to the name with an interface is immutable, and perhaps not: immutability is a _runtime_ property. 'const' is a compile time one. Objects exist at run time. Names are bound to interfaces at compile time. We could make > ReadOnlyMapping types, ReadOnlyFile types, ReadOnlyList types, > ReadOnlyBankAccount types and so forth. It makes more sense to me, > however, to separate out ReadOnly-ness because it is so pervasive. But > not for version 1. No. If you want an 'immutable dictionary', you need a new type. I mean, an actual new Python run-time type object with different 'methods' defined. The properties of an object of a certain type are entirely a run time matter. The static type system doesn't enter into it. Interfaces do not change the type of an object. They restrict how the object can be accessed. If you 'see' something as an Immutable Mapping, you cannot insert a new key: the static type checker would not let you. This prevents a run time error; but if the object were a standard dictionary, the static type system will _still_ stop you accessing the object as a dictionary. 'Seeing is not being'-ly yours. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From gstein@lyra.org Thu Dec 30 18:26:30 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 30 Dec 1999 10:26:30 -0800 (PST) Subject: [Types-sig] syntax (was: check.py) In-Reply-To: <386A3B98.35DF5735@prescod.net> Message-ID: On Wed, 29 Dec 1999, Paul Prescod wrote: > Greg Stein wrote: > > ... > > To do this, I would need to change the Python grammar, or suck in .pyi > > files. I plan to do the latter once some kind of formal grammar is > > specified. If that doesn't happen soon, then I'll be using the grammar > > that I posted in my type-proposal.html. It is complete and is sufficient > > (yet Paul seems to be starting from scratch... :-( ). > > My syntax is mostly based on your web page. I switched "!" for "as" > based on my belief that it isn't Pythonic to use random keyboard > characters in ways that are not universally understood. I covered this under a separate email. If you insist on using a word rather than '!', then at least use the right semantic. > And I put decl > and typedecl at the front instead of making them operators because I > agree with Tim Peters that we are designing a sub-language that needs to > be understood as being separate by virtue of being evaluated BEFORE the > code is executed. I disagree. Making a "sub-language" will simply serve to create something that is not integrated with Python. I see no reason to separate anything that is happening here -- that is a poor requirement/direction to take. The typedef unary operator allows a Python programmer to manipulate type declarator objects. That will be important for things such as an IDE, a debugger, or some more sophisticated analysis tools. The "decl" statement is fine, but we shouldn't look at it as an escape hatch for anything that we'd like to do. > It is my personal opinion that the grammar should be the last thing you > integrate into your system. I'd rather see the grammar *now* so that check.py can take advantage of it. That kind of defeats your claim :-) Otherwise, we're just coding in the dark, hoping that the app will work when the grammar finally gets implemented. > In order to avoid maintaining a whole > compiler while the grammar shifts, I would suggest you define classes > like this: > > class ParameterizedInterface: > .... > > class ConcreteInterface: > .... > > class MethodSignature: > > and so forth. You need these classes regardless. I've got classes like this. Go look at typedecl.py in my posting. > Then your interface file becomes: > > Array = new ParamterizedInterface( > parameters=["elements", "array"], > attributes=[new MethodSignature( arguments=... )] > ) > > We need this API anyhow so it would help alot if you could design it > while you are writing your package. Have you looked at check.py? -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Thu Dec 30 18:30:14 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 30 Dec 1999 10:30:14 -0800 (PST) Subject: [Types-sig] english words in Python (was: check.py) In-Reply-To: <386AA1FD.C726C130@maxtal.com.au> Message-ID: On Thu, 30 Dec 1999, skaller wrote: > Paul Prescod wrote: > > My syntax is mostly based on your web page. I switched "!" for "as" > > based on my belief that it isn't Pythonic to use random keyboard > > characters in ways that are not universally understood. > > Then you had better think again. 'as' is an ENGLISH word. > English is not 'universally' understood. So what? The Python language uses English words. "as" is the wrong semantic and should be rejected based on that. But not because it is English. Next, you'll say that we should replace "import", "class", and "assert" with funny little characters. Soon enough, we'll end up with APL or Perl. > I have implemented 'x!t' in Viper, and then, later, > I implemented 'def f(x!t)' -- the uses require grammar modifications > in different places and are technically distinct. Yes. I recommend '!' for the operator and ':' for the funcdefs. > As I am now implementing a C code generator, I am noticing > the effects of the optional typing on a compiler (although I'm > not actually using the information yet). > > In particular, since my implementation is entirely dynamic, > it fits well with cgen_module, which uses an already loaded module. > I have not tried a static compiler which 'parses' text to generate > code yet, but I suspect this will make my dynamic interpretation > difficult to implement -- on the other hand, Greg Stein HAS > tried this kind of tool -- and so I'd like to hear from him > what the impact of the 'at run time' meaning would be, > if he has looked at this. I don't follow your question here. I don't understand your distinctions between dynamic and runtime and static... Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Thu Dec 30 18:38:59 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 30 Dec 1999 10:38:59 -0800 (PST) Subject: [Types-sig] import vs include (was: RFC Comments) In-Reply-To: <386AA976.F268268@maxtal.com.au> Message-ID: On Thu, 30 Dec 1999, skaller wrote: >... > In this 'two pass' model, it is inconsistent to > 'import' a module in pass 1, since 'importing' a module > requires a recursive tranlation pass involving TWO passes, > and we know that the second pass can even involve recursive > module execution. So it isn't _possible_ to import > a module during pass 1. It won't work. Python importing does *not* allow recursive module execution. a.py: import b some_code() b.py: import a more_code() Let's say that you import a.py. b will be imported, the "import a" will establish a reference to the "a" module (which is incomplete at that point in time), and then more_code() is executed. The import of b completes and some_code() is then executed. After a.py completes, the module is then filled in and becomes available to other modules (such as b.py). [ note that if you *run* a.py, it is named __main__ so b's import will grab something different ] Given that recursive module execution cannot occur, there is no problem in doing a real import and acquiring interface information from the imported module. In other words: I agree with Paul that we do not need to separate the notions of import and include. In check.py, however, I do not plan to perform a true import. Interfaces of other modules must be specified in a .pi file, or check.py must be allowed to parse and construct an interface from the target module. [ I'd rather not open/read/parse/extract an interface from another module because it would definitely impact the performance too much. ] Cheers, -g -- Greg Stein, http://www.lyra.org/ From skaller@maxtal.com.au Thu Dec 30 18:41:03 1999 From: skaller@maxtal.com.au (skaller) Date: Fri, 31 Dec 1999 05:41:03 +1100 Subject: [Types-sig] const (was: PyDL RFC 0.02) References: <386A3B99.74EE9F6E@prescod.net> <386AB71D.CFF4DE70@maxtal.com.au> <386A8759.C2567113@prescod.net> Message-ID: <386BA73F.71C1F176@maxtal.com.au> Paul Prescod wrote: > I don't follow you and anyhow I don't see why banning some name > rebindings needs to be high on our list of things to do. The > optimization and safety benefits are not as high as those of ordinary > type checking. I do not agree. One of the most sought after optimisations in Python is caching. Preventing rebindings in loaded modules and/or defined classes permits function and method caching. It may be the benefits are not as great in total as typing would bring, but they are certainly significant, and they're very high on our priorty list becase the changes required to enforce such a rule are trivial (a one line change in Viper now prints a warning message for the module case). The change is also trivial to document and specify. I contend there is a very large benefit for a very small effort here. A proposal is likely to be accepted, Guido has been heard to murmur in favour of it :-) And, it is likely to make it into Python 1.6, even if the more ambitious typing proposal produced does not. This should win Brownie points for the Sig. :-) > Yes and no. We are now able to associate types with names so we need to > be cognizant of both the type of the name and the type of the object. precisely, yes, I agree, except i do not agree that names have types associated with them. They have _interfaces_ associated with them. Objects have _types_ associated with them. That is, actual TypeType objects. Example of difference: a module has a type. All modules have the _same_ type. Most modules have _different_ interfaces. An interface declaration not only asserts that a name is associated with an object of module type at run time, but also that the module have certain attributes of certain types. > My point was that we need to treat modules differently than classes > because two modules cannot export the same interface whereas two classes > can. Why cannot two modules can export the same interface? The _names_ of the interfaces may differ. > There is no equivalent code for modules because module interfaces are > not named and modules do not claim to conform to particular interfaces. Why not? I think they do. They are named. The name is encoded in the interface file name (.pi file). -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From gstein@lyra.org Thu Dec 30 18:46:03 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 30 Dec 1999 10:46:03 -0800 (PST) Subject: [Types-sig] class/module interfaces (was: const) In-Reply-To: <386A8759.C2567113@prescod.net> Message-ID: On Wed, 29 Dec 1999, Paul Prescod wrote: >... > My point was that we need to treat modules differently than classes > because two modules cannot export the same interface whereas two classes > can. We should be treating them *exactly* the same. I've been saying this for a while now :-) A module can export the same interface as another module. Heck, it can export the same interface as a class. ---- a.py ---- def foo(x: Int)->String: return "hi " + str(x) bar = 5 -------------- import a print a.foo(5), a.bar class xyz: bar = 5 def foo(self, x:Int)->String: return "hi " + str(x) b = xyz() print b.foo(5), b.bar ------------- In the above code, "a" and "b" have the same interface. We could even go and declare an interface, specify that the module and the class conforms to that interface, and then declare a and b to use the interface. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Thu Dec 30 18:54:45 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 30 Dec 1999 10:54:45 -0800 (PST) Subject: [Types-sig] decls and typedefs (was: RFC Comments) In-Reply-To: <386A8907.906DC149@prescod.net> Message-ID: On Wed, 29 Dec 1999, Paul Prescod wrote: > skaller wrote: ... > > It follows FROM THE MODEL > > that these declarations are effectively private. I don't see how that follows. But the comment was in regards to the distinction between an import/include process. I believe that distinction is bunk (as I explained in the other note), so the notion of "follows" is probably moot. > It was never my intent that decls and typedecls could be private. A "decl" does not establish a name in my mind. It defines the type that a name *will* use, but nothing more. So far, we have decls to specify interface attributes. I think there is something in there for declaring parameterized types, but I'm still not clear on that syntax. In any case, they only declare type information -- you still need an assignment somewhere to establish a value. To associate information with a name, I think that we want to use an assignment or the classdef/funcdef pattern (... name: suite). This is where the "typedef" operator came from, allowing you to do: IntOrString = typedef Int or String Have no fears -- the type checker can easily understand the above code. Cheers, -g -- Greg Stein, http://www.lyra.org/ From skaller@maxtal.com.au Thu Dec 30 18:53:17 1999 From: skaller@maxtal.com.au (skaller) Date: Fri, 31 Dec 1999 05:53:17 +1100 Subject: [Types-sig] Re: Conformance model References: <386A3B9D.228BD31B@prescod.net> <386AAF7B.9EA4E3F3@maxtal.com.au> <386A8826.23694955@prescod.net> Message-ID: <386BAA1D.BD332F0A@maxtal.com.au> Paul Prescod wrote: > > skaller wrote: > > > > ... > > > > Given that python is dynamic, > > my argument is that, say, for a type error, > > it might make sense to ... mandate that > > > > def f(x): > > 1 + "Hello" > > > > is not a valid Python program -- a compiler > > can reject the program, rather than being > > forced to implement: > > > > def f(x): > > raise TypeError > > > > which is what is currently required. > > My feeling could be summed up thus: > > "The following actions are illegal. A Python compiler may report them > and refuse to compile the program or it may run the program and generate > some form of Error exception." Seems fair .. > I would only willing to go further if you would describe overwhelming > optimization benefits in allowing undefined behavior. How would you account for 'assert'? Assert provides a run time test that can be optimised away, a compiler could perhaps be permitted to report an error IF it could detect one would occur -- but currently, there is no requirement is actually generate an error, either at compile time or run time: the current optimising compiler does neither. > You are swimming against the tide of history here. Java doesn't have > much undefined behavior either. java and python are not ISO standardised. C and C++ are, and allow undefined behaviour in some places, because it is necessary for performance. Pleny of people use C and C++ to write code. :-) Plenty of people use python and java and wish they ran faster. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Thu Dec 30 19:00:59 1999 From: skaller@maxtal.com.au (skaller) Date: Fri, 31 Dec 1999 06:00:59 +1100 Subject: [Types-sig] Conformance model References: <386A3B9D.228BD31B@prescod.net> <386AAF7B.9EA4E3F3@maxtal.com.au> <00c701bf526d$b5224c60$87740918@phnx3.az.home.com> Message-ID: <386BABEB.586103A2@maxtal.com.au> Tim Hochberg wrote: > First off, the function 'f' is close enough to: > > def g(x): > return g + "Hello" > > that it strikes me as somewhat strange to ban the former but not the later. I'd ban _anything_ that raised _any_ error, except an environment error, unless the error was caught in the function that raises the error. This implies NO system exceptions can be raised by a function. Only user defined exceptions. And they can ONLY be raised by an explicit 'raise' statement. This means that, when calling a function that does NOT raise any errors, we don't have to check for them. This is a huge overhead in typical C code, an obstacle to inlining, which reduces one of pythons biggest performance bottlenecks -- function calling. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From skaller@maxtal.com.au Thu Dec 30 19:07:01 1999 From: skaller@maxtal.com.au (skaller) Date: Fri, 31 Dec 1999 06:07:01 +1100 Subject: [Types-sig] Type checks References: <000801bf528c$72f01920$a02d153f@tim> Message-ID: <386BAD55.25B3A05B@maxtal.com.au> Tim Peters wrote: > John, you're not getting anywhere with this approach -- drop it. > The Python language does not define the means by which processor options are > specified, but does define their effects. It is not required that a > processor implement the processor option that we informally refer to as "-O > mode" -- but if it does, its effect is defined. Ok, I'll give up. -- John Skaller, mailto:skaller@maxtal.com.au 10/1 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller voice: 61-2-9660-0850 From gstein@lyra.org Thu Dec 30 21:40:08 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 30 Dec 1999 13:40:08 -0800 (PST) Subject: [Types-sig] Re: Conformance model In-Reply-To: <386BAA1D.BD332F0A@maxtal.com.au> Message-ID: On Fri, 31 Dec 1999, skaller wrote: >... > > You are swimming against the tide of history here. Java doesn't have > > much undefined behavior either. > > java and python are not ISO standardised. > C and C++ are, and allow undefined behaviour in some places, > because it is necessary for performance. > Pleny of people use C and C++ to write code. :-) > Plenty of people use python and java and wish they ran > faster. That does not negate what Tim, Paul, and myself (among others) have been saying: you're going to be unsuccessful in trying to introduce undefined behavior in Python. Give it up already. I believe that you have a possibility to get Guido to define a language feature that says "it operates , but on it will operate like ." The assert statement and some JPython-related issues are examples. However, this is the wrong forum for this since it is *VERY* dependent upon Guido's thoughts. As I've said before: bringing up this issue is growing awfully tiresome. I wish you would stop. Happy Holidays, -g -- Greg Stein, http://www.lyra.org/ From sjmachin@lexicon.net Thu Dec 30 22:59:26 1999 From: sjmachin@lexicon.net (John Machin) Date: Fri, 31 Dec 1999 08:59:26 +1000 Subject: Anti-poking lobby (was:Re: [Types-sig] type declaration syntax) In-Reply-To: <14443.32587.574186.48706@dolphin.mojam.com> References: <386165AF.F6E6BF81@maxtal.com.au> Message-ID: <19991230215036009.AAA186.69@max41121.izone.net.au> Skip said: > > skaller> I.e. TWO bans fix most problems. The ban on module level > skaller> rebindings is a significant restriction. > > I'll say. The common idiom for trapping stdout or stderr is to rebind > sys.stdout/stderr to a file-like object. How would that be accomplished > in such a straightforward way if module-level rebindings are disallowed? > sys.stdout = "you lose" is a bit too straightforward for my liking. Once we have banned poking from outside a module, can't we fix the presumably-few cases of missing-but-required functionality by supplying functions? For example, previous_stdout = sys.divert_stdout(new_stdout_file-like_object) [default argument is original stdout] with maybe some checking as determined by the module's "owner" [is it truly file-like?], and maybe some extra functionality e.g. sys.divert_stdout("my_stdout.log") # interprets string as name of file; appends if existing, else creates From tim_one@email.msn.com Fri Dec 31 03:35:30 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 30 Dec 1999 22:35:30 -0500 Subject: [Types-sig] Type checks In-Reply-To: <386BAD55.25B3A05B@maxtal.com.au> Message-ID: <000301bf5340$1604f820$e12d153f@tim> [Tim] > John, you're not getting anywhere with this approach -- drop it. > [and a paragraph of pseudo-standardese] [skaller] > Ok, I'll give up. John! Are you feeling OK? I was prepared for any conceivable response -- except for that one . and-a-happy-new-millennium-to-all-ly y'rs - tim From paul@prescod.net Fri Dec 31 09:31:54 1999 From: paul@prescod.net (Paul Prescod) Date: Fri, 31 Dec 1999 04:31:54 -0500 Subject: [Types-sig] Re: rebinding (was: const) References: Message-ID: <386C780A.3BFEE183@prescod.net> Greg Stein wrote: > > ... > > * module -- we must always disallow rebinding these because we don't > > have a notion of two modules with the "same interface". Maybe in some > > future version we could. > > Untrue. Ever look at the "anydbm" module and its cohorts? How about the > DBAPI modules? I didn't say that there was no notion of modules with the same interface. I said that our type declaration sub-language does not have such a notion. > I've said before: modules and classes both have the notion of an > interface. We ought to be able to associate an interface with a module! There is some elegance in this model but it is also pretty weird the way Pythonista's use modules instead of classes for some types of polymorphism. I want to know where Guido wants to go in people's thinking on modules. > Note on functions: how is a function declared to have a particular > signature? Just through an attribute declaration. It can be either the top level or in an interface declaration. Paul Prescod From paul@prescod.net Fri Dec 31 09:38:47 1999 From: paul@prescod.net (Paul Prescod) Date: Fri, 31 Dec 1999 04:38:47 -0500 Subject: [Types-sig] Re: syntax (was: check.py) References: Message-ID: <386C79A7.CBB008E@prescod.net> Greg Stein wrote: > > ... > > > And I put decl > > and typedecl at the front instead of making them operators because I > > agree with Tim Peters that we are designing a sub-language that needs to > > be understood as being separate by virtue of being evaluated BEFORE the > > code is executed. > > I disagree. Making a "sub-language" will simply serve to create something > that is not integrated with Python. I see no reason to separate anything > that is happening here -- that is a poor requirement/direction to take. Compile time stuff is inherently separate because it is *compile time stuff*. It follows different import rules, it is evaluated in a different namespace, and so forth. This, for instance, is not legal: a = doSomething() b = typedef a Python programmers need to understand these sorts of things. The decl syntax and "gpydl" semantics makes it very clear that these declarations are *separate* and are evaluated in a different time in a different execution context with a different namespace. > The typedef unary operator allows a Python programmer to manipulate type > declarator objects. That will be important for things such as an IDE, a > debugger, or some more sophisticated analysis tools. This is a completely orthogonal issue. There is no syntax in Python for a traceback or frame object but IDEs can work with traceback and frame objects. Classes are not created by a unary operator and assignment but they are still runtime-available objects. Paul Prescod From da@ski.org Sun Dec 5 07:00:10 1999 From: da@ski.org (David Ascher) Date: Sat, 4 Dec 1999 23:00:10 -0800 (Pacific Standard Time) Subject: [Types-sig] Static typing considered HARD In-Reply-To: <384A00D6.C3015C9D@fourthought.com> Message-ID: On Sat, 4 Dec 1999, Uche Ogbuji wrote: > David Ascher wrote: > > > No language that I know of does even a tenth of the job of configuration > > > management, error-handling or testing for anybody. They are not matters > > > for a programming language to address. > > > > I guess we'll have to agree to disagree. > > > > I've been doing some playing with Swing using JPython. Because it's > > wicked slow to start, (due to Java mostly) the > > edit-run-traceback-edit-run-traceback cycle is significantly longer than > > with with CPython. That's when I curse the fact that the compile-time > > analysis didn't catch simple typos, trivial mistakes in signatures, etc. I > > *love* Python's dynamicity. But mostly I use its 'wicked cool' dynamic > > features, like modifying the type of a variable in a function call or > > changing the __class__ of an object once in a very blue moon. > > I can agree to disagree as well as anyone, but I'll confess I'm still > baffled at how you claim that any language automates configuration > management, error-handling or testing to any significant extent. I > guess we'll also have to agree to not understand each other. I'm unsure all that you mean by 'configuration management, error-handling and testing'. All I'm pointing out is that I, others doing large-scale systems (e.g. eGroups) and many of my students all complain that Python isn't doing 'as much as it could' in the area of compile-time type checking and signature verification. > Also, I don't think I've _ever_ done anything as off-the-wall as > "modifying the type of a variable in a function call or changing the > __class__ of an object". I hope this isn't anyone's benchmark of > Python's dynamicism. Just in case it wasn't clear, what I meant by 'modifying the type of a variable in a function call' is: def a(x): x = len(x) The point is: Python is extremely dynamic, and those are extreme examples of this dynamicity. When you expressed quite strong reactions to Paul's proposal to add static types, I wanted to point out that there are things which could be done which would alleviate some of the problems that many folks are having in doing programming-in-the-large (and in-the-small as well), while not hindering most programmers most of the time (what was it P.T. Barnum said? =). Let's try a different approach. What is your benchmark of Python's dynamicity? What aspects do you care about keeping? Not modifying the __class__, that's clear. What, then? Or is it the syntactic lemon (opposite of sugar?) which lurks behind some static typing proposals which got you worried? > I program in Python perhaps 40 hours a week, and have done so for a long > time. Most of what I work on are large-scale systems. Very strange > that my typos (and they are legion) are much less catastrophic than your > own. Ah, well, probably you're just better at it than I am. =) My programs are typically small and run for a long time. They also change ten times daily due to the changing nature of the requirements. There is no 'finished' program in my current line of work. Just a different way of doing business. Note that developing a test suite for this sort of code is unrealistic. I'm paid to do science, not to do regression tests, and the regression suite is likely to be longer and buggier than the actual code. Perhaps it's best if we took this off-line though -- I think we're straying from the types-sig charter. --david From da@ski.org Sun Dec 5 05:36:53 1999 From: da@ski.org (David Ascher) Date: Sat, 4 Dec 1999 21:36:53 -0800 (Pacific Standard Time) Subject: [Types-sig] Static typing considered HARD In-Reply-To: <3849AC89.1173B163@fourthought.com> Message-ID: On Sat, 4 Dec 1999, Uche Ogbuji wrote: > Is their problem performance or defect-management? Again, there is an > important difference. I agree that typing can help the former: I am > doubtful that it is a panacea for the latter. The latter. The quote (paraphrased from memory) is "When someone changes a function interface, there's no way to know if we've caught all of the calls to that function in the tens of thousands of line of code that we have except to run the code'. Note that I don't think anyone is arguing 'panacea'. Just 'we could do better'. > > I see two very distinct problems, though -- one is the use of 'statically > > typed variables', which requires fundamental changes to Python's > > typesystem. The other is 'compile-time type/signature/interface checking', > > which could probably be done coarsely with add-on tools without changing > > the syntax or type system one iota (ok, maybe one or two iotas). > > > > > see this "misspelling" problem. Proper configuration-management > > > procedures and testing, along with intelligent error-recovery, prevent > > > such problems, which can also occur in the most strongly-typed systems. > > > > Wouldn't you agree that enforcing these 'proper procedures' is much harder > > in a language which doesn't do half the job for you? > > No language that I know of does even a tenth of the job of configuration > management, error-handling or testing for anybody. They are not matters > for a programming language to address. I guess we'll have to agree to disagree. I've been doing some playing with Swing using JPython. Because it's wicked slow to start, (due to Java mostly) the edit-run-traceback-edit-run-traceback cycle is significantly longer than with with CPython. That's when I curse the fact that the compile-time analysis didn't catch simple typos, trivial mistakes in signatures, etc. I *love* Python's dynamicity. But mostly I use its 'wicked cool' dynamic features, like modifying the type of a variable in a function call or changing the __class__ of an object once in a very blue moon. IIRC, JimH mentioned in the early part of his talk (before it got heated) a system which allowed one to change whether a particular symbol could be considered 'static' or not, and suggested what seemed to me reasonable defaults, like the names of builtins being considered 'known' at compile-time. With two new syntactic mechanisms called e.g. 'freeze' and 'thaw', one could maintain exactly the same dynamicity, while allowing the user to 'tell the compiler' that some things could be trusted not to change in the lifetime of the program (and the runtime would enforce those, of course). And if you really wanted to redefine 'open', then you still could. In other words, I'm just suggesting that given that (I'd guess) 95% of the code out there is such that variable maintain their type throughout the life of the program and that the builtins don't typically get overriden, it seems a shame not to play the numbers. And we don't have to cover all the cases. Just the 80% which give the largest payoff. Another trivial example: I can never remember whether it's pickle.dump(object, file) or pickle.dump(file, object). I tend to remember that I don't remember after the simulation has run for two hours (if I'm lucky) and the saving of state fails... --david