From hpk at trillke.net Mon Jan 6 13:13:00 2003 From: hpk at trillke.net (holger krekel) Date: Mon, 6 Jan 2003 13:13:00 +0100 Subject: [pypy-dev] test Message-ID: <20030106131300.N1568@prim.han.de> initial posting to test mailing list. From python-kbutler at sabaydi.com Fri Jan 10 20:41:04 2003 From: python-kbutler at sabaydi.com (Kevin J. Butler) Date: Fri, 10 Jan 2003 12:41:04 -0700 Subject: [pypy-dev] Re: [ann] Minimal Python project Message-ID: <3E1F21D0.5060904@sabaydi.com> > > >From: holger krekel > >We announce a mailinglist dedicated to developing >a "Minimal Python" version. Minimal means that >we want to have a very small C-core and as much >as possible (re)implemented in python itself. This >includes (parts of) the VM-Code. > >From: Guido van Rossum >Way cool. > > +1 I've been thinking of proposing a very similar thing - though I was thinking "Python in Python" which suggest all sorts of interesting logo ideas. :-) >We are very interested in learning about and >integrating prior art. And in hearing any >doubtful or reinforcing opinions. Expertise >is welcomed in all areas. > The Squeak Smalltalk implementation is interesting & relevant: http://www.squeak.org/features/vm.html The Squeak VM is written in a subset of Smalltalk ("Slang", different from "S-lang") that can be translated directly to C. This core provides the interpreter for the rest of the language, allowing the entire system to be very portable, and it facilitates development in many ways - you get to work on the core while working in your favorite language, you get to use all your favorite tools, etc. Plus I expect the translatable subset provides a solid, simple basis for integrating external code. I think a similar approach would be very useful in Minimal Python (... in Python), probably adopting ideas from Psyco http://psyco.sourceforge.net/ and/or Pyrex http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/ as the foundation for the "compilable subset". This would also provide a nice basis for a Jython implementation... Can-I-play-with-it-yet?-ly y'rs, kb From pedronis at bluewin.ch Fri Jan 10 20:49:07 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Fri, 10 Jan 2003 20:49:07 +0100 Subject: [pypy-dev] Re: [ann] Minimal Python project References: <3E1F21D0.5060904@sabaydi.com> Message-ID: <064e01c2b8e1$5675ac60$6d94fea9@newmexico> From: "Kevin J. Butler" > This would also provide a nice basis for a Jython implementation... not automatically [not that I think this should be a goal btw], the core abstraction could be not mappable/with reasonable performance over Java. From pedronis at bluewin.ch Fri Jan 10 20:54:08 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Fri, 10 Jan 2003 20:54:08 +0100 Subject: [pypy-dev] Re: [ann] Minimal Python project References: <3E1F21D0.5060904@sabaydi.com> <064e01c2b8e1$5675ac60$6d94fea9@newmexico> Message-ID: <066a01c2b8e2$09f70ae0$6d94fea9@newmexico> From: "Samuele Pedroni" > From: "Kevin J. Butler" > > This would also provide a nice basis for a Jython implementation... > > not automatically [not that I think this should be a goal btw], > the core abstraction could be not mappable/with reasonable performance over > Java. that was abstractions IOW, a Jython in Jython would be nice a idea, but is far from automatic that Python in Python and Jython in Jython can share much of their Python impl code. From Nicolas.Chauvat at logilab.fr Fri Jan 10 22:19:09 2003 From: Nicolas.Chauvat at logilab.fr (Nicolas Chauvat) Date: Fri, 10 Jan 2003 22:19:09 +0100 Subject: [pypy-dev] Re: [ann] Minimal Python project In-Reply-To: <3E1F21D0.5060904@sabaydi.com> References: <3E1F21D0.5060904@sabaydi.com> Message-ID: <20030110211909.GA17151@logilab.fr> On Fri, Jan 10, 2003 at 12:41:04PM -0700, Kevin J. Butler wrote: > >From: holger krekel > > > >We announce a mailinglist dedicated to developing > >a "Minimal Python" version. Minimal means that > >we want to have a very small C-core and as much > >as possible (re)implemented in python itself. This > >includes (parts of) the VM-Code. > > > >From: Guido van Rossum > >Way cool. > +1 +1 > The Squeak Smalltalk implementation is interesting & relevant: > http://www.squeak.org/features/vm.html > > The Squeak VM is written in a subset of Smalltalk ("Slang", different IIRC, this is also the case for Mozart, an implementation of the Oz language. Cf http://www.mozart-oz.org/ > Can-I-play-with-it-yet?-ly y'rs, I want this and some time to play around with metaprogramming in Python !! -- Nicolas Chauvat http://www.logilab.com - "Mais o? est donc Ornicar ?" - LOGILAB, Paris (France) From hpk at trillke.net Sat Jan 11 21:40:11 2003 From: hpk at trillke.net (holger krekel) Date: Sat, 11 Jan 2003 21:40:11 +0100 Subject: [pypy-dev] bootstrapping issues Message-ID: <20030111214011.P1568@prim.han.de> Welcome to the Minimal Python list, It appears (to me) that Armin and Christian are both busy for the weekend so i start some bootstrapping. First five paragraphs of background. The idea of actually starting to do "Minimal Python" arose from a thread in python-de (2003q1 first post). There soon was the idea of doing a sprint with the goal of releasing first bits. Armin, Christian and me realized early that opening up the discussion (instead of in private mails) is a very good idea as there is lots of experience and code out there. Nevertheless, i'd like the attendants of the planned sprint to decide which route they want to follow coding-wise eventually. We will propose some dates by the end of the month, btw. I think that - if in doubt - we follow the lead from CPython e.g. regarding coding style, language definition etc. Every deviation should be justified by the goals. I don't think we should follow a PEP process soon, though. During January i want to setup accounts and a cvs-repository. The involved policies will be liberal. ok. enough organizational things for now. Bootstrapping will be a hot topic for Minimal Python itself. We probably first want to use the CPython development environemnt to get going. We probably want to base it on the python-2.3 cvs tree. There is the idea of not using make/configure/automake but a simple understandable debuggable (read: python based) build environment. IOW words the number of dependencies for building Minimal Python should also be minimal. For avoiding the need of C-coded system modules there is interest to code a generalization of the "struct" module which allows *calling* functions at the C-level. This general C-"Extension" will be system-dependent. The gateway code between the "machine" and python probably needs to be coded in assembler: You can't "construct" a C-function call within ANSI-C. Both Microsoft and Apple have proprietary solutions for Java. (these insights courtesy of Jens-Uwe Mager and Christian Tismer who endorse the idea). When this succeeds we can start to code e.g. a POSIX-layer and wrap C-libaries (at runtime) from python. So much for now from my side. I am sure that as so many people inherit the good culture from c.l.py and python-dev we will have good discussions. Please note that there are some subscribers (e.g. Jens-Uwe Mager) who are experts but not neccessarily in core CPython. I hope/suggest that even naive sounding questions about Python and its core are welcomed. It certainly helps if more people understand the internals and the involved issues. I reserve the right to ask naive questions, myself :-) regards, holger krekel From robin at reportlab.com Sat Jan 11 22:30:04 2003 From: robin at reportlab.com (Robin Becker) Date: Sat, 11 Jan 2003 21:30:04 +0000 Subject: [pypy-dev] bootstrapping issues In-Reply-To: <20030111214011.P1568@prim.han.de> References: <20030111214011.P1568@prim.han.de> Message-ID: In article <20030111214011.P1568 at prim.han.de>, holger krekel writes >Welcome to the Minimal Python list, ..... thanks for this > >I hope/suggest that even naive sounding questions about Python >and its core are welcomed. It certainly helps if more >people understand the internals and the involved issues. >I reserve the right to ask naive questions, myself :-) > >regards, > > holger krekel ..... A naive questions; hich platforms are developments first aimed at ie gnu/linux win32 etc? Thomas Heller's ctypes module would seem to be a very good start at the generic C interface thing. It is pretty easy to use and although I don't have a complete grasp on it was a bit easier to use than calldll. I used both to do anygui interfaces to native windows api. Last time I discussed this with him Thomas said he was considering using the libffi library to do interfacing. I know that to be fairly portable. -- Robin Becker From bac at OCF.Berkeley.EDU Sat Jan 11 22:43:49 2003 From: bac at OCF.Berkeley.EDU (Brett Cannon) Date: Sat, 11 Jan 2003 13:43:49 -0800 (PST) Subject: [pypy-dev] bootstrapping issues In-Reply-To: <20030111214011.P1568@prim.han.de> References: <20030111214011.P1568@prim.han.de> Message-ID: [holger krekel] > I think that - if in doubt - we follow the lead > from CPython e.g. regarding coding style, language > definition etc. Every deviation should be justified > by the goals. I don't think we should follow a PEP > process soon, though. > All seems reasonable. No need to stop using something that works and people are, in general, familiar with. Also would make it easier to apply any code from Mini-Py (or is there some already accepted abbreviation for this project?) back into CPython if so desired. > Bootstrapping will be a hot topic for Minimal Python > itself. We probably first want to use the CPython > development environemnt to get going. We probably > want to base it on the python-2.3 cvs tree. > Once again I agree. I assume the other option is the maint-2.2 branch. That is not a good idea, in my opinion, since generators require a __future__ statement in 2.2. Might as well try to get all the good features we can into this. And speaking of features, how is the project going to view features that we all know are on their way out at some point? I am specifically thinking of integer division and old-style classes. Both are destined to disappear once Py3K comes out (whenever that is), and so should time be spend in implementing thess features? I am assuming backwards-compatibility takes precedence over code simplification but I thought I would double-check. > There is the idea of not using make/configure/automake > but a simple understandable debuggable (read: python based) > build environment. IOW words the number of dependencies > for building Minimal Python should also be minimal. > What about A-A-P (http://www.a-a-p.org/index.html)? It's Python-based and seems like a well-designed tool (but I am a Vim fan so I am biased toward's the creator's work =). -Brett From DavidA at ActiveState.com Sat Jan 11 22:48:10 2003 From: DavidA at ActiveState.com (David Ascher) Date: Sat, 11 Jan 2003 13:48:10 -0800 Subject: [pypy-dev] bootstrapping issues References: <20030111214011.P1568@prim.han.de> Message-ID: <3E20911A.3080409@ActiveState.com> holger krekel wrote: >Bootstrapping will be a hot topic for Minimal Python >itself. We probably first want to use the CPython >development environemnt to get going. We probably >want to base it on the python-2.3 cvs tree. > >There is the idea of not using make/configure/automake >but a simple understandable debuggable (read: python based) >build environment. IOW words the number of dependencies >for building Minimal Python should also be minimal. > May I suggest scons (http://www.scons.org/)? It's what I would use in a new project today. >For avoiding the need of C-coded system modules >there is interest to code a generalization of >the "struct" module which allows *calling* functions at >the C-level. This general C-"Extension" will be >system-dependent. The gateway code between the "machine" >and python probably needs to be coded in assembler: >You can't "construct" a C-function call within >ANSI-C. Both Microsoft and Apple have proprietary >solutions for Java. (these insights courtesy of >Jens-Uwe Mager and Christian Tismer who endorse >the idea). > I will point people towards the code in Mozilla that does similar things, called xptcall: http://www.mozilla.org/scriptable/xptcall-faq.html It's XPCOM centered, but many of the ideas and the assembly code for mac, win32 and os2, and a bunch of unix variants (http://lxr.mozilla.org/mozilla/source/xpcom/reflect/xptcall/src/md/unix/) is there. Robin already mentioned libffi as well. Looking forward to participating in this very interesting project, -- david From marc at informatik.uni-bremen.de Sat Jan 11 23:12:00 2003 From: marc at informatik.uni-bremen.de (Marc Recht) Date: Sat, 11 Jan 2003 23:12:00 +0100 Subject: [pypy-dev] bootstrapping issues In-Reply-To: <3E20911A.3080409@ActiveState.com> References: <20030111214011.P1568@prim.han.de> <3E20911A.3080409@ActiveState.com> Message-ID: <96870000.1042323120@leeloo.intern.geht.de> >> There is the idea of not using make/configure/automake >> but a simple understandable debuggable (read: python based) >> build environment. IOW words the number of dependencies >> for building Minimal Python should also be minimal. >> > May I suggest scons (http://www.scons.org/)? It's what I would use in a > new project today. SCons is a really nice tool, but what about the good old makefile (+ config.mk). We're talking about a small core and Python itself uses POSIX conforming source it shouldn't be that problem. And, SCons requires an installed Python. If an installed Python should be a requirement we could also try to use distutils (to have one dependency less). Marc "Premature optimization is the root of all evil." -- Donald E. Knuth -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 186 bytes Desc: not available URL: From florian.proff.schulze at gmx.net Sun Jan 12 01:20:19 2003 From: florian.proff.schulze at gmx.net (Florian Schulze) Date: Sun, 12 Jan 2003 01:20:19 +0100 (Westeuropäische Normalzeit) Subject: [pypy-dev] Questions about the C core Message-ID: Hi! I would like to know if there are already some concrete plans what the C core needs to be able to do. Will there be anything like the old builtin module? Will dicts and the other types still be done in C or some mix of C and Python etc. Also will the Psyco part be optional to ease the initial porting to new platforms? Florian From hpk at trillke.net Sun Jan 12 01:23:42 2003 From: hpk at trillke.net (holger krekel) Date: Sun, 12 Jan 2003 01:23:42 +0100 Subject: [pypy-dev] bootstrapping issues In-Reply-To: <96870000.1042323120@leeloo.intern.geht.de>; from marc@informatik.uni-bremen.de on Sat, Jan 11, 2003 at 11:12:00PM +0100 References: <20030111214011.P1568@prim.han.de> <3E20911A.3080409@ActiveState.com> <96870000.1042323120@leeloo.intern.geht.de> Message-ID: <20030112012342.R1568@prim.han.de> [Marc Recht Sat, Jan 11, 2003 at 11:12:00PM +0100] > >> There is the idea of not using make/configure/automake > >> but a simple understandable debuggable (read: python based) > >> build environment. IOW words the number of dependencies > >> for building Minimal Python should also be minimal. > >> > > May I suggest scons (http://www.scons.org/)? It's what I would use in a > > new project today. > SCons is a really nice tool, but what about the good old makefile (+ > config.mk). We're talking about a small core and Python itself uses POSIX > conforming source it shouldn't be that problem. And, SCons requires an > installed Python. If an installed Python should be a requirement we could > also try to use distutils (to have one dependency less). Using CPython as a first means of bootstrapping should only be there for the time beeing. It's simply practical to start with. Eventually there should be a self-contained bootstrapping process. IMO this hints at doing a very smallish VM which is able to drive the build process to a point where the next-stage VM with more features and a generic C-level interfacing could take over. SCons is certainly worth checking out but it might already use too many features of python to be used for initial bootstrapping. I doubt that distutils would be of much use at early stages. Eventually the requirement for CPython should go away. And once Minimal Python is unwrapped we don't need a build environment anymore :-) regards, holger From pyth at devel.trillke.net Sun Jan 12 01:33:43 2003 From: pyth at devel.trillke.net (holger krekel) Date: Sun, 12 Jan 2003 01:33:43 +0100 Subject: [pypy-dev] Re: [Python-Dev] Re: [ann] Minimal Python project In-Reply-To: <3E1F21D0.5060904@sabaydi.com>; from python-kbutler@sabaydi.com on Fri, Jan 10, 2003 at 12:41:04PM -0700 References: <3E1F21D0.5060904@sabaydi.com> Message-ID: <20030112013343.W349@prim.han.de> Kevin J. Butler wrote: > > > > > >From: holger krekel > > > >We announce a mailinglist dedicated to developing > >a "Minimal Python" version. Minimal means that > >we want to have a very small C-core and as much > >as possible (re)implemented in python itself. This > >includes (parts of) the VM-Code. > > > >From: Guido van Rossum > >Way cool. > > > > > +1 > > I've been thinking of proposing a very similar thing - though I was > thinking "Python in Python" which suggest all sorts of interesting logo > ideas. :-) The logo idea is still applicable. I'll ask some graphically talented friends. what is or would have been your proposal like? Any key ideas you like to share? and what's the opposite of "Vulture Culture"? That's what i seem to remember from Alan Parson's project: a celtic snake bending to its tail. I like that. which holger From hpk at trillke.net Sun Jan 12 01:41:36 2003 From: hpk at trillke.net (holger krekel) Date: Sun, 12 Jan 2003 01:41:36 +0100 Subject: [pypy-dev] bootstrapping issues In-Reply-To: ; from robin@reportlab.com on Sat, Jan 11, 2003 at 09:30:04PM +0000 References: <20030111214011.P1568@prim.han.de> Message-ID: <20030112014136.S1568@prim.han.de> [Robin Becker Sat, Jan 11, 2003 at 09:30:04PM +0000] > In article <20030111214011.P1568 at prim.han.de>, holger krekel > writes > >Welcome to the Minimal Python list, > ..... thanks for this > > > >I hope/suggest that even naive sounding questions about Python > >and its core are welcomed. It certainly helps if more > >people understand the internals and the involved issues. > >I reserve the right to ask naive questions, myself :-) > > > >regards, > > > > holger krekel > ..... > A naive questions; hich platforms are developments first aimed at ie > gnu/linux win32 etc? I guess it's going to be win32 & POSIX (MacOSX, and some linux/unix variants). > Thomas Heller's ctypes module would seem to be a very good start at the > generic C interface thing. It is pretty easy to use and although I don't > have a complete grasp on it was a bit easier to use than calldll. I used > both to do anygui interfaces to native windows api. I heard good things about it but it is currently a windows only solution, or not? > Last time I discussed this with him Thomas said he was considering using > the libffi library to do interfacing. I know that to be fairly portable. And that is a non-windows and non-mac solution? Anyway, both are interesting but actually Christian and others can probably judge better. This is obviously an issue for people knowing many different platforms. holger From tismer at tismer.de Sun Jan 12 01:47:49 2003 From: tismer at tismer.de (Christian Tismer) Date: Sun, 12 Jan 2003 01:47:49 +0100 Subject: [pypy-dev] Questions about the C core In-Reply-To: References: Message-ID: <3E20BB35.1050803@tismer.de> Florian Schulze wrote: > Hi! > > I would like to know if there are already some concrete plans what the C > core needs to be able to do. Will there be anything like the old builtin > module? Will dicts and the other types still be done in C or some mix of C > and Python etc. I guess we will borrow most core objects in the first place, just to get started. A first idea is to re-implement almost everything in Python and to let Psyco generate code for that. At least, it will be a good case study to do this for the implementation of the bytecode interpreter. One good reason for that is, as soon as we have the bytecode interpreter running, we can use any python for cross-compiling. We can also re-implement dicts and lists using Python. This can be done by emulating the basic data structures by some primitive array-like objects that represent a piece of memory. The simplicity of these obejcts might be used by psyco to deduce the possible data types which can appear in them, and produce simple, efficient code. But maybe there are better ways. We have to play a lot before this can be decided. > Also will the Psyco part be optional to ease the initial porting to new > platforms? I believe, no. Instead, I think to supply a generic virtual machine which is easy to produce code for, tpgether with a very fast interpreter. That engine would run on any machine and would be enough to do the full bootstrap, when the specific machine is defined, later. Note that psyco is most probably becoming Python code as well. :-) :-)) ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From andrew at indranet.co.nz Sun Jan 12 02:32:54 2003 From: andrew at indranet.co.nz (Andrew McGregor) Date: Sun, 12 Jan 2003 14:32:54 +1300 Subject: [pypy-dev] Questions about the C core In-Reply-To: <3E20BB35.1050803@tismer.de> References: <3E20BB35.1050803@tismer.de> Message-ID: <2179680000.1042335174@localhost.localdomain> Has anyone thought about features that Python does not have that should be in Minimal? My first candidate would be (ducks) macros, since that could drastically simplify constructing core Python features. I have some thoughts on pythonic ways to do that which I need to write up. Basically, if we use the parser module rather than a C parser, it gets to be fairly easy. So, what would a pythonic macro look like? The nearest match presently is a metaclass, but that only gets the class definition *after* compilation, which means that although we can modify the code at that point (even to the extent of retreiving the source, changing it and recompiling), it had to compile in the first place, so we can't add syntax. Therefore a Pythonic macro should be: A class which adds it's grammar (probably fairly constrained; limited to new keywords, perhaps) to the parser at parse time, by setting or appending to a magic attribute (__grammar__ perhaps). Then the macro provides code which is called with the parse trees resulting from that grammar, modifies them, and returns the parse tree that will actually be compiled. If you want to evaluate some bits at compile time, do that explicitly in the macro (opposite way around to lisp, where you have complicated rules). It may be necessary to provide a quote: keyword and quote() (or noeval()? should the names be different?) builtin that quote a block or an expression to prevent compile-time evaluation by macros. Advantages: It looks like a metaclass. It can be inherited from to use the macro in new classes (hence __grammar__ should be appended to rather than just set). If the kinds of grammar rules that can be added are constrained just right, editors won't mess up the indentation and code using macros will still look like Python. If, however, we can use unconstrained grammar mods in the core, then we can do things like build dictionaries inline. Hmm, must go now, I was dreaming up an example, but I've got to go sell a house (Hooray :-) Comments? Andrew From pedronis at bluewin.ch Sun Jan 12 02:50:59 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Sun, 12 Jan 2003 02:50:59 +0100 Subject: [pypy-dev] Questions about the C core References: <3E20BB35.1050803@tismer.de> <2179680000.1042335174@localhost.localdomain> Message-ID: <00aa01c2b9dd$0e4c7d40$6d94fea9@newmexico> > Has anyone thought about features that Python does not have that should be > in Minimal? My first candidate would be (ducks) macros, as a final user visible feature, I hope not. As an implementation then invisible tool maybe/why not. > since that could > drastically simplify constructing core Python features. I have some > thoughts on pythonic ways to do that which I need to write up. there is really no pythonic way to do macros. You can check this python-dev thread http://aspn.activestate.com/ASPN/Mail/Message/1076771 My personal opinion is that entering the realm of language (re)design vs. simply implementation is a risky move... From damien.morton at acm.org Sun Jan 12 03:51:29 2003 From: damien.morton at acm.org (Damien Morton) Date: Sat, 11 Jan 2003 21:51:29 -0500 Subject: [pypy-dev] Possible logo idea - ouroboros Message-ID: <002601c2b9e5$8aec8180$6401a8c0@damien> http://www.millenniumdesktop.co.uk/ouroboros.htm The snake biting its own tail image used in the opening title sequence of Millennium is called an Ouroboros. In the series it is the symbol of the Millennium Group. In mythology, the Ouroboros is any image of a snake, worm, serpent, or dragon biting its own tail. It was first seen as early as 1600 years BC in Egypt. The Greeks called it the Ouroboros, which means "Tail Eater." Generally taking on a circular form, the symbol is representative of many broad concepts. Time, life continuity, completion, the repetition of history, the self-sufficiency of nature and the rebirth of the earth can all be seen within the circular boundaries of the Ouroboros. Societies from throughout history have shaped the Ouroboros to fit their own beliefs and purposes. The image has been seen in Japan, India, utilized in Greek alchemic texts, European woodcuts, Native American Indian tribes and even by the Aztecs. It has, at times, been directly associated to such varying symbols as the Roman god Janus, the Chinese Ying Yang, and the Biblical serpent of the garden of Eden. http://abacus.best.vwh.net/oro/ouroboros2.html This symbol appears principally among the Gnostics and is depicted as a dragon, snake or serpent biting its own tail. In the broadest sense, it is symbolic of time and the continuity of life. It sometimes bears the caption Hen to pan - 'The One, the All', as in the Codex Marcianus, for instance, of the 2nd century A.D. It has also been explained as the union between the chthonian principle as represented by the serpent and the celestial principal as signified by the bird (a synthesis which can also be applied to the dragon). Ruland contends this proves that it is a variant of the symbol for Mercury - the duplex god. In some versions of the Ouroboros, the body is half light and half dark, alluding in this way to the successive counterbalancing of opposing principls as illustrated in the Chinese Yin-Yang symbol for instance. Evola asserts that it represents the dissoluotion of the body, or the universal serpent which (to quote the Gnostic saying) 'passes through all things'. Poison, the viper and the universal solvent are all symbols of the undifferentiated-of the 'unchanging law' which moves through all things, linking them by a common bond. Both the dragon and the bull are symbolic antagonists of the solar hero. The Ouroboros biting its own tail is symbolic of self-fecundation, or the primitive idea of a self-sufficient Nature - a Nature, that is which, ? la Nietzsche, continually returns, within a cyclic pattern, to its own beginning. There is a Venetian manuscript on alchemy which depicts the Ouroboros with its body half-black (symbolizing earth and night) and half-white (denoting heaven and light). From andrew at indranet.co.nz Sun Jan 12 04:25:57 2003 From: andrew at indranet.co.nz (Andrew McGregor) Date: Sun, 12 Jan 2003 16:25:57 +1300 Subject: [pypy-dev] Questions about the C core In-Reply-To: <00aa01c2b9dd$0e4c7d40$6d94fea9@newmexico> References: <3E20BB35.1050803@tismer.de> <2179680000.1042335174@localhost.localdomain> <00aa01c2b9dd$0e4c7d40$6d94fea9@newmexico> Message-ID: <2197800000.1042341957@localhost.localdomain> --On Sunday, January 12, 2003 02:50:59 +0100 Samuele Pedroni wrote: >> Has anyone thought about features that Python does not have that should >> be in Minimal? My first candidate would be (ducks) macros, > > as a final user visible feature, I hope not. As an implementation then > invisible tool maybe/why not. That's more or less how macros are used most places that have them. For instance, only three files in the non-compiled-in (i.e. non-core) parts of xemacs have any macros defined in them (and two of those are compatibility things for other versions of emacs, never even defined if running on an xemacs), but most packages use at least some of the macros defined in the core. From the outside, those macros just look like parts of the exported language. The one remaining user of defmacro is advice.el, which defines the generic before/after/around advice mechanism, which could also be regarded as part of the core, and probably will be in some future version. >> since that could >> drastically simplify constructing core Python features. I have some >> thoughts on pythonic ways to do that which I need to write up. > > there is really no pythonic way to do macros. You can check this > python-dev thread > > http://aspn.activestate.com/ASPN/Mail/Message/1076771 > > My personal opinion is that entering the realm of language (re)design vs. > simply implementation is a risky move... I have several things in mind for macros: 1) pyrex-like cdef as a means to implement interfaces to C modules, but instead of outputting C and compiling, having psyco's code generator build it. 2) More efficient struct and sstruct like modules to tie in with 1) 3) Means to (as another poster has suggested) build dictionaries, sets and list comprehensions including their syntax. 4) Advice, like the elisp version I mentioned above. This is probably the only one of these where I'd like to see a change in the language officially exported, and I'm not that fussed about it, frankly. 5) Possibly a hook to do templating language things without preprocessing and without horrible performance problems. In lisp dialects macros are mostly ways to build core syntax, and ways for the gurus to do scary things without killing performance, that usually result (so far as most users are concerned) in a special-purpose sublanguage. I think the latter already applies to python metaclasses, and would apply to macros too. My suggestion is really just a hook in a place where it doesn't already exist, on the before side of class definition where metaclasses come after. In the context of Minimal Python, macros provide a clean and consistent mechanism to reduce the C code by providing the hooks to build parts of the language in python. These don't already exist, and I'd prefer to see something generic than random hackery. Presuming that the parser will move to python, it would be a fairly natural use of python's dynamicism to allow define-time additions to the parser. Andrew From DavidA at ActiveState.com Sun Jan 12 05:55:44 2003 From: DavidA at ActiveState.com (David Ascher) Date: Sat, 11 Jan 2003 20:55:44 -0800 Subject: [pypy-dev] Questions about the C core References: <3E20BB35.1050803@tismer.de> Message-ID: <3E20F550.7010202@ActiveState.com> Christian Tismer wrote: > I believe, no. Instead, I think to supply a generic > virtual machine which is easy to produce code for, > tpgether with a very fast interpreter. > That engine would run on any machine and would > be enough to do the full bootstrap, when the > specific machine is defined, later. Is anyone on this list familiar with the current state of Parrot? All I know is: http://cvs.perl.org/cvsweb/parrot/ChangeLog?rev=1.4&content-type=text/x-cvsweb-markup Additionally, is anyone here familiar with some of the work that's been done on AOS, related to SmallScript, a version of SmallTalk (see e.g. http://www.smallscript.org/SmallScriptWebsite.asp#AOS). I have the impression that those folks have done some interesting work, although their non-open-source nature makes it hard to evaluate. --david From DavidA at ActiveState.com Sun Jan 12 06:03:55 2003 From: DavidA at ActiveState.com (David Ascher) Date: Sat, 11 Jan 2003 21:03:55 -0800 Subject: [pypy-dev] Questions about the C core References: <3E20BB35.1050803@tismer.de> <2179680000.1042335174@localhost.localdomain> <00aa01c2b9dd$0e4c7d40$6d94fea9@newmexico> <2197800000.1042341957@localhost.localdomain> Message-ID: <3E20F73B.6000307@ActiveState.com> On the topic of macros et al: I think that delivering minimal python will be quite hard. If the mandate is to create a new implementation of Python, then I think that the syntax of current Python should be seen as a "minimal" requirement from a syntactic POV. New syntactic elements can clearly be defined as well, although naturally care should be taken to ensure that existing code still works. (so making 'spam' a reserved word probably wouldn't work). On the other hand, I'm going to lose interest in this project pretty fast if it turns into an _unsubstantiated_ argument about language design. If a new language construct is proposed as a fairly direct and well-supported way to get the implementation done better, faster, cheaper, then by all means. --david From DavidA at ActiveState.com Sun Jan 12 06:15:31 2003 From: DavidA at ActiveState.com (David Ascher) Date: Sat, 11 Jan 2003 21:15:31 -0800 Subject: [pypy-dev] Possible logo idea - ouroboros References: <002601c2b9e5$8aec8180$6401a8c0@damien> Message-ID: <3E20F9F3.3050608@ActiveState.com> Damien Morton wrote: >http://www.millenniumdesktop.co.uk/ouroboros.htm > > Or for more pix: http://images.google.com/images?q=ouroboros I like it. Mystical, kind of wacky. Seems apporpriate. =) From pedronis at bluewin.ch Sun Jan 12 06:11:50 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Sun, 12 Jan 2003 06:11:50 +0100 Subject: [pypy-dev] Questions about the C core References: <3E20BB35.1050803@tismer.de> <2179680000.1042335174@localhost.localdomain> <00aa01c2b9dd$0e4c7d40$6d94fea9@newmexico> <2197800000.1042341957@localhost.localdomain> <3E20F73B.6000307@ActiveState.com> Message-ID: <033001c2b9f9$1da2d0c0$6d94fea9@newmexico> From: "David Ascher" > On the topic of macros et al: > > I think that delivering minimal python will be quite hard. If the > mandate is to create a new implementation of Python, then I think that > the syntax of current Python should be seen as a "minimal" requirement > from a syntactic POV. New syntactic elements can clearly be defined as > well, although naturally care should be taken to ensure that existing > code still works. (so making 'spam' a reserved word probably wouldn't work). > > On the other hand, I'm going to lose interest in this project pretty > fast if it turns into an _unsubstantiated_ argument about language > design. If a new language construct is proposed as a fairly direct and > well-supported way to get the implementation done better, faster, > cheaper, then by all means. maybe I was not completely clear but that was my point too. I believe that an open parser architecture can do the trick (i.e. supporting necessary additional constructs) without the struggle of a debate on syntax/semantics of macros in python. I sense that toying with new language features at large risks to be rather non-constructive. From pedronis at bluewin.ch Sun Jan 12 06:21:37 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Sun, 12 Jan 2003 06:21:37 +0100 Subject: [pypy-dev] Questions about the C core References: <3E20BB35.1050803@tismer.de> <3E20F550.7010202@ActiveState.com> Message-ID: <03a101c2b9fa$7b2e7540$6d94fea9@newmexico> From: "David Ascher" > Is anyone on this list familiar with the current state of Parrot? > > All I know is: > http://cvs.perl.org/cvsweb/parrot/ChangeLog?rev=1.4&content-type=text/x-cvsweb- markup http://www.amk.ca/conceit/parrot.html Right now they are sketching object support design: http://archive.develooper.com/perl6-internals%40perl.org/msg14408.html They have also started some collaboration with the DotGNU project. regards. From dmorton at bitfurnace.com Sun Jan 12 06:24:44 2003 From: dmorton at bitfurnace.com (damien morton) Date: Sun, 12 Jan 2003 00:24:44 -0500 Subject: [pypy-dev] Possible logo idea - ouroboros In-Reply-To: <3E20F9F3.3050608@ActiveState.com> Message-ID: <002701c2b9fa$ebdc2170$6401a8c0@damien> In particular the sentence "The Ouroboros biting its own tail is symbolic of self-fecundation, or the primitive idea of a self-sufficient Nature" seems to nicely describe some of the goals of pypy. > -----Original Message----- > From: David Ascher [mailto:DavidA at ActiveState.com] > Sent: Sunday, 12 January 2003 00:16 > To: Damien Morton > Cc: pypy-dev at codespeak.net > Subject: Re: [pypy-dev] Possible logo idea - ouroboros > > > Damien Morton wrote: > > >http://www.millenniumdesktop.co.uk/ouroboros.htm > > > > > Or for more pix: http://images.google.com/images?q=ouroboros > > I like it. Mystical, kind of wacky. Seems apporpriate. =) > > From fred at securenym.net Sun Jan 12 07:32:15 2003 From: fred at securenym.net (Frederic Giacometti) Date: Sat, 11 Jan 2003 22:32:15 -0800 Subject: [pypy-dev] micropython, a 'minimal python'... Message-ID: <3E210BEF.4080309@securenym.net> About a year ago, I developped micropython within a few hours worth of work. The idea was to bootstrap the python build process with python itself, and get rid of all the autoconf/config stuff. micropython is therefore a 'minimal python' in the sense that it only contains the minimal required to execute regular python code, including the base os functions. micropython is a python interpreter build as a single executable using the smallest set of source files covering only the base built-in types (no complex numbers, cobjects built-in....) from the python distribution (modulo some #ifdef), and dropping the dynamic C module loading. As result, micro python builds with no configuration at all, from a minimal makefile running under on either VC++ (nmake) or posix. The idea is that, once build, micropython be used to execute python configuration scripts (in replacement to the config.in stuff) to generate the full-fledged (platform-specific) makefile. The procedure has been successful. I did not go further just because I changed job and got a newborn daughter requesting my attention and that I get decent money in the house (which is something I have not been ablel to conciliate with Python, lately)... I'm not really putting any development time on these things, but if anybody has questions, I'll answer them. The source code was branched from python 2.1.1, and is available under the 'micro' branch' from the 'pythonx/ directory of the CVS distribution of JPE (jpe.sf.net). The bootstrapping Makefile is: http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/jpe/pythonx/Python/Micro/Attic/Makefile?rev=1.1.2.6&only_with_tag=micro&content-type=text/vnd.viewcvs-markup The thrust beyond micropython (asides from the philosophical discussions concerning the lean development) was to propose an alternate way to autoconf/config, to bootstrap the python build process. [For the curious: The development of the shared library facility in pythonx required ediding and fixing the autoconf/config files, something I sweared I'd never do again. After developping it, I've been convinced that micropython should be used in replacement to autoconf/config for generating the plateform-specific makefiles for all platforms (including win32), using the same python config script.] Cheers, Frederic G. From marc at informatik.uni-bremen.de Sun Jan 12 09:22:00 2003 From: marc at informatik.uni-bremen.de (Marc Recht) Date: Sun, 12 Jan 2003 09:22:00 +0100 Subject: [pypy-dev] Questions about the C core In-Reply-To: <00aa01c2b9dd$0e4c7d40$6d94fea9@newmexico> References: <3E20BB35.1050803@tismer.de> <2179680000.1042335174@localhost.localdomain> <00aa01c2b9dd$0e4c7d40$6d94fea9@newmexico> Message-ID: <258850000.1042359720@leeloo.intern.geht.de> >> Has anyone thought about features that Python does not have that should >> be in Minimal? My first candidate would be (ducks) macros, > > as a final user visible feature, I hope not. As an implementation then > invisible tool maybe/why not. IMHO the user visible language part should be 100% Python compatible. At least this should be the goal. But, in the build process of the Minimal Python core we could (and should?) use feature to simplify the implementation. Like generating source/classes from a formal description (eg. XML). Marc "Premature optimization is the root of all evil." -- Donald E. Knuth -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 186 bytes Desc: not available URL: From marc at informatik.uni-bremen.de Sun Jan 12 09:31:58 2003 From: marc at informatik.uni-bremen.de (Marc Recht) Date: Sun, 12 Jan 2003 09:31:58 +0100 Subject: [pypy-dev] bootstrapping issues In-Reply-To: <20030111214011.P1568@prim.han.de> References: <20030111214011.P1568@prim.han.de> Message-ID: <263580000.1042360318@leeloo.intern.geht.de> > There is the idea of not using make/configure/automake Not using automake/autoconf is IMHO "The Right Thing". > but a simple understandable debuggable (read: python based) > build environment. IOW words the number of dependencies > for building Minimal Python should also be minimal. Why not using make ? It's installed on any system that has a c compiler, so it doesn't add another dependency. IMHO we could use it to bootstrap the core and then go on with something Python based (using the bootstrapped core). While traveling through NetBSD's pkgsrc I came across this http://buildtool.sourceforge.net/ . I haven't tried it yet, but the description sounds promising. Marc "Premature optimization is the root of all evil." -- Donald E. Knuth -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 186 bytes Desc: not available URL: From DavidA at ActiveState.com Sun Jan 12 09:47:08 2003 From: DavidA at ActiveState.com (David Ascher) Date: Sun, 12 Jan 2003 00:47:08 -0800 Subject: [pypy-dev] bootstrapping issues References: <20030111214011.P1568@prim.han.de> <263580000.1042360318@leeloo.intern.geht.de> Message-ID: <3E212B8C.9010409@ActiveState.com> Marc Recht wrote: >> There is the idea of not using make/configure/automake > > Not using automake/autoconf is IMHO "The Right Thing". > >> but a simple understandable debuggable (read: python based) >> build environment. IOW words the number of dependencies >> for building Minimal Python should also be minimal. > > Why not using make ? It's installed on any system that has a c > compiler, so it doesn't add another dependency. IMHO we could use it > to bootstrap the core and then go on with something Python based > (using the bootstrapped core). Make, Configure etc. aren't "really" portable to Windows. It's hard to know when you've written a portable makefile. Make doesn't scale. Make is error-prone, etc. There are lots of problems with Make, but this isn't really the place for that discussion. See pages like: http://sc-archive.codesourcery.com/sc_build http://sc-archive.codesourcery.com/entries/build/Machina/machina-appendix.html http://www.scons.org/doc/HTML/scons-python10/t1.html http://www.a-a-p.org/ --david From andrew at indranet.co.nz Sun Jan 12 11:59:51 2003 From: andrew at indranet.co.nz (Andrew McGregor) Date: Sun, 12 Jan 2003 23:59:51 +1300 Subject: [pypy-dev] Questions about the C core In-Reply-To: <3E20F73B.6000307@ActiveState.com> References: <3E20BB35.1050803@tismer.de> <2179680000.1042335174@localhost.localdomain> <00aa01c2b9dd$0e4c7d40$6d94fea9@newmexico> <2197800000.1042341957@localhost.localdomain> <3E20F73B.6000307@ActiveState.com> Message-ID: <10420000.1042369191@localhost.localdomain> --On Saturday, January 11, 2003 21:03:55 -0800 David Ascher wrote: > On the topic of macros et al: ... > On the other hand, I'm going to lose interest in this project pretty fast > if it turns into an _unsubstantiated_ argument about language design. If > a new language construct is proposed as a fairly direct and > well-supported way to get the implementation done better, faster, > cheaper, then by all means. Which is why I suggested it. I'm no expert on language design, but I can see that a standardised way to extend a language that starts out less than complete (as a Python core with major chunks removed will) is better than an ad-hoc mess. Someone else suggested a sufficiently extensible parser, if I follow their intent, and I agree entirely. In a sense, a macro system is just one possible interface to such an extensible parser. Another suggestion was to allow an in-python way to interface with c system calls and library functions. Scoped macros implementing a python derivative similar to pyrex have potential here, in the same way pyrex does for extending CPython. In lisp, I always felt macros were best provided by the language designer or library designers, as part of the language or library. Applications that made use of custom macros were generally hard to read, but if what they were doing was compiler-like (my case was symbolic math) usually made sense once understood. I suspect the same would be true of python. In other words, I think a macro system is a good way to get syntactical extensibility so that Minimal Python can acheive it's goals. I think users should be discouraged from writing them, because of the potential nightmares unusual macros cause, but certain standard packages could provide some macros at little cost to the pythonicality of the results. I think it would be pretty cool if 'from pyrex import *' led to following code having pyrex semantics; this would, I think, make it easier to get Minimal Python completed. I also think that if metaclasses can be pythonic, there's a pythonic way to do macros too, but if consensus is against that, then let them simply be an implementation detail of Minimal Python. Andrew From simonb at webone.com.au Sun Jan 12 13:41:03 2003 From: simonb at webone.com.au (Simon Burton) Date: Sun, 12 Jan 2003 23:41:03 +1100 Subject: [pypy-dev] Possible logo idea - ouroboros In-Reply-To: <002601c2b9e5$8aec8180$6401a8c0@damien> References: <002601c2b9e5$8aec8180$6401a8c0@damien> Message-ID: <20030112234103.2df65df5.simonb@webone.com.au> Ouroborus, just released in movie form (i'm serious!); w. Nicolas Cage, screenplay by Charlie Kaufman : "Adaptation" go and see it if you can, it's ***** and isn't this all about adaptation? oh wow, i'm off to praise God/Buddha/Jesus etc. (Guido?) Simon Burton. From hpk at trillke.net Sun Jan 12 14:30:12 2003 From: hpk at trillke.net (holger krekel) Date: Sun, 12 Jan 2003 14:30:12 +0100 Subject: [pypy-dev] Possible logo idea - ouroboros In-Reply-To: <3E20F9F3.3050608@ActiveState.com>; from DavidA@ActiveState.com on Sat, Jan 11, 2003 at 09:15:31PM -0800 References: <002601c2b9e5$8aec8180$6401a8c0@damien> <3E20F9F3.3050608@ActiveState.com> Message-ID: <20030112143012.T1568@prim.han.de> [David Ascher Sat, Jan 11, 2003 at 09:15:31PM -0800] > Damien Morton wrote: > > >http://www.millenniumdesktop.co.uk/ouroboros.htm > > > > > Or for more pix: http://images.google.com/images?q=ouroboros > > I like it. Mystical, kind of wacky. Seems apporpriate. =) yes. I'll hand these links to aforementioned artists. But i probably add that it needn't be as "serious". A little cuteness and fun while biting the tail is appropriate IMO :-) holger From pedronis at bluewin.ch Sun Jan 12 14:25:52 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Sun, 12 Jan 2003 14:25:52 +0100 Subject: [pypy-dev] Questions about the C core References: <3E20BB35.1050803@tismer.de> <2179680000.1042335174@localhost.localdomain> <00aa01c2b9dd$0e4c7d40$6d94fea9@newmexico> <2197800000.1042341957@localhost.localdomain> <3E20F73B.6000307@ActiveState.com> <10420000.1042369191@localhost.localdomain> Message-ID: <007501c2ba3e$21668220$6d94fea9@newmexico> From: "Andrew McGregor" > Someone else suggested a sufficiently extensible parser, if I follow their > intent, and I agree entirely. In a sense, a macro system is just one > possible interface to such an extensible parser. the problem is that getting that one interface right for user consumption is darn hard and citing Guido about macros in the thread I have referred to (that I hope you have read): "I've considered it, and rejected it. That doesn't mean you shouldn't bring it up, but I expect it would turn Python into an entirely different language." > I also think that if metaclasses can be pythonic, there's a pythonic way to > do macros too, but if consensus is against that, then let them simply be an > implementation detail of Minimal Python. The issue is that using macros as a tool risks to trigger a debate on how to get them right and while in Lisp adding some form of macros is a no-brainer, this is not case for Python. So if they do not show to be inavoidable better leave the idea alone. An extensible parser with only a programmable interface can do the trick if necessary. If you want a debate about the pythonicity of macros and possible implementations of them IMO comp.lang.python is probably a better place. Honestly I don't hold my position because I dislike macros, I like macros in Common Lisp, I have thought about adding macros to Python and have partecipated to some debates about the issue. regards From lalo at laranja.org Sun Jan 12 14:31:53 2003 From: lalo at laranja.org (Lalo Martins) Date: Sun, 12 Jan 2003 11:31:53 -0200 Subject: [pypy-dev] [almost-off] ouroboros variations In-Reply-To: <002601c2b9e5$8aec8180$6401a8c0@damien> References: <002601c2b9e5$8aec8180$6401a8c0@damien> Message-ID: <20030112133153.GA4556@laranja.org> There is also the variation with two serpents, one white and one black, which represents the balance of the opposites (kind of an european version of the Tao - "Yin-Yang" - symbol). Warn the artists that this variation is probably not good for this project... And, in some variations the snake (or the two snakes) is twisted around itself in the middle: 8 / \ / \ XXXXXXX \ / \ / V []s, |alo +---- -- Those who trade freedom for security lose both and deserve neither. -- http://www.laranja.org/ mailto:lalo at laranja.org pgp key: http://www.laranja.org/pessoal/pgp Eu jogo RPG! (I play RPG) http://www.eujogorpg.com.br/ GNU: never give up freedom http://www.gnu.org/ From lalo at laranja.org Sun Jan 12 14:39:06 2003 From: lalo at laranja.org (Lalo Martins) Date: Sun, 12 Jan 2003 11:39:06 -0200 Subject: [pypy-dev] bootstrapping issues In-Reply-To: <20030111214011.P1568@prim.han.de> References: <20030111214011.P1568@prim.han.de> Message-ID: <20030112133906.GB4556@laranja.org> On Sat, Jan 11, 2003 at 09:40:11PM +0100, holger krekel wrote: > For avoiding the need of C-coded system modules > there is interest to code a generalization of > the "struct" module which allows *calling* functions at > the C-level. This general C-"Extension" will be > system-dependent. Someone already mentioned Squeak, which I think is something everyone involved in this project should take a look at. (http://www.squeak.org/ ) I would also mention Gnu Smalltalk (http://www.gnu.org/software/smalltalk/); it has the best C-calling-from-interpreter interface I've ever seen. []s, |alo +---- -- Those who trade freedom for security lose both and deserve neither. -- http://www.laranja.org/ mailto:lalo at laranja.org pgp key: http://www.laranja.org/pessoal/pgp Eu jogo RPG! (I play RPG) http://www.eujogorpg.com.br/ GNU: never give up freedom http://www.gnu.org/ From hpk at trillke.net Sun Jan 12 15:05:12 2003 From: hpk at trillke.net (holger krekel) Date: Sun, 12 Jan 2003 15:05:12 +0100 Subject: [pypy-dev] Questions about the C core In-Reply-To: <3E20F73B.6000307@ActiveState.com>; from DavidA@ActiveState.com on Sat, Jan 11, 2003 at 09:03:55PM -0800 References: <3E20BB35.1050803@tismer.de> <2179680000.1042335174@localhost.localdomain> <00aa01c2b9dd$0e4c7d40$6d94fea9@newmexico> <2197800000.1042341957@localhost.localdomain> <3E20F73B.6000307@ActiveState.com> Message-ID: <20030112150512.U1568@prim.han.de> [David Ascher Sat, Jan 11, 2003 at 09:03:55PM -0800] > On the topic of macros et al: > > I think that delivering minimal python will be quite hard. If the > mandate is to create a new implementation of Python, then I think that > the syntax of current Python should be seen as a "minimal" requirement > from a syntactic POV. New syntactic elements can clearly be defined as > well, although naturally care should be taken to ensure that existing > code still works. (so making 'spam' a reserved word probably wouldn't work). > > On the other hand, I'm going to lose interest in this project pretty > fast if it turns into an _unsubstantiated_ argument about language > design. If a new language construct is proposed as a fairly direct and > well-supported way to get the implementation done better, faster, > cheaper, then by all means. To me http://www.python.org/dev/culture.html has become kind of a mantra. IMO especially 'readability counts' constrains Macro ideas alot. Anyway, i am all for sticking to the language definition. Though I guess it will get easier to try out new syntax/semantic ideas. I think that the decisions from the python developers have generally been very wise and publically extending the language should really be accepted by the usual authorities. Of course, there might be some special rules or constructs in the bootstrapping process if that really helps. But even then, i think that these will be restrictions rather than extensions. Let's not give up the common coding style and readability. What might seem a gain in the short term might not play out well in the end. IOW I trust e.g. Guido more than my own judgement on these matters. regards, holger From hpk at trillke.net Sun Jan 12 15:07:12 2003 From: hpk at trillke.net (holger krekel) Date: Sun, 12 Jan 2003 15:07:12 +0100 Subject: [pypy-dev] [almost-off] ouroboros variations In-Reply-To: <20030112133153.GA4556@laranja.org>; from lalo@laranja.org on Sun, Jan 12, 2003 at 11:31:53AM -0200 References: <002601c2b9e5$8aec8180$6401a8c0@damien> <20030112133153.GA4556@laranja.org> Message-ID: <20030112150712.V1568@prim.han.de> [Lalo Martins Sun, Jan 12, 2003 at 11:31:53AM -0200] > There is also the variation with two serpents, one white and one black, > which represents the balance of the opposites (kind of an european version > of the Tao - "Yin-Yang" - symbol). Warn the artists that this variation is > probably not good for this project... > > And, in some variations the snake (or the two snakes) is twisted around > itself in the middle: > > 8 > / \ > / \ > XXXXXXX > \ / > \ / > V There definitely should also be an ASCII version :-) holger From tismer at tismer.com Sun Jan 12 16:32:38 2003 From: tismer at tismer.com (Christian Tismer) Date: Sun, 12 Jan 2003 16:32:38 +0100 Subject: [pypy-dev] Questions about the C core In-Reply-To: <3E20F73B.6000307@ActiveState.com> References: <3E20BB35.1050803@tismer.de> <2179680000.1042335174@localhost.localdomain> <00aa01c2b9dd$0e4c7d40$6d94fea9@newmexico> <2197800000.1042341957@localhost.localdomain> <3E20F73B.6000307@ActiveState.com> Message-ID: <3E218A96.7040405@tismer.com> David Ascher wrote: > On the topic of macros et al: No macros! > I think that delivering minimal python will be quite hard. If the > mandate is to create a new implementation of Python, then I think that > the syntax of current Python should be seen as a "minimal" requirement > from a syntactic POV. New syntactic elements can clearly be defined as > well, although naturally care should be taken to ensure that existing > code still works. (so making 'spam' a reserved word probably wouldn't > work). The last thing Minimal Python is targetted to is changing the language! We might decide to just implement a subset of the language, but we should really avoid to change anything, unless we have to. Changing the internals so radically as we are planning to do creates a lot of problems. I don't want to add language changes as yet another problem. With the new flexibility, which a re-implementation of Python in Python hopefully rpovides, it will be much easier to experiment with new language features than today. But before doing that, if at all, we need to get the basic stuff running. Especially for macros, I don't see any necessity right now. If we need to implement a template system for parameterizing often repeated python code, we have the simple approach to generate Python source from formatted template strings. If that is used extensively and turns out to be insufficient, we can think of something better. But nothing should be added before trying it without. > On the other hand, I'm going to lose interest in this project pretty > fast if it turns into an _unsubstantiated_ argument about language > design. If a new language construct is proposed as a fairly direct and > well-supported way to get the implementation done better, faster, > cheaper, then by all means. I don't see any reason to add anything now. The purpose of MiniPy is to remove as much as possible, especially from the C code. Things can then be re-implemented in Python, they can be thought over and get a different design to try out. I also see no problem with the Python language as it is now. Most of it is available as Python code in the compiler module. Just a few internal things like the token parser and some others are only available as C code. We can either keep then as they are, or re-code them in Python. It is even thinkable to make certain existing language features pluggable an configurable. But again, thats future matters. Right now we need to get all available Pyhton code to run. That means to implement the minimum core that's needed, and don't mess with the language. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tismer at tismer.com Sun Jan 12 16:37:32 2003 From: tismer at tismer.com (Christian Tismer) Date: Sun, 12 Jan 2003 16:37:32 +0100 Subject: [pypy-dev] Questions about the C core In-Reply-To: <20030112150512.U1568@prim.han.de> References: <3E20BB35.1050803@tismer.de> <2179680000.1042335174@localhost.localdomain> <00aa01c2b9dd$0e4c7d40$6d94fea9@newmexico> <2197800000.1042341957@localhost.localdomain> <3E20F73B.6000307@ActiveState.com> <20030112150512.U1568@prim.han.de> Message-ID: <3E218BBC.4020502@tismer.com> holger krekel wrote: > [David Ascher Sat, Jan 11, 2003 at 09:03:55PM -0800] > >>On the topic of macros et al: >> >>I think that delivering minimal python will be quite hard. If the >>mandate is to create a new implementation of Python, then I think that >>the syntax of current Python should be seen as a "minimal" requirement >>from a syntactic POV. New syntactic elements can clearly be defined as >>well, although naturally care should be taken to ensure that existing >>code still works. (so making 'spam' a reserved word probably wouldn't work). >> >>On the other hand, I'm going to lose interest in this project pretty >>fast if it turns into an _unsubstantiated_ argument about language >>design. If a new language construct is proposed as a fairly direct and >>well-supported way to get the implementation done better, faster, >>cheaper, then by all means. > > > To me http://www.python.org/dev/culture.html has become kind of a mantra. > IMO especially 'readability counts' constrains Macro ideas alot. > > Anyway, i am all for sticking to the language definition. Though > I guess it will get easier to try out new syntax/semantic ideas. > > I think that the decisions from the python developers have generally > been very wise and publically extending the language should really be > accepted by the usual authorities. Of course, there might be > some special rules or constructs in the bootstrapping process > if that really helps. But even then, i think that these will > be restrictions rather than extensions. Let's not give up > the common coding style and readability. What might seem a > gain in the short term might not play out well in the end. > > IOW I trust e.g. Guido more than my own judgement on these matters. You are absolutely right! We are not here for language design. That's already done by Guido, and he is right about it. We just want to try a different implementation. This is a high risk to be just a waste and something that the core group cannot afford to try, due to lack of time. What we effectively are doing is a prototype of a new implementation based upon new techniques. This is explorative programming, pioneer work, an experiment. We will see if it succeeds. It will help Python either way. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tismer at tismer.com Sun Jan 12 16:47:08 2003 From: tismer at tismer.com (Christian Tismer) Date: Sun, 12 Jan 2003 16:47:08 +0100 Subject: [pypy-dev] Questions about the C core In-Reply-To: <258850000.1042359720@leeloo.intern.geht.de> References: <3E20BB35.1050803@tismer.de> <2179680000.1042335174@localhost.localdomain> <00aa01c2b9dd$0e4c7d40$6d94fea9@newmexico> <258850000.1042359720@leeloo.intern.geht.de> Message-ID: <3E218DFC.6010005@tismer.com> Marc Recht wrote: >>> Has anyone thought about features that Python does not have that should >>> be in Minimal? My first candidate would be (ducks) macros, >> >> >> as a final user visible feature, I hope not. As an implementation then >> invisible tool maybe/why not. > > IMHO the user visible language part should be 100% Python compatible. At > least this should be the goal. But, in the build process of the Minimal > Python core we could (and should?) use feature to simplify the > implementation. Like generating source/classes from a formal description > (eg. XML). I have the same goal. I also don't hesitate to use some abbreviative tools for repeated code, maybe. But only if it helps to simplify, not to change the language. Furthermore, thinking of macros again, they make less sense for me every time I think it over. Our final target is an intelligent code generator which generates optimized code from well-written Python. How could a macro system help here? We would instead loose structure and make optimization harder, since the macro generated code needs to be analysed and optimized, later. That is bullshit, IMHO. Abstractions, like metaclases, which specialize themselves into other concrete classes make more sense to me. This allows to optimize without loosing structural information. XML on the other hand, as an intermediate language or for pickling makes sense to me. At least it is easier to parse than Python. (But we *have* the parser...) -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From pedronis at bluewin.ch Sun Jan 12 16:46:41 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Sun, 12 Jan 2003 16:46:41 +0100 Subject: [pypy-dev] Questions about the C core References: <3E20BB35.1050803@tismer.de> <2179680000.1042335174@localhost.localdomain><00aa01c2b9dd$0e4c7d40$6d94fea9@newmexico><2197800000.1042341957@localhost.localdomain><3E20F73B.6000307@ActiveState.com> <10420000.1042369191@localhost.localdomain> <007501c2ba3e$21668220$6d94fea9@newmexico> Message-ID: <02b901c2ba51$cd364b40$6d94fea9@newmexico> [me] > Honestly I don't hold my position because I dislike macros, I like macros in > Common Lisp, I have thought about adding macros to Python and have partecipated > to some debates about the issue. my current position being that macros in Python are out-of-place. From hpk at trillke.net Sun Jan 12 19:43:53 2003 From: hpk at trillke.net (holger krekel) Date: Sun, 12 Jan 2003 19:43:53 +0100 Subject: [pypy-dev] link documentation Message-ID: <20030112194353.B1568@prim.han.de> hi, it would be nice if people that recommend good links would also care to write a paragraph why a link might be significant. Maybe you know the theory that *everything* is just six links (or less) away :-) thanks, holger From DavidA at ActiveState.com Sun Jan 12 20:27:19 2003 From: DavidA at ActiveState.com (David Ascher) Date: Sun, 12 Jan 2003 11:27:19 -0800 Subject: [pypy-dev] Intellectual property Message-ID: <3E21C197.8010403@ActiveState.com> I'd like to suggest that the issue of intellectual property and license be tackled _very_ early. Anything else is inexcusable IMO =). Proposal: - all of the minipy IP is assigned to the PSF. - the license is either the PSF license or the Academic Free License http://www.opensource.org/licenses/academic.php. (the advangate of the AFL is the patent protection, which I suspect may be relevant in this particular implementation). --da From hpk at trillke.net Sun Jan 12 20:57:29 2003 From: hpk at trillke.net (holger krekel) Date: Sun, 12 Jan 2003 20:57:29 +0100 Subject: [pypy-dev] Intellectual property In-Reply-To: <3E21C197.8010403@ActiveState.com>; from DavidA@ActiveState.com on Sun, Jan 12, 2003 at 11:27:19AM -0800 References: <3E21C197.8010403@ActiveState.com> Message-ID: <20030112205729.E1568@prim.han.de> [David Ascher Sun, Jan 12, 2003 at 11:27:19AM -0800] > I'd like to suggest that the issue of intellectual property and license > be tackled _very_ early. Anything else is inexcusable IMO =). > > Proposal: > - all of the minipy IP is assigned to the PSF. > - the license is either the PSF license or the Academic Free License > http://www.opensource.org/licenses/academic.php. > (the advangate of the AFL is the patent protection, which I suspect > may be relevant in this particular implementation). Why start a flame war that early? :-) seriously, i don't think there is reason to deviate here from the usual PSF license. If we wanted to integrate a very cool tool which was GPLed i wouldn't mind too much, though. It's functionality might later get rewritten by someone who cares enough. Anyway, due to time manipulations the copyright has already been assigned to the PS From tismer at tismer.com Sun Jan 12 22:09:07 2003 From: tismer at tismer.com (Christian Tismer) Date: Sun, 12 Jan 2003 22:09:07 +0100 Subject: [pypy-dev] Intellectual property In-Reply-To: <3E21C197.8010403@ActiveState.com> References: <3E21C197.8010403@ActiveState.com> Message-ID: <3E21D973.50703@tismer.com> David Ascher wrote: > I'd like to suggest that the issue of intellectual property and license > be tackled _very_ early. Anything else is inexcusable IMO =). David? Whom are you afraid of? Not me, I hope! > Proposal: > - all of the minipy IP is assigned to the PSF. > - the license is either the PSF license or the Academic Free License > http://www.opensource.org/licenses/academic.php. > (the advangate of the AFL is the patent protection, which I suspect > may be relevant in this particular implementation). Hmm. Or do you believe this project can be so *very* successful that we should be afraid of larger companies, who might 'see sharp' upon us. ;-) -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From marc at informatik.uni-bremen.de Sun Jan 12 23:31:07 2003 From: marc at informatik.uni-bremen.de (Marc Recht) Date: Sun, 12 Jan 2003 23:31:07 +0100 Subject: [pypy-dev] Intellectual property In-Reply-To: <3E21C197.8010403@ActiveState.com> References: <3E21C197.8010403@ActiveState.com> Message-ID: <563480000.1042410667@leeloo.intern.geht.de> > I'd like to suggest that the issue of intellectual property and license > be tackled _very_ early. Anything else is inexcusable IMO =). I thought about posting a mail like this just some minutes before I saw your mail.. :-) > Proposal: > - all of the minipy IP is assigned to the PSF. > - the license is either the PSF license or the Academic Free License > http://www.opensource.org/licenses/academic.php. (the advangate of > the AFL is the patent protection, which I suspect may be relevant in this > particular implementation). IMHO the PSF licese would be the best choice. It the easiest way for both sides (MiniPy and CPython) to exchange code. Marc "Premature optimization is the root of all evil." -- Donald E. Knuth -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 186 bytes Desc: not available URL: From andrew at indranet.co.nz Mon Jan 13 00:56:25 2003 From: andrew at indranet.co.nz (Andrew McGregor) Date: Mon, 13 Jan 2003 12:56:25 +1300 Subject: [pypy-dev] Questions about the C core In-Reply-To: <3E218BBC.4020502@tismer.com> References: <3E20BB35.1050803@tismer.de> <2179680000.1042335174@localhost.localdomain> <00aa01c2b9dd$0e4c7d40$6d94fea9@newmexico> <2197800000.1042341957@localhost.localdomain> <3E20F73B.6000307@ActiveState.com> <20030112150512.U1568@prim.han.de> <3E218BBC.4020502@tismer.com> Message-ID: <2723120000.1042415785@localhost.localdomain> --On Sunday, January 12, 2003 16:37:32 +0100 Christian Tismer wrote: > holger krekel wrote: >> [David Ascher Sat, Jan 11, 2003 at 09:03:55PM -0800] >> >>> On the topic of macros et al: >>> >>> I think that delivering minimal python will be quite hard. If the >>> mandate is to create a new implementation of Python, then I think that >>> the syntax of current Python should be seen as a "minimal" requirement >>> from a syntactic POV. New syntactic elements can clearly be defined as >>> well, although naturally care should be taken to ensure that existing >>> code still works. (so making 'spam' a reserved word probably wouldn't >>> work). >>> >>> On the other hand, I'm going to lose interest in this project pretty >>> fast if it turns into an _unsubstantiated_ argument about language >>> design. If a new language construct is proposed as a fairly direct and >>> well-supported way to get the implementation done better, faster, >>> cheaper, then by all means. >> >> >> To me http://www.python.org/dev/culture.html has become kind of a >> mantra. IMO especially 'readability counts' constrains Macro ideas >> alot. >> >> Anyway, i am all for sticking to the language definition. Though >> I guess it will get easier to try out new syntax/semantic ideas. >> >> I think that the decisions from the python developers have generally >> been very wise and publically extending the language should really be >> accepted by the usual authorities. Of course, there might be >> some special rules or constructs in the bootstrapping process >> if that really helps. But even then, i think that these will >> be restrictions rather than extensions. Let's not give up >> the common coding style and readability. What might seem a >> gain in the short term might not play out well in the end. >> >> IOW I trust e.g. Guido more than my own judgement on these matters. > > You are absolutely right! > > We are not here for language design. > That's already done by Guido, and he is > right about it. We just want to try a > different implementation. This is a high > risk to be just a waste and something that > the core group cannot afford to try, due > to lack of time. > > What we effectively are doing is a prototype > of a new implementation based upon new > techniques. This is explorative programming, > pioneer work, an experiment. > We will see if it succeeds. It will help Python > either way. > > ciao - chris Fair enough. I simply thought that macros were an *old* technique that could be useful, even if only as part of the implementation, but the consensus is otherwise. I don't quite understand, but I'll shut up now :-) I guess the operative part of the culture document is #5, 'Flat is better than nested'. Please don't continue the thread, I'm convinced. Andrew From bokr at oz.net Mon Jan 13 01:06:39 2003 From: bokr at oz.net (Bengt Richter) Date: Sun, 12 Jan 2003 16:06:39 -0800 Subject: [pypy-dev] Pypy Roadmap Message-ID: <5.0.2.1.1.20030112132955.00a6d800@mail.oz.net> Hi, this sounds interesting ;-) I was wondering how you were planning on identifying concrete tasks and assigning and tracking them, and keeping status visible to support cooperation. Do you already have CVS and bug/issue tracking methodology settled? I.e., such stuff is nuisance overhead or sanity preservation depending on the task, and the development phase, and scale/distribution of effort. Is there one single home page that leads to everything relevant, with some indication of historical vs current vs fire alarm links? Googling pypy got me (in two steps) to http://twistedmatrix.com/users/jh.twistd/moin/moin.cgi/PyPy Should that page be "claimed"? Here are some things that occur to me to think about in making a roadmap for your project. If you would make something like it to reflect your _actual_ project plans, it would help me (and maybe some other lurkers ;-) get an idea of what you are really up to ;-) 1.0 Historical background 1.1 Psyco 1.2 core group forms and evolves pypy idea 1.3 core group decides to go public, announces pypy-dev list 2.0 Initial pypy-dev discussions 2.1 Happening now ;-) 2.2 Preliminary expressions of interest, ideas 3.0 Straw-man roadmap 3.1 Decide hosting, bug/issue tracking methodology issues 3.1.1 Designation of primary info channels 3.1.1.1 Discussion - pypy-dev list, presumably 3.1.1.2 Current task/subproject/bugfix assignments/status 3.1.1.3 Single official page as comprehensive root of info tree 3.1.2 Sourceforge? 3.1.2.1 mirror/backup/file authentication issues? 3.1.3 pypy SDK? Recommended win32 & unix tool kits, compiler versions etc. 3.1.4 Windows vs unix issues 3.1.4.1 Maintain parallel MSVC++ project files like CPython's PCbuild directory? 3.1.4.2 Platform-independent builds? 3.2 Language issues 3.2.1 Decide on minimal core language bootstrap subset? 3.2.1.1 Identify needs for supporting 4.x below 3.2.1.2 Threading/locking/synchronizing issues? 3.2.2 Compiler hints/directives - what info & how? 3.2.2.1 command line opts? 3.2.2.2 config files? 3.2.2.3 source-embedded directives/pragmas/etc? 3.3.3.4 Preprocessing? 3.3 Other implementation issues? 3.3.1 Foreign function/C interface 3.3.2 New VM? Intermediate language representation? 3.3.3 Memory allocation/garbage collection/reference counting 3.3.4 Resource monitoring (guaranteed timely finalization?) 3.3.5 Security support 3.3.5.1 Sandboxing, restricted execution, resource quotas? 3.3.5.2 Special considerations for CGI/server contexts? 3.3.6 Checkpointable execution support, fastload images? 3.4 How to minimize platform C library dependencies for bootstrap core? 3.4.1 Strategy for weaning from temporary uses 3.5 Decide on initial build targets 3.5.1 Executables: Win32, unix, bootable images? 3.5.1 Libraries: DLLs, .so's 4.0 After the minimal core bootstrap language works 4.1 Re-implementation strategy using core boostrap language to write next level of language features. 4.1.1 How many levels of subset bootstrapping? Is it a hierarchy? 4.2 Special language requirements for writing what is now C in Cpython? Well, that's all the thoughts I have for now ;-) Regards, Bengt Richter From z3p at twistedmatrix.com Mon Jan 13 01:07:39 2003 From: z3p at twistedmatrix.com (Paul Swartz) Date: Sun, 12 Jan 2003 19:07:39 -0500 Subject: [pypy-dev] PyVM: Python Bytecode interpreter, in Python Message-ID: <3E21BCFB.18262.1F9C0D6@localhost> An idea I've been toying around with is a Python bytecode interpreter, written in Python. Currently, it works decently, supporting pretty much everything except generators. Might be useful to look at. The code is available at: http://www.twistedmatrix.com/users/z3p/files/pyvm2.py -p -- Paul Swartz (o_ http://twistedmatrix.com/users/z3p.twistd/ //\ z3p at twistedmatrix.com V_/_ AIM: Z3Penguin From DavidA at ActiveState.com Mon Jan 13 01:33:14 2003 From: DavidA at ActiveState.com (David Ascher) Date: Sun, 12 Jan 2003 16:33:14 -0800 Subject: [pypy-dev] Intellectual property References: <3E21C197.8010403@ActiveState.com> <3E21D973.50703@tismer.com> Message-ID: <3E22094A.9050507@ActiveState.com> Christian Tismer wrote: > David Ascher wrote: > >> I'd like to suggest that the issue of intellectual property and >> license be tackled _very_ early. Anything else is inexcusable IMO =). > > > David? Whom are you afraid of? Not me, I hope! I'm afraid of no one -- I'm afraid of lazyness which makes early non-decisions hard to reverse. Some projects would like to change their license, for example, but can't because the IP assignment is so complicated. >> Proposal: >> - all of the minipy IP is assigned to the PSF. >> - the license is either the PSF license or the Academic Free License >> http://www.opensource.org/licenses/academic.php. >> (the advangate of the AFL is the patent protection, which I >> suspect may be relevant in this particular implementation). > > > Hmm. Or do you believe this project can be so > *very* successful that we should be afraid > of larger companies, who might 'see sharp' > upon us. ;-) I just think defensive patent strategies should be thought about. It's ok, as long as the PSF has copyright, the PSF could always change the license on later releases if it saw fit. --david From rodrigobamboo at terra.com.br Mon Jan 13 02:08:53 2003 From: rodrigobamboo at terra.com.br (Rodrigo B. de Oliveira) Date: 12 Jan 2003 23:08:53 -0200 Subject: [pypy-dev] multithread support review? Message-ID: <1042420133.1698.19.camel@dhcppc0> Are you considering reviewing the current state of multi thread support in python? I mean something like removing the need for the GIT :-) Regards, Rodrigo From tismer at tismer.com Mon Jan 13 02:11:48 2003 From: tismer at tismer.com (Christian Tismer) Date: Mon, 13 Jan 2003 02:11:48 +0100 Subject: [pypy-dev] PyVM: Python Bytecode interpreter, in Python In-Reply-To: <3E21BCFB.18262.1F9C0D6@localhost> References: <3E21BCFB.18262.1F9C0D6@localhost> Message-ID: <3E221254.4040702@tismer.com> Paul Swartz wrote: > An idea I've been toying around with is a Python bytecode > interpreter, written in Python. Currently, it works > decently, supporting pretty much everything except > generators. Might be useful to look at. The code is > available at: > http://www.twistedmatrix.com/users/z3p/files/pyvm2.py Ho hoo! That's pretty nice to read, something like the very basic bytecode interpreter, based upon representations of the internal obejcts, emulated using Python's basic builtin types. This looks pretty much like something that Holger suggested, partially. It is not implementing the new class/type system, so I guess it is based upon Python 2.0? Yes, I see byte_CALL_FUNCTION, which has been split into some more for 2.2 and up. I see the individual byte codes being implemented as small functions. That is very much as I like it. You are not trying to implement the basic stuff, but you borrow it from the Python interpreter, like print. You are also borrowing lists, tuples and dicts, the interpreter is just a mapping of bytecodes to builtin actions. That's fine to start with! I saw a generator class, at least. This is just not working, right? I had only five minutes yet to read it. Nice thing, thanks for showing it to us! This is exactly the kind of input we need. sincerely -- chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tismer at tismer.com Mon Jan 13 02:22:39 2003 From: tismer at tismer.com (Christian Tismer) Date: Mon, 13 Jan 2003 02:22:39 +0100 Subject: [pypy-dev] multithread support review? In-Reply-To: <1042420133.1698.19.camel@dhcppc0> References: <1042420133.1698.19.camel@dhcppc0> Message-ID: <3E2214DF.5080206@tismer.com> Rodrigo B. de Oliveira wrote: > Are you considering reviewing the current state of multi thread support > in python? I mean something like removing the need for the GIT :-) The GIL? Well, speaking for myself, I never liked the necessity to have a GIL, nor did I like the idea of building free threading into everything. Personally, I would prefer not to support threads in that sense at all, but only microthreads/tasklets from the Stackless package, which is much cheaper to implement. In my model, real threads in that sense would not have locking problems in the first place, but they would much more look like disjoint processes. I would not let run multiple threads in one interpreter, but give every thread an extra interpreter, and have all threads strictly disjoint. Communication would then be an extra action, which has to be done explicitly, by passign read-only references or explicit copies for writing. That would remove the load to always lock threads against each other, or to make all vulnerable objects thread-safe. Well, that's how I had done it, probably. Without re-designing Python (which is one of our goals, avoid language changes), we have no chance. Either threads have to be left out for now, or we need the GIL, too. I don't think it should be a primary goal of MiniPy to solve the Python thread problem in the first place. I believe, after having a running MiniPy, it can be used to exercise different threading models, and this much easier than with the current C based version. I don't believe that Python's current thread model is the final word spoken. I would like to change it or to provide alternatives. This is a reason to have something like MiniPy, but not part of it. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From z3p at twistedmatrix.com Mon Jan 13 02:58:30 2003 From: z3p at twistedmatrix.com (Paul Swartz) Date: Sun, 12 Jan 2003 20:58:30 -0500 Subject: [pypy-dev] PyVM: Python Bytecode interpreter, in Python In-Reply-To: <3E221254.4040702@tismer.com> References: <3E21BCFB.18262.1F9C0D6@localhost> Message-ID: <3E21D6F6.27697.25F3C35@localhost> On 13 Jan 2003 at 2:11, Christian Tismer wrote: > You are not trying to implement the basic stuff, > but you borrow it from the Python interpreter, > like print. You are also borrowing lists, tuples > and dicts, the interpreter is just a mapping of > bytecodes to builtin actions. > That's fine to start with! Yes, ATM I borrow lists, tuples, and dicts, but once I work out new-style classes, I think I'll wrap them and everything in a generic PyObject class, to make them behave like Objects do now. > I saw a generator class, at least. This is just > not working, right? I had only five minutes yet > to read it. Yes, there's a Generator class, but I've had some trouble getting them implemented, so the test for them and the code that would create them is disabled. -p -- Paul Swartz (o_ http://twistedmatrix.com/users/z3p.twistd/ //\ z3p at twistedmatrix.com V_/_ AIM: Z3Penguin From tismer at tismer.com Mon Jan 13 03:10:30 2003 From: tismer at tismer.com (Christian Tismer) Date: Mon, 13 Jan 2003 03:10:30 +0100 Subject: [pypy-dev] pypy: static vs. dynmic (was: [ann] Minimal Python project) In-Reply-To: <7xk7havwod.fsf@ruckus.brouhaha.com> References: <3E2058FA.F676DCFC@engcorp.com>

<7xsmvzw91d.fsf@ruckus.brouhaha.com>

<7xk7havwod.fsf@ruckus.brouhaha.com> Message-ID: <3E222016.7000402@tismer.com> Paul Rubin wrote: > Christian Tismer writes: > >>>>... that can be >>>>achieved with a static system. >> >>With a dynamic system, you can have backwards compatibility as well, >>together with the flexibility to try alternatives on-the-fly rather >>easily. Changing some very internal things to try something out is >>still not easy, but with welded C code, this is almost impossible. > > > Can you explain what you mean by static vs. dynamic? Yes. Please see below. > Is PHP a static system or a dynamic one? I ask this because the PHP > interpreter has been benchmarked as quite a bit faster than Python. > Of course that must be partly because the PHP language doesn't use > abstract objects so extensively, so the interpreter doesn't have to do > as many dictionary lookups all the time. Sorry, I don't have enough insight into PHP to answer that question. And I doubt I will ever have that, since what I saw my son programming in PHP was much too perlish to invoke my interest :-) (although I'm happy to see him programming at all, hoping he will get back to Python in some future. At the moment, he's trying to distinguish form his father, which I can understand...) I will try to explain my current perception of static and dynamic in the context of building a Python compiler. Please excuse that I'm writing out of the top of my head. This is more a feeling than exact information. Somebody will augment this for sure, and I appreciate it. What is STATIC? --------------- Regardles of PHP (someone more enlighted may judge this), by static I mean something like a C compiler, which is fed by a number of C sources, which are fixed at some compile time, with no chance to get changed. There is exactly one build process, which leads to an executable, and that's it. From then on, this executable has to fit everything, for every application and for all time, until another version is compiled. The static executable is also almost always not optimized for the platform, the processor brand, the amount of physical memory available, and the applications to be run. Finally, the static build is also dependant of a C compiler created by somebody, where it has no real control about. This is especially true for closed source platforms like Windows, but also open sourced compilers like the gcc family are huge projects, very difficult to understand for average programmers like me, so they are black holes as well. That makes optimizations into something like a try and error game instead into a scientific project. I admit that the last argument is not specific to the static tag, it is just an observation that comes for free. Implementing one's own system, specifically designed towards the language to implement and not trying to solve the world's needs for a compiler surely creates an easier to understand and optimize tool. What is DYNAMIC? ---------------- Dynamic means to avoid compling lots of your system based upon a fixed set of C sources, with some typical C compiler. You try to avoid C at all. Instead, you try to express as much as possible with a language that is dynamic by nature (for the implicit definition of dynamicness every Python programmer should feel). You rely on the fact that your Python code will be fed into a specializing compiler which is able to do several things which a C compiler normally cannot do: - The executed code is analyzed at runtime. This gives the compiler the opportunity to optimize code driven by the needs of the actual application. This can lead to very different implementation of certain interpreter activities, in a very application dependant manner. - The machine code can be created with knowledge of the actual hardware platform, to any thinkable extent! If there is a machine specification available that matches the current hardware, every speciality of the hardware can be used for optimization. To my knowledge, there is no such extensive optimization available at this time, but it is possible and likely to appear. The compiler can take measures of cache lines, main memory size and access time, special register sets in the CPU, everything is possible. This is out of the scope of a static C compiler. - By avoiding the so-called C runtime system, or minimizing it to the absolutely necessary, the fence between Python code and library implementation of Python objects vanishes. This opens these things to optimization at runtime. There is no longer a Python interpreter, written in C, and a set of Python objects, implemented in C, with the interpreter calling methods of these objects. Instead, the optmization of the interpreter (written in Python) can dynamically decide, wheather it makes sense to call a method of certain objects in a pre-defined manner, or it can try a specialized version of that method (again written in Python), and it can implement this method inlined into the specialized version of the interpreter, that makes sense in just that current context. A C compiler has no chance to do something comparable, simply because it does not have the runtime info, but especially since it has no idea of Python at all! By writing almost everything in Python, we are able to generate every possible optimization, since we still have the full knowledge of the abstractions, of what we wanted to implement, of our objects and of our targets.* This is what I call 'dynamic', and this is what I hope to be the __future__ of Python dynamically y'rs -- chris P.s.: * This is btw. something that would be lost by using macros, unless we invent a macro language which does know what it is doing. Almost all macro language I've seen so far did a dumb textual or tuple replacement. This is what language designers do when they don't know how to continue. Macros are a powerful extension to weak languages. Powerful languages don't need macros by definiton. -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tismer at tismer.com Mon Jan 13 03:12:48 2003 From: tismer at tismer.com (Christian Tismer) Date: Mon, 13 Jan 2003 03:12:48 +0100 Subject: [pypy-dev] PyVM: Python Bytecode interpreter, in Python In-Reply-To: <3E21D6F6.27697.25F3C35@localhost> References: <3E21BCFB.18262.1F9C0D6@localhost> <3E21D6F6.27697.25F3C35@localhost> Message-ID: <3E2220A0.7040205@tismer.com> Paul Swartz wrote: > On 13 Jan 2003 at 2:11, Christian Tismer wrote: ... >>I saw a generator class, at least. This is just >>not working, right? I had only five minutes yet >>to read it. > > > Yes, there's a Generator class, but I've had some > trouble getting them implemented, so the test for > them and the code that would create them is > disabled. Can I count on you participating in our sprint! -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From guido at python.org Mon Jan 13 03:37:23 2003 From: guido at python.org (Guido van Rossum) Date: Sun, 12 Jan 2003 21:37:23 -0500 Subject: [pypy-dev] Stuff that already exists in Python In-Reply-To: Your message of "Sun, 12 Jan 2003 16:06:39 PST." <5.0.2.1.1.20030112132955.00a6d800@mail.oz.net> References: <5.0.2.1.1.20030112132955.00a6d800@mail.oz.net> Message-ID: <200301130237.h0D2bNj08426@pcp02138704pcs.reston01.va.comcast.net> Off the top of my head, there's quite a bit of Python that has been reimplemented in Python available already: - Nearly all the import machinery (less the new PEP 302 stuff) is implemented in ihooks.py. - Command line reading is in code.py and codeop.py - A bytecode compiler is in the compiler package - A re-based tokenizer is in tokenize.py - I think that a Python implementation of marshal is (or was) part of Jython; I vaguely recall writing one when Jython was young The parser (needed by the bytecode compiler) is missing, but it's not particularly difficult to write. Not sure if this helps, but it's all I can contribute right now. --Guido van Rossum (home page: http://www.python.org/~guido/) From tismer at tismer.com Mon Jan 13 03:39:11 2003 From: tismer at tismer.com (Christian Tismer) Date: Mon, 13 Jan 2003 03:39:11 +0100 Subject: [pypy-dev] Re: [ann] Minimal Python project In-Reply-To: <7x8yxqduco.fsf@ruckus.brouhaha.com> References: <7xr8bjgz4x.fsf@ruckus.brouhaha.com> <7x4r8ej1r9.fsf@ruckus.brouhaha.com> <7x8yxqduco.fsf@ruckus.brouhaha.com> Message-ID: <3E2226CF.2050002@tismer.com> Paul Rubin wrote: > Christian Tismer writes: Now I'm trying to do my second answer to this. >>>I think that local declarations and type hints are useful language >>>improvements for more reasons than helping generate fast code... [me, arguing against introduction of new features, introducing many fruitless discussions...] > Yes, if you want to implement an extension of this type it's better to > just pick out a way to do it and code it up, than to spend weeks > posting netnews about different possible methods. There's also a nice > characteristic of experimental implementations, that if you don't like > how a feature works out, you can change it, unlike if it gets into a > real Python release and people start depending on it. However, I can > understand your approach of wanting to leave it out completely at > first and possibly add it later if it's needed. I guess that something will be needed sooner or later, also that we will implement some extensions raher soon, but not publishing them as a valid langugage extension. It is great to keep the flexibility as you mentioned, and this project will need several iterations of many constructs. I think I should have mentioned Extreme Programming, earlier. One necessary is to do without lots of fixed design decisions. It is urgent to be flexible and open to new ideas. In the sprint, we will probably go many different ways at the same time, and drop most of them, soon. Our way of examinining new ways of programming will be as extreme as the principles of extreme programming. There is for sure no other way. [conservative about changing language] > I think you should feel willing to take some liberties with the > language if it makes your implementation cleaner. A lot of the weird > corners of Python seem to me to be implementation hacks based on > CPython internals anyway. Plus, I've mentioned that coding in Python > gives me something like the joy that I imagine that the 1960's Lisp > hackers must have felt. The language itself is in similar shape to > 1960's Lisp, with just two implementations (CPython and Jython), both > of them interpreters. If the development of native-code Python > compilers results in some language evolution like it did for Lisp, > that's natural and not a bad thing. However, it all depends on what > your goals are. You know what my goals are. Smaller, more flexible, faster, easier to change, easier to maintain, easier to keep backwards compatible, more portable due to less C code, down-sizeable by features (which is most difficult), the full catastrophe... > I don't personally see a pure, faithful, exact > reimplementation of a static target whose existing implementation is > free and works perfectly well on a wide range of platforms as being > something I'd want to devote precious volunteer energy to. It's much > more interesting to be able to expand the boundaries of what's been > done before (as Stackless expanded boundaries). However, YMMV. We will try to implement Python as exact and clean as possible. The langage should be implemented completely. At the same time, as much as possible should become pluggable. It will be possible to have MiniPy without floats, without longs, without Unicode, without generators, without bool, without enums, it will be possible to have a Python that cannot generate any new types and classes, and so on. Modules which depend on these features will then not work. It will be a major amount of work to deduce the dependencies of features, and how to arrange them in a scalable shape. I do believe that the core group will help us with that. positively yours -- chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tismer at tismer.com Mon Jan 13 04:00:37 2003 From: tismer at tismer.com (Christian Tismer) Date: Mon, 13 Jan 2003 04:00:37 +0100 Subject: [pypy-dev] Stuff that already exists in Python In-Reply-To: <200301130237.h0D2bNj08426@pcp02138704pcs.reston01.va.comcast.net> References: <5.0.2.1.1.20030112132955.00a6d800@mail.oz.net> <200301130237.h0D2bNj08426@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3E222BD5.8050601@tismer.com> Guido van Rossum wrote: > Off the top of my head, there's quite a bit of Python that has been > reimplemented in Python available already: Yes, I know, and I appreciate it very much! > - Nearly all the import machinery (less the new PEP 302 stuff) is > implemented in ihooks.py. Absolutely. The amount of Python-coded stuff makes me so confident that this project is doable at all. > - Command line reading is in code.py and codeop.py This is a pleasure, well known. > - A bytecode compiler is in the compiler package Very good, I know. Finally, it is calling back into the C part, for some AST stuff and the parsetok. Not too har to implement, despite the fact that character-wise operation in Python isn't very effective at all. But there's sre. > - A re-based tokenizer is in tokenize.py That's what I've overlooked. I thought we'd have to implement all of the parsetok stuff, since (as far as I analysed it) the compiler package calls back into the parser module, which still uses parsetok. This also touches one of my weak points: Is it good to replace C code by something which is depending on another special module like sre? This makes sre into something crucial to this re-implementation. But I'm not sure if this is a good way to go. I'm also not sure if it is good to re-implement certain modules using the common Python tricks and optimizations. To some extent, I have the impression that doing it the simple way, basically as done in C, would fit the optimizations of Psyco better. But that is an open question until we get some feedback from Armin Rigo. > - I think that a Python implementation of marshal is (or was) part of > Jython; I vaguely recall writing one when Jython was young Thanks a lot, I will try to look that up. > The parser (needed by the bytecode compiler) is missing, but it's not > particularly difficult to write. Yes, I've spent some hours figuring that out. It is sitting in the parser extension, slightly different from the builtin one. For a first implementation, I think to re-code anything necessary quite directly in Python. But then, re-coding again gives us ways to become more flexible concerning the parser, if we don't have the restrictions of C any longer. Well, this is a long way to Tipparary. > Not sure if this helps, but it's all I can contribute right now. This is very helpful, thanks a lot for supporting this project. I'm happy that you are following it, positively. This is how it is meant to be: We are trying a new approach, and we want to do this as support for Python's evolution, trying to find a way to simplify it and to make its development easier and faster. If we succeed, this may become a new way for Python. If not, then we are probably closing a dead end. Both possible results are able to help. All the best -- chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From bokr at oz.net Mon Jan 13 04:07:43 2003 From: bokr at oz.net (Bengt Richter) Date: Sun, 12 Jan 2003 19:07:43 -0800 Subject: [pypy-dev] Re: [ann] Minimal Python project In-Reply-To: <3E2226CF.2050002@tismer.com> References: <7x8yxqduco.fsf@ruckus.brouhaha.com> <7xr8bjgz4x.fsf@ruckus.brouhaha.com> <7x4r8ej1r9.fsf@ruckus.brouhaha.com> <7x8yxqduco.fsf@ruckus.brouhaha.com> Message-ID: <5.0.2.1.1.20030112185911.00a7a6d0@mail.oz.net> At 03:39 2003-01-13 +0100, Christian Tismer wrote: [...] >At the same time, as much as possible should become >pluggable. It will be possible to have MiniPy [...] Is "MiniPy" the official name ? Regards, Bengt Richter From tismer at tismer.com Mon Jan 13 04:17:21 2003 From: tismer at tismer.com (Christian Tismer) Date: Mon, 13 Jan 2003 04:17:21 +0100 Subject: Seeking Minimal Python Project Name (was: [pypy-dev] Re: [ann] MinimalPython project) In-Reply-To: <5.0.2.1.1.20030112185911.00a7a6d0@mail.oz.net> References: <7x8yxqduco.fsf@ruckus.brouhaha.com> <7xr8bjgz4x.fsf@ruckus.brouhaha.com> <7x4r8ej1r9.fsf@ruckus.brouhaha.com> <7x8yxqduco.fsf@ruckus.brouhaha.com> <5.0.2.1.1.20030112185911.00a7a6d0@mail.oz.net> Message-ID: <3E222FC1.40908@tismer.com> Bengt Richter wrote: > At 03:39 2003-01-13 +0100, Christian Tismer wrote: > [...] > >>At the same time, as much as possible should become >>pluggable. It will be possible to have MiniPy > > [...] > > Is "MiniPy" the official name ? No. There is no official name, yet. For some time, I was thinking of "Lilipyt". "Ptn" was the first thought, while not sounding very sexy just expressing brevity. "Pippy" is already occupied. "Minipy"? Well, there are so many "py" projects, this makes it hard to come up with a good new name. While a name is most unimportant for me, I agree that it *is* important for the community to be able to spell it. Anyone having some good suggestions? Good night -- chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tdelaney at avaya.com Mon Jan 13 04:19:27 2003 From: tdelaney at avaya.com (Delaney, Timothy) Date: Mon, 13 Jan 2003 14:19:27 +1100 Subject: Seeking Minimal Python Project Name (was: [pypy-dev] Re: [ann ] MinimalPython project) Message-ID: > From: Christian Tismer [mailto:tismer at tismer.com] > > There is no official name, yet. > For some time, I was thinking of > "Lilipyt". > "Ptn" was the first thought, while > not sounding very sexy just expressing brevity. > "Pippy" is already occupied. > "Minipy"? Well, there are so many "py" projects, > this makes it hard to come up with a good new name. > > While a name is most unimportant for me, > I agree that it *is* important for the > community to be able to spell it. Mython Minython with an aim of course for it to eventually be Python :) Tim Delaney From z3p at twistedmatrix.com Mon Jan 13 04:35:31 2003 From: z3p at twistedmatrix.com (Paul Swartz) Date: Sun, 12 Jan 2003 22:35:31 -0500 Subject: Seeking Minimal Python Project Name (was: [pypy-dev] Re: [ann] MinimalPython project) In-Reply-To: <3E222FC1.40908@tismer.com> References: <5.0.2.1.1.20030112185911.00a7a6d0@mail.oz.net> Message-ID: <3E21EDB3.17687.2B81062@localhost> On 13 Jan 2003 at 4:17, Christian Tismer wrote: > Bengt Richter wrote: > > At 03:39 2003-01-13 +0100, Christian Tismer wrote: > > [...] > > > >>At the same time, as much as possible should become > >>pluggable. It will be possible to have MiniPy > > > > [...] > > > > Is "MiniPy" the official name ? > > No. > There is no official name, yet. > For some time, I was thinking of > "Lilipyt". > "Ptn" was the first thought, while > not sounding very sexy just expressing brevity. > "Pippy" is already occupied. > "Minipy"? Well, there are so many "py" projects, > this makes it hard to come up with a good new name. > > While a name is most unimportant for me, > I agree that it *is* important for the > community to be able to spell it. Well, 'Jython' is in Java, 'CPython' is C, why not, 'PyPython' or 'Pyython'? -p -- Paul Swartz (o_ http://twistedmatrix.com/users/z3p.twistd/ //\ z3p at twistedmatrix.com V_/_ AIM: Z3Penguin From dmorton at bitfurnace.com Mon Jan 13 05:15:53 2003 From: dmorton at bitfurnace.com (damien morton) Date: Sun, 12 Jan 2003 23:15:53 -0500 Subject: [pypy-dev] Seeking Minimal Python Project Name In-Reply-To: <3E20F9F3.3050608@ActiveState.com> Message-ID: <003201c2baba$77a9b810$6401a8c0@damien> I think Minimal is a misnomer. The goal I think, could rather be described as Self-Hosting Python. Some ideas: PyPy Python(Python) Python(self) From anthony at interlink.com.au Mon Jan 13 06:47:06 2003 From: anthony at interlink.com.au (Anthony Baxter) Date: Mon, 13 Jan 2003 16:47:06 +1100 Subject: Seeking Minimal Python Project Name (was: [pypy-dev] Re: [ann] Minimal Python project) In-Reply-To: <3E222FC1.40908@tismer.com> Message-ID: <200301130547.h0D5l6u16018@localhost.localdomain> >>> Christian Tismer wrote > While a name is most unimportant for me, > I agree that it *is* important for the > community to be able to spell it. > > Anyone having some good suggestions? According to http://www.zoomschool.com/subjects/animals/Animalbabies.shtml baby snakes are called: Snakelet, neonate (a newly-born snake), hatchling (a newly-hatched snake) More reading shows that neonates are babies that are born "live" (not really, but it seems like it - see http://double-d-reptiles.tripod.com/birth.html for more), while hatchlings are from egg-laying species (which includes Pythons). "Snakelet" has a cool sound to it, tho... Anthony From theller at python.net Mon Jan 13 10:04:27 2003 From: theller at python.net (Thomas Heller) Date: 13 Jan 2003 10:04:27 +0100 Subject: [pypy-dev] bootstrapping issues In-Reply-To: <20030112014136.S1568@prim.han.de> References: <20030111214011.P1568@prim.han.de> <20030112014136.S1568@prim.han.de> Message-ID: holger krekel writes: > [Robin Becker Sat, Jan 11, 2003 at 09:30:04PM +0000] > > A naive questions; hich platforms are developments first aimed at ie > > gnu/linux win32 etc? > > I guess it's going to be win32 & POSIX (MacOSX, and some linux/unix variants). > > > Thomas Heller's ctypes module would seem to be a very good start at the > > generic C interface thing. It is pretty easy to use and although I don't > > have a complete grasp on it was a bit easier to use than calldll. I used > > both to do anygui interfaces to native windows api. > > I heard good things about it but it is currently a windows only solution, > or not? > > > Last time I discussed this with him Thomas said he was considering using > > the libffi library to do interfacing. I know that to be fairly portable. I've now a running version which uses libffi (except on Windows). Tested on an oldish SuSE 7.1 x86 system. The docs and readme's still need updating, but for the *impatient* I've uploaded a snapshot: http://starship.python.net/crew/theller/ctypes-0.3.5.tar.gz Thomas From Nicolas.Chauvat at logilab.fr Mon Jan 13 10:55:05 2003 From: Nicolas.Chauvat at logilab.fr (Nicolas Chauvat) Date: Mon, 13 Jan 2003 10:55:05 +0100 Subject: [pypy-dev] Questions about the C core In-Reply-To: <2179680000.1042335174@localhost.localdomain> References: <3E20BB35.1050803@tismer.de> <2179680000.1042335174@localhost.localdomain> Message-ID: <20030113095505.GA19375@logilab.fr> On Sun, Jan 12, 2003 at 02:32:54PM +1300, Andrew McGregor wrote: > Therefore a Pythonic macro should be: > > A class which adds it's grammar >>> ... > Then the macro provides code which is called with the parse trees resulting > from that grammar, modifies them, and returns the parse tree that will > actually be compiled. If you want to evaluate some bits at compile time, > do that explicitly in the macro (opposite way around to lisp, where you > have complicated rules). I want that! Here is an interpreted language that does that : http://pliant.cx/ Fully GPL. Source available. Worth a look. -- Nicolas Chauvat http://www.logilab.com - "Mais o? est donc Ornicar ?" - LOGILAB, Paris (France) From Nicolas.Chauvat at logilab.fr Mon Jan 13 11:29:05 2003 From: Nicolas.Chauvat at logilab.fr (Nicolas Chauvat) Date: Mon, 13 Jan 2003 11:29:05 +0100 Subject: [pypy-dev] Questions about the C core In-Reply-To: <20030113095505.GA19375@logilab.fr> References: <3E20BB35.1050803@tismer.de> <2179680000.1042335174@localhost.localdomain> <20030113095505.GA19375@logilab.fr> Message-ID: <20030113102904.GB20799@logilab.fr> On Mon, Jan 13, 2003 at 10:55:05AM +0100, Nicolas Chauvat wrote: > On Sun, Jan 12, 2003 at 02:32:54PM +1300, Andrew McGregor wrote: > > Therefore a Pythonic macro should be: > > > > A class which adds it's grammar > >>> ... > > Then the macro provides code which is called with the parse trees resulting > > from that grammar, modifies them, and returns the parse tree that will > > actually be compiled. If you want to evaluate some bits at compile time, > > do that explicitly in the macro (opposite way around to lisp, where you > > have complicated rules). > > I want that! > > Here is an interpreted language that does that : http://pliant.cx/ > > Fully GPL. Source available. Worth a look. Note: I perfectly understand the goal of Minimal Python: do *not* change the language or parts of it (yet), just experiment with a minimal intepreter. That said, since Holger asked for desciptions of links here it goes: Both www.mozart-oz.org and pliant.cx (and others) are interesting for they rely on a minimal intepreter that lets one define part of the language itself (and basic ones like loop structure, for example). It can also be remarked that a good part of theoritically elegant languages (who said lisp?) rely on minimal interpreters. Having Python move forward in that direction may guarantee a nicer future for the language and should be easier to do thanks to the support of its large community. [back-to-lurking] -- Nicolas Chauvat http://www.logilab.com - "Mais o? est donc Ornicar ?" - LOGILAB, Paris (France) From hpk at trillke.net Mon Jan 13 12:24:16 2003 From: hpk at trillke.net (holger krekel) Date: Mon, 13 Jan 2003 12:24:16 +0100 Subject: [pypy-dev] Intellectual property In-Reply-To: <3E22094A.9050507@ActiveState.com>; from DavidA@ActiveState.com on Sun, Jan 12, 2003 at 04:33:14PM -0800 References: <3E21C197.8010403@ActiveState.com> <3E21D973.50703@tismer.com> <3E22094A.9050507@ActiveState.com> Message-ID: <20030113122416.H1568@prim.han.de> [David Ascher Sun, Jan 12, 2003 at 04:33:14PM -0800] > Christian Tismer wrote: > > > David Ascher wrote: > > > >> I'd like to suggest that the issue of intellectual property and > >> license be tackled _very_ early. Anything else is inexcusable IMO =). > > > > > > David? Whom are you afraid of? Not me, I hope! > > I'm afraid of no one -- I'm afraid of lazyness which makes early > non-decisions hard to reverse. Some projects would like to change their > license, for example, but can't because the IP assignment is so complicated. > > >> Proposal: > >> - all of the minipy IP is assigned to the PSF. > >> - the license is either the PSF license or the Academic Free License > >> http://www.opensource.org/licenses/academic.php. > >> (the advangate of the AFL is the patent protection, which I > >> suspect may be relevant in this particular implementation). > > > > > > Hmm. Or do you believe this project can be so > > *very* successful that we should be afraid > > of larger companies, who might 'see sharp' > > upon us. ;-) > > I just think defensive patent strategies should be thought about. It's > ok, as long as the PSF has copyright, the PSF could always change the > license on later releases if it saw fit. Could you give a small example scenario? What could the PSF do if Microsoft and Sun each claim that the project makes illegal use of some 1000 patents? Isn't it a very bad idea to transfer "IP" to an american institution? Sorry, but the States seem to take the lead in ridicoulous Patent and Copyright law with Europe following closely. holger From marc at informatik.uni-bremen.de Mon Jan 13 18:39:22 2003 From: marc at informatik.uni-bremen.de (Marc Recht) Date: Mon, 13 Jan 2003 18:39:22 +0100 Subject: [pypy-dev] Intellectual property In-Reply-To: <20030113122416.H1568@prim.han.de> References: <3E21C197.8010403@ActiveState.com> <3E21D973.50703@tismer.com> <3E22094A.9050507@ActiveState.com> <20030113122416.H1568@prim.han.de> Message-ID: <11060000.1042479562@leeloo.intern.geht.de> > Could you give a small example scenario? What could the PSF do > if Microsoft and Sun each claim that the project makes illegal use > of some 1000 patents? > > Isn't it a very bad idea to transfer "IP" to an american > institution? Sorry, but the States seem to take the lead in > ridicoulous Patent and Copyright law with Europe following closely. Good point. America's IP rights suck. (Who saind DMCA..) And problems shouldn't be underestimated.. I don't know if the Europen IP rights will get that bad. Isn't there any organization outside of the USA who can get the IP ? May an institution on Aruba or in Southern America ? :) Marc "Premature optimization is the root of all evil." -- Donald E. Knuth -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 186 bytes Desc: not available URL: From DavidA at ActiveState.com Mon Jan 13 19:00:25 2003 From: DavidA at ActiveState.com (David Ascher) Date: Mon, 13 Jan 2003 10:00:25 -0800 Subject: [pypy-dev] Intellectual property In-Reply-To: <20030113122416.H1568@prim.han.de> References: <3E21C197.8010403@ActiveState.com> <3E21D973.50703@tismer.com> <3E22094A.9050507@ActiveState.com> <20030113122416.H1568@prim.han.de> Message-ID: <3E22FEB9.1070303@ActiveState.com> holger krekel wrote: > Could you give a small example scenario? What could the PSF do > if Microsoft and Sun each claim that the project makes illegal use > of some 1000 patents? The point of the patent clause in the ASF is that it's a mutual defense clause. Various projects can defend against each other, thereby (theoretically) making attacks against each project a tad less likely. The more key projects use it, the stronger everyone's defense is. It's better than nothing as a patent defense, even though it's hardly the end-all and be-all. > Isn't it a very bad idea to transfer "IP" to an american > institution? Sorry, but the States seem to take the lead in > ridicoulous Patent and Copyright law with Europe following closely. Do you have any better ideas? I don't disagree w/ the state of Patent and Copyright law, but I'm fresh out of alternatives. --david From mertz at gnosis.cx Mon Jan 13 19:11:15 2003 From: mertz at gnosis.cx (David Mertz, Ph.D.) Date: Mon, 13 Jan 2003 13:11:15 -0500 Subject: [pypy-dev] Intellectual property In-Reply-To: <11060000.1042479562@leeloo.intern.geht.de> Message-ID: |> What could the PSF do if Microsoft and Sun each claim that the |>project makes illegal use of some 1000 patents? |> the States seem to take the lead in ridicoulous Patent and Copyright |>law with Europe following closely. |Good point. America's IP rights suck. (Who saind DMCA..) And problems |shouldn't be underestimated.. I don't know if the Europen IP rights will |get that bad. Isn't there any organization outside of the USA who can get |the IP ? I doubt it would make any difference where the rights-holding organization is. I agree strongly that USAian IP laws are ghastly awful, and gratuitous IP lawsuits rampants. And Europe isn't as good as it should be either. But if big corporations decide they want to kill MiniPy with patents, they'll fish for a jurisdiction they like. Even if a Cuban organization officially holds the code[*], the project will be distributed to USAian and European users, mirrored or hosted on servers in those places, perhaps sold with books and CDs, and so on. Those connections are plenty for the bad guys to gain jurisdiction. Or at least to argue it to death. Yours, David... [*] Cuba, whatever negative press it gets in the USA, has an enormously sensible IP legal framework. No gene patents, disallowing drug patents to hinder treatment, broad fair use of copyright, etc. Probably better than just about any other country, if you like freedom. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. From bokr at oz.net Mon Jan 13 19:35:23 2003 From: bokr at oz.net (Bengt Richter) Date: Mon, 13 Jan 2003 10:35:23 -0800 Subject: [pypy-dev] Intellectual property In-Reply-To: <3E22FEB9.1070303@ActiveState.com> References: <20030113122416.H1568@prim.han.de> <3E21C197.8010403@ActiveState.com> <3E21D973.50703@tismer.com> <3E22094A.9050507@ActiveState.com> <20030113122416.H1568@prim.han.de> Message-ID: <5.0.2.1.1.20030113102247.00a7a0c0@mail.oz.net> At 10:00 2003-01-13 -0800, David Ascher wrote: >holger krekel wrote: > >>Could you give a small example scenario? What could the PSF do >>if Microsoft and Sun each claim that the project makes illegal use of some 1000 patents? > >The point of the patent clause in the ASF is that it's a mutual defense clause. Various projects can defend against each other, thereby (theoretically) making attacks against each project a tad less likely. The more key projects use it, the stronger everyone's defense is. It's better than nothing as a patent defense, even though it's hardly the end-all and be-all. > >>Isn't it a very bad idea to transfer "IP" to an american institution? Sorry, but the States seem to take the lead in >>ridicoulous Patent and Copyright law with Europe following closely. > >Do you have any better ideas? I don't disagree w/ the state of Patent and Copyright law, but I'm fresh out of alternatives. Can a copyright notice forbid patenting of the expressed ideas, assuming it's the first public expression? And if something is published today, is it prior art invalidating claims of a patent application filed tomorrow? Just wondering what defense can be exercised through copyright. IANAL ;-) Regards, Bengt Richter From DavidA at ActiveState.com Mon Jan 13 19:36:30 2003 From: DavidA at ActiveState.com (David Ascher) Date: Mon, 13 Jan 2003 10:36:30 -0800 Subject: [pypy-dev] Intellectual property In-Reply-To: <5.0.2.1.1.20030113102247.00a7a0c0@mail.oz.net> References: <20030113122416.H1568@prim.han.de> <3E21C197.8010403@ActiveState.com> <3E21D973.50703@tismer.com> <3E22094A.9050507@ActiveState.com> <20030113122416.H1568@prim.han.de> <5.0.2.1.1.20030113102247.00a7a0c0@mail.oz.net> Message-ID: <3E23072E.9060404@ActiveState.com> Bengt Richter wrote: > Can a copyright notice forbid patenting of the expressed ideas, assuming > it's the first public expression? I doubt it. > Just wondering what defense can be exercised through copyright. IANAL ;-) A _copyright_ is a statement of ownership. The _license_ is where you can put whatever terms you want. > And if something is published today, is > it prior art invalidating claims of a patent application filed tomorrow? Yes, in theory, but you may have to challenge a patent in court post-hoc to make that point. --david PS: I didn't mean to derail discussion off-topic. However, I think there's value to settling these issues early, when there is no existing IP at stake. If the three founders of the project agree on who owns the IP and what the license is, as soon as they say so in public we can move on as far as I'm concerned. From marc at informatik.uni-bremen.de Mon Jan 13 20:24:02 2003 From: marc at informatik.uni-bremen.de (Marc Recht) Date: Mon, 13 Jan 2003 20:24:02 +0100 Subject: [pypy-dev] Questions about the C core In-Reply-To: <3E218DFC.6010005@tismer.com> References: <3E20BB35.1050803@tismer.de> <2179680000.1042335174@localhost.localdomain> <00aa01c2b9dd$0e4c7d40$6d94fea9@newmexico> <258850000.1042359720@leeloo.intern.geht.de> <3E218DFC.6010005@tismer.com> Message-ID: <71330000.1042485842@leeloo.intern.geht.de> > I have the same goal. I also don't hesitate to use > some abbreviative tools for repeated code, maybe. > But only if it helps to simplify, not to change the > language. I was only thinking about the actual build process, not about the language itself. > Furthermore, thinking of macros again, they make [...] I 100% agree with you. Changing the language isn't a good idea. > XML on the other hand, as an intermediate language > or for pickling makes sense to me. At least it is easier > to parse than Python. (But we *have* the parser...) I thought about using XML as a formal description for generating stuff. For example CastorXML (a java xml databinding framework, http://castor.exolab.org) uses XML Schemata to generate simple data classes with XML marshaling/unmarshaling functionality.. Though I'm not sure at which part of the MiniPy build process generating code could be useful, I think it should be kept in mind. "Premature optimization is the root of all evil." -- Donald E. Knuth -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 186 bytes Desc: not available URL: From kendall at monkeyfist.com Mon Jan 13 21:26:23 2003 From: kendall at monkeyfist.com (Kendall Grant Clark) Date: Mon, 13 Jan 2003 14:26:23 -0600 Subject: [pypy-dev] Seeking Minimal Python Project Name In-Reply-To: <003201c2baba$77a9b810$6401a8c0@damien> References: <3E20F9F3.3050608@ActiveState.com> <003201c2baba$77a9b810$6401a8c0@damien> Message-ID: <15907.8431.665987.386722@rosa.monkeyfist.com> >>>>> "damien" == damien morton writes: damien> I think Minimal is a misnomer. The goal I think, could rather be damien> described as Self-Hosting Python. damien> Some ideas: damien> PyPy Python(Python) Python(self) I agree, re: Minimal -- many of the well-wishers have taken the project to be about building a small & embedded systems version of Python. My name suggestions, playing on the importance of "self" in Python and the self-hosting idea: SoiPy Soi Poi (i.e., Soi + Py) Best, Kendall Clark http://pyzine.com/log -- Jazz is only what you are. -- Louis Armstrong From andrew at indranet.co.nz Mon Jan 13 21:43:16 2003 From: andrew at indranet.co.nz (Andrew McGregor) Date: Tue, 14 Jan 2003 09:43:16 +1300 Subject: [pypy-dev] Intellectual property In-Reply-To: <3E23072E.9060404@ActiveState.com> References: <20030113122416.H1568@prim.han.de> <3E21C197.8010403@ActiveState.com> <3E21D973.50703@tismer.com> <3E22094A.9050507@ActiveState.com> <20030113122416.H1568@prim.han.de> <5.0.2.1.1.20030113102247.00a7a0c0@mail.oz.net> <3E23072E.9060404@ActiveState.com> Message-ID: <134170000.1042490596@localhost.localdomain> --On Monday, January 13, 2003 10:36:30 -0800 David Ascher wrote: > Bengt Richter wrote: > >> Can a copyright notice forbid patenting of the expressed ideas, assuming >> it's the first public expression? > > I doubt it. So long as you publish before the patent priority date, you have made the patent invalid just by publishing it at all. But see below. If the publication date was while the patent was being examined but before the patent was public, it's possible there may be a case for a challenge, but if after the patent was granted or published, whichever comes first, you're out of luck. >> Just wondering what defense can be exercised through copyright. IANAL ;-) > > A _copyright_ is a statement of ownership. The _license_ is where you > can put whatever terms you want. > > > And if something is published today, is > > it prior art invalidating claims of a patent application filed > tomorrow? > > Yes, in theory, but you may have to challenge a patent in court post-hoc > to make that point. You *will* have to challenge, but the process is first to apply to the patent office to have it struck off, which is far cheaper than a court case. The grounds for the challenge is that the patent applicant should have disclosed the prior expression of the idea, hence the date rules. Andrew From tismer at tismer.com Mon Jan 13 18:02:51 2003 From: tismer at tismer.com (Christian Tismer) Date: Mon, 13 Jan 2003 18:02:51 +0100 Subject: [pypy-dev] Re: [ann] Minimal Python project In-Reply-To: <7xof6l2003.fsf@ruckus.brouhaha.com> References: <7xr8bjgz4x.fsf@ruckus.brouhaha.com> <7x4r8ej1r9.fsf@ruckus.brouhaha.com> <7x8yxqduco.fsf@ruckus.brouhaha.com> <7xof6l2003.fsf@ruckus.brouhaha.com> Message-ID: <3E22F13B.3060708@tismer.com> (copying to pypy-dev since this is an incentive) Paul Rubin wrote: > Christian Tismer writes: > >>You know what my goals are. >>Smaller, more flexible, faster, easier to change, >>easier to maintain, easier to keep backwards >>compatible, more portable due to less C code, >>down-sizeable by features (which is most difficult), >>the full catastrophe... > > > Well ok, it wasn't clear before what the entire set of goals were. > > >>We will try to implement Python as exact and clean >>as possible. The langage should be implemented >>completely. Now Paul is nailing thing down :-) > Are you going to try to keep the C API backwards compatible? I wish to, but I cant guarantee it. This is a matter of experimentation. Fore sure, we will try to build a version that adheres to the C API, at least in the early bootstrap phase, we really need to do so, in order to use "borrowed" internals and extensions. It then depends on experience to be gathered, how much conformance to the C API costs. If it turns out to be much more efficient to use a different API, and maybe a radical change of internal data structures as well, then TiPy (randomly picked name from all the proposals :) is the first chance to try such new ways at all. This can lead to a compatibility layer (maybe hard to implement), and these insights may become proposals to change core Python as well. It is also thinkable to have options in TiPy which make even this configurable at boot time, and people can decide whether they want it compatible or faster. Some of the "internal" API like compiler related functions and stuff from ceval.c will most probably be changed, anyway, but the common API will be as it is now. This is where we start with. > Are you going to include exact implementations of currently > not-really-documented features such as metaclasses? Absolutely. I'm using metaclasses very much, and I also will chime a compatible patch in, which allows to use slots in metaclasses. I need that, and Stackless has it. > Will stuff like frame objects still work as documented? Do you expect > to be able to run the current version of pdb.py without changes? Yes, for the first round, frames should be rather like they are now. They will get more interfaces to be manipulated by Python code. pdb.py should run as it is now. I know there is an issue with Psyco, which doesn't create frames all the time. This is an issue that need thorough discussion and design. > I don't think you should necessarily hold yourself to any of the > above, but that's just me. If the above can be achieved for a reasonable price, then we really should try. If it hurts too much, then it may be easier to change the rest of the world :-) >>It will be a major amount of work to deduce the >>dependencies of features, and how to arrange them >>in a scalable shape. I do believe that the core >>group will help us with that. > > > I'm pretty excited by the project. I think you're going to push the > limits of the Python language harder than they've ever been pushed > against before. And you'll be in a unique position to actually remove > the limits when you hit them, rather than having to work around them. > I hope you'll trust yourself to use such opportunities when it's right > to do so. Thanks for the encouragement. I hope to to as well as possible. Not loosing chances to stay compatible, at the same time not loosing good new opportunities by sticking too much with old principles. This is not easy and a balancing act. Something that I could not do alone, and I'm happy that there are so many supporters and people going to help. I'd-love-to-have-*you*-in-that-group-too -- chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tismer at tismer.com Mon Jan 13 18:19:02 2003 From: tismer at tismer.com (Christian Tismer) Date: Mon, 13 Jan 2003 18:19:02 +0100 Subject: [pypy-dev] Pypy Roadmap In-Reply-To: <5.0.2.1.1.20030112132955.00a6d800@mail.oz.net> References: <5.0.2.1.1.20030112132955.00a6d800@mail.oz.net> Message-ID: <3E22F506.8060207@tismer.com> Bengt Richter wrote: > Hi, this sounds interesting ;-) > > I was wondering how you were planning on identifying concrete tasks > and assigning and tracking them, and keeping status visible to support > cooperation. Good question. > Do you already have CVS and bug/issue tracking methodology settled? > I.e., such stuff is nuisance overhead or sanity preservation depending > on the task, and the development phase, and scale/distribution of effort. Nothing yet but this mailing list. The whole "project" is about one week old, and it consists of a lot of emails with random ideas, and the insight that there is something in a couple of brains that really wants to materialize and creep out. > Is there one single home page that leads to everything relevant, with some > indication of historical vs current vs fire alarm links? That's the next thing to do, I guess. I'm a bit reluctant to use SourceForge. If I can manage, I'd liek to use my own new server for the CVS, install a Wiki and so on. This will hopefully be figured out during this week, as time permits. > Here are some things that occur to me to think about in making > a roadmap for your project. If you would make something like it > to reflect your _actual_ project plans, it would help me (and maybe > some other lurkers ;-) get an idea of what you are really up to ;-) [huuge list of heading lines to be filled] Oh well, this is really an awful lot of work to do, and I'm so happy that we are a group of people and can spread some of the work. > Well, that's all the thoughts I have for now ;-) Once we have some basic pages, your thoughts can be found there, again. all the best -- chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tismer at tismer.com Mon Jan 13 18:11:14 2003 From: tismer at tismer.com (Christian Tismer) Date: Mon, 13 Jan 2003 18:11:14 +0100 Subject: [pypy-dev] Questions about the C core In-Reply-To: <2723120000.1042415785@localhost.localdomain> References: <3E20BB35.1050803@tismer.de> <2179680000.1042335174@localhost.localdomain> <00aa01c2b9dd$0e4c7d40$6d94fea9@newmexico> <2197800000.1042341957@localhost.localdomain> <3E20F73B.6000307@ActiveState.com> <20030112150512.U1568@prim.han.de> <3E218BBC.4020502@tismer.com> <2723120000.1042415785@localhost.localdomain> Message-ID: <3E22F332.4070704@tismer.com> Andrew McGregor wrote: ... > Fair enough. I simply thought that macros were an *old* technique that > could be useful, even if only as part of the implementation, but the > consensus is otherwise. I don't quite understand, but I'll shut up now :-) You are not meant to shut up! :) I'm just saying that I'm -1 on adding macros to Python, and even more to use TiPy as a vehicle for that. It is better to leave general design questions out of this young project. But if we do need something macro-like to quickly build up the new system, this might be a different issue. I'm open to anything that makes the implementation easier, faster, better maintainable, and I do appreciate your input. > I guess the operative part of the culture document is #5, 'Flat is > better than nested'. > > Please don't continue the thread, I'm convinced. I don't continue the thread. But please don't feel stopped! cheers - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tismer at tismer.com Tue Jan 14 01:22:13 2003 From: tismer at tismer.com (Christian Tismer) Date: Tue, 14 Jan 2003 01:22:13 +0100 Subject: [pypy-dev] Questions about the C core In-Reply-To: <71330000.1042485842@leeloo.intern.geht.de> References: <3E20BB35.1050803@tismer.de> <2179680000.1042335174@localhost.localdomain> <00aa01c2b9dd$0e4c7d40$6d94fea9@newmexico> <258850000.1042359720@leeloo.intern.geht.de> <3E218DFC.6010005@tismer.com> <71330000.1042485842@leeloo.intern.geht.de> Message-ID: <3E235835.5080000@tismer.com> Marc Recht wrote: > I have the same goal. I also don't hesitate to use > some abbreviative tools for repeated code, maybe. > But only if it helps to simplify, not to change the > language. > > I was only thinking about the actual build process, not about the > language itself. Agreed! [XML] > I thought about using XML as a formal description for generating stuff. > For example CastorXML (a java xml databinding framework, > http://castor.exolab.org) uses XML Schemata to generate simple data > classes with XML marshaling/unmarshaling functionality.. Though I'm not > sure at which part of the MiniPy build process generating code could be > useful, I think it should be kept in mind. Will be kept in mind. Canning data structures into XML schemata can be very helpful. I had to do this in a large project, and it gave me lots of insights! > "Premature optimization is the root of all evil." -- Donald E. Knu > th Can you please make that sentence not wrap any longer? :-) From tismer at tismer.com Tue Jan 14 01:23:52 2003 From: tismer at tismer.com (Christian Tismer) Date: Tue, 14 Jan 2003 01:23:52 +0100 Subject: [pypy-dev] Seeking Minimal Python Project Name In-Reply-To: <15907.8431.665987.386722@rosa.monkeyfist.com> References: <3E20F9F3.3050608@ActiveState.com> <003201c2baba$77a9b810$6401a8c0@damien> <15907.8431.665987.386722@rosa.monkeyfist.com> Message-ID: <3E235898.6060905@tismer.com> Kendall Grant Clark wrote: > "damien" == damien morton writes: > > > damien> I think Minimal is a misnomer. The goal I think, could rather be > damien> described as Self-Hosting Python. Yes. ... > My name suggestions, playing on the importance of "self" in Python and the > self-hosting idea: > > SoiPy > Soi > Poi (i.e., Soi + Py) Is PoiSon occupied? From logistix at zworg.com Tue Jan 14 03:30:54 2003 From: logistix at zworg.com (Grant Olson) Date: Mon, 13 Jan 2003 14:30:54 -1200 Subject: [pypy-dev] Concrete Syntax Tree Message-ID: <200301140230.h0E2UsPj000526@overload3.baremetal.com> I wrote some rough python code a few months ago that built Concrete Syntax Trees from the ?grammar? file in the Python source, and also turned token streams into Abstract Trees based on the Concrete Trees. The parser module successfully compiled the Abstract Trees. Is this something you guys would be interested in or am I off base here? --------------------------------------- Get your free e-mail address @zworg.com From tismer at tismer.com Tue Jan 14 05:15:16 2003 From: tismer at tismer.com (Christian Tismer) Date: Tue, 14 Jan 2003 05:15:16 +0100 Subject: [pypy-dev] Concrete Syntax Tree In-Reply-To: <200301140230.h0E2UsPj000526@overload3.baremetal.com> References: <200301140230.h0E2UsPj000526@overload3.baremetal.com> Message-ID: <3E238ED4.9080403@tismer.com> Grant Olson wrote: > I wrote some rough python code a few months ago that built Concrete > Syntax Trees from the ?grammar? file in the Python source, and also > turned token streams into Abstract Trees based on the Concrete Trees. > The parser module successfully compiled the Abstract Trees. Is this > something you guys would be interested in or am I off base here? I would love to read that, of course! yours - chris From kendall at monkeyfist.com Tue Jan 14 08:47:12 2003 From: kendall at monkeyfist.com (Kendall Clark) Date: Tue, 14 Jan 2003 01:47:12 -0600 Subject: [pypy-dev] Seeking Minimal Python Project Name In-Reply-To: <3E235898.6060905@tismer.com> References: <3E20F9F3.3050608@ActiveState.com> <003201c2baba$77a9b810$6401a8c0@damien> <15907.8431.665987.386722@rosa.monkeyfist.com> <3E235898.6060905@tismer.com> Message-ID: <20030114074712.GC30528@monkeyfist.com> On Tue, Jan 14, 2003 at 01:23:52AM +0100, Christian Tismer wrote: > > >My name suggestions, playing on the importance of "self" in Python and the > >self-hosting idea: > > > >SoiPy > >Soi > >Poi (i.e., Soi + Py) > > Is PoiSon occupied? Heh, that's even better; I like that very much; it's vaguely triple-entendric: "Poison" => (Python + Soi ("self") = Poi) + Son ("Son of Python, a descendant") Very clever, Chris. Kendall Clark -- Jazz is only what you are. -- Louis Armstrong From hpk at trillke.net Tue Jan 14 10:07:03 2003 From: hpk at trillke.net (holger krekel) Date: Tue, 14 Jan 2003 10:07:03 +0100 Subject: [pypy-dev] getting rid of compile.c? Message-ID: <20030114100703.Q1568@prim.han.de> Hi, i was wondering how/if to get rid of compile.c. ASFAIK it is currently needed for bootstrapping purposes. Wouldn't it be possible to deliver precompiled bytecode for the python compiler package and thus obsolete compile.c? The bytecode isn't platform dependent so this seems sensible to me. regards, holger From edream at tds.net Tue Jan 14 12:14:34 2003 From: edream at tds.net (Edward K. Ream) Date: Tue, 14 Jan 2003 05:14:34 -0600 Subject: [pypy-dev] Re: [ann] Minimal Python project Message-ID: <005c01c2bbbe$1e8b9640$bba4aad8@computer> > We announce a mailinglist dedicated to developing > a "Minimal Python" version. Minimal means that > we want to have a very small C-core and as much > as possible (re)implemented in python itself. This > includes (parts of) the VM-Code. > Building on the expected gains in flexibility > we also hope to use distribution techniques such > as PEP302 to ship only a minimal set of modules > and load needed modules on demand. > As Armin Rigo of PSYCO fame takes part in the effort, > we are confident that MinimalPython will eventually > run faster than today's CPython. > And because Christian Tismer takes part, we are > confident that we will find a radical enough > approach which also fits Stackless :-) It now appears that I have misunderstood what these paragraphs meant. I apologize for any confusion that I may have created. My initial reading of this was that MinimalPython would be faster _because_ parts of the VM code were merely recoded in Python. After studying psyco further it seems more likely that what was meant was that MinimalPython could be faster than the present CPython because the techniques of psyco might be applied to improve the speed of the present main interpreter loop. The former statement seemed (and still seems) most unlikely. The latter statement seems quite plausible. In short, I really don't understand the aims of this project very well. I can imagine that it has one or more of the following goals: 1. To code as much as possible of the Python distribution in Python. 2. To increase the speed of Python by integrating psyco into the interpreter. 3. To provide a test-bed for alternative implementations such as stackless. What happens if these goals conflict? For example, much of psyco is presently written in C. I don't suppose these goals imply that all of psyco will be recast in Python, or do they? If recasting code in Python conflicts with increasing code speed, which is most important? What happens if the goal of supporting stackless Python conflicts with the other goals? Any clarifications would be most welcome. Edward -------------------------------------------------------------------- Edward K. Ream email: edream at tds.net Leo: Literate Editor with Outlines Leo: http://personalpages.tds.net/~edream/front.html -------------------------------------------------------------------- From mwh at python.net Tue Jan 14 12:34:46 2003 From: mwh at python.net (Michael Hudson) Date: 14 Jan 2003 11:34:46 +0000 Subject: [pypy-dev] Re: getting rid of compile.c? References: <20030114100703.Q1568@prim.han.de> Message-ID: <2mptr0klrd.fsf@starship.python.net> holger krekel writes: > Hi, > > i was wondering how/if to get rid of compile.c. ASFAIK > it is currently needed for bootstrapping purposes. > > Wouldn't it be possible to deliver precompiled bytecode > for the python compiler package and thus obsolete compile.c? > The bytecode isn't platform dependent so this seems sensible > to me. That sounds possible, and you're always going to have issues like this, but as long as you're hosting your work inside of CPython 2.3, I wouldn't worry about it just yet... (Doing this would make bytecode changes, erm, hairy, it seems to me). Cheers, M. -- MARVIN: Oh dear, I think you'll find reality's on the blink again. -- The Hitch-Hikers Guide to the Galaxy, Episode 12 From edream at tds.net Tue Jan 14 13:09:50 2003 From: edream at tds.net (Edward K. Ream) Date: Tue, 14 Jan 2003 06:09:50 -0600 Subject: [pypy-dev] Introducing myself, again Message-ID: <008001c2bbc5$d6ed42e0$bba4aad8@computer> We didn't get off to the best start on the comp.lang.python list, so I'd like to reintroduce myself to you. In this posting I'll discuss my background and how I might contribute to the psyco project. Background I have been programming for 35+ years, first in assembly and C, then in C++ and Objective-C, and finally and most joyously in Python. I've used the NextStep/Yellow Box/Cocoa, Borland C++ Builder, wxWindows and tcl/tk class frameworks. I am familiar with the Smalltalk and Java class libraries. My primary interests have always been: 1) the programming process itself and 2) getting maximum speed out of programs. One of my earliest projects was improving the speed of a screen driver by about a factor of 10. Doing so changed qualitatively how people interacted with the screen. Sometime in the 80's Leor Zolman gave me the source code of the BDS-C compiler to study. For several years this was the fastest C compiler available on micros (as we called PC's then), and to this day is one of the fastest compilers ever built. Studying this compiler was a revelation for me: it broke all the rules about compilers I learned in graduate school, and broke them very well. I realized then that much of my career would be unlearning what I thought I knew :-) In the 90's I designed and built a commercial optimizing C compiler. This was a technical success: it produced very good m68000 code and did it faster than the Borland C compiler. It was not a commercial success as the company folded before the compiler was released. The compiler was not particularly elegant; I am not a compiler expert, and I _do_ have real experience. I am the BDFL of the Leo project. Leo is a major Open Source project written in Python. The Leo project takes up the bulk of my time, and I would gladly give some time to this project. I am also the creator of the pl68k programming language and compiler, the Sherlock tracing system, and the ancient RED text editor. With the exception of the C compiler (and related linker and file system), everything I have done has been Open Software. How I might help I have lots of experience with making code work fast and I have a number of ideas about generalizing and improving psycho... I would be willing to help improve the documentation of psyco. I think psyco promises a huge win for Python, and I would like to see it explained so many people can understand and improve it. I have created a Leo outline of an old version of psyco, and I would be happy to convert all or parts of this project to Leo. Leo is an excellent way of organizing, studying and presenting complex code. Whether the psyco project chooses to adopt Leo "officially" is not for me to say, and I would recommend this project experiment with using Leo. Oh yes, I have a Python script that helps with conversion of C code to Python. It simply hacks on the syntax, removing braces, semicolons, type declarations, converting argument lists to def statements, that kind of thing. The script is buried in the source code for Leo. Edward -------------------------------------------------------------------- Edward K. Ream email: edream at tds.net Leo: Literate Editor with Outlines Leo: http://personalpages.tds.net/~edream/front.html -------------------------------------------------------------------- From mwh at python.net Tue Jan 14 13:18:20 2003 From: mwh at python.net (Michael Hudson) Date: 14 Jan 2003 12:18:20 +0000 Subject: [pypy-dev] Re: [ann] Minimal Python project References: <005c01c2bbbe$1e8b9640$bba4aad8@computer> Message-ID: <2mn0m3lyb7.fsf@starship.python.net> "Edward K. Ream" writes: > What happens if these goals conflict? For example, much of psyco is > presently written in C. I don't suppose these goals imply that all > of psyco will be recast in Python, or do they? I believe that's the plan anyway, somewhat because coding in Python is less painful than coding in C. Have you *looked* at the source to psyco? It's, umm, challenging. Cheers, M. -- at any rate, I'm satisfied that not only do they know which end of the pointy thing to hold, but where to poke it for maximum effect. -- Eric The Read, asr, on google.com From florian.proff.schulze at gmx.net Tue Jan 14 15:42:39 2003 From: florian.proff.schulze at gmx.net (Florian Schulze) Date: Tue, 14 Jan 2003 15:42:39 +0100 (Westeuropäische Normalzeit) Subject: [pypy-dev] Pyrex Message-ID: Hi! I just thought that for some core functionality and interfacing to C functions something similar to Pyrex could be used. Just an extension to the python language to easily call C code from it. Not emmiting C code like Pyrex does, but doing it directly somehow. Something like a mix of calldll, ctypes and pyrex. Psyco then could be run over it to improve the code unlike in the current Pyrex implementation which just emmits C blindly and not very efficiently. Florian ps: Pyrex is really a great tool, it's very easy to write interfaces with it. From edream at tds.net Tue Jan 14 16:24:10 2003 From: edream at tds.net (Edward K. Ream) Date: Tue, 14 Jan 2003 09:24:10 -0600 Subject: [pypy-dev] psyco in Python was: Minimal Python project References: <005c01c2bbbe$1e8b9640$bba4aad8@computer> <2mn0m3lyb7.fsf@starship.python.net> Message-ID: <009d01c2bbe0$fd02e910$bba4aad8@computer> > > What happens if these goals conflict? For example, much of psyco is > > presently written in C. I don't suppose these goals imply that all > > of psyco will be recast in Python, or do they? > > I believe that's the plan anyway, somewhat because coding in Python is > less painful than coding in C. Have you *looked* at the source to > psyco? It's, umm, challenging. Yes, I've looked carefully at the code and converted it to a Leo outline. It is indeed challenging, as is the code for any compiler. However, the reason it is challenging is not particularly that it is written in C, but that neither the documentation nor the code comments explain completely _what_ the code is trying to accomplish and the _context_ in which the code operates. This is often true for other compilers, like gcc which I have also studied. The difference is that gcc uses completely standard compiling techniques while psyco does not. That's not a knock against psyco! Also, it is not clear to me exactly how psyco in Python would work in the production version. Clearly, it would be easy to do psyco in Python using the _present_ C implementation of Python, but how would psyco in Python be bootstrapped in an "all-Python" version? At present, all Python code eventually gets executed as C code: either in the interpreter itself or in the C libraries. In an "all-python version everything would get executed either as C code in the libraries, or assembly language output by Psycho. The question is, how does the Python code that represents psyco get "turned into" either C or assembly language? Do you see the problem? My guess is that for the production "all-python" version it would be easiest, in fact, to use something very similar to the present C version of psyco. Can anyone suggest an easier way? Doing psyco in Python for experimental purposes is a great idea. I just don't see how to implement psyco in Python in a production package without using the C interpreter as a bootstrap! And yes, I suspect that a C language version of psyco will be faster than psyco in Python. Edward -------------------------------------------------------------------- Edward K. Ream email: edream at tds.net Leo: Literate Editor with Outlines Leo: http://personalpages.tds.net/~edream/front.html -------------------------------------------------------------------- From edream at tds.net Tue Jan 14 16:37:45 2003 From: edream at tds.net (Edward K. Ream) Date: Tue, 14 Jan 2003 09:37:45 -0600 Subject: [pypy-dev] Pyrex References: Message-ID: <00b501c2bbe2$e25f61e0$bba4aad8@computer> > I just thought that for some core functionality and interfacing to C > functions something similar to Pyrex could be used. Hmm. Perhaps Pyrex could be used to bootstrap psyco in Python into C or assembly code. I suppose it will depend on what the Python version of the psyco ends up looking like... Edward -------------------------------------------------------------------- Edward K. Ream email: edream at tds.net Leo: Literate Editor with Outlines Leo: http://personalpages.tds.net/~edream/front.html -------------------------------------------------------------------- From mwh at python.net Tue Jan 14 16:54:04 2003 From: mwh at python.net (Michael Hudson) Date: 14 Jan 2003 15:54:04 +0000 Subject: [pypy-dev] Re: psyco in Python was: Minimal Python project References: <005c01c2bbbe$1e8b9640$bba4aad8@computer> <2mn0m3lyb7.fsf@starship.python.net> <009d01c2bbe0$fd02e910$bba4aad8@computer> Message-ID: <2mhecblobn.fsf@starship.python.net> "Edward K. Ream" writes: > > > What happens if these goals conflict? For example, much of psyco is > > > presently written in C. I don't suppose these goals imply that all > > > of psyco will be recast in Python, or do they? > > > > I believe that's the plan anyway, somewhat because coding in Python is > > less painful than coding in C. Have you *looked* at the source to > > psyco? It's, umm, challenging. > > Yes, I've looked carefully at the code and converted it to a Leo outline. > It is indeed challenging, as is the code for any compiler. I wasn't aware that most compilers had chunks of machine code stored in literals scattered about the source... > However, the reason it is challenging is not particularly that it is > written in C, but that neither the documentation nor the code > comments explain completely _what_ the code is trying to accomplish > and the _context_ in which the code operates. Hmmph. Have you read the slides for Armin's Europython talk? (Are they even online? Oh yeah, off sf). I didn't have a clue how psyco worked until I went to that talk. Matching that understanding to the current source is still highly non-trivial. [...] > Also, it is not clear to me exactly how psyco in Python would work in the > production version. Clearly, it would be easy to do psyco in Python using > the _present_ C implementation of Python, but how would psyco in Python be > bootstrapped in an "all-Python" version? Is this any different from Holger's "how do we get rid of Python/compile.c" question? To attempt an answer, once you have a bytecode interpreter you bytecode compile psyco and the new interpreter using CPython, then you use the bytecode interpreter to execute the compiled version of psyco. > At present, all Python code eventually gets executed as C code: > either in the interpreter itself or in the C libraries. In an > "all-python version everything would get executed either as C code > in the libraries, or assembly language output by Psycho. Nit: psyco doesn't deal in assembly... I guess you knew that. > The question is, how does the Python code that represents psyco get > "turned into" either C or assembly language? Do you see the > problem? Yes, but I don't see how the addition of pysco makes it any harder. > My guess is that for the production "all-python" version it would be > easiest, in fact, to use something very similar to the present C > version of psyco. Can anyone suggest an easier way? > > Doing psyco in Python for experimental purposes is a great idea. Do be fair, my comment that there's an intent to rewrite pysco in Python dates back to europython, long before the glimmerings of the present project. The main motivation was to increase maintainability and modularity (and so ease adding, say, a PowerPC backend), IIRC. Where's Armin when you need him? > I just don't see how to implement psyco in Python in a production > package without using the C interpreter as a bootstrap! And yes, I > suspect that a C language version of psyco will be faster than psyco > in Python. Even after you've run psyco over itself? Cheers, M. -- MARVIN: What a depressingly stupid machine. -- The Hitch-Hikers Guide to the Galaxy, Episode 7 From tismer at tismer.com Tue Jan 14 17:11:16 2003 From: tismer at tismer.com (Christian Tismer) Date: Tue, 14 Jan 2003 17:11:16 +0100 Subject: [pypy-dev] Re: [ann] Minimal Python project In-Reply-To: <7xbs2k3pwp.fsf@ruckus.brouhaha.com> References: <7xbs2k3pwp.fsf@ruckus.brouhaha.com> Message-ID: <3E2436A4.40507@tismer.com> Paul Rubin wrote: > "Edward K. Ream" writes: > >>1. The main reason for thinking C is particularly suited to >>producing fast code is that in C the types of all objects are known >>at compile time. ... I would expect that running psyco on Python >>code will at best run half as fast as the equivalent hand-written C >>code. The reasons are clear: psyco has neither the time nor the >>information to do as well as the best hand-written C code compiled >>by a descent compiler. > > > In fact some Lisp systems have been shown to produce compiled code > that runs as fast as compiled C code. I don't see that Python needs > to be different. As an addition: Pysco will have a much easier game with the re-implementation of Python in Python than with general Python code. I'm going to build the basic objects with Python, but based upon an object description which is similar to the new type objects with slots and properties. This concept can be expanded to not only describe derivatives of Python objects, but also for ordinary data structures which we know from C. This leads to high-level descriptions of low-level fields of all structures. These field descriptors will be the objects which are modified from the Python code. Example: An integer field in a structure describing a Python frame, for instance f_lineno, is not a regular integer object, but is accessed via a descriptor which tells that this is a fixed sized integer field which does not overflow. By using this kind of objects in Python expressions, Psyco can absolute easily deduce that it is dealing with primitive types. There is no experimentation necessary, and even without gathering any runtime experience, Psyco *can* emit code of the same quality as a C compiler does, provided that we make it into a smart enough compiler. The code dealing with these objects is ordinary Python code. The objects just happen to know what they are and how they should be rendered into assembly. So this leads to efficient implementations even of the innermost objects. Everything else can be bootsrapped upon this. This simple principle would even work without all of Psyco's sophistication, which makes me so confident that this project will have a great result, even when compiling statically. cheers - chris From edream at tds.net Tue Jan 14 17:07:25 2003 From: edream at tds.net (Edward K. Ream) Date: Tue, 14 Jan 2003 10:07:25 -0600 Subject: [pypy-dev] Thought experiments Message-ID: <00bd01c2bbe7$077a72e0$bba4aad8@computer> I'd like to throw out some thought experiments, just for fun. As I understand it, the essence of psyco is that it can generate efficient machine code corresponding to cases in the C interpreter based primarily on knowledge of the _runtime types_ of operands (variables of the _interpreter_ corresponding to Python operands and intermediate operands). pyrex generates C code based on the _declared_ types of objects. Is this summary basically correct? Might we not combine these two approaches? Are there any circumstances in which we might generate _optimized_ C/assembly code based on the runtime types of objects in functions? This suggests a multi-layered approach: use psyco to determine the types of objects, then _afterwards_ use something like a real C compiler to generate code. This can only work if the types of object to functions remains constant, which is usually true, but certainly not always true. This would be similar to other optimization techniques that gather statistics about the frequency that basic code blocks are executed. The difference is that here the first-stage optimizer (psyco) would gather statistics about the _types_ of variables. Yes, there are problems here, but they correspond, I think, to similar problems in psyco. Indeed, psyco must continually check (using psyco_compatible, if I am not mistaken) to see that the types presented to the function match the types used to create the assembly-language code. In other words, might we not consider a new kind of "assembled byte code"? My guess is that compiled functions/methods would have to start with a "preamble" that selects the proper compiled code to execute based on the types of the arguments passed to the function. Actually, I wonder whether psyco itself could generate such preambles as an alternative to using psyco_compatble. (Or maybe the two techniques are equivalent). Sure, there could be code explosion in general. But how likely is this in practice? And psyco must limit code bloat as well. It would be interesting to get some statistics. In Leo I suspect the vast majority of functions and methods use only a single type of each argument and return only a single type of result. The null class pattern could be used to deal with None values of arguments. We might limit this kind of technique to routines explicitly selected by the user, or the technique could limit itself to only those functions with non-varying types. Just as psyco does, or will do, we would need some kind of escape if unexpected types of arguments were discovered later. Any comments? Edward -------------------------------------------------------------------- Edward K. Ream email: edream at tds.net Leo: Literate Editor with Outlines Leo: http://personalpages.tds.net/~edream/front.html -------------------------------------------------------------------- From tismer at tismer.com Tue Jan 14 17:20:17 2003 From: tismer at tismer.com (Christian Tismer) Date: Tue, 14 Jan 2003 17:20:17 +0100 Subject: [pypy-dev] psyco in Python was: Minimal Python project In-Reply-To: <009d01c2bbe0$fd02e910$bba4aad8@computer> References: <005c01c2bbbe$1e8b9640$bba4aad8@computer> <2mn0m3lyb7.fsf@starship.python.net> <009d01c2bbe0$fd02e910$bba4aad8@computer> Message-ID: <3E2438C1.1010706@tismer.com> Edward K. Ream wrote: ... > Also, it is not clear to me exactly how psyco in Python would work in the > production version. Clearly, it would be easy to do psyco in Python using > the _present_ C implementation of Python, but how would psyco in Python be > bootstrapped in an "all-Python" version? At present, all Python code > eventually gets executed as C code: either in the interpreter itself or in > the C libraries. In an "all-python version everything would get executed > either as C code in the libraries, or assembly language output by Psycho. Bootstrap is the magic word. There will be a very tiny, portable little virtual machine written in C. It is not meant to be efficient, just enough to implement the bytecode interpreter. I'm right now tinkering with parts of such a beast. It is getting very small, just a few kilobytes executable. This thing will be able to interpret Python bytecode. So the only other thing we need is access to a file with the marshalled implementation of the bytecode interpreter. If we have bytecode running, then we have Psyco (in Python) running. Then, we already can produce machine code for the tiny virtual engine. Then we can load the real code generator for the physical platform, if it is already existant, and then Psyco generates code for that. ciao - chris From hpk at trillke.net Tue Jan 14 17:17:28 2003 From: hpk at trillke.net (holger krekel) Date: Tue, 14 Jan 2003 17:17:28 +0100 Subject: [pypy-dev] Re: getting rid of compile.c? In-Reply-To: <2mptr0klrd.fsf@starship.python.net>; from mwh@python.net on Tue, Jan 14, 2003 at 11:34:46AM +0000 References: <20030114100703.Q1568@prim.han.de> <2mptr0klrd.fsf@starship.python.net> Message-ID: <20030114171728.W1568@prim.han.de> Hi michael! so you figured out that we've gatewayed this list through gmane :-) [Michael Hudson Tue, Jan 14, 2003 at 11:34:46AM +0000] > holger krekel writes: > > > Hi, > > > > i was wondering how/if to get rid of compile.c. ASFAIK > > it is currently needed for bootstrapping purposes. > > > > Wouldn't it be possible to deliver precompiled bytecode > > for the python compiler package and thus obsolete compile.c? > > The bytecode isn't platform dependent so this seems sensible > > to me. > > That sounds possible, and you're always going to have issues like > this, but as long as you're hosting your work inside of CPython 2.3, > I wouldn't worry about it just yet... right. But i'd like to get a self-contained bootstrap process ASAP. > (Doing this would make bytecode changes, erm, hairy, it seems to me). maybe, but doing a multi-stage bootstrap should bring back enough flexibility. E.g. i think that something like the following could work: - the precompiled stage1-compiler compiles the stage2-compiler while running on smallish C-based stage1-VM - stage2-compiler (runing on stage1-VM) compiles python stage2-VM - stage2-compiler (running on stage1-VM) recompiles stage2-compiler (probably with different bytecodes) - stage2-vm takes over - new stage2-compiler (running on stage2-vm) is used subsequently One crucial point seems to be that the stage2 python compiler code maps well to the stage1-VM and stage2-VM. does that make sense? holger From tismer at tismer.com Tue Jan 14 17:27:03 2003 From: tismer at tismer.com (Christian Tismer) Date: Tue, 14 Jan 2003 17:27:03 +0100 Subject: [pypy-dev] Re: getting rid of compile.c? In-Reply-To: <20030114171728.W1568@prim.han.de> References: <20030114100703.Q1568@prim.han.de> <2mptr0klrd.fsf@starship.python.net> <20030114171728.W1568@prim.han.de> Message-ID: <3E243A57.8080402@tismer.com> holger krekel wrote: ... > maybe, but doing a multi-stage bootstrap should bring back > enough flexibility. E.g. i think that something like the > following could work: > > - the precompiled stage1-compiler compiles the stage2-compiler > while running on smallish C-based stage1-VM > > - stage2-compiler (runing on stage1-VM) compiles python stage2-VM > > - stage2-compiler (running on stage1-VM) recompiles stage2-compiler > (probably with different bytecodes) > > - stage2-vm takes over > > - new stage2-compiler (running on stage2-vm) is used subsequently > > One crucial point seems to be that the stage2 python compiler code > maps well to the stage1-VM and stage2-VM. > > does that make sense? Comparing out latest postings tells me that our brains match fairly well :-) ciao - chris (bootstrapping myself) From edream at tds.net Tue Jan 14 17:27:07 2003 From: edream at tds.net (Edward K. Ream) Date: Tue, 14 Jan 2003 10:27:07 -0600 Subject: [pypy-dev] Re: [ann] Minimal Python project References: <7xbs2k3pwp.fsf@ruckus.brouhaha.com> <3E2436A4.40507@tismer.com> Message-ID: <00e401c2bbe9$c84f27c0$bba4aad8@computer> > This leads to high-level descriptions of low-level fields of all structures. Thanks, Chris, for these details. The more I understand about this project the better it looks :-) I eagerly await developments... Edward -------------------------------------------------------------------- Edward K. Ream email: edream at tds.net Leo: Literate Editor with Outlines Leo: http://personalpages.tds.net/~edream/front.html -------------------------------------------------------------------- From pedronis at bluewin.ch Tue Jan 14 17:24:43 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Tue, 14 Jan 2003 17:24:43 +0100 Subject: [pypy-dev] reading suggestion Message-ID: <036701c2bbe9$72037b00$6d94fea9@newmexico> Mixed-mode Bytecode Execution Ole Agesen and David Detlefs Abstract Modern high-performance virtual machines use dynamic compilation. There is a tension between compilation speed and code quality. We argue that a highly-optimizing compiler is best deployed with both a fast, less-optimizing compiler and an interpreter. We present measurements showing that such a system can achieve the same peak performance as a system with just the optimizing compiler, and startup costs similar to a system with just the interpreter and fast compiler. http://research.sun.com/research/techrep/2000/abstract-87.html [Java HotSpot at the moment uses and interpreter + fast medium less optimizing compiler in the client version, and an interpreter + more higly optimizing compiler in the server version] regards From paul at boddie.org.uk Tue Jan 14 17:41:28 2003 From: paul at boddie.org.uk (Paul Boddie) Date: Tue, 14 Jan 2003 08:41:28 -0800 Subject: [pypy-dev] Minimal VM Message-ID: <200301140841.AA446234948@boddie.org.uk> Christian Tismer wrote: > >There will be a very tiny, portable little virtual >machine written in C. It is not meant to be efficient, >just enough to implement the bytecode interpreter. >I'm right now tinkering with parts of such a beast. >It is getting very small, just a few kilobytes executable. The most interesting parts, it seems to me, will be those which implement the more complicated bytecode semantics (like LOAD_NAME, for example, which seems to have the potential to be pretty deep). Although you are bound to know much more about this than me - I've never looked at the Python VM source code - there must surely be a huge chunk of C code implementing the name lookup semantics. >This thing will be able to interpret Python bytecode. Without getting carried away by the promise of massive performance increases, an interesting application of a simplified VM should be the increased potential for reimplementations of the platform. It would be most amusing to be able to reduce the scope of VM operations such that they could be more easily implemented on "really small" computing platforms. Of course, a side effect of having simpler VM operations (ie. simpler bytecode semantics) is that Psyco (or its successor) will have much more to play with, as more of the VM "magic" moves out of the VM and into bytecode routines which can then be specialised. Anyway, those are my thoughts. Paul P.S. I wrote a similar analysis on Kendall Clark's Weblog at... http://www.pyzine.com/archives.phtml?a=000108&b=3#comments ...for those of you who might think they've read this before. ____________________________________________________________ Free 20MB Web Site Hosting and Personalized E-mail Service! Get It Now At Doteasy.com http://www.doteasy.com/et/ From hpk at trillke.net Tue Jan 14 18:01:28 2003 From: hpk at trillke.net (holger krekel) Date: Tue, 14 Jan 2003 18:01:28 +0100 Subject: [pypy-dev] Minimal VM In-Reply-To: <200301140841.AA446234948@boddie.org.uk>; from paul@boddie.org.uk on Tue, Jan 14, 2003 at 08:41:28AM -0800 References: <200301140841.AA446234948@boddie.org.uk> Message-ID: <20030114180128.X1568@prim.han.de> [Paul Boddie Tue, Jan 14, 2003 at 08:41:28AM -0800] > Christian Tismer wrote: > > > >There will be a very tiny, portable little virtual > >machine written in C. It is not meant to be efficient, > >just enough to implement the bytecode interpreter. > >I'm right now tinkering with parts of such a beast. > >It is getting very small, just a few kilobytes executable. > > The most interesting parts, it seems to me, will be those > which implement the more complicated bytecode semantics (like LOAD_NAME, > for example, which seems to have the potential to be pretty deep). that's actually pretty easy. it's basically (from eval_frame in compile.c): case LOAD_NAME: w = GETNAMEV(oparg); if ((x = f->f_locals) == NULL) { // exception ... } x = PyDict_GetItem(x, w); if (x == NULL) { x = PyDict_GetItem(f->f_globals, w); if (x == NULL) { x = PyDict_GetItem(f->f_builtins, w); if (x == NULL) { // exception ... } } } Py_INCREF(x); PUSH(x); break; so it looks into local, then global and then the builtin namespace. > Although you are bound to know much more about this than me - > I've never looked at the Python VM source code - btw, it's quite easy to read if you know C. It's pythonic C mostly :-) > there must surely be a huge chunk of C code implementing the > name lookup semantics. not that much if you don't count the Dict object in. some complexity drops in with optimizations and special cases: - local name-bindings are usually mapped to LOAD_FAST/STORE_FAST which don't go through a dictionary lookup but use integer indexes into an array. these are figured out at compile time. - nested scopes (LOAD_DEREF) which bind a name to an outer namespace they probably aren't needed for a Stage1-VM. Also a big part is exception (and block) handling. > >This thing will be able to interpret Python bytecode. > > Without getting carried away by the promise of massive performance > increases, an interesting application of a simplified VM should be > the increased potential for reimplementations of the platform. It > would be most amusing to be able to reduce the scope of VM > operations such that they could be more easily implemented > on "really small" computing platforms. There will be a memory-speed tradeoff, though. So if "small" means 1M memory the odds are that a CPython based approach is more effective. > Of course, a side effect of having simpler VM operations (ie. > simpler bytecode semantics) is that Psyco (or its successor) > will have much more to play with, as more of the VM "magic" > moves out of the VM and into bytecode routines which can > then be specialised. IMO the CPython VM is pretty straight forward. But the fact that Psyco hits the C-barrier too soon (with the frame object and the eval_frame loop) stops it from going/specializing deeper. cheers, holger From theller at python.net Tue Jan 14 21:58:38 2003 From: theller at python.net (Thomas Heller) Date: 14 Jan 2003 21:58:38 +0100 Subject: [pypy-dev] bootstrapping issues In-Reply-To: <20030112133906.GB4556@laranja.org> References: <20030111214011.P1568@prim.han.de> <20030112133906.GB4556@laranja.org> Message-ID: Lalo Martins writes: > On Sat, Jan 11, 2003 at 09:40:11PM +0100, holger krekel wrote: > > For avoiding the need of C-coded system modules > > there is interest to code a generalization of > > the "struct" module which allows *calling* functions at > > the C-level. This general C-"Extension" will be > > system-dependent. > > Someone already mentioned Squeak, which I think is something everyone > involved in this project should take a look at. (http://www.squeak.org/ ) > > I would also mention Gnu Smalltalk (http://www.gnu.org/software/smalltalk/); > it has the best C-calling-from-interpreter interface I've ever seen. > I took some minutes to look over the manual, expecially http://www.gnu.org/software/smalltalk/gst-manual/gst_21.html To be honest, in spirit it looks very similar to what I implemented in the Python ctypes module. Except, that you cannot store functions pointers into CObject instances in GNU smalltalk (at least it seems so from this web-page), but in ctypes. Thomas From roccomoretti at netscape.net Wed Jan 15 04:00:54 2003 From: roccomoretti at netscape.net (Rocco Moretti) Date: Tue, 14 Jan 2003 22:00:54 -0500 Subject: [pypy-dev] Lessons From (Limited) Experience Message-ID: <71A90F65.7DB3B5EB.9ADE5C6A@netscape.net> I like the idea. In fact, after reading the psyco website, I started something like it myself in October. The answer to the question that is on everybody's minds? PyPython. (I envisioned it as potentially being the springboard for other versions: AdaPython, LispPython, etc.) Additionally, the goal was to make a clear and readable implementation of Python which people could study and play with, without requiring them to know C. My plan was "simply" to take the CPython code and convert it into Python code, minimizing the use of "high level" features like lambda, generators, etc., so that a small core could easily be hand compiled into the language du jour. Eventually, a reduced dialect of Python might be employed for the core (interpreter loop and basic objects) such that a simple compiler could be built for it. My greatest ally was my biggest enemy. To make things easier in development, I used close association of my program with the host system. This way I could defer converting the compiler and core objects untill later. Unfortunately, it was *extremely* unclear where PyPython ended and the host system began. In my final incarnation, the PyPython objects with external visibility were referenced from a global definition file (a'la python.h), where they were simply imported wholesale from __builtins__. But the problem which stymied me was the issue of what to do with Exceptions. I discarded the concept of copying the "return Null" technique of CPython - having to check every return value for an error condition seems so unpythonic. But raising native exceptions has the complication of discriminating between exceptions in the host system (or rather the PyPython code) and the end user interpreted code. I might have been able to do something with it if I could have created my own traceback objects from within Python. (That was my greatest frustration - the inability to create and modify core objects [frame, thread, traceback, etc.] from within python itself.) That's where I left it in December when I stopped for Christmas. (Don't bother to ask for code: most of what I wrote was dead ends or "mindless conversion" from C.) I agree with using cross compilers in the bootstrap proceedure and with being "stackless" - two features I was going to apply. One additional thing that came about as a direct result of Python's lack of a switch statement: I replaced the switch statement with an array of references to functions implementing the opcodes. (I was going for clarity over speed) One offshoot of this is the potential to have multiple opcode schemes in the same interpreter. (Like Python 1.5.2 and Python 2.2 - or better yet, Python and Java) How does that work with parameter/return value passing and potential psyco optimization? Don't know - didn't get that far. I might not make the sprint, but I'd be glad to help as I can, Rocco Moretti P.S. I would advise for some brave soul (more reliable than me) to start summaries of pypy-dev, like has been done recently for python-dev. It would be nice to have a record of all the major design considerations in a location which doesn't require several hours of sorting through fluff. __________________________________________________________________ The NEW Netscape 7.0 browser is now available. Upgrade now! http://channels.netscape.com/ns/browsers/download.jsp Get your own FREE, personal Netscape Mail account today at http://webmail.netscape.com/ From tdelaney at avaya.com Wed Jan 15 04:33:28 2003 From: tdelaney at avaya.com (Delaney, Timothy) Date: Wed, 15 Jan 2003 14:33:28 +1100 Subject: [pypy-dev] Seeking Minimal Python Project Name Message-ID: A radical proposal. Python implemented in Python becomes the reference version. As a result, it is called 'Python'. The C implementation offically becomes 'CPython'. :) Tim Delaney From tismer at tismer.com Wed Jan 15 04:42:53 2003 From: tismer at tismer.com (Christian Tismer) Date: Wed, 15 Jan 2003 04:42:53 +0100 Subject: [pypy-dev] Seeking Minimal Python Project Name In-Reply-To: References: Message-ID: <3E24D8BD.6040401@tismer.com> Delaney, Timothy wrote: > A radical proposal. > > Python implemented in Python becomes the reference version. As a result, it > is called 'Python'. > > The C implementation offically becomes 'CPython'. You are absolutely right! This is what I really want to achieve, and I do believe that python-dev is not against it, if it really turns out to be *the way* into the future. We can only know that by coding. (And you are invited hereby). So what this ridiculous name seekig is about (besides obvious entries into search engines, which is appreciated), is only a code name for the current project, provided that we will have *great* success. Sure I'd want this, but this goal is 100.000 lines of Python code away from us, believe me! Nobody should underestimate this project. I would nominate it for at least 7 man years. your's sincerely -- chris p.s.: I cannot do more than 3 in one From tdelaney at avaya.com Wed Jan 15 04:45:17 2003 From: tdelaney at avaya.com (Delaney, Timothy) Date: Wed, 15 Jan 2003 14:45:17 +1100 Subject: [pypy-dev] Seeking Minimal Python Project Name Message-ID: > From: Christian Tismer [mailto:tismer at tismer.com] > > So what this ridiculous name seekig is about > (besides obvious entries into search engines, > which is appreciated), is only a code name > for the current project, provided that we will Yeah - I know. I was being facetious. I therefore submit 'Whython' ;) Short for 'Why are we waiting to call this Python?' ;) Tim Delaney From tismer at tismer.com Wed Jan 15 05:19:42 2003 From: tismer at tismer.com (Christian Tismer) Date: Wed, 15 Jan 2003 05:19:42 +0100 Subject: [pypy-dev] Seeking Minimal Python Project Name In-Reply-To: References: Message-ID: <3E24E15E.5030908@tismer.com> Delaney, Timothy wrote: > From: Christian Tismer [mailto:tismer at tismer.com] > > So what this ridiculous name seekig is about > (besides obvious entries into search engines, > which is appreciated), is only a code name > for the current project, provided that we will > > > Yeah - I know. I was being facetious. > > I therefore submit 'Whython' ;) Short for 'Why are we waiting to call this > Python?' ;) Haha! Why are you waiting to code for this Python! I've started, please come and adjust me -- chris From tim.jarman at lineone.net Wed Jan 15 13:46:43 2003 From: tim.jarman at lineone.net (Tim Jarman) Date: Wed, 15 Jan 2003 12:46:43 +0000 Subject: [pypy-dev] RE: Seeking Minimal Python Project Name References: <20030115110003.7D05A5AEE5@thoth.codespeak.net> Message-ID: <3E25582A.8DF1F1A6@lineone.net> How about: XPython as in: eXperimental Python or : eXtreme Python (but not ex-Python as in "This is an ex-parrot")? Helping-to-keep-this-thread-immortal-ly yrs, Tim J. From jum at anubis.han.de Wed Jan 15 14:38:22 2003 From: jum at anubis.han.de (Jens-Uwe Mager) Date: Wed, 15 Jan 2003 14:38:22 +0100 Subject: [pypy-dev] bootstrapping issues In-Reply-To: References: <20030111214011.P1568@prim.han.de> <20030112014136.S1568@prim.han.de> Message-ID: <20030115133822.GB2996@ANUBIS> On Mon, Jan 13, 2003 at 10:04:27AM +0100, Thomas Heller wrote: > I've now a running version which uses libffi (except on Windows). > Tested on an oldish SuSE 7.1 x86 system. > > The docs and readme's still need updating, but for the *impatient* > I've uploaded a snapshot: > > http://starship.python.net/crew/theller/ctypes-0.3.5.tar.gz I just downloaded this version and I am trying to find a matching ffi library, which appears to be rather difficult. I am building on a gentoo system which has libffi 1.20 and I get problems with ffi_closure not being defined. Do I really need to check out the complete gcc source tree to get the proper libffi? -- Jens-Uwe Mager From theller at python.net Wed Jan 15 15:25:34 2003 From: theller at python.net (Thomas Heller) Date: 15 Jan 2003 15:25:34 +0100 Subject: [pypy-dev] bootstrapping issues In-Reply-To: <20030115133822.GB2996@ANUBIS> References: <20030111214011.P1568@prim.han.de> <20030112014136.S1568@prim.han.de> <20030115133822.GB2996@ANUBIS> Message-ID: Jens-Uwe Mager writes: > On Mon, Jan 13, 2003 at 10:04:27AM +0100, Thomas Heller wrote: > > > I've now a running version which uses libffi (except on Windows). > > Tested on an oldish SuSE 7.1 x86 system. > > > > The docs and readme's still need updating, but for the *impatient* > > I've uploaded a snapshot: > > > > http://starship.python.net/crew/theller/ctypes-0.3.5.tar.gz > > I just downloaded this version and I am trying to find a matching ffi > library, which appears to be rather difficult. I am building on a gentoo > system which has libffi 1.20 and I get problems with ffi_closure not > being defined. Do I really need to check out the complete gcc source > tree to get the proper libffi? I have two: The first one checked out via anonymous CVS following the instructions on the webpage http://sources.redhat.com/libfii/ . At least they updated the pointers on the webpage to be valid, which was not the case when I first got it. I then did ./configure make make install The configure command reported an error at the end: ./config.status: ./../config-ml.in: No such file or directory which I'm unable to resolve with my poor linux skills, but it works. The second one I got from Robin Becker, it seems to be a patched version of the first one, which includes further updates for Windows (stdcall and cdecl calling convention). He got it from somewhere else... Both should work, although I'm using the first one IIRC. I'm still using my own stuff on Windows, and libffi on the linux system. Unfortunately both have the filename libffi-2.00-beta.tar.gz. Should I upload a working tar.gz, or are these instructions sufficient? Thomas BTW: Is yours an x86 system? From mwh at python.net Wed Jan 15 18:26:16 2003 From: mwh at python.net (Michael Hudson) Date: 15 Jan 2003 17:26:16 +0000 Subject: [pypy-dev] Re: getting rid of compile.c? References: <20030114100703.Q1568@prim.han.de> <2mptr0klrd.fsf@starship.python.net> <20030114171728.W1568@prim.han.de> Message-ID: <2madi2l3yf.fsf@starship.python.net> holger krekel writes: > Hi michael! > > so you figured out that we've gatewayed this list through gmane :-) I saw the message to gmane.announce, I think. [snip] > does that make sense? Probably :) I should shut up and let you get on with actually doing something... Cheers, M. -- ARTHUR: But which is probably incapable of drinking the coffee. -- The Hitch-Hikers Guide to the Galaxy, Episode 6 From robin at reportlab.com Wed Jan 15 18:56:11 2003 From: robin at reportlab.com (Robin Becker) Date: Wed, 15 Jan 2003 17:56:11 +0000 Subject: [pypy-dev] bootstrapping issues In-Reply-To: References: <20030111214011.P1568@prim.han.de> <20030112014136.S1568@prim.han.de> <20030115133822.GB2996@ANUBIS> Message-ID: In article , Thomas Heller writes >The second one I got from Robin Becker, it seems to be a patched version >of the first one, which includes further updates for Windows (stdcall and >cdecl calling convention). He got it from somewhere else... The version I sent to Thomas came from Roger E Critchlow Jr and was part of his Tcl ffidl package version 0.5. He was certainly into the foreign function interface thing in a big way and his package also included source for ffcall-1.6 http://www.gnu.org/directory/libs/c/ffcall.html and the platforms reported for that are Supported CPUs: (Put the GNU config.guess values here.) i386 i486-unknown-linux (gcc), i386-unknown-sysv4.0 (gcc, /usr/bin/cc, /usr/ucb/cc), i386-pc-solaris2.6 (gcc), i486-unknown-sco3.2v4.2 (gcc, cc -Of), i486-unknown-os2emx (gcc), i386-pc-cygwin32 (gcc), i386-pc-win32 (msvc4, msvc5) m68k m68k-next-nextstep3 (cc), m68k-sun-sunos4.0 (cc), m68k-unknown-linux (gcc) mips mips-sgi-irix4.0.5 (gcc, cc -ansi, cc -D__STDC__, cc -cckr), mips-sgi-irix5.2 (gcc, cc -ansi, cc -D__STDC__, cc -cckr), mips-sgi-irix5.3 (gcc, cc -ansi, cc -D__STDC__, cc -cckr), mips-sgi-irix6.2 (cc -32), mips-sgi-irix6.4 (cc -32, cc -n32, cc -64) sparc sparc-sun-sunos4.1.1 (gcc, cc), sparc-sun-solaris2.3 (gcc) sparc-sun-solaris2.4 (gcc, cc) alpha alpha-dec-osf3.0 (gcc, cc), alpha-dec-osf4.0 (gcc, cc) hppa hppa1.0-hp-hpux8.00 (gcc, cc), hppa1.1-hp-hpux9.05 (cc), hppa1.1-hp-hpux10.01 (cc), hppa2.0-hp-hpux10.20 (cc +DA1.1) arm -- untested rs6000 rs6000-ibm-aix3.2.5 (gcc, c89, xlc), powerpc-ibm-aix4.1.4.0 (cc) m88k -- untested convex -- untested -- Robin Becker From theller at python.net Wed Jan 15 21:10:52 2003 From: theller at python.net (Thomas Heller) Date: 15 Jan 2003 21:10:52 +0100 Subject: [pypy-dev] bootstrapping issues In-Reply-To: References: <20030111214011.P1568@prim.han.de> <20030112014136.S1568@prim.han.de> <20030115133822.GB2996@ANUBIS> Message-ID: <65sqcgxf.fsf@python.net> Robin Becker writes: > In article , Thomas Heller writes > >The second one I got from Robin Becker, it seems to be a patched version > >of the first one, which includes further updates for Windows (stdcall and > >cdecl calling convention). He got it from somewhere else... > The version I sent to Thomas came from Roger E Critchlow Jr > and was part of his Tcl ffidl package version 0.5. Thanks for looking this up again, Robin. > He was certainly into the foreign function interface thing in a > big way and his package also included source for ffcall-1.6 > http://www.gnu.org/directory/libs/c/ffcall.html and > the platforms reported for that are I'm not keen on supporting ffcall, since it's GPL and so I cannot use it for the stuff I have to redistribute. OTOH I'm quite sure it would not be too difficult to support that, also, if useful. Thomas From hpk at trillke.net Thu Jan 16 00:00:12 2003 From: hpk at trillke.net (holger krekel) Date: Thu, 16 Jan 2003 00:00:12 +0100 Subject: [pypy-dev] Re: getting rid of compile.c? In-Reply-To: <2madi2l3yf.fsf@starship.python.net>; from mwh@python.net on Wed, Jan 15, 2003 at 05:26:16PM +0000 References: <20030114100703.Q1568@prim.han.de> <2mptr0klrd.fsf@starship.python.net> <20030114171728.W1568@prim.han.de> <2madi2l3yf.fsf@starship.python.net> Message-ID: <20030116000012.Q1568@prim.han.de> [Michael Hudson Wed, Jan 15, 2003 at 05:26:16PM +0000] > holger krekel writes: > [snip] > > does that make sense? > > Probably :) I should shut up and let you get on with actually doing > something... your input is highly appreciated as long as you don't start to suggest project names :-) At least for my part, i am very thankful for knowledgable people like you sharing some thoughts on e.g. bootstrapping scenarios. cheers, holger From tismer at tismer.com Thu Jan 16 00:49:30 2003 From: tismer at tismer.com (Christian Tismer) Date: Thu, 16 Jan 2003 00:49:30 +0100 Subject: [pypy-dev] Re: getting rid of compile.c? In-Reply-To: <20030116000012.Q1568@prim.han.de> References: <20030114100703.Q1568@prim.han.de> <2mptr0klrd.fsf@starship.python.net> <20030114171728.W1568@prim.han.de> <2madi2l3yf.fsf@starship.python.net> <20030116000012.Q1568@prim.han.de> Message-ID: <3E25F38A.3070600@tismer.com> holger krekel wrote: > [Michael Hudson Wed, Jan 15, 2003 at 05:26:16PM +0000] > >>holger krekel writes: >>[snip] >> >>>does that make sense? >> >>Probably :) I should shut up and let you get on with actually doing >>something... > > > your input is highly appreciated as long as you don't start > to suggest project names :-) But I found the name game very funny. Enough for now, of course. > At least for my part, i am very thankful for knowledgable people > like you sharing some thoughts on e.g. bootstrapping scenarios. I found all the discussion very interesting, too, and it kept me thinking of the project all day, although I should wait until the sprint and do my daily work. There was also an awful lot of private emails to answer. Btw. Guido asked about a sprint at Europython, which I found a very good idea. The project is large enough for more than one sprint. (As somebody mentioned, it will be more like a marathon :-) One thing from a message from Rocco Moretti is worth bringing up again: """ P.S. I would advise for some brave soul (more reliable than me) to start summaries of pypy-dev, like has been done recently for python-dev. It would be nice to have a record of all the major design considerations in a location which doesn't require several hours of sorting through fluff. """ Should we try to find somebody who takes this wonderful task of summarizing, or should we collect info in a Wiki instead? all the best - chris From robin at reportlab.com Thu Jan 16 03:58:47 2003 From: robin at reportlab.com (Robin Becker) Date: Thu, 16 Jan 2003 02:58:47 +0000 Subject: [pypy-dev] bootstrapping issues In-Reply-To: <65sqcgxf.fsf@python.net> References: <20030111214011.P1568@prim.han.de> <20030112014136.S1568@prim.han.de> <20030115133822.GB2996@ANUBIS> <65sqcgxf.fsf@python.net> Message-ID: In article <65sqcgxf.fsf at python.net>, Thomas Heller writes >Robin Becker writes: > >> In article , Thomas Heller >writes >> >The second one I got from Robin Becker, it seems to be a patched version >> >of the first one, which includes further updates for Windows (stdcall and ...... >I'm not keen on supporting ffcall, since it's GPL and so I cannot >use it for the stuff I have to redistribute. OTOH I'm quite sure >it would not be too difficult to support that, also, if useful. > >Thomas > No problem on that, I just wanted to mention some alternatives. -- Robin Becker From tismer at tismer.com Thu Jan 16 04:01:35 2003 From: tismer at tismer.com (Christian Tismer) Date: Thu, 16 Jan 2003 04:01:35 +0100 Subject: ctypes impressions (was: [pypy-dev] bootstrapping issues) In-Reply-To: References: <20030111214011.P1568@prim.han.de> <20030112014136.S1568@prim.han.de> Message-ID: <3E26208F.4010808@tismer.com> Robin Becker: >>>Thomas Heller's ctypes module would seem to be a very good start at the >>>generic C interface thing. It is pretty easy to use and although I don't >>>have a complete grasp on it was a bit easier to use than calldll. I used >>>both to do anygui interfaces to native windows api. Thomas Heller: > The docs and readme's still need updating, but for the *impatient* > I've uploaded a snapshot: Well, I had a look at the current version of ctypes, in the backgound of this project, of course. I only looked into _ctypes.c right now. callbacks.c and callproc.c have to be examined as well. My impressions are a bit mixed. First of all, this looks like a very complete module to manipulate every and all flavors of C types. I haven't played with it yet, what I promise to do. As an extension to the current, C based Python implementation, it is without doubt great stuff. There are lots of new object types together with lots of methods to create and manipulate them. What I liked really much (and I would probably like to steal this trick) is the most elegant implementation of alignment. Interested readers should read and understand lines 3309 ff of _ctypes.c . Now to the flipside. While this module is for sure of great use for CPython, I'm quite reluctant to use it as-is for Minimal Python. Reason? It is *way* too big, with 101 KB source and 4330 lines. While this is very ok with CPython, I think such a big module in C is exactly what we don't want for this project, since getting rid of large C sources is one of the first objectives of the project. This is absolutely not meant negatively to Thomas. Your module was not designed for this project. In the context of CPython, this is the way to do an efficient module. Everything has to be written down, in the flat way that C requires. That creates many source lines, many type denotations, and lots of similar looking methods. For Minimal Python, we need only a small percentage of this. We should pick some essential ideas and re-write that part in Python. We also only need some basic types to boot up the Python engine. Access to real memory can be provided by some functions in C, which should be mapped to opcodes of the virtual micro-machine. I'm anyway very thankful for this great resource. The most valuable thing are the ideas, which we should take into account. I'm also sure that we need to learn from the interfaces into library calls, pretty soon. Thanks a lot for this impressive implementation. Minimal Python will surely need some of the code and many of the ideas. cheers - chris From theller at python.net Thu Jan 16 08:56:05 2003 From: theller at python.net (Thomas Heller) Date: 16 Jan 2003 08:56:05 +0100 Subject: ctypes impressions (was: [pypy-dev] bootstrapping issues) In-Reply-To: <3E26208F.4010808@tismer.com> References: <20030111214011.P1568@prim.han.de> <20030112014136.S1568@prim.han.de> <3E26208F.4010808@tismer.com> Message-ID: Christian Tismer writes: > Robin Becker: > > >>>Thomas Heller's ctypes module would seem to be a very good start at the > >>>generic C interface thing. It is pretty easy to use and although I don't > >>>have a complete grasp on it was a bit easier to use than calldll. I used > >>>both to do anygui interfaces to native windows api. > > Thomas Heller: > > > The docs and readme's still need updating, but for the *impatient* > > I've uploaded a snapshot: > > Well, I had a look at the current version of ctypes, > in the backgound of this project, of course. > I only looked into _ctypes.c right now. > callbacks.c and callproc.c have to be examined as well. > > My impressions are a bit mixed. > > First of all, this looks like a very complete module > to manipulate every and all flavors of C types. > I haven't played with it yet, what I promise to do. > > As an extension to the current, C based Python > implementation, it is without doubt great stuff. > There are lots of new object types together with > lots of methods to create and manipulate them. > > What I liked really much (and I would probably like > to steal this trick) is the most elegant implementation > of alignment. Interested readers should read and understand > lines 3309 ff of _ctypes.c . Ahem, this code is nearly copied from Python's structmodule.c. So, this idea was someone else's. > > Now to the flipside. > While this module is for sure of great use for CPython, > I'm quite reluctant to use it as-is for Minimal Python. > Reason? It is *way* too big, with 101 KB source and 4330 > lines. > > While this is very ok with CPython, I think such a big > module in C is exactly what we don't want for this project, > since getting rid of large C sources is one of the first > objectives of the project. > > This is absolutely not meant negatively to Thomas. Your > module was not designed for this project. In the context > of CPython, this is the way to do an efficient module. > Everything has to be written down, in the flat way that C > requires. That creates many source lines, many type > denotations, and lots of similar looking methods. > Yes, only counting the lines it is even larger than Jim Fulton's ExtensionClass.c - but the code is much less dense and easier to read, IMO. But don't take this wrong: I'm not going to argue with you that something else is needed for Minimal Python. > For Minimal Python, we need only a small percentage of this. > We should pick some essential ideas and re-write that part > in Python. We also only need some basic types to boot up > the Python engine. Access to real memory can be provided > by some functions in C, which should be mapped to opcodes > of the virtual micro-machine. > Thanks for this great review, chris. Thomas From tismer at tismer.com Thu Jan 16 11:48:56 2003 From: tismer at tismer.com (Christian Tismer) Date: Thu, 16 Jan 2003 11:48:56 +0100 Subject: ctypes impressions (was: [pypy-dev] bootstrapping issues) In-Reply-To: References: <20030111214011.P1568@prim.han.de> <20030112014136.S1568@prim.han.de> <3E26208F.4010808@tismer.com> Message-ID: <3E268E18.5090209@tismer.com> Thomas Heller wrote: ... >>What I liked really much (and I would probably like >>to steal this trick) is the most elegant implementation >>of alignment. Interested readers should read and understand >>lines 3309 ff of _ctypes.c . > > > Ahem, this code is nearly copied from Python's structmodule.c. > So, this idea was someone else's. *blush* I've never completely read the structmodule source. Sorry. [talking about size] > Yes, only counting the lines it is even larger than Jim Fulton's > ExtensionClass.c - but the code is much less dense and easier to read, > IMO. > But don't take this wrong: I'm not going to argue with you > that something else is needed for Minimal Python. Great! I was afraid you might take me wrong. >>For Minimal Python, we need only a small percentage of this. >>We should pick some essential ideas and re-write that part >>in Python. We also only need some basic types to boot up >>the Python engine. Access to real memory can be provided >>by some functions in C, which should be mapped to opcodes >>of the virtual micro-machine. I dislike the above paragraph now. I'm overloading the young project with demands and restrictions. > Thanks for this great review, chris. After some sleep, I found this being not such a great review. I too much pointed on things which are only relevant through the nature of this project, and I argued with the final result in mind. Instead, I should have mentioned that it is pretty fine to use ctypes right now as it is, in order to get things flying. All considerations about re-coding things in Python don't apply to the early bootstrap phase. I-shouldn't-write-at-4-o'clock-in-the-night - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tismer at tismer.com Thu Jan 16 13:26:33 2003 From: tismer at tismer.com (Christian Tismer) Date: Thu, 16 Jan 2003 13:26:33 +0100 Subject: [pypy-dev] Re: [ann] Minimal Python project In-Reply-To: <7x1y3dehrr.fsf@ruckus.brouhaha.com> References: <7xbs2k3pwp.fsf@ruckus.brouhaha.com> <7x1y3dehrr.fsf@ruckus.brouhaha.com> Message-ID: <3E26A4F9.3010009@tismer.com> Paul Rubin wrote: > Christian Tismer writes: > >>This leads to high-level descriptions of low-level >>fields of all structures. [...] > Do you have some other examples that really get used often? I'd think > f_lineno's don't get used that often. This was just an example for a common field (in fact it is used quite often at runtime, but almost only in the SET_LINENO opcode), that is embedded into some other structure. Everything else would to. The idea is to describe fixed data structures in a way that allows Python to easily deduce what the data type is, without any trial. See for example Thomas Heller's ctypes module, which has such descriptions for everything. A drawback of this is that we always need to access target variables in dotted notation. Otherwise we loose the static type info. If Python had type declarations, this would be no problem. Here an example from intobject.c: static int int_compare(PyIntObject *v, PyIntObject *w) { register long i = v->ob_ival; register long j = w->ob_ival; return (i < j) ? -1 : (i > j) ? 1 : 0; } Assuming that we try to model this in Python, the resulting code might look like this def int_compare(v, w): i, j = v.ob_ival, w.ob_ival if i < j: return -1 elif i > j: return 1 return 0 The above code only occours in contexts where integer objects are passed in. There is no type info in advance, but at the first execution of this code, v and w are passed in with their descriptions of all fields, and it is now absolutely clear that i and j are values of fixed sized integers. Code for directly accessing the ob_ival fields and doing the comparison can immediately be emitted when running the code first time. A remaining problem with the lack of declarations are local variables which are not members of a structure and it is not clear from the beginning what the primitive type should be. One ugly way would be to construct a structure for the local variables and to use dotted notation again. I hope this can be avoided by propagation of type info via __coerce__. Another snipped from intobject: for (i = 0, p = &list->objects[0]; i < N_INTOBJECTS; i++, p++) { if (PyInt_CheckExact(p) && p->ob_refcnt != 0) irem++; } Given a type object named c_int, this might translate to i = c_integer(0) p = list.objects while i < N_INTOBJECTS: # body not implemented here i += 1 p += 1 Here I use a known class as initialization of i. The data type is therefore known. p as a pointer field in the list structure is also known. The __coerce__ method of these classes can be written in a way, that they always propagate their own class to other operands, and especially in this case, the right operand is a constant. Given a definition like this: class c_integer(c_datatypes): def __coerce__(self, other): if type(other) == int: return self, c_integer(other) elif .... What I tried to express is that with little or no help of the programmer, primitive data types can be deduced quite easily, and unique code can be emitted on first execution of the code. > Will you give some thought to going to a tagged representation of > object references, and maybe a more traditional GC? That way you > avoid a lot of memory traffic, and also don't have to adjust reference > counts every time you touch anything. > > I think these would give a big performance boost, and also would also > simplify the C API. It might be possible to supply a compatibility > layer to make the old C API still useable. Can you give me a hint how this would be done? I have no experience with tagged representations, but if this can save so much, I should learn more about it. cheers - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From pedronis at bluewin.ch Thu Jan 16 13:57:02 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Thu, 16 Jan 2003 13:57:02 +0100 Subject: [pypy-dev] Re: [ann] Minimal Python project References: <7xbs2k3pwp.fsf@ruckus.brouhaha.com><7x1y3dehrr.fsf@ruckus.brouhaha.com> <3E26A4F9.3010009@tismer.com> Message-ID: <00b501c2bd5e$c4078d80$6d94fea9@newmexico> From: "Christian Tismer" > > Will you give some thought to going to a tagged representation of > > object references, and maybe a more traditional GC? That way you > > avoid a lot of memory traffic, and also don't have to adjust reference > > counts every time you touch anything. > > > > I think these would give a big performance boost, and also would also > > simplify the C API. It might be possible to supply a compatibility > > layer to make the old C API still useable. > > Can you give me a hint how this would be done? > I have no experience with tagged representations, > but if this can save so much, I should learn more > about it. this is a relevant survey paper: http://citeseer.nj.nec.com/gudeman93representing.html Representing Type Information in Dynamically Typed Languages e.g. integer in the range [-2**30,2**30-1] would be represented by their bit patterns shifted left by 1 bit all other objects would be boxed, and the address left shifted by 1 and ORed with 1. The fact that Python is slowly removing the long/integer separation can make less disruptive such an approach. OTOH I think such an approach would make the life a bit more complicated for psyco when detecting that a datum has machine-size integer type, because such a datum would be either in the first form _or_ a boxed (long) integer in the larger range -2**31 2**31-1. Another approach is to carry data around as a struct where both the tag and the datum are machine words, this approach is AFAIK currently used in the Gwydion Dylan compiler (www.gwydiondylan.org) originally developed at CMU. regards. From mwh at python.net Thu Jan 16 15:06:10 2003 From: mwh at python.net (Michael Hudson) Date: 16 Jan 2003 14:06:10 +0000 Subject: [pypy-dev] Re: [ann] Minimal Python project References: <7xbs2k3pwp.fsf@ruckus.brouhaha.com> <7x1y3dehrr.fsf@ruckus.brouhaha.com> <3E26A4F9.3010009@tismer.com> <00b501c2bd5e$c4078d80$6d94fea9@newmexico> Message-ID: <2mptqxxk8d.fsf@starship.python.net> "Samuele Pedroni" writes: > integer in the range [-2**30,2**30-1] would be represented by their bit > patterns shifted left by 1 bit > > all other objects would be boxed, and the address left shifted by 1 and ORed > with 1. I seem to recall someone tried that with python quite recently and found it was a pessimization (search python-dev, I guess). I'm not sure that's what Paul meant, though. Cheers, M. -- ZAPHOD: Listen three eyes, don't try to outwierd me, I get stranger things than you free with my breakfast cereal. -- The Hitch-Hikers Guide to the Galaxy, Episode 7 From pedronis at bluewin.ch Thu Jan 16 15:35:17 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Thu, 16 Jan 2003 15:35:17 +0100 Subject: Fw: [pypy-dev] Re: [ann] Minimal Python project Message-ID: <00d101c2bd6c$7d422640$6d94fea9@newmexico> From: "Michael Hudson" > > > integer in the range [-2**30,2**30-1] would be represented by their bit > > patterns shifted left by 1 bit > > > > all other objects would be boxed, and the address left shifted by 1 and ORed > > with 1. > > I seem to recall someone tried that with python quite recently and > found it was a pessimization (search python-dev, I guess). > > I'm not sure that's what Paul meant, though. he wrote tagged representation, and there was an e.g. in front of the example above, I don't claim it does wonder, even less together with psyco or with reference counting. From arigo at tunes.org Thu Jan 16 16:46:45 2003 From: arigo at tunes.org (Armin Rigo) Date: Thu, 16 Jan 2003 16:46:45 +0100 (CET) Subject: [pypy-dev] From PyPy to Psyco Message-ID: Hello everybody, Ok, sorry about not posting earlier on this list. Sorry too for the long e-mail; I wanted to reply to individual e-mails, but it seems that the subjects always overlap. So I'm just dropping my thoughts here. Bootstrapping issues where quite widely discussed. There are several valid approaches in my opinion. I would say that we should currently stick to "Python in Python"'s first goal: to have a Python interpreter up and running, written entierely in Python. (1) write a Python interpreter in Python, keeping (2) in mind, using any recent version of CPython to test it. Include at least the bytecode interpreter and redefinition of the basic data structures (tuple, lists, integers, frames...) as classes. Optionally add a tokenizer-parser-compiler to generate bytecode from source (for the first tests, using the underlying compile() function is fine). Only when this is done can we invent nice ways to do interesting things on this interpreter. The approach I prefer is: (2) have a tool that can perform a static analysis of the core of (1). To make this possible this core has to be written using some restricted coding style, using simple constructs (and not full of Python tricks and optimizations; something like Rocco describes). (3) this tool compiles the core (via C code) into several different things: (3a) the classical approach (like e.g. in smalltalk) is to emit C code that is very similar to the original Python code, much like Pyrex does. We obtain a stand-alone minimal interpreter. By interpreting the rest of (1), we get an already nice result: a stand-alone complete Python interpreter with a small amount of C -- and, more interestingly, almost all this C code was itself automatically generated from Python code. To bootstrap the tokenizer-parser-compiler written in unrestricted Python, simply provide a bytecode-precompiled version of it along with the C file. Keep in mind that both the C files and the precompiled .pyc files are intermediate files, automatically regenerated as often as needed, and only distributed for bootstrapping purposes if we want to be independent from CPython. (3b) among other things that can be generated from (2) are a bytecode checker, which CPython currently lacks and Guido sometimes thinks would be a nice addition to prevent 'new.code()' from crashing the interpreter. (I already experimented with this, it can work.) (Note that we are not forced a priori to choose the same set of bytecodes as CPython, but doing so is probably a good idea at this stage.) (3c) Psyco stuff: given a few extra hints, the code generated from (2) can be Psyco itself. I mean, I am not thinking about introducing the current C-based Psyco into the play to execute (1) faster. This would at best give the same performance as CPython --- and I seriously doubt it can be as fast as that, given that Psyco's limited back-end cannot compete with serious C compilers. Instead, we can generate C code which constitutes a new Psyco for the language that (1) is an interpreter for. In short, we would have translated our interpreter into a specializing compiler. What is nice, of course, is that the same would also work if (1) were an interpreter for a different language. This is not magic; specialization is known to be a tool that can translate any interpreter into a compiler. What's new here is the dynamic nature of the choice of what is compile-time and what is run-time. For (3b) and (3c) I am thinking about emitting C code, but this is not a requirement: it would be possible to emit Python code too, for example to build a Python-based bytecode checker. But if we did the static type analysis in (2), we might just as well emit C code to keep the discovered static type declarations. Related points: * Platform-specific bootstrapping tools. Looks like I favour the idea that everything is managed by the tool in (3a) (which is itself in Python, of course). This tool could emit platform-specific C code when possible, or (for redistribution) a generic low-quality platform-independent version that would suffice to run (3a) again. As the C code is not really meant to be compiled more than once we can avoid 'make' tools --- for all I care it could even be a single large C file, as it is not manually managed anyway. * Representation of data structures. Use Python classes, e.g. integers are implemented using a "class PyIntObject" with an ob_ival attribute (or property). These classes are straightforwardly translated into a C struct by (3a). The structure can be made compatible with CPython's PyIntObject, or alternatively can be built for a GC-based interpreter with no reference counting, if we wanted to try such a thing. * Foreign Function Interface (a.k.a. calling external C functions). Two approches here. I believe that in a first stage it is sufficient to emit the real calls in our C code, by translating some static Python declaration with (3a). All such callable functions must be pre-declared to get compiled into the stand-alone core interpreter. This doesn't give get dynamic call features like those offered by some existing CPython C extensions. But a form of dynamic calls is necessary for Psyco --- it must at the very least be able to emit machine code that calls arbitrary C functions. So I'm trying to push this whole issue into the context of emitting machine code by Psyco: when this works it should not be a problem to expose some of the techniques in a built-in extension module to let end users call arbitrary C functions. Before that I don't think there is a need for enhanced 'struct', 'calldll' and 'ctypes' modules. * Psyco as a compiler. If no C compiler is available, Psyco can be used to emit statically-known code; I guess we can bootstrap a whole Python interpreter without even leaving CPython, just by having Psyco's back-end do the (3a) or even (3c). That's nice, althought I cannot think of a real use for this :-) * CPython. The above plan only uses CPython for its ability to run Python programs and for the hopefully shrinking number of features that are not re-implemented in (1). The CPython source code is used for inspiration, not compilation. * Python Virtual Machine. In the above plan there is no need for a small Python VM in C for bootstrapping. * Assembly Virtual Machine. Something else that Christian mentionned was a low-level VM that would provide a cool target for emitted machine code. For all static stuff I see C as a nicer low-level target, but for Psyco it might be interesting to have a general platform-independant target. Another cool Psyco thing would be to never actually emit code in a first phase, but only gather statistics that would later let a specialized C version of parts of the program be written and compiled statically. There are so many cool things to experiment, I can't wait to have (1) and (2) ready --- but I guess it's the same for all of us :-) A bientot, Armin. From arigo at tunes.org Thu Jan 16 16:58:33 2003 From: arigo at tunes.org (RIGO Armin) Date: Thu, 16 Jan 2003 16:58:33 +0100 Subject: [pypy-dev] Stuff that already exists in Python In-Reply-To: <3E222BD5.8050601@tismer.com> References: <5.0.2.1.1.20030112132955.00a6d800@mail.oz.net> <200301130237.h0D2bNj08426@pcp02138704pcs.reston01.va.comcast.net> <3E222BD5.8050601@tismer.com> Message-ID: <20030116155833.GA20476@magma.unil.ch> Hello Christian, On Mon, Jan 13, 2003 at 04:00:37AM +0100, Christian Tismer wrote: > This also touches one of my weak points: > Is it good to replace C code by something > which is depending on another special module > like sre? > This makes sre into something crucial to this > re-implementation. But I'm not sure if this > is a good way to go. As far as performance is concerned, it could be at some stage re-implemented in pure Python. Character handling may be slow but it is a good target for Psyco optimization. > I'm also not sure if it is good to re-implement > certain modules using the common Python tricks > and optimizations. To some extent, I have the > impression that doing it the simple way, basically > as done in C, would fit the optimizations of > Psyco better. But that is an open question > until we get some feedback from Armin Rigo. That's right; the "simple way" is what Psyco can best work with. It does not prevent the definition of abstract classes that encapsulate the common data structures; on the other hand, these classes (with suitable information like types and mutability) are quite useful for Psyco, as you already mentionned. Armin From tismer at tismer.com Thu Jan 16 18:14:23 2003 From: tismer at tismer.com (Christian Tismer) Date: Thu, 16 Jan 2003 18:14:23 +0100 Subject: [pypy-dev] Re: [ann] Minimal Python project In-Reply-To: <00b501c2bd5e$c4078d80$6d94fea9@newmexico> References: <7xbs2k3pwp.fsf@ruckus.brouhaha.com><7x1y3dehrr.fsf@ruckus.brouhaha.com> <3E26A4F9.3010009@tismer.com> <00b501c2bd5e$c4078d80$6d94fea9@newmexico> Message-ID: <3E26E86F.2050002@tismer.com> Samuele Pedroni wrote: > From: "Christian Tismer" [tagged object references] >>Can you give me a hint how this would be done? >>I have no experience with tagged representations, >>but if this can save so much, I should learn more >>about it. > > this is a relevant survey paper: > > http://citeseer.nj.nec.com/gudeman93representing.html > > Representing Type Information in Dynamically Typed Languages Oh, well, sorry, now I remember. Sure, I also remember that I was thinking loud about boxing variables and tagging addresses a few years ago. Can't find it any longer, but as I remember, somebody was not convinced that tagging would be great for Python. Maybe TimBot? I know a couple fo interpreters which use tagging, and Guile for instance makes extensive use of it. This technique is kind of data compression by putting type info into an unused pattern of addresses. It can avoid lots of allocations, but also turns every object access into some macro processing. for a language that I have to code by hand, I don't like that so much. But when we have control over everything, like in this project, it makes at least sense to try an alternative object model, by the newly gathered flexibility. For the bootstrap phase, I'd say hands off from any changes of the model. First we redefine the world identically "in place". When that works, we can try changing worlds. > integer in the range [-2**30,2**30-1] would be represented by their bit > patterns shifted left by 1 bit > > all other objects would be boxed, and the address left shifted by 1 and ORed > with 1. Humm. I would use the left shifted integer, or'ed with 1. After testing that bit, Arithmetic right shift gives the integer. That allows to use the address variant without any operation and has the nice debugging facility that addresses are already addresses, and object inspection works directly. ... > Another approach is to carry data around as a struct where both the tag and the > datum are machine words, this approach is AFAIK currently used in the Gwydion > Dylan compiler (www.gwydiondylan.org) originally developed at CMU. This makes it impossible to get an object as a result value of a function. You need references all the time. Also, container objects will double their size. I think, both can be tried, later, by changing only a little Python code that is responsible for the representation of this. Getting completely rid of reference counting in favor of a classical GC is also a challenging path which cannot be tried with CPython at all. The best layout will win the game. I don't know in advance, so let's create the thing flexible enough to keep all paths open. cheers - chris p.s.: Paul, would you mind to participate in pypy-dev, then we can avoid crossposting. -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From guido at python.org Thu Jan 16 18:17:00 2003 From: guido at python.org (Guido van Rossum) Date: Thu, 16 Jan 2003 12:17:00 -0500 Subject: Fw: [pypy-dev] Re: [ann] Minimal Python project In-Reply-To: Your message of "Thu, 16 Jan 2003 15:35:17 +0100." <00d101c2bd6c$7d422640$6d94fea9@newmexico> References: <00d101c2bd6c$7d422640$6d94fea9@newmexico> Message-ID: <200301161717.h0GHH0F12190@odiug.zope.com> > From: "Michael Hudson" > > > > > integer in the range [-2**30,2**30-1] would be represented by their bit > > > patterns shifted left by 1 bit > > > > > > all other objects would be boxed, and the address left shifted by 1 and ORed > > > with 1. > > > > I seem to recall someone tried that with python quite recently and > > found it was a pessimization (search python-dev, I guess). > > > > I'm not sure that's what Paul meant, though. > > he wrote tagged representation, and there was an e.g. in front of the example > above, I don't claim it does wonder, even less together with psyco or with > reference counting. We did something like this in ABC 20 years ago and IMO it was a recipe for disaster. We kept finding more places where we had to check pointers before dereferencing them. Admitted, the scheme proposed here may not have the same problem because all pointers are shifted too. But I seem to recall that the slight gain in memory usage wasn't worth the code complexity. Caching 100 small ints probably goes a long way for typical programs. --Guido van Rossum (home page: http://www.python.org/~guido/) From phr-2002 at nightsong.com Thu Jan 16 19:41:14 2003 From: phr-2002 at nightsong.com (Paul Rubin) Date: 16 Jan 2003 18:41:14 -0000 Subject: [pypy-dev] Re: [ann] Minimal Python project Message-ID: <20030116184114.7948.qmail@brouhaha.com> It can avoid lots of allocations, but also turns every object access into some macro processing. for a language that I have to code by hand, I don't like that so much. It's not really that bad. Most of the time, the macro doesn't do anything. In fact, wouldn't the pypy compiler take care of it automatically? The macro is only an issue in the low level C interpreter. For the bootstrap phase, I'd say hands off from any changes of the model. First we redefine the world identically "in place". When that works, we can try changing worlds. I'd think the object representation is pretty fundamental, so once you choose something, it will take a lot of rework to change it. Hopefully you can prepare for that by writing flexibly. > all other objects would be boxed, and the address left shifted by 1 > and ORed with 1. Humm. I would use the left shifted integer, or'ed with 1. After testing that bit, Arithmetic right shift gives the integer. That allows to use the address variant without any operation and has the nice debugging facility that addresses are already addresses, and object inspection works directly. Yes. Also, you can add the shifted integers together without having to unshift or mask them. You can also use 2 or 3 tag bits, and encode other types in a single word. For example, Python probably uses an awful lot of 1-character strings, so boxing those could do some good. Olin Shivers article about T is pretty inspirational--you might look at it: . p.s.: Paul, would you mind to participate in pypy-dev, then we can avoid crossposting. I guess I can live with this, though I'd rather use the newsgroup. Is crossposting so hard? If you want to put me on a mailing list, please use the address phr-pypy at nightsong.com so I can auto-route it. (Btw I don't like the name pypy much. There are several others in that thread which I like a lot better). There's another guy I'd also like to invite, a Lisp expert, if that's ok with you. He's been interested in writing a Python compiler for a while. I'll ask him if he wants to join, but he might not. How much traffic is on the list? Paul From droper at lineone.net Thu Jan 16 20:00:42 2003 From: droper at lineone.net (David Roper) Date: Thu, 16 Jan 2003 19:00:42 -0000 Subject: [pypy-dev] Objective of minimal python Message-ID: I've lurked on the periphery of the discussions so far and must admit to being somewhat lost as to what exactly is being proposed. Am I correct in believing the following? 1. That the objective of the project is to create a new implementation python that: i. Has a smaller memory footprint; and ii. runs applications faster than the current interpreter; and iii. is well suited to applications making significant use of lightweight threads and continuations. 2. To achieve this it is intended to rewrite the 'front-end' of the Python interpreter, specifically the language parser, lexical analyser and compiler / byte-code generator, in python itself. 3. The byte-code interpreter (is this the same as the virtual machine?) is a program that reads the byte-code stream as data and processes it to do 'real work'. In order to meet objective (iii), it is intended to write a new VM that employs structure and mechanisms seen in the 'stackless' implementation. There are three possible implementations of the VM: i. It could be written in 'C' and call the 'C' runtime library execute functions implied by the byte-codes, e.g. allocation of space, arithmetic functions etc. Such an approach could draw heavily upon the code base of the 'execution engine' (I don't know what it's really called) of the existing VM. ii. It could be written in any language (including, but not limited to, 'C') and convert the byte-code stream into 'C' (or, indeed, any other statically compiled language), to be passed off to a compiler for compilation, linking and subsequent execution. This advantage of this approach is that it can exploit the considerable advances that compiler writers have made in code optimisation. I presume that the reason that the byte code, rather than python source code, is converted to 'C' is to be able to reuse existing compiled classes for which one does not have access to the source. iii. I could emit machine specific assembly language instead of 'C'. If understand correctly, this is what Psyco does. The advantage to this approach is that the code emited can be modified as the execution of the program (byte-code stream) evolves, thus permitting optimisations to be performed that cannot be achieved in a statically compiled language. This disadvantage is that it would be a very considerable effort generally to produce assembly language that is as highly optimised as a good optimising compiler might achieve. I'd be most grateful for your comments. Regards, David Roper mailto:droper at lineone.net From tismer at tismer.com Thu Jan 16 20:44:15 2003 From: tismer at tismer.com (Christian Tismer) Date: Thu, 16 Jan 2003 20:44:15 +0100 Subject: [pypy-dev] Re: [ann] Minimal Python project In-Reply-To: <20030116184114.7948.qmail@brouhaha.com> References: <20030116184114.7948.qmail@brouhaha.com> Message-ID: <3E270B8F.90205@tismer.com> Paul Rubin wrote: > It can avoid lots of allocations, but also turns every object access > into some macro processing. for a language that I have to code by > hand, I don't like that so much. > > It's not really that bad. Most of the time, the macro doesn't do > anything. In fact, wouldn't the pypy compiler take care of it > automatically? The macro is only an issue in the low level C > interpreter. Sure. But please see Guido's reply on the list. I think I've read a similar message long ago, so I won't touch this now. > For the bootstrap phase, I'd say hands off from any changes of the > model. First we redefine the world identically "in place". When > that works, we can try changing worlds. > > I'd think the object representation is pretty fundamental, so once you > choose something, it will take a lot of rework to change it. Hopefully > you can prepare for that by writing flexibly. I think to try this from the beginning. ... [more tags bits... I know :-)] > p.s.: Paul, would you mind to participate in pypy-dev, > then we can avoid crossposting. > > I guess I can live with this, though I'd rather use the newsgroup. > Is crossposting so hard? If you want to put me on a mailing list, > please use the address phr-pypy at nightsong.com so I can auto-route it. > (Btw I don't like the name pypy much. There are several others > in that thread which I like a lot better). Please subscribe at http://codespeak.net/mailman/listinfo/pypy-dev > There's another guy I'd also like to invite, a Lisp expert, if that's > ok with you. He's been interested in writing a Python compiler for a > while. I'll ask him if he wants to join, but he might not. Sure he's welcome. Being on the list does btw. not imply to have to join the sprint :-) > How much traffic is on the list? Well, 128 messages by now. cheers - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From vlindberg at verio.net Thu Jan 16 21:38:58 2003 From: vlindberg at verio.net (VanL) Date: Thu, 16 Jan 2003 13:38:58 -0700 Subject: [pypy-dev] Objective of minimal python References: Message-ID: <3E271862.40902@verio.net> 1. That the objective of the project is to create a new implementation python that: i. Has a smaller memory footprint; and ii. runs applications faster than the current interpreter; and iii. is well suited to applications making significant use of lightweight threads and continuations. 2. To achieve this it is intended to rewrite the 'front-end' of the Python interpreter, specifically the language parser, lexical analyser and compiler / byte-code generator, in python itself. 3. The byte-code interpreter (is this the same as the virtual machine?) is a program that reads the byte-code stream as data and processes it to do 'real work'. In order to meet objective (iii), it is intended to write a new VM that employs structure and mechanisms seen in the 'stackless' implementation. From what I have read so far, this summary is incorrect. (Any corrections to what I am saying are appreciated, tho!) Here is what I understand: 1. The objective of the project is to create a new implementation of python that: i. Has as small and simple a C core as possible ii. Uses python itself to provide as much of the functionality of the "standard" CPython implementation as possible. 2. This new python implementation will not necessarily be stackless, nor be particularly fast. The idea is -- at least at first -- to produce a complete python implementation *in python* that then can be optimized using various dynamic techniques (such as those in psyco) or easily extended with new language ideas (like stackless). 3. A possible side-effect of implementing python in python is that python may become impressively modular -- leading to its use in small and embedded platforms. However, this is a possible side-benefit, and is not the primary focus of this effort. 4. Pysco and pyrex are possible starting places for the small C core. In fact, this is much more of a microkernal-python than a minimal python. VanL From tismer at tismer.com Fri Jan 17 02:46:10 2003 From: tismer at tismer.com (Christian Tismer) Date: Fri, 17 Jan 2003 02:46:10 +0100 Subject: [pypy-dev] Lessons From (Limited) Experience In-Reply-To: <71A90F65.7DB3B5EB.9ADE5C6A@netscape.net> References: <71A90F65.7DB3B5EB.9ADE5C6A@netscape.net> Message-ID: <3E276062.2040605@tismer.com> Rocco Moretti wrote: ... > My plan was "simply" to take the CPython code and convert > it into Python code, minimizing the use of "high level" features > like lambda, generators, etc., so that a small core could easily > be hand compiled into the language du jour. Eventually, a reduced > dialect of Python might be employed for the core (interpreter loop > and basic objects) such that a simple compiler could be built for it. Very true. ... > Unfortunately, it was *extremely* unclear where PyPython ended and > the host system began. In my final incarnation, the PyPython objects > with external visibility were referenced from a global definition > file (a'la python.h), where they were simply imported wholesale > from __builtins__. It is one of my fears when rewriting Python in Python: How often will I fail to recognize in which system I am? There was a two-part TV film like "world at the wire" in Germany, late 70's, where the level of simulation also was not always clear :-) > But the problem which stymied me was the issue of what to do with > Exceptions. I discarded the concept of copying the "return Null" > technique of CPython - having to check every return value for an > error condition seems so unpythonic. It does seem so if you have to write that crap by hand. When generating code, this is no issue at all. I wasn't sure for myself about this problem. Then I read Armin's long reply to everything, and I found one thing of his generated-c-approach appealing: Everything is turned back into generated C, and there is no question about exceptions: Surely they would be built like before. Now, why do we have a problem with it in an interpiler that does not create C code? I think we do lack experience. ... > I agree with using cross compilers in the bootstrap proceedure > and with being "stackless" - two features I was going to apply. > One additional thing that came about as a direct result of Python's > lack of a switch statement: I replaced the switch statement with an > array of references to functions implementing the opcodes. This is most natural if you don't have a case construct. If I ever whished to add such a thing, then for this project :-) > (I was > going for clarity over speed) One offshoot of this is the potential > to have multiple opcode schemes in the same interpreter. (Like > Python 1.5.2 and Python 2.2 - or better yet, Python and Java) Absolutely, this can be done if it makes enough sense. > How does that work with parameter/return value passing and potential > psyco optimization? Don't know - didn't get that far. After Armin's recent post, it seems to be true that Psyco will get a complete rewrite, with a whole lot of new ideas. Armin seems to be eager to try different object layouts as well, with/without refcounts, with classical GC, ..., so I believe the new, Python-based Pysco will be very capable. > I might not make the sprint, but I'd be glad to help as I can, > Rocco Moretti There will be a second sprint, right before EuroPython. At least, Guido proposed that, probably enough to make it doubtlessly happen. ciao - chris From logistix at zworg.com Fri Jan 17 05:01:09 2003 From: logistix at zworg.com (Grant Olson) Date: Thu, 16 Jan 2003 16:01:09 -1200 Subject: [pypy-dev] Concrete Syntax Tree Message-ID: <200301170401.h0H419F9002072@overload3.baremetal.com> >Grant Olson wrote: >> I wrote some rough python code a few months ago that built Concrete >> Syntax Trees from the ?grammar? file in the Python source, and also >> turned token streams into Abstract Trees based on the Concrete Trees. >> The parser module successfully compiled the Abstract Trees. Is this >> something you guys would be interested in or am I off base here? > >I would love to read that, of course! > >yours - chris Available at: http://members.bellatlantic.net/~olsongt/concrete.zip A little bit slower than the builtin parser ;) --------------------------------------- Get your free e-mail address @zworg.com From arigo at tunes.org Fri Jan 17 13:59:20 2003 From: arigo at tunes.org (RIGO Armin) Date: Fri, 17 Jan 2003 13:59:20 +0100 Subject: [pypy-dev] Re: [ann] Minimal Python project In-Reply-To: <20030116184114.7948.qmail@brouhaha.com> References: <20030116184114.7948.qmail@brouhaha.com> Message-ID: <20030117125920.GA699@magma.unil.ch> Hello Paul, On Thu, Jan 16, 2003 at 06:41:14PM -0000, Paul Rubin wrote: > I'd think the object representation is pretty fundamental, so once you > choose something, it will take a lot of rework to change it. Only if you code directly in C. By keeping at the high level all the time and using tools to emit the low-level stuff, you can experiment with zillons of variants. Surely, among the things I'd like to try are various forms of boxing, the double-pointers representation (Christian, in GCC you can return a struct from a function and if its size is only two pointers then the function will just return them in two registers), all with or without reference counting. We might also try emitting code for another statically typed language which already has a good GC and some boxing, like OCaml. We could even target Java too, reusing the nice work on Jython but emitting our own interpreter core -- which could make it much easier to keep Jython in sync with CPython. I'm not trying to impress people, all these are things that can clearly be done with reasonably little work. A bientot, Armin. From arigo at tunes.org Fri Jan 17 14:28:59 2003 From: arigo at tunes.org (Armin Rigo) Date: Fri, 17 Jan 2003 14:28:59 +0100 Subject: [pypy-dev] Lessons From (Limited) Experience In-Reply-To: <3E276062.2040605@tismer.com> References: <71A90F65.7DB3B5EB.9ADE5C6A@netscape.net> <3E276062.2040605@tismer.com> Message-ID: <20030117132859.GA1060@magma.unil.ch> Hello Christian, hello Rocco, Nice to hear about your project! On Fri, Jan 17, 2003 at 02:46:10AM +0100, Christian Tismer wrote: > [...exceptions...] > and I found one thing of his generated-c-approach > appealing: Everything is turned back into generated > C, and there is no question about exceptions: > Surely they would be built like before. Yes. I think that a good way to know the limit between the levels is, at first, by closely following the original C code --- but in essence only; constructions like the "return NULL" thing must clearly be ruled out, and replaced by higher-level Python constructs. Just as we are thinking about a "PyObject" base class for all Python objects, we should define an "EPython" exception that we can throw to signal a Python-visible exception, with whatever Python-visible exception we want being specified as attributes in the EPython instance. The EPython exception is caught in the main loop of the interpreter, at the point where CPython catches the cascade of "return NULLs". We can then generate CPython-like C code by adding NULL tests everywhere, or try alternatives (e.g. use the ANSI C setjmp()/longjmp() functions). There are lots of other variations on this theme. It would be possible and probably easy to generate Continuation-Passing-Style C code for the whole interpreter core, i.e. recreate Christian's old Stackless implementation for free. I guess that's something he has already considered :-) This can be done without having to write MiniPy in any special style in the first place. A bientot, Armin. From edream at tds.net Fri Jan 17 15:45:34 2003 From: edream at tds.net (Edward K. Ream) Date: Fri, 17 Jan 2003 08:45:34 -0600 Subject: [pypy-dev] Leo outlines & psyco Message-ID: <000901c2be37$17e4e230$bba4aad8@computer> I have just uploaded to Leo's web site two outlines that may be of interest to this project. 1. pysco.leo (in pysco.leo.zip) is a Leo outline I have been using to study psyco. It contains the code for psyco, and the code for the main loop of the C interpreter. Notice how using sections makes the structure of both the C interpreter and the corresponding psyco compiler clear. It also contains a section allowing you to compare, on a case-by-case basis, the cases of the psyco "compiler" with the cases of the C interpreter's main loop. This shows just some of the power of clones...NOTE: This outline is based on code that is several months old. I include it mostly to show how Leo could be applied to this project. 2. CC2.leo (in CC2.leo.zip) contains most of the C source code for an optimizing C compiler, assembler, linker and loader written in the early 90's for the now-defunct Tuple corp. I am co-owner of this code (not that it has any commercial value). Again, I include this outline to show how useful Leo would be in a compiler project. Examples: a. The documentation section in CC2.leo contains all the documentation in a single place, and parts of the documentation are cloned and placed near the code to which the documentation pertains. b. If I were writing this code today I would use sections extensively to clarify the code. For the most part, this code is exactly as produced from Leo's import script, so it isn't nearly as clear as it could be. However, I have reworked the complex tokize routine (in the @file CCtokize.c tree in the "Tokens & preprocessor" section to show the kinds of things that can be done. c. CC2.leo contains a _huge_ amount of code and documentation. With Leo it is easy to use clones to focus on the task at hand and ignore everything else. You may find both pysco.leo.zip and CC2.leo.zip at: http://sourceforge.net/project/showfiles.php?group_id=3458 under the headings: Miscellaneous: CC2 & Psyco. In order to use these outlines (.leo files) you will need a copy of leo.py, which you can find at the url above under the headings: Leo2.py: 3.10. If you have Windows you can just download the single-click installer, leosetup.exe. Otherwise, please download leo-3.10.zip. I am no salesman, and I am always leery of touting Leo. However, I do think that Leo would really contribute to this project. I hope no one objects to this posting. If so, I apologize in advance :-) Edward P.S. I think using Leo for the primary source for a project like this makes sense. Leo is ideally suited to managing and clarifying complex code. Moreover, I would like to see Leo used (eventually, not now) to manage the entire Python project. However, at present there are issues with using Leo with CVS, so I think it makes sense to use Leo on a strictly experimental basis on an experimental project like this. It will be very easy to use Leo on this project. If there is any interest I shall explain the simple rules for doing so. I believe the cvs issues with Leo will largely be resolved in about two months when Leo 4.0 comes out. P.P.S For the best example of what Leo can do, look at Leo's own source code, LeoPy.leo, included in all distributions of Leo. In particular, look at the (Project Views) section of LeoPy.leo to see how to use Leo's clones to keep track of tasks (like bugs and other ongoing issues). EKR -------------------------------------------------------------------- Edward K. Ream email: edream at tds.net Leo: Literate Editor with Outlines Leo: http://personalpages.tds.net/~edream/front.html -------------------------------------------------------------------- From mwh at python.net Fri Jan 17 17:24:02 2003 From: mwh at python.net (Michael Hudson) Date: Fri, 17 Jan 2003 16:24:02 +0000 (GMT) Subject: [pypy-dev] Re: psyco in Python was: Minimal Python project In-Reply-To: <005801c2be3d$ac15af60$bba4aad8@computer> Message-ID: On Fri, 17 Jan 2003, Edward K. Ream wrote: > Hi Michael, > > I'm going to copy this to pypy-dev so that other people can see this > discussion. OK. > > > > > I just don't see how to implement psyco in Python in a production > > > > > package without using the C interpreter as a bootstrap! And yes, I > > > > > suspect that a C language version of psyco will be faster than psyco > > > > > in Python. > > > > > Even after you've run psyco over itself? > > > > > > Yes, maybe even then. C compilers should do better _on the same > > > algorithm_ on hand-written code than can any GIT. > > > Weeeeellll, this is the thing that gets me jumping-up-and-down-excited > > about things like psyco: pysco has MORE information than the C compiler to > > work with: while the C compiler only knows the algorithm to run, psyco > > knows both the algorithm *and the data it is to run on*. Obviously, this > > approach only wins if what you do next resembles what you did last, but I > > think CPU caches have proved there is some mileage in that idea :) > > I don't think CPU caches have much bearing on this discussion. Caches speed > up frequently used values. The problems here are not the same. That wasn't my point. CPU caches derive their benefit (in part) from the assumption that what a program did recently, it will do again. psyco's specializations derive their benefit from the same assumption. Also, I'm wasn't so much talking about psyco-as-is, more psyco-as-may-be, which is one of the reasons I'm trying to stop posting stuff like this... > The more I study psyco, the more I am convinced that there is _no way_ that > psyco will ever produce better code than a C compiler. In brief, my reasons > are as follows: > > 1. psyco has, in fact, no better information than does a C compiler. > Remember that the "values" that psyco specializes upon are in fact > _interpreter_ vars, so knowing those values is equivalent to knowing the > types of operands. This does indeed allow psyco to emit machine code that > is better than the equivalent C interp generic code. In fact, though, most > of what psyco is doing is quite routine: replacing bytecode by assembly > language. This will speed up the code a _lot_, but it is doing just like a > C compiler does. You've clearly spent more time looking at pysco lately than me: I don't understand the first half of this paragraph! Please *don't* feel obliged to educate me, I wouldn't have enough time to do anything with the new knowledge... > 2. I have looked again at the code generators of the optimizing C compiler I > wrote about 10 years ago. I can find no instance where knowing the value > of an operand, rather than just its type, would produce better code. ? Surely in compiling if (x < 10) { ... } else { ... } knowing that x is less than 10 only 5% of the time would help? I'm pretty sure there are compilers which allow you to hint the outcome of any given test, so there must be *some* use for such data. > Yes, my compiler did constant folding (which eliminates code rather than > improves it), but I do not believe doing constant folding at runtime is > going to improve matters much, and is in fact very likely to slow down > the running times of typical programs, for reasons given below. It seems constant folding isn't much of an issue in Python, any way. > 3. It's easy to get seduced by the "psyco has more knowledge" mantra. In > fact, one mustn't lose track of the obstacles psyco faces in outperforming C > compilers. I sincerely hope I never gave the idea that this stuff would be easy. > First, and very important, everything that psyco does must be done at > run time. Well, yeah. > This means that even very effective optimizations like peephole > optimizations become dubious for psyco. Second, and also very > important, is that fact that psyco (or rather the code that psyco > produces) must determine whether the code that psyco has already > produced is suitable for the task at hand. This is a more subtle issue. OTOH, the seem to be benefits in this kind of caching from algorithms that people implement IN SILICON. I'm thinking of things like the P4's trace cache here. It staggers me that that thing does enough to help, but I gather it does. > The "Structure of Psyco" paper says this: "Dispatching is the set of > techniques used to store a trace of the compiled code buffers and _find if > one already matches the current compiler state_" (emphasis mine) In other > words, it's not enough to specialize code! Psyco must determine whether the > types used to compile a function match the types presently given to psyco. > This searching is done in the "compatible" routine, and no matter how it is > done is _pure overhead_ compared to the code generated by a C compiler. Well, hopefully you can hoist these choices of specialization out of any core loops. 80/20 rules and so on. It seems fairly likely that psyco will be of most benefit when the dispatcher is invoked least often, i.e. psyco should work with "fairly" large lumps of code. Though of course there are trade-offs here, as everywhere. > As I see it, both these issues are "gotchas". Neither can be eliminated > completely, and each introduces overhead that will make the code emitted by > psyco clearly inferior to the code produced by any decent C compiler. In > particular, the more psyco specializes on runtime values of operands the > bigger the code bloat and the more severe the problem of finding out what > already-compiled code to use. Yes. Fun with the I-cache, too. > 4. I have seen _nothing_ in psyco's algorithms or the code emitted by psyco > that leads me to believe that psyco is "doing enough work" to outperform a C > compiler. Sorry, OF COURSE it's not doing enough work NOW. If it is from my postings that you have got the suggestion that anyone thinks that pysco can beat a decent C compiler today, I am sincerely sorry, and apologise to you and anyone else who might have gotten the same impression. > When I was a mathematics graduate student 30 years ago one of my professors > made the remark that a mathematician need to develop a feel for the level of Hey, I'm a mathematics graduate student *now*! > work required to do a new proof. Deep insights are not going to come from > elementary methods, Hmm. > and there is no use in using deep methods to prove elementary results. More hmm. Not sure my disagreements with those statements are relavent here, though. [...] > 5. Another way to see that psyco isn't really doing much is the following. > Basically, psyco emits calls to the same runtime helpers that the C > interpreter does. Yes, psyco may improve the calling sequences to these > helpers, but it does not eliminate calls to those helpers. I thought the goal of this project was to reduce the number of these helpers... > But a C compiler will never do worse that the code in the helpers, and > will often do _much_ better. > > I challenge this group to provide even one example of Python code, for which > psyco will emit better code than that produced by a C compiler on a > transliteration of that code into C. I am confident that none such exists. > I believe no such counter-example will ever be found. For today's psyco, yes. For tomorrow, who knows? It's not like C is an ideally designed language for writing high performance applications in (here I'm thinking of things like the need for the restrict keyword in C99). [...] > P.S. I trust that this group will interpret my remarks as constructive. I > believe it is important at the beginning of any project to have proper goals > and expectations. Make no mistake: I believe this project has the potential > to be extremely valuable, even if my arguments are correct. And I would > indeed be happy to be _proved_ wrong. I also hope that my comments don't produce undue enthusiasm. We don't have a single line of code yet! [...] Something else, to finish. Do you know of HP's dynamo project? http://www.arstechnica.com/reviews/1q00/dynamo/dynamo-1.html is *fascinating* reading. They were investigating issues of dynamic binary translation and started just by "translating" PA-RISC code to PA-RISC code. Some optimizations later, and they noticed that the "translated" code ran sometimes ran 20% faster than the original! Basically because of trace caches. Cheers, M. From edream at tds.net Fri Jan 17 17:34:40 2003 From: edream at tds.net (Edward K. Ream) Date: Fri, 17 Jan 2003 10:34:40 -0600 Subject: [pypy-dev] Re: psyco in Python was: Minimal Python project Message-ID: <008701c2be46$559bfeb0$bba4aad8@computer> Somehow it seems like Michaels reply got to the mail server before my original message. Here it is again. My apologies if this is redundant: Hi Michael, I'm going to copy this to pypy-dev so that other people can see this discussion. > > > > I just don't see how to implement psyco in Python in a production > > > > package without using the C interpreter as a bootstrap! And yes, I > > > > suspect that a C language version of psyco will be faster than psyco > > > > in Python. > > > Even after you've run psyco over itself? > > > > Yes, maybe even then. C compilers should do better _on the same > > algorithm_ on hand-written code than can any GIT. > Weeeeellll, this is the thing that gets me jumping-up-and-down-excited > about things like psyco: pysco has MORE information than the C compiler to > work with: while the C compiler only knows the algorithm to run, psyco > knows both the algorithm *and the data it is to run on*. Obviously, this > approach only wins if what you do next resembles what you did last, but I > think CPU caches have proved there is some mileage in that idea :) I don't think CPU caches have much bearing on this discussion. Caches speed up frequently used values. The problems here are not the same. The more I study psyco, the more I am convinced that there is _no way_ that psyco will ever produce better code than a C compiler. In brief, my reasons are as follows: 1. psyco has, in fact, no better information than does a C compiler. Remember that the "values" that psyco specializes upon are in fact _interpreter_ vars, so knowing those values is equivalent to knowing the types of operands. This does indeed allow psyco to emit machine code that is better than the equivalent C interp generic code. In fact, though, most of what psyco is doing is quite routine: replacing bytecode by assembly language. This will speed up the code a _lot_, but it is doing just like a C compiler does. 2. I have looked again at the code generators of the optimizing C compiler I wrote about 10 years ago. I can find no instance where knowing the value of an operand, rather than just its type, would produce better code. Yes, my compiler did constant folding (which eliminates code rather than improves it), but I do not believe doing constant folding at runtime is going to improve matters much, and is in fact very likely to slow down the running times of typical programs, for reasons given below. 3. It's easy to get seduced by the "psyco has more knowledge" mantra. In fact, one mustn't lose track of the obstacles psyco faces in outperforming C compilers. First, and very important, everything that psyco does must be done at run time. This means that even very effective optimizations like peephole optimizations become dubious for psyco. Second, and also very important, is that fact that psyco (or rather the code that psyco produces) must determine whether the code that psyco has already produced is suitable for the task at hand. The "Structure of Psyco" paper says this: "Dispatching is the set of techniques used to store a trace of the compiled code buffers and _find if one already matches the current compiler state_" (emphasis mine) In other words, it's not enough to specialize code! Psyco must determine whether the types used to compile a function match the types presently given to psyco. This searching is done in the "compatible" routine, and no matter how it is done is _pure overhead_ compared to the code generated by a C compiler. As I see it, both these issues are "gotchas". Neither can be eliminated completely, and each introduces overhead that will make the code emitted by psyco clearly inferior to the code produced by any decent C compiler. In particular, the more psyco specializes on runtime values of operands the bigger the code bloat and the more severe the problem of finding out what already-compiled code to use. 4. I have seen _nothing_ in psyco's algorithms or the code emitted by psyco that leads me to believe that psyco is "doing enough work" to outperform a C compiler. When I was a mathematics graduate student 30 years ago one of my professors made the remark that a mathematician need to develop a feel for the level of work required to do a new proof. Deep insights are not going to come from elementary methods, and there is no use in using deep methods to prove elementary results. I believe the same kind of analysis can (and should!) be done here. Yes, psyco is clever. In some ways it is _very_ clever, especially the idea of modeling the compiler on the structure on the original C interp. Yes, it is easy to seen how psyco can improve the running speed of the present C interpreter: _any_ compiling will do so! But with respect, I dispute the assertion that knowing actual values of operands is going to produce faster code that that produce by a decent C compiler applied to an _equivalent_ algorithm done in C. Please note: it is _precisely_ this extremely strong assertion that must be true for there to be any speedup gained in translating Python's C libraries into Python! We already have C libraries that work very well. I would be _very_ leery of messing with them... 5. Another way to see that psyco isn't really doing much is the following. Basically, psyco emits calls to the same runtime helpers that the C interpreter does. Yes, psyco may improve the calling sequences to these helpers, but it does not eliminate calls to those helpers. But a C compiler will never do worse that the code in the helpers, and will often do _much_ better. I challenge this group to provide even one example of Python code, for which psyco will emit better code than that produced by a C compiler on a transliteration of that code into C. I believe no such counter-example will ever be found. I am not much interested in anecdotal reports of Python being faster than C, and I would be happy to study in depth any purported counterexample. If you believe you have a real counterexample, please first make sure that the algorithms appear identical. Second, please try to make the counterexample as small as possible. If I am wrong, the counter-example should provide a _clear reason_ why all my arguments above are wrong. And wouldn't that be great :-) I believe the arguments given should be given greater weight than the "psyco has more knowledge" mantra. You want to convince me otherwise? Show me the code, or provide a _detailed_ explanation for how my arguments do not apply. Edward P.S. I trust that this group will interpret my remarks as constructive. I believe it is important at the beginning of any project to have proper goals and expectations. Make no mistake: I believe this project has the potential to be extremely valuable, even if my arguments are correct. And I would indeed be happy to be _proved_ wrong. However, I think it a dubious policy to assume that Python can outperform C, for several reasons: 1. There is no need to burden this project with unrealistic expectations. Python doesn't have to beat C for Python to rule the world! :-) 2. I am concerned that assuming that we can do the impossible will skew our efforts. Yes, by all means, experiment using Python! Using C to do experiments would not be too swift :-) But lets not spend a lot of time worrying about bootstrapping until we are _sure_ that we won't be doing the final version in C! 3. There may be another way to speed up Python, namely replace (parts of) the byte code with compiled code. Armin mentioned in his papers that some ways of doing so might produce code bloat. But is that the end of the story? I think not. The same tricks that psyco uses in the "dispatcher" might well be applied to intermix machine code with byte code. I wouldn't want to focus _only_ on the "psyco way (tm)" in this project. Compile-time optimizations deserve at least some considerations, IMO. EKR -------------------------------------------------------------------- Edward K. Ream email: edream at tds.net Leo: Literate Editor with Outlines Leo: http://personalpages.tds.net/~edream/front.html -------------------------------------------------------------------- From boyd at strakt.com Fri Jan 17 17:50:48 2003 From: boyd at strakt.com (Boyd Roberts) Date: Fri, 17 Jan 2003 17:50:48 +0100 Subject: [pypy-dev] Re: psyco in Python was: Minimal Python project References: <008701c2be46$559bfeb0$bba4aad8@computer> Message-ID: <3E283468.5070905@strakt.com> Edward K. Ream wrote: >I don't think CPU caches have much bearing on this discussion. Caches speed >up frequently used values. The problems here are not the same. > I think what you mean here is frequently used code/data as CPU caches are address based, not value based; locality of reference. >2. I have looked again at the code generators of the optimizing C compiler I >wrote about 10 years ago. I can find no instance where knowing the value >of an operand, rather than just its type, would produce better code. > Zero. Many redundant instructions can be eliminated on a wide variety of architectures due to the way the value zero is handled by the CPU; results of the previous instruction may set condition codes, instructions that test for zero, ... Knowlege of values can better construct jump tables for switches. From edream at tds.net Fri Jan 17 17:51:15 2003 From: edream at tds.net (Edward K. Ream) Date: Fri, 17 Jan 2003 10:51:15 -0600 Subject: [pypy-dev] Re: psyco in Python was: Minimal Python project References: Message-ID: <00eb01c2be48$a643a910$bba4aad8@computer> I think we are in fairly complete agreement about all issues, including the need to move on. I thought these issues needed to be addressed at the start, so that the project starts off in the right direction. I've gotten everything off my chest: thanks for listening :-) If Python becomes faster than C, great. If not, the project will still be a success, _provided_ it doesn't waste too much time trying to do the impossible :-) Edward -------------------------------------------------------------------- Edward K. Ream email: edream at tds.net Leo: Literate Editor with Outlines Leo: http://personalpages.tds.net/~edream/front.html -------------------------------------------------------------------- From edream at tds.net Fri Jan 17 19:59:59 2003 From: edream at tds.net (Edward K. Ream) Date: Fri, 17 Jan 2003 12:59:59 -0600 Subject: [pypy-dev] Re: psyco in Python was: Minimal Python project References: <008701c2be46$559bfeb0$bba4aad8@computer> <3E283468.5070905@strakt.com> Message-ID: <000801c2be5a$a2f90cc0$bba4aad8@computer> > Knowlege of values can better construct jump tables for switches. Sure. You could get rid of the entire table. This kind of trick is why I mentioned constant folding. The key questions concerning "runtime constant folding" are: 1. how much work does it take to discover the particular value? 2. how much work does it take to generate code to use that particular value? and most importantly, 3. how much work does it take to _reuse_ the "specified" code? Edward -------------------------------------------------------------------- Edward K. Ream email: edream at tds.net Leo: Literate Editor with Outlines Leo: http://personalpages.tds.net/~edream/front.html -------------------------------------------------------------------- From edream at tds.net Fri Jan 17 20:08:28 2003 From: edream at tds.net (Edward K. Ream) Date: Fri, 17 Jan 2003 13:08:28 -0600 Subject: [pypy-dev] Questions for Armin Message-ID: <001001c2be5b$d1bf1ee0$bba4aad8@computer> My long posts were intended, in part, to expose my assumptions for correction if needed. Here are what I conceive to be the key questions about psyco: 1. How often and under what circumstances does psyco_compatible get called? My _guess_ is that it gets called once per every invocation of every "psycotic" function (function optimized by psyco). Is this correct? 2. True or false: the call to psyco_compatible would be equivalent to runtime code that discovers special values of certain particular variables. 3. True or false: adding more state information to psyco (in order to discover more runtime values) will slow down psyco_compatible. 4. Are these the most important questions to ask about psyco? If not, what _are_ the key questions? Thanks very much. Edward -------------------------------------------------------------------- Edward K. Ream email: edream at tds.net Leo: Literate Editor with Outlines Leo: http://personalpages.tds.net/~edream/front.html -------------------------------------------------------------------- From tfanslau at gmx.de Fri Jan 17 20:13:11 2003 From: tfanslau at gmx.de (Thomas Fanslau) Date: Fri, 17 Jan 2003 20:13:11 +0100 Subject: [pypy-dev] Re: psyco in Python was: Minimal Python project In-Reply-To: References: Message-ID: <3E2855C7.5010107@gmx.de> Michael Hudson wrote: >>2. I have looked again at the code generators of the optimizing C compiler I >>wrote about 10 years ago. I can find no instance where knowing the value >>of an operand, rather than just its type, would produce better code. >> >> > >? Surely in compiling > >if (x < 10) { > ... >} else { > ... >} > > And don't forget the obvious replacement of a multiply by some shifts, so multiply by 5 can be replaced by a copy, a right shift 2 places and a add instead of a costly multiply ... --tf From edream at tds.net Fri Jan 17 20:12:26 2003 From: edream at tds.net (Edward K. Ream) Date: Fri, 17 Jan 2003 13:12:26 -0600 Subject: [pypy-dev] How much do we know? Message-ID: <001601c2be5c$5f5cc860$bba4aad8@computer> Is there anyone out there (besides Armin) who believes he or she knows enough about psyco to: 1. Explain in detail how it works? 2. Modify the code? 3. Propose improvements? If so, would you consider sharing your view of psyco? What are the key ideas? If not, doesn't this suggest something that might forward the action of this project? Edward P.S. I've very much enjoyed the background links that various people have given out. EKR -------------------------------------------------------------------- Edward K. Ream email: edream at tds.net Leo: Literate Editor with Outlines Leo: http://personalpages.tds.net/~edream/front.html -------------------------------------------------------------------- From edream at tds.net Fri Jan 17 20:20:06 2003 From: edream at tds.net (Edward K. Ream) Date: Fri, 17 Jan 2003 13:20:06 -0600 Subject: [pypy-dev] What do we know for sure? Message-ID: <001c01c2be5d$749b5c90$bba4aad8@computer> At the start of any experimental project I think it is a good idea to catalog what it is that we know for sure. This is a "dangerous" list because mistakes here might close off promising avenues. Still, here is what I think we can say for sure about minimalPython/psyco: 1. It is clearly possible to generate code using a git that will execute faster than the equivalent code in the C interpreter. 2. Some kinds of code can be generated without knowing the types of _any_ operands. At least part of the flow of control code falls under this heading. Therefore, it _might_ be possible to move some of what psyco does to "compile" time. 3. Python is a dynamic language. Therefore, a big challenge is to know when to use already compiled code. Does anyone have anything they would like to add to this list? Does anyone dispute what I have said? Any corrections would be most helpful. Thanks, Edward -------------------------------------------------------------------- Edward K. Ream email: edream at tds.net Leo: Literate Editor with Outlines Leo: http://personalpages.tds.net/~edream/front.html -------------------------------------------------------------------- From pedronis at bluewin.ch Fri Jan 17 20:57:34 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Fri, 17 Jan 2003 20:57:34 +0100 Subject: [pypy-dev] relevant reading Message-ID: <011f01c2be62$ad9fd200$6d94fea9@newmexico> Optimizing Dynamically-Typed Object-Oriented Languages With Polymorphic Inline Caches (1991) Urs H?lzle, Craig Chambers, David Ungar http://citeseer.nj.nec.com/hlzle91optimizing.html regards. From bokr at oz.net Fri Jan 17 23:01:57 2003 From: bokr at oz.net (Bengt Richter) Date: Fri, 17 Jan 2003 14:01:57 -0800 Subject: [pypy-dev] Avoiding loving it to death In-Reply-To: <001c01c2be5d$749b5c90$bba4aad8@computer> Message-ID: <5.0.2.1.1.20030117121836.00a8d390@mail.oz.net> I have a fear that we are going to love this project to death if we don't settle on how not to overwhelm the core people with pet ideas, well meant advice, musings, and war stories, etc. (of which I have my share, that I have to restrain myself not to contribute ;-) I'm wondering if it might not be good to have a private list for the core developers that could be cc:'d with only stuff that actually supplies something they specifically asked for, so when time is short they don't have to sift through so much to find what might be directly helpful in what they are actually doing on the project. E.g., what I visualize is a core person asking for something specific, like help with a bug, or help googling to find something specific, or concrete proposals for improving an algorithm, or entity representation, or re-implementing a C module in restricted Python, etc, etc. The key thing would be to have a way for responders to say to themselves, "I don't have a real solution, but my similar experience might be helpful, so I'll post in the general list, but not the core list." Then discussion can churn and occasionally emit a gem, but the core list would only get cc:'d with the really useful stuff. If the core people's original request is cc:'d to the core list, it would become a good archive reflecting progress in a focused way, without foregoing the benefits of freer discussion and banter on the main list. I think it could work on the honor system, so long as the rules are clear, since there would still be the more open outlet, but you could block people if necessary. BTW, I am never sure who would like to be CC:'d personally and who is satisfied to read the list or newsgroup. I would like some guidelines on that, to avoid unnecessary redundancies. I don't think the project would die from mailing list dilution, but I could see core people deciding to withdraw in order to make progress, and that would seem a shame, if there can be a way for us all to help without being more bother than its worth. HTH ;-) Regards, Bengt Richter From tismer at tismer.com Sat Jan 18 00:18:29 2003 From: tismer at tismer.com (Christian Tismer) Date: Sat, 18 Jan 2003 00:18:29 +0100 Subject: [pypy-dev] Re: psyco in Python was: Minimal Python project In-Reply-To: <00eb01c2be48$a643a910$bba4aad8@computer> References: <00eb01c2be48$a643a910$bba4aad8@computer> Message-ID: <3E288F45.7070103@tismer.com> Edward K. Ream wrote: ... > If Python becomes faster than C, great. If not, the project will still be a > success, _provided_ it doesn't waste too much time trying to do the > impossible :-) Please note that many of the people on this list make their living from doing the impossible all day. These joint deranged minds will do something incredible. The only impossible candidate so far seems getting you healed from your deadlock. :-) I'm sorry to say that, but you are stuck with the current Psyco implementation, instead of getting the ideas of Armin's claim, which I believe is absolutely true. I carried some of them in my heart for years, but way less consequently than him. The more I'm supporting this, since I know it is true. You want a mathematical proof? I will try it once, and then continue my work. Your's sincerely, and expecting-a-*very*-strong-sprint - ly y'rs - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tismer at tismer.com Sat Jan 18 00:22:14 2003 From: tismer at tismer.com (Christian Tismer) Date: Sat, 18 Jan 2003 00:22:14 +0100 Subject: [pypy-dev] Lessons From (Limited) Experience In-Reply-To: <20030117132859.GA1060@magma.unil.ch> References: <71A90F65.7DB3B5EB.9ADE5C6A@netscape.net> <3E276062.2040605@tismer.com> <20030117132859.GA1060@magma.unil.ch> Message-ID: <3E289026.7040606@tismer.com> Armin Rigo wrote: ... > There are lots of other variations on this theme. It would be possible and > probably easy to generate Continuation-Passing-Style C code for the whole > interpreter core, i.e. recreate Christian's old Stackless implementation for > free. I guess that's something he has already considered :-) This can be > done without having to write MiniPy in any special style in the first place. Guess what: The only thing I could love more than Stackless is to get rid of it, due to this project. A bientot, Christian -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tismer at tismer.com Sat Jan 18 00:40:36 2003 From: tismer at tismer.com (Christian Tismer) Date: Sat, 18 Jan 2003 00:40:36 +0100 Subject: [pypy-dev] Avoiding loving it to death In-Reply-To: <5.0.2.1.1.20030117121836.00a8d390@mail.oz.net> References: <5.0.2.1.1.20030117121836.00a8d390@mail.oz.net> Message-ID: <3E289474.4030702@tismer.com> Bengt Richter wrote: > I have a fear that we are going to love this project to death if > we don't settle on how not to overwhelm the core people with > pet ideas, well meant advice, musings, and war stories, etc. > (of which I have my share, that I have to restrain myself not > to contribute ;-) Oh, thank you for your concern, but I don't share it. The quality of submissions to this list is not much lower than python-dev. (and it will increase when I stop posting so much :-) python-dev never needed an extra outlet, so we are fine. > I'm wondering if it might not be good to have a private list for > the core developers that could be cc:'d with only stuff that actually > supplies something they specifically asked for, so when time is short > they don't have to sift through so much to find what might be directly > helpful in what they are actually doing on the project. Well, I use private emails to discuss certain things which I desparately need to know. The list gives very good input, and I appreciate it very much. But it will have a hard time to distract me from things which I'm confident about, since I'm not thinking of this stuff since yesterday. [...] > Then discussion can churn and occasionally emit a gem, but the core list > would only get cc:'d with the really useful stuff. If the core people's > original request is cc:'d to the core list, it would become a good archive > reflecting progress in a focused way, without foregoing the benefits of > freer discussion and banter on the main list. What I would like more is what Rocco suggested: Let's have a decent person who is willing to put together the relevant extract of this list and who posts this weekly. I found this very valuable for python-dev, since its posting frequency turned very high, recently. > BTW, I am never sure who would like to be CC:'d personally and who is > satisfied to read the list or newsgroup. I would like some guidelines > on that, to avoid unnecessary redundancies. I always do a "reply all", because I think it is not so hard to drop an unwanted duplicate message. Personally, I like to have the duplicate, as a not that I was included in a CC, meaning that I'm supposed to reply. One thing that I have to live with is, that some people still use SPAM obfuscated email addresses which is a PITA. > I don't think the project would die from mailing list dilution, but > I could see core people deciding to withdraw in order to make progress, > and that would seem a shame, if there can be a way for us all to help > without being more bother than its worth. While I don't see such a danger, I'm very happy that you are concerned that much about the progress of this project, and I assure you I will try to help it as much as I can, regardless of the SNR* of this list (which is good, IMHO). In other words: No way to get rid of me :-) ciao - chris (*) Signal to Noise Ratio -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tismer at tismer.com Sat Jan 18 01:07:20 2003 From: tismer at tismer.com (Christian Tismer) Date: Sat, 18 Jan 2003 01:07:20 +0100 Subject: [pypy-dev] Re: psyco in Python was: Minimal Python project In-Reply-To: <3E2855C7.5010107@gmx.de> References: <3E2855C7.5010107@gmx.de> Message-ID: <3E289AB8.105@tismer.com> Thomas Fanslau wrote: ... > And don't forget the obvious replacement of a multiply by some shifts, > so multiply by 5 can be replaced by a copy, a right shift 2 places and a > add instead of a costly multiply ... Ok, my 2 Eurocent: Also don't forget that this is not always true. There is hardware that recognized such common multipliers and does an optimization in hardware. I once had such a chip for the Forth language. This is not about to say you're wrong! What I try to express is instead, that we need to be aware of very different hardware, where optimization can have very different paths, and some assumptions may drive you very wrong. While your optimization was absolutely worthy on an 8086 processor, it becomes questionable for a Pentium 4, since multiplication by an immediate has been optimized like hell, and it is not sure in the first place whether to use a single operation that does the multiply, or by replacing it by three operations which you propose. There also can be considerations of occupation of execution units, which can make a multiply cheaper, since it can be done in parallel, while something other is still happening. This is all not relevant in the bootstrap phase. But we will finally write optimized compilers. (Dunno if Armin wants to, but I do). What we need is an abstract model of processors, caching, pipelines, parallel execution lines, prefetches, all of that. This is why I looked into MMIX. I no longer think to use MMIX as a first micro engine target, since I want something now, not something perfect. But that MMIX engine has everything that I mentioned and more, it is very close to current hardware development. --- About my position in this project, I'm not completely settled. But regardless of my actual tasks, I will provide two things, as my personal pet projects: - a) The fastest possible interpreted micro engine, optimized for X86 - b) A very good code generator for X86 which produces better code than gcc, MS-VC++ and IBM's C compiler. These will both, of course, completely be done in Python. There will not be a single line edited with a C editor. Instead, I'm writing high-level Python code, which is able to a) emit a suitable C source, and b) emit optimized X86 assembly. This just as a note what I'm ging to contribute, anyway. maybe-I-spent-5-Eurocents-ly y'rs -- chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From edream at tds.net Fri Jan 17 16:32:40 2003 From: edream at tds.net (Edward K. Ream) Date: Fri, 17 Jan 2003 09:32:40 -0600 Subject: [pypy-dev] Re: psyco in Python was: Minimal Python project References: Message-ID: <005801c2be3d$ac15af60$bba4aad8@computer> Hi Michael, I'm going to copy this to pypy-dev so that other people can see this discussion. > > > > I just don't see how to implement psyco in Python in a production > > > > package without using the C interpreter as a bootstrap! And yes, I > > > > suspect that a C language version of psyco will be faster than psyco > > > > in Python. > > > Even after you've run psyco over itself? > > > > Yes, maybe even then. C compilers should do better _on the same > > algorithm_ on hand-written code than can any GIT. > Weeeeellll, this is the thing that gets me jumping-up-and-down-excited > about things like psyco: pysco has MORE information than the C compiler to > work with: while the C compiler only knows the algorithm to run, psyco > knows both the algorithm *and the data it is to run on*. Obviously, this > approach only wins if what you do next resembles what you did last, but I > think CPU caches have proved there is some mileage in that idea :) I don't think CPU caches have much bearing on this discussion. Caches speed up frequently used values. The problems here are not the same. The more I study psyco, the more I am convinced that there is _no way_ that psyco will ever produce better code than a C compiler. In brief, my reasons are as follows: 1. psyco has, in fact, no better information than does a C compiler. Remember that the "values" that psyco specializes upon are in fact _interpreter_ vars, so knowing those values is equivalent to knowing the types of operands. This does indeed allow psyco to emit machine code that is better than the equivalent C interp generic code. In fact, though, most of what psyco is doing is quite routine: replacing bytecode by assembly language. This will speed up the code a _lot_, but it is doing just like a C compiler does. 2. I have looked again at the code generators of the optimizing C compiler I wrote about 10 years ago. I can find no instance where knowing the value of an operand, rather than just its type, would produce better code. Yes, my compiler did constant folding (which eliminates code rather than improves it), but I do not believe doing constant folding at runtime is going to improve matters much, and is in fact very likely to slow down the running times of typical programs, for reasons given below. 3. It's easy to get seduced by the "psyco has more knowledge" mantra. In fact, one mustn't lose track of the obstacles psyco faces in outperforming C compilers. First, and very important, everything that psyco does must be done at run time. This means that even very effective optimizations like peephole optimizations become dubious for psyco. Second, and also very important, is that fact that psyco (or rather the code that psyco produces) must determine whether the code that psyco has already produced is suitable for the task at hand. The "Structure of Psyco" paper says this: "Dispatching is the set of techniques used to store a trace of the compiled code buffers and _find if one already matches the current compiler state_" (emphasis mine) In other words, it's not enough to specialize code! Psyco must determine whether the types used to compile a function match the types presently given to psyco. This searching is done in the "compatible" routine, and no matter how it is done is _pure overhead_ compared to the code generated by a C compiler. As I see it, both these issues are "gotchas". Neither can be eliminated completely, and each introduces overhead that will make the code emitted by psyco clearly inferior to the code produced by any decent C compiler. In particular, the more psyco specializes on runtime values of operands the bigger the code bloat and the more severe the problem of finding out what already-compiled code to use. 4. I have seen _nothing_ in psyco's algorithms or the code emitted by psyco that leads me to believe that psyco is "doing enough work" to outperform a C compiler. When I was a mathematics graduate student 30 years ago one of my professors made the remark that a mathematician need to develop a feel for the level of work required to do a new proof. Deep insights are not going to come from elementary methods, and there is no use in using deep methods to prove elementary results. I believe the same kind of analysis can (and should!) be done here. Yes, psyco is clever. In some ways it is _very_ clever, especially the idea of modeling the compiler on the structure on the original C interp. Yes, it is easy to seen how psyco can improve the running speed of the present C interpreter: _any_ compiling will do so! But with respect, I dispute the assertion that knowing actual values of operands is going to produce faster code that that produce by a decent C compiler applied to an _equivalent_ algorithm done in C. Please note: it is _precisely_ this extremely strong assertion that must be true for there to be any speedup gained in translating Python's C libraries into Python! We already have C libraries that work very well. I would be _very_ leery of messing with them... 5. Another way to see that psyco isn't really doing much is the following. Basically, psyco emits calls to the same runtime helpers that the C interpreter does. Yes, psyco may improve the calling sequences to these helpers, but it does not eliminate calls to those helpers. But a C compiler will never do worse that the code in the helpers, and will often do _much_ better. I challenge this group to provide even one example of Python code, for which psyco will emit better code than that produced by a C compiler on a transliteration of that code into C. I believe no such counter-example will ever be found. I am not much interested in anecdotal reports of Python being faster than C, and I would be happy to study in depth any purported counterexample. If you believe you have a real counterexample, please first make sure that the algorithms appear identical. Second, please try to make the counterexample as small as possible. If I am wrong, the counter-example should provide a _clear reason_ why all my arguments above are wrong. And wouldn't that be great :-) I believe the arguments given should be given greater weight than the "psyco has more knowledge" mantra. You want to convince me otherwise? Show me the code, or provide a _detailed_ explanation for how my arguments do not apply. Edward P.S. I trust that this group will interpret my remarks as constructive. I believe it is important at the beginning of any project to have proper goals and expectations. Make no mistake: I believe this project has the potential to be extremely valuable, even if my arguments are correct. And I would indeed be happy to be _proved_ wrong. However, I think it a dubious policy to assume that Python can outperform C, for several reasons: 1. There is no need to burden this project with unrealistic expectations. Python doesn't have to beat C for Python to rule the world! :-) 2. I am concerned that assuming that we can do the impossible will skew our efforts. Yes, by all means, experiment using Python! Using C to do experiments would not be too swift :-) But lets not spend a lot of time worrying about bootstrapping until we are _sure_ that we won't be doing the final version in C! 3. There may be another way to speed up Python, namely replace (parts of) the byte code with compiled code. Armin mentioned in his papers that some ways of doing so might produce code bloat. But is that the end of the story? I think not. The same tricks that psyco uses in the "dispatcher" might well be applied to intermix machine code with byte code. I wouldn't want to focus _only_ on the "psyco way (tm)" in this project. Compile-time optimizations deserve at least some considerations, IMO. EKR -------------------------------------------------------------------- Edward K. Ream email: edream at tds.net Leo: Literate Editor with Outlines Leo: http://personalpages.tds.net/~edream/front.html -------------------------------------------------------------------- From logistix at zworg.com Sat Jan 18 18:50:25 2003 From: logistix at zworg.com (logistix) Date: Sat, 18 Jan 2003 12:50:25 -0500 Subject: [pypy-dev] Re: psyco in Python was: Minimal Python project Message-ID: <200301181750.h0IHoPtD000589@overload3.baremetal.com> > If I am wrong, the counter-example should provide a _clear reason_ why all > my arguments above are wrong. And wouldn't that be great :-) I believe the > arguments given should be given greater weight than the "psyco has more > knowledge" mantra. You want to convince me otherwise? Show me the code, or > provide a _detailed_ explanation for how my arguments do not apply. > > Edward > Firstly, C compilation is static. Secondly, python can do plenty of stuff that can't be easily transliterated into C. I doubt any current C compilers would be able to optimize a C version of the following code (regardless of the practicality of it)... PythonWin 2.2.2 (#37, Oct 14 2002, 17:02:34) [MSC 32 bit (Intel)] on win32. Portions Copyright 1994-2001 Mark Hammond (mhammond at skippinet.com.au) - see 'Help/About PythonWin' for further copyright information. >>> def crazyRPC_getAge(username): .. """ .. Pretend this is a real RPC call, .. and a black box to the compiler. .. """ .. if username == "Bob": .. return 23 #int .. elif username == "Doug": .. return 12.5 #float .. else: .. raise Exception("Unknown user %s" % username) .. >>> def drinkingAge(age): .. def test(): .. return age >= 21 .. return test .. >>> bobCanDrink = drinkingAge(crazyRPC_getAge("Bob")) #returns function optimized for ints >>> bobCanDrink() 1 >>> dougCanDrink = drinkingAge(crazyRPC_getAge("Doug")) #returns function optimized for floats >>> dougCanDrink() 0 >>> From arigo at tunes.org Sat Jan 18 20:00:53 2003 From: arigo at tunes.org (Armin Rigo) Date: Sat, 18 Jan 2003 11:00:53 -0800 (PST) Subject: [pypy-dev] Questions for Armin In-Reply-To: <001001c2be5b$d1bf1ee0$bba4aad8@computer>; from edream@tds.net on Fri, Jan 17, 2003 at 01:08:28PM -0600 References: <001001c2be5b$d1bf1ee0$bba4aad8@computer> Message-ID: <20030118190053.163AD1F8E@bespin.org> Hello Edward, On Fri, Jan 17, 2003 at 01:08:28PM -0600, Edward K. Ream wrote: > 1. How often and under what circumstances does psyco_compatible get called? > > My _guess_ is that it gets called once per every invocation of every > "psycotic" function (function optimized by psyco). Is this correct? No: psyco_compatible() is only called at compile-time. When a "psycotic" function is called by regular Python code, we just jump to machine code that starts at the beginning of the function with no particular assumption about the arguments; it just receive PyObject* pointers. Only when something more about a given argument is needed (say its type) will this extra information be asked for. The corresponding machine code is very fast in the common case: it loads the type, compares it with the most common type found at this place, and if it matches, runs on. So in the common case, we only have one type check per needed argument. Given def my_function(a,b,c): return a+b+c the emitted machine code looks like what you would obtain by compiling this: PyObject* my_function(PyObject* a, PyObject* b, PyObject* c) { int r1, r2, r3; if (a->ob_type != &PyInt_Type) goto uncommon_case; if (b->ob_type != &PyInt_Type) goto uncommon_case; if (c->ob_type != &PyInt_Type) goto uncommon_case; r1 = ((PyIntObject*) a)->ob_ival; r2 = ((PyIntObject*) b)->ob_ival; r3 = ((PyIntObject*) c)->ob_ival; return PyInt_FromLong(r1+r2+r3); } Only when a new, not-already-seen type appears does it follow the "uncommon_case" branch. This triggers more compilation, i.e. emission of more machine code. During this emission, we make numerous calls to psyco_compatible() to see if we have reached a state that we have already seen, and which subsequently corresponds to already-emitted machine code; if it does, we emit a jump to this old code. This is the purpose of psyco_compatible(). I must mention that in the above example, the nice-looking C version is only arrived at after several steps of execution mixed with further compilation. The first version is: PyObject* my_function(PyObject* a, PyObject* b, PyObject* c) { goto uncommon_case; /* need a->ob_type */ } Then when the function is first called with a integer in 'a', it becomes: PyObject* my_function(PyObject* a, PyObject* b, PyObject* c) { if (a->ob_type != &PyInt_Type) goto uncommon_case; goto uncommon_case; /* need b->ob_type */ } and so on. > 2. True or false: the call to psyco_compatible would be equivalent to > runtime code that discovers special values of certain particular variables. See above. > 3. True or false: adding more state information to psyco (in order to > discover more runtime values) will slow down psyco_compatible. This is true. The more you run-time values you want to "discover" (I say "promote to compile-time"), the more versions of the same code you will get, and the slower psyco_compatible() will be (further slowing down compilation, but not execution proper, as seen above). > 4. Are these the most important questions to ask about psyco? If not, what > _are_ the key questions? Hard to say! I like to mention the "lazy" values ("virtual-time"). These are the key to high-level optimizations in Psyco. In the above example you might have noticed that the Python interpreter must build and free an intermediate integer object for "a+b" when computing "a+b+c", while the C version I showed does not. Psyco does this by considering the intermediate PyObject* pointer as lazy. As long as it is not needed, no call to PyInt_FromLong() is written; only the value "r1+r2" is computed. Similarily, in "a+b", if both operands are strings, the result is a lazy string which is implemented as a lazy list "[a,b]". Concatenating more strings turns the list into a real Python list, but the resulting string itself is still lazy. This is how Psyco end up automatically translating things like s = '' for t in xxx: s += t into something like lst = [] for t in xxx: lst.append(t) s = ''.join(lst) I hope that these examples cast some light on Psyco. I realize that this could distract people from the current goals of this project, and I apologize for that. We should discuss e.g. "how restricted" the language we use for Python-in-Python should be... A bientot, Armin. From arigo at tunes.org Sat Jan 18 20:00:54 2003 From: arigo at tunes.org (Armin Rigo) Date: Sat, 18 Jan 2003 11:00:54 -0800 (PST) Subject: [pypy-dev] Re: psyco in Python was: Minimal Python project In-Reply-To: <008701c2be46$559bfeb0$bba4aad8@computer>; from edream@tds.net on Fri, Jan 17, 2003 at 10:34:40AM -0600 References: <008701c2be46$559bfeb0$bba4aad8@computer> Message-ID: <20030118190054.AA9251F37@bespin.org> Hello Edward, All your comments about Psyco are founded, but you are focusing too much on the "back-end" part --- which I understand, given your impressive compiler technology background! On Fri, Jan 17, 2003 at 10:34:40AM -0600, Edward K. Ream wrote: > 1. psyco has, in fact, no better information than does a C compiler. People answered that Psyco can have a bit more information, because it could do more constant folding or optimize multiplications by constants. Rarely a huge win. But that's not what I had in mind when saying that "Psyco has more information". It has a much higher-level view of the program. If you have a specific algorithm which you completely coded in a couple of C functions, then after compiling it with a good C compiler the algorithm will run at a speed that cannot be beaten. But C is not well suited for larger applications consisting mainly of management stuff --- precisely what Python is much better for (but you know this well, I'm sure). These are the cases that interest me. Python gives a higher-level view of the application. Psyco can, for example, measure that some data structure (say a list) is most used in this or that way, and choose a suited implementation. For example, a list in the middle of which numerous inserts and deletes are done could be implemented as a red-black tree. Of course, in a pure C implementation of the application we can also use red-black trees, but who does? Not many C application actually use the correct implementation of their data structures :-( More importantly, which implementation is the best one must be hard-wired in advance in C and makes it difficult to switch later. This is where I expect interesting gains from a sufficiently advanced Psyco. There is already one example of this. In Python, if you build a large string by successively concatenating a lot of small strings, you get bad results. You have to rewrite your algorithm in a less straightforward style, e.g. accumulating the strings in a list and only at the end using "''.join(list)". You have the same problem in C if you repeatedly use a simple two-strings concatenation function, but the C compiler cannot do anything to help here. Psyco (upcoming version 1.0) can already help: it compiles the Python function by choosing to implement the string as a Python list of strings. The "join()" is only done when the resulting string is needed outside the function. It is a case where higher-level programming languages let compiling tools select algorithms that actually decrease the complexity --- meaning that the result can run faster than the C version by more than a constant factor. Of course, you could have made the C version wiser in this case. But again you cannot do this when your application becomes very large and where you just don't know what implementation is better without doing sophisticated profiling on hopefully representative sample data. > 1. There is no need to burden this project with unrealistic expectations. > Python doesn't have to beat C for Python to rule the world! :-) Yes, exactly. I hope I have made it clear that Psyco is not the all-or-nothing way for this project to succeed. In my opinion it is essential to write the Python-in-Python interpreter with important restrictions, so that static tools can do a lot from this code. Compile-time optimizations of Python, if you like, althought I prefer to see it as translation from a well-defined Pythonic frame to any of several possible projects that could use the source (including a CPython-like interpreter, and including a Psyco-like project). The current goal should clearly be to have a Python interpreter in Python, written with restrictions that are well understood. A bientot, Armin. From pedronis at bluewin.ch Sat Jan 18 20:17:34 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Sat, 18 Jan 2003 20:17:34 +0100 Subject: [pypy-dev] Questions for Armin References: <001001c2be5b$d1bf1ee0$bba4aad8@computer> <20030118190053.163AD1F8E@bespin.org> Message-ID: <023a01c2bf26$41945da0$6d94fea9@newmexico> From: "Armin Rigo" > > 3. True or false: adding more state information to psyco (in order to > > discover more runtime values) will slow down psyco_compatible. > > This is true. The more you run-time values you want to "discover" (I say > "promote to compile-time"), the more versions of the same code you will get, > and the slower psyco_compatible() will be (further slowing down compilation, > but not execution proper, as seen above). at least for method dispatching/lookup some kind of sampling in the spirit of polymorphic inline caches can help distinguish whether there are a one to few relevant cases that are worth to specialize for, or e.g. implementing dispatch with simply a monomorphic inline cache has a more reasonable (especially in space) price. From arigo at tunes.org Sat Jan 18 22:07:27 2003 From: arigo at tunes.org (Armin Rigo) Date: Sat, 18 Jan 2003 13:07:27 -0800 (PST) Subject: [pypy-dev] Re: psyco in Python was: Minimal Python project In-Reply-To: <005801c2be3d$ac15af60$bba4aad8@computer>; from edream@tds.net on Fri, Jan 17, 2003 at 09:32:40AM -0600 References: <005801c2be3d$ac15af60$bba4aad8@computer> Message-ID: <20030118210727.18C561C44@bespin.org> Hello again, On Fri, Jan 17, 2003 at 09:32:40AM -0600, Edward K. Ream wrote: > I challenge this group to provide even one example of Python code, for which > psyco will emit better code than that produced by a C compiler on a > transliteration of that code into C. I believe no such counter-example will > ever be found. There are tons of other examples of this if one thinks "mini-language interpretation". For example, a program that asks the user to enter a simple mathematical function and plots its graph would be seriously faster in Python+Psyco than in ANSI C. And if you feel that using Python's compile() is cheating, then write a simple parser and interpreter for the user expressions (just as you would in C) and it could run faster than C if Psyco were advanced enough to specialize it correctly. Armin From tismer at tismer.com Sat Jan 18 22:43:17 2003 From: tismer at tismer.com (Christian Tismer) Date: Sat, 18 Jan 2003 22:43:17 +0100 Subject: [pypy-dev] Questions for Armin In-Reply-To: <20030118190053.163AD1F8E@bespin.org> References: <001001c2be5b$d1bf1ee0$bba4aad8@computer> <20030118190053.163AD1F8E@bespin.org> Message-ID: <3E29CA75.4060507@tismer.com> Armin Rigo wrote: ... > def my_function(a,b,c): > return a+b+c > > the emitted machine code looks like what you would obtain by compiling this: > > PyObject* my_function(PyObject* a, PyObject* b, PyObject* c) > { > int r1, r2, r3; > if (a->ob_type != &PyInt_Type) goto uncommon_case; > if (b->ob_type != &PyInt_Type) goto uncommon_case; > if (c->ob_type != &PyInt_Type) goto uncommon_case; > r1 = ((PyIntObject*) a)->ob_ival; > r2 = ((PyIntObject*) b)->ob_ival; > r3 = ((PyIntObject*) c)->ob_ival; > return PyInt_FromLong(r1+r2+r3); > } [snipped all he good rest] Just a little comment. The above is what I like so much about the Psyco ideas. Now consider the huge eval_code function, wih its specializations in order to make operations on integers very fast, for example. With Psyco, these are no longer necessary, since Psyco will find them by itself and create code like the above from alone. As another point, when re-implementing the Python core objects in Python, there are many internal functions which are called by the interpreter, only. The datatypes pssed to those functions will be almost always the same, and since the functions aren't exposed otherwise, the first time they are called will create their final version, and the uncommon_case can be dropped completely. We just need to "seed" them with appropriate primitive data types, and the whole rest can be deduced with ease. That's what I eagerly want to try and to see happen :-) > I hope that these examples cast some light on Psyco. I realize that this > could distract people from the current goals of this project, and I apologize > for that. We should discuss e.g. "how restricted" the language we use for > Python-in-Python should be... Sorry, I couldn't resist it. I will start to ask some questions in a different thread. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tismer at tismer.com Sat Jan 18 23:03:34 2003 From: tismer at tismer.com (Christian Tismer) Date: Sat, 18 Jan 2003 23:03:34 +0100 Subject: [pypy-dev] Restricted language Message-ID: <3E29CF36.8020507@tismer.com> Hi Armin, in order to start considerations about how to implement Python-in-Python, here some initial questions. Yes, we should begin with the RISP processor :-) (Reduced Instruction Set Python). There are some issues where I have to navigate around in my little tests that I've done already. One thing is the lack of a switch statement in Python, which either leads to zillions of elifs, or to the use of function tables and indexing. We should come up with some "how to do this". Another thing is common for-loops in C. Almost all of them which I tried to translate into Python became while-loops. Is that ok? Data types. How do we model the data types which are used internally by Python? Somebody already showed a small Python interpreter for Python 2.0 (very sorry, I can't find who it was), which was implemented on top of basic Python objects. This was a nice attempt and makes sense to start with. But I gues, for a real Python in Python, we also need to re-build the data structures and cannot borrow lists, tuples and dicts and "Lift them up" into the new level. Instead, these need to be built from a minimum Python "object set" as well. How far should we go down? I was thinking of some basic classes which describe primitive data types, like signed/unsigned integers, chars, pointers to primitives, and arrays of primitives. Then I would build everything upon these. Is that already too low-level? Do you think this should be started using the builtin objects right now, and these should be replaced later, or from the beginning? How far should it go: Are we modelling reference counting as well? And since you said that whether to use reference counting at all, or a classical GC, or object/type tuples returned as register pairs, I'm asking how we should do the proper abstraction? This means clearly to me, that I should *not* repeat the Py_INCREF/Py_DECREF story from Python, but we need to do a more abstract formulation of that, which allows us to specify it in any desired way. So I guess we are not fine by just repeating the C implementation in Python, but we need some kind of "upsizing" the algorithms, away from a flat write-down in C, up to a more abstract defintion of what we want to do. I could even imagine that there are lots of other cases where the C source code already is much to verbose, over-specifying, and obfuscating what the code really is supposed to do. Would you propose to "start dumb", literally like the C source, and add abstractions after the initial thing works, or does this have to happen in the first place? Very many things are done in a specific way in C, just because of the fact that you have to use C. But when I know I have Python, I would most probably not use C string constants all around, but use Python strings. It is C that forces us to use PyString_FromString and friends. Do you think we should repeat this in the Python implementation? This is just the beginning of a whole can of worms to be considered. Just to start it somehow :-) all the best -- chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From edream at tds.net Sun Jan 19 05:40:33 2003 From: edream at tds.net (Edward K. Ream) Date: Sat, 18 Jan 2003 22:40:33 -0600 Subject: [pypy-dev] Questions for Armin References: <001001c2be5b$d1bf1ee0$bba4aad8@computer> <20030118190053.163AD1F8E@bespin.org> Message-ID: <002d01c2bf74$e7f83470$bba4aad8@computer> Hello Armin, > On Fri, Jan 17, 2003 at 01:08:28PM -0600, Edward K. Ream wrote: > > 1. How often and under what circumstances does psyco_compatible get called? > > > > My _guess_ is that it gets called once per every invocation of every > > "psycotic" function (function optimized by psyco). Is this correct? > > No: psyco_compatible() is only called at compile-time. [snip] > Only when a new, not-already-seen type appears does it follow the > "uncommon_case" branch. This triggers more compilation, i.e. emission of more > machine code... Many thanks for this most interesting and informative reply. It clears up a lot of my questions. I feel much more free to focus on the big picture. > All your comments about Psyco are founded, but you are focusing too much on > the "back-end" part... Yes. I have been focusing on an "accounting" question: how often does the compiler run? If the compiler starts from scratch every time a program is run, then I gather from your example that the compiler will be called once for every type of very argument for every executed function _every time the program runs_. Perhaps you are assuming that the gains from compiling will be so large that it doesn't matter how often the compiler runs. Yesturday I realized that it _doesn't matter_ whether this assumption is true or not. Indeed, suppose that we expand the notion of what "byte code" is to include information generated by the compiler: compiled machine code, statistics, requests for further optimizations, whatever. The compiler could rewrite the byte code in order to avoid work the next time the program runs. Now the compiler runs less often: using exactly the same scheme as before the compiler will run once for every type of very argument for every executed function _every time the source code changes_. This means that no matter how slowly the compiler runs the _amortized_ runtime cost of the compiler can be made to be asymptotically zero! This is an important theoretical result: the project can never fail due to the cost of compilation. This result might also allow us to expand our notion of what is possible in the implementation. You are free to consider any kind of algorithm at all, no matter how expansive. For instance, there was some discussion on another thread of a minimal VM for portability. Maybe that vm could be the intermediate code list of gcc? If compilation speed isn't important, the "compiler" would simply be the front end for gcc. We would only need to modify the actual emitters in gcc to output to the "byte code". We get all the good work of the gcc code generators for free. Retargeting psyco would be trivial. These are not proposals for implementation, and certainly not requests that you modify what you are planning to do in any way. Rather, they are "safety proofs" that we need not be concerned about compilation speed _at all_, provided that you (or rather Guido) is willing to expand the notion of the "byte code". This could be done whenever convenient, or never. The point is that my worries about the cost of compilation were unfounded. Compilation cost can never be a "gotcha"; a pressure-relief value is always available. Perhaps this has always been obvious to you; it wasn't at all clear to me until yesterday. Edward From bokr at oz.net Sun Jan 19 08:25:52 2003 From: bokr at oz.net (Bengt Richter) Date: Sat, 18 Jan 2003 23:25:52 -0800 Subject: [pypy-dev] Questions for Armin In-Reply-To: <002d01c2bf74$e7f83470$bba4aad8@computer> References: <001001c2be5b$d1bf1ee0$bba4aad8@computer> <20030118190053.163AD1F8E@bespin.org> Message-ID: <5.0.2.1.1.20030118223231.00a66020@mail.oz.net> At 22:40 2003-01-18 -0600, Edward K. Ream wrote: >Hello Armin, > >> On Fri, Jan 17, 2003 at 01:08:28PM -0600, Edward K. Ream wrote: >> > 1. How often and under what circumstances does psyco_compatible get >called? >> > >> > My _guess_ is that it gets called once per every invocation of every >> > "psycotic" function (function optimized by psyco). Is this correct? >> >> No: psyco_compatible() is only called at compile-time. >[snip] >> Only when a new, not-already-seen type appears does it follow the >> "uncommon_case" branch. This triggers more compilation, i.e. emission of >more >> machine code... > >Many thanks for this most interesting and informative reply. It clears up a >lot of my questions. I feel much more free to focus on the big picture. > >> All your comments about Psyco are founded, but you are focusing too much >on >> the "back-end" part... > >Yes. I have been focusing on an "accounting" question: how often does the >compiler run? If the compiler starts from scratch every time a program is >run, then I gather from your example that the compiler will be called once >for every type of very argument for every executed function _every time the >program runs_. Perhaps you are assuming that the gains from compiling will >be so large that it doesn't matter how often the compiler runs. Well, if most everything is written in python, with all the libraries etc., I think there still is some accounting to do. I.e., library modules won't change much after having been exercised a bit. Will this keep updating cached info in .pyc (or maybe new .pyp) files? (BTW, IWT that makes for eventual permissions issues, if it's shared libraries. Do you just get per-use caches, etc.?) IMO there has to be some way of not rebuilding the world a lot, even if it's fast ;-) I'm also picking this place to re-introduce the related "checkpointing" idea, namely some call into a builtin that can act like a yield and save all the state that the compiler/psyco etc have worked up. Perhaps some kind of .pyk for (python checkpoint) file that could resume from where the checkpoint call was. I believe it can be done if the interpreter stack(s) is/are able to be encapsulated and a little restart info can be stored statically and the machine stack can unwind totally out of main and the C runtime exit, so that coming back into C main everyting can be picked up again. I don't want to belabor it, but just mention it as something to consider, in case it becomes easy when you are redesigning the VM and its environment. Obviously there has to be some restrictions on state, but I think if we could wind up with fast-load application images in the future because you kept this in mind in the beginning, it could be a benefit. Regards, Bengt Richter BTW[OT], sorry about the justification. Now I can't get it back without retyping or writing a re-spacing ragged wrapper. From hpk at trillke.net Sun Jan 19 13:45:10 2003 From: hpk at trillke.net (holger krekel) Date: Sun, 19 Jan 2003 13:45:10 +0100 Subject: [pypy-dev] Restricted language In-Reply-To: <3E29CF36.8020507@tismer.com>; from tismer@tismer.com on Sat, Jan 18, 2003 at 11:03:34PM +0100 References: <3E29CF36.8020507@tismer.com> Message-ID: <20030119134510.A2661@prim.han.de> Hi Christian, [Christian Tismer Sat, Jan 18, 2003 at 11:03:34PM +0100] > in order to start considerations about how to > implement Python-in-Python, here some initial > questions. > Yes, we should begin with the RISP processor :-) > (Reduced Instruction Set Python). cool, we are getting to implementation strategy! Hopefully you don't mind if i comment even i am not Armin :-) I'm curious about his oppinion on this, too. > There are some issues where I have to navigate > around in my little tests that I've done already. > > One thing is the lack of a switch statement in > Python, which either leads to zillions of elifs, > or to the use of function tables and indexing. > We should come up with some "how to do this". Maybe make a list with 256 entries and use the bytecode as an index to get to a frame-method (store_attr etc.)? A specialised compiler could later inline the method bodies and turn any attribute access on 'self' (the frame object) into a local name operation. So there would be the first restriction: always use 'self' to refer to the instance within a frame method. > Another thing is common for-loops in C. > Almost all of them which I tried to translate > into Python became while-loops. Is that ok? yes, why should it not? > Data types. > How do we model the data types which are used > internally by Python? Somebody already showed > a small Python interpreter for Python 2.0 > (very sorry, I can't find who it was), I think several people mentioned doing something like this. Look at http://codespeak.net/moin/moin.cgi/MinimalPython where i gathered two links about python-in-python implementations. If i forgot anyone: it's a wiki so just insert a paragraph about your stuff. Never forget to describe a link! > which > was implemented on top of basic Python objects. > This was a nice attempt and makes sense to start > with. But I gues, for a real Python in Python, > we also need to re-build the data structures > and cannot borrow lists, tuples and dicts and > "Lift them up" into the new level. I wouldn't do this for starters. > Instead, these need to be built from a minimum > Python "object set" as well. > How far should we go down? Redoing the basic types can be deferred IMO. Basically we need to do typeobject.c in python, right? > I was thinking of some basic classes which describe > primitive data types, like signed/unsigned integers, > chars, pointers to primitives, and arrays of > primitives. Then I would build everything upon these. > Is that already too low-level? > Do you think this should be started using the builtin > objects right now, and these should be replaced later, > or from the beginning? replaced later IMO. First get a working interpreter and eliminate anything that stands in the way. > How far should it go: Are we modelling reference > counting as well? I'd try without. For the time beeing, CPython does it for us until we come up with a scheme. But the scheme is best experimented with when we have a working interpreter. > So I guess we are not fine by just repeating the > C implementation in Python, but we need some kind > of "upsizing" the algorithms, away from a flat > write-down in C, up to a more abstract defintion > of what we want to do. I regard a python-in-python interpreter as Pseudo-Code which can be turned into other representantions (in C or another bytecode machine) by an appropriate compiler. Using Python to describe the abstract descriptions makes sense to me. So we eliminate any C-isms (like switch statements, reference counting, signalling exceptions by NULL returns etc.). > Very many things are done in a specific way in C, just > because of the fact that you have to use C. But when > I know I have Python, I would most probably not use > C string constants all around, but use Python strings. > It is C that forces us to use PyString_FromString > and friends. Do you think we should repeat this in > the Python implementation? No. The goal is: get a working interpreter in python running on CPython. An interpreter where you can run the unittests against. Don't care about any low-level stuff. That is handled by CPython and later on will be handled by custom compilers (to C, MMIX, assembler, whatever). I am still thinking about Roccos issue of how to "differentiate" host system (CPython) exceptions from exception of the code we interprete in python. My first take is to catch any exception, rewrite the traceback information (using the python-frames) and reraise it. The exceptions must look the same as if i executed with CPython:ceval.c all the best, holger From edream at tds.net Sun Jan 19 13:47:05 2003 From: edream at tds.net (Edward K. Ream) Date: Sun, 19 Jan 2003 06:47:05 -0600 Subject: [pypy-dev] Questions for Armin References: <001001c2be5b$d1bf1ee0$bba4aad8@computer> <20030118190053.163AD1F8E@bespin.org> <5.0.2.1.1.20030118223231.00a66020@mail.oz.net> Message-ID: <000b01c2bfb8$dff4b840$bba4aad8@computer> > >Yes. I have been focusing on an "accounting" question: how often does the > >compiler run? If the compiler starts from scratch every time a program is > >run, then I gather from your example that the compiler will be called once > >for every type of very argument for every executed function _every time the > >program runs_. Perhaps you are assuming that the gains from compiling will > >be so large that it doesn't matter how often the compiler runs. > Well, if most everything is written in python, with all the > libraries etc., I think there still is some accounting to > do. I.e., library modules won't change much after having > been exercised a bit. Will this keep updating cached info in > pyc (or maybe new .pyp) files? Good question. Here is something I sent privately to Christian: [starts] To run a program: [using .pyp files] 1. Load the byte code, performing any queued requests for optimizations using stored data. In general, there will be no such requests after the first few runs. Obviously, changing the Python source code throws out some or all of this intermediate data. 2. Run the code, doing some optimizations immediately, possibly requesting other optimizations to be done later, and storing any useful data in the "extended byte code". At first the code will be intepreted/git'ed. After a few runs there will be nothing left but globally optimized machine code. It doesn't get any better than this. In short, the code gets faster the more it is executed. This approach should make bootstrapping easy. You could even start with an interp that just queues up optimization requests for the next load time! [ends] So I am thinking that the only time this machine code in the byte code (.pyp file) changes is if the jit/interpreter/psyco/whatever-you-call-it sees an object with a type that it has never seen before. After a very few runs (of any particular .pyp file) the cached data becomes nothing but machine code (with branches to uncommon_case that never get taken). I see the situation as being similar to a peephole optimizer: it takes only 2 or 3 iterations to perform all possible optimizations. Since the Python code of libraries and other "system" code never changes (or hardly ever changes), we should be ok. Edward -------------------------------------------------------------------- Edward K. Ream email: edream at tds.net Leo: Literate Editor with Outlines Leo: http://personalpages.tds.net/~edream/front.html -------------------------------------------------------------------- From stephan.diehl at gmx.net Sun Jan 19 13:50:25 2003 From: stephan.diehl at gmx.net (Stephan Diehl) Date: Sun, 19 Jan 2003 13:50:25 +0100 Subject: [pypy-dev] question about core implementation language Message-ID: <20030119124958.250375A2CF@thoth.codespeak.net> Hi all, while reading all these long mails about this exciting project, one basic question came to my mind: why has the core implementation to be in C? Years ago I did a little programming in Objective C and I just loved this language (I'm taking about gcc objc without the framework by NeXT/Apple). Here are some good things about Objective C: 1. Legal C code is already legal Objective C code 2. Objective C has a runtime engine build in. 3. objects classes don't need to be known at compile time (just their interfaces) Having said that, I could imagine that it's much easier to model python objects in Objective C than in C, thus making the needed core more compact and easier to maintain. Stephan From tismer at tismer.com Sun Jan 19 15:16:55 2003 From: tismer at tismer.com (Christian Tismer) Date: Sun, 19 Jan 2003 15:16:55 +0100 Subject: [pypy-dev] Restricted language In-Reply-To: <20030119134510.A2661@prim.han.de> References: <3E29CF36.8020507@tismer.com> <20030119134510.A2661@prim.han.de> Message-ID: <3E2AB357.5030503@tismer.com> Hi Holger, > [Christian Tismer Sat, Jan 18, 2003 at 11:03:34PM +0100] > >>in order to start considerations about how to >>implement Python-in-Python, here some initial >>questions. >>Yes, we should begin with the RISP processor :-) >>(Reduced Instruction Set Python). > > > cool, we are getting to implementation strategy! > > Hopefully you don't mind if i comment even i am > not Armin :-) I'm curious about his oppinion > on this, too. No, this is great. These things need to be discussed and we need to agree on a path to go. Interestingly, your opinion is quite different from Armin's, and the latter brought me to ask these questions. [switch and such] > Maybe make a list with 256 entries and use the bytecode > as an index to get to a frame-method (store_attr etc.)? Sure, I know all the possible variants, the question is whether we like it or not. If we are recoding Pythpn in Python, then with the final idea in mind, that in some future, this might become the real implementation of Python. But then, it should not look worse than in C. ... >>Another thing is common for-loops in C. >>Almost all of them which I tried to translate >>into Python became while-loops. Is that ok? > > yes, why should it not? Try to re-code some C code of the interpreter and the builtin objects into Python. After the third work-around to some C construct, you begin to ask for more expressive constructs. Instead of upgrading to better abstraction, I find myself emulating C constructs. This is not what I wanted. >>we also need to re-build the data structures >>and cannot borrow lists, tuples and dicts and >>"Lift them up" into the new level. > > > I wouldn't do this for starters. I understand you approach. But here we begin to see conflicting ideas. Citing Armin: """ (1) write a Python interpreter in Python, keeping (2) in mind, using any recent version of CPython to test it. Include at least the bytecode interpreter and redefinition of the basic data structures (tuple, lists, integers, frames...) as classes. Optionally add a tokenizer-parser-compiler to generate bytecode from source (for the first tests, using the underlying compile() function is fine). """ ... > Redoing the basic types can be deferred IMO. > Basically we need to do typeobject.c in python, right? I thought so, too, but I'm not sure if my overview of the whole thing is good enough to judge this early. ... >>How far should it go: Are we modelling reference >>counting as well? > > > I'd try without. For the time beeing, CPython > does it for us until we come up with a scheme. > But the scheme is best experimented with when > we have a working interpreter. ... I can't decide here. Need Armin's input, whether it makes sense to postpone the detail problems in the beginning. This surely gets us a simple start, but I hope it will not hinder us from tackling the real problems. A Python interpreter in Python is not a real problem. But hopefully a goot start to tackle them. > No. The goal is: get a working interpreter in python > running on CPython. An interpreter where you can run > the unittests against. Don't care about any low-level > stuff. That is handled by CPython and later on will > be handled by custom compilers (to C, MMIX, assembler, > whatever). Ok. I'm eager to hear what the comments will be :-) > I am still thinking about Roccos issue of how to > "differentiate" host system (CPython) exceptions from > exception of the code we interprete in python. My first > take is to catch any exception, rewrite the traceback > information (using the python-frames) and reraise it. This is the "world on the wire" problem. In a direct mapping, back to C code, we could simple repeat the Null-check strategy as it is now. See "From PyPy to Psyco", (3a). In the case where we need to run interpreter Python from Python, we have to emulate exceptions in a way, and I think we cannot "borrow" them, but we need to interpret them ourselves. If you look at the eval_frame code, there we already have to handle the block stack, so we can do our own exceptin handling as well. There is one problem with "borrowed" objects: If you have a dictionary for instance, which is not implemented by our Python, but from the standard, then this might raise exceptions, which we have to catch and turn into our emulated exceptions. Again the big picture: ---------------------- I don't want to disappoint anybody, but please don't think the initial Python-in-Python interpreter is such a big deal. It is the tip of the iceberg. IMHO, the real hard work is to re-write the whole C library. It is a major subject of the project to turn that into something more flexible, and this is the real target. Whatever path we choose, we need to ensure that we find a connected path to that goal. So what I'm trying to do is to do tiny proof-of-concept things, and try to build a chain of refinements, which leads to a stepwise implementable final project. We need to break the C barrier as early as possible, or I can's see how Python can get rid of CPython. cheers - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tismer at tismer.com Sun Jan 19 15:28:57 2003 From: tismer at tismer.com (Christian Tismer) Date: Sun, 19 Jan 2003 15:28:57 +0100 Subject: [pypy-dev] question about core implementation language In-Reply-To: <20030119124958.250375A2CF@thoth.codespeak.net> References: <20030119124958.250375A2CF@thoth.codespeak.net> Message-ID: <3E2AB629.2090700@tismer.com> Hi Stephan, > while reading all these long mails about this exciting project, one basic > question came to my mind: why has the core implementation to be in C? It does not have to. > Years ago I did a little programming in Objective C and I just loved this > language (I'm taking about gcc objc without the framework by NeXT/Apple). > Here are some good things about Objective C: > 1. Legal C code is already legal Objective C code > 2. Objective C has a runtime engine build in. > 3. objects classes don't need to be known at compile time (just their > interfaces) > > Having said that, I could imagine that it's much easier to model python > objects in Objective C than in C, thus making the needed core more compact > and easier to maintain. But the idea is to remove C alltogether, maybe keep it as a target "assembly" language for intermediate compiler output, and re-write everything in Python, in a way that a specializing compiler can choke upon. I have started some prototyping in Python, and I now remember some words of Guido, that Python isn't specially well suited for such a task. I admit, there is some truth in it. I'm sticking with the approach, but I have to fight the wish for a simpler mapping from C to Python. Some primitive sub-language with direct access to primitive objects by type declarations whould make life easier. But we don't want to change the language now, and I try to express everything by objects, which contain the necessary "hints". No idea if this is the concept that finally survives. My vision is that in a sprint, we do a Python interpreter rather quickly and then start to re-implement the basic objects. The resulting code should look improved, not worse. I have not much practice, yet. I will try a few approaches and post them for discussion. cheers - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From edream at tds.net Sun Jan 19 16:36:22 2003 From: edream at tds.net (Edward K. Ream) Date: Sun, 19 Jan 2003 09:36:22 -0600 Subject: [pypy-dev] From PyPy to Psyco References: Message-ID: <001c01c2bfd0$85e14130$bba4aad8@computer> > Bootstrapping issues where quite widely discussed. There are several > valid approaches in my opinion. I would say that we should currently > stick to "Python in Python"'s first goal: to have a Python interpreter up > and running, written entierely in Python. [snip] > There are so many cool things to experiment, I can't wait to have (1) and > (2) ready --- but I guess it's the same for all of us :-) Actually, I believe this _last_ paragraph is the heart of the matter, not the traditional bootstrapping issues. Cockpit warning sounds: Whoop, whoop: war story! Whoop, whoop: war story! :-) By far the most important moment in Leo's 7-year history was the moment that I saw how to begin to use Leo without actually having it. Leo is a combination of an outliner and traditional programming techiques. I had a vague notion that the combination was going to be effective, but I was stuck: building an outliner is a _big_ task, and I wasn't sure exactly what kind of outliner would work well with the programming constructs. I was talking to Rebecca on the way back from a one-day ski outing, vaguely mulling over the problems (she's not a programmer, and she is a great listener :-) when it suddenly struck me that I could use the MORE outliner as an "instant prototype". I would just embed my experimental code in the MORE outline. I would then copy the outline to the clibboard by hand using MORE's copy command. Finally, I would write a little program (M2C for More to C) to take the stuff off the clipboard and create proper C source code that I could then compile. Naturally, the first "outline-oriented" program I wrote in MORE was M2C. This took a few hours. I then simulated by hand the output of M2C on M2C. The result was the C code for M2C. Once I debugged M2C I was in business. It all took less than 2 days. The point is this: I could use MORE _immediately_, even without actually having M2C, and certainly without writing something as complex as the MORE outliner. As soon as I shifted my point of view I was able, within seconds, to experiment with the combination of outlines and literate programming. Within minutes all my doubts about the combination of the two techniques vanished. Within an hour I evolved a new kind of programming style that has remained remarkably constant for over 7 years. Within a few days I had a working prototyping system. (end of war story: transcript of cabin voice recorder ends) I believe something this good can be done with psyco. My ideas: 1. We now have some "safety proofs" in place that show that there is absolutely no need to worry about performance during the initial experimentation/prototyping phase of this project. 2. We already have a superb language tool, namely Python. We must exploit Python to the fullest. 3. We want a bootstrapping scheme that gets us (or rather Armin :-) going _now_: preferably within hours or days, and at most within a week. Putting these ideas together, I suggest the following: 1. Ignore all issues relating to the ultimate target language. In other words, use Python as the target language. 2. Ignore all issues relating to speed. Focus instead on the algorythms that psyco will use and all the nifty experiments that Armin wants to run yesturday. Many of these experiments will involve looking at the target code that gets produced from particular programs/byte codes. 3. Modify Python's logic (it may be possible to do with a simple patch written in Python) so that Python looks for .pyp files and loads them as needed before looking for .pyc files. I believe this can be done very quickly. 4. Put nothing but Python code and data into the .pyp files! The "bootstrap loader" is the code that loads .pyp files. It does one of the following: a. an import of the .pyp file (changing its type temporarily to .py presumably) b. an exec on the entire contents of the .pyp file. In either case, some cleverness will be needed so that the import or exec will execute psyco with the proper data. This cleverness is the province of the code emitters... 5. Modify psyco so it outputs Python code, not C or machine code. The "code emitters" write _whatever is useful_ to the .pyp file. The code emitters might use str(x) to dump psyco's x data structure. At worst (if str could not be used), the code emitters would be write the Python data structures used by psyco to the .pyp file _as python code and data_. As I said before, some cleverness may be needed so that the Python code in the .pyp file ends up executing psyco again, but this is "routine cleverness". Armin is free to dump whatever Python code he wants into the .pyp file. There is no need for formal specifications and no need for the Python code to have a consistent format. Just blast away. Presumably, Armin will design the .pyp file so that it is easy to see the results of his experiments. The advantages are these: - This can all be done within hours--days at the most. - There may be no need for further group design work. - This ignores everything that should be ignored, namely all implementation details. - We get the highest-level, most flexible framework for experimentation, namely the Python code and data in .pyp files. This Python code is the highest-level representation of the generated code, and it the clearest possible way to see the results of experimentation. - It is an immediate path to psyco in python. - There is little or no need to create an interp in Python. HTH :-) Edward P.S. Yes, the results of experimentation will be Python code. Yes, the experimental code will run slower (maybe much slower) than .pyc files given to the C interp. That doesn't matter. What _does_ matter is that Armin will be up and running quickly with an extremely clear, powerful and flexible experimental environment. For example, the code given in another thread: PyObject* my_function(PyObject* a, PyObject* b, PyObject* c) { int r1, r2, r3; if (a->ob_type != &PyInt_Type) goto uncommon_case; if (b->ob_type != &PyInt_Type) goto uncommon_case; if (c->ob_type != &PyInt_Type) goto uncommon_case; r1 = ((PyIntObject*) a)->ob_ival; r2 = ((PyIntObject*) b)->ob_ival; r3 = ((PyIntObject*) c)->ob_ival; return PyInt_FromLong(r1+r2+r3); } will appear in the .pyp file as something like this: def my_function__(a,b,c): if a.ob_type__ != PyInt_Type__: do_uncommon_case__() if b.ob_type__ != PyInt_Type__: do_uncommon_case__() if c.ob_type__ != PyInt_Type__: do_uncommon_case__() r1 = a.ob_ival__ r2 = b.ob_ival__ r3 = c.ob_ival__ return PyInt_FromLong__(r1+r2+r3) I've added trailing double underscores throughout just to indicate that I don't understand any of the implementation details of psyco in psyco. Presumably the generated prototype Python code will gather lots of statistics. The statistics _themselves_ can be written to the .pyp file as plain Python data structures. EKR -------------------------------------------------------------------- Edward K. Ream email: edream at tds.net Leo: Literate Editor with Outlines Leo: http://personalpages.tds.net/~edream/front.html -------------------------------------------------------------------- From arigo at tunes.org Sun Jan 19 19:50:22 2003 From: arigo at tunes.org (Armin Rigo) Date: Sun, 19 Jan 2003 10:50:22 -0800 (PST) Subject: [pypy-dev] Questions for Armin In-Reply-To: <002d01c2bf74$e7f83470$bba4aad8@computer>; from edream@tds.net on Sat, Jan 18, 2003 at 10:40:33PM -0600 References: <001001c2be5b$d1bf1ee0$bba4aad8@computer> <20030118190053.163AD1F8E@bespin.org> <002d01c2bf74$e7f83470$bba4aad8@computer> Message-ID: <20030119185022.6FCD495E@bespin.org> Hello Edward, On Sat, Jan 18, 2003 at 10:40:33PM -0600, Edward K. Ream wrote: > This is an important theoretical result: the project can never fail due to > the cost of compilation. Yes, with good accounting algorithms we can save and restore some of the already-done work. The current Psyco is far from being able to do this cleanly, so I never really thought about it in depth, but it is certainly a desirable feature for a cleaner Psyco (like what I want to do in this project). In all cases I still think that there is some use for a fast-and-dirty compiler mode; for example, when compiling dynamically-constructed code that will change all the time. > Maybe that vm could be the intermediate code list of gcc? If compilation > speed isn't important, the "compiler" would simply be the front end for gcc. I am a bit afraid of what would have to be done to interface the "core" of GCC with Psyco, but this is mainly because I never digged to deeply into GCC. I am sure it can be done, and it would certainly be a great thing. I am sure that your experience in this domain would be most profitable :-) > provided that you (or rather Guido) is willing to expand the notion of the > "byte code". Yes, I think that we should try to unify the idea of function object vs code object in Python with other similar ideas used internally in CPython. For example, built-in function objects have a pointer to a PyMethodDef structure --- we find again the distinction between the "callable front-end" object and the "implementation-description" structure. Ideally, we should have only one all-purpose "function" object type, which holds things like argument names and default values, and any number of "implementation" object types, of which Python code objects would be one example, and PyMethodDef-like objects another. This would let us add other ways to implement functions, like Psyco-emitted machine code objects. Sometimes I wonder whether I should raise the question in python-dev. It seems to me that it helps in various places, e.g. in the help() mecanism which currently cannot guess the argument list for built-in functions. Well, I cannot see how to make it 100% compatible with existing code... Armin From arigo at tunes.org Sun Jan 19 19:50:28 2003 From: arigo at tunes.org (Armin Rigo) Date: Sun, 19 Jan 2003 10:50:28 -0800 (PST) Subject: [pypy-dev] Questions for Armin In-Reply-To: <5.0.2.1.1.20030118223231.00a66020@mail.oz.net>; from bokr@oz.net on Sat, Jan 18, 2003 at 11:25:52PM -0800 References: <001001c2be5b$d1bf1ee0$bba4aad8@computer> <20030118190053.163AD1F8E@bespin.org> <002d01c2bf74$e7f83470$bba4aad8@computer> <5.0.2.1.1.20030118223231.00a66020@mail.oz.net> Message-ID: <20030119185028.5C7A1E06@bespin.org> Hello Bengt, On Sat, Jan 18, 2003 at 11:25:52PM -0800, Bengt Richter wrote: > I'm also picking this place to re-introduce the related > "checkpointing" idea, namely some call into a builtin that > can act like a yield and save all the state that the > compiler/psyco etc have worked up. This might probably be done without Psyco, and would certainly be a nice thing to have. Note that a good Psyco could remove any need for it: most initialization code could theoretically be specialized into something that just creates the necessary data structures without executing any code at all. Sometimes I like to point out that if our OS were written in a high-level language with built-in specializers, they would boot in no more than the time it takes to do the actual I/O that occurs when booting (mainly displaying the login screen and waiting for mouse, keyboard and network input) --- everything else is internal state and can be done lazily. A bientot, Armin. From arigo at tunes.org Sun Jan 19 19:50:30 2003 From: arigo at tunes.org (Armin Rigo) Date: Sun, 19 Jan 2003 10:50:30 -0800 (PST) Subject: [pypy-dev] Re: Restricted language In-Reply-To: <3E29CF36.8020507@tismer.com>; from tismer@tismer.com on Sat, Jan 18, 2003 at 11:03:34PM +0100 References: <3E29CF36.8020507@tismer.com> Message-ID: <20030119185030.7E7BC922@bespin.org> Hello Christian, On Sat, Jan 18, 2003 at 11:03:34PM +0100, Christian Tismer wrote: > Would you propose to "start dumb", literally like > the C source, and add abstractions after the initial > thing works, or does this have to happen in the first > place? I would suggest that wherever we feel that CPython is stuck with a "bad" way to express things, let's think a moment or two if there is a cleaner way to do it Pythonically. Only if no consensus is found, we stick with the CPython way. > One thing is the lack of a switch statement in > Python, which either leads to zillions of elifs, > or to the use of function tables and indexing. For the main loop in eval_frame(), I would say use a list or a dict of functions. It is more flexible because it would let us experiment with adding opcodes dynamically. With some specific support from the "static compiler" it can later be translated into a regular C switch. > Another thing is common for-loops in C. > Almost all of them which I tried to translate > into Python became while-loops. Is that ok? Here again I would say use "for i in range(...)" whenever it is clearly what the C code means. Compared to "while i < ...", it has the advantage that if "..." is a complex expression it tells that this expression can be computed only once. In C you would have to use workarounds to help the compiler in this case. > Data types. > How do we model the data types which are used > internally by Python? We need a common abstraction for all objects, like a base PyObject class with an ob_type property, maybe nothing more. In a first place we can implement objects staightforwardly with the corresponding basic Python objects. Then we can provide alternate object implementations which look like CPython's implementations. We must allow still other implementations to be added later. The first phase would be done with a single class which maps attribute manipulation and method calls to an internal, "real" Python object. This may only work for objects like lists and dicts which we have a great deal of control over from Python; it may not be sufficient for frame objects, for example. > I was thinking of some basic classes which describe > primitive data types, like signed/unsigned integers, > chars, pointers to primitives, and arrays of > primitives. Then I would build everything upon these. > Is that already too low-level? Something along these lines. Maybe a little bit higher-level with no explicit pointers, mutable/immutable flags, and arrays that know their length (althought out-of-bounds checks are not guaranteed, e.g. the no-debug C implementation would not have them). Required pointer indirections can often be deduced automatically from this info; e.g. tuples can store the array of items in-place because it is of immutable length. Lower-level hints may later be added or experimented with (e.g. in the current CPython implementation, dicts have a small cache area that is only used if it is small enough, while lists don't). I feel that a good way to find out which level of abstraction we should target would be to think that we may later emit not C code but OCaml code (for example). > This means clearly to me, that I should *not* > repeat the Py_INCREF/Py_DECREF story from Python, > but we need to do a more abstract formulation of > that, which allows us to specify it in any desired > way. I believe that the above data representation (with maybe the help of more flags) should be enough to deduce how to make a reference-counting interpreter. It may occasionally contain more Py_INCREF/Py_DECREF than the hand-tuned CPython, but I say never mind. If it is a real issue in a specific case then we can always fix it manually with more hints. > I know I have Python, I would most probably not use > C string constants all around, but use Python strings. When the interpreter uses Python strings internally just because handling C strings is more complex, then of course don't repeat the PyString_FromString() calls. We will see later how internal string handling may be translated to C. Reserve PyString_FromString() for places where we need a real object visible from the interpreted program. > This is just the beginning of a whole can of worms > to be considered. Just to start it somehow :-) Sure :-) By the way, I feel that if one routine deserves some special treatment (like refactoring) it is the main loop in eval_frame(). For example, using exceptions to signal "break" or "continue" statements. We already mentionned catching "EPython" exceptions raised by called functions to signal an exception visible in the program we interpret (instead of the "if result!=NULL" trick), and using a table of functions instead of a big switch. If I think about Psyco it is also the place where extra code must be added, like checking the code object for an already-compiled version or collecting statistics. I guess this is also where special treatment is required for a stackless-style CPS interpreter. So I would say, as a general rule, let's be as Pythonic as we like for the main loop, but let's keep globally close to the original C code for everything else. This is also crucial for compatibility. (Yes, I think we will also be able to emit C code almost fully binary compatible with CPython and its extension modules.) A bientot, Armin. From arigo at tunes.org Sun Jan 19 20:51:55 2003 From: arigo at tunes.org (Armin Rigo) Date: Sun, 19 Jan 2003 11:51:55 -0800 (PST) Subject: [pypy-dev] From PyPy to Psyco In-Reply-To: <001c01c2bfd0$85e14130$bba4aad8@computer>; from edream@tds.net on Sun, Jan 19, 2003 at 09:36:22AM -0600 References: <001c01c2bfd0$85e14130$bba4aad8@computer> Message-ID: <20030119195155.DDE774B5D@bespin.org> Hello Edward, On Sun, Jan 19, 2003 at 09:36:22AM -0600, Edward K. Ream wrote: > 3. Modify Python's logic (it may be possible to do with a simple patch > written in Python) so that Python looks for .pyp files and loads them as > needed before looking for .pyc files. I believe this can be done very > quickly. I am not convinced by the whole .pyp idea, but it can certainly be experimented with. I regard it as an optimization only --- which might be very worthwhile, but that's not the point. We should avoid any kind of optimization at all for the current Python-in-Python interpreter. Similarily, there is no need to support .pyc files to claim CPython compatibility. Armin From arigo at tunes.org Sun Jan 19 20:51:56 2003 From: arigo at tunes.org (Armin Rigo) Date: Sun, 19 Jan 2003 11:51:56 -0800 (PST) Subject: [pypy-dev] question about core implementation language In-Reply-To: <20030119124958.250375A2CF@thoth.codespeak.net>; from stephan.diehl@gmx.net on Sun, Jan 19, 2003 at 01:50:25PM +0100 References: <20030119124958.250375A2CF@thoth.codespeak.net> Message-ID: <20030119195156.6B9084B52@bespin.org> Hello Stephan, On Sun, Jan 19, 2003 at 01:50:25PM +0100, Stephan Diehl wrote: > Here are some good things about Objective C: > 1. Legal C code is already legal Objective C code > 2. Objective C has a runtime engine build in. > 3. objects classes don't need to be known at compile time (just their > interfaces) This makes it a good candidate to experiment with. Remember, the purpose of this project is not to write the interpreter in C but in Python, and use C as an intermediate assembly-level language. I talked about targetting other intermediate languages for their runtime engines, like OCaml which has a very good GC, but that will come later. Objective C and Java are also potential targets (though I must admit I prefer cleaner high-level languages). A bientot, Armin. From arigo at tunes.org Sun Jan 19 20:51:59 2003 From: arigo at tunes.org (Armin Rigo) Date: Sun, 19 Jan 2003 11:51:59 -0800 (PST) Subject: [pypy-dev] Restricted language In-Reply-To: <3E2AB357.5030503@tismer.com>; from tismer@tismer.com on Sun, Jan 19, 2003 at 03:16:55PM +0100 References: <3E29CF36.8020507@tismer.com> <20030119134510.A2661@prim.han.de> <3E2AB357.5030503@tismer.com> Message-ID: <20030119195159.CBE654B52@bespin.org> Hello Christian, On Sun, Jan 19, 2003 at 03:16:55PM +0100, Christian Tismer wrote: > > Redoing the basic types can be deferred IMO. > > Basically we need to do typeobject.c in python, right? > > I thought so, too, but I'm not sure if my > overview of the whole thing is good enough > to judge this early. I'm not sure typeobject.c has a special role to play here. We might at first just rely on built-in Python objects to implement all our classes, including type objects. > > I am still thinking about Roccos issue of how to > > "differentiate" host system (CPython) exceptions from > > exception of the code we interprete in python. My first > > take is to catch any exception, rewrite the traceback > > information (using the python-frames) and reraise it. Here we are confusing the two levels of Python. Let's call them the interpreter-level and the application-level. Each application-level object is emulated by an interpreter-level instance of a class like class PyObject: v_ob_type = property(...) ... Let's compare this with a C compiler which must manage a complex data structure for every variable of the C program it compiles. The PyObject class is this data structure. If a variable of the C program is found to be a constant, then the C compiler will internally use a special structure which holds (among other management data) the constant immediate value as a C-compiler-level value. So let's call an "immediate" a C-program-level value which is directly implemented by a C-compiler-level value. Non-immediate values would be e.g. variables that are stored in the stack. Similarily, if we want to implement, say, dictionaries using real Python dictionaries, it is an application-level object that must be implemented using a more complex structure which holds, among other things, a real interpreter-level dictionary. It's just the same as immediates in C programs: class ImmediateObject(PyObject): def __init__(self, ob): ... An application-level dictionary (say created by the BUILD_DICT opcode) is constructed as "ImmediateObject({})". In a first phase we need only ImmediateObjects because (almost?) all application-level objects can be implemented this way. Now let's talk about exceptions. There are also application-level exceptions (the ones that can be caught by except: in the interpreted application) and interpreter-level exceptions (e.g. a bug in the interpreter raising some IndexError). The former must be emulated, not used directly at the interpreter-level. It is just a coincidence that an application-level exception generally also means the stack of calls made by the interpreter must be popped up to the main loop's block handlers. To perform the latter we need a new exception: class EPython(Exception): pass Then the immediate translation of the CPython code PyErr_SetString(PyExc_IndexError, "index out of bounds"); return NULL; is SetException(ImmediateObject(IndexError), ImmediateObject("index out of bounds")) raise EPython where the ImmediateObject()s are here because an application-level exception stores application-level values. The main loop catches EPython exceptions. If we want to follow the C code less closely but be more Pythonic, we can drop the above SetException() which would store information into something like CPython's PyThreadState. Instead, we can directly embed this information into the EPython class: class EPython(Exception): def __init__(self, v_type, v_value): ... I prefix the argument names with 'v_' to remind us that an application-level object is expected (i.e. an instance of PyObject), not e.g. ValueError or a string. Then the code becomes raise EPython(ImmediateObject(IndexError), ImmediateObject("index out of bounds")) which I believe clearly shows the distinction between application-level and interpreter-level exceptions. Oh, and if you got it right, you must see why the 'v_ob_type' property of 'class PyObject' earlier in this mail must never hold a real type like 'int' or 'list', but instead an application-level type object like 'ImmediateObject(int)' or 'ImmediateObject(list)'. A bientot, Armin. From edream at tds.net Sun Jan 19 23:16:55 2003 From: edream at tds.net (Edward K. Ream) Date: Sun, 19 Jan 2003 16:16:55 -0600 Subject: [pypy-dev] question about core implementation language References: <20030119124958.250375A2CF@thoth.codespeak.net> <3E2AB629.2090700@tismer.com> Message-ID: <004a01c2c008$7a48d4e0$bba4aad8@computer> > I have started some prototyping in Python, and I now > remember some words of Guido, that Python isn't specially > well suited for such a task. I admit, there is some > truth in it. I don't understand Guido's comment. I personally think Python is absolutely superb for any prototyping endeavor. Perhaps I am being dense, but the fact that Python doesn't have feature x of language y seems totally a non-issue. You want a switch statement? Write a psyco_switch function (in Python, of course). Ditto for anything else your heart desires. Eventually _everything_ that generates code is going to become assembly code, but that is no problem at all for any compiler. Perhaps I am confusing levels of discussion or implementation. However, the way I see it, the most important thing is to do experimentation with algorithms. To do that, the essential thing is to compare various ways of doing things, at as high a level as possible. In other words, a psyco_switch routine provides a good enough model for the task at hand. Ditto for any other construct you want. Actually, for experimentation (perhaps leading to inspiration) I wouldn't necessarily confine myself to any particular level of implementation or design or whatever. I would simply blast away using whatever language tools appear to be most useful. Armin has talked about doing high-level work with discovering list optimizations, for example. I don't understand what the lack of a Python switch statement has to do with such matters. In short, I would hardly bother at all with questions of the level at which Python is modeling some implementation. I would use whatever works in Python and worry about mapping that back to a final implementation only after it is clear that Python in Python will be a big win. At that time we will have lots more data, energy and incentive to do the needed grunt work. Until then, I wouldn't let implementation problems get in the way of invention. Just my $0.02. Edward From logistix at zworg.com Mon Jan 20 00:23:03 2003 From: logistix at zworg.com (logistix) Date: Sun, 19 Jan 2003 18:23:03 -0500 Subject: [pypy-dev] question about core implementation language Message-ID: <200301192323.h0JNN39h021858@overload3.baremetal.com> "Edward K. Ream" wrote: > > > I have started some prototyping in Python, and I now > > remember some words of Guido, that Python isn't specially > > well suited for such a task. I admit, there is some > > truth in it. > > I don't understand Guido's comment. I personally think Python is absolutely > superb for any prototyping endeavor. > > Perhaps I am being dense, but the fact that Python doesn't have feature x of > language y seems totally a non-issue. You want a switch statement? Write a > psyco_switch function (in Python, of course). Ditto for anything else your > heart desires. Eventually _everything_ that generates code is going to > become assembly code, but that is no problem at all for any compiler. > > Perhaps I am confusing levels of discussion or implementation. However, the > way I see it, the most important thing is to do experimentation with > algorithms. To do that, the essential thing is to compare various ways of > doing things, at as high a level as possible. In other words, a > psyco_switch routine provides a good enough model for the task at hand. > Ditto for any other construct you want. > > Actually, for experimentation (perhaps leading to inspiration) I wouldn't > necessarily confine myself to any particular level of implementation or > design or whatever. I would simply blast away using whatever language tools > appear to be most useful. Armin has talked about doing high-level work with > discovering list optimizations, for example. I don't understand what the > lack of a Python switch statement has to do with such matters. > > In short, I would hardly bother at all with questions of the level at which > Python is modeling some implementation. I would use whatever works in > Python and worry about mapping that back to a final implementation only > after it is clear that Python in Python will be a big win. At that time we > will have lots more data, energy and incentive to do the needed grunt work. > Until then, I wouldn't let implementation problems get in the way of > invention. > > Just my $0.02. > > Edward > > > > _______________________________________________ > pypy-dev mailing list > pypy-dev at codespeak.net > http://codespeak.net/mailman/listinfo/pypy-dev > I think the point is that Python as a language provides absolutely no access to the machine internals. You can't examine memory contents, registers, I/O bus or call code directly. As part of the bootstrap process the following four functions should probably be added to the interperter (either as builtin functions or a module): GetMemoryFromOS() ReturnMemouryToOS() Peek() Poke() ANd probably: get/putRegister() callCode() Other than that, everything could (eventually) be written in Python. --------------------------------------- Get your free e-mail address @zworg.com From bokr at oz.net Mon Jan 20 01:02:05 2003 From: bokr at oz.net (Bengt Richter) Date: Sun, 19 Jan 2003 16:02:05 -0800 Subject: [pypy-dev] Questions for Armin In-Reply-To: <20030119185022.6FCD495E@bespin.org> References: <002d01c2bf74$e7f83470$bba4aad8@computer> <001001c2be5b$d1bf1ee0$bba4aad8@computer> <20030118190053.163AD1F8E@bespin.org> <002d01c2bf74$e7f83470$bba4aad8@computer> Message-ID: <5.0.2.1.1.20030119141754.00a6d030@mail.oz.net> At 10:50 2003-01-19 -0800, Armin Rigo wrote: >[...] >Yes, I think that we should try to unify the idea of function object vs code >object in Python with other similar ideas used internally in CPython. For >example, built-in function objects have a pointer to a PyMethodDef >structure --- we find again the distinction between the "callable >front-end" object and the "implementation-description" structure. > >Ideally, we should have only one all-purpose "function" object type, which >holds things like argument names and default values, and any number of >"implementation" object types, of which Python code objects would be one >example, and PyMethodDef-like objects another. This would let us add other >ways to implement functions, like Psyco-emitted machine code objects. > >Sometimes I wonder whether I should raise the question in python-dev. It >seems to me that it helps in various places, e.g. in the help() mecanism which >currently cannot guess the argument list for built-in functions. Well, I >cannot see how to make it 100% compatible with existing code... ISTM there is a general concept of dynamic representation management (Python can give DRM a new meaning ;-) coming out of the mist. In C, type vs representation is almost 1:1 (i.e., type names identify memory layouts with bits and words etc), but with Python and psyco there are multiple ways of physically representing the same abstract entity. I'd like to push for separating the concepts of type and representation better in discussion. What I'm getting at is separating "representation-type" from "abstraction-type". E.g., a Python object pointer in C may implicitly encode an abstract tuple of (type, id, value) or class PyPtr: __slots__ = ['oType', 'oId', 'oValue'] and we can discuss separately how to pack the info of the abstraction into a 32-bit word with huffman tricks and addressing of type-implying allocation arenas etc., or whatever. But maybe there's another level. I'm wondering whether the most primitive object representation should have a slot for an indication of what kind of representation is being used. E.g., class Primo: __slots__ = [ 'abtraction_type', # might say integer, but not int vs long vs bignum 'entity_id', # identifies abstract instance being represented 'representation_type', # implies a representation_interpreter, maybe CPU 'representation_data' # suitable for representation_interpreter to find it ] In other words, multiple Primo instances with the same entity_id could be specifying multiple abstractly equivalent but concretely different representations of the same object, e.g., a particular integer being represented in various ways, maybe even as machine code to move a particular represention from one place to another. Entity_id might be encoded as a chain of pointers through sibling Primo instances representing the same abstract instance entity. I think there can also be representation_types that are partial representations, something like a database view, or a C++ pointer cast to refer to data members of a base class part of an instance representation. This brings up relationships of multiple representations when they diverge from 1:1 representations of the full abstract info. Any full and valid representation is abstractly equivalent to another, but if one representation is updated, siblings must be invalidated or re-validated (some "view" might not be affected, another representation might be easy and worthwhile to update, like a small part of a complex object, but others might be cheaper to mark for disposal or lazy update). ISTM Python involves multiple concrete representations of types while also trying to unify the abstract aspects, and Psyco only adds to the need for a clear way to speak of Dynamic Representation Management (not Digital Rights Management ;-) issues. I hope I have triggered some useful thoughts, even though so far I only know of Psyco indirectly from these discussions (will try to correct that sometime soon ;-) Regards, Bengt Richter From bokr at oz.net Mon Jan 20 01:26:24 2003 From: bokr at oz.net (Bengt Richter) Date: Sun, 19 Jan 2003 16:26:24 -0800 Subject: [pypy-dev] Questions for Armin In-Reply-To: <20030119185028.5C7A1E06@bespin.org> References: <5.0.2.1.1.20030118223231.00a66020@mail.oz.net> <001001c2be5b$d1bf1ee0$bba4aad8@computer> <20030118190053.163AD1F8E@bespin.org> <002d01c2bf74$e7f83470$bba4aad8@computer> <5.0.2.1.1.20030118223231.00a66020@mail.oz.net> Message-ID: <5.0.2.1.1.20030119161417.00a75370@mail.oz.net> Hello Armin, At 10:50 2003-01-19 -0800, Armin Rigo wrote: >Hello Bengt, > >On Sat, Jan 18, 2003 at 11:25:52PM -0800, Bengt Richter wrote: >> I'm also picking this place to re-introduce the related >> "checkpointing" idea, namely some call into a builtin that >> can act like a yield and save all the state that the >> compiler/psyco etc have worked up. > >This might probably be done without Psyco, and would certainly be a nice thing >to have. Note that a good Psyco could remove any need for it: most >initialization code could theoretically be specialized into something that >just creates the necessary data structures without executing any code at all. There seems to be something I missed. Could you clarify how such specialized versions persist so they don't have to be redone? I.e., how do you get from an original .py source-only representation to the specialized form, and how does the latter come to exist? I.e., is this a new form of incrementally updated .pyc? >Sometimes I like to point out that if our OS were written in a high-level >language with built-in specializers, they would boot in no more than the time >it takes to do the actual I/O that occurs when booting (mainly displaying the >login screen and waiting for mouse, keyboard and network input) --- everything >else is internal state and can be done lazily. If this means dynamic incremental revisions of system files, it must be a whole new class of security issues to nail down, or am I misconstruing? Regards, Bengt Richter From tismer at tismer.com Mon Jan 20 01:47:50 2003 From: tismer at tismer.com (Christian Tismer) Date: Mon, 20 Jan 2003 01:47:50 +0100 Subject: [pypy-dev] question about core implementation language In-Reply-To: <004a01c2c008$7a48d4e0$bba4aad8@computer> References: <20030119124958.250375A2CF@thoth.codespeak.net> <3E2AB629.2090700@tismer.com> <004a01c2c008$7a48d4e0$bba4aad8@computer> Message-ID: <3E2B4736.6070602@tismer.com> Edward K. Ream wrote: > I have started some prototyping in Python, and I now > remember some words of Guido, that Python isn't specially > well suited for such a task. I admit, there is some > truth in it. > > > I don't understand Guido's comment. I personally think Python is absolutely > superb for any prototyping endeavor. It is not superb if you want to directly map stuff from a completely different language, that C is. But as I said, this is only tempting. My conclusions are quite different, tho. ... > Perhaps I am confusing levels of discussion or implementation. However, the > way I see it, the most important thing is to do experimentation with > algorithms. To do that, the essential thing is to compare various ways of > doing things, at as high a level as possible. In other words, a > psyco_switch routine provides a good enough model for the task at hand. > Ditto for any other construct you want. Yes, I agree. That's not the problem. The problem is simply this: The algorithms written in C are written very very well. This is highly tested code, evolved for years, and it is as good as thinkable, given the known restrictions, like that it has to be C. My (our) problem is now, that we have to transliterate this quality code into Python, hopefully not destroying anything. Alongside, whe have to invent lots of new objects on-the-fly, as there appear so many structs and things, while you try to write a pythonic Python implementation of a C module. I tried this today with a module which I felt urgent to try to define for this project: frameobject.c . This module is less than 1000 lines, and it is one of the C modules which I know the best, due to my stackless work. I tried to map this module on a 3.5 hour yourney from Kiel to Berlin, and I had one fourth done by an hour. Nevertheless, I got into trouble, just by comparing its implementation differences between 2.2.2 and 2.3a. Then I dropped that and decided that this is the wrong way. It is not a simple task to do a good mapping into Python, although I know exactly how I would map this and that. But only the fact that the python-dev crew is working hard on all the C code makes them into our biggest enemies: The are changing the C code all the time, in all places, getting us into the role to program after them all the time. Between the lines, I recognized that I still would know the mapping that I wanted to apply from C to Python. I just hate to write this by hand, for similar reasons why I abandones Stackless 1.0 long time ago, simply since keeping track of changes is a PITA, unless *your* code is in the core. So what I am thinking to start instead is quite different, with the same target, but one level higher. I will post that after first experiments. > Actually, for experimentation (perhaps leading to inspiration) I wouldn't > necessarily confine myself to any particular level of implementation or > design or whatever. I would simply blast away using whatever language tools > appear to be most useful. Armin has talked about doing high-level work with > discovering list optimizations, for example. I don't understand what the > lack of a Python switch statement has to do with such matters. Simply try to implement something. Yes, it is fine. Try to re-implement ceval.c in an afternoon. It is 3900 lines of code, well-written C code, a huge switch statement, and you find yourself writing all those case cases as tiny functions inside a class. Sure it is possible. What the matter is? I have to change something substantial, there is no trivial mapping, but a more complicated one. > In short, I would hardly bother at all with questions of the level at which > Python is modeling some implementation. I would use whatever works in > Python and worry about mapping that back to a final implementation only > after it is clear that Python in Python will be a big win. At that time we > will have lots more data, energy and incentive to do the needed grunt work. > Until then, I wouldn't let implementation problems get in the way of > invention. I had the same POV before I started to try coding. You are absolutely right, the implementation does not matter so much. In extent, if it doesn't matter too much, then there is no need to do this by hand. Please give me two days to try this, then I will hopefully show a way how to do it *not by hand*. all the best -- chris (too old to do this by hand) From Nicolas.Chauvat at logilab.fr Mon Jan 20 01:57:49 2003 From: Nicolas.Chauvat at logilab.fr (Nicolas Chauvat) Date: Mon, 20 Jan 2003 01:57:49 +0100 Subject: [pypy-dev] Restricted language In-Reply-To: <3E29CF36.8020507@tismer.com> References: <3E29CF36.8020507@tismer.com> Message-ID: <20030120005749.GA25015@logilab.fr> On Sat, Jan 18, 2003 at 11:03:34PM +0100, Christian Tismer wrote: > Hi Armin, Sorry, I'm not armin either. > Yes, we should begin with the RISP processor :-) > (Reduced Instruction Set Python). As already said in private, I like that name. Very descriptive :-) > Data types. > How do we model the data types which are used > internally by Python? Somebody already showed > ... > and cannot borrow lists, tuples and dicts and > "Lift them up" into the new level. > Instead, these need to be built from a minimum > Python "object set" as well. > How far should we go down? As already said on the list, I like Mozart/Oz but not its syntax. Here are two links that I hope useful: 1. oz kernel language http://www.mozart-oz.org/documentation/tutorial/node1.html#label2 It says that other language constructs are built on top of that. 2. hierarchy of primary types http://www.mozart-oz.org/documentation/tutorial/node3.html#label14 What about reusing the idea of record that serves as a basis for a lot of higher-level constructs (dictionnaries, tuples, classes and objects, etc.) See also http://www.mozart-oz.org/documentation/tutorial/node3.html#label19 IMHO, the goal we are trying to achieve can not be a new idea. Looking around for research papers and research languages is a good way not to completely reinvent the wheel. HTH. -- Nicolas Chauvat http://www.logilab.com - "Mais o? est donc Ornicar ?" - LOGILAB, Paris (France) From tismer at tismer.com Mon Jan 20 03:27:31 2003 From: tismer at tismer.com (Christian Tismer) Date: Mon, 20 Jan 2003 03:27:31 +0100 Subject: [pypy-dev] How to translate 300000 lines of C Message-ID: <3E2B5E93.8050803@tismer.com> Dear list, I already announced some concern in a recent message. [Edward, I need you for this, at least for advice!] Part One: Making you frightened about the code size --------------------------------------------------- Running the following command over the current Python CVS src/dist direcotry: wc $(find . -name '*.c' -or -name '*.h') gives this result today (Januaray 20, 2003, 2:31 (GMT+01.00) 319282 1132750 9397985 total Ok, this is about everything in the core distribution, may it be needed for Minimal Python (whatever it is) or not. Let's roughly shrink it down to 150.000 lines. This is 150.000 lines of well-written, tested, evolved, really good C code. Now, a crowd of maybe 5-10 people is going to meet in a sprint by end of February, trying to translate a relevant amount of this mountain of code into Python? Really working Python? Won't they get bored? I can do 1000 to 3000 lines per day, when re-coding into Python as a prototype. With debugging and code quality stuff, I'm down to 500 or less. Let's assume we have 10 people of comparable caliber. Given that we work for 10 days full-time, nobody being ill, not accounting for the parties we probably will have, everything being perfect, then we *might* have 50.000 lines of quality code done in that period. I don't really believe in such a great success, it will probably be much less, since programming in groups does not scale well, sorry. There will be lots of overhead, discussions, misunderstandings, personal problems, I will probably get shot, so let's expect 10.000 to 20.000 of good Python program lines. Now, think of code like ceval.c which is alone 3900 lines of code, and not the most simple code. Of course, we can create a serious new interpreter, with all "borrowed" objects wrapped in a proper way quite quickly, and I still think this is a good idea. But I think we can do much better. And most probably, people will not get bored to do the implementation, see part three. Part Two: Making you frightened about the C code ------------------------------------------------ No offense to the python-dev people (also, since I do belong to this group a little bit), the C code base is absolutely great. As a code base written in C, of course. But I would like to encourage everybody to pick some medium-sized C source file and try to translate it into Python. It is possible, and it isn't too difficult. But it makes you stumble and stumble and stumble. The more you look at it, you recognize that it is quite near to assembly language. Everything is written down, expanded in some rather efficient way, there is not much abstraction. There is no inheritance, but there are lots of repetitions of fimilar but not identical code. You are confronted with exceptions which need some mapping. You see all the primitive types being used all the time, and you'll wonder how to map them. (Yes, we can set up general directives how to do that). You will also find lots of stuctures which need to be implemented. Finally, you find myriads of builtin Py...stuff...() runtime functions which you need to emulate somehow. Then, looking at the frequency of python-checkins, you will find that your translation work will be voided in some near future. python-dev is improving things all the time, and you will be kept busy for a life time to adjust your Python version. This might come to an end, if the core developers might finally decide to drop the C implementation in favor of our new project. But this can only happen, if we are fast enough! Part Three: Proposing A Radical Consequence ------------------------------------------- I see no point in wasting manjears of coding to re-invent the whell by assembling piece-to-piece from C code to Python code. For sure, there are some very relevant modules which might need to be hand-coded. But, and this is driven by the summary of what I thought to re-code by hand today: I believe that it is possible to automate this translation process! We can set up some default mappings for the most frequent C constructs. There are a number of free-ware C compilers around, and also some C interpreters. My vision since today is now to augment such a compiler to become a Python extension, and then run this compiler over all the C code. The Python extension should then try to provide a re-write of the C code in Python! There are some simple rules to be obeyed, which come out of the top of my head and can be changed as needed, just to give an example: For every structure that appears in the source, emit an approporiate Class definition, based upon a base class that is designed to handle structures. For every switch statement, create an according number of local functions (indeed making use of the new scopes), and prepare a dispatcher table for all the functions. For every simple for loop, create an xrange construct. For every not simple for loop, create a while loop with a break condition. For every simple type instantiation, create a similar object that derives from a class that describes such simple types. For very macro constant, use a constant notation. for every macro function, provide a Python function. Remove every Py_INCREF and every py_DECREF. Instead, let's automate that, using more reference counts than necessary, since this can be deduced by a good code generator, later. The interpreter doesn't do it differently, anyway. This list is by far not complete. Addition: For every C module, provide an extra Python module that is able to override some of the automatic decision of above. Special example: For ceval.c, overwrite all the specialized opcode implementations which try to optimize integer operations. These should not be written by hand any longer, but they are the objective of Psyco's specializing features. That's what I'm saying today: Make the move from C to Python automatic, by 95 percent. Let's modify a C compiler to do most of the tedious tasks for us. Try to use pattern matching to remove more of specializations done for the sake of C. Remove C-specific optimizations and optimize for abstractions. Then, we can try to re-target, create C code or assembly from that. My proposal right now is: Let's write (or change) such a compiler which emits fairly good scripts, and then let's add modifications which make these into really good scripts. With some luck, these will withstand the hight frequency of python-dev's code changes, too. Whow, this was a lot of storm in my brain for today. I hope it made some sense -- cheers - chris From pedronis at bluewin.ch Mon Jan 20 04:15:44 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Mon, 20 Jan 2003 04:15:44 +0100 Subject: [pypy-dev] How to translate 300000 lines of C References: <3E2B5E93.8050803@tismer.com> Message-ID: <00c101c2c032$3865f920$6d94fea9@newmexico> As a maybe relevant data point: Current Jython CVS (2.2 functionality minus at the moment missing type/class unification plus java specific integration code) is around 60000 lines of java. From hpk at trillke.net Mon Jan 20 04:40:04 2003 From: hpk at trillke.net (holger krekel) Date: Mon, 20 Jan 2003 04:40:04 +0100 Subject: [pypy-dev] Restricted language In-Reply-To: <20030119195159.CBE654B52@bespin.org>; from arigo@tunes.org on Sun, Jan 19, 2003 at 11:51:59AM -0800 References: <3E29CF36.8020507@tismer.com> <20030119134510.A2661@prim.han.de> <3E2AB357.5030503@tismer.com> <20030119195159.CBE654B52@bespin.org> Message-ID: <20030120044004.E2661@prim.han.de> [armin on using a simple PyObject emulation class ...] > Let's compare this with a C compiler which must manage a complex data > structure for every variable of the C program it compiles. The PyObject class > is this data structure. If a variable of the C program is found to be a > constant, then the C compiler will internally use a special structure which > holds (among other management data) the constant immediate value as a > C-compiler-level value. So let's call an "immediate" a C-program-level value > which is directly implemented by a C-compiler-level value. Non-immediate > values would be e.g. variables that are stored in the stack. using Bengt's suggestion we could say that an "immediate" C-program-level value is represented by a C-compiler-level value. > Similarily, if we want to implement, say, dictionaries using real Python > dictionaries, it is an application-level object that must be implemented using > a more complex structure which holds, among other things, a real > interpreter-level dictionary. It's just the same as immediates in C programs: > > class ImmediateObject(PyObject): > def __init__(self, ob): > ... your class definition "tags" an CPython object as an "immediate" object, right? > An application-level dictionary (say created by the BUILD_DICT opcode) is > constructed as "ImmediateObject({})". In a first phase we need only > ImmediateObjects because (almost?) all application-level objects can be > implemented this way. ImmediateObjects would be tagged CPython objects. I don't see immediate use, though :-) > Now let's talk about exceptions. There are also application-level exceptions > (the ones that can be caught by except: in the interpreted application) and > interpreter-level exceptions (e.g. a bug in the interpreter raising some > IndexError). The former must be emulated, not used directly at the > interpreter-level. It is just a coincidence that an application-level > exception generally also means the stack of calls made by the interpreter must > be popped up to the main loop's block handlers. To perform the latter we need > a new exception: > > class EPython(Exception): > pass > > Then the immediate translation of the CPython code > > PyErr_SetString(PyExc_IndexError, "index out of bounds"); > return NULL; > > is > > SetException(ImmediateObject(IndexError), > ImmediateObject("index out of bounds")) > raise EPython IMHO the python level interpreter should not deal with exceptions like this. E.g. CPython's BINARY_SUBSCR bytecode implementation raises the above exception as an optimization to avoid calling the generic PyObject_GetItem (which does lots of checks itself). But i'd like to implement BINARY_SUBSCR like this: def BINARY_SUBSCR(self): w = self.valuestack.pop() v = self.valuestack.pop() self.valuestack.push(w.__getitem__(v)) With CPython this could raise an exception if 'v' would be "out of bound" or 'w' doesn't implement the getitem-protocol or whatever. The tricky part (to me) is catching the (CPython-) exception and executing the application-level except/finally code in the current bytecodestring. I have done some stuff with "exception/finally" handling wrapped into an object in this CPython-Hack: http://codespeak.net/moin/moin.cgi/IndentedExecution It introduces an object (and strange xml-ish syntax :-) which wraps except/finally handling so that the 'except:' or 'finally:' code is *not* in the current bytecode-string. If reading it, don't look at the namespace interaction bits, they are distracting for our context. I think i'd like to try out an approach where no bytecode needs more than a few lines of python implementation and exception handling is easy to understand. IMO it is especially important to me that all the optimization stuff (like in BINARY_SUBSCR) is removed. Hopefully only the main dispatching loop will have to deal with exceptions & their (re)presentation to the application-level ... cheers, holger From hpk at trillke.net Mon Jan 20 05:06:56 2003 From: hpk at trillke.net (holger krekel) Date: Mon, 20 Jan 2003 05:06:56 +0100 Subject: [pypy-dev] How to translate 300000 lines of C In-Reply-To: <3E2B5E93.8050803@tismer.com>; from tismer@tismer.com on Mon, Jan 20, 2003 at 03:27:31AM +0100 References: <3E2B5E93.8050803@tismer.com> Message-ID: <20030120050656.F2661@prim.han.de> [Christian Tismer Mon, Jan 20, 2003 at 03:27:31AM +0100] > > Running the following command over the current Python CVS > src/dist direcotry: > > wc $(find . -name '*.c' -or -name '*.h') > > gives this result today (Januaray 20, 2003, 2:31 (GMT+01.00) > > 319282 1132750 9397985 total > > Ok, this is about everything in the core distribution, may > it be needed for Minimal Python (whatever it is) or not. > Let's roughly shrink it down to 150.000 lines. I am not sure what this number (300.000) really means at all. For example, 100.000 lines of it are in the Modules directory (not counting their .h files). And getting a running Python-Python-Interpreter doesn't require rewriting all C-stuff. > Now, think of code like ceval.c which is alone 3900 lines > of code, and not the most simple code. If this wouldn't translate to less than 1000 lines of nice python code i would be surprised. > Of course, we can create a serious new interpreter, with > all "borrowed" objects wrapped in a proper way quite > quickly, and I still think this is a good idea. good :-) > But I think we can do much better. > And most probably, people will not get bored to do the > implementation, see part three. > > Part Two: Making you frightened about the C code > ------------------------------------------------ > > No offense to the python-dev people (also, since I do > belong to this group a little bit), the C code base > is absolutely great. As a code base written in C, of course. > > But I would like to encourage everybody to pick some > medium-sized C source file and try to translate it into > Python. It is possible, and it isn't too difficult. > But it makes you stumble and stumble and stumble. btw, I would think there is two orders of magnitude more python code out there than python C-extensions. > [more analysis of how much good C-code there is] > Part Three: Proposing A Radical Consequence > ------------------------------------------- > > I see no point in wasting manjears of coding to re-invent > the whell by assembling piece-to-piece from C code to > Python code. > For sure, there are some very relevant modules which might > need to be hand-coded. > But, and this is driven by the summary of what I thought > to re-code by hand today: > I believe that it is possible to automate this translation > process! > We can set up some default mappings for the most frequent C > constructs. > There are a number of free-ware C compilers around, and also > some C interpreters. > My vision since today is now to augment such a compiler > to become a Python extension, and then run this compiler > over all the C code. > The Python extension should then try to provide a re-write > of the C code in Python! > ... > That's what I'm saying today: > Make the move from C to Python automatic, by 95 percent. Now *this* seems a like a huge undertaking which requires to deal with C-parsers for starters. Don't take me wrong but i don't believe in this route, yet. But i will do as you say and spent more time recoding some stuff in python and try getting it to work. Maybe you are right. cheers, holger From edream at tds.net Mon Jan 20 10:01:02 2003 From: edream at tds.net (Edward K. Ream) Date: Mon, 20 Jan 2003 03:01:02 -0600 Subject: [pypy-dev] Re: How to translate 300000 lines of C References: <3E2B5E93.8050803@tismer.com> Message-ID: <001f01c2c062$7572a1d0$bba4aad8@computer> > I already announced some concern in a recent message. > > [Edward, I need you for this, at least for advice!] Thanks for the vote of confidence. Whenever I am confronted with a problem that seems big, ugly and messy I devote myself to trying to find a way of avoiding it altogether. I suggest you do the same, focusing on what it is that pyco does at "runtime" to discover optimizations. Sorry if this sounds a bit like the Delphic oracle. The only advice I can give is to change your point of view so that a) somehow the problem goes away or b) somehow you can use already existing tools to make it much easier. Actually, you've done both in your original post. I would simply encourage you to keep doing that :-) Edward -------------------------------------------------------------------- Edward K. Ream email: edream at tds.net Leo: Literate Editor with Outlines Leo: http://personalpages.tds.net/~edream/front.html -------------------------------------------------------------------- From theller at python.net Mon Jan 20 10:09:16 2003 From: theller at python.net (Thomas Heller) Date: 20 Jan 2003 10:09:16 +0100 Subject: [pypy-dev] How to translate 300000 lines of C In-Reply-To: <3E2B5E93.8050803@tismer.com> References: <3E2B5E93.8050803@tismer.com> Message-ID: Hm, maybe people could take responisbility for one or two modules and rewrite them in Python? I would try the structmodule, for example... Thomas From arigo at tunes.org Mon Jan 20 11:28:02 2003 From: arigo at tunes.org (Armin Rigo) Date: Mon, 20 Jan 2003 11:28:02 +0100 Subject: [pypy-dev] How to translate 300000 lines of C In-Reply-To: <3E2B5E93.8050803@tismer.com> References: <3E2B5E93.8050803@tismer.com> Message-ID: <20030120102802.GA8176@magma.unil.ch> Hello Christian, On Mon, Jan 20, 2003 at 03:27:31AM +0100, Christian Tismer wrote: > I believe that it is possible to automate this translation > process! Yes! I think it is a very good idea. I would certainly be much more happy with keeping a reasonable-sized translator up-to-date than having to do so with the huge C code base. Let's be clear, we cannot automate the whole translation process, and setting this up might take as long as manually translating most of CPython, but I am confident that it will be a big win afterwards (and I am sure you know it better than me, having discovered it the hard way). The point is not to blindly translate the C code into Python code that is guaranteed to do the same thing. Instead, we need to discover the high-level structure of the C code and map this to Python. It should be relatively easy given that the whole CPython code follows consistent style guidelines. All we need is a C parser; translation could be done from the resulting syntax tree. > For every switch statement, create an according number of > local functions (indeed making use of the new scopes), and > prepare a dispatcher table for all the functions. Maybe just write a chain of if:elif:. New scopes are not completely sufficient because they won't let us modify a variable from the parent scope. > For very macro constant, use a constant notation. > for every macro function, provide a Python function. Yes. In no case should be preprocess the C code to replace the macros by their definition. This would be loosing essential high-level information. > Addition: > For every C module, provide an extra Python module that is > able to override some of the automatic decision of above. Yes. Never change the emitted Python code directly, it would prevent us from keeping up-to-date with CPython. We need some way to give hints to the translator (small hints or whole hand-tuned versions of some functions). Attaching a Python module to each C source file looks like a good way to do it, althought we might also consider adding the hints directly into the C source at the point where they apply, as C comments (or #ifdef'ed-away lines). An advantage of this is that CVS will warn us in case of conflicts between our hints and CPython updates. Well, maybe there is a need for both inline hints and attached Python modules. > For ceval.c, overwrite all the specialized opcode implementations > which try to optimize integer operations. These should not > be written by hand any longer, but they are the objective of > Psyco's specializing features. Yes, althought I would say that the main loop deserves some special treatment. There is no need, for example, to copy the code that calls Py_MakePendingCalls() every _Py_CheckInterval bytecode instructions. This is a parallel aspect that will might want to add or not later, like reference counting. The big switch should be special-cased into a bundle of frame methods with the dispatch table. The Python-in-Python interpreter main loop should be hand-written. Each opcode function is itself produced by the C-to-Python translator unless otherwise specified. > My proposal right now is: Let's write (or change) such a > compiler which emits fairly good scripts, and then let's > add modifications which make these into really good scripts. I believe you are absolutely right. A bientot, Armin. From arigo at tunes.org Mon Jan 20 11:41:01 2003 From: arigo at tunes.org (Armin Rigo) Date: Mon, 20 Jan 2003 11:41:01 +0100 Subject: [pypy-dev] question about core implementation language In-Reply-To: <200301192323.h0JNN39h021858@overload3.baremetal.com> References: <200301192323.h0JNN39h021858@overload3.baremetal.com> Message-ID: <20030120104101.GA8753@magma.unil.ch> Hello Logistix, On Sun, Jan 19, 2003 at 06:23:03PM -0500, logistix wrote: > I think the point is that Python as a language provides absolutely no > access to the machine internals. You can't examine memory contents, > registers, I/O bus or call code directly. It doesn't matter. It has to be built on the top of *some* basic abstractions, but these don't have to be as low-level as that. The interpreter proper can be written in pure Python, and only specific "non-borrowing" implementations need more (e.g. to implement a list as an array of PyObject* with a length field, or to implement the time.time() function by calling a Posix- or Windows-specific OS function). Ideally, we can provide various implementations of the same concept upon various lower-level concepts. We should not stick to one given low-level abstraction. We might for example give two "list" implementations, one based on "memory block" objects that correspond closely to malloc()ed blocks, and one as a Lisp-like chained list based on couples (2-tuples). In all implementations we should somehow express what other abstractions this implementation depends on. The final code emitters can then choose which implementation(s) they will use depending on what low-level abstractions are provided by the target language (e.g. targetting C, the "memory block" objects are ideal; for Java we might reuse its built-in concept of array). A bientot, Armin. From stephan.diehl at gmx.net Mon Jan 20 12:06:52 2003 From: stephan.diehl at gmx.net (Stephan Diehl) Date: Mon, 20 Jan 2003 12:06:52 +0100 Subject: [pypy-dev] Proposal Message-ID: <20030120110624.EB2E25A280@thoth.codespeak.net> Hi all, I'm a little bit confused at the moment about what you really try to achive and I'm not too sure if everybody is talking about the same thing. Anyway, since two of the founders of this pypy discussion are meeting in person this Friday I'd propose that I grill them a little bit and write up some paper about it (and post it here, of course). That's all Stephan From boyd at strakt.com Mon Jan 20 12:35:07 2003 From: boyd at strakt.com (Boyd Roberts) Date: Mon, 20 Jan 2003 12:35:07 +0100 Subject: [Fwd: Re: [pypy-dev] Re: psyco in Python was: Minimal Python project] Message-ID: <3E2BDEEB.7060506@strakt.com> Hey, can we get a Reply-to: on this list? -------- Original Message -------- Subject: Re: [pypy-dev] Re: psyco in Python was: Minimal Python project Date: Mon, 20 Jan 2003 11:13:12 +0100 From: Boyd Roberts To: Christian Tismer References: <3E2855C7.5010107 at gmx.de> <3E289AB8.105 at tismer.com> Christian Tismer wrote: > Thomas Fanslau wrote: > > ... > >> And don't forget the obvious replacement of a multiply by some >> shifts, so multiply by 5 can be replaced by a copy, a right shift 2 >> places and a add instead of a costly multiply ... > > > Ok, my 2 Eurocent: > > Also don't forget that this is not always true. Yes, on the VAX a _floating point_ multiply was faster than a shift. From boyd at strakt.com Mon Jan 20 12:47:32 2003 From: boyd at strakt.com (Boyd Roberts) Date: Mon, 20 Jan 2003 12:47:32 +0100 Subject: [pypy-dev] question about core implementation language References: <20030119124958.250375A2CF@thoth.codespeak.net> <20030119195156.6B9084B52@bespin.org> Message-ID: <3E2BE1D4.6030001@strakt.com> > > >A bientot, > > A bient?t, please (or, for completeness, ? bient?t). From mwh at python.net Mon Jan 20 13:29:41 2003 From: mwh at python.net (Michael Hudson) Date: 20 Jan 2003 12:29:41 +0000 Subject: [pypy-dev] Re: [Fwd: Re: Re: psyco in Python was: Minimal Python project] References: <3E2BDEEB.7060506@strakt.com> Message-ID: <2m1y38t362.fsf@starship.python.net> Boyd Roberts writes: > Hey, can we get a Reply-to: on this list? Please God, no... oh hang on I'm reading this through gmane; I don't care :) More seriously, you might want to consider pointing your newsreader at news.gmane.org and reading the list that way; newsreaders tend to cope with discussion lists rather better than mailers (for obvious reasons). Or "get a better MUA"; gnus is good for discussion lists, largely because it is a newsreader that also does mail. Cheers, M. From tismer at tismer.com Mon Jan 20 14:13:13 2003 From: tismer at tismer.com (Christian Tismer) Date: Mon, 20 Jan 2003 14:13:13 +0100 Subject: [pypy-dev] Proposal In-Reply-To: <20030120110624.EB2E25A280@thoth.codespeak.net> References: <20030120110624.EB2E25A280@thoth.codespeak.net> Message-ID: <3E2BF5E9.2090006@tismer.com> Stephan Diehl wrote: > Hi all, > > I'm a little bit confused at the moment about what you really try to achive > and I'm not too sure if everybody is talking about the same thing. > Anyway, since two of the founders of this pypy discussion are meeting in > person this Friday I'd propose that I grill them a little bit and write up > some paper about it (and post it here, of course). We are rewriting Python in Python. Stepwise. First, getting a minimal system up and running, replacing the core modules by hand. Finally, we want to lift all of the C code to Python, in order to do certain things with that, like trying different ways to copile it back into C, another VM, Psyco, trying different versions of the object model, all of that. And yes, there should be a way to downsize Python, reasonably, then. Grilled Python in Perlchicken sauce :-) -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From stephan.diehl at gmx.net Mon Jan 20 14:20:43 2003 From: stephan.diehl at gmx.net (Stephan Diehl) Date: Mon, 20 Jan 2003 14:20:43 +0100 Subject: [pypy-dev] Proposal In-Reply-To: <3E2BF5E9.2090006@tismer.com> References: <20030120110624.EB2E25A280@thoth.codespeak.net> <3E2BF5E9.2090006@tismer.com> Message-ID: <20030120132016.E3F7A5A280@thoth.codespeak.net> On Monday 20 January 2003 14:13, you wrote: > Stephan Diehl wrote: > > Hi all, > > > > I'm a little bit confused at the moment about what you really try to > > achive and I'm not too sure if everybody is talking about the same thing. > > Anyway, since two of the founders of this pypy discussion are meeting in > > person this Friday I'd propose that I grill them a little bit and write > > up some paper about it (and post it here, of course). > > We are rewriting Python in Python. > Stepwise. First, getting a minimal > system up and running, replacing the > core modules by hand. > Finally, we want to lift all of the C code > to Python, in order to do certain things > with that, like trying different ways to > copile it back into C, another VM, Psyco, > trying different versions of the object > model, all of that. > And yes, there should be a way to downsize > Python, reasonably, then. Ahh, but don't think I won't you asked anything about this :-) > > Grilled Python in Perlchicken sauce :-) Sounds tasty. Is it available at Butter Lindner? From edream at tds.net Mon Jan 20 14:32:22 2003 From: edream at tds.net (Edward K. Ream) Date: Mon, 20 Jan 2003 07:32:22 -0600 Subject: [pypy-dev] How to translate 300000 lines of C References: <3E2B5E93.8050803@tismer.com> <20030120102802.GA8176@magma.unil.ch> Message-ID: <001c01c2c088$73ca2cb0$bba4aad8@computer> > On Mon, Jan 20, 2003 at 03:27:31AM +0100, Christian Tismer wrote: > > I believe that it is possible to automate this translation > > process! > > Yes! I think it is a very good idea. I would certainly be much more happy > with keeping a reasonable-sized translator up-to-date than having to do so > with the huge C code base. In a private email to Christian I suggested making this whole problem go away by changing the name of this project from minimalPython to psycoticPython :-) Whether automated or not, translating tested C code to Python seems extremely difficult and risky. It is risky because it implies one of two speculative assumptions: 1. The Python library will eventually outperform the C library or: 2. Guido will at some point approve supporting _two_ versions of the same library. I view assumption 2 as having almost zero probability, though of course I don't speek for Guido in any way. The reason is plain: it is odious to keep two sets of source code in synch. That leaves assumption 1. No point in arguing over the probabilities of it now: let's assume it is will be proved correct. I would be inclined to pick _one_ module to work on as a test bed. Translation can be done by hand. We can then test assumption 1. The bigger translation problem becomes real only if assumption 1 is proved to be true. Even then, I would imagine a _lengthy_ probationary period for each translated module before it becomes accepted into the library. So it isn't so important how long translation takes; the translation process is much less important than the testing process. My script c2py.py works only on translating C to Python syntax. It's complex enough. The hisory of machine translation of natural languages is littered with initial failure, in some cases with limited success after decades of work. Myself, I wouldn't invest any time at all in automatically translating C semantics to Python semantics. YMMV. Edward -------------------------------------------------------------------- Edward K. Ream email: edream at tds.net Leo: Literate Editor with Outlines Leo: http://personalpages.tds.net/~edream/front.html -------------------------------------------------------------------- From hpk at trillke.net Mon Jan 20 14:53:51 2003 From: hpk at trillke.net (holger krekel) Date: Mon, 20 Jan 2003 14:53:51 +0100 Subject: [pypy-dev] How to translate 300000 lines of C In-Reply-To: ; from theller@python.net on Mon, Jan 20, 2003 at 10:09:16AM +0100 References: <3E2B5E93.8050803@tismer.com> Message-ID: <20030120145351.J2661@prim.han.de> [Thomas Heller Mon, Jan 20, 2003 at 10:09:16AM +0100] > Hm, maybe people could take responisbility for one or two modules > and rewrite them in Python? I think this is a good idea. Christians idea of automatically translating CPython source to python code sure makes sense. But there will be quite some modules which we have to manually code and the structmodule is certainly among them. regarding the automatic translation: At the last FOSDEM conference i talked to Richard Dale who wrote a tool (IIRC "Koala") which automatically generates 5 or 6 bindings for the KDE/QT-library directly from the C++ code, for Java, Objective C and others. Although he was very experienced it took him a year or so to get it going. But maybe "translating" CPython code is considerably simpler than this. Anyway, i think we can start from 'both sides'. Manually rewriting stuff in python and work on a C-to-Python translator. cheers, holger From Nicolas.Chauvat at logilab.fr Mon Jan 20 11:56:24 2003 From: Nicolas.Chauvat at logilab.fr (Nicolas Chauvat) Date: Mon, 20 Jan 2003 11:56:24 +0100 Subject: [pypy-dev] Restricted language In-Reply-To: <20030120005749.GA25015@logilab.fr> References: <3E29CF36.8020507@tismer.com> <20030120005749.GA25015@logilab.fr> Message-ID: <20030120105635.D7BD91BC43@licorne.logilab.fr> [...] > > As already said on the list, I like Mozart/Oz but not its syntax. Here > are two links that I hope useful: > > 1. oz kernel language > http://www.mozart-oz.org/documentation/tutorial/node1.html#label2 > > It says that other language constructs are built on top of that. > > 2. hierarchy of primary types > http://www.mozart-oz.org/documentation/tutorial/node3.html#label14 > > What about reusing the idea of record that serves as a basis for a lot > of higher-level constructs (dictionnaries, tuples, classes and objects, > etc.) See also > http://www.mozart-oz.org/documentation/tutorial/node3.html#label19 This is really interesting and definatelly goes in the direction Stackless might have taken in the future (that is a network transparent runtime environment). > > IMHO, the goal we are trying to achieve can not be a new idea. Looking > around for research papers and research languages is a good way not to > completely reinvent the wheel. I don't know if this helps (probably confuses more than anything else :-) The other day I had a look at a smalltalk environment (squeaky. For the intersted german reader, there is a smalltalk article in the latest c't). Anyway, the interesting part is that the smalltalk runtime engine is written in, guess what, smalltalk. What you are really running is of course compiled C Code, but the smalltalk environment is able to compile the virtual machine smalltalk code to C. Stephan > > HTH. From tismer at tismer.com Mon Jan 20 15:16:00 2003 From: tismer at tismer.com (Christian Tismer) Date: Mon, 20 Jan 2003 15:16:00 +0100 Subject: [pypy-dev] How to translate 300000 lines of C In-Reply-To: <001c01c2c088$73ca2cb0$bba4aad8@computer> References: <3E2B5E93.8050803@tismer.com> <20030120102802.GA8176@magma.unil.ch> <001c01c2c088$73ca2cb0$bba4aad8@computer> Message-ID: <3E2C04A0.2050301@tismer.com> Edward K. Ream wrote: > On Mon, Jan 20, 2003 at 03:27:31AM +0100, Christian Tismer wrote: > >>I believe that it is possible to automate this translation >>process! > > Yes! I think it is a very good idea. I would certainly be much more > > happy > >>with keeping a reasonable-sized translator up-to-date than having to do so >>with the huge C code base. > > > In a private email to Christian I suggested making this whole problem go > away by changing the name of this project from minimalPython to > psycoticPython :-) Oh, I didn't get that until now. :-) > Whether automated or not, translating tested C code to Python seems > extremely difficult and risky. It is risky because it implies one of two > speculative assumptions: > > 1. The Python library will eventually outperform the C library or: > 2. Guido will at some point approve supporting _two_ versions of the same > library. > > I view assumption 2 as having almost zero probability, though of course I > don't speek for Guido in any way. The reason is plain: it is odious to keep > two sets of source code in synch. > > That leaves assumption 1. No point in arguing over the probabilities of it > now: let's assume it is will be proved correct. I would be inclined to pick > _one_ module to work on as a test bed. Translation can be done by hand. We > can then test assumption 1. Fine with me. > The bigger translation problem becomes real only if assumption 1 is proved > to be true. Even then, I would imagine a _lengthy_ probationary period for > each translated module before it becomes accepted into the library. So it > isn't so important how long translation takes; the translation process is > much less important than the testing process. That's very true. The testing process will probably take longer as one or two new Python versions. We have to run in parallel for a reasonable period. That's why we need a semi-automated process that is easy to use on a changed code base. > My script c2py.py works only on translating C to Python syntax. It's > complex enough. The hisory of machine translation of natural languages is > littered with initial failure, in some cases with limited success after > decades of work. Myself, I wouldn't invest any time at all in automatically > translating C semantics to Python semantics. YMMV. Well, it is not C, it is Pythonic C already. That's much simpler than C. (Which means, it doesn't use every and all possible trick in C, it has cleanly seperated statements, very little usage of macros, all ambiguous looking constructs are well-embraced) I also don't think to automatically translate the whole bunch without looking into the output. Instead, I think of a C parser which emits a series of tokens, or maybe AST objects, which is then fed into a Python code generator. This generator should only provide some common rules how to map certain constructs. It should stop in a situation it cannot handle. The porting work is to write configuration scripts for that, which control what to map how. I think this is quite an interactive process, but with the benefit that it is most probably repeatable for a slightly changed new Python version. There are also common patterns which should be replaced by some more abstract Python functions, which describe *what* is happening, instead of always telling *how* to do it, in an inlined way. This is what I call "uplifting". This is of course no quick process. The automated tool will help us to avoid tedious work, and to avoid errors by systematic mappings. And we can play with that and configure and fine tune, until the result looks as we like it. Not meantioning all the new ideas which we will have while we're at it. Right now, everything is an oracle. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From boyd at strakt.com Mon Jan 20 15:11:32 2003 From: boyd at strakt.com (Boyd Roberts) Date: Mon, 20 Jan 2003 15:11:32 +0100 Subject: [Fwd: Re: [pypy-dev] Re: psyco in Python was: Minimal Python project] References: <3E2BDEEB.7060506@strakt.com> <20030120135020.GA27487@logilab.fr> Message-ID: <3E2C0394.8040504@strakt.com> Nicolas Chauvat wrote: >Probably not a good idea. Ask google for "reply-to considered harmful". > I knew that response would turn up. From tismer at tismer.com Mon Jan 20 15:21:47 2003 From: tismer at tismer.com (Christian Tismer) Date: Mon, 20 Jan 2003 15:21:47 +0100 Subject: [pypy-dev] How to translate 300000 lines of C In-Reply-To: <20030120145351.J2661@prim.han.de> References: <3E2B5E93.8050803@tismer.com> <20030120145351.J2661@prim.han.de> Message-ID: <3E2C05FB.7000405@tismer.com> holger krekel wrote: ... > Anyway, i think we can start from 'both sides'. > Manually rewriting stuff in python and work on > a C-to-Python translator. Absolutely, everybody should work on the ends she sees fit. To get something quickly, we cannot rely on C2P right now. To get everything translated, we cannot wait for the manpower to do it by hand. Furthermore, for certain modules it makes very much sense to write them by hand, *and* we probably need some as a template, reference implementations for targetting the C2P processor. Plenty to do, trying all paths in parallel will move the project forward. It also cannot all be planned in advance, I'm thinking more of evolution and Extreme Programming style than of classical design. will-be-extreme-programming-fun -y y'rs chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From arigo at tunes.org Mon Jan 20 17:38:52 2003 From: arigo at tunes.org (Armin Rigo) Date: Mon, 20 Jan 2003 17:38:52 +0100 Subject: [pypy-dev] How to translate 300000 lines of C In-Reply-To: <3E2C04A0.2050301@tismer.com> References: <3E2B5E93.8050803@tismer.com> <20030120102802.GA8176@magma.unil.ch> <001c01c2c088$73ca2cb0$bba4aad8@computer> <3E2C04A0.2050301@tismer.com> Message-ID: <20030120163852.GA13514@magma.unil.ch> Hello, On Mon, Jan 20, 2003 at 03:16:00PM +0100, Christian Tismer wrote: > >Whether automated or not, translating tested C code to Python seems > >extremely difficult and risky. It is risky because it implies one of two > >speculative assumptions: > > > >1. The Python library will eventually outperform the C library or: > >2. Guido will at some point approve supporting _two_ versions of the same > >library. I'm not sure these are the fundamental assumptions. The goal we have here is to write Python in Python. The translator we are debating about is only a tool to acheive this goal in a way that greatly helps keeping our source in sync with CPython's. In fact that's precisely because we don't want to support two versions of the same code that we need some help from such a tool. Again, it is out of question to design a tool that reliably translates arbitrary C code to Python. The goal is to use simple rules and hand-made patterns to emit Python code, and then check *all* the emitted Python code and fine-tune it if needed --- with configuration scripts fed to the translator, not by directly changing the emitted Python code. What we can then do with such a Python-in-Python interpreter (e.g. emit good C code again) is another story. A bient?t, Armin. From chux at houston.rr.com Mon Jan 20 18:47:24 2003 From: chux at houston.rr.com (Charles Crain) Date: Mon, 20 Jan 2003 11:47:24 -0600 Subject: [pypy-dev] SCons build tool Message-ID: <000001c2c0ac$01a16ee0$1401a8c0@technobill> Hi. I have been lurking on the pypy list for a while, and I happened to see a message under the subject "bootstrapping issues" that caught my attention: There is the idea of not using make/configure/automake but a simple understandable debuggable (read: python based) build environment. It just so happens that I'm part of the team developing SCons, which is a 100% pure Python make replacement based on the winning proposal for the Software Carpentry Project build tool. It's very full-featured, and includes many features that have to be hand-rolled with the traditional make/configure/automake solution. For more information, see www.scons.org. Oh, and it's free as in both beer and speech, of course. -Charles P.S. I don't subscribe to the email list, I only check the archives every once in a while, so please copy any replies directly to my email. Thanks. From daniels at dsl-only.net Mon Jan 20 19:33:45 2003 From: daniels at dsl-only.net (Scott David Daniels) Date: Mon, 20 Jan 2003 10:33:45 -0800 Subject: [pypy-dev] Re: pypy-dev Digest, Vol 9, Issue 3 In-Reply-To: <20030118214149.F07305A422@thoth.codespeak.net> References: <20030118214149.F07305A422@thoth.codespeak.net> Message-ID: <3E2C4109.4020108@dsl-only.net> pypy-dev-request at codespeak.net wrote: Armin Rigo wrote (among other things): > def my_function(a,b,c): > return a+b+c > > the emitted machine code looks like what you would obtain by compiling this: > > PyObject* my_function(PyObject* a, PyObject* b, PyObject* c) > { > int r1, r2, r3; > if (a->ob_type != &PyInt_Type) goto uncommon_case; > if (b->ob_type != &PyInt_Type) goto uncommon_case; > if (c->ob_type != &PyInt_Type) goto uncommon_case; > r1 = ((PyIntObject*) a)->ob_ival; > r2 = ((PyIntObject*) b)->ob_ival; > r3 = ((PyIntObject*) c)->ob_ival; > return PyInt_FromLong(r1+r2+r3); > } Here we need to be careful about boundary conditions and wierd cases. On a 16 bit machine (for easy reading), 0x7fff + 0x7fff should not go negative. While sign checks may work on two args, they certainly won't work for three: 0x70ff - 0x7fff + 0x0005. Also on Bengt Richter 's: > I'm also picking this place to re-introduce the related > "checkpointing" idea, namely some call into a builtin that > can act like a yield and save all the state that the > compiler/psyco etc have worked up. Perhaps some kind of .pyk > for (python checkpoint) file that could resume from where > the checkpoint call was. I believe it can be done if the > interpreter stack(s) is/are able to be encapsulated and a > little restart info can be stored statically and the machine > stack can unwind totally out of main and the C runtime exit, > so that coming back into C main everyting can be picked up > again. I did a checkpointing system for the SAIL language long ago and far away. The truly nasty part of a checkpoint is replicating the environment outside of the address space. Not only do you have exception state, but you also have * file state: Does the file even exist anymore? Do output files get re-opend in append mode, or re-created with copies of data? Is the input file really the same? Do you _know_ that? Do you simple seek and go? * system state: Which interrupts are armed, enabled, or suspended and pointed to what code in SAIL's case. I think Python's will be more about what callbacks are set and how to "passivate" things like Tkinter. * Hardware state I: Stop the program with a checkpoint after printing half a recipe. Two days later run the checkpoint. The printer is not longer ready to finish printing. This is not as nasty as: You have issued two bytes of a three-byte operation to an I/O device. Post-Checkpoint the second time could be quite entertaining. This is why I decided checkpoint would cause a stop and run the checkpoint: _slightly_ safer. * Hardware state II: Stop the program with a checkpoint after printing half a recipe. Install a new printer. Run the checkpoint. * Hardware state III: (Thank heaven for the old days). Stop the program with a checkpoint. Replace the CPU. Run the checkpoint. * Hardware state IV: Stop the program with a checkpoint. E-mail the checkpoint to a friend who runs it ont his zyglot-2000. Checkpoints are absolutely wonderful. You can build systems that had several development phases, each represented by a checkpoint. In fact IMSSS at Stanford built some large systems that way: We avoided elaborate macros to build data structures by building theim in the first checkpoint phase. We actually had three levels of authors, two of whom were used to starting at a particular checkpoint and building it to the next checkpoint. The final checkpoint was delivered as our product. In my experience, there is too much outside the language system's control to provide the nice straghtforward thing you (or really your users) want when they hear "checkpoint." Programs often only check environmental things once, assuming they can never change for the life of the process. The OS type, the display type, the current process id, .... Be careful to avoid someoner thinking of a checkpoint as a "secret sauce" that they can smear over their program to allow it to magically checkpoint itself every minute and hope to use any of the checkpoints multiple times. Chris probably has run into this with continuations, which can be thought of as a kind of "internal code checkpoint." -Scott David Daniels From nathanh at zu.com Mon Jan 20 19:51:00 2003 From: nathanh at zu.com (Nathan Heagy) Date: Mon, 20 Jan 2003 12:51:00 -0600 Subject: [pypy-dev] Objective of minimal python In-Reply-To: <3E271862.40902@verio.net> Message-ID: <1E0CB281-2CA8-11D7-A534-00039385F5E6@zu.com> This might be as good a point as any to jump in with my question: is there planned support for non-x86 platforms? I'm an OS X geek and was disappointed that psycho is only for x86. On Thursday, January 16, 2003, at 02:38 PM, VanL wrote: > 4. Pysco and pyrex are possible starting places for the small C core. -- Nathan Heagy phone:306.653.4747 fax:306.653.4774 http://www.zu.com From arigo at tunes.org Mon Jan 20 19:59:15 2003 From: arigo at tunes.org (Armin Rigo) Date: Mon, 20 Jan 2003 10:59:15 -0800 (PST) Subject: [pypy-dev] Questions for Armin In-Reply-To: <5.0.2.1.1.20030119161417.00a75370@mail.oz.net>; from bokr@oz.net on Sun, Jan 19, 2003 at 04:26:24PM -0800 References: <5.0.2.1.1.20030118223231.00a66020@mail.oz.net> <001001c2be5b$d1bf1ee0$bba4aad8@computer> <20030118190053.163AD1F8E@bespin.org> <002d01c2bf74$e7f83470$bba4aad8@computer> <5.0.2.1.1.20030118223231.00a66020@mail.oz.net> <20030119185028.5C7A1E06@bespin.org> <5.0.2.1.1.20030119161417.00a75370@mail.oz.net> Message-ID: <20030120185915.67BFD5FE@bespin.org> Hello Bengt, On Sun, Jan 19, 2003 at 04:26:24PM -0800, Bengt Richter wrote: > >This might probably be done without Psyco, and would certainly be a nice thing > >to have. Note that a good Psyco could remove any need for it: most > >initialization code could theoretically be specialized into something that > >just creates the necessary data structures without executing any code at all. > > There seems to be something I missed. Could you clarify how such specialized > versions persist so they don't have to be redone? I.e., how do you get from > an original .py source-only representation to the specialized form, and how > does the latter come to exist? I.e., is this a new form of incrementally > updated .pyc? Yes, you must have this data persist somewhere. In a .pyc-like file or any variant of the idea (like having one "global" database as Edward proposed, which would be user-specific to avoid security issues). > >Sometimes I like to point out that if our OS were written in a high-level > >language with built-in specializers, they would boot in no more than the time > >it takes to do the actual I/O that occurs when booting (mainly displaying the > >login screen and waiting for mouse, keyboard and network input) --- everything > >else is internal state and can be done lazily. > > If this means dynamic incremental revisions of system files, it must be a whole > new class of security issues to nail down, or am I misconstruing? Yes and no. There are tons of issues that must be carefully planned for such a thing to be possible and secure, and it is probably not possible in a Unix-style OS (which is essentially C). I'll just drop the link http://tunes.org as an example of what I mean by re-planning an OS. A bient?t, Armin. From nathanh at zu.com Mon Jan 20 20:05:05 2003 From: nathanh at zu.com (Nathan Heagy) Date: Mon, 20 Jan 2003 13:05:05 -0600 Subject: [pypy-dev] Questions for Armin In-Reply-To: <3E29CA75.4060507@tismer.com> Message-ID: <1605C238-2CAA-11D7-A534-00039385F5E6@zu.com> Any interest in getting rid of eval() altogether? > Now consider the huge eval_code function, > wih its specializations in order to make operations > on integers very fast, for example. -- Nathan Heagy phone:306.653.4747 fax:306.653.4774 http://www.zu.com From hpk at trillke.net Mon Jan 20 20:29:09 2003 From: hpk at trillke.net (holger krekel) Date: Mon, 20 Jan 2003 20:29:09 +0100 Subject: [pypy-dev] Questions for Armin In-Reply-To: <1605C238-2CAA-11D7-A534-00039385F5E6@zu.com>; from nathanh@zu.com on Mon, Jan 20, 2003 at 01:05:05PM -0600 References: <3E29CA75.4060507@tismer.com> <1605C238-2CAA-11D7-A534-00039385F5E6@zu.com> Message-ID: <20030120202909.E12700@prim.han.de> > > Now consider the huge eval_code function, > > wih its specializations in order to make operations > > on integers very fast, for example. [Nathan Heagy Mon, Jan 20, 2003 at 01:05:05PM -0600] > Any interest in getting rid of eval() altogether? um, what do you mean? eval_frame (not eval_code, anyway) works at a completly different level than the python builtin 'eval'. holger From nathanh at zu.com Mon Jan 20 20:32:13 2003 From: nathanh at zu.com (Nathan Heagy) Date: Mon, 20 Jan 2003 13:32:13 -0600 Subject: [pypy-dev] Questions for Armin In-Reply-To: <20030120202909.E12700@prim.han.de> Message-ID: I mean the builtin. I didn't realise there was an eval() in C (I thought the function name(s) was different). On Monday, January 20, 2003, at 01:29 PM, holger krekel wrote: >>> Now consider the huge eval_code function, >>> wih its specializations in order to make operations >>> on integers very fast, for example. > > [Nathan Heagy Mon, Jan 20, 2003 at 01:05:05PM -0600] >> Any interest in getting rid of eval() altogether? > > um, what do you mean? eval_frame (not eval_code, anyway) > works at a completly different level than the python > builtin 'eval'. > > holger > > -- Nathan Heagy phone:306.653.4747 fax:306.653.4774 http://www.zu.com From hpk at trillke.net Mon Jan 20 21:03:22 2003 From: hpk at trillke.net (holger krekel) Date: Mon, 20 Jan 2003 21:03:22 +0100 Subject: [pypy-dev] Questions for Armin In-Reply-To: ; from nathanh@zu.com on Mon, Jan 20, 2003 at 01:32:13PM -0600 References: <20030120202909.E12700@prim.han.de> Message-ID: <20030120210322.F12700@prim.han.de> [Nathan Heagy Mon, Jan 20, 2003 at 01:32:13PM -0600] > I mean the builtin. I didn't realise there was an eval() in C (I > thought the function name(s) was different). no, we don't intend to change the language. holger From nathanh at zu.com Mon Jan 20 21:19:11 2003 From: nathanh at zu.com (Nathan Heagy) Date: Mon, 20 Jan 2003 14:19:11 -0600 Subject: [pypy-dev] Questions for Armin In-Reply-To: <20030120210322.F12700@prim.han.de> Message-ID: <700A0C9E-2CB4-11D7-A534-00039385F5E6@zu.com> The only reason I bring it up, and I'm not really in any position to be bringing things up, is that it seems to me that this dynamic metacompiling stuff could be a real pain for making things fast and optimized. Most fast languages don't have eval() and that may be part of the reason they are fast. I'm sure Guido could jump in with a wonderful reason why eval() is great but I think if it disappeared no one would miss it, especially if it let Python compile to machine code. In fact if that was the price of enabling Python to compile to machine code I *guarantee* no one would miss it. On Monday, January 20, 2003, at 02:03 PM, holger krekel wrote: > [Nathan Heagy Mon, Jan 20, 2003 at 01:32:13PM -0600] >> I mean the builtin. I didn't realise there was an eval() in C (I >> thought the function name(s) was different). > > no, we don't intend to change the language. > > holger > > -- Nathan Heagy phone:306.653.4747 fax:306.653.4774 http://www.zu.com From arigo at tunes.org Tue Jan 21 01:30:31 2003 From: arigo at tunes.org (Armin Rigo) Date: Mon, 20 Jan 2003 16:30:31 -0800 (PST) Subject: [pypy-dev] Restricted language In-Reply-To: <20030120044004.E2661@prim.han.de>; from hpk@trillke.net on Mon, Jan 20, 2003 at 04:40:04AM +0100 References: <3E29CF36.8020507@tismer.com> <20030119134510.A2661@prim.han.de> <3E2AB357.5030503@tismer.com> <20030119195159.CBE654B52@bespin.org> <20030120044004.E2661@prim.han.de> Message-ID: <20030121003031.97B944EF5@bespin.org> Hello Holger, On Mon, Jan 20, 2003 at 04:40:04AM +0100, holger krekel wrote: > using Bengt's suggestion we could say that an "immediate" > C-program-level value is represented by a C-compiler-level value. It fells like you made exactly the confusion that I was trying to prevent people from making. A C-program-level value is *not* *just* a C-compiler-level value. I mean, your "immediate value" in your C program might be in a variable 'x', but in the C compiler it is not in any variable. Instead, it is embedded within a complex structure, e.g. struct program_variable { bool is_constant; union { struct { // non-constant case int in_which_register; ... } struct { // constant case long immediate_value; ... } } }; > ImmediateObjects would be tagged CPython objects. > I don't see immediate use, though :-) You cannot use real CPython objects to represent application-level objects. If you do, you are confusing the two levels. It will lead you into tons of problems; for example, you can only use immediates for the application-level objects. It is similar to a C compiler in which the above struct program_variable would not exist; all program variables would be represented as a long. You can only represent immediates like this. For example, if you want later to design your own "class MyList" implementing lists, and use this instead of real list objects to represent application-level lists, it looks fine; you change the implementation of BUILD_LIST to create an instance of MyList, and work with that instead of a real list. But then, the class must work *exactly* like a list for this to work. The problem is that it cannot. If you want to give another implementation you cannot inherit from the built-in "list" type. You are stuck. At best you will change *all* your interpreter to contain tests like "if isinstance(x, MyList)" to know if the object 'x' is an immediate object or something that needs special care. Hence the class ImmediateObject is absolutely essential. If you get confused think about implementing a Python interpreter not in Python but in another similar language. You will be forced to define classes like lists, tuples, dicts and so on, and give direct mappings between what the interpreted application wants to do on instances of these classes and what the interpreter itself can do with the underlying implementing object. The case of Python-in-Python gets confusing because these mappings are trivial in the case of the above class ImmediateObject. But you cannot remove them. You have a similar problem with exceptions: you must not confuse the exceptions that the application wants to see and the exceptions that are used internally in the interpreter because it is a nice programming technique to use in your interpreter. Here again, think about Python-in-some-similar-language. You will see that you need a way to say which Python exception must be thrown in the application. Then you need a way for the interpreter to come back from a couple of nested calls to the main loop, and so you create an "EPython" exception to raise for this purpose. As above it is easy to get confused because ImmediateObjects have a trivial mapping between what the application wants to see and what the operation on the underlying implementing object actually raises. In other words, if you implement a list with a CPython list 'a', then when you try to do the PyObject_GetItem operation, you will end up doing this: try: return a[index] except Exception, e: SetException(ImmediateObject(e)) raise EPython Note how 'e' is embedded inside an ImmediateObject(). > But i'd like to implement BINARY_SUBSCR like this: > > def BINARY_SUBSCR(self): > w = self.valuestack.pop() > v = self.valuestack.pop() > self.valuestack.push(w.__getitem__(v)) Here we are using Python's __getitem__ protocol to implement Python's __getitem__ protocol. I see nothing wrong in that, but it is easy to get confused. To make things clearer I would way: self.valuestack.push(w.getitem(v)) where all non-abstract classes inheriting from PyObject should have a getitem() method; for example, in class ImmediateObject: def getitem(self, index): try: return self.ob[index.ob] except Exception, e: SetException(ImmediateObject(e)) raise EPython This level of indirection is quite necessary. Seen otherwise, you cannot store arbitrary CPython objects into the self.valuestack list, because otherwise you can only store CPython objects representing themselves, and you are stuck as soon as you want to represent things differently. Think about the type() function; it could not return the real type of the implementing object, because you couldn't implement lists or ints with a custom class. And it cannot call a new method __type__() of the object, because you cannot add such a new method to all already-existing built-in objects. You could hack something that calls __type__() if it exists and returns the real type otherwise, but you are running into trouble when interpreting programs that define __type__() methods for their own purpose. This is the kind of confusion we are bound to run into if we are not careful. With all correctly set up it is trivial to catch the EPython exception in the main loop and, in its exception handler, unwind the block stack just like CPython does in its main loop. If another exception (not EPython) is raised in the interpreter, it will not be caught by default; it is the normal behavior of Python programs and means there is a bug (in this case, a bug in the interpreter). A bient?t, Armin. From arigo at tunes.org Tue Jan 21 01:30:32 2003 From: arigo at tunes.org (Armin Rigo) Date: Mon, 20 Jan 2003 16:30:32 -0800 (PST) Subject: [pypy-dev] Questions for Armin In-Reply-To: <5.0.2.1.1.20030119141754.00a6d030@mail.oz.net>; from bokr@oz.net on Sun, Jan 19, 2003 at 04:02:05PM -0800 References: <002d01c2bf74$e7f83470$bba4aad8@computer> <001001c2be5b$d1bf1ee0$bba4aad8@computer> <20030118190053.163AD1F8E@bespin.org> <002d01c2bf74$e7f83470$bba4aad8@computer> <20030119185022.6FCD495E@bespin.org> <5.0.2.1.1.20030119141754.00a6d030@mail.oz.net> Message-ID: <20030121003032.49B504EB7@bespin.org> Hello Bengt, On Sun, Jan 19, 2003 at 04:02:05PM -0800, Bengt Richter wrote: > (...) In C, type vs representation > is almost 1:1 (i.e., type names identify memory layouts with bits and words etc), > but with Python and psyco there are multiple ways of physically representing the > same abstract entity. I'd like to push for separating the concepts of type and > representation better in discussion. Yes, I also think it is an important point. I'm not sure we should already tackle the issue of having multiple representations of the *same* object, though. I was rather thinking about having several available implementations, but each object only implemented with one of them at a time. Occasionally switching to another representation is the next step. Managing several concurrent representations is yet another, more difficult step I guess :-) A bient?t, Armin. From roccomoretti at netscape.net Tue Jan 21 04:39:04 2003 From: roccomoretti at netscape.net (Rocco Moretti) Date: Mon, 20 Jan 2003 22:39:04 -0500 Subject: [pypy-dev] Restricted language Message-ID: <7895C76A.186E8E0C.9ADE5C6A@netscape.net> Armin Rigo wrote: < excellent discussion of confusion between (host) Python and (interpreted) Python snipped > >Here we are using Python's __getitem__ protocol to implement Python's >__getitem__ protocol. ?I see nothing wrong in that, but it is easy to get >confused. ?To make things clearer I would way: > > ? ?self.valuestack.push(w.getitem(v)) ... > >Think about >the type() function; it could not return the real type of the implementing >object, because you couldn't implement lists or ints with a custom class. ?And >it cannot call a new method __type__() of the object, because you cannot add >such a new method to all already-existing built-in objects. ?You could hack >something that calls __type__() if it exists and returns the real type >otherwise, but you are running into trouble when interpreting programs that >define __type__() methods for their own purpose. ?This is the kind of >confusion we are bound to run into if we are not careful. > So if I understand you correctly, we should not use (either explicitly or implicitly) any of the special methods of the objects we create for the interpreter. The only acceptable use is member dereferencing (object.attribute). Only host objects are allowed to be part of special method use. (e.g. in function calls, in expressions such as a+b, etc.) This is not strictly a technical limitation in most cases, but good practice so that we avoid confusion between the two levels. Is this an accurate interpretation? -Rocco P.S. In your implementation of EPython exceptions, how would associated traceback objects be handled? __________________________________________________________________ The NEW Netscape 7.0 browser is now available. Upgrade now! http://channels.netscape.com/ns/browsers/download.jsp Get your own FREE, personal Netscape Mail account today at http://webmail.netscape.com/ From roccomoretti at netscape.net Tue Jan 21 04:49:56 2003 From: roccomoretti at netscape.net (Rocco Moretti) Date: Mon, 20 Jan 2003 22:49:56 -0500 Subject: [pypy-dev] Questions for Armin Message-ID: <5A433638.4ABBCFC2.9ADE5C6A@netscape.net> On removing eval(), Nathan Heagy wrote: >The only reason I bring it up, and I'm not really in any position to be >bringing things up, is that it seems to me that this dynamic >metacompiling stuff could be a real pain for making things fast and >optimized. Most fast languages don't have eval() and that may be part >of the reason they are fast. I'm sure Guido could jump in with a >wonderful reason why eval() is great but I think if it disappeared no >one would miss it, especially if it let Python compile to machine code. >In fact if that was the price of enabling Python to compile to machine >code I *guarantee* no one would miss it. Having eval() is no impediment to compiling Python to machine code in the common case. What's the difference between a Python program which doesn't use eval() and one where eval() is prohibited? To deal with eval() in compiling to machine code, you could do one of two things: (1) Halt with an error saying that eval() is not supported (use the Python interpreter instead), or (2) Tack on an additional function which replicates the function of the Python interpreter to handle the argument to eval(). As I understand it, (2) is close to the equivalent of what Psyco currently does for the "uncommon case" anyway. -Rocco __________________________________________________________________ The NEW Netscape 7.0 browser is now available. Upgrade now! http://channels.netscape.com/ns/browsers/download.jsp Get your own FREE, personal Netscape Mail account today at http://webmail.netscape.com/ From roccomoretti at netscape.net Tue Jan 21 05:08:17 2003 From: roccomoretti at netscape.net (Rocco Moretti) Date: Mon, 20 Jan 2003 23:08:17 -0500 Subject: [pypy-dev] question about core implementation language Message-ID: <3AB0B79C.2B42328D.9ADE5C6A@netscape.net> logistix wrote: >I think the point is that Python as a language provides absolutely no >access to the machine internals. ?You can't examine memory contents, >registers, I/O bus or call code directly. ? Can you do this all in C? (without library functions coded in assembly) >As part of the bootstrap >process the following four functions should probably be added to the >interperter (either as builtin functions or a module): > > ? ?GetMemoryFromOS() > ? ?ReturnMemouryToOS() > ? ?Peek() > ? ?Poke() > >ANd probably: > ? ?get/putRegister() > ? ?callCode() > >Other than that, everything could (eventually) be written in Python. If we were writing an Operating System, I would agree with you. (We can put down a Python based OS as another goal of the project. ) Fortunately, we aren't quite so low level. The standard OS functions tend to be higher level. As I understand it, much of libc.so on Unix systems is simply a very thin wrapper around OS functions which behave much like the C functions that call them. Since we don't want to reinvent the wheel, it would probably be best to reuse the standard interface provided. This is where the minimal C core comes in. We can probably limit it to basic input and output (PyBIOS?), plus a generic mechanism for discovering and calling dll/so functions. Or am I missing your point entirely? -Rocco __________________________________________________________________ The NEW Netscape 7.0 browser is now available. Upgrade now! http://channels.netscape.com/ns/browsers/download.jsp Get your own FREE, personal Netscape Mail account today at http://webmail.netscape.com/ From roccomoretti at netscape.net Tue Jan 21 05:33:25 2003 From: roccomoretti at netscape.net (Rocco Moretti) Date: Mon, 20 Jan 2003 23:33:25 -0500 Subject: [pypy-dev] How to translate 300000 lines of C Message-ID: <506F64A6.6E20DBCF.9ADE5C6A@netscape.net> On another thread Christian Tismer wrote: >I tried to map [frameobject.c] on a 3.5 hour yourney from >Kiel to Berlin, and I had one fourth done by an hour. >Nevertheless, I got into trouble, just by comparing its >implementation differences between 2.2.2 and 2.3a. >Then I dropped that and decided that this is the wrong way. I'm interested to know what problems you were encountering. One issue I recall from frameobject.c is that it has a lot of optimizations regarding object caching and reuse. As a first approximation, we should probably ignore such optimizations. If we do add it in later, it should probably be as an optimization by psyco for *all* objects. That said, theoretically, I like the idea of of a C->Py converter. (It did get tedious when I was doing it.) However, I'm concerned if the time invested to make one would be worth it. Christian Tismer wrote: >There are a number of free-ware C compilers around, and also >some C interpreters. But coded in which language? How difficult would it be to get them to "play nice" with Python (which is, I am assuming, where you want to code the conversion logic)? - This question probably boils down to asking what compiler/interpreter you have in mind to co-opt. One caveat with me evaluating the benefits of this idea is that I don't have a feel how difficult tracking changes in CPython would be. We do have the changelog and Unit tests for CPython, so we wouldn't necessarily need to do a line-by-line comparison. We could approach the changes in more of a top-down level. Isn't Python supposed to be easier to maintain than C? How does Jython do it? Just my two atoms of copper-coated zinc. -Rocco __________________________________________________________________ The NEW Netscape 7.0 browser is now available. Upgrade now! http://channels.netscape.com/ns/browsers/download.jsp Get your own FREE, personal Netscape Mail account today at http://webmail.netscape.com/ From tanzer at swing.co.at Tue Jan 21 08:27:09 2003 From: tanzer at swing.co.at (Christian Tanzer) Date: Tue, 21 Jan 2003 08:27:09 +0100 Subject: [pypy-dev] Questions for Armin In-Reply-To: Your message of "Mon, 20 Jan 2003 14:19:11 CST." <700A0C9E-2CB4-11D7-A534-00039385F5E6@zu.com> Message-ID: Nathan Heagy wrote: > The only reason I bring it up, and I'm not really in any position to be > bringing things up, is that it seems to me that this dynamic > metacompiling stuff could be a real pain for making things fast and > optimized. Most fast languages don't have eval() and that may be part > of the reason they are fast. I'm sure Guido could jump in with a > wonderful reason why eval() is great but I think if it disappeared no > one would miss it, especially if it let Python compile to machine code. > In fact if that was the price of enabling Python to compile to machine > code I *guarantee* no one would miss it. I certainly would miss it. And your intuition about eval vs. speed is wrong anyway... -- Christian Tanzer tanzer at swing.co.at Glasauergasse 32 Tel: +43 1 876 62 36 A-1130 Vienna, Austria Fax: +43 1 877 66 92 From arigo at tunes.org Tue Jan 21 11:13:01 2003 From: arigo at tunes.org (Armin Rigo) Date: Tue, 21 Jan 2003 02:13:01 -0800 (PST) Subject: [pypy-dev] Objective of minimal python In-Reply-To: <1E0CB281-2CA8-11D7-A534-00039385F5E6@zu.com>; from nathanh@zu.com on Mon, Jan 20, 2003 at 12:51:00PM -0600 References: <3E271862.40902@verio.net> <1E0CB281-2CA8-11D7-A534-00039385F5E6@zu.com> Message-ID: <20030121101301.46BC44858@bespin.org> Hello Nathan, On Mon, Jan 20, 2003 at 12:51:00PM -0600, Nathan Heagy wrote: > This might be as good a point as any to jump in with my question: is > there planned support for non-x86 platforms? Yes, I expect the new Psyco-like code to be much more flexible and easy to port to other processors. We are also talking about mixed approaches involving gcc. A bient?t, Armin. From spierre at type-z.org Tue Jan 21 11:33:12 2003 From: spierre at type-z.org (=?ISO-8859-1?Q?S=E9bastien_Pierre?=) Date: Tue, 21 Jan 2003 11:33:12 +0100 Subject: [pypy-dev] MinimalPython newbie questions Message-ID: Hi all, I recently found MinimalPython and immediately subscribed to the list after seeing that it was trying to make a better, smaller CPython with ideas from Psycho and most notably Stackless. I found many interesting ideas in Stackless, most notably the notion of microthread, which seems particularly adapted to the implementation of "multi agent systems". Looking at the site, which is maintained by Armin, I found also many ideas that would ease the life of people who want to write "agent-based applications". My question is: will MinimalPython try to incorporate the microthreads from stackless and will try to incorporate ideas from tunes project (like migration) ? If so MinimalPython would really be an ideal language for developing agent-based applications. Cheers, -- S?bastien -- ?Si tous les autres acceptaient le mensonge impos? par le Parti - si tous les rapports racontaient la m?me chose - le mensonge passait dans l'histoire et devenait la v?rit?? -- George Orwell, 1984 From arigo at tunes.org Tue Jan 21 12:13:52 2003 From: arigo at tunes.org (Armin Rigo) Date: Tue, 21 Jan 2003 03:13:52 -0800 (PST) Subject: [pypy-dev] Restricted language In-Reply-To: <7895C76A.186E8E0C.9ADE5C6A@netscape.net>; from roccomoretti@netscape.net on Mon, Jan 20, 2003 at 10:39:04PM -0500 References: <7895C76A.186E8E0C.9ADE5C6A@netscape.net> Message-ID: <20030121111352.DE4744A56@bespin.org> Hello Rocco, On Mon, Jan 20, 2003 at 10:39:04PM -0500, Rocco Moretti wrote: > So if I understand you correctly, we should not use (either explicitly > or implicitly) any of the special methods of the objects we create for > the interpreter. The only acceptable use is member dereferencing > (object.attribute). Yes, indeed. To avoid confusion the host objects (CPython objects) should not define special methods. We should define our own set of methods, althought it can be as straightforward as using the canonical name and dropping "__". So custom list implementations would define a "getitem()" method and not a "__getitem__()" one. Another way to see this important distinction is by closely following the CPython core declarations. Wherever there is a "PyObject*", it means that the core is handling an application-level object (which we have to translate into an application-level object, like an ImmediateObject). Wherever there are other C data types, it is interpreter-internal data. The application-level "__getitem__()" method corresponds to: PyObject* PyObject_GetItem(PyObject* object, PyObject* index); This function is not called "__getitem__"; there is only a similarity in the name. The function does not take an integer index (i.e. an "int"), but an application-level "PyObject*". It is exactly the same in our case: we need a "getitem(self, v_index)" method whose name merely reminds us about "__getitem__", and taking as index argument not an integer, but another object implementing an application-level object (which might be an ImmediateObject() containing an integer). > P.S. In your implementation of EPython exceptions, how would associated > traceback objects be handled? If we decide to use attributes of the EPython instance to store the application-level exception type and value, then the traceback could also be there. In the main loop, when an EPython exception is caught, its traceback info would be updated. In fact this description is just what one would explain to describe what the PyTraceBack_Here() call in ceval.c does. A bient?t, Armin. From arigo at tunes.org Tue Jan 21 12:13:56 2003 From: arigo at tunes.org (Armin Rigo) Date: Tue, 21 Jan 2003 03:13:56 -0800 (PST) Subject: [pypy-dev] Restricted language In-Reply-To: <3E2CB761.3030700@tismer.com>; from tismer@tismer.com on Tue, Jan 21, 2003 at 03:58:41AM +0100 References: <3E29CF36.8020507@tismer.com> <20030119134510.A2661@prim.han.de> <3E2AB357.5030503@tismer.com> <20030119195159.CBE654B52@bespin.org> <20030120044004.E2661@prim.han.de> <20030121003031.97B944EF5@bespin.org> <3E2CB761.3030700@tismer.com> Message-ID: <20030121111356.57FB14A56@bespin.org> Hello Christian, On Tue, Jan 21, 2003 at 03:58:41AM +0100, Christian Tismer wrote: > Just to make sure I understand what you mean (private mail). > > Having hacked on CPython for a very long time now, I always > had the impression that CPython never does any distinction > withing exception, whether they are caused by a script, > or whether they are caused inside a builtin C object. > I the runtime decides to throw a memory error, it is the > same, as if there is a syntax error, or an error in a user > object. Yes, I think you are right. There is some confusion already in the C core which uses exceptions both for user-program-visible exceptions and internal handling. For example, the fact that a generator can signal it is exhausted by throwing a StopIteration exception (instead of finishing on a "return"): the StopIteration exception is *internally* used to signal the end of iterators and a user-level StopIteration should never have been able to interfere. Of course, given the way CPython's exceptions are coded, it seemed natural at the time to add in the definition of the Python language that you can raise a StopIteration to stop generators... For memory errors it is still a bit more fuzzy, because it is clearly an internal error, but there isn't much else that the core can do about it than turn it into a user-level visible error to cleanly interrupt the user program... In Python the RuntimeError exception is reserved for internal errors. Ideally it should have been more clearly separated. But it is tempting to re-use the existing exception mecanism for RuntimeErrors. Other interpreters do that, BTW (e.g. Java). In Python-in-Python we must internally raise an EPython exception to signal application-level exceptions, and use the normal exception mecanism for all other internal error conditions (e.g. use assert's or raise ValueErrors when internal calls are made with bad values). In a first phase these would just crash the interpreter with a traceback that is useful for debugging. In a second phase (and only then) we can make the interpreter more robust by catching *all* unexpected exceptions in the main loop and throwing them into the application, which would then see them as RuntimeErrors whatever the original internal exception class had been. I know this can be quite confusing; it took me some time to figure out this, but now I am confident that it is the correct point of view. Again, think about implementing a Python interpreter in another different language that also has its own (different) family of exceptions. If the interpreter internally raises an unexpected exception, you cannot show it as is in the Python application because it may have no immediate equivalent. So you let the application see a generic RuntimeError exception. Conversely, if a part of the interpreter wants to raise, say, a ValueError, there may be no directly-corresponding exception existing in the interpreter's language; you have to define a new kind of exception in the interpreter's language, and throw and catch that exception internally. This new kind of exception can be generic ("EPython") and embed information about the Python type and value of the application-level exception. A bient?t, Armin. From tismer at tismer.com Tue Jan 21 14:34:35 2003 From: tismer at tismer.com (Christian Tismer) Date: Tue, 21 Jan 2003 14:34:35 +0100 Subject: [pypy-dev] How to translate 300000 lines of C In-Reply-To: <506F64A6.6E20DBCF.9ADE5C6A@netscape.net> References: <506F64A6.6E20DBCF.9ADE5C6A@netscape.net> Message-ID: <3E2D4C6B.8020300@tismer.com> Rocco Moretti wrote: > On another thread Christian Tismer wrote: > > >>I tried to map [frameobject.c] on a 3.5 hour yourney from >>Kiel to Berlin, and I had one fourth done by an hour. >>Nevertheless, I got into trouble, just by comparing its >>implementation differences between 2.2.2 and 2.3a. >>Then I dropped that and decided that this is the wrong way. > > > I'm interested to know what problems you were encountering. Well, first of all, frameobject.c seemed to force me to invent every necessary supporting objects at once, or to drop them and maybe loose them. The block stack was one thing that I would have liked to describe with some struct construct, and there is pointer arithmetic... Well, this could have easily been replaced by a list of tuples. But well, what really got me stuck was the number of changes and additions which came with 2.3a. With a direct rewrite in Python, we get into a major problem: You cannot use diffs any longer. Diffing between two C files is fine. But what do you do if you have a hand-written version of a C file? You have nothing to diff *that* against, so you have to read the C diffs, repeat the mappings which you did in brain and try to figure out what to change in the target .py file. Huh! That really made me stop the otherwise not-so-hard transliteration and to think about how to avoid the upcoming nightmare. What I think is needed is a tool, that does the translation partially automatically, partially on my command, with some scripted rules. Given that, I'm able to produce a python file of a new C version, and diff the resulting Python files against each other. Let it even be that this diff contains C snippets which were'n automatically mapped, but they are in the right place, hopefully. > One issue I recall from frameobject.c is that it > has a lot of optimizations regarding object > caching and reuse. As a first approximation, > we should probably ignore such optimizations. > If we do add it in later, it should probably be > as an optimization by psyco for *all* objects. Yes, optimizations are spread all over the place, and they don't help the translation, at least :-) > That said, theoretically, I like the idea of > a C->Py converter. (It did get tedious when I was doing it.) > However, I'm concerned if the time invested to make one would be worth it. > > Christian Tismer wrote: > > >>There are a number of free-ware C compilers around, and also >>some C interpreters. > > > But coded in which language? How difficult would it > be to get them to "play nice" with Python (which is, > I am assuming, where you want to code the conversion > logic)? - This question probably boils down to asking > what compiler/interpreter you have in mind to co-opt. I'm reading through lcc right now, just to get an idea, and I've begun to code a small lexer and parser in Python. The advantage of our problem is that we may assume correct C code, so I don't have to do a validating parser. I have no idea yet, how the mapping should work, and on which abstraction level. There are lots of issues which can only emit an "untranslatable" message, like labels and gotos. After I have something useful, I will post the tiny parser for playing with ideas. > One caveat with me evaluating the benefits of this > idea is that I don't have a feel how difficult tracking > changes in CPython would be. We do have the changelog > and Unit tests for CPython, so we wouldn't necessarily > need to do a line-by-line comparison. We could approach > the changes in more of a top-down level. Isn't Python > supposed to be easier to maintain than C? Well, I hope comparing translated scripts does help here. We have to try it anyway, tho. > How does Jython do it? No idea. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From hpk at trillke.net Tue Jan 21 15:17:05 2003 From: hpk at trillke.net (holger krekel) Date: Tue, 21 Jan 2003 15:17:05 +0100 Subject: [pypy-dev] How to translate 300000 lines of C In-Reply-To: <3E2D4C6B.8020300@tismer.com>; from tismer@tismer.com on Tue, Jan 21, 2003 at 02:34:35PM +0100 References: <506F64A6.6E20DBCF.9ADE5C6A@netscape.net> <3E2D4C6B.8020300@tismer.com> Message-ID: <20030121151704.M12700@prim.han.de> [Christian Tismer Tue, Jan 21, 2003 at 02:34:35PM +0100] > Rocco Moretti wrote: > > On another thread Christian Tismer wrote: > > > > > >>I tried to map [frameobject.c] on a 3.5 hour yourney from > >>Kiel to Berlin, and I had one fourth done by an hour. > >>Nevertheless, I got into trouble, just by comparing its > >>implementation differences between 2.2.2 and 2.3a. > >>Then I dropped that and decided that this is the wrong way. > > > > > > I'm interested to know what problems you were encountering. > > Well, first of all, frameobject.c seemed to force > me to invent every necessary supporting objects at once, > or to drop them and maybe loose them. > The block stack was one thing that I would have > liked to describe with some struct construct, > and there is pointer arithmetic... > Well, this could have easily been replaced by > a list of tuples. > > But well, what really got me stuck was the number of > changes and additions which came with 2.3a. > With a direct rewrite in Python, we get into a major > problem: > You cannot use diffs any longer. Diffing between two > C files is fine. > But what do you do if you have a hand-written version > of a C file? You have nothing to diff *that* against, I see the problem. But i hope that if we start with a minimal and clean python version then a) we shouldn't have to change so often as lots of details are abtracted out anyway. hopefully we only have to change the python-to-c or python-to-assembler or python-to-objective-c compilers. If at all. b) there could later be modules whose primary source is the pythonic version. Similar to the compiler package but note: most modules are simpler. c) we might convince the python-dev folks later that testing out new PEPs or maintenance takes place in the python version (prototype level) for specific parts. Keeping the C-version up to date is the problem then :-) This needn't be an all-or-nothing deal and hopefully could be done step-by-step. > so you have to read the C diffs, repeat the mappings > which you did in brain and try to figure out what to > change in the target .py file. Huh! > > That really made me stop the otherwise not-so-hard > transliteration and to think about how to avoid > the upcoming nightmare. > What I think is needed is a tool, that does the > translation partially automatically, partially > on my command, with some scripted rules. *This* still seems like a nightmare to me. you have the complexity for both developments (CPython and PyPython) and the added complexity of syncing them. Unless your C-to-python translation is fully automatic. Note, that taking the 'pythonic code as primary ressource' route wasn't possible for the Jython people at the time. And they have a hard time catching up. have fun with translators, anyway :-) holger From arigo at tunes.org Tue Jan 21 15:34:33 2003 From: arigo at tunes.org (Armin Rigo) Date: Tue, 21 Jan 2003 15:34:33 +0100 Subject: [pypy-dev] MinimalPython newbie questions In-Reply-To: References: Message-ID: <20030121143433.GA26985@magma.unil.ch> Hello S?bastien, On Tue, Jan 21, 2003 at 11:33:12AM +0100, S?bastien Pierre wrote: > Looking at the site, which is maintained by Armin, I'm not the maintained of that site, althought I often cite it. > My question is: will MinimalPython try to incorporate the microthreads > from stackless and will try to incorporate ideas from tunes project > (like migration) ? If so MinimalPython would really be an ideal > language for developing agent-based applications. Maybe. There are quite a lot of exciting possibilities that we can dream of, given enough flexibility. Starting from a high-level description of the interpreter (the interpreter written in Python), Stackless microthreads can easily be added (or not). Migration is a whole harder project, althought if we know what we precisely want it will probably be possible to add it then without rewriting everything (which is the whole point of the MiniPython project). Migration is related to checkpoints, and saving and restoring program state. Scott pointed out numerous problems related to it. I personally believe that these problems araise from some kind of confusion between the various abstraction levels; when you want a checkpoint, you cannot make a copy of the state of the whole universe, so you must be clear about which abstraction levels you want to save and restore. For example, from the C point of view, a program can only be checkpointed between two expressions; from the processor point of view, it can be checkpointed between two instructions; but the level that probably interests you is higher. In a GUI application you may want that the checkpoint occurs between two elementary user actions. This is a whole complex problem, outside the scope of MiniPy (which could however become later a nice test bed). A bient?t, Armin. From arigo at tunes.org Tue Jan 21 15:41:29 2003 From: arigo at tunes.org (Armin Rigo) Date: Tue, 21 Jan 2003 15:41:29 +0100 Subject: [pypy-dev] How to translate 300000 lines of C In-Reply-To: <20030121151704.M12700@prim.han.de> References: <506F64A6.6E20DBCF.9ADE5C6A@netscape.net> <3E2D4C6B.8020300@tismer.com> <20030121151704.M12700@prim.han.de> Message-ID: <20030121144129.GB26985@magma.unil.ch> Hello Holger, On Tue, Jan 21, 2003 at 03:17:05PM +0100, holger krekel wrote: > > What I think is needed is a tool, that does the > > translation partially automatically, partially > > on my command, with some scripted rules. > > *This* still seems like a nightmare to me. you > have the complexity for both developments (CPython > and PyPython) and the added complexity of syncing > them. Unless your C-to-python translation is > fully automatic. I would say the goal is to completely automate this translation --- not in the sense that we have The Ultimate C-to-Python Translator(tm), but in the sense that all hints and special cases have to be described in custom .py files attached to the .c sources. "Regular" non-special-case'd changes to the CPython source come "for free" into MiniPy. > Note, that taking the 'pythonic code as primary > ressource' route wasn't possible for the Jython > people at the time. And they have a hard time > catching up. Both Jython and the old Stackless demonstrate badly enough that translating the CPython code manually might be quite possible, but if you do it, your project is doomed to become completely out-of-date after a couple of years. Let's not repeat that mistake. A bient?t, Armin. From pedronis at bluewin.ch Tue Jan 21 15:40:24 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Tue, 21 Jan 2003 15:40:24 +0100 Subject: [pypy-dev] How to translate 300000 lines of C References: <506F64A6.6E20DBCF.9ADE5C6A@netscape.net><3E2D4C6B.8020300@tismer.com> <20030121151704.M12700@prim.han.de> <20030121144129.GB26985@magma.unil.ch> Message-ID: <01c901c2c15b$233425c0$6d94fea9@newmexico> From: "Armin Rigo" > Both Jython and the old Stackless demonstrate badly enough that translating > the CPython code manually might be quite possible, but if you do it, your > project is doomed to become completely out-of-date after a couple of years. > Let's not repeat that mistake. Jython is not a manual translation of CPython C but what I do I know, it seems that here everybody is a Jython expert. Jython could use a more active community wrt development but that's a different kind of issue. OTOH automatically translating the CPython C sources to the right level of abstraction necessary e.g. for Jython or what not is for sure an interesting challenge. From hpk at trillke.net Tue Jan 21 16:05:04 2003 From: hpk at trillke.net (holger krekel) Date: Tue, 21 Jan 2003 16:05:04 +0100 Subject: [pypy-dev] How to translate 300000 lines of C In-Reply-To: <20030121144129.GB26985@magma.unil.ch>; from arigo@tunes.org on Tue, Jan 21, 2003 at 03:41:29PM +0100 References: <506F64A6.6E20DBCF.9ADE5C6A@netscape.net> <3E2D4C6B.8020300@tismer.com> <20030121151704.M12700@prim.han.de> <20030121144129.GB26985@magma.unil.ch> Message-ID: <20030121160504.N12700@prim.han.de> [Armin Rigo Tue, Jan 21, 2003 at 03:41:29PM +0100] > Hello Holger, > > On Tue, Jan 21, 2003 at 03:17:05PM +0100, holger krekel wrote: > > > What I think is needed is a tool, that does the > > > translation partially automatically, partially > > > on my command, with some scripted rules. > > > > *This* still seems like a nightmare to me. you > > have the complexity for both developments (CPython > > and PyPython) and the added complexity of syncing > > them. Unless your C-to-python translation is > > fully automatic. > > I would say the goal is to completely automate this translation --- not in the > sense that we have The Ultimate C-to-Python Translator(tm), but in the sense > that all hints and special cases have to be described in custom .py files > attached to the .c sources. "Regular" non-special-case'd changes to the > CPython source come "for free" into MiniPy. Sounds nice but i am not convinced this is doable in a simple way. But simplicity should be a major goal. It's to me what made and makes python a success. > > Note, that taking the 'pythonic code as primary > > ressource' route wasn't possible for the Jython > > people at the time. And they have a hard time > > catching up. > > Both Jython and the old Stackless demonstrate badly enough that translating > the CPython code manually might be quite possible, but if you do it, your > project is doomed to become completely out-of-date after a couple of years. > Let's not repeat that mistake. Wait a moment. My implicit point here was that it's hardly possible to convince CPython-dev people to prototype or implement at Java-level. But it is common python-dev practice already to prototype new stuff in Python and if the interface is stable and clean port it to CPython to make it faster. Now if PyPython helps with the speed problem then the incentive to have a CPython version is lower. Even more so, if you can generate CPython-sources from the python version. I wouldn't mind requiring hints for a special Python-to-CPython generator. cheers, holger From pedronis at bluewin.ch Tue Jan 21 16:04:34 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Tue, 21 Jan 2003 16:04:34 +0100 Subject: [pypy-dev] How to translate 300000 lines of C References: <506F64A6.6E20DBCF.9ADE5C6A@netscape.net><3E2D4C6B.8020300@tismer.com><20030121151704.M12700@prim.han.de> <20030121144129.GB26985@magma.unil.ch> <01c901c2c15b$233425c0$6d94fea9@newmexico> Message-ID: <031d01c2c15e$691ba060$6d94fea9@newmexico> From: "Samuele Pedroni" > From: "Armin Rigo" > > Both Jython and the old Stackless demonstrate badly enough that translating > > the CPython code manually might be quite possible, but if you do it, your > > project is doomed to become completely out-of-date after a couple of years. > > Let's not repeat that mistake. > > Jython is not a manual translation of CPython C but what I do I know, it seems > that here everybody is a Jython expert. > i.e. there are algorithms that are more or less direct translations, but that's the easy tough boring stuff. If e.g. type/class unification is hard is exactly because any kind of direct translation does not cut it, and further java integration should be brought into the picture. From hpk at trillke.net Tue Jan 21 16:17:03 2003 From: hpk at trillke.net (holger krekel) Date: Tue, 21 Jan 2003 16:17:03 +0100 Subject: [pypy-dev] How to translate 300000 lines of C In-Reply-To: <01c901c2c15b$233425c0$6d94fea9@newmexico>; from pedronis@bluewin.ch on Tue, Jan 21, 2003 at 03:40:24PM +0100 References: <506F64A6.6E20DBCF.9ADE5C6A@netscape.net><3E2D4C6B.8020300@tismer.com> <20030121151704.M12700@prim.han.de> <20030121144129.GB26985@magma.unil.ch> <01c901c2c15b$233425c0$6d94fea9@newmexico> Message-ID: <20030121161703.O12700@prim.han.de> [Samuele Pedroni Tue, Jan 21, 2003 at 03:40:24PM +0100] > From: "Armin Rigo" > > Both Jython and the old Stackless demonstrate badly enough that translating > > the CPython code manually might be quite possible, but if you do it, your > > project is doomed to become completely out-of-date after a couple of years. > > Let's not repeat that mistake. > > Jython is not a manual translation of CPython C but what I do I know, it seems > that here everybody is a Jython expert. I'd appreciate if somebody could share knowledge about Jython. I know that Jython works quite nicely and there are quite some people beeing happy with Jython. ASFAIK Jython compiles Python into Java-bytecode with no intermediate python interpreter. But i am happy to be corrected. > Jython could use a more active community wrt development > but that's a different kind of issue. FWIW, personally i dont't join any Jython effort because i don't particularily like to code in Java unless payed for it. If i had to do a lot of stuff in Java and could use Jython this would be different. holger From pedronis at bluewin.ch Tue Jan 21 16:18:41 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Tue, 21 Jan 2003 16:18:41 +0100 Subject: [pypy-dev] How to translate 300000 lines of C References: <506F64A6.6E20DBCF.9ADE5C6A@netscape.net><3E2D4C6B.8020300@tismer.com> <20030121151704.M12700@prim.han.de> <20030121144129.GB26985@magma.unil.ch> <01c901c2c15b$233425c0$6d94fea9@newmexico> <20030121161703.O12700@prim.han.de> Message-ID: <034101c2c160$61cc5f00$6d94fea9@newmexico> From: "holger krekel" > > Jython could use a more active community wrt development > > but that's a different kind of issue. community = it's community, I'm not here begging for workforce :). It was an exposition of matter of fact. From hpk at trillke.net Tue Jan 21 16:54:05 2003 From: hpk at trillke.net (holger krekel) Date: Tue, 21 Jan 2003 16:54:05 +0100 Subject: [pypy-dev] How to translate 300000 lines of C In-Reply-To: <034101c2c160$61cc5f00$6d94fea9@newmexico>; from pedronis@bluewin.ch on Tue, Jan 21, 2003 at 04:18:41PM +0100 References: <506F64A6.6E20DBCF.9ADE5C6A@netscape.net><3E2D4C6B.8020300@tismer.com> <20030121151704.M12700@prim.han.de> <20030121144129.GB26985@magma.unil.ch> <01c901c2c15b$233425c0$6d94fea9@newmexico> <20030121161703.O12700@prim.han.de> <034101c2c160$61cc5f00$6d94fea9@newmexico> Message-ID: <20030121165405.P12700@prim.han.de> [Samuele Pedroni Tue, Jan 21, 2003 at 04:18:41PM +0100] > > > Jython could use a more active community wrt development > > > but that's a different kind of issue. > > community = it's community, I'm not here begging for workforce :). > It was an exposition of matter of fact. And i was trying to discuss why Jython development maybe has different problems regarding keeping-up-to-date with CPython. And that we may actually have an opportunity of providing a system which would *help* CPython development. You don't need to convince anyone on python-dev that python is a worthwile language :-) And i posted because Armin said that we don't want to repeat Stackless and Jython major problems regarding maintainability. holger From daniels at dsl-only.net Tue Jan 21 19:44:11 2003 From: daniels at dsl-only.net (Scott David Daniels) Date: Tue, 21 Jan 2003 10:44:11 -0800 Subject: [pypy-dev] [pypydev] C to Python translation Message-ID: <3E2D94FB.6030101@dsl-only.net> As I read some of the plans here and do the standard "Oooh, Good Idea; Oh No, there's a swamp; Oh, Maybe" dance in my head, I had what might either be a flash of inspiration or maybe a deep lack of insight: Instead of having the C-to-Python compiler derived from lcc (or whatever) in charge of generating python, maybe we can make it track points corresponding to changes in the in C programs. Its job would be simpler: no code generation at all. Find identifed points in a C module which may have changes in it, and indicate the corresponding points "hand-translated" program. This is in part simpler because the "hand-translated" program can have comments (or something) indication what original source range they came from. By original source range I do not mean line, character, or token numbers, but locations that are somehow rediscoverable with a parser on altered source. At the roughest cut, this could be declarations and definitions, but what level of detail you need these points should be much clearer to Christian than to me. For concreteness: A : original CPython source (version 2.33) B : hand-translated version of A Z : possibly unnecessary file describing the A->B mapping and how identification points in A correspond to those in B A': New improved CPython source for A (version 2.34) The tool woud be responsible for taking as input: B Z A A', a diff of A and A' and producing change ranges in B in human readable form (and the corresponding ranges in A and/or A'. It would not automate any code semantics comparisons, but it could focus the maintainer's attention on where changes might have effect. This scheme might reduce the work of hand-tracking changes while avoiding the nightmare of Python-in-name-only code that comes from translating C code to python mechanically. On the other hand, we have five fingers. -Scott David Daniels From bokr at oz.net Tue Jan 21 21:30:22 2003 From: bokr at oz.net (Bengt Richter) Date: Tue, 21 Jan 2003 12:30:22 -0800 Subject: [pypy-dev] MinimalPython newbie questions In-Reply-To: <20030121143433.GA26985@magma.unil.ch> References:

Message-ID: <5.0.2.1.1.20030121091127.00a7b4f0@mail.oz.net> At 15:34 2003-01-21 +0100, Armin Rigo wrote: >Hello S?bastien, > >On Tue, Jan 21, 2003 at 11:33:12AM +0100, S?bastien Pierre wrote: >> Looking at the site, which is maintained by Armin, > >I'm not the maintained of that site, althought I often cite it. > >> My question is: will MinimalPython try to incorporate the microthreads >> from stackless and will try to incorporate ideas from tunes project >> (like migration) ? If so MinimalPython would really be an ideal >> language for developing agent-based applications. > >Maybe. There are quite a lot of exciting possibilities that we can dream of, >given enough flexibility. Starting from a high-level description of the >interpreter (the interpreter written in Python), Stackless microthreads can >easily be added (or not). I'm wondering how you will get to the starting point of the "high-level description" you mention by any automated means of translating the interpreter from C sources. I.e., you could even say repr(file('/usr/bin/python','r').read()) gets you a quick "Python source representation" of a big chunk of Python, but it isn't high level. Reading in the C sources into Python strings gets you another representation held in Python's hands (or coils ?). But your goal is to transform this into another kind of representation of the same abstract thing, but at a "high level," so that the latter will be suitable for (a) being transformed again by Psyco to an efficient low level for a CPU, and (b) for humans to use as a new master representation of the language interpreter, suitable for human modifications and experimentation. (a) could be viewed as a black box super-optimization project accepting C as the ongoing master source representation, and it wouldn't really matter what the intermediate representations were in generating an efficient fast interpreter. That might be an interesting thing, but I am guessing that is not the key goal for pypy-dev. b) seems a difficult thing, because the C code does not _directly_ reflect the higher level abstractions you want to re-represent in Python. Stepping back for a moment, what would the ideal Python source for Python look like? I don't think it would look like a mechanical translation from C. IMO the concern of keeping up with CPython sources should be dropped in favor of keeping up with developing/running a comprehensive test suite for all versions purporting to implement the same Python language version. The C sources are really irrelevant as far as the language is concerned. Plus there will be times when CPython develops bugs that don't happen when a new PEP is implemented in PPython (hm, handy short name ;-), and vice versa (until PPython replaces CPython as the official master ;-) Except insofar as you want to keep C as the master representation of some functionality, and hope to gain by black-box super-optimization via Psyco without human intervention, or perhaps want to use C as source and not use an external C compiler for some reason, I suspect that automated translation of C should best be used only to generate a starting point for human editing. If C remains a master representation for something, it may be that an ffi approach is more practical until Python source is available. And an editing starting point doesn't even have to be compilable, though it doesn't hurt to make it so through quick-and-dirty comment passthroughs of C for human reference, and sticking to otherwise easy stuff to get some useful tool going fast. Hm, I have a general feeling that in the game of programs-as-data you are probably doomed to reinvent some lisp ;-) Set your time machine to 1960 for minimal language-in-language inspiration ;-) http://lib1.store.vip.sc5.yahoo.com/lib/paulgraham/jmc.lisp A really high level PPython source is probably nice for humans, but a somewhat lower level, using a Python language subset might be easier for Psyco to chew on? So are you planning on identifying stages of language features, and partitioning the PPython source, where the lowest level boostraps higher level capabilities to use for the next stage, or are there only two levels, so the entire language has to be implemented in the lower level? How are you dealing with the temptation to code before designing ;-) (I recognize you have to find some things out by doing, and "plan to throw one away" etc., but ... ;-) >Migration is a whole harder project, althought if we know what we precisely >want it will probably be possible to add it then without rewriting everything >(which is the whole point of the MiniPython project). > >Migration is related to checkpoints, and saving and restoring program state. >Scott pointed out numerous problems related to it. I personally believe that >these problems araise from some kind of confusion between the various >abstraction levels; when you want a checkpoint, you cannot make a copy of the >state of the whole universe, so you must be clear about which abstraction >levels you want to save and restore. For example, from the C point of view, a >program can only be checkpointed between two expressions; from the processor >point of view, it can be checkpointed between two instructions; but the level >that probably interests you is higher. In a GUI application you may want that >the checkpoint occurs between two elementary user actions. This is a whole >complex problem, outside the scope of MiniPy (which could however become later >a nice test bed). I agree with your point re abstraction levels, but OTOH, ISTM the design of MiniPy/PPython will affect whether you _can_ do larger-granularity checkpointing without resorting to instruction-level checkpointing. Also different kinds of checkpointing are feasible depending on when in the development of execution state you call for it (e.g., before opening any user files, or starting threads, etc.) You could prohibit a lot and still have a useful restricted capability. From arigo at tunes.org Tue Jan 21 22:25:43 2003 From: arigo at tunes.org (Armin Rigo) Date: Tue, 21 Jan 2003 13:25:43 -0800 (PST) Subject: [pypy-dev] Sprint dates Message-ID: <20030121212543.2C1BA1F58@bespin.org> Hello everybody, If we are running for a February sprint, we should fix the dates now. I've no preference myself in February, but cannot free myself during the first week of March. I remember Christian suggesting dates but I seem to have lost them somewhere... A bient?t, Armin. From logistix at zworg.com Tue Jan 21 23:38:58 2003 From: logistix at zworg.com (logistix) Date: Tue, 21 Jan 2003 17:38:58 -0500 Subject: [pypy-dev] (no subject) Message-ID: <200301212238.h0LMcwtu028829@overload3.baremetal.com> > -----Original Message----- > From: pypy-dev-bounces at codespeak.net [mailto:pypy-dev- > bounces at codespeak.net] On Behalf Of Rocco Moretti > Sent: Monday, January 20, 2003 11:08 PM > To: pypy-dev at codespeak.net > Subject: Re: [pypy-dev] question about core implementation language > > logistix wrote: > > >I think the point is that Python as a language provides absolutely no > >access to the machine internals. ?You can't examine memory contents, > >registers, I/O bus or call code directly. > > Can you do this all in C? (without library functions coded in > assembly) > > >As part of the bootstrap > >process the following four functions should probably be added to the > >interperter (either as builtin functions or a module): > > > > ? ?GetMemoryFromOS() > > ? ?ReturnMemouryToOS() > > ? ?Peek() > > ? ?Poke() > > > >ANd probably: > > ? ?get/putRegister() > > ? ?callCode() > > > >Other than that, everything could (eventually) be written in Python. > > If we were writing an Operating System, I would agree with you. (We > can put down a Python based OS as another goal of the project. 1:wink()>) > > Fortunately, we aren't quite so low level. The standard OS functions > tend to be higher level. As I understand it, much of libc.so on Unix > systems is simply a very thin wrapper around OS functions which behave > much like the C functions that call them. > > Since we don't want to reinvent the wheel, it would probably be best > to reuse the standard interface provided. This is where the minimal C > core comes in. We can probably limit it to basic input and output > (PyBIOS?), plus a generic mechanism for discovering and calling dll/so > functions. > > Or am I missing your point entirely? > > -Rocco > Yeah, Armin also seemed to disagree with me. Here's where I'm coming from (and maybe I'm missing a very big part of what psycho does). Okay we've got a theoretical python interpreter written in python. It: Compiles to python bytecode Uses this to generate C code(or other backend) Uses VC++ or gcc to generate machine code This all works well and good until the interpreter calls eval() or exec (like it does via the code module anytime you do something in IDLE or PythonWin, or even importing an updated module) what happens in this case? The way I see things now, you'd have to: Compile to bytecode Generate C (or other backend) Create dll/so Load DLL/SO into python I would want some way to load code directly into memory (and preferably with minimal File I/O). Hence a few low level memory functions (coded in C or assembly) that are callable from the Python framework. Everything else would be written on a higher level. Is the above the intended workflow for interactive compilation or am I just being an idiot here? ;) From tdelaney at avaya.com Tue Jan 21 23:50:44 2003 From: tdelaney at avaya.com (Delaney, Timothy) Date: Wed, 22 Jan 2003 09:50:44 +1100 Subject: [pypy-dev] Objective of minimal python Message-ID: > From: Armin Rigo [mailto:arigo at tunes.org] > > On Mon, Jan 20, 2003 at 12:51:00PM -0600, Nathan Heagy wrote: > > This might be as good a point as any to jump in with my > question: is > > there planned support for non-x86 platforms? > > Yes, I expect the new Psyco-like code to be much more > flexible and easy to > port to other processors. We are also talking about mixed approaches > involving gcc. I would think that there should be multiple possible targets from psyco depending on the platform. 1. By default the output format is python bytecode. 2. Optionally, C source code could be output and compiled on-the-fly (this would really need a persistent cache to avoid the big hit). 3. For specific platforms, machine code could be output directly. 4. Finally, Python source code which matches the C code that *would* have been produced in (2) as a reference. I would think (1) and (2) would be the first to be implemented. At this point, I don't consider performance to be *any* kind of objective. Getting a working version on one platform, then a working version on all platforms are the first steps. Tim Delaney From hpk at trillke.net Wed Jan 22 01:19:08 2003 From: hpk at trillke.net (holger krekel) Date: Wed, 22 Jan 2003 01:19:08 +0100 Subject: [pypy-dev] Sprint dates In-Reply-To: <20030121212543.2C1BA1F58@bespin.org>; from arigo@tunes.org on Tue, Jan 21, 2003 at 01:25:43PM -0800 References: <20030121212543.2C1BA1F58@bespin.org> Message-ID: <20030122011907.R12700@prim.han.de> [Armin Rigo Tue, Jan 21, 2003 at 01:25:43PM -0800] > Hello everybody, > > If we are running for a February sprint, we should fix the dates now. I've no > preference myself in February, but cannot free myself during the first week of > March. > > I remember Christian suggesting dates but I seem to have lost them > somewhere... it was me and the time range was a week between 17th and 27th of February. Originally, we wanted to agree on the exact date at the Berlin Python Meeting next weekend. Also because there will be at least four people present (Christian Tismer, Jens-Uwe Mager, Dinu Gherman and me) who plan to attend the sprint. But we can try to fix it now. I'll propose 17th to 23rd of February. For this time we could have a big room (100 square meters), two beamers for evening presentations and reports, and a kitchen at our disposal. Not to forget a good internet connection. There is also some forest where you can walk out and one piano. For people who don't want or can't spend enough money for external rooms i can arrange something but it will not be luxurious. The sprint will take place in this house: http://www.trillke.net/images/trillke_schnee.png which is situated in Hildesheim, 30km away from Hannover (kind of central germany, 200km south of Hamburg, 250 west of Berlin). There is no company behind this offer of organizing the sprint. But it is related to the 'codespeak' effort which focuses on free software development [1]. There is no commercial intention whatsoever execept, of course, meeting people like the ones on this list (yes, you, dear reader :-) may bring up opportunities in the longer run. Armin, is this concrete enough and fine with you? regards, holger [1] please don't ask me yet what *exactly* codespeak is. For one, it's a site which Jens-Uwe and me collaboratively develop to support/host free software projects. Two, it is also a lot of fun to work with brilliant people who themselves perform "codespeak". It's not a "sourceforge" which is too sweaty, anyway :-) From teyc at cognoware.com Wed Jan 22 01:27:48 2003 From: teyc at cognoware.com (Chui Tey) Date: Wed, 22 Jan 2003 10:27:48 +1000 Subject: [pypy-dev] Keeping it simple Message-ID: <9DB60AFDAB1A5E4093762030DBB38978360BC9@WEBSERVER.advdata.com.au> Hi All, Minimal Python sounds cool. :) I think we should try to stick as closely to the original objectives of the Minimal Python project, namely to aid in the fast prototyping of the Python VM, and reducing the C core to a minimum. Work could be started on two ends: a) making the types in the C interpreter even more Python like. eg. making every attribute in the frameobject writable, and making it subclassable in Python itself. b) Providing a set of patches that hooks the C core into using alternate implementations particular bytecodes. For instance, we might have a Python implementation of BUILD_CLASS instead of the services provided by the C-Core. ... y) Having a python implementation of a bytecode interpreter z) making some of the services of the python implementation eg. frameobjects available to the C core. Eventually this may lead to a version of PyPy where the main interpreter loop lives in Python by using a very small set of core CPython services like lists, and dictionaries (circa Python 1.0). The aim of making PyPy run faster then CPython is noble. However, even if it is not achieved, PyPy will still have great utility. You see, CPython has been finely tuned for many years and there is no reason why people will continue to tune it for many years to come. The tuning has come at the cost of understandability in some cases. A Minimal Python would fill a void in providing a reference implementation which can be understood in a single sitting. Aligning PyPy closely with CPython will aid the longevity of the Python Python project and encourage wider adoption of PyPy from the core python developers. Chui -------------- next part -------------- An HTML attachment was scrubbed... URL: From hpk at trillke.net Wed Jan 22 01:45:13 2003 From: hpk at trillke.net (holger krekel) Date: Wed, 22 Jan 2003 01:45:13 +0100 Subject: [pypy-dev] focus: CPython <-> PyPython Message-ID: <20030122014513.S12700@prim.han.de> Hi folks, There have been quite some discussions about how to relate to the ever-progressing CPython development. To recap i try to formulate some agreements: a) the Python language is not to be modified b) if in doubt we follow the CPython lead wherever possible b) CPython core abstraction levels should be cleanly formulated in Python. c) Macros may be used at some levels if there really is no good way to avoid them. But a) and b) still hold. d) our Pythonic core specification is intended to be used for all kinds of generational steps (to C, Bytecode, Object-C, assembler, for PSYCO-Specilization whatever, etc). e) if in doubt i follow Armin's and Christian's lead regarding the right abstraction levels :-) e) there is all kinds of nice stuff to try f) there are a lot of wishes, thoughts, suggestions and ... names :-) There is some disagreement about how to cope with the CPython codebase and python-dev's ongoing C-development. Which isn't really problematic for the time beeing. We might have a model and some working code before PyCon (the python developers conference in Washington in March). It should be an ideal opportunity to discuss stuff. Any strong objections or agreements on this recap? regards, holger From tismer at tismer.com Wed Jan 22 02:10:17 2003 From: tismer at tismer.com (Christian Tismer) Date: Wed, 22 Jan 2003 02:10:17 +0100 Subject: [pypy-dev] MinimalPython newbie questions In-Reply-To: <5.0.2.1.1.20030121091127.00a7b4f0@mail.oz.net> References:

<5.0.2.1.1.20030121091127.00a7b4f0@mail.oz.net> Message-ID: <3E2DEF79.5000903@tismer.com> Bengt Richter wrote: ... > I'm wondering how you will get to the starting point of the > "high-level description" you mention by any automated means > of translating the interpreter from C sources. [in-depth explainment why this cannot work] ... > Hm, I have a general feeling that in the game of programs-as-data > you are probably doomed to reinvent some lisp ;-) Fine. What's your problem with that? It does not need Lisp to use related techniques. Do you have a proposal? [snipped the rest] I agree that we are not trying to do something trivial, and that it is impossible to do an automatic translation including a suitable abstraction without interaction. It is more like extracting structure, and applying generalizations by hand, in a re-doable manner. I just re-read your message about "Loving it to death". In that sense, was this an offer to do some 10+ C to Python translations by hand? That would be great! going-back-to-prototyping -- chris From tismer at tismer.com Wed Jan 22 02:18:31 2003 From: tismer at tismer.com (Christian Tismer) Date: Wed, 22 Jan 2003 02:18:31 +0100 Subject: [pypy-dev] focus: CPython <-> PyPython In-Reply-To: <20030122014513.S12700@prim.han.de> References: <20030122014513.S12700@prim.han.de> Message-ID: <3E2DF167.9000803@tismer.com> holger krekel wrote: [not copying the recap] > Any strong objections or agreements on this recap? +1 Thanks a lot for this conclusion. I think it is good to get the feet back into the ground and to focus on what we want to get started with. sincerely - chris From tismer at tismer.com Wed Jan 22 02:30:36 2003 From: tismer at tismer.com (Christian Tismer) Date: Wed, 22 Jan 2003 02:30:36 +0100 Subject: [pypy-dev] Sprint dates In-Reply-To: <20030122011907.R12700@prim.han.de> References: <20030121212543.2C1BA1F58@bespin.org> <20030122011907.R12700@prim.han.de> Message-ID: <3E2DF43C.7020001@tismer.com> holger krekel wrote: ... > I'll propose 17th to 23rd of February. For this time we could > have a big room (100 square meters), two beamers for evening > presentations and reports, and a kitchen at our disposal. > Not to forget a good internet connection. There is also > some forest where you can walk out and one piano. Would you mind to announce this publically right now? I think the earlier the better, since the date is only a few weeks away, and people still might think that it will be March, as was said in some earlier post. Hey, a piano -- great! chris From hpk at trillke.net Wed Jan 22 02:32:52 2003 From: hpk at trillke.net (holger krekel) Date: Wed, 22 Jan 2003 02:32:52 +0100 Subject: [pypy-dev] MinimalPython newbie questions In-Reply-To: <5.0.2.1.1.20030121091127.00a7b4f0@mail.oz.net>; from bokr@oz.net on Tue, Jan 21, 2003 at 12:30:22PM -0800 References:

<20030121143433.GA26985@magma.unil.ch> <5.0.2.1.1.20030121091127.00a7b4f0@mail.oz.net> Message-ID: <20030122023252.T12700@prim.han.de> Hi Bengt! [Bengt Richter Tue, Jan 21, 2003 at 12:30:22PM -0800] > At 15:34 2003-01-21 +0100, Armin Rigo wrote: > >Maybe. There are quite a lot of exciting possibilities that we can dream of, > >given enough flexibility. Starting from a high-level description of the > >interpreter (the interpreter written in Python), Stackless microthreads can > >easily be added (or not). > > I'm wondering how you will get to the starting point of the > "high-level description" you mention by any automated means > of translating the interpreter from C sources. I.e., you could > even say > repr(file('/usr/bin/python','r').read()) > gets you a quick "Python source representation" of a big chunk > of Python, but it isn't high level. Reading in the C sources into > Python strings gets you another representation [...] I agree with your general direction. However, Armin and Christian have both proven something by realizing amazing projects. Though i have a different oppinion (where we too agree somewhat) on the interaction with the CPython code base i wouldn't exclude even "fantastic" approaches. See my CPython<->PyPython recap posting in another thread. Btw, i am really eager to get to know some of your background. I have read a lot of good posts from you (mainly on c.l.py) but have no actual clue what your are doing. Mind to tell a bit? regards, holger From hpk at trillke.net Wed Jan 22 02:48:13 2003 From: hpk at trillke.net (holger krekel) Date: Wed, 22 Jan 2003 02:48:13 +0100 Subject: [pypy-dev] Sprint dates In-Reply-To: <3E2DF43C.7020001@tismer.com>; from tismer@tismer.com on Wed, Jan 22, 2003 at 02:30:36AM +0100 References: <20030121212543.2C1BA1F58@bespin.org> <20030122011907.R12700@prim.han.de> <3E2DF43C.7020001@tismer.com> Message-ID: <20030122024813.U12700@prim.han.de> [Christian Tismer Wed, Jan 22, 2003 at 02:30:36AM +0100] > holger krekel wrote: > ... > > > I'll propose 17th to 23rd of February. For this time we could > > have a big room (100 square meters), two beamers for evening > > presentations and reports, and a kitchen at our disposal. > > Not to forget a good internet connection. There is also > > some forest where you can walk out and one piano. > > Would you mind to announce this publically > right now? I think the earlier the better, > since the date is only a few weeks away, > and people still might think that it will > be March, as was said in some earlier post. I can announce it tommorow evening (CET) if i get some feedback at least from Armin but also from others who are interested. Can't do it earlier because i will be travelling home tommorow and need some sleep soonish :-) holger From bokr at oz.net Wed Jan 22 03:40:01 2003 From: bokr at oz.net (Bengt Richter) Date: Tue, 21 Jan 2003 18:40:01 -0800 Subject: [pypy-dev] MinimalPython newbie questions In-Reply-To: <3E2DEF79.5000903@tismer.com> References: <5.0.2.1.1.20030121091127.00a7b4f0@mail.oz.net>

<5.0.2.1.1.20030121091127.00a7b4f0@mail.oz.net> Message-ID: <5.0.2.1.1.20030121174156.00a8a2b0@mail.oz.net> At 02:10 2003-01-22 +0100, Christian Tismer wrote: >Bengt Richter wrote: >... > >>I'm wondering how you will get to the starting point of the >>"high-level description" you mention by any automated means >>of translating the interpreter from C sources. > >[in-depth explainment why this cannot work] Ouch. I think you have an exciting project. I hope you don't take my attempt to get it clear in my head as negativism. I was trying to factor out different aspects as I saw them, to ask if I was looking at things the way you are re goals of a new high level PPython source for humans vs python-in- any-form-just-so-Pysco-can-get-its-teeth-into-it-and-do-its-magic. >... > >>Hm, I have a general feeling that in the game of programs-as-data >>you are probably doomed to reinvent some lisp ;-) > >Fine. What's your problem with that? No problem, it just flashed into my mind. And I remembered the little eval thing, which I though was kind of interesting. >It does not need Lisp to use related techniques. Sure, I didn't mean to imply that. >Do you have a proposal? Not at this time, sorry. But I am trying to think about the problem, FWIW. It is interesting ;-) >[snipped the rest] > >I agree that we are not trying to do something trivial, >and that it is impossible to do an automatic translation >including a suitable abstraction without interaction. >It is more like extracting structure, and applying >generalizations by hand, in a re-doable manner. Does that mean you intend to build a CPython->PPython translator (with whatever manually prepared hints and directives in files augmenting the C inputs) for continued use? I.e., anticipating treating CPython as master source code for the forseeable future, as opposed to a one-time translation that is validated by testing against test suites, and then maintained at the source/PEP level, rather than re-translating new C? >I just re-read your message about "Loving it to death". Do you think I fulfilled my own fears? I hope not ;-/ >In that sense, was this an offer to do some 10+ C to >Python translations by hand? That would be great! I would like to help. I'm not sure my best help would be to jump into coding, but it might give me a better appreciation for what you are dealing with, so if you want to mention some particular C to translate, and some example of patterns you'd like used, I'd be happy to look at it. Does it mean that you now have a project plan with particular coding tasks that can be delegated? BTW, I can promise not to speak until spoken to about whatever, if that's best at any given time ;-) >going-back-to-prototyping -- chris When will we get a peek? Or did I miss an URL somewhere ;-) Best wishes, Bengt From scott at fenton.baltimore.md.us Wed Jan 22 03:05:14 2003 From: scott at fenton.baltimore.md.us (Scott Fenton) Date: Tue, 21 Jan 2003 21:05:14 -0500 Subject: [pypy-dev] __builtin__ module Message-ID: <20030122020514.GA4079@debian.fenton.baltimore.md.us> Hello all. Due to the fact that there's no way in hell for me to get out of the US by March, and due to the fact that I love this concept, I've hacked up some basic replacements for various functions in __builtin__. The code resides at http://fenton.baltimore.md.us/pypy.py Take a look at it and tell me what you think. Currenty, this code implements everything BUT: * the exceptions (seperate concept) * builtin types (not sure how to handle them) * callable (not sure how to test) * classmethod (ditto) * coerce (tritto) * compile (needs an actual compiler, and the AST module scares me) * dir (not sure how to get current scope) * eval (not my job(TM)) * execfile (ditto(TM)) * hex, oct (lazyness, it'll be in version 2) * id (lower level than I can handle) * intern (didn't understand the docstring) * isinstance, issubclass (see classmethod) * globals, locals (not sure how to get ahold of them) * raw_input (really complex) * staticmethod (ugly hackery) * super (see classmethod) * type (need more info, dammit!) * unichr (given chr, unichr frightens me) * xrange (sorta builtin type) -Scott Fenton -- char m[9999],*n[99],*r=m,*p=m+5000,**s=n,d,c;main(){for(read(0,r,4000);c=*r; r++)c-']'||(d>1||(r=*p?*s:(--s,r)),!d||d--),c-'['||d++||(*++s=r),d||(*p+=c== '+',*p-=c=='-',p+=c=='>',p-=c=='<',c-'.'||write(1,p,1),c-','||read(2,p,1));} -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From hpk at trillke.net Wed Jan 22 04:02:04 2003 From: hpk at trillke.net (holger krekel) Date: Wed, 22 Jan 2003 04:02:04 +0100 Subject: [pypy-dev] __builtin__ module In-Reply-To: <20030122020514.GA4079@debian.fenton.baltimore.md.us>; from scott@fenton.baltimore.md.us on Tue, Jan 21, 2003 at 09:05:14PM -0500 References: <20030122020514.GA4079@debian.fenton.baltimore.md.us> Message-ID: <20030122040204.Y12700@prim.han.de> [Scott Fenton Tue, Jan 21, 2003 at 09:05:14PM -0500] > Hello all. Due to the fact that there's no way > in hell for me to get out of the US by March, > and due to the fact that I love this concept, > I've hacked up some basic replacements for > various functions in __builtin__. The code > resides at http://fenton.baltimore.md.us/pypy.py > Take a look at it and tell me what you think. Cool! A pity you can't take part. But latest at the next EuroPython we should do another sprint. I really want to get a repository going soon. You code should be checked in there. I'd love to have people porting various bits of C-coded stuff into pure Python. Though i prefer coding myself i would take some time to organize such an effort. Help with organizing this C to Python effort would be much appreciated. Think of testing (sic!), nightly builds, SCons etc.pp. I'd prefer to do this on a subversion repository which we (at codespeak) hopefully set up pretty soon. I guess i am not the only one eager to try this out. If it there are big obstacles we resort to good old cvs. Sorry for (ab)using your contribution for this interwoven half-announcement. regards, holger From roccomoretti at netscape.net Wed Jan 22 04:46:12 2003 From: roccomoretti at netscape.net (Rocco Moretti) Date: Tue, 21 Jan 2003 22:46:12 -0500 Subject: [pypy-dev] How to translate 300000 lines of C Message-ID: <7B82C048.2F93F892.9ADE5C6A@netscape.net> One issue re: the automatic C to Python translater just occured to me. It's been mentioned that only portions of the C code would be automatically translated, certain sections being translated by hand. How would the interface between the two work if the hand translated sections differ from the C version? Simplest example is basic types. A Python version would presumably be represented as a class with member functions. But the C code uses calls to accessor functions. How could we redirect the automatically generated code to call member functions as opposed to free floating functions? A more difficult problem lies when we we start to reorganize the object/structure internals. Good case of this is the frame object. The C version has an array with multiple pointers to internal objects. Once you reorganize the internals, how do you tell the C->Python translator that f_stacktop isn't a PyObject pointer anymore, but an index off of another list? Not to dissuade the effort, but I'm concerned we could get a situation where we are hesitant to rearrange the internals for the better because it would break our tools. Cautiously Optimistic, -Rocco __________________________________________________________________ The NEW Netscape 7.0 browser is now available. Upgrade now! http://channels.netscape.com/ns/browsers/download.jsp Get your own FREE, personal Netscape Mail account today at http://webmail.netscape.com/ From theller at python.net Wed Jan 22 09:08:09 2003 From: theller at python.net (Thomas Heller) Date: 22 Jan 2003 09:08:09 +0100 Subject: [pypy-dev] __builtin__ module In-Reply-To: <20030122020514.GA4079@debian.fenton.baltimore.md.us> References: <20030122020514.GA4079@debian.fenton.baltimore.md.us> Message-ID: <3cnltxna.fsf@python.net> Scott Fenton writes: > Hello all. Due to the fact that there's no way > in hell for me to get out of the US by March, > and due to the fact that I love this concept, > I've hacked up some basic replacements for > various functions in __builtin__. The code > resides at http://fenton.baltimore.md.us/pypy.py > Take a look at it and tell me what you think. Your implementations of ord() and chr() are somewhat inefficient, because they rebuild the list/dictionary each time. Pass them to dis.dis() and you'll see what I mean. Otherwise - cool. > > Currenty, this code implements everything BUT: > > * the exceptions (seperate concept) > * builtin types (not sure how to handle them) > * callable (not sure how to test) > * classmethod (ditto) > * coerce (tritto) > * compile (needs an actual compiler, and the AST module scares me) > * dir (not sure how to get current scope) > * eval (not my job(TM)) > * execfile (ditto(TM)) > * hex, oct (lazyness, it'll be in version 2) > * id (lower level than I can handle) > * intern (didn't understand the docstring) > * isinstance, issubclass (see classmethod) > * globals, locals (not sure how to get ahold of them) > * raw_input (really complex) > * staticmethod (ugly hackery) > * super (see classmethod) > * type (need more info, dammit!) > * unichr (given chr, unichr frightens me) > * xrange (sorta builtin type) Hm, it should be possible to find suitable tests in the lib/test subdir, IMO. Thomas From gsw at agere.com Wed Jan 22 14:53:07 2003 From: gsw at agere.com (Gerald S. Williams) Date: Wed, 22 Jan 2003 08:53:07 -0500 Subject: [pypy-dev] RE: pypy-dev Digest, Vol 13, Issue 2 In-Reply-To: <20030122110002.AD97E5ABCB@thoth.codespeak.net> Message-ID: Rocco Moretti wrote: > How would the interface between the two work if the hand translated sections differ > from the C version? I can give general answers from past experience. > Simplest example is basic types. A Python version would presumably be represented as a > class with member functions. But the C code uses calls to accessor functions. How could > we redirect the automatically generated code to call member functions as opposed to > free floating functions? One answer: provide such functions, which call the member functions as needed. > A more difficult problem lies when we we start to reorganize the object/structure > internals. Good case of this is the frame object. The C version has an array with > multiple pointers to internal objects. Once you reorganize the internals, how do > you tell the C->Python translator that f_stacktop isn't a PyObject pointer anymore, > but an index off of another list? If the class was providing public attributes, you may not want to change their meaning. f_stacktop could provide an accessor for the old representation. The new representation would be in _f_stacktop or something. -Jerry From arigo at tunes.org Wed Jan 22 15:58:19 2003 From: arigo at tunes.org (Armin Rigo) Date: Wed, 22 Jan 2003 15:58:19 +0100 Subject: [pypy-dev] How to translate 300000 lines of C In-Reply-To: <20030121160504.N12700@prim.han.de> References: <506F64A6.6E20DBCF.9ADE5C6A@netscape.net> <3E2D4C6B.8020300@tismer.com> <20030121151704.M12700@prim.han.de> <20030121144129.GB26985@magma.unil.ch> <20030121160504.N12700@prim.han.de> Message-ID: <20030122145819.GA10868@magma.unil.ch> Hello Holger, On Tue, Jan 21, 2003 at 04:05:04PM +0100, holger krekel wrote: > Sounds nice but i am not convinced this is doable in a simple > way. But simplicity should be a major goal. It's to me what > made and makes python a success. > > (...) > > My implicit point here was that it's hardly possible to > convince CPython-dev people to prototype or implement > at Java-level. I now feel that I was off-target. Mea culpa. Our goal is indeed not to keep as close as CPython as possible, but rather to suggest an alternate implementation that might be used as reference at some time. Let's forget all my arguments in favour of a fully automated translation. We may keep Christian's ideas of a C-to-Python helper, which we might use for specific parts of CPython; the resulting Python code would still be the reference (with maintainability acheivable using various tricks like diff'ing the outputs of our translator on the old and the new CPython version to know what we should update in our Python-only reference implementation). Armin. From arigo at tunes.org Wed Jan 22 16:56:39 2003 From: arigo at tunes.org (Armin Rigo) Date: Wed, 22 Jan 2003 16:56:39 +0100 Subject: [pypy-dev] (no subject) In-Reply-To: <200301212238.h0LMcwtu028829@overload3.baremetal.com> References: <200301212238.h0LMcwtu028829@overload3.baremetal.com> Message-ID: <20030122155639.GC10868@magma.unil.ch> Hello Logistix, On Tue, Jan 21, 2003 at 05:38:58PM -0500, logistix wrote: > I would want some way to load code directly into memory (and preferably > with minimal File I/O). Hence a few low level memory functions (coded > in C or assembly) that are callable from the Python framework. Now I think I see what you meant. But there are two different C-emission processes that we are talking about: 1) given the Python interpreter written in Python, we want to statically translate it into some CPython-like C code, experimenting variants etc. but always having a "classical" (non-Psyco) interpreter. This should be regarded as the goal of the project right now, focusing on nice Python code, not specifically optimized. Most of the C code making this new interpreter should come from 100% automatic translation of the Python-in-Python; it could be completed with a few C modules that handle all issues specific to C (e.g. dynamic loading libraries or signal polling). 2) later, we can add a Psyco layer made of some (hopefully small) parts of the current Psyco, and some specific back-end code (which may or may not be written directly in C). As above the most important part of this project must be 100% automatically translated from the Python-in-Python. I think the issues you mention arise if we want to write absolutely everything in Python, which I think we don't: issues specific to the final C/ObjC/OCaml/wantever translation may be coded in C/ObjC/OCaml/whatever, as far as I'm concerned, and the same with Psyco. A bient?t, Armin. From arigo at tunes.org Wed Jan 22 17:02:41 2003 From: arigo at tunes.org (Armin Rigo) Date: Wed, 22 Jan 2003 17:02:41 +0100 Subject: [pypy-dev] focus: CPython <-> PyPython In-Reply-To: <20030122014513.S12700@prim.han.de> References: <20030122014513.S12700@prim.han.de> Message-ID: <20030122160241.GD10868@magma.unil.ch> Hello Holger, On Wed, Jan 22, 2003 at 01:45:13AM +0100, holger krekel wrote: > e) if in doubt i follow Armin's and Christian's > lead regarding the right abstraction levels :-) > > e) there is all kinds of nice stuff to try Interessingly enough, you used the same letter for the two points, probably meaning that trusting Christian and me is a nice stuff to try :-) > Any strong objections or agreements on this recap? +1 too. A good starting point for future discussion. A bient?t, Armin. From arigo at tunes.org Wed Jan 22 17:22:43 2003 From: arigo at tunes.org (Armin Rigo) Date: Wed, 22 Jan 2003 17:22:43 +0100 Subject: [pypy-dev] __builtin__ module In-Reply-To: <20030122020514.GA4079@debian.fenton.baltimore.md.us> References: <20030122020514.GA4079@debian.fenton.baltimore.md.us> Message-ID: <20030122162243.GE10868@magma.unil.ch> Hello Scott, On Tue, Jan 21, 2003 at 09:05:14PM -0500, Scott Fenton wrote: > I've hacked up some basic replacements for > various functions in __builtin__. Fine! The greatest benefits of your code is that it clearly shows what can be directly implemented in Python, what could be, and what cannot. A function like chr() is in the second category: it could be written in Python as you did, but it is not "how we feel a chr() implementation should be". It should just build a one-string character, putting 'i' somewhere as an ASCII code. But how? This problem, and the deeper problems with some other functions, come from the fact that you placed your code at the same level as the Python interpreter. The functions you wrote could be used in the current CPython interpreter, in place of the existing built-ins. In PyPython, these are functions that we will populate the emulated built-ins with. We still need two levels: the functions that operate at the same level as the interpreted programs (like yours), and the functions that operate at the level of the interpreter (like CPython's built-in functions). In other words, we still need the notion of built-in functions that will be as "magic" as CPython's in the sense that they do something that couldn't be done by user code. You see what I mean when you try to rewrite the type() builtin :-) Armin PS: just a comment about abs(), cmp() and len(). These should use the underlying __abs__(), __cmp__() and __len__() methods, as you have done for apply(), bool(), hash(), pow() and repr(). From theller at python.net Wed Jan 22 17:57:22 2003 From: theller at python.net (Thomas Heller) Date: 22 Jan 2003 17:57:22 +0100 Subject: [pypy-dev] __builtin__ module In-Reply-To: <20030122162243.GE10868@magma.unil.ch> References: <20030122020514.GA4079@debian.fenton.baltimore.md.us> <20030122162243.GE10868@magma.unil.ch> Message-ID: Armin Rigo writes: > Fine! The greatest benefits of your code is that it clearly shows what can be > directly implemented in Python, what could be, and what cannot. > > A function like chr() is in the second category: it could be written in Python > as you did, but it is not "how we feel a chr() implementation should be". It > should just build a one-string character, putting 'i' somewhere as an ASCII > code. But how? I don't think so. For me, this is a fine implementation of chr(): def chr(i): return "\x00\x01\x02x03...\xFF"[i] Maybe a check should be added to make sure 'i' is between 0 and 255 :-) But the resposibility to construct string objects is not chr()'s burdon, IMO. In a CPython extension and probably also in the core there are helper functions to build these strings, the implementor of chr() would use them. Thomas From boyd at strakt.com Wed Jan 22 18:06:55 2003 From: boyd at strakt.com (Boyd Roberts) Date: Wed, 22 Jan 2003 18:06:55 +0100 Subject: [pypy-dev] __builtin__ module References: <20030122020514.GA4079@debian.fenton.baltimore.md.us> <20030122162243.GE10868@magma.unil.ch> Message-ID: <3E2ECFAF.3010104@strakt.com> Thomas Heller wrote: >I don't think so. For me, this is a fine implementation of chr(): > >def chr(i): > return "\x00\x01\x02x03...\xFF"[i] > > That's dreadful. def chr(i): return'%c' % i From nathanh at zu.com Wed Jan 22 18:14:08 2003 From: nathanh at zu.com (Nathan Heagy) Date: Wed, 22 Jan 2003 11:14:08 -0600 Subject: [pypy-dev] __builtin__ module In-Reply-To: <20030122040204.Y12700@prim.han.de> Message-ID: > I really want to get a repository going soon. Let's get this started! I'm ready to write some code as well if someone could point out a good place to start. -- Nathan Heagy phone:306.653.4747 fax:306.653.4774 http://www.zu.com From theller at python.net Wed Jan 22 18:15:12 2003 From: theller at python.net (Thomas Heller) Date: 22 Jan 2003 18:15:12 +0100 Subject: [pypy-dev] __builtin__ module In-Reply-To: <3E2ECFAF.3010104@strakt.com> References: <20030122020514.GA4079@debian.fenton.baltimore.md.us> <20030122162243.GE10868@magma.unil.ch> <3E2ECFAF.3010104@strakt.com> Message-ID: <7kcxumvz.fsf@python.net> Boyd Roberts writes: > Thomas Heller wrote: > > >I don't think so. For me, this is a fine implementation of chr(): > > > >def chr(i): > > return "\x00\x01\x02x03...\xFF"[i] > > > > That's dreadful. > > def chr(i): > return'%c' % i > Maybe, but you probably got my point... Thomas From tfanslau at gmx.de Wed Jan 22 22:33:37 2003 From: tfanslau at gmx.de (Thomas Fanslau) Date: Wed, 22 Jan 2003 22:33:37 +0100 Subject: [pypy-dev] Porting Python C-Code to Python Message-ID: <3E2F0E31.40307@gmx.de> I'm reading the list for some time now ... and although I understand what you want to achieve I may not really understand what is involved in reaching that goal ... But lately I read about porting that C-Code to Python and I have a question/suggestion that may be a serious candidate for the most stupid idea :) Shouldn't it be possible to write a (small) backend to gcc to generate python-bytecode? So you can translate the C-Part of Python (or any other C-Code :) to python-bytecode and run it. That should be easier then all the other suggestions and allows you to replace the C-Code Module for Module ... --tf From nathanh at zu.com Wed Jan 22 23:25:50 2003 From: nathanh at zu.com (Nathan Heagy) Date: Wed, 22 Jan 2003 16:25:50 -0600 Subject: [pypy-dev] Porting Python C-Code to Python In-Reply-To: <3E2F0E31.40307@gmx.de> Message-ID: <760A5DD5-2E58-11D7-8410-00039385F5E6@zu.com> I've heard that this is not easy at all. iirc gcc can only really make code for register-based systems. the java back end is apparently a big hack that basically sidesteps all the gcc internals. On Wednesday, January 22, 2003, at 03:33 PM, Thomas Fanslau wrote: > I'm reading the list for some time now ... and although I understand > what you want to achieve I may not really understand what is involved > in reaching that goal ... > > But lately I read about porting that C-Code to Python and I have a > question/suggestion that may be a serious candidate for the most > stupid idea :) > > Shouldn't it be possible to write a (small) backend to gcc to generate > python-bytecode? So you can translate the C-Part of Python (or any > other C-Code :) to python-bytecode and run it. That should be easier > then all the other suggestions and allows you to replace the C-Code > Module for Module ... > > --tf > > _______________________________________________ > pypy-dev at codespeak.net > http://codespeak.net/mailman/listinfo/pypy-dev > > -- Nathan Heagy phone:306.653.4747 fax:306.653.4774 http://www.zu.com From bokr at oz.net Thu Jan 23 00:21:26 2003 From: bokr at oz.net (Bengt Richter) Date: Wed, 22 Jan 2003 15:21:26 -0800 Subject: [pypy-dev] __builtin__ module In-Reply-To: <7kcxumvz.fsf@python.net> References: <3E2ECFAF.3010104@strakt.com> <20030122020514.GA4079@debian.fenton.baltimore.md.us> <20030122162243.GE10868@magma.unil.ch> <3E2ECFAF.3010104@strakt.com> Message-ID: <5.0.2.1.1.20030122130959.00a9b800@mail.oz.net> At 18:15 2003-01-22 +0100, Thomas Heller wrote: >Boyd Roberts writes: > >> Thomas Heller wrote: >> >> >I don't think so. For me, this is a fine implementation of chr(): >> > >> >def chr(i): >> > return "\x00\x01\x02x03...\xFF"[i] >> > >> >> That's dreadful. >> >> def chr(i): >> return'%c' % i >> > >Maybe, but you probably got my point... I see four issues in the above that might be worth some guiding words from the leaders[a]: 1) premature optimization, 2)appropriate level of Python code for coding PyPython, 3) appropriate definition semantics, 4) dependencies in definitions. 1) Standard advice, presumably 2) What kinds of constructs will be best for Psyco to deal with? When coding the definition of something, should one ask: 2a) Is the thing being coded part of core minimal PyPython, so coding should be limited to a Python subset, or 2b) Is the thing being coded at the next level, so that the full language can be presumed to be available? 2c) How does one know whether a thing belongs to 2a or 2b, or is this distinction really necessary in the current thinking? 3) When is it good to define primitively, like Thomas's definition, and when is it good to define in terms of a composition implicitly delegating to existing function(s), like Boyd's? And should one be careful as to whether the functions one is depending on belong to 2a or 2b? 4) Should a dependency tree be documented as we go, e.g., to avoid hidden circular dependencies in delegated functionality, but also to make clear levels of primitiveness? (BTW, ISTM this would help in any future attempt to factor out the implementation of a primitive core functionality). The two definitions of chr() above are examples of primitive vs composite/delegating definitions, which is what brought this to mind (which is not to say that the "primitive" definition doesn't depend on anything, but it's composing with lower level primitives). Regards, Bengt [a] Should I have cc'd Armin, Chris, and Holger? I.e., would that have been courtesy or redundant annoyance? From logistix at zworg.com Thu Jan 23 01:21:39 2003 From: logistix at zworg.com (logistix) Date: Wed, 22 Jan 2003 19:21:39 -0500 Subject: [pypy-dev] Notes on compiler package Message-ID: <200301230021.h0N0Ld2p012793@overload3.baremetal.com> I'm finally starting to get my head wrapped around the compiler package, and thought I'd pass some notes on to the group since it's a beast: 1) Other than the original AST generation, it looks like it's all pure python (not tying into the C source). The original AST comes from the parser module, which is a pure wrapper module (although I've already posted python code to build ast's here) 2) Internal CPython ASTs are transformed into "compiler"'s own ASTs. These nodes are smarter than the internal ones; they attach appropriate info to "attributes". CPython AST nodes just have a list of children, this module identifies and breaks out important data (for example, the arguments in a function def become an "args" attribute instead of guessing it's the first child) 3) The CodeGenerators walk the ASTs and emit code into a code object. 4) Speed seems reasonable (which makes me think I'm missing some calls into the C side of things) 4) The generated bytecode looks reasonably good and execs fine, but there are still some diffences from what CPython generates. Here's what I've seen so far: SET_LINENO doesn't always work the same (this is how python knows what line in a file threw an exception) Doesn't have one of the few optimizations the CPython compiler does have. CPython will throw out misplaced "docstrings"... strings in the source that aren't assigned to anything. This throws off array indexes in the code objects co_const attribute. In general, the compiler package is in alot better shape than I expected. If anyone is interested in poking around, here's a quick script that does basic comparison and diff on the bytecode generated by the builtin compile and compiler's equivilent function. I'll probably expand this to compare the whole code objects in the near future. =============== CompilerTest.py =============== import sys import compiler import dis def opcodeTuples(source): """ Makes Opcode tuples in the form of: OFFSET, NAME, [OPTIONAL PARAM] """ retVal = [] a = iter(source) offset = 0 def getByte(next=a.next): return ord(a.next()) def getWord(next=a.next): return ord(a.next()) + ord(a.next()) * 256 # Little-endian while 1: try: opcode = getByte() opname = dis.opname[opcode] if opcode < 90: retVal.append( (offset, dis.opname[opcode]) ) offset += 1 else: retVal.append( (offset, dis.opname[opcode], getWord())) offset += 3 except StopIteration: break return retVal def opcodeDiff(ops1, ops2): """ Does a simple DIFF of two sets of opcodes. Can only check one skipped line. Ignores param for now since they don't match """ opcode1, opcode2 = opcodeTuples(ops1), opcodeTuples(ops2) a,b = 0,0 print "%30s%30s" % ("FIRST", "SECOND") print "%30s%30s" % ("====================" ,"====================") while 1: if opcode1[a][1:2] == opcode2[b][1:2]: print "%30s%30s" % (opcode1[a], opcode2[b] ), if opcode1[a][2:] != opcode2[b][2:]: print " ARG MISMATCH" else: print a += 1 b += 1 elif opcode1[a+1][1:2] == opcode2[b][1:2]: print "%30s%30s" % (opcode1[a], "") a += 1 elif opcode1[a][1:2] == opcode2[b+1][1:2]: print "%30s%30s" % ("", opcode2[b]) b += 1 else: print "NONTRIVIAL DIFF%25s%25s" % (opcode1[a], opcode2[b]) break if a >= len(opcode1) and b >= len(opcode2): break elif a >= len(opcode1) or b >= len(opcode2): print "UNEXPECTED END OF OPCODES" break def compareCompiles(filename, nativeCompile=compile, pythonCompile=compiler.compile): """ Compares a bytecode compile between the native python compiler and the one written in python """ source = file(filename).read() native = nativeCompile(source, filename, "exec") python = pythonCompile(source, filename, "exec") if native.co_code == python.co_code: print "compiles matched" else: print "compiles didn't match" opcodeDiff(native.co_code, python.co_code) if __name__ == "__main__": compareCompiles("c:\\python23\\lib\\code.py") =============== End CompilerTest.py =============== --------------------------------------- Get your free e-mail address @zworg.com From hpk at trillke.net Thu Jan 23 01:27:19 2003 From: hpk at trillke.net (holger krekel) Date: Thu, 23 Jan 2003 01:27:19 +0100 Subject: [pypy-dev] [ann] Sprint to Minimal Python Message-ID: <20030123012719.B16000@prim.han.de> Hi, please find enclosed the Minimal Python Sprint announcement. I post on pypy-dev first to get some feedback before sending it off to other lists... ------------------------------------------------------ Code Sprint towards Minimal Python 17th-23rd Feb. 2003 ------------------------------------------------------ Everybody is invited to join our first Minimal Python Marathon Or was it PyPython? Nevermind. We will have one or - if needed - two big rooms, beamers for presentations (and movies?), a kitchen, internet and a piano. There is a big park and some forest in case you need some fresh air. Short Ad-Hoc presentations about your area of interest, project or plain code will certainly be appreciated. ------------------------------------------------------ Goals of the first Minimal PyPython Marathon ------------------------------------------------------ - codify some ideas that were recently discussed on the pypy-dev codespeak list. - port your favorite C-module to Python (and maintain it :-) - build & enhance infrastructure (python build system, webapps, email, subversion/cvs, automated testing, ...) - rebuild the CPython-Python interpreter in simple Python, running on a minimal "transitive closure" of/in CPython. - try to write platform dependent Assembler/C-code that enables doing C/Machine-level calls from Python without extra C-bindings. - settle on concepts - focus on having some usable results at the end - have a lot of fun meeting like minded people. ------------------------------------------------------ Current Basic Agreements ------------------------------------------------------ Please note, that we have reached some agreement on a number of basic points: a) the Python language is not to be modified. b) simplicity wins. especially against optimization. c) if in doubt we follow the CPython/PEP lead. d) CPython core abstractions are to be cleanly formulated in simple Python. e) Macro-techniques may be used at some levels if there really is no good way to avoid them. But a) to c) still hold. f) our Pythonic core "specification" is intended to be used for all kinds of generational steps e.g to C, Bytecode, Objective-C, assembler and last-but-not-least PSYCO-Specialization. g) if in doubt we follow Armins and Christians lead regarding the right abstraction levels. And any other python people taking responsibility :-) h) there are a lot of wishes, code, thoughts, suggestions and ... names :-) ------------------------------------------------------ How to proceed if you are interested ------------------------------------------------------ If you are interested to come - even parttime - then please subscribe at http://codespeak.net/mailman/listinfo/pypy-sprint where organisational stuff will be communicated. Code- related discussions still take place on the pypy-dev list. For people who don't want or can't spend the money for external rooms i can probably arrange something but it will not be luxurious. btw, there are already five people who will come among them - you guessed it - Christian Tismer and Armin Rigo. ------------------------------------------------------ Disclaimer ------------------------------------------------------ There is no explicit commercial intention behind the organisation of the sprint. On the implicit hand socializing with like-minded people tends to bring up future opportunities. Especially if it's free codespeak ... ups :-) cheers, holger From tismer at tismer.com Thu Jan 23 03:42:48 2003 From: tismer at tismer.com (Christian Tismer) Date: Thu, 23 Jan 2003 03:42:48 +0100 Subject: [pypy-dev] __builtin__ module In-Reply-To: <3cnltxna.fsf@python.net> References: <20030122020514.GA4079@debian.fenton.baltimore.md.us> <3cnltxna.fsf@python.net> Message-ID: <3E2F56A8.3080100@tismer.com> Thomas Heller wrote: > Scott Fenton writes: > > >>Hello all. Due to the fact that there's no way >>in hell for me to get out of the US by March, >>and due to the fact that I love this concept, >>I've hacked up some basic replacements for >>various functions in __builtin__. The code >>resides at http://fenton.baltimore.md.us/pypy.py >>Take a look at it and tell me what you think. > > > Your implementations of ord() and chr() are somewhat inefficient, > because they rebuild the list/dictionary each time. > Pass them to dis.dis() and you'll see what I mean. This is correct for pure Python. One would have computed the tables once and either used them as a global, or as a default value for a dummy function parameter. On the other hand, with the assumption that Psyco or its successor will be able to deduce constant local expressions and turn them into constants, this approach is absolutely fine; despite the fact that these tables will most probably not be used and replaced by their trivial implementation. I anyway do appreciate the effort very much: Trying to reduce stuff based upon a minimum! cheers - chris From tismer at tismer.com Thu Jan 23 03:56:27 2003 From: tismer at tismer.com (Christian Tismer) Date: Thu, 23 Jan 2003 03:56:27 +0100 Subject: [pypy-dev] Notes on compiler package In-Reply-To: <200301230021.h0N0Ld2p012793@overload3.baremetal.com> References: <200301230021.h0N0Ld2p012793@overload3.baremetal.com> Message-ID: <3E2F59DB.80205@tismer.com> logistix wrote: > I'm finally starting to get my head wrapped around the compiler package, > and thought I'd pass some notes on to the group since it's a beast: > > 1) Other than the original AST generation, it looks like it's all pure > python (not tying into the C source). The original AST comes from the > parser module, which is a pure wrapper module (although I've already > posted python code to build ast's here) Oh, is that true? Last time, when I analysed what it would take to build a compiler for Python in Python, I ended up in the parser module being called, finally, which called back into the internals stuff. This was for Python 2.2.2; maybe this changed now? ciao - chris From logistix at zworg.com Thu Jan 23 04:43:09 2003 From: logistix at zworg.com (logistix) Date: Wed, 22 Jan 2003 22:43:09 -0500 Subject: [pypy-dev] Notes on compiler package Message-ID: <200301230343.h0N3h9Kv000791@overload3.baremetal.com> Christian Tismer wrote: > > logistix wrote: > > I'm finally starting to get my head wrapped around the compiler package, > > and thought I'd pass some notes on to the group since it's a beast: > > > > 1) Other than the original AST generation, it looks like it's all pure > > python (not tying into the C source). The original AST comes from the > > parser module, which is a pure wrapper module (although I've already > > posted python code to build ast's here) > > Oh, is that true? > Last time, when I analysed what it would take to > build a compiler for Python in Python, I ended > up in the parser module being called, finally, > which called back into the internals stuff. > This was for Python 2.2.2; maybe this changed now? > > ciao - chris > > Here's the recursive decent parser I wrote to build ASTs about six months ago for 2.2 (just tested on 2.3a): http://members.bellatlantic.net/~olsongt/concrete.zip I haven't written an extensive test-suite yet (I was just screwing around when I originally wrote it), but it did successfully build a tree that could get my version of the old nines problem to compile. It definitely needs a some rewriting (stack hog from hell), but it's a proof of concept. It should replace the parser module functionality. Then if you tie that into compiler package, it looks like pure python compilation to bytecode! I was suprised to see how much the compiler module did when I finally looked at it. To test, unzip the files somewhere, adjust the hardcoded reference to "c:\python22\lib\site-packages\concrete\misc\nines.py" as appropriate, and run "python llparser.py". You'll see the results of several trees I build getting compiled and exec'ed (using the CPython compiler) I too have trouble believing it could be this easy. A quick grep of compiler shows that it only uses parser to create ast Tuples in the early phases of generation. It doesn't use the astcompile functions on the back end. If anyone more familiar with the compiler internals would like to chime in on binary dependancies I'd appreciate it. It's getting a little late tonight, but tomorrow I'll just delete the parser pyd from my 2.3 install and see if I can plug my parser into compiler. I'm also noticing some more minor problems with the compiler bytecode generation, but the more I look the more I think the framework has the first 90% of the work done. -logistix P.S. If anyone can point me to some sort of definition of what the "Extended" in ELL(1) stands for I'd appreciate it. Python grammar is not LL(1) as written. --------------------------------------- Get your free e-mail address @zworg.com From logistix at zworg.com Thu Jan 23 05:44:55 2003 From: logistix at zworg.com (logistix) Date: Wed, 22 Jan 2003 23:44:55 -0500 Subject: [pypy-dev] Notes on compiler package Message-ID: <200301230444.h0N4it7R020460@overload3.baremetal.com> > > > P.S. If anyone can point me to some sort of definition of what the > > "Extended" in ELL(1) stands for I'd appreciate it. Python grammar is > > not LL(1) as written. > > Are you trying to use a parser generator? > > ciao - chris > Nah, that's what I already wrote one in the above stuff. But I had to backtrack to get it to work on parts of the grammar like: varargslist: (fpdef ['=' test] ',')* ('*' NAME [',' '**' NAME] | '**' NAME) | fpdef ['=' test] (',' fpdef ['=' test])* [','] The backtracking wasn't a big deal, but it shouldn't be necessary in an LL(1) grammar. The source claims that the grammar is ELL(1). I was wondering if the ELL definition addressed my backtracking issues, so that I could confirm that my assumptions about where I throw my Syntax Errors were valid. To use XML terminology, my current assumptions are only well-formed ;-) From theller at python.net Thu Jan 23 08:41:42 2003 From: theller at python.net (Thomas Heller) Date: 23 Jan 2003 08:41:42 +0100 Subject: [pypy-dev] [ann] Sprint to Minimal Python In-Reply-To: <20030123012719.B16000@prim.han.de> References: <20030123012719.B16000@prim.han.de> Message-ID: holger krekel writes: ------------------------------------------------------ > Goals of the first Minimal PyPython Marathon > ------------------------------------------------------ [...] > > - try to write platform dependent Assembler/C-code that > enables doing C/Machine-level calls from Python without > extra C-bindings. So I can expect people to improve the ctypes module even further ;-) ? Note that recent ctypes already works on Windows, Linux and MacOSX: http://sourceforge.net/projects/ctypes Not yet announced, because the docs are not up to date... Thomas PS: And no, I don't have time to come to the sprint it seems. From hpk at trillke.net Thu Jan 23 10:19:23 2003 From: hpk at trillke.net (holger krekel) Date: Thu, 23 Jan 2003 10:19:23 +0100 Subject: [pypy-dev] [ann] Sprint to Minimal Python In-Reply-To: ; from theller@python.net on Thu, Jan 23, 2003 at 08:41:42AM +0100 References: <20030123012719.B16000@prim.han.de> Message-ID: <20030123101923.D16000@prim.han.de> [Thomas Heller Thu, Jan 23, 2003 at 08:41:42AM +0100] > holger krekel writes: > > ------------------------------------------------------ > > Goals of the first Minimal PyPython Marathon > > ------------------------------------------------------ > [...] > > > > - try to write platform dependent Assembler/C-code that > > enables doing C/Machine-level calls from Python without > > extra C-bindings. > > So I can expect people to improve the ctypes module even further ;-) ? Hello Thomas, sorry, should have mentioned ctypes already. of course nobody will mindlessly rewrite the functionality of your module. are you ok with rewriting the above into: - further explore the ctypes approach to perform C/Machine-level calls from Python without extra C-bindings ? > Note that recent ctypes already works on Windows, Linux and MacOSX: > > http://sourceforge.net/projects/ctypes > > Not yet announced, because the docs are not up to date... > > Thomas > > PS: And no, I don't have time to come to the sprint it seems. not even some days? :-) anyway, maybe you can help identify tasks around ctypes? I have to talk to Jens-Uwe to see what he is up to. regards, holger -- you can have it fast, cheap or high quality. pick two. From arigo at tunes.org Thu Jan 23 10:41:19 2003 From: arigo at tunes.org (Armin Rigo) Date: Thu, 23 Jan 2003 10:41:19 +0100 Subject: [pypy-dev] __builtin__ module In-Reply-To: <5.0.2.1.1.20030122130959.00a9b800@mail.oz.net> References: <3E2ECFAF.3010104@strakt.com> <20030122020514.GA4079@debian.fenton.baltimore.md.us> <20030122162243.GE10868@magma.unil.ch> <3E2ECFAF.3010104@strakt.com> <5.0.2.1.1.20030122130959.00a9b800@mail.oz.net> Message-ID: <20030123094119.GA22778@magma.unil.ch> Hello Bengt, On def chr(i): return "\x00\x01\x02x03...\xFF"[i] versus def chr(i): return'%c' % i I feel the second solution to be more to the point. The first one seems redundant somehow: "pick the ith character from this string whose ith character just happen to have ASCII code i". But that's probably a minor point. It shows that chr() may well be implemented in pure Python using lower "primitives" (let's call them built-in functions or methods). It may indeed be a good idea to draw a fluctuating but documented dependency graph between pure Python and built-in functions. It would let two teams work on these two halves of the work: (1) writing pure Python functions (as Scott did), and (2) writing built-in functions. I think that Scott's work drew the line for the built-in functions. Most of what he couldn't code must be done as built-in functions. Armin. From hpk at trillke.net Thu Jan 23 10:50:54 2003 From: hpk at trillke.net (holger krekel) Date: Thu, 23 Jan 2003 10:50:54 +0100 Subject: [pypy-dev] __builtin__ module In-Reply-To: <5.0.2.1.1.20030122130959.00a9b800@mail.oz.net>; from bokr@oz.net on Wed, Jan 22, 2003 at 03:21:26PM -0800 References: <3E2ECFAF.3010104@strakt.com> <20030122020514.GA4079@debian.fenton.baltimore.md.us> <20030122162243.GE10868@magma.unil.ch> <3E2ECFAF.3010104@strakt.com> <7kcxumvz.fsf@python.net> <5.0.2.1.1.20030122130959.00a9b800@mail.oz.net> Message-ID: <20030123105054.E16000@prim.han.de> [Bengt Richter Wed, Jan 22, 2003 at 03:21:26PM -0800] > [a] Should I have cc'd Armin, Chris, and Holger? > I.e., would that have been courtesy or redundant annoyance? it would have been redundant. holger From boyd at strakt.com Thu Jan 23 11:09:05 2003 From: boyd at strakt.com (Boyd Roberts) Date: Thu, 23 Jan 2003 11:09:05 +0100 Subject: [pypy-dev] __builtin__ module References: <3E2ECFAF.3010104@strakt.com> <20030122020514.GA4079@debian.fenton.baltimore.md.us> <20030122162243.GE10868@magma.unil.ch> <3E2ECFAF.3010104@strakt.com> <5.0.2.1.1.20030122130959.00a9b800@mail.oz.net> <20030123094119.GA22778@magma.unil.ch> Message-ID: <3E2FBF41.2020001@strakt.com> Armin Rigo wrote: >def chr(i): > return "\x00\x01\x02x03...\xFF"[i] > > > The problem with this is that it's absurdly verbose as well as error prone to enumerate the string constant. It may be faster, but it's no good if it's _wrong_. From theller at python.net Thu Jan 23 11:18:00 2003 From: theller at python.net (Thomas Heller) Date: 23 Jan 2003 11:18:00 +0100 Subject: [pypy-dev] __builtin__ module In-Reply-To: <20030123094119.GA22778@magma.unil.ch> References: <3E2ECFAF.3010104@strakt.com> <20030122020514.GA4079@debian.fenton.baltimore.md.us> <20030122162243.GE10868@magma.unil.ch> <3E2ECFAF.3010104@strakt.com> <5.0.2.1.1.20030122130959.00a9b800@mail.oz.net> <20030123094119.GA22778@magma.unil.ch> Message-ID: Armin Rigo writes: > I think that Scott's work drew the line for the built-in functions. Most of > what he couldn't code must be done as built-in functions. Except that classmethod, staticmethod, and super could be done in pure Python. Aren't those even in Guido's descrintro?? Thomas From darius at accesscom.com Thu Jan 23 11:20:28 2003 From: darius at accesscom.com (Darius Bacon) Date: Thu, 23 Jan 2003 02:20:28 -0800 Subject: [pypy-dev] Miasma: x86 machine code generation in Python Message-ID: I've just written a Python module to emit x86 machine code using assembler-like function calls, plus another module in C to invoke the generated code: http://accesscom.com/~darius/software/miasma/ It comes with a little test program that compiles RPN expressions. I realize nobody's talking about compiling direct to native code yet, but this was just so easy to hack up using code already lying around that I couldn't resist. Perhaps some of the hackers on this list will enjoy playing with it. Darius From hpk at trillke.net Thu Jan 23 11:41:36 2003 From: hpk at trillke.net (holger krekel) Date: Thu, 23 Jan 2003 11:41:36 +0100 Subject: [pypy-dev] Miasma: x86 machine code generation in Python In-Reply-To: ; from darius@accesscom.com on Thu, Jan 23, 2003 at 02:20:28AM -0800 References: Message-ID: <20030123114136.A10805@prim.han.de> [Darius Bacon Thu, Jan 23, 2003 at 02:20:28AM -0800] > I've just written a Python module to emit x86 machine code using > assembler-like function calls, plus another module in C to invoke the > generated code: > > http://accesscom.com/~darius/software/miasma/ > > It comes with a little test program that compiles RPN expressions. I > realize nobody's talking about compiling direct to native code yet, > but this was just so easy to hack up using code already lying around > that I couldn't resist. Perhaps some of the hackers on this list will > enjoy playing with it. Sure interesting! Except i don't know enough Scheme. Could you (or somebody) else explain what the challenges were and how you solved them? greetings, holger -- you can have it fast, cheap or high quality. pick two. From darius at accesscom.com Thu Jan 23 12:22:01 2003 From: darius at accesscom.com (Darius Bacon) Date: Thu, 23 Jan 2003 03:22:01 -0800 Subject: [pypy-dev] Miasma: x86 machine code generation in Python In-Reply-To: <20030123114136.A10805@prim.han.de> (message from holger krekel on Thu, 23 Jan 2003 11:41:36 +0100) Message-ID: holger krekel wrote: [x86 code emission] > Sure interesting! Except i don't know enough Scheme. Could you > (or somebody) else explain what the challenges were and how > you solved them? The Scheme code generates Python code from a table of instruction descriptions; you can just use the pregenerated Python directly without bothering about the Scheme, as long as you're happy with the interface it gives you. I wouldn't have used Scheme if I were starting a Python project from scratch, but most of the code was already written. I don't expect this to be right for Psyco as is, because the code generator I originally built this for emitted instructions back to front, in the reverse order of execution -- like this: def prolog(): x86.push_gv(edi) x86.push_gv(esi) x86.push_gv(ebx) x86.mov_gv_ev(ebp, reg(esp)) x86.push_gv(ebp) which emits the conventional function prolog push ebp mov ebp, esp push ebx push esi push edi Changing it to work forwards instead, like Psyco, would require rebuilding the whole 2000 lines or so of generated code, so you would want to mess with the Scheme for that. It also needs more addressing modes and a way of doing backpatching. The challenges, well, the main one was just working through the Intel reference manual and discovering the occasional error in it -- checking the output against gas was a big help, though one of the differences turned out to be a bug in gas instead. Internally, there's a table with instruction descriptions in a little language modeled after Intel's documentation, and we generate code for each instruction emitter from that. This was all done years ago for a Lisp OS project. Later I made an attempt to factor out the general logic from language-specific code emission stuff, which is why the Python version only took a few hours hacking to get running. It's not very fancy but enough to start on. Darius From mwh at python.net Thu Jan 23 12:22:58 2003 From: mwh at python.net (Michael Hudson) Date: 23 Jan 2003 11:22:58 +0000 Subject: [pypy-dev] Re: Notes on compiler package References: <200301230021.h0N0Ld2p012793@overload3.baremetal.com> Message-ID: <2miswgrtyl.fsf@starship.python.net> logistix writes: > I'm finally starting to get my head wrapped around the compiler package, > and thought I'd pass some notes on to the group since it's a beast: As the person who did some fixing up so it worked in 2.3a1, I second that. > 1) Other than the original AST generation, it looks like it's all pure > python (not tying into the C source). The original AST comes from the > parser module, which is a pure wrapper module (although I've already > posted python code to build ast's here) Yep. > 2) Internal CPython ASTs are transformed into "compiler"'s own ASTs. > These nodes are smarter than the internal ones; they attach appropriate > info to "attributes". CPython AST nodes just have a list of children, > this module identifies and breaks out important data (for example, the > arguments in a function def become an "args" attribute instead of > guessing it's the first child) Yep. This is compiler.transformer. It has to be said, if you were writing a parser for Python in Python, I'd shoot for generating the output of the transformer module rather than aping what the parser module produces. > 3) The CodeGenerators walk the ASTs and emit code into a code object. Yup. > 4) Speed seems reasonable (which makes me think I'm missing some calls > into the C side of things) You obviously haven't tried to compile anything big with it yet. It's slow. > 4) The generated bytecode looks reasonably good and execs fine, but > there are still some diffences from what CPython generates. Here's what > I've seen so far: > SET_LINENO doesn't always work the same (this is how python knows > what line in a file threw an exception) Hey, in 2.3 SET_LINENO doesn't even exist any more :-) > Doesn't have one of the few optimizations the CPython compiler does > have. CPython will throw out misplaced "docstrings"... strings in the > source that aren't assigned to anything. This throws off array indexes > in the code objects co_const attribute. Yeah, indices into co_consts are different (and not just in the case you mention above, I think). But it works -- you can compile the stdlib with in and run the test suite happily. > In general, the compiler package is in alot better shape than I expected. What were you expecting? Cheers, M. From tismer at tismer.com Thu Jan 23 05:13:49 2003 From: tismer at tismer.com (Christian Tismer) Date: Thu, 23 Jan 2003 05:13:49 +0100 Subject: [pypy-dev] Notes on compiler package In-Reply-To: <200301230343.h0N3h9Kv000791@overload3.baremetal.com> References: <200301230343.h0N3h9Kv000791@overload3.baremetal.com> Message-ID: <3E2F6BFD.6030401@tismer.com> logistix wrote: [core dependencies of compiler package] > It's getting a little late tonight, but tomorrow I'll just delete the > parser pyd from my 2.3 install and see if I can plug my parser into > compiler. That's exactly what I'd propose. > I'm also noticing some more minor problems with the compiler bytecode > generation, but the more I look the more I think the framework has the > first 90% of the work done. Great!!! > P.S. If anyone can point me to some sort of definition of what the > "Extended" in ELL(1) stands for I'd appreciate it. Python grammar is > not LL(1) as written. Are you trying to use a parser generator? ciao - chris From logistix at zworg.com Thu Jan 23 13:36:11 2003 From: logistix at zworg.com (logistix) Date: Thu, 23 Jan 2003 07:36:11 -0500 Subject: [pypy-dev] Re: Notes on compiler package Message-ID: <200301231236.h0NCaBWj031367@overload3.baremetal.com> Michael Hudson wrote: > > > In general, the compiler package is in alot better shape than I expected. > > What were you expecting? > > Cheers, > M. > I don't know. I guess in general I expect nothing, and this looks like something ;) BTW, there's a note in compiler.ast that says it's autogenerated. Is this still true? If so, how? From mwh at python.net Thu Jan 23 13:38:33 2003 From: mwh at python.net (Michael Hudson) Date: Thu, 23 Jan 2003 12:38:33 +0000 (GMT) Subject: [pypy-dev] Re: Notes on compiler package In-Reply-To: <200301231236.h0NCaBWj031367@overload3.baremetal.com> Message-ID: On Thu, 23 Jan 2003, logistix wrote: > Michael Hudson wrote: > > > > > > In general, the compiler package is in alot better shape than I expected. > > > > What were you expecting? > > I don't know. I guess in general I expect nothing, and this looks like > something ;) Oh I see. I got the impression that you expected the compiler package to be broken. > BTW, there's a note in compiler.ast that says it's autogenerated. I > this still true? If so, how? Errr... Tools/compiler/astgen.py, I think. Although the docstring of said script doesn't sound entirely encouraging """Generate ast module from specification This script generates the ast module from a simple specification, which makes it easy to accomodate changes in the grammar. This approach would be quite reasonable if the grammar changed often. Instead, it is rather complex to generate the appropriate code. And the Node interface has changed more often than the grammar. """ I *think* it's still used... I don't want to give the impression that I know the compiler module inside out. I just have enough of an idea about it that I can find the bit that's broken this week without *too* much effort (changes to Python/compile.c and other places routinely break the compiler module). Cheers, M. From pedronis at bluewin.ch Thu Jan 23 13:59:16 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Thu, 23 Jan 2003 13:59:16 +0100 Subject: [pypy-dev] Re: Notes on compiler package References: <200301230021.h0N0Ld2p012793@overload3.baremetal.com> <2miswgrtyl.fsf@starship.python.net> Message-ID: <006d01c2c2df$3cdce400$6d94fea9@newmexico> From: "Michael Hudson" > > 2) Internal CPython ASTs are transformed into "compiler"'s own ASTs. > > These nodes are smarter than the internal ones; they attach appropriate > > info to "attributes". CPython AST nodes just have a list of children, > > this module identifies and breaks out important data (for example, the > > arguments in a function def become an "args" attribute instead of > > guessing it's the first child) > > Yep. This is compiler.transformer. It has to be said, if you were > writing a parser for Python in Python, I'd shoot for generating the > output of the transformer module rather than aping what the parser > module produces. > yes, btw there have been undergoing work to substitute parser module and what is used internally by CPython with a (new) AST format: [Compiler-sig] compiler-sig project for Python 2.3: new AST http://mail.python.org/pipermail/compiler-sig/2002-March/000091.html http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/nondist/sandbox/as t/ there should be also a CVS branch Finn Bock has already incorparated this in Jython, OTOH I think work on CPython has stalled. The idea is that future versions of (e.g) PyChecker ought to use the (new) AST format. Jeremy Hylton should be asked on the status of this. regards. From mwh at python.net Thu Jan 23 14:16:22 2003 From: mwh at python.net (Michael Hudson) Date: 23 Jan 2003 13:16:22 +0000 Subject: [pypy-dev] Re: Notes on compiler package References: <200301230021.h0N0Ld2p012793@overload3.baremetal.com> <2miswgrtyl.fsf@starship.python.net> <006d01c2c2df$3cdce400$6d94fea9@newmexico> Message-ID: <2md6mo6m6x.fsf@starship.python.net> "Samuele Pedroni" writes: > From: "Michael Hudson" > > Yep. This is compiler.transformer. It has to be said, if you were > > writing a parser for Python in Python, I'd shoot for generating the > > output of the transformer module rather than aping what the parser > > module produces. > > > > yes, btw there have been undergoing work to substitute parser module and what > is used internally by CPython with a (new) AST format: > > [Compiler-sig] compiler-sig project for Python 2.3: new AST > http://mail.python.org/pipermail/compiler-sig/2002-March/000091.html > > http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/nondist/sandbox/as > t/ > > there should be also a CVS branch There is: ast-branch. I'd forgotten about that. Finishing that stuff up might make this project easier... > Finn Bock has already incorparated this in Jython, OTOH I think work > on CPython has stalled. Yes. > The idea is that future versions of (e.g) PyChecker ought to use > the (new) AST format. > > Jeremy Hylton should be asked on the status of this. I think he's stupidly busy. Python developers really shouldn't be allowed to have kids . Cheers, M. -- If trees could scream, would we be so cavalier about cutting them down? We might, if they screamed all the time, for no good reason. -- Jack Handey From hickey at oclc.org Thu Jan 23 15:35:57 2003 From: hickey at oclc.org (Hickey,Thom) Date: Thu, 23 Jan 2003 09:35:57 -0500 Subject: [pypy-dev] RE: 300,000 lines of C Message-ID: Does anyone else have experience with TeX? Knuth wrote it in Pascal (I was there when he was considering whether Pascal or C would be the right choice), but most implementations run as C. Someone wrote a set of support routines plus a translator for the very restricted Pascal that Knuth used. Local modifications to the code are almost all handled by substituting new Pascal code which is incorporated by a preprocessor before being translated into C. Really works pretty well, and some of the same techniques should carry over to pypy. I once started a project to translate TeX to Java, but found the translation to Java more difficult than that to C and ended up abandoning it. --Th -------------- next part -------------- An HTML attachment was scrubbed... URL: From boyd at strakt.com Thu Jan 23 15:46:55 2003 From: boyd at strakt.com (Boyd Roberts) Date: Thu, 23 Jan 2003 15:46:55 +0100 Subject: [pypy-dev] RE: 300,000 lines of C References: Message-ID: <3E30005F.4020904@strakt.com> Hickey,Thom wrote: > Does anyone else have experience with TeX? Knuth wrote it in Pascal ... Yes I remember installing it when it first came out. It consisted of a Pascal bootsrap and two programs called 'tangle' and 'weave'. IIRC they were huge chunks of Pascal, very ugly and unweildy. I have been thinking of an idea where there is a C implementation of a virtual machine code to machine code interpreter. The interpreter is then supplied in the virtual machine code and it is then compiled/interpreted to machine code. From there it would run native. I have been thinking about Inferno's Limbo: http://www.vitanuova.com/inferno/papers/descent.html From guido at python.org Thu Jan 23 15:49:26 2003 From: guido at python.org (Guido van Rossum) Date: Thu, 23 Jan 2003 09:49:26 -0500 Subject: [pypy-dev] Re: Notes on compiler package In-Reply-To: Your message of "23 Jan 2003 13:16:22 GMT." <2md6mo6m6x.fsf@starship.python.net> References: <200301230021.h0N0Ld2p012793@overload3.baremetal.com> <2miswgrtyl.fsf@starship.python.net> <006d01c2c2df$3cdce400$6d94fea9@newmexico> <2md6mo6m6x.fsf@starship.python.net> Message-ID: <200301231449.h0NEnQ506396@odiug.zope.com> > There is: ast-branch. I'd forgotten about that. Finishing that stuff > up might make this project easier... If someone could help Jeremy with that, that would be great! It's close to completion, but he's had no time to work on it. > > Jeremy Hylton should be asked on the status of this. > > I think he's stupidly busy. Python developers really shouldn't be > allowed to have kids . Actually, the kids have little to do with it. It's the Zope work that's killing us here at PL. --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy at zope.com Thu Jan 23 17:20:35 2003 From: jeremy at zope.com (Jeremy Hylton) Date: Thu, 23 Jan 2003 11:20:35 -0500 Subject: [pypy-dev] Re: Notes on compiler package In-Reply-To: <200301231449.h0NEnQ506396@odiug.zope.com> References: <200301230021.h0N0Ld2p012793@overload3.baremetal.com> <2miswgrtyl.fsf@starship.python.net> <006d01c2c2df$3cdce400$6d94fea9@newmexico> <2md6mo6m6x.fsf@starship.python.net> <200301231449.h0NEnQ506396@odiug.zope.com> Message-ID: <15920.5715.329448.372928@slothrop.zope.com> I've been intending to post some comments in this thread for quite a while, but both Zope and kids have been keeping my busy . The compiler package is pretty functional. In Tools/compiler there is a regrtest.py script that compiles the entire standard library with the compiler package and runs the test suite. There are a a failure related to a bug in the builtin compiler (improper handling of the unary negative optimization), but otherwise the tests all run. The package is more complex than I would like. In particular, the assembler phase that converts from abstract bytecode to concrete bytecode is rather baroque. But it is functional. I believe the stack depth computation is still pretty bogus, although Mark Hammond did a good job of getting it mostly correct. The difference between the two compilers is that the builtin compiler tracks stack depth at the same time it emits bytecode and the compiler package tries to determine the stack depth by scanning the bytecode in a later pass. The latter approach should probably compute stack depth for each basic block and then do simple flow analysis (I think already present) to determine what the max stack depth on any pass is. Even if the post-processing approach gets fixed, I'm not sure which approach I like better. As Samuele mentioned, there's an improved AST on the ast-branch and it's already being used by Jython. I don't recall the specific differences off the top of my head, but the new AST has slightly simpler data structures and is a better more regular. The ast-branch still requires a lot of work to finish, although it is functional enough to compile simple functions (definition and call). The symbol table pass is much cleaner. In general, there's less code because the AST is easier to work with. As a simplification, I decided not to do anything to change the parser and instead focused only on the backend. I had grand plans of replacing the parser in a future release, but we'll have to wait and see :-). To summarize, briefly, what remains to be done on that branch (although I'm not sure it's relevant to pypy-dev): - Conversion of concrete to abstract trees (90% done) - Error handling during conversion (basically not done) - Marshal API to pass AST between Python and C (30% done) - Basic bytecode generation (80% done) - Error checking (25% done) It's probably a couple of weeks effort to get it to alpha quality. The question I'd really like to ask, which I'll just throw out for now, is: Why would minimal python want to generate bytecode for the old interpreter? It seems the a simpler bytecode format, e.g one that didn't have a BINARY_ADD but generated approriate code to call __add__, would be a better target. There's a lot of complex C code hiding inside BINARY_ADD, which ought to get pushed back out to the Python code. A simpler and slimmer bytecode format seems like it would expose more opportunity for optimization. Jeremy From nathanh at zu.com Thu Jan 23 18:47:47 2003 From: nathanh at zu.com (Nathan Heagy) Date: Thu, 23 Jan 2003 11:47:47 -0600 Subject: [pypy-dev] Re: Notes on compiler package In-Reply-To: <15920.5715.329448.372928@slothrop.zope.com> Message-ID: > The question I'd really like to ask, which I'll just throw out for > now, is: Why would minimal python want to generate bytecode for the > old interpreter? Is byte-code compatibility a goal of minimalPython? I haven't seen it mentioned anywhere but everyone seems to be working under that assumption. As far as alternatives, why don't we target the .net CLI? j/k! -- Nathan Heagy phone:306.653.4747 fax:306.653.4774 http://www.zu.com From tismer at tismer.com Thu Jan 23 19:06:23 2003 From: tismer at tismer.com (Christian Tismer) Date: Thu, 23 Jan 2003 19:06:23 +0100 Subject: [pypy-dev] Re: Notes on compiler package In-Reply-To: References: Message-ID: <3E302F1F.7080904@tismer.com> Nathan Heagy wrote: > The question I'd really like to ask, which I'll just throw out for > now, is: Why would minimal python want to generate bytecode for the > old interpreter? > > > Is byte-code compatibility a goal of minimalPython? I haven't seen it > mentioned anywhere but everyone seems to be working under that assumption. This was an early consideration. We wanted to stay as compatible as possible, especially in the bootstrap phase. It would be nice to use CPython as cross compiler to bytecode, which the new thing can execute then, even before it is able to compile from alone. But there is so much flux in the project right now, that this requirement might become useless, I have no idea yet. > As far as alternatives, why don't we target the .net CLI? .net at all would deserve an extra thread. So let me answer it for anything different from .pyc code: Python byte code is well known, there is an interpreter called CPython, there is an almost complete compiler in Python, there is Psyco which works right now with it, Bytecodehacks, the dis module and much more. There is more common knowledge about byte code in the heads which are going to join than about any other intermediate language, I guess. If we want to let this sprint produce anything executable which has to deal with an interpreter engine at all, then we should not start from scratch. Other engines can be discussed when we have something that works. ciao - chris From hpk at trillke.net Thu Jan 23 19:16:44 2003 From: hpk at trillke.net (holger krekel) Date: Thu, 23 Jan 2003 19:16:44 +0100 Subject: [pypy-dev] Re: Notes on compiler package In-Reply-To: ; from nathanh@zu.com on Thu, Jan 23, 2003 at 11:47:47AM -0600 References: <15920.5715.329448.372928@slothrop.zope.com> Message-ID: <20030123191644.L10805@prim.han.de> [Jeremy Hilton] The question I'd really like to ask, which I'll just throw out for now, is: Why would minimal python want to generate bytecode for the old interpreter? [Nathan Heagy] Is byte-code compatibility a goal of minimalPython? I haven't seen it mentioned anywhere but everyone seems to be working under that assumption. We are working from the known towards the unknown. It is clear that we want to have a compiler package written in python and if feasible an appropriate parser. Later on we probably want to use it to generate *very* different stuff than the current Python bytecode. [Nathan Heagy] As far as alternatives, why don't we target the .net CLI? Because we don't aim at buzzword compliancy :-) From what i remember about .net/cli discussions it just doesn't fit well with python's dynamics. If you know otherwise then please post more details. regards, holger From pedronis at bluewin.ch Thu Jan 23 18:49:10 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Thu, 23 Jan 2003 18:49:10 +0100 Subject: [pypy-dev] Re: Notes on compiler package References: Message-ID: <027201c2c307$bc9405c0$6d94fea9@newmexico> > > As far as alternatives, why don't we target the .net CLI? > http://www.msdn.microsoft.com/library/default.asp?url=/library/en-us/cpguide/ht ml/cpconcompilationreuse.asp From scott at fenton.baltimore.md.us Thu Jan 23 21:32:06 2003 From: scott at fenton.baltimore.md.us (Scott Fenton) Date: Thu, 23 Jan 2003 15:32:06 -0500 Subject: [pypy-dev] __builtin__ module In-Reply-To: References: <3E2ECFAF.3010104@strakt.com> <20030122020514.GA4079@debian.fenton.baltimore.md.us> <20030122162243.GE10868@magma.unil.ch> <3E2ECFAF.3010104@strakt.com> <5.0.2.1.1.20030122130959.00a9b800@mail.oz.net> <20030123094119.GA22778@magma.unil.ch> Message-ID: <20030123203206.GA1986@debian.fenton.baltimore.md.us> On Thu, Jan 23, 2003 at 11:18:00AM +0100, Thomas Heller wrote: > Armin Rigo writes: > > > I think that Scott's work drew the line for the built-in functions. Most of > > what he couldn't code must be done as built-in functions. > > Except that classmethod, staticmethod, and super could be done > in pure Python. Aren't those even in Guido's descrintro?? They may well be. My point in writing some of __builtin__ was to get some inital code out there, not to draw a line in the sand. The next version will probably have some more stuff included in it. Remember, this was a one afternoon hack that got the inital version out. -Scott -- char m[9999],*n[99],*r=m,*p=m+5000,**s=n,d,c;main(){for(read(0,r,4000);c=*r; r++)c-']'||(d>1||(r=*p?*s:(--s,r)),!d||d--),c-'['||d++||(*++s=r),d||(*p+=c== '+',*p-=c=='-',p+=c=='>',p-=c=='<',c-'.'||write(1,p,1),c-','||read(2,p,1));} -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From scott at fenton.baltimore.md.us Thu Jan 23 23:17:23 2003 From: scott at fenton.baltimore.md.us (Scott Fenton) Date: Thu, 23 Jan 2003 17:17:23 -0500 Subject: [pypy-dev] Builtin types Message-ID: <20030123221722.GA2297@debian.fenton.baltimore.md.us> Hello all. While trying to mix up some more stuff for __builtin__, I came up with an interesting problem. The solution I found for classmethod (using google on c.l.python, since I couldn't figure it out myself) requires that classes that use it must derive from object. That made me wonder, could we just automatically derive everything from object in Minimal? The big problem with doing this in CPython, as I understand it, is extensions that need to be converted, but we don't have that problem here, obviously. So would it be alright if I went ahead and used this code as if everything derived from object? -Scott -- char m[9999],*n[99],*r=m,*p=m+5000,**s=n,d,c;main(){for(read(0,r,4000);c=*r; r++)c-']'||(d>1||(r=*p?*s:(--s,r)),!d||d--),c-'['||d++||(*++s=r),d||(*p+=c== '+',*p-=c=='-',p+=c=='>',p-=c=='<',c-'.'||write(1,p,1),c-','||read(2,p,1));} -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From tismer at tismer.com Fri Jan 24 01:43:26 2003 From: tismer at tismer.com (Christian Tismer) Date: Fri, 24 Jan 2003 01:43:26 +0100 Subject: [pypy-dev] Builtin types In-Reply-To: <20030123221722.GA2297@debian.fenton.baltimore.md.us> References: <20030123221722.GA2297@debian.fenton.baltimore.md.us> Message-ID: <3E308C2E.9020109@tismer.com> Scott Fenton wrote: > Hello all. While trying to mix up some more stuff for > __builtin__, I came up with an interesting problem. > The solution I found for classmethod (using google on > c.l.python, since I couldn't figure it out myself) > requires that classes that use it > must derive from object. No, this is not true. classmethod works for every kind of class, may it be a newstyle class, triggered by - deriving from object - using __slots__ - deriving from a builtin type or a descendant - did I forget something? or a "classic" class, it always works. For reference, see http://www.python.org/2.2/descrintro.html > That made me wonder, could > we just automatically derive everything from object > in Minimal? The big problem with doing this in CPython, > as I understand it, is extensions that need to be > converted, but we don't have that problem here, obviously. > So would it be alright if I went ahead and used this code > as if everything derived from object? No, sorry. object is just a special case for deriving a new-style class, see above. They can also be created by deriving from builtin types, or by using __slots__. Furthermore, I'm going to propose an extension to the new class system (at least for the MiniPy prototype) that goes a bit further: - __slots__ should get the ability to denote the type of a slot to be generated, especially ctypes types - it should be possible to derive a class from nothing, not even from object or classic, but I'd like to describe a plain C structure by classes. The latter will allow to define the object class and all builtin types with the same machinery. But please go ahead with your __builtins__ as it seems to fit. We can fix such stuff later. If you want to be perfect, try to define it for any class. cheers & thanks! -- chris From roccomoretti at netscape.net Fri Jan 24 03:14:49 2003 From: roccomoretti at netscape.net (Rocco Moretti) Date: Thu, 23 Jan 2003 21:14:49 -0500 Subject: [pypy-dev] RE: Interfacing Python Code with Autogenerated C code Message-ID: <7F55D380.2ACDEAA1.9ADE5C6A@netscape.net> "Gerald S. Williams" wrote: >Rocco Moretti wrote: >> How would the interface between the two work if the hand translated sections differ >> from the C version? > >I can give general answers from past experience. > >> Simplest example is basic types. A Python version would presumably be represented as a >> class with member functions. But the C code uses calls to accessor functions. How could >> we redirect the automatically generated code to call member functions as opposed to >> free floating functions? > >One answer: provide such functions, which call the member >functions as needed. > >> A more difficult problem lies when we we start to reorganize the object/structure >> internals. Good case of this is the frame object. The C version has an array with >> multiple pointers to internal objects. Once you reorganize the internals, how do >> you tell the C->Python translator that f_stacktop isn't a PyObject pointer anymore, >> but an index off of another list? > >If the class was providing public attributes, you may not >want to change their meaning. f_stacktop could provide an >accessor for the old representation. The new representation >would be in _f_stacktop or something. > >-Jerry In short, if we use C->Python autogeneration, we need to maintain legacy interfaces in any C code we hand translate to Python. - Sounds fair enough. One thought I had since posting is that it may be possible to build intellegence into the translator, such that it can massage calls from one function into another (perhaps multiple) call(s) to another function, reorganize functions into members, and map attribute accesses into member function calls, as the case may be. I'd anticipate you would need to feed the translator a file describing the mapping of the [C language interface of the hand translated code] onto the Python language interface. Something to the effect of: guido(a,b,c) --> n = guido_1(a,b); guido_2(n,c) tim(d,e) --> e._tim(d) F_bot.g = X --> F_bot.set_g(X) You may even be able to use this technique to map the C standard library functions onto their Python equivalents. -Rocco (Avoiding f**, b*r and b*z) __________________________________________________________________ The NEW Netscape 7.0 browser is now available. Upgrade now! http://channels.netscape.com/ns/browsers/download.jsp Get your own FREE, personal Netscape Mail account today at http://webmail.netscape.com/ From bokr at oz.net Fri Jan 24 06:07:14 2003 From: bokr at oz.net (Bengt Richter) Date: Thu, 23 Jan 2003 21:07:14 -0800 Subject: [pypy-dev] Builtin types In-Reply-To: <3E308C2E.9020109@tismer.com> References: <20030123221722.GA2297@debian.fenton.baltimore.md.us> <20030123221722.GA2297@debian.fenton.baltimore.md.us> Message-ID: <5.0.2.1.1.20030123185114.00a91780@mail.oz.net> At 01:43 2003-01-24 +0100, Christian Tismer wrote: >Scott Fenton wrote: >>Hello all. While trying to mix up some more stuff for >>__builtin__, I came up with an interesting problem. >>The solution I found for classmethod (using google on >>c.l.python, since I couldn't figure it out myself) requires that classes that use it >>must derive from object. > >No, this is not true. >classmethod works for every kind of class, >may it be a newstyle class, triggered by >- deriving from object >- using __slots__ >- deriving from a builtin type or a descendant >- did I forget something? >or a "classic" class, it always works. >For reference, see >http://www.python.org/2.2/descrintro.html > >>That made me wonder, could >>we just automatically derive everything from object >>in Minimal? The big problem with doing this in CPython, >>as I understand it, is extensions that need to be >>converted, but we don't have that problem here, obviously. >>So would it be alright if I went ahead and used this code >>as if everything derived from object? > >No, sorry. object is just a special case for deriving a >new-style class, see above. >They can also be created by deriving from builtin types, >or by using __slots__. > >Furthermore, I'm going to propose an extension to the new >class system (at least for the MiniPy prototype) that goes >a bit further: >- __slots__ should get the ability to denote the type of > a slot to be generated, especially ctypes types >- it should be possible to derive a class from nothing, > not even from object or classic, but I'd like to describe > a plain C structure by classes. > >The latter will allow to define the object class and all >builtin types with the same machinery. I have some thoughts for decribing C structures (and anything else digital) in a canonical abstract way. I'll call such a decription a meta-representation, and a function that produces it from a Python object is meta_repr(python_object) => meta_rep_obj. The reverse operation is meta_eval ;-) The basis for meta_repr is that it capture and meta-represent (type, id, value) for an object in a pure abstract form. The abstract basis for storing the information is a bitvector object with attribute names that specify slices of the bitvector. I.e., there is an effective dict like {'slicename':(lo,hi)} and __getattr__ uses it so that bvec.slicename refers to bvec[lo:hi] etc. This is pure and abstract data, but you can see that interpreted little-endianly you can very straightforwardly lay out a C struct with whatever field widths and alignments you want. I also want to make a way to create subclasses of the base bitvector class that can have specified size and named slices as class variable info shared by instances. This is very analogous to a C struct definition, and BTW also permits union by just defining overlapping named slices. BTW2, there is also a natural way to iterate based on a slice name -- i.e., just walk successive contiguous same-size slices to some limit, starting with the one defined by a name (which doesn't have to start at bit 0 of the whole, of course, so you can get embedded arrays of bit fields starting anywhere). Hence a giant bitvector with a slice defined by byte=(16,24) could iterate in 8-bit bytes starting at bit 16. meta_eval-ing such a byte sequence meta-representation together with type and id would produce a Python character or string. If Psyco could deal with these meta_representations in a fine-tuned way, it might be possible to have very abstract representations of things and yet generate efficient bit-twiddling assembler level stuff, and, as you said, one consistent mechanism. One nice thing would be to be able to define machine instructions in terms of a few classes for different basic formats, with particular bit fields named right out of the book. Same for floating point doubles, etc. Packing code would be appending bit vectors, etc. Fortunately intel is little-endian, so there is a simple mapping to the most common machine stuff, but even if that were not so, the correspondence between an abstract bit list and an abstract integer represented as an ordered set of binary coefficients for powers of 2 just naturally sums little- endianly by sum=0; for i in range(len(bitvec)): sum += bitvec[i]*2**i Though of course that's not the most efficient way to compute it. Anyway, this meta-repr/meta_eval idea is only half baked, but in case you wanted to consider it, I wanted to offer it before you are all finished ;-) There is a bitvec.py in an old demo directory that I think I will cannibalize for a good deal of bit vector stuff, though I think I would want to base the data in a final version on arrays of ints that would map more naturally to arrays of machine ints instead of using Python longs. But for prototyping it doesn't matter. I still have to mull the class factory or perhaps metaclass stuff for creating classes whose instances will share bit slice name definitions. Also the best way to capture nested composition and naming or whole and sub-parts, when it's done by composition instead of one flat definition. BTW, it seems pickling and marshalling are also quite related, but I haven't thought about that. Interestingly, meta_repr produces a python object, so what does, e.g., meta_repr(meta_repr(123)) produce? It has to make sense ;-) I am mulling the encoding of type and id along with value into some composite meta_repr representation to represent a full object... Also the meaning of mixed structures, like an ordinary Python list of objects produced by meta_repr. It should be legal, but you have to keep the tin foil in good repair ;-) Please use what might further the project, and ignore the rest. But besides the possible use PyPython, would named-slice abstract bit vectors have a chance as a PEP as a way to express fine grain abstract digital info? Best, Bengt From darius at accesscom.com Fri Jan 24 07:36:42 2003 From: darius at accesscom.com (Darius Bacon) Date: Thu, 23 Jan 2003 22:36:42 -0800 Subject: [pypy-dev] Builtin types In-Reply-To: <5.0.2.1.1.20030123185114.00a91780@mail.oz.net> (message from Bengt Richter on Thu, 23 Jan 2003 21:07:14 -0800) Message-ID: Bengt Richter writes: > I have some thoughts for decribing C structures (and anything else > digital) in a canonical abstract way. I'll call such a decription > a meta-representation, and a function that produces it from a Python > object is meta_repr(python_object) => meta_rep_obj. The reverse > operation is meta_eval ;-) > > The basis for meta_repr is that it capture and meta-represent > (type, id, value) for an object in a pure abstract form. > The abstract basis for storing the information is a bitvector > object with attribute names that specify slices of the bitvector. > I.e., there is an effective dict like {'slicename':(lo,hi)} and > __getattr__ uses it so that bvec.slicename refers to bvec[lo:hi] etc. > This is pure and abstract data, but you can see that interpreted > little-endianly you can very straightforwardly lay out a C struct > with whatever field widths and alignments you want. Sounds vaguely like this paper: First-Class Data-type Representation in SchemeXerox http://citeseer.nj.nec.com/4990.html Darius From theller at python.net Fri Jan 24 08:46:57 2003 From: theller at python.net (Thomas Heller) Date: 24 Jan 2003 08:46:57 +0100 Subject: [pypy-dev] Builtin types In-Reply-To: <3E308C2E.9020109@tismer.com> References: <20030123221722.GA2297@debian.fenton.baltimore.md.us> <3E308C2E.9020109@tismer.com> Message-ID: <3cnjrnv2.fsf@python.net> Christian Tismer writes: > classmethod works for every kind of class, > may it be a newstyle class, triggered by > - deriving from object > - using __slots__ > - deriving from a builtin type or a descendant > - did I forget something? > or a "classic" class, it always works. > For reference, see > http://www.python.org/2.2/descrintro.html [...] > No, sorry. object is just a special case for deriving a > new-style class, see above. > They can also be created by deriving from builtin types, > or by using __slots__. That's incorrect, IIUC. __slots__ only has a special meaning for *new stype classes* only, it doesn't trigger anything in classes classes. New style classes always have object as *one* of it's base classes, and most builtin types are new style classes also. > > Furthermore, I'm going to propose an extension to the new > class system (at least for the MiniPy prototype) that goes > a bit further: > - __slots__ should get the ability to denote the type of > a slot to be generated, especially ctypes types I'm not really understanding what you're proposing here. You could look at ctypes as implementing 'typed slots' with C-compatible layout. class A(object): __slots__ ["x", "y", "z"] class B(ctypes.Structure): _fields_ = [("x", "c"), ("y", "i"), ("z", "q")] __slots__ = [] Instances of both A and B can only have 'x', 'y', and 'z' instance variables (or should I say slots), both don't have a __dict__. > - it should be possible to derive a class from nothing, > not even from object or classic, but I'd like to describe > a plain C structure by classes. This is maybe also something that ctypes already does. The B *class* above knows all about this C structure struct B { char x; int y; long long z; }; >>> ctypes.sizeof(B) 16 >>> > The latter will allow to define the object class and all > builtin types with the same machinery. > Thomas From theller at python.net Fri Jan 24 09:01:11 2003 From: theller at python.net (Thomas Heller) Date: 24 Jan 2003 09:01:11 +0100 Subject: [pypy-dev] Builtin types In-Reply-To: <5.0.2.1.1.20030123185114.00a91780@mail.oz.net> References: <20030123221722.GA2297@debian.fenton.baltimore.md.us> <20030123221722.GA2297@debian.fenton.baltimore.md.us> <5.0.2.1.1.20030123185114.00a91780@mail.oz.net> Message-ID: Bengt Richter writes: > I have some thoughts for decribing C structures (and anything else > digital) in a canonical abstract way. I'll call such a decription > a meta-representation, and a function that produces it from a Python > object is meta_repr(python_object) => meta_rep_obj. The reverse > operation is meta_eval ;-) > > The basis for meta_repr is that it capture and meta-represent > (type, id, value) for an object in a pure abstract form. > The abstract basis for storing the information is a bitvector > object with attribute names that specify slices of the bitvector. > I.e., there is an effective dict like {'slicename':(lo,hi)} and > __getattr__ uses it so that bvec.slicename refers to bvec[lo:hi] etc. > This is pure and abstract data, but you can see that interpreted > little-endianly you can very straightforwardly lay out a C struct > with whatever field widths and alignments you want. > > I also want to make a way to create subclasses of the base bitvector > class that can have specified size and named slices as class variable > info shared by instances. This is very analogous to a C struct > definition, and BTW also permits union by just defining overlapping > named slices. This all sounds very similar to what is already implemented in ctypes ;-) This is a the structure which stores to information for one field of a C structure or union: typedef struct { PyObject_HEAD int offset; int size; int index; /* Index into CDataObject's object array */ PyObject *proto; /* a type or NULL */ GETFUNC getfunc; /* getter function if proto is NULL */ SETFUNC setfunc; /* setter function if proto is NULL */ } CFieldObject; 'offset' is what you call 'lo', and 'offset + size' is your 'hi' attribute. 'proto' (I should probably have chosen a better name) is a Python object holding information about the field type (such as alignment requirements and storage size), and 'getfunc' and 'setfunc' are pointers to functions which are able to convert the data from Python to C and vice versa. Instances of these CFieldObjects populate the class dict of ctypes Structure and Union subclasses, they are created by their respective metaclass from the _fields_ attribute, and are used as attribute descriptors to expose the fields to Python. Thomas From theller at python.net Fri Jan 24 09:02:34 2003 From: theller at python.net (Thomas Heller) Date: 24 Jan 2003 09:02:34 +0100 Subject: [pypy-dev] Builtin types In-Reply-To: <3cnjrnv2.fsf@python.net> References: <20030123221722.GA2297@debian.fenton.baltimore.md.us> <3E308C2E.9020109@tismer.com> <3cnjrnv2.fsf@python.net> Message-ID: Thomas Heller writes: > __slots__ only has a special meaning for *new stype classes* only, Sorry, typo: *new style classes* Thomas From arigo at tunes.org Fri Jan 24 10:39:35 2003 From: arigo at tunes.org (Armin Rigo) Date: Fri, 24 Jan 2003 10:39:35 +0100 Subject: [pypy-dev] __builtin__ module In-Reply-To: <20030123203206.GA1986@debian.fenton.baltimore.md.us> References: <3E2ECFAF.3010104@strakt.com> <20030122020514.GA4079@debian.fenton.baltimore.md.us> <20030122162243.GE10868@magma.unil.ch> <3E2ECFAF.3010104@strakt.com> <5.0.2.1.1.20030122130959.00a9b800@mail.oz.net> <20030123094119.GA22778@magma.unil.ch> <20030123203206.GA1986@debian.fenton.baltimore.md.us> Message-ID: <20030124093935.GC6665@magma.unil.ch> Hello Scott, On Thu, Jan 23, 2003 at 03:32:06PM -0500, Scott Fenton wrote: > > Except that classmethod, staticmethod, and super could be done > > in pure Python. Aren't those even in Guido's descrintro?? > > They may well be. My point in writing some of __builtin__ was to > get some inital code out there, not to draw a line in the sand. > The next version will probably have some more stuff included in > it. Sure. I was rather talking about the comments you added in the e-mail, where you say that some functions don't seem to be implementable. About chr(i) vs. '%c'%i vs. '\x00\x01...\xFF'[i]: what I feel this shows is that one of these solutions must be thought as the primitive way to build a character, and the others should use it; and I definitely feel that chr() is the primitive way to build a character. Contrary to what I said in a previous e-mail I don't think that chr() should be implemented with '%c'%i. On the other hand, I guess that the strings' % operator could nicely be implemented in pure Python. It would then have to use chr() to implement the %c format code. It looks more reasonable than the other way around. A bient?t, Armin. From bokr at oz.net Fri Jan 24 11:02:43 2003 From: bokr at oz.net (Bengt Richter) Date: Fri, 24 Jan 2003 02:02:43 -0800 Subject: [pypy-dev] Builtin types In-Reply-To: References: <5.0.2.1.1.20030123185114.00a91780@mail.oz.net> <20030123221722.GA2297@debian.fenton.baltimore.md.us> <20030123221722.GA2297@debian.fenton.baltimore.md.us> <5.0.2.1.1.20030123185114.00a91780@mail.oz.net> Message-ID: <5.0.2.1.1.20030124003751.00a8c460@mail.oz.net> At 09:01 2003-01-24 +0100, Thomas Heller wrote: >Bengt Richter writes: > >> I have some thoughts for decribing C structures (and anything else >> digital) in a canonical abstract way. I'll call such a decription >> a meta-representation, and a function that produces it from a Python >> object is meta_repr(python_object) => meta_rep_obj. The reverse >> operation is meta_eval ;-) >> >> The basis for meta_repr is that it capture and meta-represent >> (type, id, value) for an object in a pure abstract form. >> The abstract basis for storing the information is a bitvector >> object with attribute names that specify slices of the bitvector. >> I.e., there is an effective dict like {'slicename':(lo,hi)} and >> __getattr__ uses it so that bvec.slicename refers to bvec[lo:hi] etc. >> This is pure and abstract data, but you can see that interpreted >> little-endianly you can very straightforwardly lay out a C struct >> with whatever field widths and alignments you want. >> >> I also want to make a way to create subclasses of the base bitvector >> class that can have specified size and named slices as class variable >> info shared by instances. This is very analogous to a C struct >> definition, and BTW also permits union by just defining overlapping >> named slices. > >This all sounds very similar to what is already implemented in ctypes ;-) > >This is a the structure which stores to information for one field of a >C structure or union: > >typedef struct { > PyObject_HEAD > int offset; > int size; > int index; /* Index into CDataObject's > object array */ > PyObject *proto; /* a type or NULL */ > GETFUNC getfunc; /* getter function if proto is NULL */ > SETFUNC setfunc; /* setter function if proto is NULL */ >} CFieldObject; > >'offset' is what you call 'lo', and 'offset + size' is your 'hi' It does sound like a lot of similar ground is covered. Are your offset and size values in bits or bytes? (I intended bits). >attribute. 'proto' (I should probably have chosen a better name) is a >Python object holding information about the field type (such as >alignment requirements and storage size), and 'getfunc' and 'setfunc' >are pointers to functions which are able to convert the data from >Python to C and vice versa. > >Instances of these CFieldObjects populate the class dict of ctypes >Structure and Union subclasses, they are created by their respective >metaclass from the _fields_ attribute, and are used as attribute >descriptors to expose the fields to Python. This part sounds very close to the same, except I was going to specify fields by parameters in keyword arg tuples, with the keywords as field names. And I was basing getfunc and setfunc on slice operations operating on regions of a bit list. And there are some things I haven't worked out yet -- but you are right, there are similarities. There would have to be ;-) But I'm kind of partial to the notion of pure abstract bit sequences and being able to build purely abstract composites of those. I realize you can view the C entities as abstract data elements too, but I'd think the pure based abstractions could survive C going away, and getting retargeted to something else perhaps a little more readily. It is speculation at this point for me though. I will explore a little further see how it looks. The thing I think would be cool is if one could write python to build meta_repr objects in python and have Psyco compile Python code manipulating those representations and wind up with machine code effectively equivalent to what the C code in your ctypes module does when given the same implicit abstract info defining fields and accessors etc., and code using those. The thing about the latter situation is that Psyco would still see code accessing a foreign interface to getfunc/setfunc, whereas if it sees Python code actually accessing the data meta-representations behind getfunc/setfunc, it has a chance to bypass function calls and generate inline machine code instead of using your canned functions. Probably a loss at first, but eventually it could be a gain, depending on Psyco? Regards, Bengt From arigo at tunes.org Fri Jan 24 11:36:59 2003 From: arigo at tunes.org (Armin Rigo) Date: Fri, 24 Jan 2003 11:36:59 +0100 Subject: [pypy-dev] Re: Notes on compiler package In-Reply-To: <15920.5715.329448.372928@slothrop.zope.com> References: <200301230021.h0N0Ld2p012793@overload3.baremetal.com> <2miswgrtyl.fsf@starship.python.net> <006d01c2c2df$3cdce400$6d94fea9@newmexico> <2md6mo6m6x.fsf@starship.python.net> <200301231449.h0NEnQ506396@odiug.zope.com> <15920.5715.329448.372928@slothrop.zope.com> Message-ID: <20030124103659.GD6665@magma.unil.ch> Hello Jeremy, On Thu, Jan 23, 2003 at 11:20:35AM -0500, Jeremy Hylton wrote: > I believe the stack depth computation is still pretty bogus, although > Mark Hammond did a good job of getting it mostly correct. I suddenly realize that I've got a toy that can be modified in 10 minutes to compute the stack depth with very little chance of a complex bug sneaking in. (Sorry, I don't have it right under my hand.) The toy is the main loop of a Python interpreter in Python (which some of us already worked on for fun, with a Frame class and with a method for each of the opcodes). In other words, we (the PyPython project) don't need to worry abot computing the stack depth with data flow analysis or whatever because our project will automatically produce such a tool for free! That's "abstract interpretation". To explain what I have in mind, let me first describe the PyPython main loop. First let me insist again on the separation between interpreter-level objects and application-level objects. Imagine that the stack in the interpreter is maintained as a list, and that at some point in the interpretation the stack contains the integers 1, 2 and 3. Then the stack list must *not* be the list [1, 2, 3]. That's the point I already discussed: it must be a list of three *complex objects* that represent the application-level integers 1, 2 and 3. For example, it could be [PyInt(1), PyInt(2), PyInt(3)], where PyInt is one of our custom classes. To show the confusion that would araise from a stack that would really contain [1, 2, 3], try to see how the interpreter should implement type()! You cannot. But it is trivial to add an ob_type() method to the PyInt class. Now to the point. If you take the same Python code that implements the main loop and all the opcodes, but you *replace* PyInt, PyStr, etc. with a trivial empty class Placeholder, you get an abstract interpreter. Running it against a real code object, you will see the stack evolve with no real objects in it. For example, at the point where the stack could have hold [PyInt(1), PyInt(2), PyInt(3)], now it is just [Placeholder(), Placeholder(), Placeholder()]. So what we need to do to compute the stack depth is interpret the code object once. On conditional jumps, we "fork", i.e. we try both paths. When we jump back to an already-seen position, we "merge", i.e. we keep the longest of the previous and the current stack for that position. When this process finishes, we know exactly how long the stack can be at each position. Note that the same technique can be extended to make a bytecode checker that can prove that an arbitrary bytecode is correct, in the sense that it will not crash the interpreter. This is what I was trying to do in my toy code. What's interesting here is that there seems to be a general pattern about re-using the Python-in-Python main loop. Psyco also works like this, only merging is more subtle. In all the cases (regular interpretation, stack depth computation, bytecode checking, Psyco) the PyPython main loop is *exactly* the same; the things that change are: * the implementation of PyInt, PyStr, etc. For regular interpretation it is implemented with a real int, str, etc. object. For stack depth it is a dummy placeholder. For Psyco it is a more complex structure that can tell apart run-time and compile-time values. * how we do forking. The (generic) JUMP_IF_TRUE opcode could be written with something like: def JUMP_IF_TRUE(self, arg): # self=frame, arg=offset to jump target v = self.tos() # top-of-stack object if v.istrue(): self.next_instr += arg # jump forward by 'arg' The meaning is clear in the case of regular interpretation. For stack depth computation, the istrue() method does some magic: it forks the current frame and returns twice to its caller, once returning 'false' so that JUMP_IF_TRUE() will continue to inspect the 'false' branch, and later returning 'true' to inspect the other branch. Yes, returning twice from a call is what *continuations* are for. No, Python has no such notion. Yes, Stackless would be very handy there. No, I'm not suggesting that we are stuck to enhancing CPython or switching to an altogether different language that has continuations. PyPython core should at some place be translated to other languages automatically; this translation can write continuation-passing-style functions if needed (in C or again in Python!). * finally, merging is important in all but the regular interpretation case, to detect when we reached a point that we have already seen. That's some action that must be done at regular interval, between two opcodes. So if PyPython contains a "poll()" call where CPython contains the regular-interval checks, we can implement poll() to do whatever is required for the intended usage of PyPython, e.g. poll the Unix signals for regular interpretation, or merge for the stack depth computer. I hope you are convinced now that the interpreter stack must not be [1, 2, 3] :-) A bient?t, Armin. From bokr at oz.net Fri Jan 24 11:59:44 2003 From: bokr at oz.net (Bengt Richter) Date: Fri, 24 Jan 2003 02:59:44 -0800 Subject: [pypy-dev] Builtin types In-Reply-To: References: <5.0.2.1.1.20030123185114.00a91780@mail.oz.net> Message-ID: <5.0.2.1.1.20030123232216.00a900a0@mail.oz.net> At 22:36 2003-01-23 -0800, Darius Bacon wrote: >Bengt Richter writes: > >> I have some thoughts for decribing C structures (and anything else >> digital) in a canonical abstract way. I'll call such a decription >> a meta-representation, and a function that produces it from a Python >> object is meta_repr(python_object) => meta_rep_obj. The reverse >> operation is meta_eval ;-) >> >> The basis for meta_repr is that it capture and meta-represent >> (type, id, value) for an object in a pure abstract form. >> The abstract basis for storing the information is a bitvector >> object with attribute names that specify slices of the bitvector. >> I.e., there is an effective dict like {'slicename':(lo,hi)} and >> __getattr__ uses it so that bvec.slicename refers to bvec[lo:hi] etc. >> This is pure and abstract data, but you can see that interpreted >> little-endianly you can very straightforwardly lay out a C struct >> with whatever field widths and alignments you want. > >Sounds vaguely like this paper: > >First-Class Data-type Representation in SchemeXerox >http://citeseer.nj.nec.com/4990.html > >Darius Thank you for the reference! I can see why you were reminded. Regards, Bengt From theller at python.net Fri Jan 24 12:02:25 2003 From: theller at python.net (Thomas Heller) Date: 24 Jan 2003 12:02:25 +0100 Subject: [pypy-dev] __builtin__ module In-Reply-To: <20030124093935.GC6665@magma.unil.ch> References: <3E2ECFAF.3010104@strakt.com> <20030122020514.GA4079@debian.fenton.baltimore.md.us> <20030122162243.GE10868@magma.unil.ch> <3E2ECFAF.3010104@strakt.com> <5.0.2.1.1.20030122130959.00a9b800@mail.oz.net> <20030123094119.GA22778@magma.unil.ch> <20030123203206.GA1986@debian.fenton.baltimore.md.us> <20030124093935.GC6665@magma.unil.ch> Message-ID: Armin Rigo writes: > About chr(i) vs. '%c'%i vs. '\x00\x01...\xFF'[i]: what I feel this > shows is that one of these solutions must be thought as the > primitive way to build a character, and the others should use it; > and I definitely feel that chr() is the primitive way to build a > character. Contrary to what I said in a previous e-mail I don't > think that chr() should be implemented with '%c'%i. On the other > hand, I guess that the strings' % operator could nicely be > implemented in pure Python. It would then have to use chr() to > implement the %c format code. It looks more reasonable than the > other way around. I don't want to beat this to death, but I've a different opinion. There is no 'character' data type in Python, only a 'string', which consists of 0 to n characters. This is also reflected in Python/bltinmodule.c, where the code is: s[0] = (char)x; return PyString_FromStringAndSize(s, 1); So even the C code creates a 'character array', and passes it to PyString_FromStringAndSize, which I would call to canonical way to build a string of whatever size. Of course there's a lot going on what this function does behind the scenes... Thomas From theller at python.net Fri Jan 24 12:10:22 2003 From: theller at python.net (Thomas Heller) Date: 24 Jan 2003 12:10:22 +0100 Subject: [pypy-dev] Builtin types In-Reply-To: <5.0.2.1.1.20030124003751.00a8c460@mail.oz.net> References: <5.0.2.1.1.20030123185114.00a91780@mail.oz.net> <20030123221722.GA2297@debian.fenton.baltimore.md.us> <20030123221722.GA2297@debian.fenton.baltimore.md.us> <5.0.2.1.1.20030123185114.00a91780@mail.oz.net> <5.0.2.1.1.20030124003751.00a8c460@mail.oz.net> Message-ID: <7kcureg1.fsf@python.net> Bengt Richter writes: [description of ctypes internal deleted] > > It does sound like a lot of similar ground is covered. Are your > offset and size values in bits or bytes? (I intended bits). > Currently they measure in bytes, but only because I didn't have a need for bit fields in structs or unions. > The thing I think would be cool is if one could write python to > build meta_repr objects in python and have Psyco compile Python code > manipulating those representations and wind up with machine code > effectively equivalent to what the C code in your ctypes module does > when given the same implicit abstract info defining fields and > accessors etc., and code using those. > > The thing about the latter situation is that Psyco would still see > code accessing a foreign interface to getfunc/setfunc, whereas if it > sees Python code actually accessing the data meta-representations > behind getfunc/setfunc, it has a chance to bypass function calls and > generate inline machine code instead of using your canned > functions. Probably a loss at first, but eventually it could be a > gain, depending on Psyco? Maybe. We'll see ;-) All great ideas in the air! Thomas From scott at fenton.baltimore.md.us Fri Jan 24 13:43:13 2003 From: scott at fenton.baltimore.md.us (Scott Fenton) Date: Fri, 24 Jan 2003 07:43:13 -0500 Subject: [pypy-dev] __builtin__ module In-Reply-To: <20030124093935.GC6665@magma.unil.ch> References: <3E2ECFAF.3010104@strakt.com> <20030122020514.GA4079@debian.fenton.baltimore.md.us> <20030122162243.GE10868@magma.unil.ch> <3E2ECFAF.3010104@strakt.com> <5.0.2.1.1.20030122130959.00a9b800@mail.oz.net> <20030123094119.GA22778@magma.unil.ch> <20030123203206.GA1986@debian.fenton.baltimore.md.us> <20030124093935.GC6665@magma.unil.ch> Message-ID: <20030124124313.GA3626@debian.fenton.baltimore.md.us> On Fri, Jan 24, 2003 at 10:39:35AM +0100, Armin Rigo wrote: > About chr(i) vs. '%c'%i vs. '\x00\x01...\xFF'[i]: what I feel this shows is > that one of these solutions must be thought as the primitive way to build a > character, and the others should use it; and I definitely feel that chr() is > the primitive way to build a character. Contrary to what I said in a previous > e-mail I don't think that chr() should be implemented with '%c'%i. On the > other hand, I guess that the strings' % operator could nicely be implemented > in pure Python. It would then have to use chr() to implement the %c format > code. It looks more reasonable than the other way around. > I disagree. My feeling about this is probably that everything that can be expressed as a function should be in pure python, and that which can't should probably be C (or Java, or a Python compiler, or....). I guess since builtin types fit below that level, we should probably make '%c'%i the builtin way of conversion, and, in fact, the new version of pypy.py I'm putting together has it that way. Another good, compelling argument is that it would be idiotic to try to implement unichr the other way, so for consistency we should probably delegate the task of character->number to C, where it can be done more gracefully anyway. -Scott -- char m[9999],*n[99],*r=m,*p=m+5000,**s=n,d,c;main(){for(read(0,r,4000);c=*r; r++)c-']'||(d>1||(r=*p?*s:(--s,r)),!d||d--),c-'['||d++||(*++s=r),d||(*p+=c== '+',*p-=c=='-',p+=c=='>',p-=c=='<',c-'.'||write(1,p,1),c-','||read(2,p,1));} -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From hpk at trillke.net Fri Jan 24 14:48:42 2003 From: hpk at trillke.net (holger krekel) Date: Fri, 24 Jan 2003 14:48:42 +0100 Subject: [pypy-dev] __builtin__ module In-Reply-To: <20030124124313.GA3626@debian.fenton.baltimore.md.us>; from scott@fenton.baltimore.md.us on Fri, Jan 24, 2003 at 07:43:13AM -0500 References: <20030122020514.GA4079@debian.fenton.baltimore.md.us> <20030122162243.GE10868@magma.unil.ch> <3E2ECFAF.3010104@strakt.com> <5.0.2.1.1.20030122130959.00a9b800@mail.oz.net> <20030123094119.GA22778@magma.unil.ch> <20030123203206.GA1986@debian.fenton.baltimore.md.us> <20030124093935.GC6665@magma.unil.ch> <20030124124313.GA3626@debian.fenton.baltimore.md.us> Message-ID: <20030124144842.W10805@prim.han.de> [Scott Fenton Fri, Jan 24, 2003 at 07:43:13AM -0500] > On Fri, Jan 24, 2003 at 10:39:35AM +0100, Armin Rigo wrote: > > > About chr(i) vs. '%c'%i vs. '\x00\x01...\xFF'[i]: what I feel this shows is > > that one of these solutions must be thought as the primitive way to build a > > character, and the others should use it; and I definitely feel that chr() is > > the primitive way to build a character. Contrary to what I said in a previous > > e-mail I don't think that chr() should be implemented with '%c'%i. On the > > other hand, I guess that the strings' % operator could nicely be implemented > > in pure Python. It would then have to use chr() to implement the %c format > > code. It looks more reasonable than the other way around. > > > > I disagree. My feeling about this is probably that everything that can > be expressed as a function should be in pure python, what do you mean by this? "everything" can always be expressed in a python function. It's a matter of time and space so could you be more specific? Anyway, I would try very hard to express all the builtins in python. And i see Thomas Heller's ctypes approach as a way to make this possible. Before coding anything in C there must be a real *need* to do so. greetings, holger From scott at fenton.baltimore.md.us Fri Jan 24 14:25:23 2003 From: scott at fenton.baltimore.md.us (Scott Fenton) Date: Fri, 24 Jan 2003 08:25:23 -0500 Subject: [pypy-dev] New __builtin__ Message-ID: <20030124132523.GA3767@debian.fenton.baltimore.md.us> Yes, everyone, it is the moment you have all dreaded. A new version of pure python __builtin__s. This version contains nudity, sexual references, chr implemented using '%c'%i, an optimized ord, multiarg map and zip, and the new functions unichr, oct, hex, staticmethod, *attr (for * in (get,has,set)), isinstance, issubclass, and slice. I did not include classmethod as this seems to warrent further discussion. Stuff I think can be easily done in python that haven't been yet: dir, raw_input, reload, round, and xrange. thinks-python-can-be-faster-than-c-ly yours, -Scott -- char m[9999],*n[99],*r=m,*p=m+5000,**s=n,d,c;main(){for(read(0,r,4000);c=*r; r++)c-']'||(d>1||(r=*p?*s:(--s,r)),!d||d--),c-'['||d++||(*++s=r),d||(*p+=c== '+',*p-=c=='-',p+=c=='>',p-=c=='<',c-'.'||write(1,p,1),c-','||read(2,p,1));} -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From scott at fenton.baltimore.md.us Fri Jan 24 14:32:14 2003 From: scott at fenton.baltimore.md.us (Scott Fenton) Date: Fri, 24 Jan 2003 08:32:14 -0500 Subject: [pypy-dev] __builtin__ module In-Reply-To: <20030124144842.W10805@prim.han.de> References: <20030122162243.GE10868@magma.unil.ch> <3E2ECFAF.3010104@strakt.com> <5.0.2.1.1.20030122130959.00a9b800@mail.oz.net> <20030123094119.GA22778@magma.unil.ch> <20030123203206.GA1986@debian.fenton.baltimore.md.us> <20030124093935.GC6665@magma.unil.ch> <20030124124313.GA3626@debian.fenton.baltimore.md.us> <20030124144842.W10805@prim.han.de> Message-ID: <20030124133214.GA3779@debian.fenton.baltimore.md.us> On Fri, Jan 24, 2003 at 02:48:42PM +0100, holger krekel wrote: > > what do you mean by this? "everything" can always be expressed in > a python function. It's a matter of time and space so could you > be more specific? Mostly, stuff that exists as "builtin" syntax, ie printf-style formats that can be implemented "under the hood" using sprintf should probably be C. "everything" can be expressed in term of my signature, I just wouldn't try it if my life depened on it. > > Anyway, I would try very hard to express all the builtins in python. > And i see Thomas Heller's ctypes approach as a way to make > this possible. Before coding anything in C there must be a > real *need* to do so. > I can make chr python either way. It's a question of how "deep" we want python to go. -Scott -- char m[9999],*n[99],*r=m,*p=m+5000,**s=n,d,c;main(){for(read(0,r,4000);c=*r; r++)c-']'||(d>1||(r=*p?*s:(--s,r)),!d||d--),c-'['||d++||(*++s=r),d||(*p+=c== '+',*p-=c=='-',p+=c=='>',p-=c=='<',c-'.'||write(1,p,1),c-','||read(2,p,1));} -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From theller at python.net Fri Jan 24 15:41:42 2003 From: theller at python.net (Thomas Heller) Date: 24 Jan 2003 15:41:42 +0100 Subject: ctypes news (was Re: [pypy-dev] __builtin__ module) In-Reply-To: <20030124144842.W10805@prim.han.de> References: <20030122020514.GA4079@debian.fenton.baltimore.md.us> <20030122162243.GE10868@magma.unil.ch> <3E2ECFAF.3010104@strakt.com> <5.0.2.1.1.20030122130959.00a9b800@mail.oz.net> <20030123094119.GA22778@magma.unil.ch> <20030123203206.GA1986@debian.fenton.baltimore.md.us> <20030124093935.GC6665@magma.unil.ch> <20030124124313.GA3626@debian.fenton.baltimore.md.us> <20030124144842.W10805@prim.han.de> Message-ID: holger krekel writes: > Anyway, I would try very hard to express all the builtins in python. > And i see Thomas Heller's ctypes approach as a way to make > this possible. So do I ;-) ;-), although I'm have no idea where to begin. Construct a new Python type by assembling the type structure completely as a ctypes type? ----- Here are some news on ctypes: Development has has moved to SF: http://sourceforge.net/projects/ctypes. New web pages (but not much new content) I'm actively working on it right now: writing tests to shake out all the little bugs (there are lots of them, although noone has complained about them), and fixing them on the fly. I think I'm halfway through the fixing. In case someone cares, I've already rewritten the argument conversion stuff, which was a rather large change. libffi is well integrated. I'm routinely running the tests under Linux and Windows 2000 always. Just van Rossum has submitted a patch to make it work under MacOS X also. If someone wants to read the code, best would be to retrieve it with anon CVS. There's a mailing list now, although I'm currently the only subscriber. I'm not sure it will be useful, I created it just in case. Thomas From tismer at tismer.com Fri Jan 24 17:08:56 2003 From: tismer at tismer.com (Christian Tismer) Date: Fri, 24 Jan 2003 17:08:56 +0100 Subject: [pypy-dev] Builtin types In-Reply-To: <3cnjrnv2.fsf@python.net> References: <20030123221722.GA2297@debian.fenton.baltimore.md.us> <3E308C2E.9020109@tismer.com> <3cnjrnv2.fsf@python.net> Message-ID: <3E316518.1050207@tismer.com> Thomas Heller wrote: > Christian Tismer writes: ... >>No, sorry. object is just a special case for deriving a >>new-style class, see above. >>They can also be created by deriving from builtin types, >>or by using __slots__. > > That's incorrect, IIUC. > > __slots__ only has a special meaning for *new stype classes* only, > it doesn't trigger anything in classes classes. Yes, I forgot about that, you are right. >>Furthermore, I'm going to propose an extension to the new >>class system (at least for the MiniPy prototype) that goes >>a bit further: >>- __slots__ should get the ability to denote the type of >> a slot to be generated, especially ctypes types > > > I'm not really understanding what you're proposing here. I am trying to extend the new-style classes that includes ctypes, somehow. > You could look at ctypes as implementing 'typed slots' with > C-compatible layout. How is that so different from my idea? ... >>- it should be possible to derive a class from nothing, >> not even from object or classic, but I'd like to describe >> a plain C structure by classes. > > > This is maybe also something that ctypes already does. > The B *class* above knows all about this C structure > > struct B { > char x; > int y; > long long z; > }; > > >>>>ctypes.sizeof(B) > > 16 However, we have the same intent. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From arigo at tunes.org Fri Jan 24 17:10:08 2003 From: arigo at tunes.org (Armin Rigo) Date: Fri, 24 Jan 2003 17:10:08 +0100 Subject: [pypy-dev] __builtin__ module In-Reply-To: <20030124124313.GA3626@debian.fenton.baltimore.md.us> References: <20030122020514.GA4079@debian.fenton.baltimore.md.us> <20030122162243.GE10868@magma.unil.ch> <3E2ECFAF.3010104@strakt.com> <5.0.2.1.1.20030122130959.00a9b800@mail.oz.net> <20030123094119.GA22778@magma.unil.ch> <20030123203206.GA1986@debian.fenton.baltimore.md.us> <20030124093935.GC6665@magma.unil.ch> <20030124124313.GA3626@debian.fenton.baltimore.md.us> Message-ID: <20030124161008.GA11766@magma.unil.ch> Hello Scott, On Fri, Jan 24, 2003 at 07:43:13AM -0500, Scott Fenton wrote: > I disagree. My feeling about this is probably that everything that can > be expressed as a function should be in pure python, and that which > can't should probably be C No! I was not saying it should be in C. *Nothing* should be in C. I'm using the word "built-in" to mean a function at the interpreter-level, as opposed to a function at the application-level. I still think that the complexity of "string modulo" is best written as a (non-built-in) Python function. A thing like chr() on the other hand is trivial to implement as a built-in, with code like this: def builtin_chr(v): # v is a PyInt class instance i = v.parse_long() if i not in range(256): raise EPython(PyExc(ValueError), PyStr('chr() arg not in range(256)') return PyStr(chr(i)) Compare this with bltinmodule.c:chr(). Yes, I know there is still a call to 'chr()' in my implementation. But I'm at the interpreter level. The above function (including its chr() call) is easy to translate to C to make the equivalent of bltinmodule.c:chr(). BTW the syntax "i not in range(256)" is more conceptual than "i<0 or i>=256" and expresses better what we are testing for (as the error message suggests too). The above code is a typical example of how I see "built-in" functions could be written. The function is "built-in" not because it is written in another language but because it works at the interpreter level. A bient?t, Armin. From tismer at tismer.com Fri Jan 24 17:25:38 2003 From: tismer at tismer.com (Christian Tismer) Date: Fri, 24 Jan 2003 17:25:38 +0100 Subject: [pypy-dev] Builtin types In-Reply-To: <3E316518.1050207@tismer.com> References: <20030123221722.GA2297@debian.fenton.baltimore.md.us> <3E308C2E.9020109@tismer.com> <3cnjrnv2.fsf@python.net> <3E316518.1050207@tismer.com> Message-ID: <3E316902.1060605@tismer.com> Christian Tismer wrote: ... A small addition: >> I'm not really understanding what you're proposing here. > > > I am trying to extend the new-style classes > that includes ctypes, somehow. > >> You could look at ctypes as implementing 'typed slots' with >> C-compatible layout. ... >> This is maybe also something that ctypes already does. >> The B *class* above knows all about this C structure >> >> struct B { >> char x; >> int y; >> long long z; >> }; Ok, what I was thinking of was to use ctypes or something similar to describe structs, and then to build all objects on top of this. This means that details like type pointer and reference counts go into this definiton as well, together with their behavior, and we are able to try different approaches as well. Probably this idea is trivial, and you though this way all the time. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From nathanh at zu.com Fri Jan 24 17:23:46 2003 From: nathanh at zu.com (Nathan Heagy) Date: Fri, 24 Jan 2003 10:23:46 -0600 Subject: [pypy-dev] __builtin__ module In-Reply-To: <20030124161008.GA11766@magma.unil.ch> Message-ID: <3675A4C8-2FB8-11D7-99B7-00039385F5E6@zu.com> > BTW the syntax "i not in range(256)" is more conceptual than "i<0 or > i>=256" > and expresses better what we are testing for (as the error message > suggests > too). Why not "if 0 < i < 256:" ? -- Nathan Heagy phone:306.653.4747 fax:306.653.4774 http://www.zu.com From nathanh at zu.com Fri Jan 24 17:34:20 2003 From: nathanh at zu.com (Nathan Heagy) Date: Fri, 24 Jan 2003 10:34:20 -0600 Subject: [pypy-dev] __builtin__ module In-Reply-To: <20030124093935.GC6665@magma.unil.ch> Message-ID: > chr(i) vs. '%c'%i vs. '\x00\x01...\xFF'[i]: Isn't part of this decision is whether the string type will itself be written in Python? If so then the chr(i) functionality will be a method of the String class, and even if String class is written in C chr() could probably still be a class method. Perhaps minimalPython does need a char type so that strings can be written in python and not C? -- Nathan Heagy phone:306.653.4747 fax:306.653.4774 http://www.zu.com From pedronis at bluewin.ch Fri Jan 24 17:48:42 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Fri, 24 Jan 2003 17:48:42 +0100 Subject: [pypy-dev] Builtin types References: <20030123221722.GA2297@debian.fenton.baltimore.md.us><3E308C2E.9020109@tismer.com> <3cnjrnv2.fsf@python.net><3E316518.1050207@tismer.com> <3E316902.1060605@tismer.com> Message-ID: <00fe01c2c3c8$74474ae0$6d94fea9@newmexico> From: "Christian Tismer" > Christian Tismer wrote: > ... > > A small addition: > >> I'm not really understanding what you're proposing here. > > > > > > I am trying to extend the new-style classes > > that includes ctypes, somehow. > > > >> You could look at ctypes as implementing 'typed slots' with > >> C-compatible layout. > ... > > >> This is maybe also something that ctypes already does. > >> The B *class* above knows all about this C structure > >> > >> struct B { > >> char x; > >> int y; > >> long long z; > >> }; > > Ok, what I was thinking of was to use ctypes > or something similar to describe structs, > and then to build all objects on top of this. > This means that details like type pointer and > reference counts go into this definiton as > well, together with their behavior, and we > are able to try different approaches as well. > > Probably this idea is trivial, and you though > this way all the time. maybe I'm stating the obvious, I think all of this is useful and necessary to get a running/working Python in Python and for targetting C-like/machine-code-level backends. OTOH I think a higher level of abstraction is necessary to targert more general backends. E.g. at such level what is relevant - about an integer object is its value and that's its type is integer - semantics of non-control-flow and binding byte codes are important e.g. : def binary_add(o1,o2): ... but the fact that there's a bytecode eval loop is much less so. (From my experience) a relevant issue is how to abstract over the PyTypeObject struct and the interpreter internal inheritance and lookup mechanisms (which do not correspond so directly to the language level semantics). regards. From bokr at oz.net Sat Jan 25 00:28:16 2003 From: bokr at oz.net (Bengt Richter) Date: Fri, 24 Jan 2003 15:28:16 -0800 Subject: [pypy-dev] __builtin__ module In-Reply-To: <20030124133214.GA3779@debian.fenton.baltimore.md.us> References: <20030124144842.W10805@prim.han.de> <20030122162243.GE10868@magma.unil.ch> <3E2ECFAF.3010104@strakt.com> <5.0.2.1.1.20030122130959.00a9b800@mail.oz.net> <20030123094119.GA22778@magma.unil.ch> <20030123203206.GA1986@debian.fenton.baltimore.md.us> <20030124093935.GC6665@magma.unil.ch> <20030124124313.GA3626@debian.fenton.baltimore.md.us> <20030124144842.W10805@prim.han.de> Message-ID: <5.0.2.1.1.20030124133513.00aa4d00@mail.oz.net> At 08:32 2003-01-24 -0500, Scott Fenton wrote: >On Fri, Jan 24, 2003 at 02:48:42PM +0100, holger krekel wrote: >> >> what do you mean by this? "everything" can always be expressed in >> a python function. It's a matter of time and space so could you >> be more specific? > >Mostly, stuff that exists as "builtin" syntax, ie printf-style formats >that can be implemented "under the hood" using sprintf should probably >be C. "everything" can be expressed in term of my signature, I just >wouldn't try it if my life depened on it. > >> >> Anyway, I would try very hard to express all the builtins in python. >> And i see Thomas Heller's ctypes approach as a way to make >> this possible. Before coding anything in C there must be a >> real *need* to do so. >> > >I can make chr python either way. It's a question of how "deep" we want >python to go. I'm thinking "depth" is a monotonic relationship among nodes along a path in an acyclic graph, and so far we are talking about two kinds of nodes: "interpreter level" and "Python level". I am getting an idea that maybe we should be thinking meta-levels instead of two kinds, and that in general there can concurrently exist many levels of nodes in the tree, all busily acting as "interpreter level" for their higher level parents, except for root and leaves. Not sure how useful this is for immediate goals, but I'm struggling to form a satisfying abstract view of the whole problem ;-) It's interesting that manipulation of representational elements in one level implements operations at another level, and manipulation is _defined_ by a pattern of elements that live somewhere too. It's a subtle soup ;-) BTW, discussion of chr(i) and ord(c) brings up the question of representing reinterpret_cast at an abstract level. (Whether a C reinterpret_cast correctly implements chr and ord semantics is a separate question. I just want to talk about casting a moment, for the light it may shed on type info). ISTM we need a (python level) type to (meta-) represent untyped bits. I.e., the 8 bits of a char don't carry the type info that makes them char vs uint8. The hypothetical meta_repr function I mentioned in a previous post does make use of a Python level object (Bits instance) to represent untyped bits. I.e., if c is a character, conceptually: meta_repr(c) => ('str', id(c), Bits(c)) where Bits(c) is a special bit vector at the Python level, but represents the untyped bits of c at the next meta-level ("interpreter level" here). An in-place reinterpret_cast from char to uint8 would amount to ('str', id(c), Bits(c)) => ('int', id(c), Bits(c)) (Assuming that in the abstract, 'int' doesn't care about the current number of bits in its representation, though I'm skipping a detail about sign, which is an interpretation of the msb of Bits(c), so it should probably be written ('str', id(c), Bits(c)) => ('int', id(c), Bits(0)+Bits(c)) where the '+' means bit vectors concatenate. Or you could have a distinct 'uint' type. That might be necessary for unsigned fixed-width representations. A _conversion_ would virtually allocate existence space by also providing a new distinct id and copying the representational Bits (or not copying, if we can tag the instance as immutable for some uses). Note that you can imagine corresponding operations in a different level where there's an "actual" space allocation somewhere in some array (Python level or malloc level ;-) and id indicates a location in that array, and 'int' is perhaps encoded in some Bits associated with the Bits of the newly allocated int representation, or maybe implicit in the indication of the space array, if that is dedicated to a single type. All this, too, could be represented abstractly at a level before machine language, so I think a two-meta-level model may be constraining. If you look at ('str', id(c), Bits(c)) as Python-level code, it is a Python level tuple with a Python level string, int, and class instance. The whole thing only has meta-meaning because of how it is interpreted in a context relating two meta-levels. I.e., the interpretation itself is expressed at the Python level, but being plain Python, there is a meta_repr(('str', id(c), Bits(c))) involved in the next level of what's happening, and so forth, until leaf representations of machine code are evolved and used, IWT. Hoping I'm helping factor and clarify concepts rather than tangle and muddy, Bengt From pedronis at bluewin.ch Sat Jan 25 01:01:39 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Sat, 25 Jan 2003 01:01:39 +0100 Subject: [pypy-dev] __builtin__ module References: <20030124144842.W10805@prim.han.de> <20030122162243.GE10868@magma.unil.ch> <3E2ECFAF.3010104@strakt.com> <5.0.2.1.1.20030122130959.00a9b800@mail.oz.net> <20030123094119.GA22778@magma.unil.ch> <20030123203206.GA1986@debian.fenton.baltimore.md.us> <20030124093935.GC6665@magma.unil.ch> <20030124124313.GA3626@debian.fenton.baltimore.md.us> <20030124144842.W10805@prim.han.de> <5.0.2.1.1.20030124133513.00aa4d00@mail.oz.net> Message-ID: <05ed01c2c404$efea05c0$6d94fea9@newmexico> From: "Bengt Richter" > At 08:32 2003-01-24 -0500, Scott Fenton wrote: > >On Fri, Jan 24, 2003 at 02:48:42PM +0100, holger krekel wrote: > >> > >> what do you mean by this? "everything" can always be expressed in > >> a python function. It's a matter of time and space so could you > >> be more specific? > > > >Mostly, stuff that exists as "builtin" syntax, ie printf-style formats > >that can be implemented "under the hood" using sprintf should probably > >be C. "everything" can be expressed in term of my signature, I just > >wouldn't try it if my life depened on it. > > > >> > >> Anyway, I would try very hard to express all the builtins in python. > >> And i see Thomas Heller's ctypes approach as a way to make > >> this possible. Before coding anything in C there must be a > >> real *need* to do so. > >> > > > >I can make chr python either way. It's a question of how "deep" we want > >python to go. > > I'm thinking "depth" is a monotonic relationship among nodes along a path > in an acyclic graph, and so far we are talking about two kinds of nodes: > "interpreter level" and "Python level". I am getting an idea that maybe > we should be thinking meta-levels instead of two kinds, and that in general > there can concurrently exist many levels of nodes in the tree, all busily > acting as "interpreter level" for their higher level parents, except for > root and leaves. Not sure how useful this is for immediate goals, but I'm > struggling to form a satisfying abstract view of the whole problem ;-) > It's interesting that manipulation of representational elements in one > level implements operations at another level, and manipulation is _defined_ > by a pattern of elements that live somewhere too. It's a subtle soup ;-) > > BTW, discussion of chr(i) and ord(c) brings up the question of representing > reinterpret_cast at an abstract level. (Whether a C reinterpret_cast correctly > implements chr and ord semantics is a separate question. I just want to talk > about casting a moment, for the light it may shed on type info). > > ISTM we need a (python level) type to (meta-) represent untyped bits. I.e., the > 8 bits of a char don't carry the type info that makes them char vs uint8. > The hypothetical meta_repr function I mentioned in a previous post does make > use of a Python level object (Bits instance) to represent untyped bits. I.e., > if c is a character, conceptually: meta_repr(c) => ('str', id(c), Bits(c)) > where Bits(c) is a special bit vector at the Python level, but represents the > untyped bits of c at the next meta-level ("interpreter level" here). > > An in-place reinterpret_cast from char to uint8 would amount to > ('str', id(c), Bits(c)) => ('int', id(c), Bits(c)) > (Assuming that in the abstract, 'int' doesn't care about the current number of > bits in its representation, though I'm skipping a detail about sign, which > is an interpretation of the msb of Bits(c), so it should probably be written > ('str', id(c), Bits(c)) => ('int', id(c), Bits(0)+Bits(c)) > where the '+' means bit vectors concatenate. Or you could have a distinct 'uint' > type. That might be necessary for unsigned fixed-width representations. > > A _conversion_ would virtually allocate existence space by also providing > a new distinct id and copying the representational Bits (or not copying, if > we can tag the instance as immutable for some uses). Note that you can imagine > corresponding operations in a different level where there's an "actual" space > allocation somewhere in some array (Python level or malloc level ;-) and > id indicates a location in that array, and 'int' is perhaps encoded in some > Bits associated with the Bits of the newly allocated int representation, or maybe > implicit in the indication of the space array, if that is dedicated to a single type. > All this, too, could be represented abstractly at a level before machine language, > so I think a two-meta-level model may be constraining. > > If you look at ('str', id(c), Bits(c)) as Python-level code, it is > a Python level tuple with a Python level string, int, and class instance. > The whole thing only has meta-meaning because of how it is interpreted > in a context relating two meta-levels. > > I.e., the interpretation itself is expressed at the Python level, but being plain > Python, there is a meta_repr(('str', id(c), Bits(c))) involved in the next > level of what's happening, and so forth, until leaf representations of machine > code are evolved and used, IWT. > > Hoping I'm helping factor and clarify concepts rather than tangle and muddy, > Bengt I think that the untyped bit vector thing is too low level. From scott at fenton.baltimore.md.us Sat Jan 25 00:46:31 2003 From: scott at fenton.baltimore.md.us (Scott Fenton) Date: Fri, 24 Jan 2003 18:46:31 -0500 Subject: [pypy-dev] __builtin__ module In-Reply-To: <5.0.2.1.1.20030124133513.00aa4d00@mail.oz.net> References: <3E2ECFAF.3010104@strakt.com> <5.0.2.1.1.20030122130959.00a9b800@mail.oz.net> <20030123094119.GA22778@magma.unil.ch> <20030123203206.GA1986@debian.fenton.baltimore.md.us> <20030124093935.GC6665@magma.unil.ch> <20030124124313.GA3626@debian.fenton.baltimore.md.us> <20030124144842.W10805@prim.han.de> <5.0.2.1.1.20030124133513.00aa4d00@mail.oz.net> Message-ID: <20030124234631.GA1447@debian.fenton.baltimore.md.us> On Fri, Jan 24, 2003 at 03:28:16PM -0800, Bengt Richter wrote: > [snip] > I'm thinking "depth" is a monotonic relationship among nodes along a path > in an acyclic graph, and so far we are talking about two kinds of nodes: > "interpreter level" and "Python level". I am getting an idea that maybe > we should be thinking meta-levels instead of two kinds, and that in general > there can concurrently exist many levels of nodes in the tree, all busily > acting as "interpreter level" for their higher level parents, except for > root and leaves. Not sure how useful this is for immediate goals, but I'm > struggling to form a satisfying abstract view of the whole problem ;-) > It's interesting that manipulation of representational elements in one > level implements operations at another level, and manipulation is _defined_ > by a pattern of elements that live somewhere too. It's a subtle soup ;-) Hmm.... meta-circular interpreters. Sounds suspicously like problems Lisp has dealt with for years. Perhaps we should take a look at how systems like Maclisp and Interlisp handled these problems, since they were themselves written in Lisp, as I recall it. BTW, for reference I've put a copy of Steele and Sussman's "Art of the Interpreter" on my site at http://fenton.baltimore.md.us/AIM-453.pdf parenthetically yours, -Scott -- char m[9999],*n[99],*r=m,*p=m+5000,**s=n,d,c;main(){for(read(0,r,4000);c=*r; r++)c-']'||(d>1||(r=*p?*s:(--s,r)),!d||d--),c-'['||d++||(*++s=r),d||(*p+=c== '+',*p-=c=='-',p+=c=='>',p-=c=='<',c-'.'||write(1,p,1),c-','||read(2,p,1));} -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From scott at fenton.baltimore.md.us Sat Jan 25 00:52:30 2003 From: scott at fenton.baltimore.md.us (Scott Fenton) Date: Fri, 24 Jan 2003 18:52:30 -0500 Subject: [pypy-dev] __builtin__ module In-Reply-To: <05ed01c2c404$efea05c0$6d94fea9@newmexico> References: <3E2ECFAF.3010104@strakt.com> <5.0.2.1.1.20030122130959.00a9b800@mail.oz.net> <20030123094119.GA22778@magma.unil.ch> <20030123203206.GA1986@debian.fenton.baltimore.md.us> <20030124093935.GC6665@magma.unil.ch> <20030124124313.GA3626@debian.fenton.baltimore.md.us> <20030124144842.W10805@prim.han.de> <5.0.2.1.1.20030124133513.00aa4d00@mail.oz.net> <05ed01c2c404$efea05c0$6d94fea9@newmexico> Message-ID: <20030124235230.GA1525@debian.fenton.baltimore.md.us> On Sat, Jan 25, 2003 at 01:01:39AM +0100, Samuele Pedroni wrote: > [snip] > I think that the untyped bit vector thing is too low level. Not if we're talking about generating a compiler, it isn't. I think part of our problem is that, besides extending python FAR beyond its original problem domain, we look like we're heading towards the first python compiler, as opposed to the CPython and Jython interpreters (ok, there is freeze, but it uses the internals of an interpreter to run, so not really). What we need to do is figure out if we want a direct compiler in which to write an interpeter, or just an interpreter, which wouldn't ever be free standing. meta-circular-ly yours, -Scott -- char m[9999],*n[99],*r=m,*p=m+5000,**s=n,d,c;main(){for(read(0,r,4000);c=*r; r++)c-']'||(d>1||(r=*p?*s:(--s,r)),!d||d--),c-'['||d++||(*++s=r),d||(*p+=c== '+',*p-=c=='-',p+=c=='>',p-=c=='<',c-'.'||write(1,p,1),c-','||read(2,p,1));} -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From bokr at oz.net Sat Jan 25 01:52:34 2003 From: bokr at oz.net (Bengt Richter) Date: Fri, 24 Jan 2003 16:52:34 -0800 Subject: [pypy-dev] __builtin__ module In-Reply-To: <3675A4C8-2FB8-11D7-99B7-00039385F5E6@zu.com> References: <20030124161008.GA11766@magma.unil.ch> Message-ID: <5.0.2.1.1.20030124153543.00aa3960@mail.oz.net> At 10:23 2003-01-24 -0600, Nathan Heagy wrote: >>BTW the syntax "i not in range(256)" is more conceptual than "i<0 or i>=256" >>and expresses better what we are testing for (as the error message suggests >>too). > >Why not "if 0 < i < 256:" ? I think it's because your ineqality refers to an interval in the set of all integers (or reals for that matter), whereas the concept of chr derives from the ordinal place of a character in a specific finite _ordered_ set of characters. IOW if the char set were represented as a sequence named charset, then the index test really should be "i not in range(len(charset))". If you think of charset having a 1:1 corresponding indexset, then c = chr(i) can be expressed as c = charset[indexset.index(i)] and conventionally the index set for ord is range(len(charset)), because that ordered set naturally represents position in the set. Note that you could have a non-integer i and index set in this formulation, so Armin's test doesn't just test a numerical value within an interval, it tests whether the index is an allowable index object (i.e., member of the indexset). Of course if indexset is range(x) and i is in indexset, indexset.index(i) is a noop, so after Armin's test, you can safely write c = charset[i]. For better conceptual purity (and less-magical magic numbers ;-), I might write your inequality as if isinstance(i, int) and ord(charset[0]) <= i <= ord(charset[-1]) and let ord(c) be charset.index(c) Maybe Psyco would ultimately generate the same code ;-) Regards, Bengt From pedronis at bluewin.ch Sat Jan 25 01:55:04 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Sat, 25 Jan 2003 01:55:04 +0100 Subject: [pypy-dev] __builtin__ module References: <3E2ECFAF.3010104@strakt.com> <5.0.2.1.1.20030122130959.00a9b800@mail.oz.net> <20030123094119.GA22778@magma.unil.ch> <20030123203206.GA1986@debian.fenton.baltimore.md.us> <20030124093935.GC6665@magma.unil.ch> <20030124124313.GA3626@debian.fenton.baltimore.md.us> <20030124144842.W10805@prim.han.de> <5.0.2.1.1.20030124133513.00aa4d00@mail.oz.net> <05ed01c2c404$efea05c0$6d94fea9@newmexico> <20030124235230.GA1525@debian.fenton.baltimore.md.us> Message-ID: <070901c2c40c$65fc9960$6d94fea9@newmexico> >Not if we're talking about generating a compiler, it isn't. I think >part of our problem is that, besides extending python FAR beyond >its original problem domain, we look like we're heading towards the >first python compiler, as opposed to the CPython and Jython interpreters >(ok, there is freeze, but it uses the internals of an interpreter to >run, so not really). What we need to do is figure out if we want a >direct compiler in which to write an interpeter, or just an interpreter, >which wouldn't ever be free standing. I guess that what you mean is that (limited) untyped bit vectors can be used in an intermediate representation near to machine code generation in a to-machine-code compiler. My point is that wrt capturing python semantics in python in a way that is statically analyzable in order to produce widely different "backends" the notion of untyped bits vector is likely too low level. (Even if the backend set should encompass "compilers"). From scott at fenton.baltimore.md.us Sat Jan 25 01:34:20 2003 From: scott at fenton.baltimore.md.us (Scott Fenton) Date: Fri, 24 Jan 2003 19:34:20 -0500 Subject: [pypy-dev] __builtin__ module In-Reply-To: <070901c2c40c$65fc9960$6d94fea9@newmexico> References: <20030123094119.GA22778@magma.unil.ch> <20030123203206.GA1986@debian.fenton.baltimore.md.us> <20030124093935.GC6665@magma.unil.ch> <20030124124313.GA3626@debian.fenton.baltimore.md.us> <20030124144842.W10805@prim.han.de> <5.0.2.1.1.20030124133513.00aa4d00@mail.oz.net> <05ed01c2c404$efea05c0$6d94fea9@newmexico> <20030124235230.GA1525@debian.fenton.baltimore.md.us> <070901c2c40c$65fc9960$6d94fea9@newmexico> Message-ID: <20030125003420.GA2151@debian.fenton.baltimore.md.us> On Sat, Jan 25, 2003 at 01:55:04AM +0100, Samuele Pedroni wrote: > [snip] > I guess that what you mean is that (limited) untyped bit vectors can be used in > an intermediate representation near to machine code generation in a > to-machine-code compiler. > > My point is that wrt capturing python semantics in python in a way that is > statically analyzable in order to produce widely different "backends" the > notion of untyped bits vector is likely too low level. (Even if the backend > set should encompass "compilers"). > OK then, in that case, sure. Bit vectors do end up being too low-level. I think it would really help this project to have a clear statement of what we're aiming to do: ie., do we want a parser with multiple backends, two of which are interpet and compile, or do we want a simple python->machine code or python->execute as interpeted translator? -Scott -- char m[9999],*n[99],*r=m,*p=m+5000,**s=n,d,c;main(){for(read(0,r,4000);c=*r; r++)c-']'||(d>1||(r=*p?*s:(--s,r)),!d||d--),c-'['||d++||(*++s=r),d||(*p+=c== '+',*p-=c=='-',p+=c=='>',p-=c=='<',c-'.'||write(1,p,1),c-','||read(2,p,1));} -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From tismer at tismer.com Sat Jan 25 04:21:22 2003 From: tismer at tismer.com (Christian Tismer) Date: Sat, 25 Jan 2003 04:21:22 +0100 Subject: [pypy-dev] __builtin__ module In-Reply-To: <3675A4C8-2FB8-11D7-99B7-00039385F5E6@zu.com> References: <3675A4C8-2FB8-11D7-99B7-00039385F5E6@zu.com> Message-ID: <3E3202B2.90800@tismer.com> Nathan Heagy wrote: > BTW the syntax "i not in range(256)" is more conceptual than "i<0 or > i>=256" > and expresses better what we are testing for (as the error message > suggests > too). > > > Why not "if 0 < i < 256:" ? Well, the generated code should finally be the same. But "i not in range(256)" expresses that i should not be in the set of those 256 values. That's more expressive, since it does not imply that i needs to be an integer at all. It does not impose that i has a data type that is ordered, and it does not require that i has to know how to compare itself to be less or greater than anything else. It just says "do not be any of these". ciao - chris From tismer at tismer.com Sat Jan 25 04:24:37 2003 From: tismer at tismer.com (Christian Tismer) Date: Sat, 25 Jan 2003 04:24:37 +0100 Subject: [pypy-dev] __builtin__ module In-Reply-To: References: Message-ID: <3E320375.9000906@tismer.com> Nathan Heagy wrote: > chr(i) vs. '%c'%i vs. '\x00\x01...\xFF'[i]: > > > Isn't part of this decision is whether the string type will itself be > written in Python? If so then the chr(i) functionality will be a method > of the String class, and even if String class is written in C chr() > could probably still be a class method. Perhaps minimalPython does need > a char type so that strings can be written in python and not C? The implementation enginge of MinimalPython, not MinimalPythonitself necessarily, needs to have a way to express "this is a single char". This will most probably be expressed by using an instance of an according type like in ctypes. This is not necessary in the first iteration of the bootstrap, but later, when we have a running engine, and begin to make it efficient. cheers - chris From tismer at tismer.com Sat Jan 25 04:41:37 2003 From: tismer at tismer.com (Christian Tismer) Date: Sat, 25 Jan 2003 04:41:37 +0100 Subject: [pypy-dev] __builtin__ module In-Reply-To: <20030125003420.GA2151@debian.fenton.baltimore.md.us> References: <20030123094119.GA22778@magma.unil.ch> <20030123203206.GA1986@debian.fenton.baltimore.md.us> <20030124093935.GC6665@magma.unil.ch> <20030124124313.GA3626@debian.fenton.baltimore.md.us> <20030124144842.W10805@prim.han.de> <5.0.2.1.1.20030124133513.00aa4d00@mail.oz.net> <05ed01c2c404$efea05c0$6d94fea9@newmexico> <20030124235230.GA1525@debian.fenton.baltimore.md.us> <070901c2c40c$65fc9960$6d94fea9@newmexico> <20030125003420.GA2151@debian.fenton.baltimore.md.us> Message-ID: <3E320771.6000407@tismer.com> Scott Fenton wrote: > On Sat, Jan 25, 2003 at 01:55:04AM +0100, Samuele Pedroni wrote: > >>[snip] >>I guess that what you mean is that (limited) untyped bit vectors can be used in >>an intermediate representation near to machine code generation in a >>to-machine-code compiler. >> >>My point is that wrt capturing python semantics in python in a way that is >>statically analyzable in order to produce widely different "backends" the >>notion of untyped bits vector is likely too low level. (Even if the backend >>set should encompass "compilers"). >> > > > OK then, in that case, sure. Bit vectors do end up being too > low-level. I think it would really help this project to have > a clear statement of what we're aiming to do: ie., do we > want a parser with multiple backends, two of which are interpet > and compile, or do we want a simple python->machine code or > python->execute as interpeted translator? Yes. At the moment, we want any and all of that. Even the statement about bit vectors being too low-level is a bit ;-) early to be stated. Together with a good set of description objects for the bits in the vector, it does make sense. This is the really intersting structure, the vectors is much less relevant. -chris From arigo at tunes.org Sat Jan 25 12:32:16 2003 From: arigo at tunes.org (Armin Rigo) Date: Sat, 25 Jan 2003 03:32:16 -0800 (PST) Subject: [pypy-dev] __builtin__ module In-Reply-To: <3675A4C8-2FB8-11D7-99B7-00039385F5E6@zu.com>; from nathanh@zu.com on Fri, Jan 24, 2003 at 10:23:46AM -0600 References: <20030124161008.GA11766@magma.unil.ch> <3675A4C8-2FB8-11D7-99B7-00039385F5E6@zu.com> Message-ID: <20030125113216.A4F554A5F@bespin.org> Hello Nathan, On Fri, Jan 24, 2003 at 10:23:46AM -0600, Nathan Heagy wrote: > > "i not in range(256)" > > Why not "if 0 < i < 256:" ? Ultimately because "i in range()" is a single test. This is what we want to say: "i is in the acceptable range". A translator working on the "0 <= i < 256" version would have to use patterns to figure out that we are actually asking whether "i" is within a range, and not simply making two tests "0 <= i" and "i < 256", in case it allows to generate more natural code. Conversely, it is trivial to implement "in range()" as a double comparison if needed. Armin From pedronis at bluewin.ch Sat Jan 25 13:29:58 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Sat, 25 Jan 2003 13:29:58 +0100 Subject: [pypy-dev] __builtin__ module References: <20030123094119.GA22778@magma.unil.ch> <20030123203206.GA1986@debian.fenton.baltimore.md.us> <20030124093935.GC6665@magma.unil.ch> <20030124124313.GA3626@debian.fenton.baltimore.md.us> <20030124144842.W10805@prim.han.de> <5.0.2.1.1.20030124133513.00aa4d00@mail.oz.net> <05ed01c2c404$efea05c0$6d94fea9@newmexico> <20030124235230.GA1525@debian.fenton.baltimore.md.us> <070901c2c40c$65fc9960$6d94fea9@newmexico> <20030125003420.GA2151@debian.fenton.baltimore.md.us> <3E320771.6000407@tismer.com> Message-ID: <001201c2c46d$7a1a0480$6d94fea9@newmexico> From: "Christian Tismer" > > Yes. > At the moment, we want any and all of that. > Even the statement about bit vectors being > too low-level is a bit ;-) early to be stated. > Together with a good set of description objects > for the bits in the vector, it does make sense. > This is the really intersting structure, the > vectors is much less relevant. > it is worth to rember that some potential targerts have no notion of pointer, and of casting between integers and pointers, only opaque references. From hpk at trillke.net Sat Jan 25 21:35:50 2003 From: hpk at trillke.net (holger krekel) Date: Sat, 25 Jan 2003 21:35:50 +0100 Subject: ctypes news (was Re: [pypy-dev] __builtin__ module) In-Reply-To: ; from theller@python.net on Fri, Jan 24, 2003 at 03:41:42PM +0100 References: <3E2ECFAF.3010104@strakt.com> <5.0.2.1.1.20030122130959.00a9b800@mail.oz.net> <20030123094119.GA22778@magma.unil.ch>