From hpk at trillke.net Mon Dec 1 17:24:54 2003 From: hpk at trillke.net (holger krekel) Date: Mon, 1 Dec 2003 17:24:54 +0100 Subject: [pypy-dev] request for sprint planning discussion Message-ID: <20031201172454.M15957@prim.han.de> Hello pypy and especially hello Amsterdam sprinters (apparently 13), the Amsterdam sprint is only two weeks off and i think we need some more discussion and overview about what we intend to do in Amsterdam. First let me note that while Armin and me were hoping to make a first public release in Amsterdam this is not of paramount importance. I think that it's more important to have a sprint that everybody enjoys. Judging from IRC discussions and feedback from some sprinters this may imply that we take a somewhat different approach than announced before: revisiting various parts of the current architecture, learning about and improving/documenting them. It's probably more fun to hack together at some documentation than doing it alone :-) Please feel encouraged (especially sprint participants) to ask questions, comment or suggest different/more ideas, goals and wishes for our sprint. Let me present some ideas that are currently on the table: - revisit/complete support for interpreter<->applevel interactions [1] This code (interpreter/gateway.py and friends) was mostly hacked by Armin and me and would enjoy wider involvement and improvement. For one, interpreter-level objects like frame and function objects need to be correctly exposed at application level (some bits and pieces missing). The "import dis; dis.dis(dis.dis)" goal finally needs to work! (still only "import dis; dis.dis(dis.dis.func_code)" works. - currently you can mix interp-level code and app-level code (see e.g. module/builtin.py). While you can access app-level objects from interp-level the reverse is not true: there is no general way to access interp-level objects directly from app-level -- unless the interp-level objects provide specific hooks (e.g. pypy_getattr() in pyframe.py). Accessing interpreter level objects from app-level would e.g. be useful if we want to *define* the complete python type hiearchy including __new__ methods in our standard "types.py" file. Note that e.g. StdObjSpace doesn't really care about this: most of it would work just fine without having such a type hierarchy. However, types.py would then *define* the actual types and their relations completly at app-level. The interp-level objects would of course need to correlate to this definition. In the course, it might be nice if we could access interp-level objects directly from app-level class int(type): def __new__(cls, ...): from ... import W_IntObject # our stdobjspace-implementation return W_IntObject(arg) # gets invoked here ... this obviously needs more thought but i hope the idea is understandable. - Of course, there is still a lot of work with Annotation/Translation but this has been mentioned in previous postings. Note that this part of the source tree is pretty independent from the rest of pypy. Recently Armin and me have started to refactor annotation code after we also implemented the "Berlin model" of the flowgraph-structures (objspace/flow/model.py). The idea is to have a rather general and somewhat efficient annotation/query engine. The beginnings are in the new pypy/annotation directory including a (non-complete) README.txt - frontends: i don't know how many people have experiences hacking with pygame and/or game architectures. It probably doesn't make much sense if only Armin and me want to or can do it. Of course, Michael Hudson has done some stuff in that area, too, but he had to cancel his participation. However, we can start with writing tools that e.g. list all space operations for a given python function. There also is tool/methodChecker.py which tries to list the "implementedness" of app-visible functions/methods of types. Doing tools like this is helpful for understanding how pypy works -- both writing and using it. These tools could easily be reused from whatever frontend with the following approach: Let PyPy run in a most flexible, insecure but simple 15-liner application-server that simply receives and executes remote python code/string objects. Thus you can send the "methodChecker/showspaceops" cmdline tools to a remote server and receive the results (e.g. over a redirected sys.stdout/err). (i might commit some simple code for this mechanism before the sprint to a src/pyappserver directory and notify the list). - completing stdobjspace and builtins, there is a lot to do still. Rocco Morretti recently worked in the direction of (and suggested as a goal) getting 'regrtest.py' to pass on PyPy as much as possible. An important missing piece probably is getting a PyPy implementation of the cpython import mechanism (our current module/builtin module __import__ implementation is just a simplistic hack). Basically i suggest we try to plan the sprint so that everybody gets educated enough to feel at home with code and concepts of PyPy. At best everyone feels able to take initiatives of their own. It probably would make sense to prepare some introductional talks to the interpreter (byte code dispatching/implementation/exceptions ...) the stdobjspace (multimethods, and type implementations ...) annotation/translation (flowgraphs/annotations via space-operations) on the first two days. At least we can make a few question/answer screen-sessions on these topics. At the moment, there are very few people who know most/all of the areas of PyPy and can take initiatives to fix/improve stuff. This needs to change and is more important than getting a release out (although we need not give up this idea, yet). please comment away, holger [1] application level objects are the usual objects/structures you see in/from a python program. Under the hood, these objects have interpreter-level implementations. In interp-level source code you see lots of w_* names indicating that they reference a 'wrapped object' aka an application-level object. These wrapped objects are manipulated by object space operations like objspace.getitem/getattr/type/add and are opaque to the interpreter. IOW, an objectspace is usually in complete control of object layout, app-level representation and other details. Only the interpreter frame/function/code/... objects can control their app-level representation themselves by defining certain hooks like 'pypy_getattr' which the objectspace dispatches to if it encounters a getattr object on an internal object. The nice effect is that objectspaces don't need to reimplement e.g. function/generator/code/module types with all the app-level representation again and again. From arigo at tunes.org Mon Dec 1 17:20:34 2003 From: arigo at tunes.org (Armin Rigo) Date: Mon, 1 Dec 2003 16:20:34 +0000 Subject: [pypy-dev] Re: [pypy-svn] rev 2280 - in pypy/trunk/src/pypy: annotationappspace module module/test In-Reply-To: <20031130175313.C6AA15BE97@thoth.codespeak.net> References: <20031130175313.C6AA15BE97@thoth.codespeak.net> Message-ID: <20031201162034.GA7130@vicky.ecs.soton.ac.uk> Hello Rocco, On Sun, Nov 30, 2003 at 06:53:13PM +0100, rocco at codespeak.net wrote: > M module\__init__.py > A module\itertoolsmodule.py > A module\_sremodule.py > A module\mathmodule.py > A module\_randommodule.py > A module\cStringIOmodule.py > > Add some missing builtin modules - (Note that these probably should get > rewritten at some point when we get a functioning RPython implementation, > but for now ... Sounds nice, but I can't seem to import them in PyPy? "import math" or "import _random" just fails with an ImportError. I saw that they are actually just wrappers around the CPython modules, but I'd thought they'd be supposed to be *working* wrappers... A bient?t, Armin. From stephan.diehl at gmx.net Mon Dec 1 17:35:48 2003 From: stephan.diehl at gmx.net (Stephan Diehl) Date: Mon, 1 Dec 2003 17:35:48 +0100 Subject: [pypy-dev] request for sprint planning discussion In-Reply-To: <20031201172454.M15957@prim.han.de> References: <20031201172454.M15957@prim.han.de> Message-ID: <200312011735.48637.stephan.diehl@gmx.net> > > - frontends: i don't know how many people have experiences hacking with > pygame and/or game architectures. It probably doesn't make much sense > if only Armin and me want to or can do it. Of course, Michael Hudson has > done some stuff in that area, too, but he had to cancel his > participation. I'm just hacking away with the 'cmd' module. Since this comes with the Standard Python distribution and doesn't need to have some other 'strange' packets installed, you might want to use that. It is really easy to use and understand and gives you documentation, command line completion and help without major effort. Just my 2cents Stephan From arigo at tunes.org Mon Dec 1 18:02:07 2003 From: arigo at tunes.org (Armin Rigo) Date: Mon, 1 Dec 2003 17:02:07 +0000 Subject: [pypy-dev] request for sprint planning discussion In-Reply-To: <20031201172454.M15957@prim.han.de> References: <20031201172454.M15957@prim.han.de> Message-ID: <20031201170207.GA10543@vicky.ecs.soton.ac.uk> Hello Holger, A small precision: On Mon, Dec 01, 2003 at 05:24:54PM +0100, holger krekel wrote: > Accessing interpreter level objects from app-level would e.g. be useful > if we want to *define* the complete python type hiearchy including > __new__ methods in our standard "types.py" file. I think these issues are somewhat independent. And I also think that the possibility of getting rid of type objects altogether at interpreter-level looks like a really cool thing, which requires a bit more precision. There are two different but related things we call "type" in Python: there is the "behavior" of an object 'x', e.g. how it is added with other objects; and there is the type object 'type(x)'. In CPython the two are pretty much the same thing because in C an object is just a "PyObject*" pointer, so that we use the type object to capture the behavior. In PyPy it is not necessarily so. For example, we have a lot of interpreter-level classes like Function, Module, Code and Frame that work very nicely on their own without any help, thank you: when you are in PyPy, a function object is implemented by an instance of Function, and Function defines all the behavior we need. So "Function" is the "interpreter-level type", or "implementing class", of function objects. But now if you ask in PyPy 'type(f)' you don't get a reasonable answer yet. What you should get is some wrapped object that is, well, a type object, but none exists for functions. In other words we have fully working functions but no FunctionType. Also, in the StdObjSpace, type objects are a bit of a mess. They exist correctly but building them is laborious. Putting these observations together, and in the spirit of the original "minimal Python" approach, we could wonder if type objects are needed at all. We could actually have a fully working Python sub-language in which there is no type object (and so no 'type' built-in, and only old-style classes). The objects still have a strictly-typed behavior, but you just can't extract this behavior information as an explicit type object. You can implement almost anything without ever referring to concrete types -- some people argue that you should never use or check concrete types anyway. The summary is that we could just leave type objects out of the core of PyPy. Then a cool place to add them (because I guess people will still want them back) would be at app-level, e.g. in a custom version of the 'types.py' module, that would look like: class object: ... class type(object): ... class int(object): ... class bool(int): ... class function(object): ... and then we need a way to correlate these classes with interpreter-level classes. We could have a nice registry of available built-in implementations for each type, but essentially a quick hack would be: class int(object): def __new__(cls, *args): return pypy_factories.W_IntObject(*args) where 'pypy_factories' is a built-in module that exposes at app-level factory functions to create instances of internal classes. In Holger's text the "W_IntObject" above was a more direct app-level-visible version of the W_IntObject class instead of just a factory function; well maybe that's a good idea, but right now it is a side point to the whole "let's get rid of type objects!" programme. What do you think about it? A bient?t, Armin. From hpk at trillke.net Mon Dec 1 19:47:11 2003 From: hpk at trillke.net (holger krekel) Date: Mon, 1 Dec 2003 19:47:11 +0100 Subject: [pypy-dev] request for sprint planning discussion In-Reply-To: <200312011735.48637.stephan.diehl@gmx.net>; from stephan.diehl@gmx.net on Mon, Dec 01, 2003 at 05:35:48PM +0100 References: <20031201172454.M15957@prim.han.de> <200312011735.48637.stephan.diehl@gmx.net> Message-ID: <20031201194711.O15957@prim.han.de> Hi Stephan, [Stephan Diehl Mon, Dec 01, 2003 at 05:35:48PM +0100] > > > > > - frontends: i don't know how many people have experiences hacking with > > pygame and/or game architectures. It probably doesn't make much sense > > if only Armin and me want to or can do it. Of course, Michael Hudson has > > done some stuff in that area, too, but he had to cancel his > > participation. > > I'm just hacking away with the 'cmd' module. Since this comes with the > Standard Python distribution and doesn't need to have some other 'strange' > packets installed, you might want to use that. > It is really easy to use and understand and gives you documentation, command > line completion and help without major effort. Yes, providing a line-based shell with some completion is a good thing although interactive tab-completion usually requires readline which is not natively available on all platforms. However, multi-line editing of a function (whose flowgraph, space operations and translated pyrex/C-code would automatically be created/updated and displayed for you) requires a somewhat more involved approach. Pygame might be a nice, portable and flexible platform for such tools. Of course, the choice of frontend(s) is debatable and mostly depends on the experiences of the sprinters, anyway. I guess we can agree that starting with some cmdline-based or terminal-line-based tools to access and display PyPy internals and intermediate representations is a good idea. cheers, holger From roccomoretti at hotpop.com Tue Dec 2 17:20:53 2003 From: roccomoretti at hotpop.com (Rocco Moretti) Date: Tue, 02 Dec 2003 10:20:53 -0600 Subject: [pypy-dev] Re: [pypy-svn] rev 2280 - in pypy/trunk/src/pypy: annotationappspace module module/test In-Reply-To: <20031201162034.GA7130@vicky.ecs.soton.ac.uk> References: <20031130175313.C6AA15BE97@thoth.codespeak.net> <20031201162034.GA7130@vicky.ecs.soton.ac.uk> Message-ID: <3FCCBBE5.8000400@hotpop.com> Armin Rigo wrote: >Hello Rocco, > >On Sun, Nov 30, 2003 at 06:53:13PM +0100, rocco at codespeak.net wrote: > > >>M module\__init__.py >>A module\itertoolsmodule.py >>A module\_sremodule.py >>A module\mathmodule.py >>A module\_randommodule.py >>A module\cStringIOmodule.py >> >>Add some missing builtin modules - (Note that these probably should get >>rewritten at some point when we get a functioning RPython implementation, >>but for now ... >> >> > >Sounds nice, but I can't seem to import them in PyPy? "import math" or >"import _random" just fails with an ImportError. I saw that they are actually >just wrappers around the CPython modules, but I'd thought they'd be supposed >to be *working* wrappers... > They both work for me using the (equivalent of) stock 2280 PyPy under both Trivial and StdObjSpaces. My Python: Python 2.3 (#46, Jul 29 2003, 18:54:32) [MSC v.1200 32 bit (Intel)] on win32 [WinXP] Can anyone else duplicate Armin's bug? -Rocco P.S. I haven't tried actually *using* either the math or _random module, so there are probably latent bugs in there somewhere, but the import statement should work at least. From arigo at tunes.org Wed Dec 3 16:11:48 2003 From: arigo at tunes.org (Armin Rigo) Date: Wed, 3 Dec 2003 15:11:48 +0000 Subject: [pypy-dev] Re: [pypy-svn] rev 2280 - in pypy/trunk/src/pypy: annotationappspace module module/test In-Reply-To: <3FCCBBE5.8000400@hotpop.com> References: <20031130175313.C6AA15BE97@thoth.codespeak.net> <20031201162034.GA7130@vicky.ecs.soton.ac.uk> <3FCCBBE5.8000400@hotpop.com> Message-ID: <20031203151148.GE21194@vicky.ecs.soton.ac.uk> Hello Rocco, On Tue, Dec 02, 2003 at 10:20:53AM -0600, Rocco Moretti wrote: > They both work for me using the (equivalent of) stock 2280 PyPy under > both Trivial and StdObjSpaces. I figured it out: 'math' is not in sys.builtin_modules_name in my underlying CPython interpreter, because it is built as a separate extension module (which is the case by default, you must have tweaked your CPython installation). So in mathmodule.py the following line: if 'math' in _names: just returns False. You should probably just try to import math and if you really want to be nice, catch the (possible but unlikely) ImportError. A bientot, Armin. From hpk at trillke.net Sun Dec 7 19:52:42 2003 From: hpk at trillke.net (holger krekel) Date: Sun, 7 Dec 2003 19:52:42 +0100 Subject: [pypy-dev] request for sprint planning discussion In-Reply-To: <20031201170207.GA10543@vicky.ecs.soton.ac.uk>; from arigo@tunes.org on Mon, Dec 01, 2003 at 05:02:07PM +0000 References: <20031201172454.M15957@prim.han.de> <20031201170207.GA10543@vicky.ecs.soton.ac.uk> Message-ID: <20031207195242.K15957@prim.han.de> Hi Armin, [Armin Rigo Mon, Dec 01, 2003 at 05:02:07PM +0000] > On Mon, Dec 01, 2003 at 05:24:54PM +0100, holger krekel wrote: > > Accessing interpreter level objects from app-level would e.g. be useful > > if we want to *define* the complete python type hiearchy including > > __new__ methods in our standard "types.py" file. > > I think these issues are somewhat independent. And I also think that the > possibility of getting rid of type objects altogether at interpreter-level > looks like a really cool thing, which requires a bit more precision. Good, i had the feeling my posting already was big enough so i didn't try to give too many details. > ... > The summary is that we could just leave type objects out of the core of PyPy. > Then a cool place to add them (because I guess people will still want them > back) would be at app-level, e.g. in a custom version of the 'types.py' > module, that would look like: > > class object: ... > class type(object): ... > class int(object): ... > class bool(int): ... > class function(object): ... I agree that this would be nice ... > and then we need a way to correlate these classes with interpreter-level > classes. We could have a nice registry of available built-in implementations > for each type, but essentially a quick hack would be: > > class int(object): > def __new__(cls, *args): > return pypy_factories.W_IntObject(*args) When subclassing such type-objects how would the interp-level machinery find the modified/added methods from the user-defined subclass? Just looking at the interpreter-level implementation for methods/attributes obviously isn't enough. We probably need to provide the actual type object ('cls') to the interpreter level factories. However, we don't want to look into the app-level definitions every time we call some operation on a W_DictObject, W_ListObject etc.pp. I guess the factories need to create e.g. a W_UserObject if 'cls' does not match an implementations default. Therefore we need to store some information regarding if a W_* instance is a userlevel-subclass-instance or a plain one. Additionally, i wonder how the need to represent our more internal objects (functions, modules) at app-level relates to the idea of defining python's type system at app-level. When e.g. accessing co_* attributes from a PyCode instance it might be nice to enumerate all attributes at app-level to make introspection easier (currently you have to provide introspection through pypy_getattr & friends which is tedious). It's possible though that any issues of representing internal objects is or can be rather orthogonal the the explicit types.py approach. At least we probably don't need to care about subclassing those internal objects (subclassing PyCode and creating user-level PyFrames might be fun, though :-) Let me also note that the more involved interactions between interpreter level and application-level objects we have the harder debugging gets ... but it seems worth it in this case. Maybe we should try to write a a complete app-level "types.py" prototype to see if we have a complete enough picture ... cheers, holger From arigo at tunes.org Tue Dec 9 15:50:19 2003 From: arigo at tunes.org (Armin Rigo) Date: Tue, 9 Dec 2003 14:50:19 +0000 Subject: [pypy-dev] Sprint tracks Message-ID: <20031209145019.GA18189@vicky.ecs.soton.ac.uk> Hello PyPy, we have done some more discussion on how to plan the upcoming sprint. Here is our idea. Holger will do an overview presentation of PyPy's current architecture. Armin (or someone else, please stand up :-) will introduce the following four possible tracks for the sprint: * types.py * annotations * interactive/doc tools * builtins and miscanellea In a few words: * types.py This is essentially moving type declarations to app-level. The interp-level code would only be concerned about providing implementations for objects (either internal objects like Code or Function, or StdObjSpace objects like W_IntObject). This involves (a) writing the app-level types as almost-usual class declarations; (b) writing a mechanism to link them with interp-level classes; (c) fixing the StdObjSpace to use this mechanism (which should be straightforward because it already works almost like that). * annotations This is the part about translating PyPy into low-level code. We have the beginnings of a nice general annotation scheme which should be finished and linked with the type inference and code generation. The goal is then to have something that can really analyze and produce code for complex examples (up to PyPy itself). * interactive/doc tools Introspecting and interacting with a running PyPy process on various levels. This should make debugging and understanding pypy structures easier, and provide more direct feed-back. Depending on interest and experience we could build various nice user interfaces to do that. Refactoring the current interactive.py/py.pymodule is also a good starting point. * builtins and miscanellea There are still a few crucial builtins to be implemented to make typcial python programs work. Running the CPython test suite is an obvious goal. Writing some more tests especially for the interpreter-level builtins also makes sense. And of course there is a larger number of less-essential built-in modules that need to be ported. The two big introduction sessions should make it possible to get everyone up to speed and start hacking in pairs. Note that we can still add/remove tracks depending on interest. Don't be scared by the above tracks, most of them can be further subdivided into more independent jobs. We might want to list all the "jobs" that need to be done on the board so that if you are fed up with your current track you can pick something :-) Furthermore at the end of the week we should all invest an afternoon into documenting the various architecture bits and pieces. It probably is more fun if we do it all at the same time. cheers and see you all in Amsterdam (well not all, just the sprinters :-), Armin & Holger From rxe at ukshells.co.uk Tue Dec 9 17:04:35 2003 From: rxe at ukshells.co.uk (Richard Emslie) Date: Tue, 9 Dec 2003 16:04:35 +0000 (GMT) Subject: [pypy-dev] Questions Message-ID: Hi, I've been reading through the source code and the docs, and getting some jist of what is going on. I guess what I was expecting to see something more like the CPython code but in python (like why do we have different object spaces, although I see the errors of my ways now :-) ) and was failing to understand the big picture. So reading between the lines, does this sound anything quite like what we are trying to achieve... The abstraction of the object spaces are so we can perform abstract interpretation with one set, a working interpreter with another, a minimal interpreter with another, and goodness knows what else ;-) So to create our initial interpreter, we take the interpreter code, multi-method dispatcher and the standard object space and we can abstractly interpret with the interpreter/flow object space/annotation. That stage involves building up a set of basic blocks, building a flow graph, type inference and then translating (sorry I get a bit lost here with what happens where, ie when does the flow object space stop and annotation start, but the answer to that one is to read more code ;-) ) to pyrex/CL/other low level code. Does that sound about right so far? Then do either of these make sense (purely speculation... and most likely nonsense) Also if we write the flow object space and annotation in RPython we can pipe that through itself, to generate low level code too. Now my main question is - how do we combine the two object spaces such that we do abstract intepretation and annotation in a running interpreter (also I guess we would either need some very low level translation, ie machine code or some LLVM like architecture to do this?) OR Once we have broken the interpeter - standard object space into a finite - into a set of blocks and graph, and translate those blocks into low level code - we could view any python bytecode operating on this as a traversal over the blocks. Therefore we could create a new flow graph from this traversal, and feed it into some LLVM like architecture which does the low level translation and optimisation phase for us?? Thanks for any feedback... :-) Cheers, Richard From hpk at trillke.net Tue Dec 9 23:18:59 2003 From: hpk at trillke.net (holger krekel) Date: Tue, 9 Dec 2003 23:18:59 +0100 Subject: [pypy-dev] Questions In-Reply-To: ; from rxe@ukshells.co.uk on Tue, Dec 09, 2003 at 04:04:35PM +0000 References: Message-ID: <20031209231859.X15957@prim.han.de> Hi Richard, [Richard Emslie Tue, Dec 09, 2003 at 04:04:35PM +0000] > I've been reading through the source code and the docs, and getting some > jist of what is going on. I guess what I was expecting to see something > more like the CPython code but in python (like why do we have different > object spaces, although I see the errors of my ways now :-) ) and was > failing to understand the big picture. understandable. Reverse engeneering documentation from plain code is not always easy :-) > So reading between the lines, does this sound anything quite like what we > are trying to achieve... > > The abstraction of the object spaces are so we can perform abstract > interpretation with one set, a working interpreter with another, a minimal > interpreter with another, and goodness knows what else ;-) right. > So to create our initial interpreter, we take the interpreter code, multi-method > dispatcher and the standard object space and we can abstractly interpret > with the interpreter/flow object space/annotation. yes, more precisely the interpreter/flowobjspace combination should be able to perform abstract interpretation on any RPython program. RPython is our acronym for "not quite as dynamic as python". But note, that we basically allow *full dynamism* including metaclasses and all the fancy stuff during *initialization* of the interpreter and its object spaces. Only when we actually interprete code objects from an app-level program we restrict the involved code to be RPythonic. The interpreter/flowobjspace combination will start abstract interpretation on some initial function object, say e.g. frame.run(). The frame and the bytecode/opcode implementations it invokes will work with e.g. the StdObjSpace. The flowobjspace doesn't care on which objspace the frame/opcodes execute. The flowobjspace and its interpreter instance don't care if they run on something else than pypy :-) Actually thinking in more detail about this will probably lead us into the still muddy waters of the whole bootstrapping process but let's not get distracted here:-) > That stage involves > building up a set of basic blocks, building a flow graph, type inference > and then translating (sorry I get a bit lost here with what happens where, > ie when does the flow object space stop and annotation start, but the > answer to that one is to read more code ;-) ) to pyrex/CL/other low level > code. exactly. > Does that sound about right so far? Then do either of these make sense > (purely speculation... and most likely nonsense) > > Also if we write the flow object space and annotation in RPython we can > pipe that through itself, to generate low level code too. Now my main > question is - how do we combine the two object spaces such that we do > abstract intepretation and annotation in a running interpreter (also I > guess we would either need some very low level translation, ie machine > code or some LLVM like architecture to do this?) (first: see my above reference of muddy waters :-) In theory, we can annotate/translate flowobjspace itself, thus producing a low-level (pyrex/lisp/c/llvm) representation of our abstract interpretation code. When executing this lower-level representation on ourself again we should produce to same representation we are currently running. I think this is similar to the 3-stage gcc building process: First it uses some external component to build itsself (stage1). It uses stage1 to compile itself again to stage2. It then uses stage2 to recompile itself again to stage3 and sees if it still works. Thus the whole program serves as a good testbed if everything works right. > Once we have broken the interpeter - standard object space into a finite - > into a set of blocks and graph, and translate those blocks into low level > code - we could view any python bytecode operating on this as a traversal > over the blocks. Hmm, yes i think that's right although i would rephrase a bit: the flowgraph obtained from abstract interpretation is just another representation of a/our python program. Code objects (which contain the bytecodes) are themselves a representation of python source text. The flowgraph of course provides a lot of interesting information (like all possible code pathes and low-level identification of variable state) and makes it explicitely available for annotation and translation. Btw, at the moment annotation justs *uses* the flowgraph but not he other way round. (In the future we might want to drive them more in parallel in order to allow the flowobjspace code to consult the annotation module. Then the flowgraph code could possibly avoid producing representations where annotation/type inference is not able anymore to produce exact types). > Therefore we could create a new flow graph from this > traversal, and feed it into some LLVM like architecture which does the low > level translation and optimisation phase for us?? There is no need to take this double-indirection. We can produce LLVM bytecode directly from python-code with a specific translator (similar to genpyrex/genclisp). We could translate ourself to make this faster, of course. For merging Psyco techniques we will probably want to rely on something like LLVM to do this dynamically. Generating C-code is usually a pretty static thing and cannnot easily be done at runtime. > Thanks for any feedback... :-) you are welcome. Feel free to followup ... cheers, holger P.S.: please note that everything in pypy/annotation/* is just evolving code which is not used anywhere. the in-use annotation stuff is currently in translator ... From lac at strakt.com Thu Dec 11 08:49:58 2003 From: lac at strakt.com (Laura Creighton) Date: Thu, 11 Dec 2003 08:49:58 +0100 Subject: [pypy-dev] IronPython Message-ID: <200312110749.hBB7nwWF019788@ratthing-b246.strakt.com> Alex Martelli tells me that Jim Hugunin has ported Python to .NET and his new Python port is 70% faster than CPython 2.3 on pystones. Anybody have any more information, webpage, code? Laura From tanzer at swing.co.at Thu Dec 11 09:03:57 2003 From: tanzer at swing.co.at (Christian Tanzer) Date: Thu, 11 Dec 2003 09:03:57 +0100 Subject: [pypy-dev] IronPython In-Reply-To: Your message of "Thu, 11 Dec 2003 08:49:58 +0100." <200312110749.hBB7nwWF019788@ratthing-b246.strakt.com> Message-ID: > Alex Martelli tells me that Jim Hugunin has ported Python to .NET > and his new Python port is 70% faster than CPython 2.3 on pystones. > Anybody have any more information, webpage, code? You probably already know: http://www.python.org/~jeremy/weblog/031209a.html ? Doesn't contain a lot of information but points to a mail by Jim. -- Christian Tanzer http://www.c-tanzer.at/ From hpk at trillke.net Thu Dec 11 11:54:06 2003 From: hpk at trillke.net (holger krekel) Date: Thu, 11 Dec 2003 11:54:06 +0100 Subject: [pypy-dev] IronPython In-Reply-To: ; from tanzer@swing.co.at on Thu, Dec 11, 2003 at 09:03:57AM +0100 References: <200312110749.hBB7nwWF019788@ratthing-b246.strakt.com> Message-ID: <20031211115406.D15957@prim.han.de> [Christian Tanzer Thu, Dec 11, 2003 at 09:03:57AM +0100] > > > Alex Martelli tells me that Jim Hugunin has ported Python to .NET > > and his new Python port is 70% faster than CPython 2.3 on pystones. > > Anybody have any more information, webpage, code? > > You probably already know: > > http://www.python.org/~jeremy/weblog/031209a.html ? > > Doesn't contain a lot of information but points to a mail by Jim. However, it certainly sounds promising. Apart from generating LLVM-bytecode we should definitely consider generating IL (the intermediate language from .NET) . Jim also mentions the following bit: The current version of IronPython doesn't use any type inference or partial evaluation to improve performance. I guess we could help there ... cheers, holger From rxe at ukshells.co.uk Fri Dec 12 00:11:30 2003 From: rxe at ukshells.co.uk (Richard Emslie) Date: Thu, 11 Dec 2003 23:11:30 +0000 (GMT) Subject: [pypy-dev] Questions In-Reply-To: <20031209231859.X15957@prim.han.de> References: <20031209231859.X15957@prim.han.de> Message-ID: Hi Holger, On Tue, 9 Dec 2003, holger krekel wrote: > Hi Richard, > > [Richard Emslie Tue, Dec 09, 2003 at 04:04:35PM +0000] > > I've been reading through the source code and the docs, and getting some > > jist of what is going on. I guess what I was expecting to see something > > more like the CPython code but in python (like why do we have different > > object spaces, although I see the errors of my ways now :-) ) and was > > failing to understand the big picture. > > understandable. Reverse engeneering documentation from plain code > is not always easy :-) :-) Thanks Holger for great responses... it has certainly cleared up a few things. One thing that is really interesting in understanding PyPy thus far, is that the puzzle has two sides; how does it work and why is it done in such a way. For instance we can count 10 different types of frame object in the interpreter and stdobjspace. What would be a really nice part of the architecture introduction (although I imagine there are other, better ideas) - is to step through a few simple code examples running in an initialised stdobjspace "interactive.py" session, describing the various object creation/interactions on the way (ExecutionContext, Code, Frame, objects) and how method dispatching to the object spaces flows from Code/Frames. And then some idea of how the current bootstrapping is working for stdobjspace (see * below). It might serve as a nice basis for documenting too (yup I'm volunteering :-)) > > > So reading between the lines, does this sound anything quite like what we > > are trying to achieve... > > > > The abstraction of the object spaces are so we can perform abstract > > interpretation with one set, a working interpreter with another, a minimal > > interpreter with another, and goodness knows what else ;-) > > right. > > > So to create our initial interpreter, we take the interpreter code, multi-method > > dispatcher and the standard object space and we can abstractly interpret > > with the interpreter/flow object space/annotation. > > yes, more precisely the interpreter/flowobjspace combination should be > able to perform abstract interpretation on any RPython program. RPython > is our acronym for "not quite as dynamic as python". But note, that > we basically allow *full dynamism* including metaclasses and all the > fancy stuff during *initialization* of the interpreter and its object > spaces. Only when we actually interprete code objects from an > app-level program we restrict the involved code to be RPythonic. > That explains a lot, I was ironically starting to think RPython is really very dynamic, but after the dust settles I guess that's it. I am assuming therefore on the call to initialize() [do europeans generally follow american spelling? ;-)] we are free to do all sorts of dynamic manipulation to our classes and objects - However, during the course of building the sys module & builtins (*) we seem start interpreting some bytecodes!! How is that possible if we don't have any object spaces ready to act on? > The interpreter/flowobjspace combination will start abstract > interpretation on some initial function object, say e.g. frame.run(). > The frame and the bytecode/opcode implementations it invokes will work > with e.g. the StdObjSpace. The flowobjspace doesn't care on which > objspace the frame/opcodes execute. The flowobjspace and its interpreter > instance don't care if they run on something else than pypy :-) > > Actually thinking in more detail about this will probably lead us into the > still muddy waters of the whole bootstrapping process but let's not get > distracted here:-) > Do you mean what was described above with the bytecode being interpreted before initialisation is complete - or are we talking about memory management, internal representation of basic object types in the object space (lists, ints, floats ) system calls (block/nonblocking), system resources (file descriptors), garbage collection and whatnot. Ok lets not get distracted... :-) > > That stage involves > > building up a set of basic blocks, building a flow graph, type inference > > and then translating (sorry I get a bit lost here with what happens where, > > ie when does the flow object space stop and annotation start, but the > > answer to that one is to read more code ;-) ) to pyrex/CL/other low level > > code. > > exactly. > > > Does that sound about right so far? Then do either of these make sense > > (purely speculation... and most likely nonsense) > > > > Also if we write the flow object space and annotation in RPython we can > > pipe that through itself, to generate low level code too. Now my main > > question is - how do we combine the two object spaces such that we do > > abstract intepretation and annotation in a running interpreter (also I > > guess we would either need some very low level translation, ie machine > > code or some LLVM like architecture to do this?) > > (first: see my above reference of muddy waters :-) > > In theory, we can annotate/translate flowobjspace itself, thus producing > a low-level (pyrex/lisp/c/llvm) representation of our abstract > interpretation code. When executing this lower-level representation > on ourself again we should produce to same representation we are > currently running. Yes, I see now. For some reason I thought they would be different. >I think this is similar to the 3-stage gcc building > process: First it uses some external component to build itsself > (stage1). It uses stage1 to compile itself again to stage2. It then uses > stage2 to recompile itself again to stage3 and sees if it still works. > Thus the whole program serves as a good testbed if everything works right. Funny I used to compile twice doing stage 1 and 2 manually back when redhat were producing buggy versions, if I only knew! ;-) > > > Once we have broken the interpeter - standard object space into a finite - > > into a set of blocks and graph, and translate those blocks into low level > > code - we could view any python bytecode operating on this as a traversal > > over the blocks. > > Hmm, yes i think that's right although i would rephrase a bit: the flowgraph > obtained from abstract interpretation is just another representation of a/our > python program. Code objects (which contain the bytecodes) are > themselves a representation of python source text. It does have other cool implications if we have a low enough translation language we could do away with stacks and frames for execution... :-) > > The flowgraph of course provides a lot of interesting information (like > all possible code pathes and low-level identification of variable state) > and makes it explicitely available for annotation and translation. > Btw, at the moment annotation justs *uses* the flowgraph but not he > other way round. (In the future we might want to drive them more in > parallel in order to allow the flowobjspace code to consult the > annotation module. Then the flowgraph code could possibly avoid > producing representations where annotation/type inference is not able > anymore to produce exact types). > Can I ask the silly question of what does annotation actually mean? Is it seperate from type inference? Don't really follow the parallel part. With RPython are we assuming that we can always produce exact types? Is the idea for non-determinsitic points (ie nodes where we cannot infer the types) to be revealed and then propagated up the graph to highest node where it first can be determined and create a new snapshot of nodes when any new type enters that point and translate, and adding caching so we don't have to recreate the snapshot/translation each time (high chance it is going to be the same type)? > > Therefore we could create a new flow graph from this > > traversal, and feed it into some LLVM like architecture which does the low > > level translation and optimisation phase for us?? > > There is no need to take this double-indirection. We can produce LLVM > bytecode directly from python-code with a specific translator (similar to > genpyrex/genclisp). We could translate ourself to make this faster, of > course. For merging Psyco techniques we will probably want to rely on something > like LLVM to do this dynamically. Generating C-code is usually a pretty > static thing and cannnot easily be done at runtime. > :-) Yes not the best way with the double interpretation. > > Thanks for any feedback... :-) > > you are welcome. Feel free to followup ... > Yes thanks again! Looking forward to next week... :-) Cheers, Richard From rxe at ukshells.co.uk Fri Dec 12 00:20:22 2003 From: rxe at ukshells.co.uk (Richard Emslie) Date: Thu, 11 Dec 2003 23:20:22 +0000 (GMT) Subject: [pypy-dev] Questions In-Reply-To: <20031209231859.X15957@prim.han.de> References: <20031209231859.X15957@prim.han.de> Message-ID: [Sorry, same email - formatting fixed] Hi Holger, On Tue, 9 Dec 2003, holger krekel wrote: > Hi Richard, > > [Richard Emslie Tue, Dec 09, 2003 at 04:04:35PM +0000] > > I've been reading through the source code and the docs, and getting some > > jist of what is going on. I guess what I was expecting to see something > > more like the CPython code but in python (like why do we have different > > object spaces, although I see the errors of my ways now :-) ) and was > > failing to understand the big picture. > > understandable. Reverse engeneering documentation from plain code > is not always easy :-) :-) Thanks Holger for great responses... it has certainly cleared up a few things. One thing that is really interesting in understanding PyPy thus far, is that the puzzle has two sides; how does it work and why is it done in such a way. For instance we can count 10 different types of frame object in the interpreter and stdobjspace. What would be a really nice part of the architecture introduction (although I imagine there are other, better ideas) - is to step through a few simple code examples running in an initialised stdobjspace "interactive.py" session, describing the various object creation/interactions on the way (ExecutionContext, Code, Frame, objects) and how method dispatching to the object spaces flows from Code/Frames. And then some idea of how the current bootstrapping is working for stdobjspace (see * below). It might serve as a nice basis for documenting too (yup I'm volunteering :-)) > > > So reading between the lines, does this sound anything quite like what we > > are trying to achieve... > > > > The abstraction of the object spaces are so we can perform abstract > > interpretation with one set, a working interpreter with another, a minimal > > interpreter with another, and goodness knows what else ;-) > > right. > > > So to create our initial interpreter, we take the interpreter code, multi-method > > dispatcher and the standard object space and we can abstractly interpret > > with the interpreter/flow object space/annotation. > > yes, more precisely the interpreter/flowobjspace combination should be > able to perform abstract interpretation on any RPython program. RPython > is our acronym for "not quite as dynamic as python". But note, that > we basically allow *full dynamism* including metaclasses and all the > fancy stuff during *initialization* of the interpreter and its object > spaces. Only when we actually interprete code objects from an > app-level program we restrict the involved code to be RPythonic. That explains a lot, I was ironically starting to think RPython is really very dynamic, but after the dust settles I guess that's it. I am assuming therefore on the call to initialize() [do europeans generally follow american spelling? ;-)] we are free to do all sorts of dynamic manipulation to our classes and objects - However, during the course of building the sys module & builtins (*) we seem start interpreting some bytecodes!! How is that possible if we don't have any object spaces ready to act on? > > The interpreter/flowobjspace combination will start abstract > interpretation on some initial function object, say e.g. frame.run(). > The frame and the bytecode/opcode implementations it invokes will work > with e.g. the StdObjSpace. The flowobjspace doesn't care on which > objspace the frame/opcodes execute. The flowobjspace and its interpreter > instance don't care if they run on something else than pypy :-) > > Actually thinking in more detail about this will probably lead us into the > still muddy waters of the whole bootstrapping process but let's not get > distracted here:-) Do you mean what was described above with the bytecode being interpreted before initialisation is complete - or are we talking about memory management, internal representation of basic object types in the object space (lists, ints, floats ) system calls (block/nonblocking), system resources (file descriptors), garbage collection and whatnot. Ok lets not get distracted... :-) > > > That stage involves > > building up a set of basic blocks, building a flow graph, type inference > > and then translating (sorry I get a bit lost here with what happens where, > > ie when does the flow object space stop and annotation start, but the > > answer to that one is to read more code ;-) ) to pyrex/CL/other low level > > code. > > exactly. > > > Does that sound about right so far? Then do either of these make sense > > (purely speculation... and most likely nonsense) > > > > Also if we write the flow object space and annotation in RPython we can > > pipe that through itself, to generate low level code too. Now my main > > question is - how do we combine the two object spaces such that we do > > abstract intepretation and annotation in a running interpreter (also I > > guess we would either need some very low level translation, ie machine > > code or some LLVM like architecture to do this?) > > (first: see my above reference of muddy waters :-) > > In theory, we can annotate/translate flowobjspace itself, thus producing > a low-level (pyrex/lisp/c/llvm) representation of our abstract > interpretation code. When executing this lower-level representation > on ourself again we should produce to same representation we are > currently running. Yes, I see now. For some reason I thought they would be different. I think this is similar to the 3-stage gcc building > process: First it uses some external component to build itsself > (stage1). It uses stage1 to compile itself again to stage2. It then uses > stage2 to recompile itself again to stage3 and sees if it still works. > Thus the whole program serves as a good testbed if everything works right. > Funny I used to compile twice doing stage 1 and 2 manually back when redhat were producing buggy versions, if I only knew! ;-) > > Once we have broken the interpeter - standard object space into a finite - > > into a set of blocks and graph, and translate those blocks into low level > > code - we could view any python bytecode operating on this as a traversal > > over the blocks. > > Hmm, yes i think that's right although i would rephrase a bit: the flowgraph > obtained from abstract interpretation is just another representation of a/our > python program. Code objects (which contain the bytecodes) are > themselves a representation of python source text. It does have other cool implications if we have a low enough translation language we could do away with stacks and frames for execution... :-) > > The flowgraph of course provides a lot of interesting information (like > all possible code pathes and low-level identification of variable state) > and makes it explicitely available for annotation and translation. > Btw, at the moment annotation justs *uses* the flowgraph but not he > other way round. (In the future we might want to drive them more in > parallel in order to allow the flowobjspace code to consult the > annotation module. Then the flowgraph code could possibly avoid > producing representations where annotation/type inference is not able > anymore to produce exact types). Can I ask the silly question of what does annotation actually mean? Is it seperate from type inference? Don't really follow the parallel part. With RPython are we assuming that we can always produce exact types? Is the idea for non-determinsitic points (ie nodes where we cannot infer the types) to be revealed and then propagated up the graph to highest node where it first can be determined and create a new snapshot of nodes when any new type enters that point and translate, and adding caching so we don't have to recreate the snapshot/translation each time (high chance it is going to be the same type)? > > > Therefore we could create a new flow graph from this > > traversal, and feed it into some LLVM like architecture which does the low > > level translation and optimisation phase for us?? > > There is no need to take this double-indirection. We can produce LLVM > bytecode directly from python-code with a specific translator (similar to > genpyrex/genclisp). We could translate ourself to make this faster, of > course. For merging Psyco techniques we will probably want to rely on something > like LLVM to do this dynamically. Generating C-code is usually a pretty > static thing and cannnot easily be done at runtime. :-) Yes not the best way with the double indirection. > > > Thanks for any feedback... :-) > > you are welcome. Feel free to followup ... Yes thanks again! Looking forward to next week... :-) Cheers, Richard From tismer at tismer.com Fri Dec 12 01:10:08 2003 From: tismer at tismer.com (Christian Tismer) Date: Fri, 12 Dec 2003 01:10:08 +0100 Subject: [pypy-dev] Questions In-Reply-To: References: <20031209231859.X15957@prim.han.de> Message-ID: <3FD90760.4020707@tismer.com> Richard Emslie wrote: ... > That explains a lot, I was ironically starting to think RPython is really > very dynamic, but after the dust settles I guess that's it. I am assuming > therefore on the call to initialize() [do europeans generally follow > american spelling? ;-)] we are free to do all sorts of dynamic > manipulation to our classes and objects - However, during the course of > building the sys module & builtins (*) we seem start interpreting some > bytecodes!! How is that possible if we don't have any object spaces ready > to act on? It does all not really matter, given that you crank the thing up *someway*. During that phase, you can use all and everything from Python, you are just supposed to create the modules, objectspace and such, and produce the needed code objects. Once you are done, you want to say "this is now my Python, and it is running upon RPython, only". That's the point. At that moment, you need to forget all tricks and stuff that you used during bootstrap phase. Now the RPython rules must be obeyed. Only code objects with that properties may now be visible. Reason? Well, you can run this interpreter now, but it's not the point. The point is, from the current compiled RPython bytecode, you can create a new source. The frozen source of this interpreter. And since it is RPython, you can generate efficient code, like the efficient code which we know: The C code. Well, and this is done by running the flow object space on top of that. It gives all the information to create efficient code. Dunno if that helps and if I was on track ;-) ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 mobile +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tismer at tismer.com Tue Dec 16 03:25:57 2003 From: tismer at tismer.com (Christian Tismer) Date: Tue, 16 Dec 2003 03:25:57 +0100 Subject: [pypy-dev] Does anybody really use frame->f_tstate ? Message-ID: <3FDE6D35.3090100@tismer.com> Hi colleagues, this is my second attempt to get rid of the f_tstate field in frames. I need to find every user of this field. What am I talking about? Well, Python always has a thread state variable which is a unique handle to the current thread. This variable is accessed in many places, and there exists a fast macro to get at it. Every executing Python frame also gets a copy on creation. In some cases, this frame->f_tstate field is used, in other cases the current tstate variable is used. If this sounds foreign to you, please stop reading here. ------------------------------------------------------------- I would like to get rid of the frame->f_tstate, and I'm trying to find out if there is a need for it. I don't need it, for Stackless, it is the opposite, it disturbs. There was a small thread about this in June this year, where Guido convinced me that it is possible to create a traceback on a frame that comes from a different thread. http://mail.python.org/pipermail/python-dev/2003-June/036254.html Ok, this is in fact possible, although I don't think anybody has a need for this. My question to all extension writers is this: If you use frame->f_tstate at all, do you use it just because it is handy, or do you want to use it for some other purpose? One purpose could be that you really want to create a traceback on a different than the current thread. I have never seen this, but who knows, so that's why I'm asking the Python world. In most cases, a traceback will be created on a frame that is currently processd or just has been processed. Accessing a frame of a different thread that is being processed might make sense for special debugger cases. My proposal is -------------- a) change semantics of PytraceBack_Here to use the current tstate. b) if such a special purpose exists, create a new function for it. c) if urgent, different needs exist to keep f_tstate, then let's forget about this proposal. Especially for Stackless, I'd be keen of getting rid of this. thanks for input -- chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 mobile +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From mwh at python.net Tue Dec 16 12:37:37 2003 From: mwh at python.net (Michael Hudson) Date: Tue, 16 Dec 2003 11:37:37 +0000 Subject: [pypy-dev] Re: [pypy-svn] rev 2354 - pypy/trunk/src/pypy/objspace/std/test In-Reply-To: <20031216110649.54C505BFB1@thoth.codespeak.net> (alex@codespeak.net's message of "Tue, 16 Dec 2003 12:06:49 +0100 (MET)") References: <20031216110649.54C505BFB1@thoth.codespeak.net> Message-ID: <2m7k0xhu2m.fsf@starship.python.net> alex at codespeak.net writes: > Author: alex > Date: Tue Dec 16 12:06:48 2003 > New Revision: 2354 > > Modified: > pypy/trunk/src/pypy/objspace/std/test/test_dictobject.py > Log: > Added a test for dicts' str representation > > > Modified: pypy/trunk/src/pypy/objspace/std/test/test_dictobject.py > ============================================================================== > --- pypy/trunk/src/pypy/objspace/std/test/test_dictobject.py (original) > +++ pypy/trunk/src/pypy/objspace/std/test/test_dictobject.py Tue Dec 16 12:06:48 2003 > @@ -303,6 +303,12 @@ > self.assertEqual(bool, True) > bool = d1 < d4 > self.assertEqual(bool, False) > + > + def test_str_repr(self): > + self.assertEqual('{}', str({})) > + self.assertEqual('{1: 2}', str({1: 2})) > + self.assertEqual("{'ba': 'bo'}", str({'ba': 'bo'})) > + self.assertEqual("{1: 2, 'ba': 'bo'}", str({1: 2, 'ba': 'bo'})) Are these really good tests? They would seem to depend on hashtable ordering... Cheers, mwh -- I never realized it before, but having looked that over I'm certain I'd rather have my eyes burned out by zombies with flaming dung sticks than work on a conscientious Unicode regex engine. -- Tim Peters, 3 Dec 1998 From sschwarzer at sschwarzer.net Tue Dec 16 17:53:48 2003 From: sschwarzer at sschwarzer.net (Stefan Schwarzer) Date: Tue, 16 Dec 2003 17:53:48 +0100 Subject: [pypy-dev] Inactive code in pypy/tool/test.py Message-ID: <6AF13EE8-2FE8-11D8-A63B-000A95AAF5B8@sschwarzer.net> In pypy/tool/test.py, class MyTextTestResult, method interact, some code follows a return statement and thus is inactive. Was the return statement left in accidentically or what's going on here? :-) Stefan From mwh at python.net Tue Dec 16 18:13:35 2003 From: mwh at python.net (Michael Hudson) Date: Tue, 16 Dec 2003 17:13:35 +0000 Subject: [pypy-dev] Re: Inactive code in pypy/tool/test.py References: <6AF13EE8-2FE8-11D8-A63B-000A95AAF5B8@sschwarzer.net> Message-ID: <2mu140heio.fsf@starship.python.net> Stefan Schwarzer writes: > In pypy/tool/test.py, class MyTextTestResult, method interact, some > code follows a return statement and thus is inactive. Was the return > statement left in accidentically or what's going on here? :-) I think the dead code is really dead. Cheers, mwh -- In many ways, it's a dull language, borrowing solid old concepts from many other languages & styles: boring syntax, unsurprising semantics, few automatic coercions, etc etc. But that's one of the things I like about it. -- Tim Peters, 16 Sep 93 From sschwarzer at sschwarzer.net Thu Dec 18 13:00:50 2003 From: sschwarzer at sschwarzer.net (Stefan Schwarzer) Date: Thu, 18 Dec 2003 13:00:50 +0100 Subject: [pypy-dev] Package directory test conflicts with module file test.py Message-ID: In src/pypy/tool there's a file test.py . As all test directories are called "test" (and I want to add one), this would conflict with the existing module. For example, if you say from pypy.tool.test import main the package test's __init__.py would be loaded, but that doesn't contain a function "main". ;-) My suggestion is to rename test.py to something alse. Suggestions for the name? Stefan From lac at strakt.com Thu Dec 18 14:15:03 2003 From: lac at strakt.com (Laura Creighton) Date: Thu, 18 Dec 2003 14:15:03 +0100 Subject: [pypy-dev] Package directory test conflicts with module file test.py In-Reply-To: Message from Stefan Schwarzer of "Thu, 18 Dec 2003 13:00:50 +0100." References: Message-ID: <200312181315.hBIDF33Q006488@ratthing-b246.strakt.com> In a message of Thu, 18 Dec 2003 13:00:50 +0100, Stefan Schwarzer writes: >In src/pypy/tool there's a file test.py . As all test directories >are called "test" (and I want to add one), this would conflict >with the existing module. > >For example, if you say > >from pypy.tool.test import main > >the package test's __init__.py would be loaded, but that doesn't >contain a function "main". ;-) > >My suggestion is to rename test.py to something alse. Suggestions >for the name? testit.py? > >Stefan > >_______________________________________________ >pypy-dev at codespeak.net >http://codespeak.net/mailman/listinfo/pypy-dev From arigo at tunes.org Thu Dec 18 21:34:16 2003 From: arigo at tunes.org (Armin Rigo) Date: Thu, 18 Dec 2003 20:34:16 +0000 Subject: [pypy-dev] Built-in modules at app-level Message-ID: <20031218203416.GA17841@vicky.ecs.soton.ac.uk> Hello world, Just a fast note about what we want to do at some point: write all built-in modules at application-level. Example: file module/__builtin__.py (or appspace/__builtin__.py): """ Built-in functions, exceptions, and other objects. """ def sum(sequence, total=0): for item in sequence: total = total + item return total def _interp_pow(self, w_base, w_exponent, w_modulus=None): if w_modulus is None: w_modulus = self.space.w_None return self.space.pow(w_base, w_exponent, w_modulus) -+- So this is essentially the same idea as "def app_xxx" at interpreter-level, but the other way around. It is just much more clear to write __builtin__ as above: everything is "normal", works as expected, with the exception of interpreter-level functions which serve as "hooks". Turning an "_interp_xxx" function into a built-in function is a hack. The cleaner and more powerful way to do it is quite funny: pow = escape(""" def pow(self, w_base, w_exponent, w_modulus=None): if w_modulus is None: w_modulus = self.space.w_None return self.space.pow(w_base, w_exponent, w_modulus) """) which is very nice because you can build the string with %s or whatever at application-level :-) The point is that the string is executed at interp-level, which means that escape() can only be used during initialization. After that, the translator will see a normal built-in function, and translate it into C code. A bientot, Armin. From tismer at tismer.com Fri Dec 19 10:32:37 2003 From: tismer at tismer.com (Christian Tismer) Date: Fri, 19 Dec 2003 10:32:37 +0100 Subject: [pypy-dev] Last chance! (was: Does anybody really use frame->f_tstate ?) In-Reply-To: <3FDE6D35.3090100@tismer.com> References: <3FDE6D35.3090100@tismer.com> Message-ID: <3FE2C5B5.8080208@tismer.com> Dear Python community, since I didn't get *any* reply to this request, either the request was bad or there is really nobody using f_tstate in a way that makes it urgent to keep. I will wait a few hours and then make the change to Stackless, and I'd like to propose to do the same to the Python core. Christian Tismer wrote: > Hi colleagues, > > this is my second attempt to get rid of the f_tstate field > in frames. I need to find every user of this field. > > What am I talking about? > Well, Python always has a thread state variable which > is a unique handle to the current thread. This variable > is accessed in many places, and there exists a fast macro > to get at it. > Every executing Python frame also gets a copy on creation. > In some cases, this frame->f_tstate field is used, > in other cases the current tstate variable is used. > > If this sounds foreign to you, please stop reading here. > > ------------------------------------------------------------- > > I would like to get rid of the frame->f_tstate, and I'm trying > to find out if there is a need for it. I don't need it, > for Stackless, it is the opposite, it disturbs. > > There was a small thread about this in June this year, where > Guido convinced me that it is possible to create a traceback > on a frame that comes from a different thread. > > http://mail.python.org/pipermail/python-dev/2003-June/036254.html > > Ok, this is in fact possible, although I don't think > anybody has a need for this. > > My question to all extension writers is this: > If you use frame->f_tstate at all, do you use it just > because it is handy, or do you want to use it for > some other purpose? > > One purpose could be that you really want to create a traceback > on a different than the current thread. I have never seen this, > but who knows, so that's why I'm asking the Python world. > > In most cases, a traceback will be created on a frame > that is currently processd or just has been processed. > Accessing a frame of a different thread that is being processed > might make sense for special debugger cases. > > My proposal is > -------------- > > a) change semantics of PytraceBack_Here to use the current tstate. > > b) if such a special purpose exists, create a new function for it. > > c) if urgent, different needs exist to keep f_tstate, > then let's forget about this proposal. > > Especially for Stackless, I'd be keen of getting rid of this. > > thanks for input -- chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 mobile +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From lac at strakt.com Fri Dec 19 11:34:47 2003 From: lac at strakt.com (Laura Creighton) Date: Fri, 19 Dec 2003 11:34:47 +0100 Subject: [pypy-dev] Andy Robinson wants to pen in a day for the ACCU Python-UK conference Message-ID: <200312191034.hBJAYlTc008948@ratthing-b246.strakt.com> 14-17 April. Jacob and I cannot attend this, alas. Laura http://www.accu.org/conference/ ------- Forwarded Message Replied: "Andy Robinson" Return-Path: andy at reportlab.com From: "Andy Robinson" To: "Laura Creighton" Subject: RE: ACCU 2004 Date: Fri, 19 Dec 2003 09:59:46 -0000 We're trying to have a day on all the 'new implementations of python' at ACCU - Jython, .NET, PyPy, Psyco, Stackless etc. It occurs to me the relevant people are probably with you now. I'd LOVE to wrap it up with an overview talk on how they will all converge one day, if that's still the plan. Can you ask (a) if anyone could speak and (b) if they want a sprint in Oxford that week? I can look into sprint facilities in January if it helps... - - Andy ------- End of Forwarded Message From lac at strakt.com Sat Dec 20 11:30:43 2003 From: lac at strakt.com (Laura Creighton) Date: Sat, 20 Dec 2003 11:30:43 +0100 Subject: [pypy-dev] heard at the Sprint ... Message-ID: <200312201030.hBKAUhRh011870@ratthing-b246.strakt.com> Alex: What? assert 1 is 0 is True for sufficiently large values of 0 ?! (It was a bad test, not a problem with 0L conversion ....) From hpk at trillke.net Sun Dec 21 21:24:10 2003 From: hpk at trillke.net (holger krekel) Date: Sun, 21 Dec 2003 21:24:10 +0100 Subject: [pypy-dev] amsterdam sprint reports (20th december 2003) Message-ID: <20031221212410.H15957@prim.han.de> Hello PyPy, the Amsterdam sprint has just finished and here is a report and some surrounding and outlook information. As usual please comment/add/correct me - especially the sprinters. I also wouldn't mind some discussion of what and how we could do things better at the next sprint. First of all, big thanks to *Etienne Posthumus* who patiently helped organizing this sprint even though he had to deal with various other problems at the same time. Before i start with details i recommend reading through the new Architecture document at http://codespeak.net/pypy/index.cgi?doc/architecture.html in case you don't know what i am talking about regarding the Amsterdam sprint results :-) Originally, we intended to go rather directly for translation and thus for a first release of PyPy. But before the sprint we decided to go differently about the sprint not only because Michael Hudson and Christian Tismer had to cancel their participation but we also wanted to give a smooth introduction for the new developers attending the sprint. Therefore we didn't press very hard at translation and type inference and major suprises were awaiting us anyway ... fixing lots and lots of bugs, adding more builtins and more introspection ------------------------------------------------------------------------- On this front mainly Alex Martelli, Patrick Maupin, Laura Creighton and Jacob Hallen added and fixed a lot of builtins and modules and made it possible to run - among other modules - the pystone benchmark: on most machines we have more than one pystone with PyPy already :-) While trying to get 'long' objects working Armin and Samuele realized that the StdObjSpace multimethod mechanism now desparately needs refactoring. Thus the current "long" support is just another hack (TM) which nevertheless allows to execute more of CPython's regression tests. In a releated effort, Samuele and yours truly made introspection of frames, functions and code objects compatible to CPython so that the "dis.dis(dis.dis)" goal finally works i.e can be run through PyPy/StdObjSpace. This is done by the so called "pypy_" protocol which an object space uses to delegate operations on core execution objects (functions, frames, code ...) back to the interpreter. redefining our standard type system at application level -------------------------------------------------------- Originally we thought that we could more or less easily redefine the python type objects at application level and let them access interpreter level objects and implementations via some hook. This turned out to be a bootstrapping nightmare (e.g. in order to instantiate classes you need type objects already but actually we want our first classes to define exactly those). While each particular problem could be worked around somehow Armin and Samuele realized they were opening a big can of worms ... and couldn't close it in due time. The good news is that after lots of discussions and tossing ideas around we managed to develop a new approach (see the end of the report) which raised our hopes we can finally define the types at application level and thus get rid of the ugly and hard to understand interpreter level implementation. Improving tracing and debugging of PyPy --------------------------------------- With PyPy you often get long tracebacks and other problems which make it hard to debug sometimes. Richard Emslie, Tomek Meka and me implemented a new Object Space called "TraceObjSpace" which can wrap the Trivial and Standard Object Space and will trace all objectspace operations as well as frame creation into a long list of events. Richard then in a nightly hotel session wrote "tool/traceinteractive.py" which will nicely reveal what is going on if you execute python statements: which frames are created which bytecodes are executed and what object space operations are involved. Just execute traceinteractive.py (with python 2.3) and type some random function definition and see PyPy's internals at work ... It only works with python 2.3 because we had to rewrite python's dis-module module to allow programmatic access to dissassembling byte codes. And this module has considerably changed from python 2.2 to 2.3 (thanks, Michael :-) "finishing" the Annotation refactoring -------------------------------------- That should be easy, right? Actually Guido van Rossum and Armin had started doing type inference/annotation in Belgium just before EuroPython and we have since refactored it already at the Berlin sprint and in between the sprints several times. But it turned out that Annotations as we did them are *utterly broken* in that we try to do a too general system (we had been talking about general inference engines and such) thus making "merging" of two annotations very hard to do in a meaningful way. But after beeing completly crushed on one afternoon, Samuele and Armin came up with a new *simpler* approach that ... worked and promises to not have the same flaws. It got integrated into the translator already and appears to work nicely. I think this is the fourth refactoring of just "three files" and, of course, we already have the 'XXX' virus spreading again :-) refactoring/rewriting the test framework ---------------------------------------- PyPy has an ever increasing test-suite which requires a lot of flexibility that the standard unittest.py module just doesn't provide. Currently, we have in tool/test.py and interpreter/unittest_w.py a collection of more or less weird hacks to make our (now over 500) tests run either at interpreter level or at application level which means they are actually interpreted by PyPy. Tests from both levels can moreover be run with different object spaces. Thus Stefan Schwarzer and me came up with a rewrite of unittest.py which is currently in 'newtest.py'. During my train ride back to germany i experimentally used our new approach which let our tests run around 30% faster (!) as a side effect. More about this in separate mails as this is - as almost every other area of PyPy - refactoring-in-progress. Documentation afternoon ----------------------- We (apart from Richard who had a good excuse :-) also managed on Friday to devote a full afternoon to documentation. There is now an emerging "architecture" document, a "howtopypy" (getting started) and a "goals" document here: http://codespeak.net/pypy/index.cgi?doc Moreover, we deleted misleading or redundant wiki-pages. In case you miss one of them you can still access them through the web by following "Info on this page" and "revision history". We also had a small lecture from Theo de Ridder who dived into our flowgraph and came up with some traditional descriptions from compiler theory to describe what we are doing. He also inspired us with some other nice ideas and we certainly hope he revisits the projects and continues to try to use it for his own purposes. There of course is no question that we still need more higher level documentation. Please don't use the wiki for serious documentation but make ReST-files in the doc-subtree. I guess that we will make the "documentation afternoon" a permanent event on our sprints. Towards more applevel code ... ------------------------------ As mentioned before the approach of defining the python types at application level didn't work out as easy as hoped for. But luckily, we had - again in some pub - the rescuing idea: a general mechanism that lets us trigger "exec/eval" of arbitrary interpreter level code given as a string. Of course this by itself is far too dynamic to be translatable but remember: we can perform *arbitrarily dynamic* pythonic tricks while still *initializing* object spaces and the interpreter. Translation will start with executing the initialized interpreter/objspace through another interpreter/flowobjspace instance. Some hacking on the last day showed that this new approach makes the definition of "builtin" modules a lot more pythonic: modules are not complicated class instances anymore but really look like a normal module with some "interpreter level escapes". It appears now that in combination with some other considerations we will finally get to "types.py" defining the python types thus getting rid of the cumbersome 10-12 *type.py files in objspace/std. There are still some XXX's to fight, though. Participants ------------ Patrick Maupin Richard Emslie Stefan Schwarzer Theo de Ridder Alex Martelli Laura Creighton Jacob Hallen Tomek Meka Armin Rigo Guenter Jantzen Samuele Pedronis Holger Krekel and Etienne Posthumus who made our Amsterdam Sprint possible. outlook, what comes next? ------------------------- On the Amsterdam sprint maybe more than ever we realized how strongly refactoring is the key development activity in PyPy. Watch those "XXX" :-) Some would argue that we should instead think more about what we are doing but then you wouldn't call that extreme programming, would you? However, we haven't fixed a site and date for the next sprint, yet. We would like to do it sometime February in Suitzerland on some nice mountain but there hasn't emerged a nice facility, yet. Later in the year we might be able to do a sprint in Dublin and of course one right before EuroPython in Sweden. Btw, if someone want to offer helping to organize a sprint feel free to contact us. Also there was some talk on IRC that we might do a "virtual sprint" so that our non-european developers can more easily participate. This would probably mean doing screen-sessions and using some Voice-over-IP technology ... we'll see what will eventually evolve. After all, we might also soon get information from the EU regarding our recent proposal which should make sprint planning easier in the long run. We'll see. For now i wish everyone some nice last days of the year which has been a fun one regarding our ongoing pypy adventure ... cheers, holger (who hopes he hasn't forgotten someone or something ...) From aleaxit at yahoo.com Mon Dec 22 01:32:55 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon, 22 Dec 2003 01:32:55 +0100 Subject: [pypy-dev] help fully fixing generators...? Message-ID: <200312220132.55297.aleaxit@yahoo.com> enumerate currently does not work -- and it does not work because generators (which enumerate, implemented at app level, uses) don't behave correctly when a StopIteration is raised -- they propagate it (propagate the OperationError, at interpreter level) rather than turning it into a "clean return from the generator" (a SGeneratorReturn, at interpreter level). I easily fixed it for a simple "raise StopIteration" in the generator itself, and committed that change (as well as the unit-test showing the bug existed), but I'm not sure where to edit to fix it when the StopIteration is being PROPAGATED from a call to something.next() in the generator, which is what enumerate needs. I haven't checked in a unit test showing the problem, but, basically, the issue is something like: def agen(): it = iter('ciao') while 1: yield it.next() print list(agen()) in CPython this works fine, but in pypy it currently doesn't (yet). Suggestions, anybody? Alex From arigo at tunes.org Mon Dec 22 15:49:12 2003 From: arigo at tunes.org (Armin Rigo) Date: Mon, 22 Dec 2003 14:49:12 +0000 Subject: [pypy-dev] help fully fixing generators...? In-Reply-To: <200312220132.55297.aleaxit@yahoo.com> References: <200312220132.55297.aleaxit@yahoo.com> Message-ID: <20031222144912.GA15092@vicky.ecs.soton.ac.uk> Hello Alex, On Mon, Dec 22, 2003 at 01:32:55AM +0100, Alex Martelli wrote: > easily fixed it for a simple "raise StopIteration" in the generator itself, > and committed that change (as well as the unit-test showing the bug > existed), but I'm not sure where to edit to fix it when the StopIteration > is being PROPAGATED from a call to something.next() in the generator, which > is what enumerate needs. The whole business of StopIteration vs. the interp-level "NoValue" exception is unclean, but your hack to change the semantics of the "raise" statement is worse :-) Instead, you should catch the OperationError in pypy_next(), in the call to Frame.run(), and turn it into a NoValue. A bientot, Armin. From hpk at trillke.net Mon Dec 22 16:42:46 2003 From: hpk at trillke.net (holger krekel) Date: Mon, 22 Dec 2003 16:42:46 +0100 Subject: [pypy-dev] Re: [pypy-svn] rev 2674 - pypy/trunk/doc In-Reply-To: <20031222152621.56CBF5AAA7@thoth.codespeak.net>; from alex@codespeak.net on Mon, Dec 22, 2003 at 04:26:21PM +0100 References: <20031222152621.56CBF5AAA7@thoth.codespeak.net> Message-ID: <20031222164245.A29950@prim.han.de> Hi Alex and others, i think we should try to minimize the number of formatting changes not only in program code but also in ReST files as it makes reading the diffs really hard because you basically have to reread the whole document in order to see the differences. As this has happended now several times i think it makes sense to think of some convention which allows the diffs to stay readable. Maybe (especially) ReST-changes should be done with a two-step approach: first the one where you can read the content diff and then the one (if neccessary at all) which has purely formatting changes and says so in the commit message. And we might also think about a common line width so that not everybody reformats the text again and again :-) cheers, holger [alex at codespeak.net Mon, Dec 22, 2003 at 04:26:21PM +0100] > Author: alex > Date: Mon Dec 22 16:26:20 2003 > New Revision: 2674 > ... From aleaxit at yahoo.com Mon Dec 22 17:14:17 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon, 22 Dec 2003 17:14:17 +0100 Subject: [pypy-dev] help fully fixing generators...? In-Reply-To: <20031222144912.GA15092@vicky.ecs.soton.ac.uk> References: <200312220132.55297.aleaxit@yahoo.com> <20031222144912.GA15092@vicky.ecs.soton.ac.uk> Message-ID: <200312221714.17838.aleaxit@yahoo.com> On Monday 22 December 2003 03:49 pm, Armin Rigo wrote: > Hello Alex, > > On Mon, Dec 22, 2003 at 01:32:55AM +0100, Alex Martelli wrote: > > easily fixed it for a simple "raise StopIteration" in the generator > > itself, and committed that change (as well as the unit-test showing the > > bug existed), but I'm not sure where to edit to fix it when the > > StopIteration is being PROPAGATED from a call to something.next() in the > > generator, which is what enumerate needs. > > The whole business of StopIteration vs. the interp-level "NoValue" > exception is unclean, but your hack to change the semantics of the "raise" > statement is worse :-) Instead, you should catch the OperationError in > pypy_next(), in the call to Frame.run(), and turn it into a NoValue. Aye aye, cap'n -- done, and removed the hook you disliked (now unneeded). test_all.py -S passes, including new unit tests regarding generators raising or propagating StopIteration exceptions. Alex From aleaxit at yahoo.com Mon Dec 22 17:52:40 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon, 22 Dec 2003 17:52:40 +0100 Subject: [pypy-dev] Re: [pypy-svn] rev 2674 - pypy/trunk/doc In-Reply-To: <20031222164245.A29950@prim.han.de> References: <20031222152621.56CBF5AAA7@thoth.codespeak.net> <20031222164245.A29950@prim.han.de> Message-ID: <200312221752.40026.aleaxit@yahoo.com> On Monday 22 December 2003 04:42 pm, holger krekel wrote: > Hi Alex and others, > > i think we should try to minimize the number of formatting changes not > only in program code but also in ReST files as it makes reading the diffs > really hard because you basically have to reread the whole document > in order to see the differences. As this has happended now several > times i think it makes sense to think of some convention which allows > the diffs to stay readable. > > Maybe (especially) ReST-changes should be done with a two-step approach: > first the one where you can read the content diff and then the one (if > neccessary at all) which has purely formatting changes and says so in > the commit message. And we might also think about a common line width > so that not everybody reformats the text again and again :-) Yes, good points. I think line length of <80, indents of 4 spaces, no tabs, no traling spaces allowed on any line under any pretext whatsoever, would be good conventions; I find it extremely hard to edit files (be they sources, ReST, or any other text) that fail to meet these conventions. Is there a way, with subversion, to have a simple conformance test for this kind of formatting parameters be run automatically, whenever a textfile is committed? If files breaking these conventions (or whatever conventions we can all agree on) were never committed, this would minimize the need for "changes that are purely related to formatting", I think. Alex From hpk at trillke.net Mon Dec 22 18:07:58 2003 From: hpk at trillke.net (holger krekel) Date: Mon, 22 Dec 2003 18:07:58 +0100 Subject: [pypy-dev] Re: [pypy-svn] rev 2674 - pypy/trunk/doc In-Reply-To: <200312221752.40026.aleaxit@yahoo.com>; from aleaxit@yahoo.com on Mon, Dec 22, 2003 at 05:52:40PM +0100 References: <20031222152621.56CBF5AAA7@thoth.codespeak.net> <20031222164245.A29950@prim.han.de> <200312221752.40026.aleaxit@yahoo.com> Message-ID: <20031222180758.B29950@prim.han.de> Hi Alex, [Alex Martelli Mon, Dec 22, 2003 at 05:52:40PM +0100] > On Monday 22 December 2003 04:42 pm, holger krekel wrote: > > Hi Alex and others, > > > > i think we should try to minimize the number of formatting changes not > > only in program code but also in ReST files as it makes reading the diffs > > really hard because you basically have to reread the whole document > > in order to see the differences. As this has happended now several > > times i think it makes sense to think of some convention which allows > > the diffs to stay readable. > > > > Maybe (especially) ReST-changes should be done with a two-step approach: > > first the one where you can read the content diff and then the one (if > > neccessary at all) which has purely formatting changes and says so in > > the commit message. And we might also think about a common line width > > so that not everybody reformats the text again and again :-) > > Yes, good points. I think line length of <80, indents of 4 spaces, no tabs, > no traling spaces allowed on any line under any pretext whatsoever, > would be good conventions; I find it extremely hard to edit files (be they > sources, ReST, or any other text) that fail to meet these conventions. For python source files this is fine for me. I am not entirely sure for ReST especially since you can have verbatim-blocks (say an excerpt from a mail). > Is there a way, with subversion, to have a simple conformance test for this > kind of formatting parameters be run automatically, whenever a textfile is > committed? If files breaking these conventions (or whatever conventions > we can all agree on) were never committed, this would minimize the need > for "changes that are purely related to formatting", I think. Indeed. A pre-commit hook is able to perform checks like this although the details might get hairy if you go for "indentation of four-spaces" enforcement (e.g. considering docstrings) . I guess i'd go for simplicity and just enforce "line-length < 80" and svn:eol-style==native for all *.py and *.txt files (unless their svn:mime-type is explicitely non-text or some such). Non-conforming commits would simply be refused and nothing would be changed on the fly. Opinions? cheers, holger From aleaxit at yahoo.com Mon Dec 22 18:17:12 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon, 22 Dec 2003 18:17:12 +0100 Subject: [pypy-dev] Re: [pypy-svn] rev 2674 - pypy/trunk/doc In-Reply-To: <20031222180758.B29950@prim.han.de> References: <20031222152621.56CBF5AAA7@thoth.codespeak.net> <200312221752.40026.aleaxit@yahoo.com> <20031222180758.B29950@prim.han.de> Message-ID: <200312221817.12448.aleaxit@yahoo.com> On Monday 22 December 2003 06:07 pm, holger krekel wrote: ... > > > several times i think it makes sense to think of some convention which > > > allows the diffs to stay readable. > > > > > > Maybe (especially) ReST-changes should be done with a two-step > > > approach: first the one where you can read the content diff and then > > > the one (if neccessary at all) which has purely formatting changes and > > > says so in the commit message. And we might also think about a common > > > line width so that not everybody reformats the text again and again :-) > > > > Yes, good points. I think line length of <80, indents of 4 spaces, no > > tabs, no traling spaces allowed on any line under any pretext whatsoever, > > would be good conventions; I find it extremely hard to edit files (be > > they sources, ReST, or any other text) that fail to meet these > > conventions. > > For python source files this is fine for me. I am not entirely sure for > ReST especially since you can have verbatim-blocks (say an excerpt from a > mail). In this case, the checker should be more smart -- know something about ReST syntax, if it's crucial to allow "verbatim blocks" to break what would otherwise be the conventions (particularly if we enforce them, see below). > > Is there a way, with subversion, to have a simple conformance test for > > this kind of formatting parameters be run automatically, whenever a > > textfile is committed? If files breaking these conventions (or whatever > > conventions we can all agree on) were never committed, this would > > minimize the need for "changes that are purely related to formatting", I > > think. > > Indeed. A pre-commit hook is able to perform checks like this although > the details might get hairy if you go for "indentation of four-spaces" > enforcement (e.g. considering docstrings) . I guess i'd go for simplicity I would agree, as I don't see the need for "verbatim blocks" in our docs, but you're the one who brought the subject up so I guess you must have something in mind about them? As for Python's docstrings, again the checker should be a bit more smart (use Python's tokenizer to check indent lengths but not within multistring literals). But as a first pass we can surely "not enforce this yet" (it will only mean a few more "formatting only" changes needed). > and just enforce "line-length < 80" and svn:eol-style==native for all *.py ...and no tabs and no trailing spaces on lines...? Doesn't seem any more complicated than line length and eol-style (or am I missing something?). > and *.txt files (unless their svn:mime-type is explicitely non-text or some > such). Non-conforming commits would simply be refused and nothing would be > changed on the fly. Opinions? Full agreement on refusing nonconforming commits (ideally with a clear message). One issue is that the existing files are sure to contain a lot of violations. If we start enforcing the rules, then perhaps to commit a tiny change somewhere in a file with a lot of violations one would have also to undertake a massive reformatting-only exercise. Perhaps we could enforce the rules only when the file is new OR the existing file (i.e. the version before the commit) respects the rules...? Alex From aleaxit at yahoo.com Mon Dec 22 22:23:24 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon, 22 Dec 2003 22:23:24 +0100 Subject: [pypy-dev] bug with vars() in a nested function Message-ID: <200312222223.24683.aleaxit@yahoo.com> The implementation of vars() in builtin.py relies on a call to getdictscope() to return the caller's locals as a dir, but unfortunately that doesn't work when the caller of built-in vars() is a nested function -- nestedscope.py overrides fast2locals to ensure setting in self.w_locals the free variables in addition to the locals, and that w_locals is what getdictscope returns. Apparently this kind of bug must have bitten CPython at some point (or at least be suspected!) since the unit-test for vars() looks for it pretty exactly (that's how I found it, too -- still striving to run as much of that CPython unit-test as feasible!) -- basically, it has something like: def test_vars(self): def get_vars_f0(): return vars() def get_vars_f2(): get_vars_f0() a = 1 b = 2 return vars() ... self.assertEqual(get_vars_f2(), {'a': 1, 'b': 2}) where that 'useless' call to get_vars_f0 inside get_vars_f2 serves just the purpose of "infecting" the latter with the former name -- and sure enough the assertEqual fails because 'get_vars_f0' is reported as being a key in the dict returned by get_vars_f2(). My instinct is to patch this with an optional switch to getdictscope to ask for "return the TRUE locals only please" -- but since a vaguely analogous patch I had tried to remedy the issue with generators raising StopIteration (i.e. "directly attacking the symptom rather than going for a more general solution") was criticized, I thought I'd better ask first rather than committing anything yet -- this one IS a rather marginal and obscure side issue after all, so for now I'll just mark it as 'XXX TODO' and comment it away in the builtin_functions_test.py file (what's one more...). Which reminds me: do we have a "known bugs" file/summary, or are the infos about known bugs, limitations &c spread around in comments and docstrings? 'cause the XXX TODO in the builtin_*_test.py files are quite a nifty list of known current bugs and limitations for built-ins, so I wondered if I should collect them in some documentation file... Alex From lac at strakt.com Mon Dec 22 23:16:58 2003 From: lac at strakt.com (Laura Creighton) Date: Mon, 22 Dec 2003 23:16:58 +0100 Subject: [pypy-dev] bug with vars() in a nested function In-Reply-To: Message from Alex Martelli of "Mon, 22 Dec 2003 22:23:24 +0100." <200312222223.24683.aleaxit@yahoo.com> References: <200312222223.24683.aleaxit@yahoo.com> Message-ID: <200312222216.hBMMGwT5009123@theraft.strakt.com> In a message of Mon, 22 Dec 2003 22:23:24 +0100, Alex Martelli writes: >The implementation of vars() in builtin.py relies on a call to getdictsco >pe() >to return the caller's locals as a dir, but unfortunately that doesn't wo >rk >when the caller of built-in vars() is a nested function -- nestedscope.py >overrides fast2locals to ensure setting in self.w_locals the free variabl >es >in addition to the locals, and that w_locals is what getdictscope returns >. > >Apparently this kind of bug must have bitten CPython at some point (or >at least be suspected!) since the unit-test for vars() looks for it prett >y >exactly (that's how I found it, too -- still striving to run as much of t >hat >CPython unit-test as feasible!) -- basically, it has something like: > > def test_vars(self): > def get_vars_f0(): > return vars() > def get_vars_f2(): > get_vars_f0() > a = 1 > b = 2 > return vars() > ... > self.assertEqual(get_vars_f2(), {'a': 1, 'b': 2}) > >where that 'useless' call to get_vars_f0 inside get_vars_f2 serves just >the purpose of "infecting" the latter with the former name -- and sure >enough the assertEqual fails because 'get_vars_f0' is reported as being >a key in the dict returned by get_vars_f2(). > >My instinct is to patch this with an optional switch to getdictscope to >ask for "return the TRUE locals only please" -- but since a vaguely >analogous patch I had tried to remedy the issue with generators raising >StopIteration (i.e. "directly attacking the symptom rather than going for >a more general solution") was criticized, I thought I'd better ask first >rather than committing anything yet -- this one IS a rather marginal and >obscure side issue after all, so for now I'll just mark it as 'XXX TODO' >and comment it away in the builtin_functions_test.py file (what's one >more...). > >Which reminds me: do we have a "known bugs" file/summary, or are >the infos about known bugs, limitations &c spread around in comments >and docstrings? 'cause the XXX TODO in the builtin_*_test.py files >are quite a nifty list of known current bugs and limitations for built-in >s, >so I wondered if I should collect them in some documentation file... Please put them someplace and have goals refer to them. I wonder if this is what we should be using the issue tracker for.... As per your real question -- fast2locals strikes me as a hack, that needs to be more general, and our understanding of what a scope would be has been significantly mangled since getdictscope was written. Something in me says that we want to do scoping in some more cleaner way, but I cannot quite envision what it will be after builtins changes _again_ to be more like a regular module. I am now looking at our current crop of builtins, and thinking ... 'the only reason you lot are there is because CPython had you there'. Now that we have hacked the architecture one more time again, grin, we have something a lot cleaner (for now at any rate) ... I am pushing for a more elegant definition of builtin, based on a pragmatic idea of 'you have to be built in becauswe we cannot make you any other way' What of ours make it? Which cannot be made in app space and why, given that interp space is dead? I think that we have gloriously moved to a place where most of our builtins really do not have to be. Some sort of execfile sort of thing is all we need. This is probably wrong, and indicates that my vision is too idealised, and I miss very real practical problems. I await more enlightenment. But I still think that most of our 'builtins' are more 'reimplentations of CPython builtins, done because CPython doesn't have them so we couldn't have a working Python without them'. And we could implement them as a module. When we have leftovers, real problems in converting one object space to another, then things like getdictscope -- or a more general thing that does this and more actually -- strikes me as a thing that belongs __there__. Scoping rules strike me as object space 'required things to leave a hook out for'. But perhaps I am just confused again, and oversimplifying in my mind. I await enlightenment. Laura > > >Alex > >_______________________________________________ >pypy-dev at codespeak.net >http://codespeak.net/mailman/listinfo/pypy-dev From bokr at oz.net Mon Dec 22 22:17:03 2003 From: bokr at oz.net (Bengt Richter) Date: Mon, 22 Dec 2003 13:17:03 -0800 Subject: [pypy-dev] Re: [pypy-svn] rev 2674 - pypy/trunk/doc In-Reply-To: <20031222180758.B29950@prim.han.de> References: <200312221752.40026.aleaxit@yahoo.com> <20031222152621.56CBF5AAA7@thoth.codespeak.net> <20031222164245.A29950@prim.han.de> <200312221752.40026.aleaxit@yahoo.com> Message-ID: <5.0.2.1.1.20031222130106.00a76450@mail.oz.net> At 18:07 2003-12-22 +0100, you (holger krekel) wrote: >Hi Alex, > >[Alex Martelli Mon, Dec 22, 2003 at 05:52:40PM +0100] >> On Monday 22 December 2003 04:42 pm, holger krekel wrote: >> > Hi Alex and others, >> > >> > i think we should try to minimize the number of formatting changes not >> > only in program code but also in ReST files as it makes reading the diffs >> > really hard because you basically have to reread the whole document >> > in order to see the differences. As this has happended now several >> > times i think it makes sense to think of some convention which allows >> > the diffs to stay readable. >> > >> > Maybe (especially) ReST-changes should be done with a two-step approach: >> > first the one where you can read the content diff and then the one (if >> > neccessary at all) which has purely formatting changes and says so in >> > the commit message. And we might also think about a common line width >> > so that not everybody reformats the text again and again :-) >> >> Yes, good points. I think line length of <80, indents of 4 spaces, no tabs, >> no traling spaces allowed on any line under any pretext whatsoever, >> would be good conventions; I find it extremely hard to edit files (be they >> sources, ReST, or any other text) that fail to meet these conventions. > >For python source files this is fine for me. I am not entirely sure for >ReST especially since you can have verbatim-blocks (say an excerpt from a >mail). > >> Is there a way, with subversion, to have a simple conformance test for this >> kind of formatting parameters be run automatically, whenever a textfile is >> committed? If files breaking these conventions (or whatever conventions >> we can all agree on) were never committed, this would minimize the need >> for "changes that are purely related to formatting", I think. > >Indeed. A pre-commit hook is able to perform checks like this although >the details might get hairy if you go for "indentation of four-spaces" >enforcement (e.g. considering docstrings) . I guess i'd go for simplicity >and just enforce "line-length < 80" and svn:eol-style==native for all *.py >and *.txt files (unless their svn:mime-type is explicitely non-text or some >such). Non-conforming commits would simply be refused and nothing would be >changed on the fly. Opinions? Comments from left field (since I wound up a mere lurker after all ;-) 1. I could see automated refusal of commits being a nuisance unless there was an override available. 2. What about a separate pypytidy.py tool that people could use like a standards-enforcer/beautifier/spell-checker _before_ committing? Seems like an enforcer-hook would contain much of the logic anyway, and done separately a useful tool could evolve as a side effect. Regards & Happy Holidays, Bengt Richter From hpk at trillke.net Mon Dec 22 23:48:18 2003 From: hpk at trillke.net (holger krekel) Date: Mon, 22 Dec 2003 23:48:18 +0100 Subject: [pypy-dev] Re: [pypy-svn] rev 2674 - pypy/trunk/doc In-Reply-To: <5.0.2.1.1.20031222130106.00a76450@mail.oz.net>; from bokr@oz.net on Mon, Dec 22, 2003 at 01:17:03PM -0800 References: <200312221752.40026.aleaxit@yahoo.com> <20031222152621.56CBF5AAA7@thoth.codespeak.net> <20031222164245.A29950@prim.han.de> <200312221752.40026.aleaxit@yahoo.com> <20031222180758.B29950@prim.han.de> <5.0.2.1.1.20031222130106.00a76450@mail.oz.net> Message-ID: <20031222234818.F29950@prim.han.de> Hi Bengt, [Bengt Richter Mon, Dec 22, 2003 at 01:17:03PM -0800] > At 18:07 2003-12-22 +0100, you (holger krekel) wrote: > >Hi Alex, > > > >[Alex Martelli Mon, Dec 22, 2003 at 05:52:40PM +0100] > >> Is there a way, with subversion, to have a simple conformance test for this > >> kind of formatting parameters be run automatically, whenever a textfile is > >> committed? If files breaking these conventions (or whatever conventions > >> we can all agree on) were never committed, this would minimize the need > >> for "changes that are purely related to formatting", I think. > > > >Indeed. A pre-commit hook is able to perform checks like this although > >the details might get hairy if you go for "indentation of four-spaces" > >enforcement (e.g. considering docstrings) . I guess i'd go for simplicity > >and just enforce "line-length < 80" and svn:eol-style==native for all *.py > >and *.txt files (unless their svn:mime-type is explicitely non-text or some > >such). Non-conforming commits would simply be refused and nothing would be > >changed on the fly. Opinions? > > Comments from left field (since I wound up a mere lurker after all ;-) Your are welcome and that can change, anyway :-) > 1. I could see automated refusal of commits being a nuisance unless there > was an override available. If we choose transparent and consensual rules and can provide nice error messages i think they should help more then do harm. Often have people checked in files just to find out they did wrong and had to checkin another time ... > 2. What about a separate pypytidy.py tool that people could use like > a standards-enforcer/beautifier/spell-checker _before_ committing? > Seems like an enforcer-hook would contain much of the logic anyway, > and done separately a useful tool could evolve as a side effect. we have pypy/tool/fixeol.py although this mainly deals with line endings. Our tools sections is an ever expanding branch of PyPy :-) cheers, holger From lac at strakt.com Mon Dec 22 23:54:32 2003 From: lac at strakt.com (Laura Creighton) Date: Mon, 22 Dec 2003 23:54:32 +0100 Subject: [pypy-dev] Re: [pypy-svn] rev 2674 - pypy/trunk/doc In-Reply-To: Message from holger krekel of "Mon, 22 Dec 2003 23:48:18 +0100." <20031222234818.F29950@prim.han.de> References: <200312221752.40026.aleaxit@yahoo.com> <20031222152621.56CBF5AAA7@thoth.codespeak.net> <20031222164245.A29950@prim.han.de> <200312221752.40026.aleaxit@yahoo.com> <20031222180758.B29950@prim.han.de> <5.0.2.1.1.20031222130106.00a76450@mail.oz.net> <20031222234818.F29950@prim.han.de> Message-ID: <200312222254.hBMMsWlv018462@ratthing-b246.strakt.com> In a message of Mon, 22 Dec 2003 23:48:18 +0100, holger krekel writes: >Hi Bengt, > >[Bengt Richter Mon, Dec 22, 2003 at 01:17:03PM -0800] >> At 18:07 2003-12-22 +0100, you (holger krekel) wrote: >> >Hi Alex, >> > >> >[Alex Martelli Mon, Dec 22, 2003 at 05:52:40PM +0100] >> >> Is there a way, with subversion, to have a simple conformance test f >or this >> >> kind of formatting parameters be run automatically, whenever a textf >ile is >> >> committed? If files breaking these conventions (or whatever convent >ions >> >> we can all agree on) were never committed, this would minimize the n >eed >> >> for "changes that are purely related to formatting", I think. >> > >> >Indeed. A pre-commit hook is able to perform checks like this although >> >the details might get hairy if you go for "indentation of four-spaces" > >> >enforcement (e.g. considering docstrings) . I guess i'd go for simpli >city >> >and just enforce "line-length < 80" and svn:eol-style==native for all >*.py >> >and *.txt files (unless their svn:mime-type is explicitely non-text or > some >> >such). Non-conforming commits would simply be refused and nothing wou >ld be >> >changed on the fly. Opinions? >> >> Comments from left field (since I wound up a mere lurker after all ;-) > >Your are welcome and that can change, anyway :-) > >> 1. I could see automated refusal of commits being a nuisance unless the >re >> was an override available. > >If we choose transparent and consensual rules and can provide nice error >messages i think they should help more then do harm. Often have people >checked in files just to find out they did wrong and had to checkin >another time ... > >> 2. What about a separate pypytidy.py tool that people could use like >> a standards-enforcer/beautifier/spell-checker _before_ committing? >> Seems like an enforcer-hook would contain much of the logic anyway, >> and done separately a useful tool could evolve as a side effect. > >we have pypy/tool/fixeol.py although this mainly deals with line >endings. Our tools sections is an ever expanding branch of PyPy :-) > >cheers, > > holger >_______________________________________________ >pypy-dev at codespeak.net >http://codespeak.net/mailman/listinfo/pypy-dev I am thinking that we need a way to say 'this line purposely longer'. Big URLS, and things like: 'Don't write code like this: ' Laura From roccomoretti at hotpop.com Tue Dec 23 00:16:00 2003 From: roccomoretti at hotpop.com (Rocco Moretti) Date: Mon, 22 Dec 2003 17:16:00 -0600 Subject: [pypy-dev] bug with vars() in a nested function In-Reply-To: <200312222216.hBMMGwT5009123@theraft.strakt.com> References: <200312222223.24683.aleaxit@yahoo.com> <200312222216.hBMMGwT5009123@theraft.strakt.com> Message-ID: <3FE77B30.3040804@hotpop.com> Laura Creighton wrote: To be honest, I'm not quite catching your entire meaning ... So I'll just babble and hope something of what I say strikes close to the mark. >I am now looking at our current crop of builtins, and thinking ... >'the only reason you lot are there is because CPython had you there'. >Now that we have hacked the architecture one more time again, grin, >we have something a lot cleaner (for now at any rate) ... I am >pushing for a more elegant definition of builtin, based on a >pragmatic idea of 'you have to be built in becauswe we cannot make >you any other way' > Clarity is strained by the two connotations of "builtin" (i.e. 'always present') that can be meant. That is (for the CPython interpreter): * Written in C and statically linked to the interpreter. (Always present in the interpreter.) * Available in Python without having to import anything. (Always present in the language.) I'm under the impression that the __builtin__ module is so named for the second point, as there are a number of modules which meet the first point. (There happen to be 39 on my copy of CPython - len(sys.builtin_module_names)). PyPy's __builtin__ *has* to meet point #2 - otherwise it wouldn't be Python. But I agree with you on point #1 -- We should push to application level everything we can, and have the interpreter level be the absolute minimum needed in order to make it run. >What of ours make it? Which cannot be made in app space and why, >given that interp space is dead? I think that we have gloriously >moved to a place where most of our builtins really do not have to be. >Some sort of execfile sort of thing is all we need. > >This is probably wrong, and indicates that my vision is too idealised, >and I miss very real practical problems. I await more enlightenment. >But I still think that most of our 'builtins' are more 'reimplentations >of CPython builtins, done because CPython doesn't have them so we >couldn't have a working Python without them'. And we could >implement them as a module. When we have leftovers, real >problems in converting one object space to another, then things >like getdictscope -- or a more general thing that does this and more >actually -- strikes me as a thing that belongs __there__. Scoping >rules strike me as object space 'required things to leave a hook >out for'. > Let me clarify - are you just referring to the __builtin__ module, or are you advocating a more expansive redesign where ObjSpace and interpreter core gets uplifted to App level? > But perhaps I am just confused again, and oversimplifying in my mind. I definitely agree with Laura. We should strive to push as much as possible to application level, for no other reason than it will make doing annotations, etc. easier. But there needs to be a good mechanism to provide interpreter level hooks for the app level functions. Take __import__. There is currently a commented out application level version in the builtin module. The __import__ functionality would work fine at application level, except for a few minor issues. You can't get sys.modules from app level, as that would require you to 'import sys', which leads to obvious recursion. Same goes for accessing the filesystem tools in os and os.path. The way the app level function works now is that it defines a set of interpreter level helpers which are able to access the functionality and pass it back. The problem with the way they are implemented now is that all those helpers pollute the __builtin__ namespace. If there was a good way to define interpreter level helpers which were visible from *within* the module, but invisible from the outside, then I feel this approach would work well, and we can extend it to pare the interpreter level functionality down to the bare minimum. -Rocco From mwh at python.net Tue Dec 23 13:46:44 2003 From: mwh at python.net (Michael Hudson) Date: Tue, 23 Dec 2003 12:46:44 +0000 Subject: [pypy-dev] Re: amsterdam sprint reports (20th december 2003) References: <20031221212410.H15957@prim.han.de> Message-ID: <2mfzfbd7m3.fsf@starship.python.net> holger krekel writes: > "finishing" the Annotation refactoring > -------------------------------------- > > That should be easy, right? Actually Guido van Rossum and Armin had > started doing type inference/annotation in Belgium just before > EuroPython and we have since refactored it already at the Berlin > sprint and in between the sprints several times. But it turned out > that Annotations as we did them are *utterly broken* in that we try > to do a too general system (we had been talking about general > inference engines and such) thus making "merging" of two annotations > very hard to do in a meaningful way. But after beeing completly > crushed on one afternoon, Samuele and Armin came up with a new > *simpler* approach that ... worked and promises to not have the same > flaws. It got integrated into the translator already and appears to > work nicely. If this scheme is going to stand the test of time, it would be really nice if it got some high-level documentation. I've been trying to follow progress here, but as soon as you look at it, it changes... > I think this is the fourth refactoring of just "three files" and, of > course, we already have the 'XXX' virus spreading again :-) > > refactoring/rewriting the test framework > ---------------------------------------- > > PyPy has an ever increasing test-suite which requires a lot of > flexibility that the standard unittest.py module just doesn't > provide. Currently, we have in tool/test.py and > interpreter/unittest_w.py a collection of more or less weird hacks > to make our (now over 500) tests run either at interpreter level or > at application level which means they are actually interpreted by > PyPy. Tests from both levels can moreover be run with different > object spaces. Thus Stefan Schwarzer and me came up with a rewrite > of unittest.py which is currently in 'newtest.py'. During my train > ride back to germany i experimentally used our new approach which > let our tests run around 30% faster (!) as a side effect. More > about this in separate mails as this is - as almost every other area > of PyPy - refactoring-in-progress. I think this stuff looks really cool. > On the Amsterdam sprint maybe more than ever we realized how > strongly refactoring is the key development activity in PyPy. Watch > those "XXX" :-) Some would argue that we should instead think more > about what we are doing but then you wouldn't call that extreme > programming, would you? > > However, we haven't fixed a site and date for the next sprint, > yet. Here's hoping I'll be able to make that one... Cheers, mwh -- ZAPHOD: You know what I'm thinking? FORD: No. ZAPHOD: Neither do I. Frightening isn't it? -- The Hitch-Hikers Guide to the Galaxy, Episode 11 From hpk at trillke.net Tue Dec 23 15:09:43 2003 From: hpk at trillke.net (holger krekel) Date: Tue, 23 Dec 2003 15:09:43 +0100 Subject: [pypy-dev] bug with vars() in a nested function In-Reply-To: <3FE77B30.3040804@hotpop.com>; from roccomoretti@hotpop.com on Mon, Dec 22, 2003 at 05:16:00PM -0600 References: <200312222223.24683.aleaxit@yahoo.com> <200312222216.hBMMGwT5009123@theraft.strakt.com> <3FE77B30.3040804@hotpop.com> Message-ID: <20031223150943.H29950@prim.han.de> Hi Rocco, hi Laura, [Rocco Moretti Mon, Dec 22, 2003 at 05:16:00PM -0600] > Laura Creighton wrote: > >I am now looking at our current crop of builtins, and thinking ... > >'the only reason you lot are there is because CPython had you there'. > >Now that we have hacked the architecture one more time again, grin, > >we have something a lot cleaner (for now at any rate) ... I am > >pushing for a more elegant definition of builtin, based on a > >pragmatic idea of 'you have to be built in becauswe we cannot make > >you any other way' > > > Clarity is strained by the two connotations of "builtin" (i.e. 'always > present') that can be meant. That is (for the CPython interpreter): > > * Written in C and statically linked to the interpreter. > (Always present in the interpreter.) > * Available in Python without having to import anything. > (Always present in the language.) > > I'm under the impression that the __builtin__ module is so named for the > second point, as there are a number of modules which meet the first > point. (There happen to be 39 on my copy of CPython - > len(sys.builtin_module_names)). > > PyPy's __builtin__ *has* to meet point #2 - otherwise it wouldn't be > Python. But I agree with you on point #1 -- We should push to > application level everything we can, and have the interpreter level be > the absolute minimum needed in order to make it run. Sure, that has been our goal almost all of the time. However, code implemented at application level goes through the interpretation indirection and is not only slower now but will probably remain slower even after translation. Anyway, our new approaches at implementing builtin modules surely improve the simplicity of implementing app-level code and weawing it into interpreter level. > >What of ours make it? Which cannot be made in app space and why, > >given that interp space is dead? I think that we have gloriously > >moved to a place where most of our builtins really do not have to be. > >Some sort of execfile sort of thing is all we need. Well, interpreter level code is far from dead but we might be able to reduce it to a minimum level following our original "minimal python" idea. I think that we are not doing so badly as the number of interpreter-level builtins is not all that large. The problem so far has been that the builtin module concept was kind of complicated but this should be fixed soon, now: builtin modules are to be defined at application level but can access/interact very dynamically with interpreter level code at initialization time. I guess Armin will write a few more sentences when he gets to checkin the new stuff. > >This is probably wrong, and indicates that my vision is too idealised, > >and I miss very real practical problems. I await more enlightenment. > >But I still think that most of our 'builtins' are more 'reimplentations > >of CPython builtins, done because CPython doesn't have them so we > >couldn't have a working Python without them'. And we could > >implement them as a module. When we have leftovers, real > >problems in converting one object space to another, then things > >like getdictscope -- or a more general thing that does this and more > >actually -- strikes me as a thing that belongs __there__. Scoping > >rules strike me as object space 'required things to leave a hook > >out for'. > > > Let me clarify - are you just referring to the __builtin__ module, or > are you advocating a more expansive redesign where ObjSpace and > interpreter core gets uplifted to App level? > > > But perhaps I am just confused again, and oversimplifying in my mind. > > I definitely agree with Laura. We should strive to push as much as > possible to application level, for no other reason than it will make > doing annotations, etc. easier. > > But there needs to be a good mechanism to provide interpreter level > hooks for the app level functions. yes, the main point here is that we probably want to avoid duplicate or redundant state, for example calling on interpreter-level space.builtin.execfile(...) and on app-level __builtin__.execfile(...) should do the same thing but what happens if someone overrides __builtin__.execfile from app level? Do we want the interpreter-level to go through this new implementation or should it keep the "real" reference? It seems tricky to decide this on a case-by-case basis. When doing our recent "implement builtin at app-level and invoke interp-level hooks" hack we had a similar consideration with "sys.modules" which in CPython can be overriden at applevel but it doesn't affect interpreter-level implementations. Otherwise you could get into a state that makes it impossible to import anything anymore (e.g. consider 'sys.modules = "no dict"'). So i am not sure what we want to do about this "duplicate state" issue as there apparently is a flexibility versus security tradeoff involved. I tend to lean towards "flexibility", though :-) > Take __import__. There is currently a commented out application level > version in the builtin module. The __import__ functionality would work > fine at application level, except for a few minor issues. You can't get > sys.modules from app level, as that would require you to 'import sys', > which leads to obvious recursion. Hmmm, maybe exposing some general '_pypy_' builtin hook would allow defining __import__ at app-level because we could provide a '_pypy_.sys' attribute or maybe better "_pypy_.modules['sys']". I also thought about exposing parts of the objectspace directly, and some builtins could just be bound methods of the space, e.g. len = _pypy_.space.len delattr = _pypy_.space.delattr as the object space and their corresponding builtin implementations share the same signature. Some space-methods would probably be exposed as readonly-attributes unless we want to provide ways to seriously mess up your interpreter quickly :-) I think it's worth a try to see what can of worms suddenly opens if we did this ... cheers, holger From roccomoretti at hotpop.com Tue Dec 23 16:20:46 2003 From: roccomoretti at hotpop.com (Rocco Moretti) Date: Tue, 23 Dec 2003 09:20:46 -0600 Subject: [pypy-dev] bug with vars() in a nested function In-Reply-To: <20031223150943.H29950@prim.han.de> References: <200312222223.24683.aleaxit@yahoo.com> <200312222216.hBMMGwT5009123@theraft.strakt.com> <3FE77B30.3040804@hotpop.com><20031223150943.H29950@prim.han.de> Message-ID: <3FE85D4E.8050600@hotpop.com> holger krekel wrote: >yes, the main point here is that we probably want to avoid duplicate or >redundant state, for example calling on interpreter-level > > space.builtin.execfile(...) > >and on app-level > > __builtin__.execfile(...) > >should do the same thing but what happens if someone overrides >__builtin__.execfile from app level? Do we want the interpreter-level >to go through this new implementation or should it keep the "real" >reference? It seems tricky to decide this on a case-by-case basis. > >When doing our recent "implement builtin at app-level and invoke >interp-level hooks" hack we had a similar consideration with "sys.modules" >which in CPython can be overriden at applevel but it doesn't affect >interpreter-level implementations. Otherwise you could get into a >state that makes it impossible to import anything anymore (e.g. >consider 'sys.modules = "no dict"'). So i am not sure what we >want to do about this "duplicate state" issue as there apparently >is a flexibility versus security tradeoff involved. I tend to >lean towards "flexibility", though :-) > Fooling around with sys.modules and import on Python2.3 I come away with the idea that Python's idea that variables are just names helps us in certain cases. I.e.: >>> a = sys.modules >>> print a.has_key('random') False >>> sys.modules = {} >>> sys.modules {} >>> import random >>> sys.modules {} >>> print a.has_key('random') True >>> So it is the particular dictionary that sys.modules points immediately after startup that is used by the CPython import mechanism, not the object that sys.modules is pointing to when the import is called. This is less help for cases like __import__(), where it's what the name is pointing to that matters. Although, I suppose we could possibly handle that through a property-like interface with transparent getters and setters. -Rocco From hpk at trillke.net Tue Dec 23 17:53:20 2003 From: hpk at trillke.net (holger krekel) Date: Tue, 23 Dec 2003 17:53:20 +0100 Subject: [pypy-dev] bug with vars() in a nested function In-Reply-To: <200312222223.24683.aleaxit@yahoo.com>; from aleaxit@yahoo.com on Mon, Dec 22, 2003 at 10:23:24PM +0100 References: <200312222223.24683.aleaxit@yahoo.com> Message-ID: <20031223175320.J29950@prim.han.de> [Alex Martelli Mon, Dec 22, 2003 at 10:23:24PM +0100] > The implementation of vars() in builtin.py relies on a call to getdictscope() > to return the caller's locals as a dir, but unfortunately that doesn't work > when the caller of built-in vars() is a nested function -- nestedscope.py > overrides fast2locals to ensure setting in self.w_locals the free variables > in addition to the locals, and that w_locals is what getdictscope returns. I think some further CPython analysis would be warranted here. It's not very easy to see in which code paths and situations PyFrame_FastToLocals and PyFrame_LocalsToFast in CPython get invoked regarding nested scopes. In some theoretically critical use cases like exec and import-star statements the compiler actually forbids it ... So I committed a fix which modifies fast2locals to not put bindings from outer scopes into the dict and added some tests (also the builtin_functions_test now passes). I have the feeling though that there might be some more dark corners ... cheers, holger From hpk at trillke.net Tue Dec 23 19:05:19 2003 From: hpk at trillke.net (holger krekel) Date: Tue, 23 Dec 2003 19:05:19 +0100 Subject: [pypy-dev] bug with vars() in a nested function In-Reply-To: <3FE85D4E.8050600@hotpop.com>; from roccomoretti@hotpop.com on Tue, Dec 23, 2003 at 09:20:46AM -0600 References: <200312222223.24683.aleaxit@yahoo.com> <200312222216.hBMMGwT5009123@theraft.strakt.com> <3FE77B30.3040804@hotpop.com> <20031223150943.H29950@prim.han.de> <3FE85D4E.8050600@hotpop.com> Message-ID: <20031223190519.K29950@prim.han.de> [Rocco Moretti Tue, Dec 23, 2003 at 09:20:46AM -0600] > holger krekel wrote: > > >yes, the main point here is that we probably want to avoid duplicate or > >redundant state, for example calling on interpreter-level > > > > space.builtin.execfile(...) > > > >and on app-level > > > > __builtin__.execfile(...) > > > >should do the same thing but what happens if someone overrides > >__builtin__.execfile from app level? Do we want the interpreter-level > >to go through this new implementation or should it keep the "real" > >reference? It seems tricky to decide this on a case-by-case basis. > > > >When doing our recent "implement builtin at app-level and invoke > >interp-level hooks" hack we had a similar consideration with "sys.modules" > >which in CPython can be overriden at applevel but it doesn't affect > >interpreter-level implementations. Otherwise you could get into a > >state that makes it impossible to import anything anymore (e.g. > >consider 'sys.modules = "no dict"'). So i am not sure what we > >want to do about this "duplicate state" issue as there apparently > >is a flexibility versus security tradeoff involved. I tend to > >lean towards "flexibility", though :-) > > > Fooling around with sys.modules and import on Python2.3 I come away > with the idea that Python's idea that variables are just names helps > us in certain cases. I.e.: > > >>> a = sys.modules > >>> print a.has_key('random') > False > >>> sys.modules = {} > >>> sys.modules > {} > >>> import random > >>> sys.modules > {} > >>> print a.has_key('random') > True > >>> > > So it is the particular dictionary that sys.modules points immediately > after startup that is used by the CPython import mechanism, not the > object that sys.modules is pointing to when the import is called. Yes, but is it what we want to mimic? Somehow i think the idea is that sys.modules is the one place where modulepath-moduleobject mappings should be kept and the interpreter level should consult this object. I guess that CPython's keeping reference to the original dict object is more a performance hack and also shields from stupid errors ... > This is less help for cases like __import__(), where it's what the > name is pointing to that matters. Although, I suppose we could possibly > handle that through a property-like interface with transparent getters > and setters. We can always special case but i'd prefer a general solution like "interp-level has to go through the applevel hooks/names" but maybe this is not feasible. holger From sschwarzer at sschwarzer.net Tue Dec 23 22:17:12 2003 From: sschwarzer at sschwarzer.net (Stefan Schwarzer) Date: Tue, 23 Dec 2003 22:17:12 +0100 Subject: [pypy-dev] Weird failure of test.py as testit.py Message-ID: <20031223211712.GA639@warpy.sschwarzer.net> To avoid a conflict between tool/test.py and a potential directory tool/test, I have renamed test.py to testit.py and made a corresponding change to test_all.py . Before I checked the change in, all tests, invoked from testit.py or test_all.py, ran. After checking in the changed files I noticed that the tests no longer ran. That is, testit.py would start but find no tests. Eventually I found out that everything worked as long as test.py or test.pyc remained in the tool directory. (So copying testit.py to test.py makes you again able to run the tests.) So I had checked my changes, finding everything working, checked in, and cleaned away test.py* . Thus, I didn't notice that something had gone astray before committing the changes. Now there's the question what's wrong with test.py/testit.py . I suspected an import of "test" from within the module but found no such import. I'm still searching for the cause of the problem, but if someone has a quick idea, it would be fine if he/she responded on the list or even fixed the problem. Stefan From sschwarzer at sschwarzer.net Tue Dec 23 22:33:38 2003 From: sschwarzer at sschwarzer.net (Stefan Schwarzer) Date: Tue, 23 Dec 2003 22:33:38 +0100 Subject: [pypy-dev] Re: Weird failure of test.py as testit.py In-Reply-To: <20031223211712.GA639@warpy.sschwarzer.net> References: <20031223211712.GA639@warpy.sschwarzer.net> Message-ID: <20031223213338.GB639@warpy.sschwarzer.net> On Tue, 2003-12-23 22:17:12 +0100, Stefan Schwarzer wrote: > Now there's the question what's wrong with test.py/testit.py . I > suspected an import of "test" from within the module but found no such > import. I'm still searching for the cause of the problem, but if > someone has a quick idea, it would be fine if he/she responded on the > list or even fixed the problem. One addition: I tracked down the problem to line 247 in test(it).py . If test.py* doesn't exist, the returned subsuite is empty. I'm trying to find out, why. (If it wasn't clear from my previous mail: you can reproduce everything without involving test_all.py .) Stefan From bokr at oz.net Tue Dec 23 23:30:31 2003 From: bokr at oz.net (Bengt Richter) Date: Tue, 23 Dec 2003 14:30:31 -0800 Subject: [pypy-dev] bug with vars() in a nested function In-Reply-To: <20031223190519.K29950@prim.han.de> References: <3FE85D4E.8050600@hotpop.com> <200312222223.24683.aleaxit@yahoo.com> <200312222216.hBMMGwT5009123@theraft.strakt.com> <3FE77B30.3040804@hotpop.com> <20031223150943.H29950@prim.han.de> <3FE85D4E.8050600@hotpop.com> Message-ID: <5.0.2.1.1.20031223135226.00a76cb0@mail.oz.net> At 19:05 2003-12-23 +0100, you (holger krekel) wrote: >[Rocco Moretti Tue, Dec 23, 2003 at 09:20:46AM -0600] >> holger krekel wrote: >> >> >yes, the main point here is that we probably want to avoid duplicate or >> >redundant state, for example calling on interpreter-level >> > >> > space.builtin.execfile(...) >> > >> >and on app-level >> > >> > __builtin__.execfile(...) >> > >> >should do the same thing but what happens if someone overrides >> >__builtin__.execfile from app level? Do we want the interpreter-level >> >to go through this new implementation or should it keep the "real" >> >reference? It seems tricky to decide this on a case-by-case basis. >> > >> >When doing our recent "implement builtin at app-level and invoke >> >interp-level hooks" hack we had a similar consideration with "sys.modules" >> >which in CPython can be overriden at applevel but it doesn't affect >> >interpreter-level implementations. Otherwise you could get into a >> >state that makes it impossible to import anything anymore (e.g. >> >consider 'sys.modules = "no dict"'). So i am not sure what we >> >want to do about this "duplicate state" issue as there apparently >> >is a flexibility versus security tradeoff involved. I tend to >> >lean towards "flexibility", though :-) >> > >> Fooling around with sys.modules and import on Python2.3 I come away >> with the idea that Python's idea that variables are just names helps >> us in certain cases. I.e.: >> >> >>> a = sys.modules >> >>> print a.has_key('random') >> False >> >>> sys.modules = {} >> >>> sys.modules >> {} >> >>> import random >> >>> sys.modules >> {} >> >>> print a.has_key('random') >> True >> >>> >> >> So it is the particular dictionary that sys.modules points immediately >> after startup that is used by the CPython import mechanism, not the >> object that sys.modules is pointing to when the import is called. > >Yes, but is it what we want to mimic? Somehow i think the idea is that >sys.modules is the one place where modulepath-moduleobject mappings >should be kept and the interpreter level should consult this object. Maybe the original binding could be preserved as sys.__modules__ analogously to sys.__stdout__ ? >I guess that CPython's keeping reference to the original dict object >is more a performance hack and also shields from stupid errors ... I don't know. Isn't it normal to get a binding through a name and then ignore the name? Mutating the referenced object is a different matter though, e.g., >>> import sys >>> a = sys.modules >>> print a.has_key('random') False >>> sys.modules.clear() Traceback (most recent call last): File "", line 1, in ? RuntimeError: lost __builtin__ Just a couple of thoughts. >> This is less help for cases like __import__(), where it's what the doesn't __import__ look for the name in the same original sys.modules? >> name is pointing to that matters. Although, I suppose we could possibly >> handle that through a property-like interface with transparent getters >> and setters. > >We can always special case but i'd prefer a general solution like >"interp-level has to go through the applevel hooks/names" but maybe >this is not feasible. If the interpreter has to maintain some objects to survive, maybe apps should only get access via readonly/proxy mechanisms of some kind? Disclaimer: I'm only reacting in the context of this one email, so please ignore if it doesn't make sense ;-) Bengt From hpk at trillke.net Wed Dec 24 00:08:12 2003 From: hpk at trillke.net (holger krekel) Date: Wed, 24 Dec 2003 00:08:12 +0100 Subject: [pypy-dev] Re: Weird failure of test.py as testit.py In-Reply-To: <20031223213338.GB639@warpy.sschwarzer.net>; from sschwarzer@sschwarzer.net on Tue, Dec 23, 2003 at 10:33:38PM +0100 References: <20031223211712.GA639@warpy.sschwarzer.net> <20031223213338.GB639@warpy.sschwarzer.net> Message-ID: <20031224000812.L29950@prim.han.de> [Stefan Schwarzer Tue, Dec 23, 2003 at 10:33:38PM +0100] > On Tue, 2003-12-23 22:17:12 +0100, Stefan Schwarzer wrote: > > Now there's the question what's wrong with test.py/testit.py . I > > suspected an import of "test" from within the module but found no such > > import. I'm still searching for the cause of the problem, but if > > someone has a quick idea, it would be fine if he/she responded on the > > list or even fixed the problem. > > One addition: I tracked down the problem to line 247 in test(it).py . > If test.py* doesn't exist, the returned subsuite is empty. I'm trying > to find out, why. Well, you forgot to change about 50 other files . See my fixes http://codespeak.net/pipermail/pypy-svn/2003-December/001851.html which shows that all test files (and even some others needing to parse options) use the test (now testit) module. Please be careful when changing the foundation of our development model :-) cheers, holger From sschwarzer at sschwarzer.net Wed Dec 24 11:43:06 2003 From: sschwarzer at sschwarzer.net (Stefan Schwarzer) Date: Wed, 24 Dec 2003 11:43:06 +0100 Subject: [pypy-dev] Re: Weird failure of test.py as testit.py In-Reply-To: <20031224000812.L29950@prim.han.de> References: <20031223211712.GA639@warpy.sschwarzer.net> <20031223213338.GB639@warpy.sschwarzer.net> <20031224000812.L29950@prim.han.de> Message-ID: <20031224104306.GB674@warpy.sschwarzer.net> On Wed, 2003-12-24 00:08:12 +0100, holger krekel wrote: > [Stefan Schwarzer Tue, Dec 23, 2003 at 10:33:38PM +0100] > > On Tue, 2003-12-23 22:17:12 +0100, Stefan Schwarzer wrote: > > > Now there's the question what's wrong with test.py/testit.py . I > > > suspected an import of "test" from within the module but found no such > > > import. I'm still searching for the cause of the problem, but if > > > someone has a quick idea, it would be fine if he/she responded on the > > > list or even fixed the problem. > > > > One addition: I tracked down the problem to line 247 in test(it).py . > > If test.py* doesn't exist, the returned subsuite is empty. I'm trying > > to find out, why. > > Well, you forgot to change about 50 other files . See my fixes > > http://codespeak.net/pipermail/pypy-svn/2003-December/001851.html > > which shows that all test files (and even some others needing to > parse options) use the test (now testit) module. Please be careful > when changing the foundation of our development model :-) *blush* I can't understand why I didn't note that simple reason. On the other hand, testit.py didn't output any error messages, so I expected something subtle going on and didn't get to that trivial cause first. Many thanks for the quick fix! Stefan From hpk at trillke.net Wed Dec 24 12:22:40 2003 From: hpk at trillke.net (holger krekel) Date: Wed, 24 Dec 2003 12:22:40 +0100 Subject: [pypy-dev] Re: Weird failure of test.py as testit.py In-Reply-To: <20031224104306.GB674@warpy.sschwarzer.net>; from sschwarzer@sschwarzer.net on Wed, Dec 24, 2003 at 11:43:06AM +0100 References: <20031223211712.GA639@warpy.sschwarzer.net> <20031223213338.GB639@warpy.sschwarzer.net> <20031224000812.L29950@prim.han.de> <20031224104306.GB674@warpy.sschwarzer.net> Message-ID: <20031224122240.O29950@prim.han.de> [Stefan Schwarzer Wed, Dec 24, 2003 at 11:43:06AM +0100] > On Wed, 2003-12-24 00:08:12 +0100, holger krekel wrote: > > [Stefan Schwarzer Tue, Dec 23, 2003 at 10:33:38PM +0100] > > > On Tue, 2003-12-23 22:17:12 +0100, Stefan Schwarzer wrote: > > > > Now there's the question what's wrong with test.py/testit.py . I > > > > suspected an import of "test" from within the module but found no such > > > > import. I'm still searching for the cause of the problem, but if > > > > someone has a quick idea, it would be fine if he/she responded on the > > > > list or even fixed the problem. > > > > > > One addition: I tracked down the problem to line 247 in test(it).py . > > > If test.py* doesn't exist, the returned subsuite is empty. I'm trying > > > to find out, why. > > > > Well, you forgot to change about 50 other files . See my fixes > > > > http://codespeak.net/pipermail/pypy-svn/2003-December/001851.html > > > > which shows that all test files (and even some others needing to > > parse options) use the test (now testit) module. Please be careful > > when changing the foundation of our development model :-) > > *blush* > > I can't understand why I didn't note that simple reason. On the other > hand, testit.py didn't output any error messages, so I expected > something subtle going on and didn't get to that trivial cause first. Yeah, such behaviour is really bad and one of the reasons we are rewriting the whole mess :-) cheers, holger From hpk at trillke.net Sat Dec 27 19:46:20 2003 From: hpk at trillke.net (holger krekel) Date: Sat, 27 Dec 2003 19:46:20 +0100 Subject: [pypy-dev] documentation progress! Message-ID: <20031227194620.W29950@prim.han.de> Hello PyPy, apart from the new "architecture" document the Amsterdam Sprint also produced a "goals" document which i now think should be rewritten to be (much) more higher level. See here for reference http://codespeak.net/pypy/index.cgi?doc/goals As it stands it's IMHO very confusing to understand from this document what PyPy aims at. I guess the current 'goals' document is more like an extensive "todo list" and maybe should just be renamed. (Because from a Todo-document one a) doesn't expect to understand everything at once and b) expects more lower level details). Apart from this little renaming we also concluded earlier that we want to have a "Status" document which should offer precise information about the implementation/concept status regarding our "components": interpreter, standard/flow/trace/trivial object spaces, annotation and translation. Each of these components should then have a list of main pieces wrong, missing, incomplete or (yes they exist!) complete. Btw, Armin has in the last days added/modified a chapter about Multimethods and Annotations (type inference): http://codespeak.net/pypy/index.cgi?doc/objspace/multimethod http://codespeak.net/pypy/index.cgi?doc/translation/annotation but don't forget to read the architecture document first in case you e.g. stumble on the term "object space" and don't know what it means :-) http://codespeak.net/pypy/index.cgi?doc/architecture cheers and some happy last days of the year, holger From tinuviel at sparcs.kaist.ac.kr Wed Dec 31 21:39:23 2003 From: tinuviel at sparcs.kaist.ac.kr (Seo Sanghyeon) Date: Thu, 1 Jan 2004 05:39:23 +0900 Subject: [pypy-dev] Pie-thon benchmark Message-ID: <20031231203923.GA12900@sparcs.kaist.ac.kr> Happy new year, pypy-dev! Guido announced Pie-thon benchmark. http://mail.python.org/pipermail/python-dev/2003-December/041527.html I tried to run it against PyPy revision 2706. First, here is Python 2.3 performance on my system, measured by `make times'. b0.py: real 0m2.720s b1.py: real 0m1.069s b2.py: real 0m0.343s b3.py: real 0m2.137s b4.py: real 0m0.709s b5.py: real 0m2.100s b6.py: real 0m1.903s For PyPy, I used `time python py.py -S b?.py'. To make relative import work I prepended `PYTHONPATH=.' for b4.py and b6.py. b0.py: error __new__() takes exactly 4 arguments (1 given) b1.py: error maximum recursion depth exceeded b2.py: real 1m25.725s (250x slow) b3.py: error __new__ multimethod got a non-wrapped argument b4.py: error (this one imports and uses b0.py) b5.py: error global name complex is not defined b6.py: error unbound method must be called with instance as first argument More comments on b6.py: it took 22 minutes(!) to crash. It printed 42, 1000041, 999999, 49999950000, and finally crashed at `d = dict.fromkeys(xrange(1000000))'. Will investigate more.