From fijall at gmail.com Tue Jun 1 06:00:17 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 31 May 2010 22:00:17 -0600 Subject: [pypy-dev] [pypy-svn] r74976 - in pypy/branch/sys-prefix: lib/pypy1.2/lib_pypy/ctypes_config_cache pypy/interpreter/test pypy/rlib pypy/tool pypy/tool/test pypy/translator/goal pypy/translator/sandbox In-Reply-To: <20100531150223.33D88282B90@codespeak.net> References: <20100531150223.33D88282B90@codespeak.net> Message-ID: A bit about directory structure: Can you explain to me a bit? What's in lib except pypy1.2? Why pypy1.2? Do we have any reason? Do we ever need to keep more than one? What's in pypy1.2 except lib_pypy? On Mon, May 31, 2010 at 9:02 AM, wrote: > Author: antocuni > Date: Mon May 31 17:02:20 2010 > New Revision: 74976 > > Modified: > ? pypy/branch/sys-prefix/lib/pypy1.2/lib_pypy/ctypes_config_cache/rebuild.py > ? pypy/branch/sys-prefix/pypy/interpreter/test/test_module.py > ? pypy/branch/sys-prefix/pypy/rlib/rmd5.py > ? pypy/branch/sys-prefix/pypy/rlib/rsha.py > ? pypy/branch/sys-prefix/pypy/rlib/rzipfile.py > ? pypy/branch/sys-prefix/pypy/tool/compat.py > ? pypy/branch/sys-prefix/pypy/tool/lib_pypy.py > ? pypy/branch/sys-prefix/pypy/tool/test/test_lib_pypy.py > ? pypy/branch/sys-prefix/pypy/translator/goal/targetpypystandalone.py > ? pypy/branch/sys-prefix/pypy/translator/sandbox/pypy_interact.py > ? pypy/branch/sys-prefix/pypy/translator/sandbox/sandlib.py > Log: > remove most of the remaining references to pypy/lib, and make them pointing to lib_pypy > From anto.cuni at gmail.com Tue Jun 1 10:05:22 2010 From: anto.cuni at gmail.com (Antonio Cuni) Date: Tue, 01 Jun 2010 10:05:22 +0200 Subject: [pypy-dev] [pypy-svn] r74976 - in pypy/branch/sys-prefix: lib/pypy1.2/lib_pypy/ctypes_config_cache pypy/interpreter/test pypy/rlib pypy/tool pypy/tool/test pypy/translator/goal pypy/translator/sandbox In-Reply-To: References: <20100531150223.33D88282B90@codespeak.net> Message-ID: <4C04BF42.5060304@gmail.com> On 01/06/10 06:00, Maciej Fijalkowski wrote: > A bit about directory structure: I think I have explained everything in my original email to pypy-dev: http://codespeak.net/pipermail/pypy-dev/2010q2/005854.html > Can you explain to me a bit? > > What's in lib except pypy1.2? nothing. It really plays the same role as /usr/lib. We need it because in this way we can have sys.prefix == '/path/to/pypy-trunk' and still have the lib in join(sys.prefix, 'lib', 'pypy%d.%d') > Why pypy1.2? Do we have any reason? yes. When we install pypy system wide, we really want to have a version number in the directory that contains the stdlib; also, it is consistent with cpython, which puts it into e.g. /usr/lib/python2.6 > Do we ever need to keep more than one? no. We will rename it every time we do a new release. > What's in pypy1.2 except lib_pypy? there will be also lib-python, although I've not moved it yet. ciao, Anto From andrewfr_ice at yahoo.com Tue Jun 1 19:57:46 2010 From: andrewfr_ice at yahoo.com (Andrew Francis) Date: Tue, 1 Jun 2010 10:57:46 -0700 (PDT) Subject: [pypy-dev] EuroPython Talk on Stackless.py and Questions Message-ID: <324283.30017.qm@web111720.mail.gq1.yahoo.com> Hi Folks: I understand that stackless is not high on the pypy team priority list but here goes.... My talk "Prototyping Go's Select for Stackles Python in Stackless.py" was accepted. I have never been to Europe. To date, I have been able to implement the Go Select like capability in a number of ways. Most recently I emulated the Plan9 approach which is pretty straight forward. I need to add the code for channel preferences and the channel callback so it is on par with the old stackless.py Questions: Where should I put the new code? Are there regression tests for stackless.py Right now, I am taking Stephan Diehl and Carl Bolz's good advice and using stackless.py with greenlets. However for completeness, I would like to run through the exercise of using the translate tool chain. Right now, select is a function. I figure implementing Select as a language feature would be a good way to familarise myself better with the PyPy framework at a deeper level. However I am not sure where to start. Where is the parser? Do I create new opt codes? Is this doable for a newbie in a month? Cheers, Andrew From fijall at gmail.com Wed Jun 2 06:51:30 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Tue, 1 Jun 2010 22:51:30 -0600 Subject: [pypy-dev] EuroPython Talk on Stackless.py and Questions In-Reply-To: <324283.30017.qm@web111720.mail.gq1.yahoo.com> References: <324283.30017.qm@web111720.mail.gq1.yahoo.com> Message-ID: On Tue, Jun 1, 2010 at 11:57 AM, Andrew Francis wrote: > Hi Folks: > > I understand that stackless is not high on the pypy team priority list but here goes.... > Completely not answering your question (I don't know), but clarifying. PyPy has no active developer working on stackless features. However, that does not mean that pypy's priority for stackless features is low. We're definitely going to support developments in that direction (at least those that make sense in our opinion of course :) Cheers, fijal From andrewfr_ice at yahoo.com Wed Jun 2 15:49:18 2010 From: andrewfr_ice at yahoo.com (Andrew Francis) Date: Wed, 2 Jun 2010 06:49:18 -0700 (PDT) Subject: [pypy-dev] EuroPython Talk on Stackless.py and Questions In-Reply-To: Message-ID: <71820.75955.qm@web111725.mail.gq1.yahoo.com> Hi Maciej: --- On Tue, 6/1/10, Maciej Fijalkowski wrote: > Completely not answering your question (I don't know), but > clarifying.PyPy has no active developer working on stackless features. > However, that does not mean that pypy's priority for stackless > features is low. We're definitely going to support developments in that > direction (at least those that make sense in our opinion of course :) I don't know if this clarifies things but my changes have been to stackless.py, the API. I added the ability to monitor many channels at once, a la Newsqueak/Limbo/Go. This should not break existing code (so far it doesn't but I need to write more tests). I also want to starting to play with join conditions (http://en.wikipedia.org/wiki/Join-calculus) and pattern matching. However I want to get deeper into PyPy. I would like to implement Select as a language feature as an exercise rather than an actual change to the language. I am looking the Javascript and Smalltalk VMs but I don't know where to start for Python itself. Also I wouldn't mind learning more about the stackless transform. Cheers, Andrew From cfbolz at gmx.de Wed Jun 2 15:59:21 2010 From: cfbolz at gmx.de (Carl Friedrich Bolz) Date: Wed, 02 Jun 2010 15:59:21 +0200 Subject: [pypy-dev] EuroPython Talk on Stackless.py and Questions In-Reply-To: <71820.75955.qm@web111725.mail.gq1.yahoo.com> References: <71820.75955.qm@web111725.mail.gq1.yahoo.com> Message-ID: <4C0663B9.8080109@gmx.de> On 06/02/2010 03:49 PM, Andrew Francis wrote: > Hi Maciej: > > --- On Tue, 6/1/10, Maciej Fijalkowski wrote: > >> Completely not answering your question (I don't know), but >> clarifying.PyPy has no active developer working on stackless >> features. However, that does not mean that pypy's priority for >> stackless features is low. We're definitely going to support >> developments in that direction (at least those that make sense in >> our opinion of course :) > > I don't know if this clarifies things but my changes have been to > stackless.py, the API. I added the ability to monitor many channels > at once, a la Newsqueak/Limbo/Go. This should not break existing code > (so far it doesn't but I need to write more tests). I also want to > starting to play with join conditions > (http://en.wikipedia.org/wiki/Join-calculus) and pattern matching. > > However I want to get deeper into PyPy. I would like to implement > Select as a language feature as an exercise rather than an actual > change to the language. What exactly do you mean by "language feature"? I assume that select is so far simply a function that lives in stackless.py, right? I don't really see what other form select could take, so I also don't see in what way you want to change the language. > I am looking the Javascript and Smalltalk VMs > but I don't know where to start for Python itself. The bytecode interpreter and the parser live in interpreter/, the object implementations in objspace/std/ and the modules in modules/. > Also I wouldn't > mind learning more about the stackless transform. Again, the stackless transformation doesn't really need to be touched to implement select. Cheers, Carl Friedrich From arigo at tunes.org Wed Jun 2 16:19:56 2010 From: arigo at tunes.org (Armin Rigo) Date: Wed, 2 Jun 2010 16:19:56 +0200 Subject: [pypy-dev] EuroPython Talk on Stackless.py and Questions In-Reply-To: <4C0663B9.8080109@gmx.de> References: <71820.75955.qm@web111725.mail.gq1.yahoo.com> <4C0663B9.8080109@gmx.de> Message-ID: <20100602141956.GA7829@code0.codespeak.net> Hi Andrew, On Wed, Jun 02, 2010 at 03:59:21PM +0200, Carl Friedrich Bolz wrote: > > However I want to get deeper into PyPy. I would like to implement > > Select as a language feature as an exercise rather than an actual > > change to the language. > > What exactly do you mean by "language feature"? More precisely, I should say that this kind of changes to the syntax and bytecode compiler are *really* uninteresting from our point of view. PyPy is indeed about language implementations, but not about the syntactic level. There are nice existing tools about generating parsers in various languages; and as each of them has some issues with the Python language, we just wrote the parser and compiler by hand and happily forgot about it. Instead, PyPy is about implementation features. Moreover, it focuses more on features that can be meta-programmed, like the stackless transformation (which works fine so far, and which does not need any change to support e.g. various kinds of Select). A bientot, Armin. From arigo at tunes.org Tue Jun 8 14:41:12 2010 From: arigo at tunes.org (Armin Rigo) Date: Tue, 8 Jun 2010 14:41:12 +0200 Subject: [pypy-dev] Merging branch/blackhole-improvements Message-ID: <20100608124112.GA20901@code0.codespeak.net> Hi Fijal, The branch/blackhole-improvement is ready to be merged as far as I can tell. Do you have any issue (as release manager) if I try to merge it now? A bientot, Armin. From anto.cuni at gmail.com Thu Jun 10 00:26:07 2010 From: anto.cuni at gmail.com (Antonio Cuni) Date: Thu, 10 Jun 2010 00:26:07 +0200 Subject: [pypy-dev] [pypy-svn] r75220 - in pypy/trunk/pypy: annotation annotation/test jit/backend jit/backend/llgraph jit/backend/llgraph/test jit/backend/llsupport jit/backend/llsupport/test jit/backend/test jit/backend/x86 jit/backend/x86/test jit/codewriter jit/codewriter/test jit/metainterp jit/metainterp/test jit/tl jit/tl/spli jit/tl/tla jit/tool jit/tool/test module/pypyjit module/pypyjit/test objspace/flow rpython rpython/lltypesystem rpython/lltypesystem/test rpython/memory/gctransform/test rpython/memory/test rpython/test tool/algo tool/algo/test translator/c translator/c/src translator/c/test translator/tool In-Reply-To: <20100608214258.3E44E282BDE@codespeak.net> References: <20100608214258.3E44E282BDE@codespeak.net> Message-ID: <4C1014FF.4040902@gmail.com> On 08/06/10 23:42, arigo at codespeak.net wrote: > The number of changes is a bit huge though. The > format of the static bytecodes used by the jit > changed completely, and codewriter.py is now split > among many files in the new directory pypy.jit.codewriter. > There is also no longer ConstAddr, only ConstInt: prebuilt > addresses are now turned into integers (which are > symbolic, with the new class AddressAsInt, so they can > be rendered by the C translation backend). Various > related changes occurred here and there. I didn't look at the branch deeply, but the last sentence looks suspiciously hard/impossible to implement in ootype. Could you explain why ConstAdrr were bad please? ciao, Anto From arigo at tunes.org Sun Jun 13 08:46:41 2010 From: arigo at tunes.org (Armin Rigo) Date: Sun, 13 Jun 2010 08:46:41 +0200 Subject: [pypy-dev] [pypy-svn] r75220 - in pypy/trunk/pypy: annotation annotation/test jit/backend jit/backend/llgraph jit/backend/llgraph/test jit/backend/llsupport jit/backend/llsupport/test jit/backend/test jit/backend/x86 jit/backend/x86/test jit/codewriter jit/codewriter/test jit/metainterp jit/metainterp/test jit/tl jit/tl/spli jit/tl/tla jit/tool jit/tool/test module/pypyjit module/pypyjit/test objspace/flow rpython rpython/lltypesystem rpython/lltypesystem/test rpython/memory/gctransform/test rpython/memory/test rpython/test tool/algo tool/algo/test translator/c translator/c/src translator/c/test translator/tool In-Reply-To: <4C1014FF.4040902@gmail.com> References: <20100608214258.3E44E282BDE@codespeak.net> <4C1014FF.4040902@gmail.com> Message-ID: <20100613064641.GA7722@code0.codespeak.net> Hi Anto, On Thu, Jun 10, 2010 at 12:26:07AM +0200, Antonio Cuni wrote: > I didn't look at the branch deeply, but the last sentence looks > suspiciously hard/impossible to implement in ootype. Could you explain why > ConstAdrr were bad please? Because the blackhole interpreter doesn't use ConstXxx at all. The constants are now encoded directly in the jitcode -- in three lists, a list of integers, a list of references, and a list of float. There is almost no ConstXxx prebuilt any more. It seemed like a waste to need a fourth list just for addresses when they would fit in the list of integer constants too, hence I did AddressAsInt. It's a rather natural thing to do for lltype, given that we could already convert between addresses and integers at runtime. I'm a bit confused, btw: I thought that ootype did not need ConstAddr at all, because it used ConstObj for all pointer-ish things. A bientot, Armin. From anto.cuni at gmail.com Sun Jun 13 09:40:24 2010 From: anto.cuni at gmail.com (Antonio Cuni) Date: Sun, 13 Jun 2010 09:40:24 +0200 Subject: [pypy-dev] [pypy-svn] r75220 - in pypy/trunk/pypy: annotation annotation/test jit/backend jit/backend/llgraph jit/backend/llgraph/test jit/backend/llsupport jit/backend/llsupport/test jit/backend/test jit/backend/x86 jit/backend/x86/test jit/codewriter jit/codewriter/test jit/metainterp jit/metainterp/test jit/tl jit/tl/spli jit/tl/tla jit/tool jit/tool/test module/pypyjit module/pypyjit/test objspace/flow rpython rpython/lltypesystem rpython/lltypesystem/test rpython/memory/gctransform/test rpython/memory/test rpython/test tool/algo tool/algo/test translator/c translator/c/src translator/c/test translator/tool In-Reply-To: <20100613064641.GA7722@code0.codespeak.net> References: <20100608214258.3E44E282BDE@codespeak.net> <4C1014FF.4040902@gmail.com> <20100613064641.GA7722@code0.codespeak.net> Message-ID: <4C148B68.7030707@gmail.com> On 13/06/10 08:46, Armin Rigo wrote: > I'm a bit confused, btw: I thought that ootype did not need ConstAddr at > all, because it used ConstObj for all pointer-ish things. ah sorry. You wrote ConstAddr but I actually read ConstPtr (i.e., ConstObj for ootype). Indeed, ConstAddr is not used at all by ootype, so there should be no problem (at least in this respect :-)). ciao, Anto From holger at merlinux.eu Tue Jun 15 11:37:59 2010 From: holger at merlinux.eu (holger krekel) Date: Tue, 15 Jun 2010 11:37:59 +0200 Subject: [pypy-dev] can somebody talk at DZUG/Dresden? Message-ID: <20100615093759.GL17693@trillke.net> Hi all, i have been asked to talk at http://new.zope.de/tagung/Dresden_2010 about PyPy. But i can't make it there. Anybody interested in giving a talk about PyPy? Giving english talks is fine and happens a lot there. I think we have good base material from past conferences so if you are an informed follower of the project and are interested - drop me a note. best, holger From tom at tomlocke.com Thu Jun 17 10:21:53 2010 From: tom at tomlocke.com (Tom Locke) Date: Thu, 17 Jun 2010 09:21:53 +0100 Subject: [pypy-dev] Seeking advice re. implementing an interpreter in RPython Message-ID: Hi Folks I'm probably in the wrong place, so I'll make this quick : ) I am working on an experimental programming language, and am considering building the next prototype of the interpreter in RPython. I just wanted to ask which mailing-list is the best one to join for help and advice on this topic? Thanks Tom From arigo at tunes.org Thu Jun 17 10:34:58 2010 From: arigo at tunes.org (Armin Rigo) Date: Thu, 17 Jun 2010 10:34:58 +0200 Subject: [pypy-dev] Seeking advice re. implementing an interpreter in RPython In-Reply-To: References: Message-ID: <20100617083458.GA22461@code0.codespeak.net> Hi Tom, On Thu, Jun 17, 2010 at 09:21:53AM +0100, Tom Locke wrote: > I'm probably in the wrong place, so I'll make this quick : ) You are not :-) Welcome. Feel free to ask here about RPython. You can also join #pypy on irc.freenode.net. A bientot, Armin. From tom at tomlocke.com Thu Jun 17 11:19:19 2010 From: tom at tomlocke.com (Tom Locke) Date: Thu, 17 Jun 2010 10:19:19 +0100 Subject: [pypy-dev] Seeking advice re. implementing an interpreter in RPython In-Reply-To: <20100617083458.GA22461@code0.codespeak.net> References: <20100617083458.GA22461@code0.codespeak.net> Message-ID: Thanks for the welcome, and how nice it is to find a project on a European time-zone, and not have to wait for those sleepy Americans to wake up : ) By way of an introduction, did any of you guys notice "Logix" a few years back? On-the-fly syntax extension and lisp-ish macros for Python. I'm the guy that did that. Now abandoned sadly. I am building what you might call a macro language or a template language for code-generation. It is up and running in prototype form, but way too slow. I must confess to having jumped ship - I am mainly a Ruby guy these days, and the prototype is in Ruby. But RPython is interesting enough to perhaps bring me back - for this project at least - so congratulations for that. Amazing project. OK, to get down to business - I'll be starting with the parser. I notice there is a packrat parser in the rlib directory. If that is in a working state I'll be a happy man, as my existing grammar is for a Ruby packrat parser (Treetop). I am guessing that the 'r' in 'rlib' means RPython? Which I'm hoping means the packrat parser might be reasonably fast? Any pointers to getting started with the packrat parser (or some other if you don't advise that) much appreciated! Tom From william.leslie.ttg at gmail.com Thu Jun 17 10:29:24 2010 From: william.leslie.ttg at gmail.com (William Leslie) Date: Thu, 17 Jun 2010 18:29:24 +1000 Subject: [pypy-dev] Seeking advice re. implementing an interpreter in RPython In-Reply-To: References: Message-ID: This is the right mailing list. You might like to join us on irc for more involved questions, too. What are you working on? On 17/06/2010 6:22 PM, "Tom Locke" wrote: Hi Folks I'm probably in the wrong place, so I'll make this quick : ) I am working on an experimental programming language, and am considering building the next prototype of the interpreter in RPython. I just wanted to ask which mailing-list is the best one to join for help and advice on this topic? Thanks Tom _______________________________________________ pypy-dev at codespeak.net http://codespeak.net/mailman/listinfo/pypy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From ryan at rfk.id.au Thu Jun 17 12:02:55 2010 From: ryan at rfk.id.au (Ryan Kelly) Date: Thu, 17 Jun 2010 20:02:55 +1000 Subject: [pypy-dev] using libffi from rpython - a better way? Message-ID: <1276768975.2069.122.camel@durian> Hi All, First, let me say that PyPy just rocked my world. I develop an auto-update framework for frozen python applications [1] and have started using PyPy to compile parts of the app into stand-alone executables. I got it up and running in under two days, and it shaves several MB off the size of my frozen apps - so thanks for some awesome tech! I'm currently using libffi to dynamically find and load a python DLL from within an RPython program. It works well but the code seems very verbose to someone who's used to working with ctypes. I'm hoping there's a better way... Here's a trimmed-down example of the sort of thing I'm doing:: import ctypes.util libpython = ctypes.util.find_library("python2.6") from pypy.rlib.libffi import * from pypy.rpython.lltypesystem import rffi, lltype def target(*args): def entry_point(argv): # Find the python DLL and extract needed functions py = CDLL(libpython) Py_Initialize = py.getpointer("Py_Initialize",[],ffi_type_void) Py_Finalize = py.getpointer("Py_Finalize",[],ffi_type_void) PyRun_SimpleString = py.getpointer("PyRun_SimpleString",[ffi_type_pointer],ffi_type_sint) # Bootstrap into running a python program Py_Initialize.call(lltype.Void) buf = rffi.str2charp("print 'hello from python!'") PyRun_SimpleString.push_arg(buf) PyRun_SimpleString.call(rffi.INT) rffi.free_charp(buf) Py_Finalize.call(lltype.Void) return 0 return entry_point,None As you can imagine, the real application has a lot more of these boilerplate declarations. Any suggestions on how to do this in less/cleaner code? Thanks, Ryan [1] http://pypi.python.org/pypi/esky/ -- Ryan Kelly http://www.rfk.id.au | This message is digitally signed. Please visit ryan at rfk.id.au | http://www.rfk.id.au/ramblings/gpg/ for details -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part URL: From amauryfa at gmail.com Thu Jun 17 13:35:03 2010 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Thu, 17 Jun 2010 13:35:03 +0200 Subject: [pypy-dev] using libffi from rpython - a better way? In-Reply-To: <1276768975.2069.122.camel@durian> References: <1276768975.2069.122.camel@durian> Message-ID: Hi, 2010/6/17 Ryan Kelly : > > ? ? ? ? ? ? ?# ?Find the python DLL and extract needed functions > ? ? ? ? ? ? ?py = CDLL(libpython) > ? ? ? ? ? ? ?Py_Initialize = py.getpointer("Py_Initialize",[],ffi_type_void) > ? ? ? ? ? ? ?Py_Finalize = py.getpointer("Py_Finalize",[],ffi_type_void) > ? ? ? ? ? ? ?PyRun_SimpleString = py.getpointer("PyRun_SimpleString",[ffi_type_pointer],ffi_type_sint) > > ?As you can imagine, the real application has a lot more of these > boilerplate declarations. ?Any suggestions on how to do this in > less/cleaner code? For the cpyext module, we used the documentation to generate stubs for every function of the API. It's easy to write a sphinx extension that processes every "cfunction" directive in the documentation. You could start with pypy/module/cpyext/stubgen.py and modify it for your needs, which are obviously different. It's poorly documented, but the comment for r72933 says: The stub generator works as follows: 1. Go to Doc (in your CPython checkout). 2. Run make text and stop it after it downloaded the prerequisites. 3. Apply the patch Doc_stubgen_enable.patch in your CPython checkout 4. Run make text again with PyPy in your Python path. 5. Voila, the stubs.py file will be updated. -- Amaury Forgeot d'Arc From cfbolz at gmx.de Thu Jun 17 13:40:10 2010 From: cfbolz at gmx.de (Carl Friedrich Bolz) Date: Thu, 17 Jun 2010 13:40:10 +0200 Subject: [pypy-dev] Seeking advice re. implementing an interpreter in RPython In-Reply-To: References: <20100617083458.GA22461@code0.codespeak.net> Message-ID: <4C1A099A.4040806@gmx.de> Hi Tom, On 06/17/2010 11:19 AM, Tom Locke wrote: > Thanks for the welcome, and how nice it is to find a project on a > European time-zone, and not have to wait for those sleepy Americans > to wake up : ) By way of an introduction, did any of you guys notice > "Logix" a few years back? On-the-fly syntax extension and lisp-ish > macros for Python. I'm the guy that did that. Now abandoned sadly. > > I am building what you might call a macro language or a template > language for code-generation. It is up and running in prototype form, > but way too slow. > > I must confess to having jumped ship - I am mainly a Ruby guy these > days, and the prototype is in Ruby. But RPython is interesting enough > to perhaps bring me back - for this project at least - so > congratulations for that. Amazing project. > > OK, to get down to business - I'll be starting with the parser. In general the PyPy attitude is that parsers are totally uninteresting. > I notice there is a packrat parser in the rlib directory. If that is > in a working state I'll be a happy man, as my existing grammar is > for a Ruby packrat parser (Treetop). I am guessing that the 'r' in > 'rlib' means RPython? Yes, that's correct. > Which I'm hoping means the packrat parser might be reasonably fast? As I am wrote the stuff in the rlib/parsing directory I guess I should answer that. There are actually two different packrat-parsing approaches in the parsing directory. Both of them are not particularly polished or particularly fast. They might still be useful for you, but you have to try and see. > Any pointers to getting started with the packrat parser (or some > other if you don't advise that) much appreciated! There is this: http://codespeak.net/pypy/dist/pypy/doc/rlib.html#parsing Apart from that, you probably have to look at the code or the tests in rlib/parsing. Cheers, Carl Friedrich From holger at merlinux.eu Thu Jun 17 13:46:41 2010 From: holger at merlinux.eu (holger krekel) Date: Thu, 17 Jun 2010 13:46:41 +0200 Subject: [pypy-dev] Seeking advice re. implementing an interpreter in RPython In-Reply-To: <4C1A099A.4040806@gmx.de> References: <20100617083458.GA22461@code0.codespeak.net> <4C1A099A.4040806@gmx.de> Message-ID: <20100617114641.GW17693@trillke.net> On Thu, Jun 17, 2010 at 13:40 +0200, Carl Friedrich Bolz wrote: > On 06/17/2010 11:19 AM, Tom Locke wrote: > > OK, to get down to business - I'll be starting with the parser. > > In general the PyPy attitude is that parsers are totally uninteresting. Luckily from time to time we have people who care, though. holger From cfbolz at gmx.de Thu Jun 17 13:48:34 2010 From: cfbolz at gmx.de (Carl Friedrich Bolz) Date: Thu, 17 Jun 2010 13:48:34 +0200 Subject: [pypy-dev] Seeking advice re. implementing an interpreter in RPython In-Reply-To: <20100617114641.GW17693@trillke.net> References: <20100617083458.GA22461@code0.codespeak.net> <4C1A099A.4040806@gmx.de> <20100617114641.GW17693@trillke.net> Message-ID: <4C1A0B92.1060707@gmx.de> On 06/17/2010 01:46 PM, holger krekel wrote: > On Thu, Jun 17, 2010 at 13:40 +0200, Carl Friedrich Bolz wrote: >> On 06/17/2010 11:19 AM, Tom Locke wrote: >>> OK, to get down to business - I'll be starting with the parser. >> >> In general the PyPy attitude is that parsers are totally uninteresting. > > Luckily from time to time we have people who care, though. Yes, luckily. Occasionally I am even one of them :-). Carl Friedrich From andrewfr_ice at yahoo.com Thu Jun 17 14:39:53 2010 From: andrewfr_ice at yahoo.com (Andrew Francis) Date: Thu, 17 Jun 2010 05:39:53 -0700 (PDT) Subject: [pypy-dev] EuroPython Talk on Stackless.py and Questions In-Reply-To: Message-ID: <508973.72046.qm@web120005.mail.ne1.yahoo.com> Hi Carl: Message: 2 Date: Wed, 02 Jun 2010 15:59:21 +0200 From: Carl Friedrich Bolz Subject: Re: [pypy-dev] EuroPython Talk on Stackless.py and Questions To: pypy-dev at codespeak.net Message-ID: <4C0663B9.8080109 at gmx.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed >What exactly do you mean by "language feature"? As in alter the syntax of Python > I assume that select is so far simply a function that lives in >stackless.py, right? Yes. > I don't really see what other form select could take, so I also > don't see in what way you want to change the language. The Newsqueak/Limbo/Go family of languages show select and channels can be implemented as language features. It is from Limbo, where Stackless Python gets channels. >The bytecode interpreter and the parser live in interpreter/, the object >implementations in objspace/std/ and the modules in modules/. Thanks. I found it and started to look at it. > Also I wouldn't > mind learning more about the stackless transform. >Again, the stackless transformation doesn't really need to be touched to >implement select. True. However I would like to learn how the stackless transform works. Cheers, Andrew From andrewfr_ice at yahoo.com Thu Jun 17 15:13:26 2010 From: andrewfr_ice at yahoo.com (Andrew Francis) Date: Thu, 17 Jun 2010 06:13:26 -0700 (PDT) Subject: [pypy-dev] EuroPython Talk on Stackless.py and Questions In-Reply-To: Message-ID: <964940.80590.qm@web120017.mail.ne1.yahoo.com> Hi Armin: Message: 3 Date: Wed, 2 Jun 2010 16:19:56 +0200 From: Armin Rigo Subject: Re: [pypy-dev] EuroPython Talk on Stackless.py and Questions To: Carl Friedrich Bolz Cc: pypy-dev at codespeak.net Message-ID: <20100602141956.GA7829 at code0.codespeak.net> Content-Type: text/plain; charset=us-ascii Hi Andrew, On Wed, Jun 02, 2010 at 03:59:21PM +0200, Carl Friedrich Bolz wrote: > > However I want to get deeper into PyPy. I would like to implement > > Select as a language feature as an exercise rather than an actual > > change to the language. > >> What exactly do you mean by "language feature"? >More precisely, I should say that this kind of changes to the syntax and >bytecode compiler are *really* uninteresting from our point of view. >PyPy is indeed about language implementations, but not about the >syntactic level. There are nice existing tools about generating parsers >in various languages; and as each of them has some issues with the Python language, we just wrote the parser and compiler by hand and >happily forgot about it. Armin, I understand. However I am approaching this from another angle. I view myself as an application programmer. I know the Stackless API relatively well and its algorithms. Enough to implement select (not hard). Outside of selecting the right switches for translate.py or just sticking to greenlets, there really isn't much PyPy to learn, unless I decide to re-write parts specifically in RPython for speed. I may do this so I can better acquaint myself with PyPy. Implementing select is an important exercise because it is a stepping stone for me (and hopefully others) to experiment with more exotic features. For example, I am becoming interested in Complex Event Processing that has pattern matching rules. One may want language features to support those. So I am interested in the low level details of how to alter PyPy's implementation of Python syntax. Maybe to help in the exercise, it would be nice if I can get advice about what exactly what I needed to change. Getting advice from Carl and Stephan about using greenlets *radically* speed up my ability to implement stackless.py - I would never figure that out on my own. Cheers, Andrew From arigo at tunes.org Thu Jun 17 15:36:35 2010 From: arigo at tunes.org (Armin Rigo) Date: Thu, 17 Jun 2010 15:36:35 +0200 Subject: [pypy-dev] EuroPython Talk on Stackless.py and Questions In-Reply-To: <964940.80590.qm@web120017.mail.ne1.yahoo.com> References: <964940.80590.qm@web120017.mail.ne1.yahoo.com> Message-ID: <20100617133635.GA11503@code0.codespeak.net> Hi Andrew, On Thu, Jun 17, 2010 at 06:13:26AM -0700, Andrew Francis wrote: > (...) One may want language features to support those. So I am > interested in the low level details of how to alter PyPy's > implementation of Python syntax. Sure, feel free to ask, here or on the #pypy channel of irc.freenode.net. I suppose that you have got the point "we are generally not interested in syntax changes" by now, but we should still be around to give you some help. A bientot, Armin. From alexander.belopolsky at gmail.com Fri Jun 18 02:50:59 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 17 Jun 2010 20:50:59 -0400 Subject: [pypy-dev] Adding pure python implementation of datetime to CPython 3.x Message-ID: Dear PyPy-Dev, I have started porting PyPy implementation of datetime.py to CPython 3.x with a goal of distributing it together with C implementation. The idea is for C and Python modules to share most of the regression test cases and allow platforms that cannot use C modules use python implementation. The benefit for CPython developers is that this will allow rapid prototyping and experimentation with new features. The benefit for CPython users is that they will be able to consult python implementation as ultimately detailed documentation. Since your project did an admirable job maintaining datetime.py in recent years, your input will be very much appreciated. Please see CPython tracker issue 7989 [1] for details and to leave comments. I already have a patch there which updates PyPy datetime.py to Python 2.7. You should be able to apply it to your tree when you upgrade to 2.7. -- [1] http://bugs.python.org/issue7989 From ademan555 at gmail.com Fri Jun 18 05:16:37 2010 From: ademan555 at gmail.com (Dan Roberts) Date: Thu, 17 Jun 2010 20:16:37 -0700 Subject: [pypy-dev] lltype Questions Message-ID: <1276830997.3153.8.camel@StormEagle> Hey Everybody, It seems no one's in #pypy at the moment, so I figured I'd post to the mailing list for non-realtime questioning. I wrote my implementation of PyUnicode_DecodeUTF16() which is correct as best I can tell (and it passes its test, but I wrote the test too :-) ). The code is here: http://paste.pocoo.org/show/226753/ and its test is here: http://paste.pocoo.org/show/226752/ . I have a few questions, like I said, the test passes, but I'm unsure of some of the code I've written. There are XXX and FIXME and other comments throughout the code, those represent places where I'm unsure. In the test code, line 6 I'm not sure if that malloc() is correct, and even if it is, would it have been better (and also valid) to lltype.malloc(lltype.Signed, ...) ? Then from line 7 to 12 I assign "native" integers to the malloc-ed memory, is this done correctly? ex. pendian[0] = -1 or should it be pendian[0] = rffi.cast(lltype.Signed, -1) ? In the implementation, I'm fairly certain I've marked everything with FIXME or XXX, so there's not much more to say about that... I'd really appreciate anyone who's willing to take a look. Thanks, Dan From amauryfa at gmail.com Fri Jun 18 10:19:07 2010 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Fri, 18 Jun 2010 10:19:07 +0200 Subject: [pypy-dev] lltype Questions In-Reply-To: <1276830997.3153.8.camel@StormEagle> References: <1276830997.3153.8.camel@StormEagle> Message-ID: Hi, 2010/6/18 Dan Roberts : > Hey Everybody, > ? ? ? ?It seems no one's in #pypy at the moment, so I figured I'd post to the > mailing list for non-realtime questioning. ?I wrote my implementation of > PyUnicode_DecodeUTF16() which is correct as best I can tell (and it > passes its test, but I wrote the test too :-) ). > ? ? ? ?The code is here: http://paste.pocoo.org/show/226753/ and its test is > here: http://paste.pocoo.org/show/226752/ . I have a few questions, like > I said, the test passes, but I'm unsure of some of the code I've > written. ?There are XXX and FIXME and other comments throughout the > code, those represent places where I'm unsure. > ? In the test code, line 6 I'm not sure if that malloc() is correct, > and even if it is, would it have been better (and also valid) to > lltype.malloc(lltype.Signed, ...) ? ?Then from line 7 to 12 I assign > "native" integers to the malloc-ed memory, is this done correctly? ex. > pendian[0] = -1 or should it be pendian[0] = rffi.cast(lltype.Signed, > -1) ? > ? In the implementation, I'm fairly certain I've marked everything with > FIXME or XXX, so there's not much more to say about that... ?I'd really > appreciate anyone who's willing to take a look. All this looks quite good, except for the second cast - it should be pbyteorder[0] = rffi.cast(rffi.INT, byteorder) And I suggest to add another test: automatic detection of byte order when a Byte Order Mark is present. -- Amaury Forgeot d'Arc From fijall at gmail.com Fri Jun 18 19:01:44 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 18 Jun 2010 11:01:44 -0600 Subject: [pypy-dev] virtualenv support and directory hierarchy In-Reply-To: <4BFD364B.5030903@gmail.com> References: <4BFD364B.5030903@gmail.com> Message-ID: Hey anto. I expressed my concerns already, but I'm going to express them once again, since amaury also said that on IRC. I really dislike having version number hardcoded in source checkout for a whole variety of reasons. I don't think need to have virtualenv working from source checkout is enough to push that through. How about install script that does that maybe? Cheers, fijal On Wed, May 26, 2010 at 8:55 AM, Antonio Cuni wrote: > Hi all, > > I am investigating how to make virtualenv working on pypy and I'm running into > a couple of issues: the most important one is that virtualenv relies on > sys.prefix (which does not exists in pypy) to find the standard library, and > the other is that the standard library of pypy is supposed to be put in > /usr/share instead of /usr/lib (or /usr/local/*). > > Currently, a pypy installation is supposed to have this structure: > ? ? ? ? /usr/bin/pypy-c > ? ? ? ? /usr/share/pypy-1.2/pypy/lib/ > ? ? ? ? /usr/share/pypy-1.2/lib-python/modified-2.5.2 > ? ? ? ? /usr/share/pypy-1.2/lib-python/2.5.2 > > In such a situation, sys.pypy_prefix is set to '/usr/share/pypy-1.2'. > > I propose to change it in this way: > ? ? ? ? /usr/bin/pypy-c > ? ? ? ? /usr/lib/pypy1.2/lib-pypy/ > ? ? ? ? /usr/lib/pypy1.2/lib-python/modified-2.5.2 > ? ? ? ? /usr/lib/pypy1.2/lib-python/2.5.2 > > where lib-pypy contains what it's now in pypy/lib. > In such a situation, sys.prefix would be set to '/usr', in a similar way as > cpython. ?Also, we should also add a sys.exec_prefix which is meant to be > always equal to sys.prefix (at least for now). > (I removed the dash in pypy-1.2 for consistency with cpython, which uses > something like lib/python2.6). > > > Moreover, I would also like virtualenv to work from an svn checkout/source > tarball of pypy, without any needing of installing it system-wide. To do so, > we need to find a sensible value for sys.prefix, considering that tools like > virtualenv suppose to find the stdlib under sys.prefix+'lib/'+something_else. > > So, the proposed new structure is this: > > ? ? ? ? /path/to/pypy-trunk/ > ? ? ? ? /path/to/pypy-trunk/lib/pypy1.2/{lib-pypy,modified-2.5.2,2.5.2} > ? ? ? ? sys.prefix == '/path/to/pypy-trunk' > > The drawback is that before getting to the real files you have to walk a lot > of empty directories, and that we should manually change the name of pypy1.2 > each time we increase the version number (not that it happens very often :-)). > One side-advantage is that in this way we would move pypy/lib outside the main > pypy package, which is good because it's not really a part of it. > > Finally, we should probably think of where to put the include/ directory (plus > other that might be needed to build extension), but I'll let cpyext experts to > say what it's better :-) > > What do you think? Any comment/suggestion/problem that I overlooked? > > ciao, > Anto > _______________________________________________ > pypy-dev at codespeak.net > http://codespeak.net/mailman/listinfo/pypy-dev > From anto.cuni at gmail.com Sat Jun 19 10:29:58 2010 From: anto.cuni at gmail.com (Antonio Cuni) Date: Sat, 19 Jun 2010 10:29:58 +0200 Subject: [pypy-dev] virtualenv support and directory hierarchy In-Reply-To: References: <4BFD364B.5030903@gmail.com> Message-ID: <4C1C8006.1030009@gmail.com> Hi, I also don't like too much to have a hardcoded version number, but when I asked for alternatives nobody suggested anything. On IRC, amaury suggested to install the whole pypy distribution in its self-contained directory, more or less as cpython does on windows. I didn't think about this solution, but now I see that it might be a good one, as it would allow to have the same hierarchy on svn checkouts, user-specific installations and system-wide installations. The drawback is that it's a bit non-standard on unix; moreover, if we install pypy in say /opt/pypy1.2, it would be hard to put a binary in /usr/bin without hardcoding the path to pypy1.2 somewhere. So, for me there are four possible solutions: 1) leave things as they are on branch/sys-prefix and have the version number hardcoded in svn 2) put the whole distribution in its own directory, as jython or cpython on windows. This has the open problem of determining where the directory is, as described above 3) don't hardcode the version number in svn, but add a special case to virtualenv to detect if we are inside a pypy checkout to handle it specially 4) don't care about running virtualenv inside a pypy checkout Personally, I would exclude (4): I think that it would be very cool for people to try pypy in a sandboxed virtualenv without having to install it, and it would be useful to us too. So, before I do more work, I'd like to hear what people think and which of the alternatives they prefer. ciao, Anto On 18/06/10 19:01, Maciej Fijalkowski wrote: > Hey anto. > > I expressed my concerns already, but I'm going to express them once > again, since amaury also said that on IRC. > > I really dislike having version number hardcoded in source checkout > for a whole variety of reasons. I don't think need to have virtualenv > working from source checkout is enough to push that through. > > How about install script that does that maybe? > > Cheers, > fijal From arigo at tunes.org Sat Jun 19 10:36:31 2010 From: arigo at tunes.org (Armin Rigo) Date: Sat, 19 Jun 2010 10:36:31 +0200 Subject: [pypy-dev] virtualenv support and directory hierarchy In-Reply-To: <4C1C8006.1030009@gmail.com> References: <4BFD364B.5030903@gmail.com> <4C1C8006.1030009@gmail.com> Message-ID: <20100619083631.GA31070@code0.codespeak.net> Hi Anto, On Sat, Jun 19, 2010 at 10:29:58AM +0200, Antonio Cuni wrote: > The drawback is that it's a bit non-standard on unix; moreover, if we > install pypy in say /opt/pypy1.2, it would be hard to put a binary in > /usr/bin without hardcoding the path to pypy1.2 somewhere. No, we can put a symlink in /usr/bin going to /opt/pypy1.2/bin/pypy. I have seen a few projects do that and it would just work (the current logic to search for the pypy directory follows symlinks). A bientot, Armin. From amauryfa at gmail.com Sat Jun 19 11:00:10 2010 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Sat, 19 Jun 2010 11:00:10 +0200 Subject: [pypy-dev] virtualenv support and directory hierarchy In-Reply-To: <4C1C8006.1030009@gmail.com> References: <4BFD364B.5030903@gmail.com> <4C1C8006.1030009@gmail.com> Message-ID: Hi, 2010/6/19 Antonio Cuni : > I also don't like too much to have a hardcoded version number, but when I > asked for alternatives nobody suggested anything. > > On IRC, amaury suggested to install the whole pypy distribution in its > self-contained directory, more or less as cpython does on windows. > I didn't think about this solution, but now I see that it might be a good one, > as it would allow to have the same hierarchy on svn checkouts, user-specific > installations and system-wide installations. > > The drawback is that it's a bit non-standard on unix; moreover, if we install > pypy in say /opt/pypy1.2, it would be hard to put a binary in /usr/bin without > hardcoding the path to pypy1.2 somewhere. Is it possible to just put a symlink, or a small script in /usr/bin? -- Amaury Forgeot d'Arc From anto.cuni at gmail.com Sat Jun 19 11:07:29 2010 From: anto.cuni at gmail.com (Antonio Cuni) Date: Sat, 19 Jun 2010 11:07:29 +0200 Subject: [pypy-dev] virtualenv support and directory hierarchy In-Reply-To: References: <4BFD364B.5030903@gmail.com> <4C1C8006.1030009@gmail.com> Message-ID: <4C1C88D1.20308@gmail.com> On 19/06/10 11:00, Amaury Forgeot d'Arc wrote: >> The drawback is that it's a bit non-standard on unix; moreover, if we install >> pypy in say /opt/pypy1.2, it would be hard to put a binary in /usr/bin without >> hardcoding the path to pypy1.2 somewhere. > > Is it possible to just put a symlink, or a small script in /usr/bin? yes, the symlink should be possible, as armin also points out. I already thought about it, but I was not sure that distributions like ubuntu allows to put a symlink in /usr/bin to something external. But indeed, looking at my /usr/bin it seems that symlinks are used a lot, so it should be fine. So, does this mean that we are going for solution number 2? From fijall at gmail.com Sat Jun 19 18:07:52 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sat, 19 Jun 2010 10:07:52 -0600 Subject: [pypy-dev] virtualenv support and directory hierarchy In-Reply-To: <4C1C88D1.20308@gmail.com> References: <4BFD364B.5030903@gmail.com> <4C1C8006.1030009@gmail.com> <4C1C88D1.20308@gmail.com> Message-ID: Personally, if given such choice I would go with 4) (don't care about virtualenv from source checkout). I don't care myself and I don't think many people would care (since they'll get pypy installed from say ubuntu or fedora most likely). Cheers, fijal On Sat, Jun 19, 2010 at 3:07 AM, Antonio Cuni wrote: > On 19/06/10 11:00, Amaury Forgeot d'Arc wrote: > >>> The drawback is that it's a bit non-standard on unix; moreover, if we >>> install >>> pypy in say /opt/pypy1.2, it would be hard to put a binary in /usr/bin >>> without >>> hardcoding the path to pypy1.2 somewhere. >> >> Is it possible to just put a symlink, or a small script in /usr/bin? > > yes, the symlink should be possible, as armin also points out. I already > thought about it, but I was not sure that distributions like ubuntu allows > to put a symlink in /usr/bin to something external. ?But indeed, looking at > my /usr/bin it seems that symlinks are used a lot, so it should be fine. > > So, does this mean that we are going for solution number 2? > > > > From tobami at googlemail.com Fri Jun 25 13:08:06 2010 From: tobami at googlemail.com (Miquel Torres) Date: Fri, 25 Jun 2010 13:08:06 +0200 Subject: [pypy-dev] New speed.pypy.org version Message-ID: Hi all!, I want to announce a new version of the benchmarks site speed.pypy.org. After about 6 months, it finally shows the vision I had for such a website: usefull for pypy developers but also for the general public following pypy's or even other python implementation's development. On to the changes. There are now three views: "Changes", "Timeline" and "Comparison": The Overview was renamed to Changes, and its inline plot bars got removed because you can get the exact same plot in the Comparison view now (and then some). The Timeline got selectable baseline and "humanized" date labels for the x axis. The new Comparison view allows, well, comparing of "competing" interpreters, which will also be of interest to the wider Python community (specially if we can add unladen, ironpython and JPython results). Two examples of interesting comparisons are: - relative bars ( http://speed.pypy.org/comparison/?bas=2%2B35&chart=relative+bars): here we see that the jit is faster than psyco in all cases except spambayes and slowspitfire, were the jit cannot make up for pypy-c's abismal performance. Interestingly, in the only other case where the jit is slower than cpython, the ai benchmark, psyco performs even worse. - stacked bars horizontal( http://speed.pypy.org/comparison/?hor=true&bas=2%2B35&chart=stacked+bars): This is not meant to "demonstrate" that overall the jit is over two times faster than cpython. It is just another way for a developer to picture how long a programme would take to complete if it were composed of 21 such tasks. You can see that cpython's (the normalization chosen) benchmarks all take 1"relative" second. pypy-c needs more or less the same time, some "tasks" being slower and some faster. Psyco shows an interesting picture: >From meteor-contest downwards (fortuitously) , all benchmarks are extremely "compressed", which means they are speeded up by psyco quite a lot. But any further speed up wouldn't make overall time much shorter because the first group of benchmarks now takes most of the time to complete. pypy-c-jit is a more extreme case of this: If the jit accelerated all "fast" benchmarks to 0 seconds (infinitely fast), it would only get about twice as fast as now because ai, slowspitfire, spambayes and twisted_tcp now need half the entire execution time. An good demonstration of "you are only as fast as your slowest part". Of course the aggregate of all benchmarks is not a real app, but it is still fun. I hope you find the new version useful, and as always any feedback is welcome. Cheers! Miquel -------------- next part -------------- An HTML attachment was scrubbed... URL: From anto.cuni at gmail.com Fri Jun 25 13:43:25 2010 From: anto.cuni at gmail.com (Antonio Cuni) Date: Fri, 25 Jun 2010 13:43:25 +0200 Subject: [pypy-dev] New speed.pypy.org version In-Reply-To: References: Message-ID: <4C24965D.2030808@gmail.com> Hi Miquel, On 25/06/10 13:08, Miquel Torres wrote: > Hi all!, [cut] > I hope you find the new version useful, and as always any feedback is > welcome. well... what to say? I simply like it *a lot*. Thank you :-) I especially think that the "Changes" view is very useful for us developers, in particular the fact that you can see the log for all the revisions that affected the change: it is something that we did tons of time manually, it's nice to see it automated :-). ciao, Anto From p.giarrusso at gmail.com Fri Jun 25 14:07:44 2010 From: p.giarrusso at gmail.com (Paolo Giarrusso) Date: Fri, 25 Jun 2010 14:07:44 +0200 Subject: [pypy-dev] New speed.pypy.org version In-Reply-To: References: Message-ID: Hi! First, I want to restate the obvious, before pointing out what I think is a mistake: your work on this website is great and very useful! On Fri, Jun 25, 2010 at 13:08, Miquel Torres wrote: > - stacked bars Here you are summing up normalized times, which is more or less like taking their arithmetic average. And that doesn't work at all: in many cases you can "show" completely different results by normalizing relatively to another item. Even the simple question "who is faster?" can be answered in different ways So you should use the geometric mean, even if this is not so widely known. Or better, it is known by benchmarking experts, but it's difficult to become so. Please, have a look at the short paper: "How not to lie with statistics: the correct way to summarize benchmark results" http://scholar.google.com/scholar?cluster=1051144955483053492&hl=en&as_sdt=2000 I downloaded it from the ACM library, please tell me if you can't find it. > horizontal(http://speed.pypy.org/comparison/?hor=true&bas=2%2B35&chart=stacked+bars): > This is not meant to "demonstrate" that overall the jit is over two times > faster than cpython. It is just another way for a developer to picture how > long a programme would take to complete if it were composed of 21 such > tasks. You are not summing up absolute times, so your claim is incorrect. And the error is significant, given the above paper. A sum of absolute times would provide what you claim. > You can see that cpython's (the normalization chosen) benchmarks all > take 1"relative" second. Here, for instance, I see that CPython and pypy-c take more or less the same time, which surprises me (since the PyPy interpreter was known to be slower than CPython). But given that the result is invalid, it may well be an artifact of your statistics. > pypy-c needs more or less the same time, some > "tasks" being slower and some faster. Psyco shows an interesting picture: > From meteor-contest downwards (fortuitously) , all benchmarks are extremely > "compressed", which means they are speeded up by psyco quite a lot. But any > further speed up wouldn't make overall time much shorter because the first > group of benchmarks now takes most of the time to complete. pypy-c-jit is a > more extreme case of this: If the jit accelerated all "fast" benchmarks to 0 > seconds (infinitely fast), it would only get about twice as fast as now > because ai, slowspitfire, spambayes and twisted_tcp now need half the entire > execution time. An good demonstration of "you are only as fast as your > slowest part". Of course the aggregate of all benchmarks is not a real app, > but it is still fun. This could maybe be still true, at least in part, but you have to do this reasoning on absolute times. Best regards, and keep up the good work! -- Paolo Giarrusso - Ph.D. Student http://www.informatik.uni-marburg.de/~pgiarrusso/ From fijall at gmail.com Fri Jun 25 17:42:13 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 25 Jun 2010 09:42:13 -0600 Subject: [pypy-dev] New speed.pypy.org version In-Reply-To: References: Message-ID: On Fri, Jun 25, 2010 at 5:08 AM, Miquel Torres wrote: > Hi all!, > > I want to announce a new version of the benchmarks site speed.pypy.org. > > After about 6 months, it finally shows the vision I had for such a website: > usefull for pypy developers but also for the general public following pypy's > or even other python implementation's development. On to the changes. > > There are now three views: "Changes", "Timeline" and "Comparison": > > The Overview was renamed to Changes, and its inline plot bars got removed > because you can get the exact same plot in the Comparison view now (and then > some). > > The Timeline got selectable baseline and "humanized" date labels for the x > axis. > > The new Comparison view allows, well, comparing of "competing" interpreters, > which will also be of interest to the wider Python community (specially if > we can add unladen, ironpython and JPython results). > > > Two examples of interesting comparisons are: > > - relative bars > (http://speed.pypy.org/comparison/?bas=2%2B35&chart=relative+bars): here we > see that the jit is faster than psyco in all cases except spambayes and > slowspitfire, were the jit cannot make up for pypy-c's abismal performance. > Interestingly, in the only other case where the jit is slower than cpython, > the ai benchmark, psyco performs even worse. > > - stacked bars > horizontal(http://speed.pypy.org/comparison/?hor=true&bas=2%2B35&chart=stacked+bars): > This is not meant to "demonstrate" that overall the jit is over two times > faster than cpython. It is just another way for a developer to picture how > long a programme would take to complete if it were composed of 21 such > tasks. You can see that cpython's (the normalization chosen) benchmarks all > take 1"relative" second. pypy-c needs more or less the same time, some > "tasks" being slower and some faster. Psyco shows an interesting picture: > From meteor-contest downwards (fortuitously) , all benchmarks are extremely > "compressed", which means they are speeded up by psyco quite a lot. But any > further speed up wouldn't make overall time much shorter because the first > group of benchmarks now takes most of the time to complete. pypy-c-jit is a > more extreme case of this: If the jit accelerated all "fast" benchmarks to 0 > seconds (infinitely fast), it would only get about twice as fast as now > because ai, slowspitfire, spambayes and twisted_tcp now need half the entire > execution time. An good demonstration of "you are only as fast as your > slowest part". Of course the aggregate of all benchmarks is not a real app, > but it is still fun. > > I hope you find the new version useful, and as always any feedback is > welcome. > > Cheers! > Miquel > Wow, I really like it, great job. Can we see how we can use this features for branches? Cheers, fijal From fijall at gmail.com Fri Jun 25 17:53:45 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 25 Jun 2010 09:53:45 -0600 Subject: [pypy-dev] New speed.pypy.org version In-Reply-To: References: Message-ID: Hey Paolo. While in general I agree with you, this is not exactly science, I still think it's giving somewhat an impression what's going on to outsiders. Inside, we still look mostly at particular benchmarks. I'm not sure having any convoluted (at least to normal people) metric while summarizing would help, maybe. Speaking a bit on Miguel's behalf, feel free to implement this as a feature on codespeed (it's an open source project after all), you can fork it on github http://github.com/tobami/codespeed. Cheers, fijal On Fri, Jun 25, 2010 at 6:07 AM, Paolo Giarrusso wrote: > Hi! > First, I want to restate the obvious, before pointing out what I think > is a mistake: your work on this website is great and very useful! > > On Fri, Jun 25, 2010 at 13:08, Miquel Torres wrote: >> - stacked bars > Here you are summing up normalized times, which is more or less like > taking their arithmetic average. And that doesn't work at all: in many > cases you can "show" completely different results by normalizing > relatively to another item. Even the simple question "who is faster?" > can be answered in different ways > So you should use the geometric mean, even if this is not so widely > known. Or better, it is known by benchmarking experts, but it's > difficult to become so. > > Please, have a look at the short paper: > "How not to lie with statistics: the correct way to summarize benchmark results" > http://scholar.google.com/scholar?cluster=1051144955483053492&hl=en&as_sdt=2000 > I downloaded it from the ACM library, please tell me if you can't find it. > >> horizontal(http://speed.pypy.org/comparison/?hor=true&bas=2%2B35&chart=stacked+bars): >> This is not meant to "demonstrate" that overall the jit is over two times >> faster than cpython. It is just another way for a developer to picture how >> long a programme would take to complete if it were composed of 21 such >> tasks. > > You are not summing up absolute times, so your claim is incorrect. And > the error is significant, given the above paper. > A sum of absolute times would provide what you claim. > >> You can see that cpython's (the normalization chosen) benchmarks all >> take 1"relative" second. > Here, for instance, I see that CPython and pypy-c take more or less > the same time, which surprises me (since the PyPy interpreter was > known to be slower than CPython). But given that the result is > invalid, it may well be an artifact of your statistics. > >> pypy-c needs more or less the same time, some >> "tasks" being slower and some faster. Psyco shows an interesting picture: >> From meteor-contest downwards (fortuitously) , all benchmarks are extremely >> "compressed", which means they are speeded up by psyco quite a lot. But any >> further speed up wouldn't make overall time much shorter because the first >> group of benchmarks now takes most of the time to complete. pypy-c-jit is a >> more extreme case of this: If the jit accelerated all "fast" benchmarks to 0 >> seconds (infinitely fast), it would only get about twice as fast as now >> because ai, slowspitfire, spambayes and twisted_tcp now need half the entire >> execution time. An good demonstration of "you are only as fast as your >> slowest part". Of course the aggregate of all benchmarks is not a real app, >> but it is still fun. > > This could maybe be still true, at least in part, but you have to do > this reasoning on absolute times. > > Best regards, and keep up the good work! > -- > Paolo Giarrusso - Ph.D. Student > http://www.informatik.uni-marburg.de/~pgiarrusso/ > _______________________________________________ > pypy-dev at codespeak.net > http://codespeak.net/mailman/listinfo/pypy-dev > From fijall at gmail.com Fri Jun 25 18:16:01 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 25 Jun 2010 10:16:01 -0600 Subject: [pypy-dev] New speed.pypy.org version In-Reply-To: References: Message-ID: Hey. A bit more important problem - results seem to be messed up. I think there is something wrong with baselines. Look here: http://buildbot.pypy.org/builders/jit-benchmark-linux-x86-32/builds/358/steps/shell_2/logs/stdio on twisted_tcp vs http://speed.pypy.org/timeline/?exe=1,3&base=2%2B35&ben=twisted_tcp&env=tannit&revs=200 On Fri, Jun 25, 2010 at 5:08 AM, Miquel Torres wrote: > Hi all!, > > I want to announce a new version of the benchmarks site speed.pypy.org. > > After about 6 months, it finally shows the vision I had for such a website: > usefull for pypy developers but also for the general public following pypy's > or even other python implementation's development. On to the changes. > > There are now three views: "Changes", "Timeline" and "Comparison": > > The Overview was renamed to Changes, and its inline plot bars got removed > because you can get the exact same plot in the Comparison view now (and then > some). > > The Timeline got selectable baseline and "humanized" date labels for the x > axis. > > The new Comparison view allows, well, comparing of "competing" interpreters, > which will also be of interest to the wider Python community (specially if > we can add unladen, ironpython and JPython results). > > > Two examples of interesting comparisons are: > > - relative bars > (http://speed.pypy.org/comparison/?bas=2%2B35&chart=relative+bars): here we > see that the jit is faster than psyco in all cases except spambayes and > slowspitfire, were the jit cannot make up for pypy-c's abismal performance. > Interestingly, in the only other case where the jit is slower than cpython, > the ai benchmark, psyco performs even worse. > > - stacked bars > horizontal(http://speed.pypy.org/comparison/?hor=true&bas=2%2B35&chart=stacked+bars): > This is not meant to "demonstrate" that overall the jit is over two times > faster than cpython. It is just another way for a developer to picture how > long a programme would take to complete if it were composed of 21 such > tasks. You can see that cpython's (the normalization chosen) benchmarks all > take 1"relative" second. pypy-c needs more or less the same time, some > "tasks" being slower and some faster. Psyco shows an interesting picture: > From meteor-contest downwards (fortuitously) , all benchmarks are extremely > "compressed", which means they are speeded up by psyco quite a lot. But any > further speed up wouldn't make overall time much shorter because the first > group of benchmarks now takes most of the time to complete. pypy-c-jit is a > more extreme case of this: If the jit accelerated all "fast" benchmarks to 0 > seconds (infinitely fast), it would only get about twice as fast as now > because ai, slowspitfire, spambayes and twisted_tcp now need half the entire > execution time. An good demonstration of "you are only as fast as your > slowest part". Of course the aggregate of all benchmarks is not a real app, > but it is still fun. > > I hope you find the new version useful, and as always any feedback is > welcome. > > Cheers! > Miquel > > > _______________________________________________ > pypy-dev at codespeak.net > http://codespeak.net/mailman/listinfo/pypy-dev > From tobami at googlemail.com Fri Jun 25 19:08:23 2010 From: tobami at googlemail.com (Miquel Torres) Date: Fri, 25 Jun 2010 19:08:23 +0200 Subject: [pypy-dev] New speed.pypy.org version In-Reply-To: References: Message-ID: Hi Paolo, I am aware of the problem with calculating benchmark means, but let me explain my point of view. You are correct in that it would be preferable to have absolute times. Well, you actually can, but see what it happens: http://speed.pypy.org/comparison/?hor=true&bas=none&chart=stacked+bars Absolute values would only work if we had carefully chosen benchmaks runtimes to be very similar (for our cpython baseline). As it is, html5lib, spitfire and spitfire_cstringio completely dominate the cummulative time. And not because the interpreter is faster or slower but because the benchmark was arbitrarily designed to run that long. Any improvement in the long running benchmarks will carry much more weight than in the short running. What is more useful is to have comparable slices of time so that the improvements can be seen relatively over time. Normalizing does that i think. It just says: we have 21 tasks which take 1 second to run each on interpreter X (cpython in the default case). Then we see how other executables compare to that. What would the geometric mean achieve here, exactly, for the end user? I am not really calculating any mean. You can see that I carefully avoided to display any kind of total bar which would indeed incur in the problem you mention. That a stacked chart implicitly displays a total is something you can not avoid, and for that kind of chart I still think normalized results is visually the best option. Still, i would very much like to read the paper you cite, but you need a login for it. Cheers, Miquel 2010/6/25 Paolo Giarrusso > Hi! > First, I want to restate the obvious, before pointing out what I think > is a mistake: your work on this website is great and very useful! > > On Fri, Jun 25, 2010 at 13:08, Miquel Torres > wrote: > > - stacked bars > Here you are summing up normalized times, which is more or less like > taking their arithmetic average. And that doesn't work at all: in many > cases you can "show" completely different results by normalizing > relatively to another item. Even the simple question "who is faster?" > can be answered in different ways > So you should use the geometric mean, even if this is not so widely > known. Or better, it is known by benchmarking experts, but it's > difficult to become so. > > Please, have a look at the short paper: > "How not to lie with statistics: the correct way to summarize benchmark > results" > > http://scholar.google.com/scholar?cluster=1051144955483053492&hl=en&as_sdt=2000 > I downloaded it from the ACM library, please tell me if you can't find it. > > > horizontal( > http://speed.pypy.org/comparison/?hor=true&bas=2%2B35&chart=stacked+bars): > > This is not meant to "demonstrate" that overall the jit is over two times > > faster than cpython. It is just another way for a developer to picture > how > > long a programme would take to complete if it were composed of 21 such > > tasks. > > You are not summing up absolute times, so your claim is incorrect. And > the error is significant, given the above paper. > A sum of absolute times would provide what you claim. > > > You can see that cpython's (the normalization chosen) benchmarks all > > take 1"relative" second. > Here, for instance, I see that CPython and pypy-c take more or less > the same time, which surprises me (since the PyPy interpreter was > known to be slower than CPython). But given that the result is > invalid, it may well be an artifact of your statistics. > > > pypy-c needs more or less the same time, some > > "tasks" being slower and some faster. Psyco shows an interesting picture: > > From meteor-contest downwards (fortuitously) , all benchmarks are > extremely > > "compressed", which means they are speeded up by psyco quite a lot. But > any > > further speed up wouldn't make overall time much shorter because the > first > > group of benchmarks now takes most of the time to complete. pypy-c-jit is > a > > more extreme case of this: If the jit accelerated all "fast" benchmarks > to 0 > > seconds (infinitely fast), it would only get about twice as fast as now > > because ai, slowspitfire, spambayes and twisted_tcp now need half the > entire > > execution time. An good demonstration of "you are only as fast as your > > slowest part". Of course the aggregate of all benchmarks is not a real > app, > > but it is still fun. > > This could maybe be still true, at least in part, but you have to do > this reasoning on absolute times. > > Best regards, and keep up the good work! > -- > Paolo Giarrusso - Ph.D. Student > http://www.informatik.uni-marburg.de/~pgiarrusso/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobami at googlemail.com Fri Jun 25 19:14:23 2010 From: tobami at googlemail.com (Miquel Torres) Date: Fri, 25 Jun 2010 19:14:23 +0200 Subject: [pypy-dev] New speed.pypy.org version In-Reply-To: References: Message-ID: Hey fijal, the baseline problem you mention only happens with some benchmarks, so I will risk the guess that the cpython results currently present are not from the last one, and that in the case you point out (only for twisted_tcp) it changed quite a bit. If the next results overwrite cpython's results, we'll see if that has been the case. 2010/6/25 Maciej Fijalkowski > Hey. > > A bit more important problem - results seem to be messed up. I think > there is something wrong with baselines. Look here: > > http://buildbot.pypy.org/builders/jit-benchmark-linux-x86-32/builds/358/steps/shell_2/logs/stdio > > on twisted_tcp > > vs > > > http://speed.pypy.org/timeline/?exe=1,3&base=2%2B35&ben=twisted_tcp&env=tannit&revs=200 > > On Fri, Jun 25, 2010 at 5:08 AM, Miquel Torres > wrote: > > Hi all!, > > > > I want to announce a new version of the benchmarks site speed.pypy.org. > > > > After about 6 months, it finally shows the vision I had for such a > website: > > usefull for pypy developers but also for the general public following > pypy's > > or even other python implementation's development. On to the changes. > > > > There are now three views: "Changes", "Timeline" and "Comparison": > > > > The Overview was renamed to Changes, and its inline plot bars got removed > > because you can get the exact same plot in the Comparison view now (and > then > > some). > > > > The Timeline got selectable baseline and "humanized" date labels for the > x > > axis. > > > > The new Comparison view allows, well, comparing of "competing" > interpreters, > > which will also be of interest to the wider Python community (specially > if > > we can add unladen, ironpython and JPython results). > > > > > > Two examples of interesting comparisons are: > > > > - relative bars > > (http://speed.pypy.org/comparison/?bas=2%2B35&chart=relative+bars): here > we > > see that the jit is faster than psyco in all cases except spambayes and > > slowspitfire, were the jit cannot make up for pypy-c's abismal > performance. > > Interestingly, in the only other case where the jit is slower than > cpython, > > the ai benchmark, psyco performs even worse. > > > > - stacked bars > > horizontal( > http://speed.pypy.org/comparison/?hor=true&bas=2%2B35&chart=stacked+bars): > > This is not meant to "demonstrate" that overall the jit is over two times > > faster than cpython. It is just another way for a developer to picture > how > > long a programme would take to complete if it were composed of 21 such > > tasks. You can see that cpython's (the normalization chosen) benchmarks > all > > take 1"relative" second. pypy-c needs more or less the same time, some > > "tasks" being slower and some faster. Psyco shows an interesting picture: > > From meteor-contest downwards (fortuitously) , all benchmarks are > extremely > > "compressed", which means they are speeded up by psyco quite a lot. But > any > > further speed up wouldn't make overall time much shorter because the > first > > group of benchmarks now takes most of the time to complete. pypy-c-jit is > a > > more extreme case of this: If the jit accelerated all "fast" benchmarks > to 0 > > seconds (infinitely fast), it would only get about twice as fast as now > > because ai, slowspitfire, spambayes and twisted_tcp now need half the > entire > > execution time. An good demonstration of "you are only as fast as your > > slowest part". Of course the aggregate of all benchmarks is not a real > app, > > but it is still fun. > > > > I hope you find the new version useful, and as always any feedback is > > welcome. > > > > Cheers! > > Miquel > > > > > > _______________________________________________ > > pypy-dev at codespeak.net > > http://codespeak.net/mailman/listinfo/pypy-dev > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobami at googlemail.com Fri Jun 25 19:23:13 2010 From: tobami at googlemail.com (Miquel Torres) Date: Fri, 25 Jun 2010 19:23:13 +0200 Subject: [pypy-dev] New speed.pypy.org version In-Reply-To: References: Message-ID: There is no problem in running tests for branches. What other branches or interpreters would you for example run? 2010/6/25 Maciej Fijalkowski > On Fri, Jun 25, 2010 at 5:08 AM, Miquel Torres > wrote: > > Hi all!, > > > > I want to announce a new version of the benchmarks site speed.pypy.org. > > > > After about 6 months, it finally shows the vision I had for such a > website: > > usefull for pypy developers but also for the general public following > pypy's > > or even other python implementation's development. On to the changes. > > > > There are now three views: "Changes", "Timeline" and "Comparison": > > > > The Overview was renamed to Changes, and its inline plot bars got removed > > because you can get the exact same plot in the Comparison view now (and > then > > some). > > > > The Timeline got selectable baseline and "humanized" date labels for the > x > > axis. > > > > The new Comparison view allows, well, comparing of "competing" > interpreters, > > which will also be of interest to the wider Python community (specially > if > > we can add unladen, ironpython and JPython results). > > > > > > Two examples of interesting comparisons are: > > > > - relative bars > > (http://speed.pypy.org/comparison/?bas=2%2B35&chart=relative+bars): here > we > > see that the jit is faster than psyco in all cases except spambayes and > > slowspitfire, were the jit cannot make up for pypy-c's abismal > performance. > > Interestingly, in the only other case where the jit is slower than > cpython, > > the ai benchmark, psyco performs even worse. > > > > - stacked bars > > horizontal( > http://speed.pypy.org/comparison/?hor=true&bas=2%2B35&chart=stacked+bars): > > This is not meant to "demonstrate" that overall the jit is over two times > > faster than cpython. It is just another way for a developer to picture > how > > long a programme would take to complete if it were composed of 21 such > > tasks. You can see that cpython's (the normalization chosen) benchmarks > all > > take 1"relative" second. pypy-c needs more or less the same time, some > > "tasks" being slower and some faster. Psyco shows an interesting picture: > > From meteor-contest downwards (fortuitously) , all benchmarks are > extremely > > "compressed", which means they are speeded up by psyco quite a lot. But > any > > further speed up wouldn't make overall time much shorter because the > first > > group of benchmarks now takes most of the time to complete. pypy-c-jit is > a > > more extreme case of this: If the jit accelerated all "fast" benchmarks > to 0 > > seconds (infinitely fast), it would only get about twice as fast as now > > because ai, slowspitfire, spambayes and twisted_tcp now need half the > entire > > execution time. An good demonstration of "you are only as fast as your > > slowest part". Of course the aggregate of all benchmarks is not a real > app, > > but it is still fun. > > > > I hope you find the new version useful, and as always any feedback is > > welcome. > > > > Cheers! > > Miquel > > > > Wow, I really like it, great job. > > Can we see how we can use this features for branches? > > Cheers, > fijal > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Fri Jun 25 19:28:21 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 25 Jun 2010 11:28:21 -0600 Subject: [pypy-dev] New speed.pypy.org version In-Reply-To: References: Message-ID: PyPy branches mostly (did this improve or not really kind of question) On Fri, Jun 25, 2010 at 11:23 AM, Miquel Torres wrote: > There is no problem in running tests for branches. What other branches or > interpreters would you for example run? > > > 2010/6/25 Maciej Fijalkowski >> >> On Fri, Jun 25, 2010 at 5:08 AM, Miquel Torres >> wrote: >> > Hi all!, >> > >> > I want to announce a new version of the benchmarks site speed.pypy.org. >> > >> > After about 6 months, it finally shows the vision I had for such a >> > website: >> > usefull for pypy developers but also for the general public following >> > pypy's >> > or even other python implementation's development. On to the changes. >> > >> > There are now three views: "Changes", "Timeline" and "Comparison": >> > >> > The Overview was renamed to Changes, and its inline plot bars got >> > removed >> > because you can get the exact same plot in the Comparison view now (and >> > then >> > some). >> > >> > The Timeline got selectable baseline and "humanized" date labels for the >> > x >> > axis. >> > >> > The new Comparison view allows, well, comparing of "competing" >> > interpreters, >> > which will also be of interest to the wider Python community (specially >> > if >> > we can add unladen, ironpython and JPython results). >> > >> > >> > Two examples of interesting comparisons are: >> > >> > - relative bars >> > (http://speed.pypy.org/comparison/?bas=2%2B35&chart=relative+bars): here >> > we >> > see that the jit is faster than psyco in all cases except spambayes and >> > slowspitfire, were the jit cannot make up for pypy-c's abismal >> > performance. >> > Interestingly, in the only other case where the jit is slower than >> > cpython, >> > the ai benchmark, psyco performs even worse. >> > >> > - stacked bars >> > >> > horizontal(http://speed.pypy.org/comparison/?hor=true&bas=2%2B35&chart=stacked+bars): >> > This is not meant to "demonstrate" that overall the jit is over two >> > times >> > faster than cpython. It is just another way for a developer to picture >> > how >> > long a programme would take to complete if it were composed of 21 such >> > tasks. You can see that cpython's (the normalization chosen) benchmarks >> > all >> > take 1"relative" second. pypy-c needs more or less the same time, some >> > "tasks" being slower and some faster. Psyco shows an interesting >> > picture: >> > From meteor-contest downwards (fortuitously) , all benchmarks are >> > extremely >> > "compressed", which means they are speeded up by psyco quite a lot. But >> > any >> > further speed up wouldn't make overall time much shorter because the >> > first >> > group of benchmarks now takes most of the time to complete. pypy-c-jit >> > is a >> > more extreme case of this: If the jit accelerated all "fast" benchmarks >> > to 0 >> > seconds (infinitely fast), it would only get about twice as fast as now >> > because ai, slowspitfire, spambayes and twisted_tcp now need half the >> > entire >> > execution time. An good demonstration of "you are only as fast as your >> > slowest part". Of course the aggregate of all benchmarks is not a real >> > app, >> > but it is still fun. >> > >> > I hope you find the new version useful, and as always any feedback is >> > welcome. >> > >> > Cheers! >> > Miquel >> > >> >> Wow, I really like it, great job. >> >> Can we see how we can use this features for branches? >> >> Cheers, >> fijal > > From p.giarrusso at gmail.com Fri Jun 25 22:48:59 2010 From: p.giarrusso at gmail.com (Paolo Giarrusso) Date: Fri, 25 Jun 2010 22:48:59 +0200 Subject: [pypy-dev] New speed.pypy.org version In-Reply-To: References: Message-ID: On Fri, Jun 25, 2010 at 17:53, Maciej Fijalkowski wrote: > Hey Paolo. > > While in general I agree with you, this is not exactly science, I > still think it's giving somewhat an impression what's going on to > outsiders. As long as no important scholars look at that part, and it's just me, yes. When that happen, you probably lose some credibility. No matter if speed.pypy.org is made by people external to the team. > Inside, we still look mostly at particular benchmarks. But if a change improves one benchmark and worsens another, you need some summary. > I'm > not sure having any convoluted (at least to normal people) metric > while summarizing would help, maybe. You're talking to programmers, not to street people. Even before knowing why I should use the geomean here, I never felt too confused by people using it, even without explanation. The geometric mean is just a mean. And it's the only way to get an average performance ratio: how much faster is PyPy than CPython, on these benchmarks, considered all with equal weight? If you want, put just this on the graph, and the term "geometric mean" in some note. > Speaking a bit on Miguel's > behalf, feel free to implement this as a feature on codespeed (it's an > open source project after all), you can fork it on github > http://github.com/tobami/codespeed. Of course I can, so your answer is valid, but that's not my plan, sorry - the difference of effort needed to me and to him is huge. If I had time to spend, I'd hack PyPy itself instead. Best regards -- Paolo Giarrusso - Ph.D. Student http://www.informatik.uni-marburg.de/~pgiarrusso/ From p.giarrusso at gmail.com Fri Jun 25 22:50:39 2010 From: p.giarrusso at gmail.com (Paolo Giarrusso) Date: Fri, 25 Jun 2010 22:50:39 +0200 Subject: [pypy-dev] New speed.pypy.org version In-Reply-To: References: Message-ID: On Fri, Jun 25, 2010 at 19:08, Miquel Torres wrote: > Hi Paolo, > > I am aware of the problem with calculating benchmark means, but let me > explain my point of view. > > You are correct in that it would be preferable to have absolute times. Well, > you actually can, but see what it happens: > http://speed.pypy.org/comparison/?hor=true&bas=none&chart=stacked+bars Ahah! I didn't notice that I could skip normalization! This does not fully invalidate my point, however. > Absolute values would only work if we had carefully chosen benchmaks > runtimes to be very similar (for our cpython baseline). As it is, html5lib, > spitfire and spitfire_cstringio completely dominate the cummulative time. I acknowledge that (btw, it should be cumulative time, with one 'm', both here and in the website). > And not because the interpreter is faster or slower but because the > benchmark was arbitrarily designed to run that long. Any improvement in the > long running benchmarks will carry much more weight than in the short > running. > What is more useful is to have comparable slices of time so that the > improvements can be seen relatively over time. If you want to sum up times (but at this point, I see no reason for it), you should rather have externally derived weights, as suggested by the paper (in Rule 3). As soon as you take weights from the data, lots of maths that you need is not going to work any more - that's generally true in many cases in statistics. And the only way making sense to have external weights is to gather them from real world programs. Since that's not going to happen easily, just stick with the geometric mean. Or set an arbitrarily low weight, manually, without any math, so that the long-running benchmarks stop dominating the res. It's no fraud, since the current graph is less valid anyway. > Normalizing does that i > think. Not really. > It just says: we have 21 tasks which take 1 second to run each on > interpreter X (cpython in the default case). Then we see how other > executables compare to that. What would the geometric mean achieve here, > exactly, for the end user? You actually need the geomean to do that. Don't forget that the geomean is still a mean: it's a mean performance ratio which averages individual performance ratios. If PyPy's geomean is 0.5, it means that PyPy is going to run that task in 11.5 seconds instead of 21. To me, this sounds exactly like what you want to achieve. Moreover, it actually works, unlike what you use. For instance, ignore PyPy-JIT, and look only CPython and pypy-c (no JIT). Then, change the normalization among the two: http://speed.pypy.org/comparison/?exe=2%2B35,3%2BL&ben=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21&env=1&hor=true&bas=2%2B35&chart=stacked+bars http://speed.pypy.org/comparison/?exe=2%2B35,3%2BL&ben=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21&env=1&hor=true&bas=3%2BL&chart=stacked+bars with the current data, you get that in one case cpython is faster, in the other pypy-c is faster. It can't happen with the geomean. This is the point of the paper. I could even construct a normalization baseline $base such that CPython seems faster than PyPy-JIT. Such a base should be very fast on, say, ai (where CPython is slower), so that $cpython.ai/$base.ai becomes 100 and $pypyjit.ai/$base.ai becomes 200, and be very slow on other benchmarks (so that they disappear in the sum). So, the only difference I see is that geomean works, arithm. mean doesn't. That's why Real Benchmarkers use geomean. Moreover, you are making a mistake quite common among non-physicists. What you say makes sense under the implicit assumption that dividing two times gives something you can use as a time. When you say "Pypy's runtime for a 1 second task", you actually want to talk about a performance ratio, not about the time. In the same way as when you say "this bird runs 3 meters long in one second", a physicist would sum that up as "3 m/s" rather than "3 m". > I am not really calculating any mean. You can see that I carefully avoided > to display any kind of total bar which would indeed incur in the problem you > mention. That a stacked chart implicitly displays a total is something you > can not avoid, and for that kind of chart I still think normalized results > is visually the best option. But on a stacked bars graph, I'm not going to look at individual bars at all, just at the total: it's actually less convenient than in "normal bars" to look at the result of a particular benchmark. I hope I can find guidelines against stacked plots, I have a PhD colleague reading on how to make graphs. Best regards -- Paolo Giarrusso - Ph.D. Student http://www.informatik.uni-marburg.de/~pgiarrusso/ From fijall at gmail.com Sat Jun 26 01:27:52 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 25 Jun 2010 17:27:52 -0600 Subject: [pypy-dev] PyPy 1.3 released Message-ID: ======================= PyPy 1.3: Stabilization ======================= Hello. We're please to announce release of PyPy 1.3. This release has two major improvements. First of all, we stabilized the JIT compiler since 1.2 release, answered user issues, fixed bugs, and generally improved speed. We're also pleased to announce alpha support for loading CPython extension modules written in C. While the main purpose of this release is increased stability, this feature is in alpha stage and it is not yet suited for production environments. Highlights of this release ========================== * We introduced support for CPython extension modules written in C. As of now, this support is in alpha, and it's very unlikely unaltered C extensions will work out of the box, due to missing functions or refcounting details. The support is disable by default, so you have to do:: import cpyext before trying to import any .so file. Also, libraries are source-compatible and not binary-compatible. That means you need to recompile binaries, using for example:: python setup.py build Details may vary, depending on your build system. Make sure you include the above line at the beginning of setup.py or put it in your PYTHONSTARTUP. This is alpha feature. It'll likely segfault. You have been warned! * JIT bugfixes. A lot of bugs reported for the JIT have been fixed, and its stability greatly improved since 1.2 release. * Various small improvements have been added to the JIT code, as well as a great speedup of compiling time. Cheers, Maciej Fijalkowski, Armin Rigo, Alex Gaynor, Amaury Forgeot d'Arc and the PyPy team From tobami at googlemail.com Sat Jun 26 09:16:52 2010 From: tobami at googlemail.com (Miquel Torres) Date: Sat, 26 Jun 2010 09:16:52 +0200 Subject: [pypy-dev] New speed.pypy.org version In-Reply-To: References:

Message-ID: Hi Paolo, well, you are right of course. I had forgotten about the real problem, which you actually demonstrate quite well with your CPython and pypy-c case: depending on the normalization you can make any stacked series look faster than the others. I will have a look at the literature and modify normalized stacked plots accordingly. Thanks for taking the time to explain things in such detail. Regards, Miquel 2010/6/25 Paolo Giarrusso > On Fri, Jun 25, 2010 at 19:08, Miquel Torres > wrote: > > Hi Paolo, > > > > I am aware of the problem with calculating benchmark means, but let me > > explain my point of view. > > > > You are correct in that it would be preferable to have absolute times. > Well, > > you actually can, but see what it happens: > > http://speed.pypy.org/comparison/?hor=true&bas=none&chart=stacked+bars > > Ahah! I didn't notice that I could skip normalization! This does not > fully invalidate my point, however. > > > Absolute values would only work if we had carefully chosen benchmaks > > runtimes to be very similar (for our cpython baseline). As it is, > html5lib, > > spitfire and spitfire_cstringio completely dominate the cummulative time. > > I acknowledge that (btw, it should be cumulative time, with one 'm', > both here and in the website). > > > And not because the interpreter is faster or slower but because the > > benchmark was arbitrarily designed to run that long. Any improvement in > the > > long running benchmarks will carry much more weight than in the short > > running. > > > What is more useful is to have comparable slices of time so that the > > improvements can be seen relatively over time. > > If you want to sum up times (but at this point, I see no reason for > it), you should rather have externally derived weights, as suggested > by the paper (in Rule 3). > As soon as you take weights from the data, lots of maths that you need > is not going to work any more - that's generally true in many cases in > statistics. > And the only way making sense to have external weights is to gather > them from real world programs. Since that's not going to happen > easily, just stick with the geometric mean. Or set an arbitrarily low > weight, manually, without any math, so that the long-running > benchmarks stop dominating the res. It's no fraud, since the current > graph is less valid anyway. > > > Normalizing does that i > > think. > Not really. > > > It just says: we have 21 tasks which take 1 second to run each on > > interpreter X (cpython in the default case). Then we see how other > > executables compare to that. What would the geometric mean achieve here, > > exactly, for the end user? > > You actually need the geomean to do that. Don't forget that the > geomean is still a mean: it's a mean performance ratio which averages > individual performance ratios. > If PyPy's geomean is 0.5, it means that PyPy is going to run that task > in 11.5 seconds instead of 21. To me, this sounds exactly like what > you want to achieve. Moreover, it actually works, unlike what you use. > > For instance, ignore PyPy-JIT, and look only CPython and pypy-c (no > JIT). Then, change the normalization among the two: > > http://speed.pypy.org/comparison/?exe=2%2B35,3%2BL&ben=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21&env=1&hor=true&bas=2%2B35&chart=stacked+bars > > http://speed.pypy.org/comparison/?exe=2%2B35,3%2BL&ben=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21&env=1&hor=true&bas=3%2BL&chart=stacked+bars > with the current data, you get that in one case cpython is faster, in > the other pypy-c is faster. > It can't happen with the geomean. This is the point of the paper. > > I could even construct a normalization baseline $base such that > CPython seems faster than PyPy-JIT. Such a base should be very fast > on, say, ai (where CPython is slower), so that $cpython.ai/$base.ai > becomes 100 and $pypyjit.ai/$base.ai becomes 200, and be very slow on > other benchmarks (so that they disappear in the sum). > > So, the only difference I see is that geomean works, arithm. mean > doesn't. That's why Real Benchmarkers use geomean. > > Moreover, you are making a mistake quite common among non-physicists. > What you say makes sense under the implicit assumption that dividing > two times gives something you can use as a time. When you say "Pypy's > runtime for a 1 second task", you actually want to talk about a > performance ratio, not about the time. In the same way as when you say > "this bird runs 3 meters long in one second", a physicist would sum > that up as "3 m/s" rather than "3 m". > > > I am not really calculating any mean. You can see that I carefully > avoided > > to display any kind of total bar which would indeed incur in the problem > you > > mention. That a stacked chart implicitly displays a total is something > you > > can not avoid, and for that kind of chart I still think normalized > results > > is visually the best option. > > But on a stacked bars graph, I'm not going to look at individual bars > at all, just at the total: it's actually less convenient than in > "normal bars" to look at the result of a particular benchmark. > > I hope I can find guidelines against stacked plots, I have a PhD > colleague reading on how to make graphs. > > Best regards > -- > Paolo Giarrusso - Ph.D. Student > http://www.informatik.uni-marburg.de/~pgiarrusso/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From arigo at tunes.org Sat Jun 26 10:34:57 2010 From: arigo at tunes.org (Armin Rigo) Date: Sat, 26 Jun 2010 10:34:57 +0200 Subject: [pypy-dev] PyPy 1.3 released In-Reply-To: References: Message-ID: <20100626083457.GA14816@code0.codespeak.net> Hi, On Fri, Jun 25, 2010 at 05:27:52PM -0600, Maciej Fijalkowski wrote: > python setup.py build As corrected on the blog (http://morepypy.blogspot.com/), this line should read: pypy setup.py build Armin. From cfbolz at gmx.de Mon Jun 28 15:08:08 2010 From: cfbolz at gmx.de (Carl Friedrich Bolz) Date: Mon, 28 Jun 2010 15:08:08 +0200 Subject: [pypy-dev] PyPy Master thesis sandboxing Message-ID: <4C289EB8.4090702@gmx.de> Hi all, just wanted to point this Master thesis using PyPy out: http://www.diku.dk/english/Calendar/masters_defence_soeren/ Didn't know about this, but interesting anyway. Cheers, Carl Friedrich From fijall at gmail.com Mon Jun 28 17:51:37 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 28 Jun 2010 09:51:37 -0600 Subject: [pypy-dev] PyPy Master thesis sandboxing In-Reply-To: <4C289EB8.4090702@gmx.de> References: <4C289EB8.4090702@gmx.de> Message-ID: On Mon, Jun 28, 2010 at 7:08 AM, Carl Friedrich Bolz wrote: > Hi all, > > just wanted to point this Master thesis using PyPy out: > > http://www.diku.dk/english/Calendar/masters_defence_soeren/ > > Didn't know about this, but interesting anyway. > > Cheers, > > Carl Friedrich According to the abstract it was a poor choice :) From cfbolz at gmx.de Mon Jun 28 17:56:55 2010 From: cfbolz at gmx.de (Carl Friedrich Bolz) Date: Mon, 28 Jun 2010 17:56:55 +0200 Subject: [pypy-dev] PyPy Master thesis sandboxing In-Reply-To: References: <4C289EB8.4090702@gmx.de> Message-ID: <4C28C647.2000008@gmx.de> On 06/28/2010 05:51 PM, Maciej Fijalkowski wrote: > On Mon, Jun 28, 2010 at 7:08 AM, Carl Friedrich Bolz wrote: >> Hi all, >> >> just wanted to point this Master thesis using PyPy out: >> >> http://www.diku.dk/english/Calendar/masters_defence_soeren/ >> >> Didn't know about this, but interesting anyway. >> >> Cheers, >> >> Carl Friedrich > > According to the abstract it was a poor choice :) So it says. He never wrote anything on the mailing list though, I assume he showed up on IRC? Cheers, Carl Friedrich From fijall at gmail.com Mon Jun 28 18:00:45 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 28 Jun 2010 10:00:45 -0600 Subject: [pypy-dev] PyPy Master thesis sandboxing In-Reply-To: <4C28C647.2000008@gmx.de> References: <4C289EB8.4090702@gmx.de> <4C28C647.2000008@gmx.de> Message-ID: On Mon, Jun 28, 2010 at 9:56 AM, Carl Friedrich Bolz wrote: > On 06/28/2010 05:51 PM, Maciej Fijalkowski wrote: >> >> On Mon, Jun 28, 2010 at 7:08 AM, Carl Friedrich Bolz >> ?wrote: >>> >>> Hi all, >>> >>> just wanted to point this Master thesis using PyPy out: >>> >>> http://www.diku.dk/english/Calendar/masters_defence_soeren/ >>> >>> Didn't know about this, but interesting anyway. >>> >>> Cheers, >>> >>> Carl Friedrich >> >> According to the abstract it was a poor choice :) > > So it says. He never wrote anything on the mailing list though, I assume he > showed up on IRC? > > Cheers, > > Carl Friedrich > I don't honestly remember anyone showing up and speaking about that, but well. Anyone? Cheers, fijal From sl at scrooge.dk Mon Jun 28 19:50:57 2010 From: sl at scrooge.dk (=?ISO-8859-1?Q?S=F8ren_Laursen?=) Date: Mon, 28 Jun 2010 19:50:57 +0200 Subject: [pypy-dev] PyPy Master thesis sandboxing In-Reply-To: <4C28C647.2000008@gmx.de> References: <4C289EB8.4090702@gmx.de> <4C28C647.2000008@gmx.de> Message-ID: <7e69406314220d0a6655e8a6f2fe70b4@mail.gmail.com> -----Oprindelig meddelelse----- Fra: pypy-dev-bounces at codespeak.net [mailto:pypy-dev-bounces at codespeak.net] P? vegne af Carl Friedrich Bolz Sendt: 28. juni 2010 17:57 Til: Maciej Fijalkowski Cc: PyPy Dev Emne: Re: [pypy-dev] PyPy Master thesis sandboxing On 06/28/2010 05:51 PM, Maciej Fijalkowski wrote: > On Mon, Jun 28, 2010 at 7:08 AM, Carl Friedrich Bolz wrote: >> Hi all, >> >> just wanted to point this Master thesis using PyPy out: >> >> http://www.diku.dk/english/Calendar/masters_defence_soeren/ >> >> Didn't know about this, but interesting anyway. >> >> Cheers, >> >> Carl Friedrich > > According to the abstract it was a poor choice :) >So it says. He never wrote anything on the mailing list though, I assume >he showed up on IRC? I did wrote on the mailinglist: http://permalink.gmane.org/gmane.comp.python.pypy/5987 This error was a show stopper, and some other strange errors using the pickle module. Regards, S?ren From fijall at gmail.com Mon Jun 28 19:58:37 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 28 Jun 2010 11:58:37 -0600 Subject: [pypy-dev] PyPy Master thesis sandboxing In-Reply-To: <7e69406314220d0a6655e8a6f2fe70b4@mail.gmail.com> References: <4C289EB8.4090702@gmx.de> <4C28C647.2000008@gmx.de> <7e69406314220d0a6655e8a6f2fe70b4@mail.gmail.com> Message-ID: Bah, indeed you did. Well, I can answer that now :) The level of sandboxing is the level on which RPython operates - it does not necesarilly relate 1:1 to os module functions. You would need to look at implementation of os.listdir which C functions it calls. If you enable debug (is it on by default?) it'll give you what functions are called with what parameters. Also a pdb would help there. I can look at those errors precisely if you want me to. 2010/6/28 S?ren Laursen : > -----Oprindelig meddelelse----- > Fra: pypy-dev-bounces at codespeak.net > [mailto:pypy-dev-bounces at codespeak.net] P? vegne af Carl Friedrich Bolz > Sendt: 28. juni 2010 17:57 > Til: Maciej Fijalkowski > Cc: PyPy Dev > Emne: Re: [pypy-dev] PyPy Master thesis sandboxing > > On 06/28/2010 05:51 PM, Maciej Fijalkowski wrote: >> On Mon, Jun 28, 2010 at 7:08 AM, Carl Friedrich Bolz > wrote: >>> Hi all, >>> >>> just wanted to point this Master thesis using PyPy out: >>> >>> http://www.diku.dk/english/Calendar/masters_defence_soeren/ >>> >>> Didn't know about this, but interesting anyway. >>> >>> Cheers, >>> >>> Carl Friedrich >> >> According to the abstract it was a poor choice :) > >>So it says. He never wrote anything on the mailing list though, I assume >>he showed up on IRC? > > I did wrote on the mailinglist: > http://permalink.gmane.org/gmane.comp.python.pypy/5987 > > This error was a show stopper, and some other strange errors using the > pickle module. > > Regards, > > S?ren > _______________________________________________ > pypy-dev at codespeak.net > http://codespeak.net/mailman/listinfo/pypy-dev > From maloneymr at gmail.com Mon Jun 28 19:59:00 2010 From: maloneymr at gmail.com (Michael Maloney) Date: Mon, 28 Jun 2010 12:59:00 -0500 Subject: [pypy-dev] Unsubscribe Message-ID: On Mon, Jun 28, 2010 at 12:51 PM, wrote: > Send pypy-dev mailing list submissions to > ? ? ? ?pypy-dev at codespeak.net > > To subscribe or unsubscribe via the World Wide Web, visit > ? ? ? ?http://codespeak.net/mailman/listinfo/pypy-dev > or, via email, send a message with subject or body 'help' to > ? ? ? ?pypy-dev-request at codespeak.net > > You can reach the person managing the list at > ? ? ? ?pypy-dev-owner at codespeak.net > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of pypy-dev digest..." > > > Today's Topics: > > ? 1. PyPy 1.3 released (Maciej Fijalkowski) > ? 2. Re: New speed.pypy.org version (Miquel Torres) > ? 3. Re: PyPy 1.3 released (Armin Rigo) > ? 4. PyPy Master thesis sandboxing (Carl Friedrich Bolz) > ? 5. Re: PyPy Master thesis sandboxing (Maciej Fijalkowski) > ? 6. Re: PyPy Master thesis sandboxing (Carl Friedrich Bolz) > ? 7. Re: PyPy Master thesis sandboxing (Maciej Fijalkowski) > ? 8. Re: PyPy Master thesis sandboxing (S?ren Laursen) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Fri, 25 Jun 2010 17:27:52 -0600 > From: Maciej Fijalkowski > Subject: [pypy-dev] PyPy 1.3 released > To: PyPy Dev , ?"" > ? ? ? ?, ? ? ? ?python-announce at python.org > Message-ID: > ? ? ? ? > Content-Type: text/plain; charset=UTF-8 > > ======================= > PyPy 1.3: Stabilization > ======================= > > Hello. > > We're please to announce release of PyPy 1.3. This release has two major > improvements. First of all, we stabilized the JIT compiler since 1.2 release, > answered user issues, fixed bugs, and generally improved speed. > > We're also pleased to announce alpha support for loading CPython extension > modules written in C. While the main purpose of this release is increased > stability, this feature is in alpha stage and it is not yet suited for > production environments. > > Highlights of this release > ========================== > > * We introduced support for CPython extension modules written in C. As of now, > ?this support is in alpha, and it's very unlikely unaltered C extensions will > ?work out of the box, due to missing functions or refcounting details. The > ?support is disable by default, so you have to do:: > > ? import cpyext > > ?before trying to import any .so file. Also, libraries are source-compatible > ?and not binary-compatible. That means you need to recompile binaries, using > ?for example:: > > ? python setup.py build > > ?Details may vary, depending on your build system. Make sure you include > ?the above line at the beginning of setup.py or put it in your PYTHONSTARTUP. > > ?This is alpha feature. It'll likely segfault. You have been warned! > > * JIT bugfixes. A lot of bugs reported for the JIT have been fixed, and its > ?stability greatly improved since 1.2 release. > > * Various small improvements have been added to the JIT code, as well as a great > ?speedup of compiling time. > > Cheers, > Maciej Fijalkowski, Armin Rigo, Alex Gaynor, Amaury Forgeot d'Arc and > the PyPy team > > > ------------------------------ > > Message: 2 > Date: Sat, 26 Jun 2010 09:16:52 +0200 > From: Miquel Torres > Subject: Re: [pypy-dev] New speed.pypy.org version > To: Paolo Giarrusso > Cc: pypy-dev > Message-ID: > ? ? ? ? > Content-Type: text/plain; charset="iso-8859-1" > > Hi Paolo, > > well, you are right of course. I had forgotten about the real problem, which > you actually demonstrate quite well with your CPython and pypy-c case: > depending on the normalization you can make any stacked series look faster > than the others. > > I will have a look at the literature and modify normalized stacked plots > accordingly. > > Thanks for taking the time to explain things in such detail. > > Regards, > Miquel > > > 2010/6/25 Paolo Giarrusso > >> On Fri, Jun 25, 2010 at 19:08, Miquel Torres >> wrote: >> > Hi Paolo, >> > >> > I am aware of the problem with calculating benchmark means, but let me >> > explain my point of view. >> > >> > You are correct in that it would be preferable to have absolute times. >> Well, >> > you actually can, but see what it happens: >> > http://speed.pypy.org/comparison/?hor=true&bas=none&chart=stacked+bars >> >> Ahah! I didn't notice that I could skip normalization! This does not >> fully invalidate my point, however. >> >> > Absolute values would only work if we had carefully chosen benchmaks >> > runtimes to be very similar (for our cpython baseline). As it is, >> html5lib, >> > spitfire and spitfire_cstringio completely dominate the cummulative time. >> >> I acknowledge that (btw, it should be cumulative time, with one 'm', >> both here and in the website). >> >> > And not because the interpreter is faster or slower but because the >> > benchmark was arbitrarily designed to run that long. Any improvement in >> the >> > long running benchmarks will carry much more weight than in the short >> > running. >> >> > What is more useful is to have comparable slices of time so that the >> > improvements can be seen relatively over time. >> >> If you want to sum up times (but at this point, I see no reason for >> it), you should rather have externally derived weights, as suggested >> by the paper (in Rule 3). >> As soon as you take weights from the data, lots of maths that you need >> is not going to work any more - that's generally true in many cases in >> statistics. >> And the only way making sense to have external weights is to gather >> them from real world programs. Since that's not going to happen >> easily, just stick with the geometric mean. Or set an arbitrarily low >> weight, manually, without any math, so that the long-running >> benchmarks stop dominating the res. It's no fraud, since the current >> graph is less valid anyway. >> >> > Normalizing does that i >> > think. >> Not really. >> >> > It just says: we have 21 tasks which take 1 second to run each on >> > interpreter X (cpython in the default case). Then we see how other >> > executables compare to that. What would the geometric mean achieve here, >> > exactly, for the end user? >> >> You actually need the geomean to do that. Don't forget that the >> geomean is still a mean: it's a mean performance ratio which averages >> individual performance ratios. >> If PyPy's geomean is 0.5, it means that PyPy is going to run that task >> in 11.5 seconds instead of 21. To me, this sounds exactly like what >> you want to achieve. Moreover, it actually works, unlike what you use. >> >> For instance, ignore PyPy-JIT, and look only CPython and pypy-c (no >> JIT). Then, change the normalization among the two: >> >> http://speed.pypy.org/comparison/?exe=2%2B35,3%2BL&ben=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21&env=1&hor=true&bas=2%2B35&chart=stacked+bars >> >> http://speed.pypy.org/comparison/?exe=2%2B35,3%2BL&ben=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21&env=1&hor=true&bas=3%2BL&chart=stacked+bars >> with the current data, you get that in one case cpython is faster, in >> the other pypy-c is faster. >> It can't happen with the geomean. This is the point of the paper. >> >> I could even construct a normalization baseline $base such that >> CPython seems faster than PyPy-JIT. Such a base should be very fast >> on, say, ai (where CPython is slower), so that $cpython.ai/$base.ai >> becomes 100 and $pypyjit.ai/$base.ai becomes 200, and be very slow on >> other benchmarks (so that they disappear in the sum). >> >> So, the only difference I see is that geomean works, arithm. mean >> doesn't. That's why Real Benchmarkers use geomean. >> >> Moreover, you are making a mistake quite common among non-physicists. >> What you say makes sense under the implicit assumption that dividing >> two times gives something you can use as a time. When you say "Pypy's >> runtime for a 1 second task", you actually want to talk about a >> performance ratio, not about the time. In the same way as when you say >> "this bird runs 3 meters long in one second", a physicist would sum >> that up as "3 m/s" rather than "3 m". >> >> > I am not really calculating any mean. You can see that I carefully >> avoided >> > to display any kind of total bar which would indeed incur in the problem >> you >> > mention. That a stacked chart implicitly displays a total is something >> you >> > can not avoid, and for that kind of chart I still think normalized >> results >> > is visually the best option. >> >> But on a stacked bars graph, I'm not going to look at individual bars >> at all, just at the total: it's actually less convenient than in >> "normal bars" to look at the result of a particular benchmark. >> >> I hope I can find guidelines against stacked plots, I have a PhD >> colleague reading on how to make graphs. >> >> Best regards >> -- >> Paolo Giarrusso - Ph.D. Student >> http://www.informatik.uni-marburg.de/~pgiarrusso/ >> > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: http://codespeak.net/pipermail/pypy-dev/attachments/20100626/e7965f36/attachment-0001.htm > > ------------------------------ > > Message: 3 > Date: Sat, 26 Jun 2010 10:34:57 +0200 > From: Armin Rigo > Subject: Re: [pypy-dev] PyPy 1.3 released > To: Maciej Fijalkowski > Cc: python-announce at python.org, PyPy Dev , > ? ? ? ?"" > Message-ID: <20100626083457.GA14816 at code0.codespeak.net> > Content-Type: text/plain; charset=us-ascii > > Hi, > > On Fri, Jun 25, 2010 at 05:27:52PM -0600, Maciej Fijalkowski wrote: >> ? ?python setup.py build > > As corrected on the blog (http://morepypy.blogspot.com/), this line > should read: > > ? ? pypy setup.py build > > > Armin. > > > ------------------------------ > > Message: 4 > Date: Mon, 28 Jun 2010 15:08:08 +0200 > From: Carl Friedrich Bolz > Subject: [pypy-dev] PyPy Master thesis sandboxing > To: PyPy Dev > Message-ID: <4C289EB8.4090702 at gmx.de> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > Hi all, > > just wanted to point this Master thesis using PyPy out: > > http://www.diku.dk/english/Calendar/masters_defence_soeren/ > > Didn't know about this, but interesting anyway. > > Cheers, > > Carl Friedrich > > > ------------------------------ > > Message: 5 > Date: Mon, 28 Jun 2010 09:51:37 -0600 > From: Maciej Fijalkowski > Subject: Re: [pypy-dev] PyPy Master thesis sandboxing > To: Carl Friedrich Bolz > Cc: PyPy Dev > Message-ID: > ? ? ? ? > Content-Type: text/plain; charset=UTF-8 > > On Mon, Jun 28, 2010 at 7:08 AM, Carl Friedrich Bolz wrote: >> Hi all, >> >> just wanted to point this Master thesis using PyPy out: >> >> http://www.diku.dk/english/Calendar/masters_defence_soeren/ >> >> Didn't know about this, but interesting anyway. >> >> Cheers, >> >> Carl Friedrich > > According to the abstract it was a poor choice :) > > > ------------------------------ > > Message: 6 > Date: Mon, 28 Jun 2010 17:56:55 +0200 > From: Carl Friedrich Bolz > Subject: Re: [pypy-dev] PyPy Master thesis sandboxing > To: Maciej Fijalkowski > Cc: PyPy Dev > Message-ID: <4C28C647.2000008 at gmx.de> > Content-Type: text/plain; charset=UTF-8; format=flowed > > On 06/28/2010 05:51 PM, Maciej Fijalkowski wrote: >> On Mon, Jun 28, 2010 at 7:08 AM, Carl Friedrich Bolz ?wrote: >>> Hi all, >>> >>> just wanted to point this Master thesis using PyPy out: >>> >>> http://www.diku.dk/english/Calendar/masters_defence_soeren/ >>> >>> Didn't know about this, but interesting anyway. >>> >>> Cheers, >>> >>> Carl Friedrich >> >> According to the abstract it was a poor choice :) > > So it says. He never wrote anything on the mailing list though, I assume > he showed up on IRC? > > Cheers, > > Carl Friedrich > > > ------------------------------ > > Message: 7 > Date: Mon, 28 Jun 2010 10:00:45 -0600 > From: Maciej Fijalkowski > Subject: Re: [pypy-dev] PyPy Master thesis sandboxing > To: Carl Friedrich Bolz > Cc: PyPy Dev > Message-ID: > ? ? ? ? > Content-Type: text/plain; charset=UTF-8 > > On Mon, Jun 28, 2010 at 9:56 AM, Carl Friedrich Bolz wrote: >> On 06/28/2010 05:51 PM, Maciej Fijalkowski wrote: >>> >>> On Mon, Jun 28, 2010 at 7:08 AM, Carl Friedrich Bolz >>> ?wrote: >>>> >>>> Hi all, >>>> >>>> just wanted to point this Master thesis using PyPy out: >>>> >>>> http://www.diku.dk/english/Calendar/masters_defence_soeren/ >>>> >>>> Didn't know about this, but interesting anyway. >>>> >>>> Cheers, >>>> >>>> Carl Friedrich >>> >>> According to the abstract it was a poor choice :) >> >> So it says. He never wrote anything on the mailing list though, I assume he >> showed up on IRC? >> >> Cheers, >> >> Carl Friedrich >> > > I don't honestly remember anyone showing up and speaking about that, > but well. Anyone? > > Cheers, > fijal > > > ------------------------------ > > Message: 8 > Date: Mon, 28 Jun 2010 19:50:57 +0200 > From: S?ren Laursen > Subject: Re: [pypy-dev] PyPy Master thesis sandboxing > To: pypy-dev at codespeak.net > Message-ID: <7e69406314220d0a6655e8a6f2fe70b4 at mail.gmail.com> > Content-Type: text/plain; charset=ISO-8859-1 > > -----Oprindelig meddelelse----- > Fra: pypy-dev-bounces at codespeak.net > [mailto:pypy-dev-bounces at codespeak.net] P? vegne af Carl Friedrich Bolz > Sendt: 28. juni 2010 17:57 > Til: Maciej Fijalkowski > Cc: PyPy Dev > Emne: Re: [pypy-dev] PyPy Master thesis sandboxing > > On 06/28/2010 05:51 PM, Maciej Fijalkowski wrote: >> On Mon, Jun 28, 2010 at 7:08 AM, Carl Friedrich Bolz > wrote: >>> Hi all, >>> >>> just wanted to point this Master thesis using PyPy out: >>> >>> http://www.diku.dk/english/Calendar/masters_defence_soeren/ >>> >>> Didn't know about this, but interesting anyway. >>> >>> Cheers, >>> >>> Carl Friedrich >> >> According to the abstract it was a poor choice :) > >>So it says. He never wrote anything on the mailing list though, I assume >>he showed up on IRC? > > I did wrote on the mailinglist: > http://permalink.gmane.org/gmane.comp.python.pypy/5987 > > This error was a show stopper, and some other strange errors using the > pickle module. > > Regards, > > S?ren > > > ------------------------------ > > _______________________________________________ > pypy-dev mailing list > pypy-dev at codespeak.net > http://codespeak.net/mailman/listinfo/pypy-dev > > > End of pypy-dev Digest, Vol 359, Issue 7 > **************************************** > From sl at scrooge.dk Mon Jun 28 20:01:45 2010 From: sl at scrooge.dk (=?ISO-8859-1?Q?S=F8ren_Laursen?=) Date: Mon, 28 Jun 2010 20:01:45 +0200 Subject: [pypy-dev] PyPy Master thesis sandboxing In-Reply-To: References: <4C289EB8.4090702@gmx.de> <4C28C647.2000008@gmx.de> <7e69406314220d0a6655e8a6f2fe70b4@mail.gmail.com> Message-ID: Thanks, The project is not stoppede, but the thesis had to be handed in. Regards, s?ren -----Oprindelig meddelelse----- Fra: Maciej Fijalkowski [mailto:fijall at gmail.com] Sendt: 28. juni 2010 19:59 Til: S?ren Laursen Cc: pypy-dev at codespeak.net Emne: Re: [pypy-dev] PyPy Master thesis sandboxing Bah, indeed you did. Well, I can answer that now :) The level of sandboxing is the level on which RPython operates - it does not necesarilly relate 1:1 to os module functions. You would need to look at implementation of os.listdir which C functions it calls. If you enable debug (is it on by default?) it'll give you what functions are called with what parameters. Also a pdb would help there. I can look at those errors precisely if you want me to. 2010/6/28 S?ren Laursen : > -----Oprindelig meddelelse----- > Fra: pypy-dev-bounces at codespeak.net > [mailto:pypy-dev-bounces at codespeak.net] P? vegne af Carl Friedrich Bolz > Sendt: 28. juni 2010 17:57 > Til: Maciej Fijalkowski > Cc: PyPy Dev > Emne: Re: [pypy-dev] PyPy Master thesis sandboxing > > On 06/28/2010 05:51 PM, Maciej Fijalkowski wrote: >> On Mon, Jun 28, 2010 at 7:08 AM, Carl Friedrich Bolz > wrote: >>> Hi all, >>> >>> just wanted to point this Master thesis using PyPy out: >>> >>> http://www.diku.dk/english/Calendar/masters_defence_soeren/ >>> >>> Didn't know about this, but interesting anyway. >>> >>> Cheers, >>> >>> Carl Friedrich >> >> According to the abstract it was a poor choice :) > >>So it says. He never wrote anything on the mailing list though, I assume >>he showed up on IRC? > > I did wrote on the mailinglist: > http://permalink.gmane.org/gmane.comp.python.pypy/5987 > > This error was a show stopper, and some other strange errors using the > pickle module. > > Regards, > > S?ren > _______________________________________________ > pypy-dev at codespeak.net > http://codespeak.net/mailman/listinfo/pypy-dev > From fijall at gmail.com Mon Jun 28 20:03:22 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 28 Jun 2010 12:03:22 -0600 Subject: [pypy-dev] PyPy Master thesis sandboxing In-Reply-To: References: <4C289EB8.4090702@gmx.de> <4C28C647.2000008@gmx.de> <7e69406314220d0a6655e8a6f2fe70b4@mail.gmail.com> Message-ID: Does it mean "yes" or "no"? :-) 2010/6/28 S?ren Laursen : > Thanks, > > The project is not stoppede, but the thesis had to be handed in. > > Regards, > > s?ren > > -----Oprindelig meddelelse----- > Fra: Maciej Fijalkowski [mailto:fijall at gmail.com] > Sendt: 28. juni 2010 19:59 > Til: S?ren Laursen > Cc: pypy-dev at codespeak.net > Emne: Re: [pypy-dev] PyPy Master thesis sandboxing > > Bah, indeed you did. > > Well, I can answer that now :) > > The level of sandboxing is the level on which RPython operates - it > does not necesarilly relate 1:1 to os module functions. You would need > to look at implementation of os.listdir which C functions it calls. If > you enable debug (is it on by default?) it'll give you what functions > are called with what parameters. Also a pdb would help there. I can > look at those errors precisely if you want me to. > > 2010/6/28 S?ren Laursen : >> -----Oprindelig meddelelse----- >> Fra: pypy-dev-bounces at codespeak.net >> [mailto:pypy-dev-bounces at codespeak.net] P? vegne af Carl Friedrich Bolz >> Sendt: 28. juni 2010 17:57 >> Til: Maciej Fijalkowski >> Cc: PyPy Dev >> Emne: Re: [pypy-dev] PyPy Master thesis sandboxing >> >> On 06/28/2010 05:51 PM, Maciej Fijalkowski wrote: >>> On Mon, Jun 28, 2010 at 7:08 AM, Carl Friedrich Bolz >> wrote: >>>> Hi all, >>>> >>>> just wanted to point this Master thesis using PyPy out: >>>> >>>> http://www.diku.dk/english/Calendar/masters_defence_soeren/ >>>> >>>> Didn't know about this, but interesting anyway. >>>> >>>> Cheers, >>>> >>>> Carl Friedrich >>> >>> According to the abstract it was a poor choice :) >> >>>So it says. He never wrote anything on the mailing list though, I assume >>>he showed up on IRC? >> >> I did wrote on the mailinglist: >> http://permalink.gmane.org/gmane.comp.python.pypy/5987 >> >> This error was a show stopper, and some other strange errors using the >> pickle module. >> >> Regards, >> >> S?ren >> _______________________________________________ >> pypy-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/pypy-dev >> > _______________________________________________ > pypy-dev at codespeak.net > http://codespeak.net/mailman/listinfo/pypy-dev > From sl at scrooge.dk Mon Jun 28 20:08:50 2010 From: sl at scrooge.dk (=?ISO-8859-1?Q?S=F8ren_Laursen?=) Date: Mon, 28 Jun 2010 20:08:50 +0200 Subject: [pypy-dev] PyPy Master thesis sandboxing In-Reply-To: References: <4C289EB8.4090702@gmx.de> <4C28C647.2000008@gmx.de> <7e69406314220d0a6655e8a6f2fe70b4@mail.gmail.com>

Message-ID: <66fb452a52c4f0d736a63b961557676b@mail.gmail.com> Yes! :-) I would like some help. Regards, S?ren -----Oprindelig meddelelse----- Fra: Maciej Fijalkowski [mailto:fijall at gmail.com] Sendt: 28. juni 2010 20:03 Til: S?ren Laursen Cc: pypy-dev at codespeak.net Emne: Re: [pypy-dev] PyPy Master thesis sandboxing Does it mean "yes" or "no"? :-) 2010/6/28 S?ren Laursen : > Thanks, > > The project is not stoppede, but the thesis had to be handed in. > > Regards, > > s?ren > > -----Oprindelig meddelelse----- > Fra: Maciej Fijalkowski [mailto:fijall at gmail.com] > Sendt: 28. juni 2010 19:59 > Til: S?ren Laursen > Cc: pypy-dev at codespeak.net > Emne: Re: [pypy-dev] PyPy Master thesis sandboxing > > Bah, indeed you did. > > Well, I can answer that now :) > > The level of sandboxing is the level on which RPython operates - it > does not necesarilly relate 1:1 to os module functions. You would need > to look at implementation of os.listdir which C functions it calls. If > you enable debug (is it on by default?) it'll give you what functions > are called with what parameters. Also a pdb would help there. I can > look at those errors precisely if you want me to. > > 2010/6/28 S?ren Laursen : >> -----Oprindelig meddelelse----- >> Fra: pypy-dev-bounces at codespeak.net >> [mailto:pypy-dev-bounces at codespeak.net] P? vegne af Carl Friedrich Bolz >> Sendt: 28. juni 2010 17:57 >> Til: Maciej Fijalkowski >> Cc: PyPy Dev >> Emne: Re: [pypy-dev] PyPy Master thesis sandboxing >> >> On 06/28/2010 05:51 PM, Maciej Fijalkowski wrote: >>> On Mon, Jun 28, 2010 at 7:08 AM, Carl Friedrich Bolz >> wrote: >>>> Hi all, >>>> >>>> just wanted to point this Master thesis using PyPy out: >>>> >>>> http://www.diku.dk/english/Calendar/masters_defence_soeren/ >>>> >>>> Didn't know about this, but interesting anyway. >>>> >>>> Cheers, >>>> >>>> Carl Friedrich >>> >>> According to the abstract it was a poor choice :) >> >>>So it says. He never wrote anything on the mailing list though, I assume >>>he showed up on IRC? >> >> I did wrote on the mailinglist: >> http://permalink.gmane.org/gmane.comp.python.pypy/5987 >> >> This error was a show stopper, and some other strange errors using the >> pickle module. >> >> Regards, >> >> S?ren >> _______________________________________________ >> pypy-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/pypy-dev >> > _______________________________________________ > pypy-dev at codespeak.net > http://codespeak.net/mailman/listinfo/pypy-dev > From fijall at gmail.com Mon Jun 28 20:37:58 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 28 Jun 2010 12:37:58 -0600 Subject: [pypy-dev] PyPy Master thesis sandboxing In-Reply-To: <66fb452a52c4f0d736a63b961557676b@mail.gmail.com> References: <4C289EB8.4090702@gmx.de> <4C28C647.2000008@gmx.de> <7e69406314220d0a6655e8a6f2fe70b4@mail.gmail.com>

<66fb452a52c4f0d736a63b961557676b@mail.gmail.com> Message-ID: Cool, will look at it as soon as I can (still will take a bit, moving around). 2010/6/28 S?ren Laursen : > Yes! :-) > > I would like some help. > > Regards, > > S?ren > > > -----Oprindelig meddelelse----- > Fra: Maciej Fijalkowski [mailto:fijall at gmail.com] > Sendt: 28. juni 2010 20:03 > Til: S?ren Laursen > Cc: pypy-dev at codespeak.net > Emne: Re: [pypy-dev] PyPy Master thesis sandboxing > > Does it mean "yes" or "no"? :-) > > 2010/6/28 S?ren Laursen : >> Thanks, >> >> The project is not stoppede, but the thesis had to be handed in. >> >> Regards, >> >> s?ren >> >> -----Oprindelig meddelelse----- >> Fra: Maciej Fijalkowski [mailto:fijall at gmail.com] >> Sendt: 28. juni 2010 19:59 >> Til: S?ren Laursen >> Cc: pypy-dev at codespeak.net >> Emne: Re: [pypy-dev] PyPy Master thesis sandboxing >> >> Bah, indeed you did. >> >> Well, I can answer that now :) >> >> The level of sandboxing is the level on which RPython operates - it >> does not necesarilly relate 1:1 to os module functions. You would need >> to look at implementation of os.listdir which C functions it calls. If >> you enable debug (is it on by default?) it'll give you what functions >> are called with what parameters. Also a pdb would help there. I can >> look at those errors precisely if you want me to. >> >> 2010/6/28 S?ren Laursen : >>> -----Oprindelig meddelelse----- >>> Fra: pypy-dev-bounces at codespeak.net >>> [mailto:pypy-dev-bounces at codespeak.net] P? vegne af Carl Friedrich Bolz >>> Sendt: 28. juni 2010 17:57 >>> Til: Maciej Fijalkowski >>> Cc: PyPy Dev >>> Emne: Re: [pypy-dev] PyPy Master thesis sandboxing >>> >>> On 06/28/2010 05:51 PM, Maciej Fijalkowski wrote: >>>> On Mon, Jun 28, 2010 at 7:08 AM, Carl Friedrich Bolz >>> wrote: >>>>> Hi all, >>>>> >>>>> just wanted to point this Master thesis using PyPy out: >>>>> >>>>> http://www.diku.dk/english/Calendar/masters_defence_soeren/ >>>>> >>>>> Didn't know about this, but interesting anyway. >>>>> >>>>> Cheers, >>>>> >>>>> Carl Friedrich >>>> >>>> According to the abstract it was a poor choice :) >>> >>>>So it says. He never wrote anything on the mailing list though, I assume >>>>he showed up on IRC? >>> >>> I did wrote on the mailinglist: >>> http://permalink.gmane.org/gmane.comp.python.pypy/5987 >>> >>> This error was a show stopper, and some other strange errors using the >>> pickle module. >>> >>> Regards, >>> >>> S?ren >>> _______________________________________________ >>> pypy-dev at codespeak.net >>> http://codespeak.net/mailman/listinfo/pypy-dev >>> >> _______________________________________________ >> pypy-dev at codespeak.net >> http://codespeak.net/mailman/listinfo/pypy-dev >> > _______________________________________________ > pypy-dev at codespeak.net > http://codespeak.net/mailman/listinfo/pypy-dev > > > > From phyo.arkarlwin at gmail.com Tue Jun 29 02:55:40 2010 From: phyo.arkarlwin at gmail.com (Phyo Arkar) Date: Tue, 29 Jun 2010 00:55:40 +0000 Subject: [pypy-dev] Is MySQLdb Working? Message-ID: Hello Pypy team. Thank you for the Greatest project every happen to Programming World. I am trying to run web2py - http://www.web2py.com on pypy , everything working fine except MySQLdb , which i installed over easy_install . >>>> import MySQLdb Traceback (most recent call last): File "", line 1, in ZipImportError: No module named pkg_resources I saw the patch for mysqldb in the trunk , so i have to apply it on mysqldb ? Which version for it ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From glavoie at gmail.com Tue Jun 29 21:09:40 2010 From: glavoie at gmail.com (Gabriel Lavoie) Date: Tue, 29 Jun 2010 15:09:40 -0400 Subject: [pypy-dev] Improving Stackless/Coroutines implementation Message-ID: Hello everyone, as a few knows here, I'm working heavily with PyPy's "stackless" module for my Master degree project to make it more distributed. Since I started to work full time on this project I've encountered a few bugs (mostly related to pickling of tasklets) and missing implementation details in the module. The latest problem I've encountered is to be able to detect when tasklet.kill() is called, within the tasklet being killed. With Stackless CPython, TaskletExit is raised and can be caught but this part wasn't really implemented in PyPy's stackless module. Since the module is implemented on top of coroutines and since coroutine.kill() is called within tasklet.kill(), the exception thrown by the coroutine implementation needs to be caught. Here's the problem: http://codespeak.net/pypy/dist/pypy/doc/stackless.html#coroutines - coro.kill() Kill coro by sending an exception to it. (At the moment, the exception is not visible to app-level, which means that you cannot catch it, and that try: finally: clauses are not honored. This will be fixed in the future.) The exception is not thrown at app level and a coroutine dies silently. Took a look at the code and I've been able to expose a CoroutineExit exception to app level on which I intend implementing TaskletExit correctly. I'm also able to catch the exception as expected but the code is not yet complete. Right now, I have a question on how to expose correctly the CoroutineExit and TaskletExit exceptions to app level. Here's what I did: W_CoroutineExit = _new_exception('CoroutineExit', W_Exception, 'Exit requested...') class AppCoroutine(Coroutine): # XXX, StacklessFlags): def __init__(self, space, state=None): # Some other code here # Exporting new exception to __builtins__ and "exceptions" modules self.w_CoroutineExit = space.gettypefor(W_CoroutineExit) space.setitem( space.exceptions_module.w_dict, space.new_interned_str('CoroutineExit'), self.w_CoroutineExit) space.setitem(space.builtin.w_dict, space.new_interned_str('CoroutineExit'), self.w_CoroutineExit) I talked about this on #pypy (IRC) but people weren't sure about exporting new names to __builtins__. On my side I wanted to make it look as most as possible as how Stackless CPython did it with TaskletExit, which is directly available in __builtins__. This would make code compatible with both Stackless Python and PyPy's stackless module. Also, exporting names this way would only make them appear in __builtins__ when the "_stackless" module is enabled (pypy-c built with --stackless). What are your opinions about it? (Maciej, I already know about yours! ;) Thank you very much, Gabriel (WildChild) -- Gabriel Lavoie glavoie at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.giarrusso at gmail.com Wed Jun 30 10:11:33 2010 From: p.giarrusso at gmail.com (Paolo Giarrusso) Date: Wed, 30 Jun 2010 10:11:33 +0200 Subject: [pypy-dev] New speed.pypy.org version In-Reply-To: References:

Message-ID: Hi Miquel, I'm quite busy (because of a paper deadline next Tuesday), sorry for not answering earlier. I was just struck by an idea: there is a stacked bar plot where the total bar is related to the geometric mean, such that it is normalization-invariant. But this graph _is_ complicated. It is a stacked plot of _logarithms_ of performance ratios? This way, the complete stacked bar shows the logarithm of the product, rather than their sum, i.e. the log of the (geometric mean)^N rather than their arithmetic mean. log of the (geometric mean)^N = N*log of the (geometric mean). Some simple maths (I didn't write it out, so please recheck!) seems to show that showing (a+b*log (ratio)), instead of log(ratio), gives still a fair comparison, obtaining N*a+b*N*log(geomean) = \Theta(log(geomean)). You need to put a and b because showing if the ratio is 1, log(1) is zero (b is the representation scale which is always there). About your workaround: I would like a table with the geometric mean of the ratios, where we get the real global performance ratio among the interpreters. As far as the results of your solution do not contradict that _real_ table, it should be a reasonable workaround (but I would embed the check in the code - otherwise other projects _will be_ bitten by that). Probably, I would like the website to offer such a table to users, and I would like a graph of the overall performance ratio over time (actually revisions). Finally, the docs of your web application should at the very least reference the paper and this conversation (if there's a public archive of the ML, as I think), and ideally explain the issue. Sorry for being too dense, maybe - if I was unclear, please tell me and I'll answer next week. Best regards, Paolo On Mon, Jun 28, 2010 at 11:21, Miquel Torres wrote: > Hi Paolo, > > I read the paper, very interesting. It is perfectly clear that to > calculate a normalized total only the geometric mean makes sense. > > However, a stacked bars plot shows the individual benchmarks so it > implicitly is an arithmetic mean. The only solution (apart from > removing the stacked charts and only offering total bars) is the > weighted approach. > > External weights are not very practical though. Codespeed is used by > other projects so an extra option would need to be added to the > settings to allow the introducing of arbitrary weights to benchmarks. > A bit cumbersome. I have an idea that may work. Take the weights from > a defined baseline so that the run times are equal, which is the same > as normalizing to a baseline. It would be the same as now, only that > you can't choose the normalization, it will be weighted (normalized) > according the default baseline (which you already can already > configure in the settings). > > You may say that it is still an arithmetic mean, but there won't be > conflicting results because there is only a single normalization. For > PyPy that would be cpython, and everything would make sense. > I know it is a work around, not a solution. If you think it is a bad > idea, the only other possibility is not to have stacked bars (as in > "showing individual benchmarks"). But I find them useful. Yes you can > see the individual benchmark results better in the normal bars chart, > but there you don't see visually which benchmarks take the biggest > part of the pie, which helps visualize what parts of your program need > most improving. > > What do you think? > > Regards, > Miquel > > > 2010/6/25 Paolo Giarrusso : >> On Fri, Jun 25, 2010 at 19:08, Miquel Torres wrote: >>> Hi Paolo, >>> >>> I am aware of the problem with calculating benchmark means, but let me >>> explain my point of view. >>> >>> You are correct in that it would be preferable to have absolute times. Well, >>> you actually can, but see what it happens: >>> http://speed.pypy.org/comparison/?hor=true&bas=none&chart=stacked+bars >> >> Ahah! I didn't notice that I could skip normalization! This does not >> fully invalidate my point, however. >> >>> Absolute values would only work if we had carefully chosen benchmaks >>> runtimes to be very similar (for our cpython baseline). As it is, html5lib, >>> spitfire and spitfire_cstringio completely dominate the cummulative time. >> >> I acknowledge that (btw, it should be cumulative time, with one 'm', >> both here and in the website). >> >>> And not because the interpreter is faster or slower but because the >>> benchmark was arbitrarily designed to run that long. Any improvement in the >>> long running benchmarks will carry much more weight than in the short >>> running. >> >>> What is more useful is to have comparable slices of time so that the >>> improvements can be seen relatively over time. >> >> If you want to sum up times (but at this point, I see no reason for >> it), you should rather have externally derived weights, as suggested >> by the paper (in Rule 3). >> As soon as you take weights from the data, lots of maths that you need >> is not going to work any more - that's generally true in many cases in >> statistics. >> And the only way making sense to have external weights is to gather >> them from real world programs. Since that's not going to happen >> easily, just stick with the geometric mean. Or set an arbitrarily low >> weight, manually, without any math, so that the long-running >> benchmarks stop dominating the res. It's no fraud, since the current >> graph is less valid anyway. >> >>> Normalizing does that i >>> think. >> Not really. >> >>> It just says: we have 21 tasks which take 1 second to run each on >>> interpreter X (cpython in the default case). Then we see how other >>> executables compare to that. What would the geometric mean achieve here, >>> exactly, for the end user? >> >> You actually need the geomean to do that. Don't forget that the >> geomean is still a mean: it's a mean performance ratio which averages >> individual performance ratios. >> If PyPy's geomean is 0.5, it means that PyPy is going to run that task >> in 11.5 seconds instead of 21. To me, this sounds exactly like what >> you want to achieve. Moreover, it actually works, unlike what you use. >> >> For instance, ignore PyPy-JIT, and look only CPython and pypy-c (no >> JIT). Then, change the normalization among the two: >> http://speed.pypy.org/comparison/?exe=2%2B35,3%2BL&ben=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21&env=1&hor=true&bas=2%2B35&chart=stacked+bars >> http://speed.pypy.org/comparison/?exe=2%2B35,3%2BL&ben=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21&env=1&hor=true&bas=3%2BL&chart=stacked+bars >> with the current data, you get that in one case cpython is faster, in >> the other pypy-c is faster. >> It can't happen with the geomean. This is the point of the paper. >> >> I could even construct a normalization baseline $base such that >> CPython seems faster than PyPy-JIT. Such a base should be very fast >> on, say, ai (where CPython is slower), so that $cpython.ai/$base.ai >> becomes 100 and $pypyjit.ai/$base.ai becomes 200, and be very slow on >> other benchmarks (so that they disappear in the sum). >> >> So, the only difference I see is that geomean works, arithm. mean >> doesn't. That's why Real Benchmarkers use geomean. >> >> Moreover, you are making a mistake quite common among non-physicists. >> What you say makes sense under the implicit assumption that dividing >> two times gives something you can use as a time. When you say "Pypy's >> runtime for a 1 second task", you actually want to talk about a >> performance ratio, not about the time. In the same way as when you say >> "this bird runs 3 meters long in one second", a physicist would sum >> that up as "3 m/s" rather than "3 m". >> >>> I am not really calculating any mean. You can see that I carefully avoided >>> to display any kind of total bar which would indeed incur in the problem you >>> mention. That a stacked chart implicitly displays a total is something you >>> can not avoid, and for that kind of chart I still think normalized results >>> is visually the best option. >> >> But on a stacked bars graph, I'm not going to look at individual bars >> at all, just at the total: it's actually less convenient than in >> "normal bars" to look at the result of a particular benchmark. >> >> I hope I can find guidelines against stacked plots, I have a PhD >> colleague reading on how to make graphs. >> >> Best regards >> -- >> Paolo Giarrusso - Ph.D. Student >> http://www.informatik.uni-marburg.de/~pgiarrusso/ >> > -- Paolo Giarrusso - Ph.D. Student http://www.informatik.uni-marburg.de/~pgiarrusso/ From phyo.arkarlwin at gmail.com Wed Jun 30 23:24:20 2010 From: phyo.arkarlwin at gmail.com (Phyo Arkar) Date: Wed, 30 Jun 2010 21:24:20 +0000 Subject: [pypy-dev] PyPy 1.3 released In-Reply-To: References: Message-ID: So far , python-mysql still not working.. Anyone had sucessfully got it work? On Fri, Jun 25, 2010 at 11:27 PM, Maciej Fijalkowski wrote: > ======================= > PyPy 1.3: Stabilization > ======================= > > Hello. > > We're please to announce release of PyPy 1.3. This release has two major > improvements. First of all, we stabilized the JIT compiler since 1.2 > release, > answered user issues, fixed bugs, and generally improved speed. > > We're also pleased to announce alpha support for loading CPython extension > modules written in C. While the main purpose of this release is increased > stability, this feature is in alpha stage and it is not yet suited for > production environments. > > Highlights of this release > ========================== > > * We introduced support for CPython extension modules written in C. As of > now, > this support is in alpha, and it's very unlikely unaltered C extensions > will > work out of the box, due to missing functions or refcounting details. The > support is disable by default, so you have to do:: > > import cpyext > > before trying to import any .so file. Also, libraries are > source-compatible > and not binary-compatible. That means you need to recompile binaries, > using > for example:: > > python setup.py build > > Details may vary, depending on your build system. Make sure you include > the above line at the beginning of setup.py or put it in your > PYTHONSTARTUP. > > This is alpha feature. It'll likely segfault. You have been warned! > > * JIT bugfixes. A lot of bugs reported for the JIT have been fixed, and its > stability greatly improved since 1.2 release. > > * Various small improvements have been added to the JIT code, as well as a > great > speedup of compiling time. > > Cheers, > Maciej Fijalkowski, Armin Rigo, Alex Gaynor, Amaury Forgeot d'Arc and > the PyPy team > _______________________________________________ > pypy-dev at codespeak.net > http://codespeak.net/mailman/listinfo/pypy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: