From paul at paulgraydon.co.uk Wed Jan 18 08:50:22 2012 From: paul at paulgraydon.co.uk (Paul Graydon) Date: Wed, 18 Jan 2012 07:50:22 +0000 Subject: [Speed] Volunteering Message-ID: <20120118075022.GA11158@paulgraydon.co.uk> Hi folks, I'd like to volunteer my time to help with speed.python.org if I can. I'm a sysadmin by trade, with experience in a mix of linux and *bsd based environments from mid-sized ISP infrastructure down to consulting on single servers. I do a bunch of things with python but I'm far from a hotshot, but always looking for ways to build my skills. Is there anything specific that I might be able to help with? Paul From senger at rehfisch.de Wed Jan 18 12:09:53 2012 From: senger at rehfisch.de (Carsten Senger) Date: Wed, 18 Jan 2012 12:09:53 +0100 Subject: [Speed] Volunteering In-Reply-To: <20120118075022.GA11158@paulgraydon.co.uk> References: <20120118075022.GA11158@paulgraydon.co.uk> Message-ID: <4F16A881.8000603@rehfisch.de> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi all, Am 18.01.2012 08:50, schrieb Paul Graydon: > I'd like to volunteer my time to help with speed.python.org if I > can. I'm a sysadmin by trade, with experience in a mix of linux > and *bsd based environments from mid-sized ISP infrastructure down > to consulting on single servers. I do a bunch of things with python > but I'm far from a hotshot, but always looking for ways to build my > skills. Is there anything specific that I might be able to help > with? I also want to help with this project. I'm a python programmer with decent experience administrating and deploying python web applications on Linux. I could help to setup the testrunners or automate them. ..Carsten - -- Carsten Senger - Schumannstr. 38 - 65193 Wiesbaden senger at rehfisch.de - (0611) 5324176 PGP: gpg --recv-keys --keyserver hkp://subkeys.pgp.net 0xE374C75A -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJPFqiBAAoJEAOSv+HjdMdaldgH/i2xOzqofldPLLVtoa3bnm2d 3GmBhOr2T0Wz8MAlhG9j2nRUBuegF1mu59mZuk8K95NdhnRfnYJF9GM5P3YM54TA 8zKVaYNBk9Z87h8TidnUUJi0Z5Hc3Rak93zQh5t1bHHTihC08d70vM3Mgjz+U7GO hcsAQ5XUTOFHwlHp+8W2L6kV7CqAEaWyCvMG+osNR2cJjXw1jEQT7BOjxts0RxNu 7GWAfiBsUi4tLqnFUGjzdHt0UnWnrJtpWpUTrwohA74CFZ/ZFoWRP6DwAoxLR6V7 y0roJaJSp9s99IP0voYjpGZXscgaXBXVhIxHdILRmuZzn4kCPppKH9Za0xJeUN0= =suOL -----END PGP SIGNATURE----- From fijall at gmail.com Wed Jan 18 22:07:08 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Wed, 18 Jan 2012 23:07:08 +0200 Subject: [Speed] Volunteering In-Reply-To: <4F16A881.8000603@rehfisch.de> References: <20120118075022.GA11158@paulgraydon.co.uk> <4F16A881.8000603@rehfisch.de> Message-ID: Hey It's great that you want to help and there is a bit of work to be done. Can you find me on IRC (fijal on #pypy for example) or on chat (at this mail) to coordinate? Cheers, fijal From senger at rehfisch.de Thu Jan 19 11:20:42 2012 From: senger at rehfisch.de (Carsten Senger) Date: Thu, 19 Jan 2012 11:20:42 +0100 Subject: [Speed] Volunteering In-Reply-To: References: <20120118075022.GA11158@paulgraydon.co.uk> <4F16A881.8000603@rehfisch.de> Message-ID: <4F17EE7A.8080601@rehfisch.de> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, Am 18.01.2012 22:07, schrieb Maciej Fijalkowski: > It's great that you want to help and there is a bit of work to be > done. Can you find me on IRC (fijal on #pypy for example) or on > chat (at this mail) to coordinate? IRC is fine with me. Maybe you, Paul and me can meet on IRC. I will be there most of today until 4:30pm UTC and after 8:30pm UTC and tomorrow most of the day. ..Carsten - -- Carsten Senger - Schumannstr. 38 - 65193 Wiesbaden senger at rehfisch.de - (0611) 5324176 PGP: gpg --recv-keys --keyserver hkp://subkeys.pgp.net 0xE374C75A -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJPF+56AAoJEAOSv+HjdMdaxcQIAKimCx0k+FIf6+YBG5moeCgb 37clP1NFSp9fLerxkK53DH3l1ioGRjMv6Yl6b+tsLqONt0c9yJzzdEs3gzX5I5j+ 0pjhqDl/avK8v38RBK5/eWEZVGnUMPEtv3TK4On4w2GjkcN6g+CGLuLE143ca/Mb amJy14bIHv+PLbu9hRZrGBVDPvdLXkYSt+pIHm3v3KizPdzZvSSW5VxFTHyZ/Flu GFirDAvrznexNJLqYc8GU5Es8TMKZVZ7+Fz+vctdQTUBLx1mfnAwfgCMyCPfg/a8 cL7qI6plJQafjGo0brxEt+rwhAMosCA05qO2TXnaL+KL8cX2xTkXGA5iwkEb1zI= =Px0M -----END PGP SIGNATURE----- From senger at rehfisch.de Thu Jan 26 21:21:02 2012 From: senger at rehfisch.de (Carsten Senger) Date: Thu, 26 Jan 2012 21:21:02 +0100 Subject: [Speed] Buildbot Status Message-ID: <4F21B5AE.2080304@rehfisch.de> Hi everybody With the help of Maciej I worked on the buildbot in the last days. It can build cpython, run the benchmarks and upload the results to one or more codespeed instances. Maciej will look at the changes so we will hopefully have a working buildbot for python 2.7 in the next days. This has a ticket in pypy's bugtracker: https://bugs.pypy.org/issue1015 I also have a script we can use to run the benchmarks for parts of the history and get data for a year or so into codespeed. The question is if this data is interesting to anyone. What are the plans for benchmarking python 3? How much of the benchmark suite will work with python 3, or can be made work without much effort? Porting the runner and the support code is easy, but directly porting the benchmarks including the used libraries seems unrealistic. Can we replace them with newer versions that support python3 to get some benchmarks working? Or build a second set of python3 compatible benchmarks with these newer versions? Are there other tasks for speed.python.org atm? Cheers, ..Carsten -- Carsten Senger - Schumannstr. 38 - 65193 Wiesbaden senger at rehfisch.de - (0611) 5324176 PGP: gpg --recv-keys --keyserver hkp://subkeys.pgp.net 0xE374C75A From brian at python.org Thu Jan 26 21:36:19 2012 From: brian at python.org (Brian Curtin) Date: Thu, 26 Jan 2012 14:36:19 -0600 Subject: [Speed] Buildbot Status In-Reply-To: <4F21B5AE.2080304@rehfisch.de> References: <4F21B5AE.2080304@rehfisch.de> Message-ID: On Thu, Jan 26, 2012 at 14:21, Carsten Senger wrote: > Hi everybody > > ...snip... > > Cheers, > > ..Carsten I'm not the right person to answer any of your questions...but I offer an all caps THANK YOU for getting some movement on this. From jnoller at gmail.com Thu Jan 26 21:37:49 2012 From: jnoller at gmail.com (Jesse Noller) Date: Thu, 26 Jan 2012 15:37:49 -0500 Subject: [Speed] Buildbot Status In-Reply-To: References: <4F21B5AE.2080304@rehfisch.de> Message-ID: <847F90CB24CA4DEDB220F872306FD75F@gmail.com> +1 to what Brian said On Thursday, January 26, 2012 at 3:36 PM, Brian Curtin wrote: > On Thu, Jan 26, 2012 at 14:21, Carsten Senger wrote: > > Hi everybody > > > > ...snip... > > > > Cheers, > > > > ..Carsten > > I'm not the right person to answer any of your questions...but I offer > an all caps THANK YOU for getting some movement on this. > _______________________________________________ > Speed mailing list > Speed at python.org (mailto:Speed at python.org) > http://mail.python.org/mailman/listinfo/speed From brett at python.org Mon Jan 30 18:56:01 2012 From: brett at python.org (Brett Cannon) Date: Mon, 30 Jan 2012 12:56:01 -0500 Subject: [Speed] Buildbot Status In-Reply-To: <4F21B5AE.2080304@rehfisch.de> References: <4F21B5AE.2080304@rehfisch.de> Message-ID: On Thu, Jan 26, 2012 at 15:21, Carsten Senger wrote: > Hi everybody > > With the help of Maciej I worked on the buildbot in the last days. It > can build cpython, run the benchmarks and upload the results to one or > more codespeed instances. Maciej will look at the changes so we will > hopefully have a working buildbot for python 2.7 in the next days. > > This has a ticket in pypy's bugtracker: https://bugs.pypy.org/issue1015 > > I also have a script we can use to run the benchmarks for parts of the > history and get data for a year or so into codespeed. The question is if > this data is interesting to anyone. > I would say "don't worry about it unless you have some personal motivation to want to bother". While trending data is interesting, it isn't critical and a year will eventually pass anyway. =) > > > What are the plans for benchmarking python 3? > How much of the benchmark suite will work with python 3, or can be made > work without much effort? Porting the runner and the support code is > easy, but directly porting the benchmarks including the used libraries > seems unrealistic. > > Can we replace them with newer versions that support python3 to get some > benchmarks working? Or build a second set of python3 compatible > benchmarks with these newer versions? > > That's an open question. Until the libraries the benchmarks get ported officially then it's up in the air when the pre-existing benchmarks can move. We might have to look at pulling in a new set to start and then add back in the old ones (possibly) as they get ported. > Are there other tasks for speed.python.org atm? > Beats me, but I appreciate everything being done! -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Mon Jan 30 19:28:17 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 30 Jan 2012 20:28:17 +0200 Subject: [Speed] Buildbot Status In-Reply-To: References: <4F21B5AE.2080304@rehfisch.de> Message-ID: On Mon, Jan 30, 2012 at 7:56 PM, Brett Cannon wrote: > > > On Thu, Jan 26, 2012 at 15:21, Carsten Senger wrote: >> >> Hi everybody >> >> With the help of Maciej I worked on the buildbot in the last days. It >> can build cpython, run the benchmarks and upload the results to one or >> more codespeed instances. Maciej will look at the changes so we will >> hopefully have a working buildbot for python 2.7 in the next days. >> >> This has a ticket in pypy's bugtracker: https://bugs.pypy.org/issue1015 >> >> I also have a script we can use to run the benchmarks for parts of the >> history and get data for a year or so into codespeed. The question is if >> this data is interesting to anyone. > > > I would say "don't worry about it unless you have some personal motivation > to want to bother". While trending data is interesting, it isn't critical > and a year will eventually pass anyway. =) > >> >> >> >> What are the plans for benchmarking python 3? >> How much of the benchmark suite will work with python 3, or can be made >> work without much effort? Porting the runner and the support code is >> easy, but directly porting the benchmarks including the used libraries >> seems unrealistic. >> >> Can we replace them with newer versions that support python3 to get some >> benchmarks working? Or build a second set of python3 compatible >> benchmarks with these newer versions? >> > > That's an open question. Until the libraries the benchmarks get ported > officially then it's up in the air when the pre-existing benchmarks can > move. We might have to look at pulling in a new set to start and then add > back in the old ones (possibly) as they get ported. Changing benchmarks is *never* a good idea. Note that we have quite some history of those benchmarks running on pypy and I would strongly object changing them in any way. Adding python 3 versions next to them is much better. Also porting runner etc. is not a very good idea I think. The problem really is that most of interesting benchmarks don't work on python 3, only the uninteresting ones. What we gonna do about that? PS. I pulled your changes From stutzbach at google.com Mon Jan 30 20:03:42 2012 From: stutzbach at google.com (Daniel Stutzbach) Date: Mon, 30 Jan 2012 11:03:42 -0800 Subject: [Speed] Buildbot Status In-Reply-To: References: <4F21B5AE.2080304@rehfisch.de> Message-ID: On Mon, Jan 30, 2012 at 9:56 AM, Brett Cannon wrote: > That's an open question. Until the libraries the benchmarks get ported > officially then it's up in the air when the pre-existing benchmarks can > move. We might have to look at pulling in a new set to start and then add > back in the old ones (possibly) as they get ported. > +1 In particular, I don't think the Spitfire authors are working on a Python3 port at all. If we have to wait for Spitfire, we may be waiting a very long time. -- Daniel Stutzbach -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Tue Jan 31 17:21:58 2012 From: brett at python.org (Brett Cannon) Date: Tue, 31 Jan 2012 11:21:58 -0500 Subject: [Speed] Buildbot Status In-Reply-To: References: <4F21B5AE.2080304@rehfisch.de> Message-ID: On Mon, Jan 30, 2012 at 13:28, Maciej Fijalkowski wrote: > On Mon, Jan 30, 2012 at 7:56 PM, Brett Cannon wrote: > > > > > > On Thu, Jan 26, 2012 at 15:21, Carsten Senger > wrote: > >> > >> Hi everybody > >> > >> With the help of Maciej I worked on the buildbot in the last days. It > >> can build cpython, run the benchmarks and upload the results to one or > >> more codespeed instances. Maciej will look at the changes so we will > >> hopefully have a working buildbot for python 2.7 in the next days. > >> > >> This has a ticket in pypy's bugtracker: https://bugs.pypy.org/issue1015 > >> > >> I also have a script we can use to run the benchmarks for parts of the > >> history and get data for a year or so into codespeed. The question is if > >> this data is interesting to anyone. > > > > > > I would say "don't worry about it unless you have some personal > motivation > > to want to bother". While trending data is interesting, it isn't critical > > and a year will eventually pass anyway. =) > > > >> > >> > >> > >> What are the plans for benchmarking python 3? > >> How much of the benchmark suite will work with python 3, or can be made > >> work without much effort? Porting the runner and the support code is > >> easy, but directly porting the benchmarks including the used libraries > >> seems unrealistic. > >> > >> Can we replace them with newer versions that support python3 to get some > >> benchmarks working? Or build a second set of python3 compatible > >> benchmarks with these newer versions? > >> > > > > That's an open question. Until the libraries the benchmarks get ported > > officially then it's up in the air when the pre-existing benchmarks can > > move. We might have to look at pulling in a new set to start and then add > > back in the old ones (possibly) as they get ported. > > Changing benchmarks is *never* a good idea. Note that we have quite > some history of those benchmarks running on pypy and I would strongly > object changing them in any way. Adding python 3 versions next to them > is much better. Also porting runner etc. is not a very good idea I > think. > > The problem really is that most of interesting benchmarks don't work > on python 3, only the uninteresting ones. What we gonna do about that? > And this is a fundamental issue with tying benchmarks to real applications and libraries; if the code the benchmark relies on never changes to Python 3, then the benchmark is dead in the water. As Daniel pointed out, if spitfire simply never converts then either we need to convert them ourselves *just* for the benchmark (yuck), live w/o the benchmark (ok, but if this happens to a bunch of benchmarks then we are going to not have a lot of data), or we look at making new benchmarks based on apps/libraries that _have_ made the switch to Python 3 (which means trying to agree on some new set of benchmarks to add to the current set). BTW, which benchmark set are we talking about? speed.pypy.org runs a different set of benchmarks than the ones at http://hg.python.org/benchmarks. What set are we worrying about porting here? If it's the latter we will have to wait a while for at lest 20% of the benchmarks since they rely on Twisted (which is only 50% done according to http://twistedmatrix.com/trac/milestone/Python-3.x). -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at hotpy.org Tue Jan 31 17:46:58 2012 From: mark at hotpy.org (Mark Shannon) Date: Tue, 31 Jan 2012 16:46:58 +0000 Subject: [Speed] Buildbot Status In-Reply-To: References: <4F21B5AE.2080304@rehfisch.de> Message-ID: <4F281B02.3050306@hotpy.org> Brett Cannon wrote: [snip] > > BTW, which benchmark set are we talking about? speed.pypy.org > runs a different set of benchmarks than the ones > at http://hg.python.org/benchmarks . What set are we worrying about I think we should aim for supporting the same set of benchmarks as PyPy, since they have a nice historical record. It is also means we can have a meaningful comparison of the latest version of CPython with the latest version of PyPy (even if it is only 2.7 ;) ) > porting here? If it's the latter we will have to wait a while for at > lest 20% of the benchmarks since they rely on Twisted (which is only 50% > done according to http://twistedmatrix.com/trac/milestone/Python-3.x). At least they are working on it. We may just have to be patient and just add in benchmarks one by one. Cheers, Mark. From paul at paulgraydon.co.uk Tue Jan 31 17:58:12 2012 From: paul at paulgraydon.co.uk (Paul Graydon) Date: Tue, 31 Jan 2012 06:58:12 -1000 Subject: [Speed] Buildbot Status In-Reply-To: References: <4F21B5AE.2080304@rehfisch.de> Message-ID: <4F281DA4.7010300@paulgraydon.co.uk> > And this is a fundamental issue with tying benchmarks to real > applications and libraries; if the code the benchmark relies on never > changes to Python 3, then the benchmark is dead in the water. As > Daniel pointed out, if spitfire simply never converts then either we > need to convert them ourselves *just* for the benchmark (yuck), live > w/o the benchmark (ok, but if this happens to a bunch of benchmarks > then we are going to not have a lot of data), or we look at making new > benchmarks based on apps/libraries that _have_ made the switch to > Python 3 (which means trying to agree on some new set of benchmarks > to add to the current set). > > What is the criteria by which the original benchmark sets were chosen? I'm assuming it was because they're generally popular libraries amongst developers across a variety of purposes, so speed.pypy would show the speed of regular tasks? If so, presumably it shouldn't be too hard to find appropriate libraries for Python 3? Paul From brett at python.org Tue Jan 31 18:38:40 2012 From: brett at python.org (Brett Cannon) Date: Tue, 31 Jan 2012 12:38:40 -0500 Subject: [Speed] Buildbot Status In-Reply-To: <4F281B02.3050306@hotpy.org> References: <4F21B5AE.2080304@rehfisch.de> <4F281B02.3050306@hotpy.org> Message-ID: On Tue, Jan 31, 2012 at 11:46, Mark Shannon wrote: > Brett Cannon wrote: > > [snip] > > >> BTW, which benchmark set are we talking about? speed.pypy.org < >> http://speed.pypy.org> runs a different set of benchmarks than the ones >> at http://hg.python.org/**benchmarks . >> What set are we worrying about >> > > I think we should aim for supporting the same set of benchmarks as PyPy, > since they have a nice historical record. > Depends who you ask. =) The benchmarks from hg.python.org are the ones from unladen, and then PyPy tweaked them. Originally we had said we would consolidate on the hg.python.org ones so we had them being developed in a central location. Plus history should only come into play to show a benchmark was useful, not for historical data. speed.python.org is meant to compare the various VMs, not to compare which version of Python is the fastest; that just leads down the road of "I'm going to stay on Python M.N because it's slightly faster than Python M.O" which we don't want. > It is also means we can have a meaningful comparison of the latest > version of CPython with the latest version of PyPy > (even if it is only 2.7 ;) ) > > Those are already compared by either benchmark so that isn't something to worry about. > > porting here? If it's the latter we will have to wait a while for at lest >> 20% of the benchmarks since they rely on Twisted (which is only 50% done >> according to http://twistedmatrix.com/trac/**milestone/Python-3.x >> ). >> > > At least they are working on it. We may just have to be patient > and just add in benchmarks one by one. > There's patience, and then there's waiting years. If we want a decent set of benchmarks for when PyPy reaches Python 3 parity then we probably shouldn't sit around waiting for Twisted to release. '-Brett > > Cheers, > Mark. > > ______________________________**_________________ > Speed mailing list > Speed at python.org > http://mail.python.org/**mailman/listinfo/speed > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Tue Jan 31 18:40:09 2012 From: brett at python.org (Brett Cannon) Date: Tue, 31 Jan 2012 12:40:09 -0500 Subject: [Speed] Buildbot Status In-Reply-To: <4F281DA4.7010300@paulgraydon.co.uk> References: <4F21B5AE.2080304@rehfisch.de> <4F281DA4.7010300@paulgraydon.co.uk> Message-ID: On Tue, Jan 31, 2012 at 11:58, Paul Graydon wrote: > > And this is a fundamental issue with tying benchmarks to real >> applications and libraries; if the code the benchmark relies on never >> changes to Python 3, then the benchmark is dead in the water. As Daniel >> pointed out, if spitfire simply never converts then either we need to >> convert them ourselves *just* for the benchmark (yuck), live w/o the >> benchmark (ok, but if this happens to a bunch of benchmarks then we are >> going to not have a lot of data), or we look at making new benchmarks based >> on apps/libraries that _have_ made the switch to Python 3 (which means >> trying to agree on some new set of benchmarks to add to the current set). >> >> >> What is the criteria by which the original benchmark sets were chosen? > I'm assuming it was because they're generally popular libraries amongst > developers across a variety of purposes, so speed.pypy would show the speed > of regular tasks? > That's the reason unladen swallow chose them, yes. PyPy then adopted them and added in the Twisted benchmarks. > If so, presumably it shouldn't be too hard to find appropriate libraries > for Python 3? Perhaps, but someone has to put in the effort to find those benchmarks, code them up, show how they are a reasonable workload, and then get them accepted. Everyone likes the current set because the unladen team put in a lot of time and effort into selecting and creating those benchmarks. -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Tue Jan 31 19:44:49 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Tue, 31 Jan 2012 20:44:49 +0200 Subject: [Speed] Buildbot Status In-Reply-To: References: <4F21B5AE.2080304@rehfisch.de> <4F281DA4.7010300@paulgraydon.co.uk> Message-ID: On Tue, Jan 31, 2012 at 7:40 PM, Brett Cannon wrote: > > > On Tue, Jan 31, 2012 at 11:58, Paul Graydon wrote: >> >> >>> And this is a fundamental issue with tying benchmarks to real >>> applications and libraries; if the code the benchmark relies on never >>> changes to Python 3, then the benchmark is dead in the water. As Daniel >>> pointed out, if spitfire simply never converts then either we need to >>> convert them ourselves *just* for ?the benchmark (yuck), live w/o the >>> benchmark (ok, but if this happens to a bunch of benchmarks then we are >>> going to not have a lot of data), or we look at making new benchmarks based >>> on apps/libraries that _have_ made the switch to Python 3 (which means >>> trying to agree on some new set of ?benchmarks to add to the current set). >>> >>> >> What is the criteria by which the original benchmark sets were chosen? >> ?I'm assuming it was because they're generally popular libraries amongst >> developers across a variety of purposes, so speed.pypy would show the speed >> of regular tasks? > > > That's the reason unladen swallow chose them, yes. PyPy then adopted them > and added in the Twisted benchmarks. > >> >> If so, presumably it shouldn't be too hard to find appropriate libraries >> for Python 3? > > > Perhaps, but someone has to put in the effort to find those benchmarks, code > them up, show how they are a reasonable workload, and then get them > accepted. Everyone likes the current set because the unladen team put in a > lot of time and effort into selecting and creating those benchmarks. I think we also spent significant amount of time grabbing various benchmarks from various places (we = people who contributed to speed.pypy.org benchmark suite, that's by far not a group consisting only pypy devs). You might be surprised, but the criteria we used were mostly "contributed benchmarks showing some sort of real workload". I don't think we ever *rejected* a benchmark barring one case that was very variable and not very interesting (Depending on the HD performance). Some benchmarks were developed from "we know pypy is slow on this" scenarios as well. The important part is that we want also "interesting" benchmarks to be included. This mostly means "run by someone somewhere" which includes a very broad category of things, but *excludes* fibonacci, richards, pystone and stuff like this. I think it's fine if we have a benchmark that runs python 3 version of whatever is there, but this requires work. Is there someone willing to do that work? Cheers, fijal From brett at python.org Tue Jan 31 20:55:26 2012 From: brett at python.org (Brett Cannon) Date: Tue, 31 Jan 2012 14:55:26 -0500 Subject: [Speed] Buildbot Status In-Reply-To: References: <4F21B5AE.2080304@rehfisch.de> <4F281DA4.7010300@paulgraydon.co.uk> Message-ID: On Tue, Jan 31, 2012 at 13:44, Maciej Fijalkowski wrote: > On Tue, Jan 31, 2012 at 7:40 PM, Brett Cannon wrote: > > > > > > On Tue, Jan 31, 2012 at 11:58, Paul Graydon > wrote: > >> > >> > >>> And this is a fundamental issue with tying benchmarks to real > >>> applications and libraries; if the code the benchmark relies on never > >>> changes to Python 3, then the benchmark is dead in the water. As Daniel > >>> pointed out, if spitfire simply never converts then either we need to > >>> convert them ourselves *just* for the benchmark (yuck), live w/o the > >>> benchmark (ok, but if this happens to a bunch of benchmarks then we are > >>> going to not have a lot of data), or we look at making new benchmarks > based > >>> on apps/libraries that _have_ made the switch to Python 3 (which means > >>> trying to agree on some new set of benchmarks to add to the current > set). > >>> > >>> > >> What is the criteria by which the original benchmark sets were chosen? > >> I'm assuming it was because they're generally popular libraries amongst > >> developers across a variety of purposes, so speed.pypy would show the > speed > >> of regular tasks? > > > > > > That's the reason unladen swallow chose them, yes. PyPy then adopted them > > and added in the Twisted benchmarks. > > > >> > >> If so, presumably it shouldn't be too hard to find appropriate libraries > >> for Python 3? > > > > > > Perhaps, but someone has to put in the effort to find those benchmarks, > code > > them up, show how they are a reasonable workload, and then get them > > accepted. Everyone likes the current set because the unladen team put in > a > > lot of time and effort into selecting and creating those benchmarks. > > I think we also spent significant amount of time grabbing various > benchmarks from various places (we = people who contributed to > speed.pypy.org benchmark suite, that's by far not a group consisting > only pypy devs). > Where does the PyPy benchmark code live, anyway? > > You might be surprised, but the criteria we used were mostly > "contributed benchmarks showing some sort of real workload". I don't > think we ever *rejected* a benchmark barring one case that was very > variable and not very interesting (Depending on the HD performance). > Some benchmarks were developed from "we know pypy is slow on this" > scenarios as well. > Yeah, you and Alex have told me that in-person before. > > The important part is that we want also "interesting" benchmarks to be > included. This mostly means "run by someone somewhere" which includes > a very broad category of things, but *excludes* fibonacci, richards, > pystone and stuff like this. I think it's fine if we have a benchmark > that runs python 3 version of whatever is there, but this requires > work. Is there someone willing to do that work? > Right, I'm not suggesting something as silly as fibonacci. I think we need to first decide which set of benchmarks we are using since there is already divergence between what is on hg.python.org and what is measured at speed.pypy.org (e.g. hg.python.org tests 2to3 while pypy.orgdoes not, reverse goes for twisted). Once we know what set of benchmarks we care about (it can be a cross-section), then we need to take a hard look at where we are coming up short for Python 3. But from a python-dev perspective, benchmarks running against Python 2 are not interesting since we are simply no longer developing performance improvements for Python 2.7. -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Tue Jan 31 21:04:16 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Tue, 31 Jan 2012 22:04:16 +0200 Subject: [Speed] Buildbot Status In-Reply-To: References: <4F21B5AE.2080304@rehfisch.de> <4F281DA4.7010300@paulgraydon.co.uk> Message-ID: On Tue, Jan 31, 2012 at 9:55 PM, Brett Cannon wrote: > > > On Tue, Jan 31, 2012 at 13:44, Maciej Fijalkowski wrote: >> >> On Tue, Jan 31, 2012 at 7:40 PM, Brett Cannon wrote: >> > >> > >> > On Tue, Jan 31, 2012 at 11:58, Paul Graydon >> > wrote: >> >> >> >> >> >>> And this is a fundamental issue with tying benchmarks to real >> >>> applications and libraries; if the code the benchmark relies on never >> >>> changes to Python 3, then the benchmark is dead in the water. As >> >>> Daniel >> >>> pointed out, if spitfire simply never converts then either we need to >> >>> convert them ourselves *just* for ?the benchmark (yuck), live w/o the >> >>> benchmark (ok, but if this happens to a bunch of benchmarks then we >> >>> are >> >>> going to not have a lot of data), or we look at making new benchmarks >> >>> based >> >>> on apps/libraries that _have_ made the switch to Python 3 (which means >> >>> trying to agree on some new set of ?benchmarks to add to the current >> >>> set). >> >>> >> >>> >> >> What is the criteria by which the original benchmark sets were chosen? >> >> ?I'm assuming it was because they're generally popular libraries >> >> amongst >> >> developers across a variety of purposes, so speed.pypy would show the >> >> speed >> >> of regular tasks? >> > >> > >> > That's the reason unladen swallow chose them, yes. PyPy then adopted >> > them >> > and added in the Twisted benchmarks. >> > >> >> >> >> If so, presumably it shouldn't be too hard to find appropriate >> >> libraries >> >> for Python 3? >> > >> > >> > Perhaps, but someone has to put in the effort to find those benchmarks, >> > code >> > them up, show how they are a reasonable workload, and then get them >> > accepted. Everyone likes the current set because the unladen team put in >> > a >> > lot of time and effort into selecting and creating those benchmarks. >> >> I think we also spent significant amount of time grabbing various >> benchmarks from various places (we = people who contributed to >> speed.pypy.org benchmark suite, that's by far not a group consisting >> only pypy devs). > > > Where does the PyPy benchmark code live, anyway? http://bitbucket.org/pypy/benchmarks > >> >> >> You might be surprised, but the criteria we used were mostly >> "contributed benchmarks showing some sort of real workload". I don't >> think we ever *rejected* a benchmark barring one case that was very >> variable and not very interesting (Depending on the HD performance). >> Some benchmarks were developed from "we know pypy is slow on this" >> scenarios as well. > > > Yeah, you and Alex have told me that in-person before. > >> >> >> The important part is that we want also "interesting" benchmarks to be >> included. This mostly means "run by someone somewhere" which includes >> a very broad category of things, but *excludes* fibonacci, richards, >> pystone and stuff like this. I think it's fine if we have a benchmark >> that runs python 3 version of whatever is there, but this requires >> work. Is there someone willing to do that work? > > > Right, I'm not suggesting something as silly as fibonacci. > > I think we need to first decide which set of benchmarks we are using since > there is already divergence between what is on hg.python.org and what is > measured at speed.pypy.org (e.g. hg.python.org tests 2to3 while pypy.org > does not, reverse goes for twisted). Once we know what set of benchmarks we > care about (it can be a cross-section), then we need to take a hard look at > where we are coming up short for Python 3. But ?from a python-dev > perspective, benchmarks running against Python 2 are not interesting since > we are simply no longer developing performance improvements for Python 2.7. 2to3 is essentially an overlook on pypy side, we'll integrate it back. Other than that I think pypy benchmarks are mostly a superset (there is also pickle and a bunch of pointless microbenchmarks). From brett at python.org Tue Jan 31 21:39:54 2012 From: brett at python.org (Brett Cannon) Date: Tue, 31 Jan 2012 15:39:54 -0500 Subject: [Speed] Buildbot Status In-Reply-To: References: <4F21B5AE.2080304@rehfisch.de> <4F281DA4.7010300@paulgraydon.co.uk> Message-ID: On Tue, Jan 31, 2012 at 15:04, Maciej Fijalkowski wrote: > On Tue, Jan 31, 2012 at 9:55 PM, Brett Cannon wrote: > > > > > > On Tue, Jan 31, 2012 at 13:44, Maciej Fijalkowski > wrote: > >> > >> On Tue, Jan 31, 2012 at 7:40 PM, Brett Cannon wrote: > >> > > >> > > >> > On Tue, Jan 31, 2012 at 11:58, Paul Graydon > >> > wrote: > >> >> > >> >> > >> >>> And this is a fundamental issue with tying benchmarks to real > >> >>> applications and libraries; if the code the benchmark relies on > never > >> >>> changes to Python 3, then the benchmark is dead in the water. As > >> >>> Daniel > >> >>> pointed out, if spitfire simply never converts then either we need > to > >> >>> convert them ourselves *just* for the benchmark (yuck), live w/o > the > >> >>> benchmark (ok, but if this happens to a bunch of benchmarks then we > >> >>> are > >> >>> going to not have a lot of data), or we look at making new > benchmarks > >> >>> based > >> >>> on apps/libraries that _have_ made the switch to Python 3 (which > means > >> >>> trying to agree on some new set of benchmarks to add to the current > >> >>> set). > >> >>> > >> >>> > >> >> What is the criteria by which the original benchmark sets were > chosen? > >> >> I'm assuming it was because they're generally popular libraries > >> >> amongst > >> >> developers across a variety of purposes, so speed.pypy would show the > >> >> speed > >> >> of regular tasks? > >> > > >> > > >> > That's the reason unladen swallow chose them, yes. PyPy then adopted > >> > them > >> > and added in the Twisted benchmarks. > >> > > >> >> > >> >> If so, presumably it shouldn't be too hard to find appropriate > >> >> libraries > >> >> for Python 3? > >> > > >> > > >> > Perhaps, but someone has to put in the effort to find those > benchmarks, > >> > code > >> > them up, show how they are a reasonable workload, and then get them > >> > accepted. Everyone likes the current set because the unladen team put > in > >> > a > >> > lot of time and effort into selecting and creating those benchmarks. > >> > >> I think we also spent significant amount of time grabbing various > >> benchmarks from various places (we = people who contributed to > >> speed.pypy.org benchmark suite, that's by far not a group consisting > >> only pypy devs). > > > > > > Where does the PyPy benchmark code live, anyway? > > http://bitbucket.org/pypy/benchmarks > > > > >> > >> > >> You might be surprised, but the criteria we used were mostly > >> "contributed benchmarks showing some sort of real workload". I don't > >> think we ever *rejected* a benchmark barring one case that was very > >> variable and not very interesting (Depending on the HD performance). > >> Some benchmarks were developed from "we know pypy is slow on this" > >> scenarios as well. > > > > > > Yeah, you and Alex have told me that in-person before. > > > >> > >> > >> The important part is that we want also "interesting" benchmarks to be > >> included. This mostly means "run by someone somewhere" which includes > >> a very broad category of things, but *excludes* fibonacci, richards, > >> pystone and stuff like this. I think it's fine if we have a benchmark > >> that runs python 3 version of whatever is there, but this requires > >> work. Is there someone willing to do that work? > > > > > > Right, I'm not suggesting something as silly as fibonacci. > > > > I think we need to first decide which set of benchmarks we are using > since > > there is already divergence between what is on hg.python.org and what is > > measured at speed.pypy.org (e.g. hg.python.org tests 2to3 while pypy.org > > does not, reverse goes for twisted). Once we know what set of benchmarks > we > > care about (it can be a cross-section), then we need to take a hard look > at > > where we are coming up short for Python 3. But from a python-dev > > perspective, benchmarks running against Python 2 are not interesting > since > > we are simply no longer developing performance improvements for Python > 2.7. > > 2to3 is essentially an overlook on pypy side, we'll integrate it back. > Other than that I think pypy benchmarks are mostly a superset (there > is also pickle and a bunch of pointless microbenchmarks). > I think pickle was mostly for unladen's pickle performance patches (trying saying that three times fast =), so I don't really care about that one. Would it make sense to change the pypy repo to make the unladen_swallow directory an external repo from hg.python.org/benchmarks? Because as it stands right now there are two mako benchmarks that are not identical. Otherwise we should talk at PyCon and figure this all out before we end up with two divergent benchmark suites that are being independently maintained (since we are all going to be running the same benchmarks on speed.python.org). -------------- next part -------------- An HTML attachment was scrubbed... URL: