From cfbolz at gmx.de Fri Feb 1 08:44:06 2013 From: cfbolz at gmx.de (Carl Friedrich Bolz) Date: Fri, 01 Feb 2013 08:44:06 +0100 Subject: [pypy-dev] [pypy-commit] pypy default: hidden frames are fairly rare, it's ok to unroll this In-Reply-To: <20130201015434.63E7E1C009B@cobra.cs.uni-duesseldorf.de> References: <20130201015434.63E7E1C009B@cobra.cs.uni-duesseldorf.de> Message-ID: <8e869c63-2c3a-4ed7-b4e2-05bd9eaccb88@email.android.com> Hi Alex, This needs a corresponding test in test_pypy_c.py Cheers, Carl Friedrich alex_gaynor wrote: >Author: Alex Gaynor >Branch: >Changeset: r60800:9aeefdb4841d >Date: 2013-01-31 17:54 -0800 >http://bitbucket.org/pypy/pypy/changeset/9aeefdb4841d/ > >Log: hidden frames are fairly rare, it's ok to unroll this > >diff --git a/pypy/interpreter/executioncontext.py >b/pypy/interpreter/executioncontext.py >--- a/pypy/interpreter/executioncontext.py >+++ b/pypy/interpreter/executioncontext.py >@@ -40,6 +40,7 @@ > def gettopframe(self): > return self.topframeref() > >+ @jit.unroll_safe > def gettopframe_nohidden(self): > frame = self.topframeref() > while frame and frame.hide(): >_______________________________________________ >pypy-commit mailing list >pypy-commit at python.org >http://mail.python.org/mailman/listinfo/pypy-commit -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.gaynor at gmail.com Sat Feb 2 23:53:07 2013 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Sat, 2 Feb 2013 14:53:07 -0800 Subject: [pypy-dev] [pypy-commit] pypy default: hidden frames are fairly rare, it's ok to unroll this In-Reply-To: <8e869c63-2c3a-4ed7-b4e2-05bd9eaccb88@email.android.com> References: <20130201015434.63E7E1C009B@cobra.cs.uni-duesseldorf.de> <8e869c63-2c3a-4ed7-b4e2-05bd9eaccb88@email.android.com> Message-ID: Hi Carl, At the moment this shouldn't affect anything, this function is only called in other places with loops. However, this is based on the same logic we used in getnextframe_nohidden, and I believe is a first step in making sys.exc_info() not explode stuff :) Alex On Thu, Jan 31, 2013 at 11:44 PM, Carl Friedrich Bolz wrote: > Hi Alex, > > This needs a corresponding test in test_pypy_c.py > > Cheers, > > Carl Friedrich > > > alex_gaynor wrote: >> >> Author: Alex Gaynor >> Branch: >> Changeset: r60800:9aeefdb4841d >> Date: 2013-01-31 17:54 -0800 >> http://bitbucket.org/pypy/pypy/changeset/9aeefdb4841d/ >> >> Log: hidden frames are fairly rare, it's ok to unroll this >> >> diff --git a/pypy/interpreter/executioncontext.py b/pypy/interpreter/executioncontext.py >> --- a/pypy/interpreter/executioncontext.py >> +++ b/pypy/interpreter/executioncontext.py >> @@ -40,6 +40,7 @@ >> def gettopframe(self): >> return self.topframeref() >> >> + @jit.unroll_safe >> def gettopframe_nohidden(self): >> frame = self.topframeref() >> while frame and >> frame.hide(): >> ------------------------------ >> >> pypy-commit mailing list >> pypy-commit at python.org >> http://mail.python.org/mailman/listinfo/pypy-commit >> >> > > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > > -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.m.camara at gmail.com Sun Feb 3 18:39:39 2013 From: john.m.camara at gmail.com (John Camara) Date: Sun, 3 Feb 2013 12:39:39 -0500 Subject: [pypy-dev] Should jitviewer come with a warning? Message-ID: I have been noticing a pattern where many who are writing Python code to run on PyPy are relying more and more on using the jitviewer to help them write faster code. Unfortunately, many of them who do so don't look at improving the design of their code as a way to improve the speed at which it will run under PyPy but instead start writing obscure Python code that happens to run faster under PyPy. I know that at least with the PyPy core developers they would like to see every one just create good clean Python code and that often code that has been made into obscure Python was don so to try to optimize it for CPython which in many cases causes it to run slower on PyPy than it would run it the code just followed typical Python idioms. I feel that a normal developer should be using tools like cProfiler and runsnakerun and cleaning up design issues way before they should even consider using jitviewer. In a recent case where I saw someone using the jitviewer who likely doesn't need to use it. At least they don't need to use it considering the current design of the code I said the following "The jitviewer should be mainly used by PyPy core developers and those building PyPy VMs. A normal developer writing Python code to run on PyPy shouldn?t have a need to use it. They can use it to point out an inefficiency that PyPy has to the core developers but it should not be used as a way to get you to write Python code in a way that has a better chance of being optimized under PyPy except for very rare occasions and even then it should only be made by those who follow closely and understand PyPy?s development." Do others here share this same opinion and should some warning be added to the jitviewer? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From exarkun at twistedmatrix.com Sun Feb 3 20:25:48 2013 From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com) Date: Sun, 03 Feb 2013 19:25:48 -0000 Subject: [pypy-dev] Should jitviewer come with a warning? In-Reply-To: References: Message-ID: <20130203192548.3816.313520268.divmod.xquotient.2@localhost6.localdomain6> On 05:39 pm, john.m.camara at gmail.com wrote: >I have been noticing a pattern where many who are writing Python code >to >run on PyPy are relying more and more on using the jitviewer to help >them >write faster code. Unfortunately, many of them who do so don't look at >improving the design of their code as a way to improve the speed at >which >it will run under PyPy but instead start writing obscure Python code >that >happens to run faster under PyPy. > >I know that at least with the PyPy core developers they would like to >see >every one just create good clean Python code and that often code that >has >been made into obscure Python was don so to try to optimize it for >CPython >which in many cases causes it to run slower on PyPy than it would run >it >the code just followed typical Python idioms. > >I feel that a normal developer should be using tools like cProfiler and >runsnakerun and cleaning up design issues way before they should even >consider using jitviewer. > >In a recent case where I saw someone using the jitviewer who likely >doesn't >need to use it. At least they don't need to use it considering the >current >design of the code I said the following > >"The jitviewer should be mainly used by PyPy core developers and those >building PyPy VMs. A normal developer writing Python code to run on >PyPy >shouldn?t have a need to use it. They can use it to point out an >inefficiency that PyPy has to the core developers but it should not be >used >as a way to get you to write Python code in a way that has a better >chance >of being optimized under PyPy except for very rare occasions and even >then >it should only be made by those who follow closely and understand >PyPy?s >development." > > >Do others here share this same opinion and should some warning be added >to >the jitviewer? What makes you think people will even read this warning, let alone prioritize it over their immediate desire to make their program run faster? (Not that I am objecting to adding the warning, but I think you might be fooling yourself if you think it will have any impact) Jean-Paul From fijall at gmail.com Sun Feb 3 20:39:38 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sun, 3 Feb 2013 21:39:38 +0200 Subject: [pypy-dev] Should jitviewer come with a warning? In-Reply-To: <20130203192548.3816.313520268.divmod.xquotient.2@localhost6.localdomain6> References: <20130203192548.3816.313520268.divmod.xquotient.2@localhost6.localdomain6> Message-ID: On Sun, Feb 3, 2013 at 9:25 PM, wrote: > On 05:39 pm, john.m.camara at gmail.com wrote: >> >> I have been noticing a pattern where many who are writing Python code to >> run on PyPy are relying more and more on using the jitviewer to help them >> write faster code. Unfortunately, many of them who do so don't look at >> improving the design of their code as a way to improve the speed at which >> it will run under PyPy but instead start writing obscure Python code that >> happens to run faster under PyPy. >> >> I know that at least with the PyPy core developers they would like to see >> every one just create good clean Python code and that often code that has >> been made into obscure Python was don so to try to optimize it for CPython >> which in many cases causes it to run slower on PyPy than it would run it >> the code just followed typical Python idioms. >> >> I feel that a normal developer should be using tools like cProfiler and >> runsnakerun and cleaning up design issues way before they should even >> consider using jitviewer. >> >> In a recent case where I saw someone using the jitviewer who likely >> doesn't >> need to use it. At least they don't need to use it considering the >> current >> design of the code I said the following >> >> "The jitviewer should be mainly used by PyPy core developers and those >> building PyPy VMs. A normal developer writing Python code to run on PyPy >> shouldn?t have a need to use it. They can use it to point out an >> inefficiency that PyPy has to the core developers but it should not be >> used >> as a way to get you to write Python code in a way that has a better chance >> of being optimized under PyPy except for very rare occasions and even then >> it should only be made by those who follow closely and understand PyPy?s >> development." >> >> >> Do others here share this same opinion and should some warning be added to >> the jitviewer? > > > What makes you think people will even read this warning, let alone > prioritize it over their immediate desire to make their program run faster? > > (Not that I am objecting to adding the warning, but I think you might be > fooling yourself if you think it will have any impact) > > Jean-Paul > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev Let me rephrase it. Where did you look for such a warning and you did not find it so you assumed it's ok? Cheers, fijal From john.m.camara at gmail.com Sun Feb 3 21:08:39 2013 From: john.m.camara at gmail.com (John Camara) Date: Sun, 3 Feb 2013 15:08:39 -0500 Subject: [pypy-dev] Should jitviewer come with a warning? In-Reply-To: References: Message-ID: > What makes you think people will even read this warning, let alone > prioritize it over their immediate desire to make their program run > faster? > (Not that I am objecting to adding the warning, but I think you might be > fooling yourself if you think it will have any impact) > Jean-Paul I agree with you and was not being naive and thinking this alone was going to solve the problem but it does gives us something to point to when we see someone abusing the jitviewer. Maybe, a more effective approach, is not to advertise about the jitviewer to everyone who has performance issues and only tell those who are experience programmers who have already done the obvious in fixing any design issues that had existed in their code. Having inexperience developers use the normal profiling tools will still help them find the hot spots in their code and help prevent them from picking up habits that lead them to writing un-Pythonic code. I'm sure we all agree that code with a better design will run faster in pypy than trying to add optimizations that work only for pypy to help out a poor design. I don't think we want to end up with a lot of Python code that looks like C code. This is what happens when the inexperience start relying on the jitviewer. For instance take a look at this code [1] and blog [2] which lead me to post this. This is not the first example I have come across this issue and unfortunately it appears to be increaseing at an alarming rate. I guess I feel we have a responsibility to try to promote good programming practices when we can. [1] - https://github.com/msgpack/msgpack-python/blob/master/msgpack/fallback.py [2] - http://blog.affien.com/archives/2013/01/29/msgpack-for-pypy/ John On Sun, Feb 3, 2013 at 12:39 PM, John Camara wrote: > I have been noticing a pattern where many who are writing Python code to > run on PyPy are relying more and more on using the jitviewer to help them > write faster code. Unfortunately, many of them who do so don't look at > improving the design of their code as a way to improve the speed at which > it will run under PyPy but instead start writing obscure Python code that > happens to run faster under PyPy. > > I know that at least with the PyPy core developers they would like to see > every one just create good clean Python code and that often code that has > been made into obscure Python was don so to try to optimize it for CPython > which in many cases causes it to run slower on PyPy than it would run it > the code just followed typical Python idioms. > > I feel that a normal developer should be using tools like cProfiler and > runsnakerun and cleaning up design issues way before they should even > consider using jitviewer. > > In a recent case where I saw someone using the jitviewer who likely > doesn't need to use it. At least they don't need to use it considering the > current design of the code I said the following > > "The jitviewer should be mainly used by PyPy core developers and those > building PyPy VMs. A normal developer writing Python code to run on PyPy > shouldn?t have a need to use it. They can use it to point out an > inefficiency that PyPy has to the core developers but it should not be used > as a way to get you to write Python code in a way that has a better chance > of being optimized under PyPy except for very rare occasions and even then > it should only be made by those who follow closely and understand PyPy?s > development." > > > Do others here share this same opinion and should some warning be added to > the jitviewer? > > John > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Sun Feb 3 21:12:03 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sun, 3 Feb 2013 22:12:03 +0200 Subject: [pypy-dev] Should jitviewer come with a warning? In-Reply-To: References:

Message-ID: On Sun, Feb 3, 2013 at 10:08 PM, John Camara wrote: >> What makes you think people will even read this warning, let alone >> prioritize it over their immediate desire to make their program run >> faster? > >> (Not that I am objecting to adding the warning, but I think you might be >> fooling yourself if you think it will have any impact) > >> Jean-Paul > > I agree with you and was not being naive and thinking this alone was going > to solve the problem but it does gives us something to point to when we see > someone abusing the jitviewer. > > Maybe, a more effective approach, is not to advertise about the jitviewer to > everyone who has performance issues and only tell those who are experience > programmers who have already done the obvious in fixing any design issues > that had existed in their code. Having inexperience developers use the > normal profiling tools will still help them find the hot spots in their code > and help prevent them from picking up habits that lead them to writing > un-Pythonic code. > > I'm sure we all agree that code with a better design will run faster in pypy > than trying to add optimizations that work only for pypy to help out a poor > design. > > I don't think we want to end up with a lot of Python code that looks like C > code. This is what happens when the inexperience start relying on the > jitviewer. > > For instance take a look at this code [1] and blog [2] which lead me to post > this. This is not the first example I have come across this issue and > unfortunately it appears to be increaseing at an alarming rate. > > I guess I feel we have a responsibility to try to promote good programming > practices when we can. > > [1] - > https://github.com/msgpack/msgpack-python/blob/master/msgpack/fallback.py > > [2] - http://blog.affien.com/archives/2013/01/29/msgpack-for-pypy/ > > John > > > > On Sun, Feb 3, 2013 at 12:39 PM, John Camara > wrote: >> >> I have been noticing a pattern where many who are writing Python code to >> run on PyPy are relying more and more on using the jitviewer to help them >> write faster code. Unfortunately, many of them who do so don't look at >> improving the design of their code as a way to improve the speed at which it >> will run under PyPy but instead start writing obscure Python code that >> happens to run faster under PyPy. >> >> I know that at least with the PyPy core developers they would like to see >> every one just create good clean Python code and that often code that has >> been made into obscure Python was don so to try to optimize it for CPython >> which in many cases causes it to run slower on PyPy than it would run it the >> code just followed typical Python idioms. >> >> I feel that a normal developer should be using tools like cProfiler and >> runsnakerun and cleaning up design issues way before they should even >> consider using jitviewer. >> >> In a recent case where I saw someone using the jitviewer who likely >> doesn't need to use it. At least they don't need to use it considering the >> current design of the code I said the following >> >> "The jitviewer should be mainly used by PyPy core developers and those >> building PyPy VMs. A normal developer writing Python code to run on PyPy >> shouldn?t have a need to use it. They can use it to point out an >> inefficiency that PyPy has to the core developers but it should not be used >> as a way to get you to write Python code in a way that has a better chance >> of being optimized under PyPy except for very rare occasions and even then >> it should only be made by those who follow closely and understand PyPy?s >> development." >> >> >> Do others here share this same opinion and should some warning be added to >> the jitviewer? >> >> John > > > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > Hi John. I don't believe jitviewer is advertised really in that many places. We tell people who come to IRC, yes, but that's about it (it's not prominently featured on pypy.org for example). It's hard enough to make people read docs. From john.m.camara at gmail.com Sun Feb 3 21:13:00 2013 From: john.m.camara at gmail.com (John Camara) Date: Sun, 3 Feb 2013 15:13:00 -0500 Subject: [pypy-dev] Should jitviewer come with a warning? In-Reply-To: References:

Message-ID: > Let me rephrase it. Where did you look for such a warning and you did > not find it so you assumed it's ok? > Cheers, > fijal Having a warning on https://bitbucket.org/pypy/jitviewer would be good. On Sun, Feb 3, 2013 at 3:08 PM, John Camara wrote: > > What makes you think people will even read this warning, let alone > > prioritize it over their immediate desire to make their program run > > faster? > > > (Not that I am objecting to adding the warning, but I think you might be > > fooling yourself if you think it will have any impact) > > > Jean-Paul > > I agree with you and was not being naive and thinking this alone was going to solve the problem but it does gives us something to point to when we see someone abusing the jitviewer. > > Maybe, a more effective approach, is not to advertise about the jitviewer to everyone who has performance issues and only tell those who are experience programmers who have already done the obvious in fixing any design issues that had existed in their code. Having inexperience developers use the normal profiling tools will still help them find the hot spots in their code and help prevent them from picking up habits that lead them to writing un-Pythonic code. > > I'm sure we all agree that code with a better design will run faster in pypy than trying to add optimizations that work only for pypy to help out a poor design. > > I don't think we want to end up with a lot of Python code that looks like C code. This is what happens when the inexperience start relying on the jitviewer. > > For instance take a look at this code [1] and blog [2] which lead me to post this. This is not the first example I have come across this issue and unfortunately it appears to be increaseing at an alarming rate. > > I guess I feel we have a responsibility to try to promote good programming practices when we can. > > [1] - https://github.com/msgpack/msgpack-python/blob/master/msgpack/fallback.py > > [2] - http://blog.affien.com/archives/2013/01/29/msgpack-for-pypy/ > > John > > > > On Sun, Feb 3, 2013 at 12:39 PM, John Camara wrote: > >> I have been noticing a pattern where many who are writing Python code to >> run on PyPy are relying more and more on using the jitviewer to help them >> write faster code. Unfortunately, many of them who do so don't look at >> improving the design of their code as a way to improve the speed at which >> it will run under PyPy but instead start writing obscure Python code that >> happens to run faster under PyPy. >> >> I know that at least with the PyPy core developers they would like to see >> every one just create good clean Python code and that often code that has >> been made into obscure Python was don so to try to optimize it for CPython >> which in many cases causes it to run slower on PyPy than it would run it >> the code just followed typical Python idioms. >> >> I feel that a normal developer should be using tools like cProfiler and >> runsnakerun and cleaning up design issues way before they should even >> consider using jitviewer. >> >> In a recent case where I saw someone using the jitviewer who likely >> doesn't need to use it. At least they don't need to use it considering the >> current design of the code I said the following >> >> "The jitviewer should be mainly used by PyPy core developers and those >> building PyPy VMs. A normal developer writing Python code to run on PyPy >> shouldn?t have a need to use it. They can use it to point out an >> inefficiency that PyPy has to the core developers but it should not be used >> as a way to get you to write Python code in a way that has a better chance >> of being optimized under PyPy except for very rare occasions and even then >> it should only be made by those who follow closely and understand PyPy?s >> development." >> >> >> Do others here share this same opinion and should some warning be added >> to the jitviewer? >> >> John >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Sun Feb 3 21:13:07 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sun, 3 Feb 2013 22:13:07 +0200 Subject: [pypy-dev] Should jitviewer come with a warning? In-Reply-To: References:

Message-ID: On Sun, Feb 3, 2013 at 10:12 PM, Maciej Fijalkowski wrote: > On Sun, Feb 3, 2013 at 10:08 PM, John Camara wrote: >>> What makes you think people will even read this warning, let alone >>> prioritize it over their immediate desire to make their program run >>> faster? >> >>> (Not that I am objecting to adding the warning, but I think you might be >>> fooling yourself if you think it will have any impact) >> >>> Jean-Paul >> >> I agree with you and was not being naive and thinking this alone was going >> to solve the problem but it does gives us something to point to when we see >> someone abusing the jitviewer. >> >> Maybe, a more effective approach, is not to advertise about the jitviewer to >> everyone who has performance issues and only tell those who are experience >> programmers who have already done the obvious in fixing any design issues >> that had existed in their code. Having inexperience developers use the >> normal profiling tools will still help them find the hot spots in their code >> and help prevent them from picking up habits that lead them to writing >> un-Pythonic code. >> >> I'm sure we all agree that code with a better design will run faster in pypy >> than trying to add optimizations that work only for pypy to help out a poor >> design. >> >> I don't think we want to end up with a lot of Python code that looks like C >> code. This is what happens when the inexperience start relying on the >> jitviewer. >> >> For instance take a look at this code [1] and blog [2] which lead me to post >> this. This is not the first example I have come across this issue and >> unfortunately it appears to be increaseing at an alarming rate. >> >> I guess I feel we have a responsibility to try to promote good programming >> practices when we can. >> >> [1] - >> https://github.com/msgpack/msgpack-python/blob/master/msgpack/fallback.py >> >> [2] - http://blog.affien.com/archives/2013/01/29/msgpack-for-pypy/ >> >> John >> >> >> >> On Sun, Feb 3, 2013 at 12:39 PM, John Camara >> wrote: >>> >>> I have been noticing a pattern where many who are writing Python code to >>> run on PyPy are relying more and more on using the jitviewer to help them >>> write faster code. Unfortunately, many of them who do so don't look at >>> improving the design of their code as a way to improve the speed at which it >>> will run under PyPy but instead start writing obscure Python code that >>> happens to run faster under PyPy. >>> >>> I know that at least with the PyPy core developers they would like to see >>> every one just create good clean Python code and that often code that has >>> been made into obscure Python was don so to try to optimize it for CPython >>> which in many cases causes it to run slower on PyPy than it would run it the >>> code just followed typical Python idioms. >>> >>> I feel that a normal developer should be using tools like cProfiler and >>> runsnakerun and cleaning up design issues way before they should even >>> consider using jitviewer. >>> >>> In a recent case where I saw someone using the jitviewer who likely >>> doesn't need to use it. At least they don't need to use it considering the >>> current design of the code I said the following >>> >>> "The jitviewer should be mainly used by PyPy core developers and those >>> building PyPy VMs. A normal developer writing Python code to run on PyPy >>> shouldn?t have a need to use it. They can use it to point out an >>> inefficiency that PyPy has to the core developers but it should not be used >>> as a way to get you to write Python code in a way that has a better chance >>> of being optimized under PyPy except for very rare occasions and even then >>> it should only be made by those who follow closely and understand PyPy?s >>> development." >>> >>> >>> Do others here share this same opinion and should some warning be added to >>> the jitviewer? >>> >>> John >> >> >> >> _______________________________________________ >> pypy-dev mailing list >> pypy-dev at python.org >> http://mail.python.org/mailman/listinfo/pypy-dev >> > > Hi John. > > I don't believe jitviewer is advertised really in that many places. We > tell people who come to IRC, yes, but that's about it (it's not > prominently featured on pypy.org for example). It's hard enough to > make people read docs. Also, looking at the msgpack - this code is maybe not ideal, but if you're dealing with buffer-level protocols, you end up with code looking like C a lot. From fijall at gmail.com Sun Feb 3 21:19:41 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sun, 3 Feb 2013 22:19:41 +0200 Subject: [pypy-dev] Parallella open hardware platform In-Reply-To: References: Message-ID: On Fri, Feb 1, 2013 at 12:01 AM, John Camara wrote: > A couple of days ago I heard about the Parallella [1] project which is an > open hardware platform similar to the Raspberry Pi but with much higher > capabilities. It has a Zynq Z-7010 which has both a dual core ARM A9 (800 > MHz) processor and a Artix-7 FPGA, a 16 core Epiphany multicore accelerator, > 1GB ram (see [2] for more info) and currently boots up in Ubuntu. > > The goal of the Parallella project is to develop an open parallel hardware > platform and development tools. Recently they announced support for Python > with Mark Dewing [3] leading the effort. I had asked Mark if he considered > PyPy but at this time he doesn't have time for this investigation and he > reposted my comment on the forum [4] with a couple of question. Maybe one of > you could answer them. > > Working with the Parallella project maybe a good opportunity for the PyPy > project from both a PR perspective and as well as the technical challenges > it would present. On the technical side it would give the opportunity to > test STM on a reasonable number of cores while also dealing with cores from > different architectures (ARM and Epiphany). I could see all the JITting > occurring on the ARM cores with it producing output for both architectures > based on which type of core STM decides to use for a chunk of work to > execute on. Of course there is also the challenge of bridging between the 2 > architectures. Maybe even some of the more expensive STM operations could > be offloaded to the FPGA or even a limited amount of very hot sections of > code could be JITted to the FPGA (although this might be more work than its > worth). > > From a PR perspective PyPy needs to excel at some niche market so that the > PyPy platform can take off. When PyPy started concentrating on the > scientific market with increasing support for Numpy I thought this would be > the niche market that would get PyPy to take off. But there have been a > couple of issue with this approach. There is a tremendous amount of work > that needs to be done so that PyPy can look attractive to this niche market. > It requires supporting both NumPy and SciPy and their was an expectation > that if PyPy supports NumPy others would come to help out with the SciPy > support. The problem is that there doesn't seam to be many who are eager to > pitch in for the SciPy effort and there also has not been a whole lot > willing to help will the ongoing NumPy work. I think in general the ratio > of people who use NumPy and SciPy to those willing to contribute is quite > small. So the idea of going after this market was a good idea and can > definitely have the opportunity to showing the strength of PyPy project it > hasn't done much to improve the image of the PyPy project. It also doesn't > help that there is some commercial interests that have popped up recently > that have decided to play hard ball against PyPy by spreading FUD. > > Unlike the Raspberry Pi hardware which can only support hobbyist the > Parallella hardware can support both hobbyists and commercial interests. > They cost $100 which is more than the $35 for Raspberry Pi but still within > reach of most hobbyists and they didn't cut out the many features that are > needed for commercial interests. The Parallella project raised nearly $0.9 > million on kickstarter [5] for the project with nearly 5000 backers. Since > many who will use the Parallella hardware also have experience on embedded > systems they and are more likely used to writing low level code in assembly, > FPGAs, and even lots of C code and I'm sure have hit many issues with > programming in parallel/multithreaded and would welcome a better developer > experience. I bet many of them would be willing to contribute both > financially and time to supporting such an effort. I believe the > Architecture of PyPy could lend it self to becoming the core of such a > development system and would allow Python to be used in this space. This > could provide a lot of good PR for the PyPy project. > > Now I'm not saying PyPy shouldn't devote any more time to supporting NumPy > as I'm sure when PyPy has very good support for both NumPy and SciPy it's > going to be a very good day for all Python supporters. I just think that > the PyPy team needs to think about a strategy that in the end will help its > PR and gain support from a much larger community. This project is doing a > lot of good things technically and now it just needs to get the attention of > the development community at large. Now I can't predict if working with the > Parallella project would be the break though in PR that PyPy needs but it's > at least an option that's out there. > > BTW I don't have any commercial interests in the Parallella project. If > some time in the future I use their hardware it would likely be as a > hobbyist and it would be nice to program it in Python. My real objective of > this post to see the PyPy project gain wider interest as it would be a good > thing for Python. > > [1] - http://www.parallella.org/ > [2] - http://www.parallella.org/board/ > [3] - http://forums.parallella.org/memberlist.php?mode=viewprofile&u=3344 > [4] - http://forums.parallella.org/viewtopic.php?f=26&t=139 > [5] - > http://www.kickstarter.com/projects/adapteva/parallella-a-supercomputer-for-everyone > > John > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > Hi John To answer the question from the forum - the JIT emits assembler (x86, arm) it does not emit C code. As far as PR is concerned, there is no such things as PyPy team meets and decided where to go. Everyone works what they feel like doing where volunteer time is concerned. Obviously things are a little different when there is a commercial interest in something. >From my own perspective PyPy should excel at one thing - providing kick ass Python VM that's universally fast. We're missing quite a few things (like library support), but the things has improved quite drastically, due to things like cffi. The startup time is another one on the list to consider and it affects ARM even more (since it's slower in general). Cheers, fijal From john.m.camara at gmail.com Sun Feb 3 21:29:22 2013 From: john.m.camara at gmail.com (John Camara) Date: Sun, 3 Feb 2013 15:29:22 -0500 Subject: [pypy-dev] Should jitviewer come with a warning? In-Reply-To: References: Message-ID: > Also, looking at the msgpack - this code is maybe not ideal, but if > you're dealing with buffer-level protocols, you end up with code > looking like C a lot. I do agree that this type a code will likely end up looking like C but it's not necessary for all of it to look like c. Like there should be a need to have long chains of if, elif statements. Using pack_into and unpack_from instead of pack and unpack methods so that it directly deals with the buffer instead of making sub strings. Even if pypy can optimize this away why write Python code like this when its not necessary. Plus I felt, initially the code should just use cffi and connect to the native c library. I believe this approach is likely to give very close to the best performance you could get on pypy for this type of library. I'm not sure how much of an increase in performance would be gain by writing the library completely in Python vs using cffi. Is there anything wrong with this line of thinking. Do you feel a pure Python approach could achieve better results than using cffi under pypy. John -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Sun Feb 3 21:50:57 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sun, 3 Feb 2013 22:50:57 +0200 Subject: [pypy-dev] Should jitviewer come with a warning? In-Reply-To: References: Message-ID: On Sun, Feb 3, 2013 at 10:29 PM, John Camara wrote: >> Also, looking at the msgpack - this code is maybe not ideal, but if >> you're dealing with buffer-level protocols, you end up with code >> looking like C a lot. > > I do agree that this type a code will likely end up looking like C but it's > not necessary for all of it to look like c. Like there should be a need to > have long chains of if, elif statements. Using pack_into and unpack_from > instead of pack and unpack methods so that it directly deals with the buffer > instead of making sub strings. Even if pypy can optimize this away why > write Python code like this when its not necessary. er. strings are immutable in python. you can unpack into them. other kinds of buffers are kind of dodgy, because python never grew a correct buffer. > > Plus I felt, initially the code should just use cffi and connect to the > native c library. I believe this approach is likely to give very close to > the best performance you could get on pypy for this type of library. I'm > not sure how much of an increase in performance would be gain by writing the > library completely in Python vs using cffi. Is there anything wrong with > this line of thinking. Do you feel a pure Python approach could achieve > better results than using cffi under pypy. python is nicer. It does not segfault. Besides, how do you get a string out of a C library? if you do raw malloc it's prone to be bad. Etc. etc. > > John > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > From steve at pearwood.info Sun Feb 3 23:39:36 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 04 Feb 2013 09:39:36 +1100 Subject: [pypy-dev] Should jitviewer come with a warning? In-Reply-To: <20130203192548.3816.313520268.divmod.xquotient.2@localhost6.localdomain6> References: <20130203192548.3816.313520268.divmod.xquotient.2@localhost6.localdomain6> Message-ID: <510EE728.1000608@pearwood.info> On 04/02/13 06:25, exarkun at twistedmatrix.com wrote: > On 05:39 pm, john.m.camara at gmail.com wrote: >> I have been noticing a pattern where many who are writing Python code to >> run on PyPy are relying more and more on using the jitviewer to help them >> write faster code. Unfortunately, many of them who do so don't look at >> improving the design of their code as a way to improve the speed at which >> it will run under PyPy but instead start writing obscure Python code that >> happens to run faster under PyPy. [...] >> Do others here share this same opinion and should some warning be added to >> the jitviewer? > > What makes you think people will even read this warning, let alone prioritize < it over their immediate desire to make their program run faster? > > (Not that I am objecting to adding the warning, but I think you might be >fooling yourself if you think it will have any impact) I think that if the coder is actually using some sort of profiling tool, any sort of profiling tool, that makes them 1000 times more likely to read and pay attention to the warning than the average coder who optimizes code by guessing. Other than that observation, I don't have an opinion on whether jitviewer should come with a warning. (Oh, and another thing... I'm assuming you mean for jitviewer to print the warning as part of it's normal output.) -- Steven From fijall at gmail.com Sun Feb 3 23:51:03 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 4 Feb 2013 00:51:03 +0200 Subject: [pypy-dev] Should jitviewer come with a warning? In-Reply-To: <510EE728.1000608@pearwood.info> References: <20130203192548.3816.313520268.divmod.xquotient.2@localhost6.localdomain6> <510EE728.1000608@pearwood.info> Message-ID: On Mon, Feb 4, 2013 at 12:39 AM, Steven D'Aprano wrote: > On 04/02/13 06:25, exarkun at twistedmatrix.com wrote: >> >> On 05:39 pm, john.m.camara at gmail.com wrote: >>> >>> I have been noticing a pattern where many who are writing Python code to >>> run on PyPy are relying more and more on using the jitviewer to help them >>> write faster code. Unfortunately, many of them who do so don't look at >>> improving the design of their code as a way to improve the speed at which >>> it will run under PyPy but instead start writing obscure Python code that >>> happens to run faster under PyPy. > > [...] > >>> Do others here share this same opinion and should some warning be added >>> to >>> the jitviewer? >> >> >> What makes you think people will even read this warning, let alone >> prioritize > > < it over their immediate desire to make their program run faster? >> >> >> (Not that I am objecting to adding the warning, but I think you might be >> fooling yourself if you think it will have any impact) > > > > I think that if the coder is actually using some sort of profiling tool, > any sort of profiling tool, that makes them 1000 times more likely to read > and pay attention to the warning than the average coder who optimizes code > by > guessing. > > Other than that observation, I don't have an opinion on whether jitviewer > should come with a warning. > > (Oh, and another thing... I'm assuming you mean for jitviewer to print the > warning as part of it's normal output.) that is definitely a no (my screen is too small to have some noise there, if for no other reason), it might have a warning in the documentation though, if it's any useful. But honestly, I doubt such a warning makes any sense. People who are capable of using jitviewer already "know better". From john.m.camara at gmail.com Mon Feb 4 01:12:01 2013 From: john.m.camara at gmail.com (John Camara) Date: Sun, 3 Feb 2013 19:12:01 -0500 Subject: [pypy-dev] Should jitviewer come with a warning? In-Reply-To: References:

Message-ID: > that is definitely a no (my screen is too small to have some noise > there, if for no other reason), it might have a warning in the > documentation though, if it's any useful. But honestly, I doubt such a > warning makes any sense. People who are capable of using jitviewer > already "know better". I agree it should not be part of the normal output. I would say add it to the doc string in app.py and to the README file. As far as people using the jitviewer already "know better". If that's the case I wouldn't have started this thread. Like you said earlier the use of jitviewer is only promoted on irc and yet I have come across 3 people working on different projects who are using it for the wrong reasons over the last 2 weeks. It's like this is the new RPython where people start using it for the wrong reasons. -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Mon Feb 4 09:42:47 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 4 Feb 2013 10:42:47 +0200 Subject: [pypy-dev] Should jitviewer come with a warning? In-Reply-To: References:

Message-ID: On Mon, Feb 4, 2013 at 2:12 AM, John Camara wrote: >> that is definitely a no (my screen is too small to have some noise >> there, if for no other reason), it might have a warning in the >> documentation though, if it's any useful. But honestly, I doubt such a >> warning makes any sense. People who are capable of using jitviewer >> already "know better". > > I agree it should not be part of the normal output. I would say add it to > the doc string in app.py and to the README file. As far as people using the > jitviewer already "know better". If that's the case I wouldn't have started > this thread. Like you said earlier the use of jitviewer is only promoted on > irc and yet I have come across 3 people working on different projects who > are using it for the wrong reasons over the last 2 weeks. It's like this is > the new RPython where people start using it for the wrong reasons. > > Seriously which ones? I think msgpack usage is absolutely legit. You seem to have different opinions about the design of that software, but you did not respond to my concerns even, not to mention the fact that it sounds like it's not "obfuscated by jitviewer". Cheers, fijal From john.m.camara at gmail.com Mon Feb 4 17:28:22 2013 From: john.m.camara at gmail.com (John Camara) Date: Mon, 4 Feb 2013 11:28:22 -0500 Subject: [pypy-dev] Should jitviewer come with a warning? In-Reply-To: References:

Message-ID: On Mon, Feb 4, 2013 at 3:42 AM, Maciej Fijalkowski wrote: > Seriously which ones? I think msgpack usage is absolutely legit. You > seem to have different opinions about the design of that software, but > you did not respond to my concerns even, not to mention the fact that > it sounds like it's not "obfuscated by jitviewer". > > Cheers, > fijal > First I would have tried using cffi to the msgpack c library. If I wasn't happy with it I would do a Python port. So for no lets forget about cffi and just deal with the current design of this library. I had tried to minimize the discussion about this library on this forum as I had already wrote extensive comments on the original blog [1]. Now I didn't do an extensive review of the code as I only concentrated on a small portion of it namely in the area of unpacking the msgpack messages. I'll just highlight a couple of concerns I had. The first thing the shocked me was the use of the struct.pack and struct.unpack functions. Normally when you need to pack and unpack often with the same format you would create a struct object with the desired format and use this object with its pack and unpack methods. That way the format string is not always being parsed but instead once when the struct object is created. As Bas pointed out pypy is able to optimize the parsing of the format which is great but why would you prefer to write code that would run with horrible performance under CPython when there is an alternative available. Now toward the end of the comments on the blog, Bas stated he tried the struct object under pypy and found it ran slower. So there is likely an opportunity for pypy to add another optimization as if pypy can optimize the struct functions it should be able to handle the struct objects which I would think would be an easier case to handle purely looking at it from a high level perspective. Another issue I had was the msgpack spec is designed in a way to minimize the need of copying data. That is you should be able to just use the data directly from the message buffers. The normal way to do this with the struct module is to use the unpack_from and pack_into methods instead of the pack and unpack methods. These methods take a buffer and an offset as opposed to the pack and unpack which would require you to slice out a copy of the original buffer to pass it in the unpack method. As Bas pointed out again pypy is able to optimize this copy created from slicing away which is great but again why code it in a way that will be slow on CPython when there is an alternative. The other issue I mentioned on the blog was the large number of if, elif statements used to handle each type of msgpack message. I instead suggested creating essentialy a list that holds references to struct objects so that the message type would be used as in index into this list. So that way you remove all the if, elif statements and end up with something like struct_objects[message_type].unpack_from() Now I understand that pypy is able to optimize all these if and elif statements by creating bridges for the various paths through this code but again why code it this way when it will be slow on CPython. I would also assume that using the if elif statements would still have more overhead in pypy compared to using a list of references although maybe there is not much of a difference. Any way this is just the issues I saw with this library which by the way is no where near as bad as other code I have seen written as a result of users using the jitviewer. Unfortunately, I could not discuss these other projects as they are closed source. Any way to get to the other part of you reply I assume not responding to your concerns is about the following "python is nicer. It does not segfault. Besides, how do you get a string out of a C library? if you do raw malloc it's prone to be bad. Etc. etc." Sorry that was an over sight. I feel the same way about Python but what's the real issue of taking the practical approach of using a c library that is written well and is robust. I would love to see everything written in Python but who has the time to port everything over. In the msgpack c library it would have the responsibility of maintaining the buffers. It's API supports creating and freeing these buffers. The msgpack library would be doing most of the work and the only data that has to go back and forth between the Python code and the library are just basic types like int, float, double, strings, etc. To get a string out of the c library just slice cffi.buffer to create a copy of it in Python before calling the function to clear the msgpack buffer. With using cffi this slicing to create copies of strings into Python and the overhead of calling into the c functions does add extra work over what would be done with the code written purely in Python and assuming pypy does have all the optimizations in place to get you to match the performance of the msgpack c library. The question is how much overhead does cffi really add in this use case, and is it worth doing the Python port to remove that overhead. I don't know the answer to this question. It would require profiling both cases. [1] - http://blog.affien.com/archives/2013/01/29/msgpack-for-pypy/ John -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Mon Feb 4 22:22:28 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 4 Feb 2013 23:22:28 +0200 Subject: [pypy-dev] Should jitviewer come with a warning? In-Reply-To: References:

Message-ID: On Mon, Feb 4, 2013 at 6:28 PM, John Camara wrote: > On Mon, Feb 4, 2013 at 3:42 AM, Maciej Fijalkowski wrote: > >> >> Seriously which ones? I think msgpack usage is absolutely legit. You >> seem to have different opinions about the design of that software, but >> you did not respond to my concerns even, not to mention the fact that >> it sounds like it's not "obfuscated by jitviewer". >> >> Cheers, >> fijal > > > First I would have tried using cffi to the msgpack c library. If I wasn't > happy with it I would do a Python port. So for no lets forget about cffi > and just deal with the current design of this library. > > I had tried to minimize the discussion about this library on this forum as I > had already wrote extensive comments on the original blog [1]. Now I didn't > do an extensive review of the code as I only concentrated on a small portion > of it namely in the area of unpacking the msgpack messages. I'll just > highlight a couple of concerns I had. > > The first thing the shocked me was the use of the struct.pack and > struct.unpack functions. Normally when you need to pack and unpack often > with the same format you would create a struct object with the desired > format and use this object with its pack and unpack methods. That way the > format string is not always being parsed but instead once when the struct > object is created. > > As Bas pointed out pypy is able to optimize the parsing of the format which > is great but why would you prefer to write code that would run with horrible > performance under CPython when there is an alternative available. Now > toward the end of the comments on the blog, Bas stated he tried the struct > object under pypy and found it ran slower. So there is likely an > opportunity for pypy to add another optimization as if pypy can optimize the > struct functions it should be able to handle the struct objects which I > would think would be an easier case to handle purely looking at it from a > high level perspective. It's a fallback for PyPy, so CPython speed is irrelevant. Also CPython has tons of weird quirks and "faster for PyPy and slower for CPython" is not always a bad thing. Personally I don't care. This particular example however should be reported as a bug in PyPy - using Struct is *nicer*, so it should be as fast (and there is no good reason why not). > > Another issue I had was the msgpack spec is designed in a way to minimize > the need of copying data. That is you should be able to just use the data > directly from the message buffers. The normal way to do this with the > struct module is to use the unpack_from and pack_into methods instead of the > pack and unpack methods. These methods take a buffer and an offset as > opposed to the pack and unpack which would require you to slice out a copy > of the original buffer to pass it in the unpack method. As Bas pointed out > again pypy is able to optimize this copy created from slicing away which is > great but again why code it in a way that will be slow on CPython when there > is an alternative. Python buffer support sucks. For example you don't get a string out (because strings are immutable). PyPy buffer support double sucks, because buffer protocol is broken and we also didn't care. Fortunately we're able to optimize string slicing here (strings are nicer than buffers or bytearrays to play with), but we should fix buffers. Sorry about that. Again, the CPython speed does not apply. > > The other issue I mentioned on the blog was the large number of if, elif > statements used to handle each type of msgpack message. I instead suggested > creating essentialy a list that holds references to struct objects so that > the message type would be used as in index into this list. So that way you > remove all the if, elif statements and end up with something like > > struct_objects[message_type].unpack_from() Lack of constant propagation. Again, a potential bug in PyPy, but a hard one. > > Now I understand that pypy is able to optimize all these if and elif > statements by creating bridges for the various paths through this code but > again why code it this way when it will be slow on CPython. I would also > assume that using the if elif statements would still have more overhead in > pypy compared to using a list of references although maybe there is not much > of a difference. It's not about if/elif or references (all those things are incredibly cheap), but about constant propagation. Notably determining that a format is constant. This would disappear if we fix Struct (it's an easy fix, a few hours of work for someone not experienced with PyPy) > > Any way this is just the issues I saw with this library which by the way is > no where near as bad as other code I have seen written as a result of users > using the jitviewer. Unfortunately, I could not discuss these other > projects as they are closed source. And we're unable to help you because of that. > > Any way to get to the other part of you reply I assume not responding to > your concerns is about the following > > "python is nicer. It does not segfault. Besides, how do you get a > string out of a C library? if you do raw malloc it's prone to be bad. > Etc. etc." > > Sorry that was an over sight. I feel the same way about Python but what's > the real issue of taking the practical approach of using a c library that is > written well and is robust. I would love to see everything written in > Python but who has the time to port everything over. If you're dealing with a data coming from the outside using Python over C lib sounds like a very sensible idea security-wise. I can't blame anyone here. I would do the same (given that the protocol is simple enough as well). Cheers, fijal From arigo at tunes.org Tue Feb 5 15:47:28 2013 From: arigo at tunes.org (Armin Rigo) Date: Tue, 5 Feb 2013 15:47:28 +0100 Subject: [pypy-dev] Parallella open hardware platform In-Reply-To: References: Message-ID: Hi John, Sorry if I misread you, but you seem to be only saying "it would be nice if the PyPy team worked on the support for rather than ". While this might be true under some point of view, it is not constructive. What would be nice is if *you* seriously proposed to work on , or helped us raise commercial interest, or otherwise contributed towards . If you're not up to it, and nobody steps up, then it's the end of the story (but thanks anyway for the nice description of Parallela). A bient?t, Armin. From john.m.camara at gmail.com Tue Feb 5 19:25:16 2013 From: john.m.camara at gmail.com (John Camara) Date: Tue, 5 Feb 2013 13:25:16 -0500 Subject: [pypy-dev] Parallella open hardware platform In-Reply-To: References: Message-ID: Hi Armin, It's even worse I'm asking you to support and I don't even need it. When I posted this thread it was getting rather long and unfortunately I didn't really make all the points I wanted to make. At this point, and even for some time now PyPy has a great foundation but it's use remains low. Every now and then it's good to step back a little bit and reflect on the current situation and come up with a strategy that helps the project's popularity grow. I know that PyPy has done things to help with the growth such as writing blog posts, being quick to fix bugs, helping others with their performance issues and even rapidly adding optimizations to PyPy, presenting at conferences, and often actively being engaged in commenting any posts or comments made about PyPy. So PyPy is doing a lot of things right to help it's PR but yet there is this issue of slow growth. Now we know what the main issue is with it's growth is the fact that the Python ecosystem relies on a lot of libraries that use the CPython API and PyPy just doesn't have full support for this interface. I understand the reasons why PyPy is not going to support the full interface and PyPy has come up with the cffi library as a way to bridge the gap. And of course I don't expect the PyPy project to take on the responsibility of porting all the popular 3rd party libraries that use the CPython API to cffi. It's going to have to be a community effort. One thing that could help would be more marketing of cffi as very few Python developers know it exists. But that along is not going to be enough. History tells us that most successful products/projects that become popular do so by first supporting the needs of some niche market. As time goes by that niche market starts providing PR that helps other markets to discover the product/project and the cycle can sometimes continue until there is mass adoption. Now when PyPy started to place a focus on NumPy I had hoped that the market it serves would turn out to be the market that would help PyPy grow. But at this point in time it does not appear like that is going to happen. For a while I have been trying to think of a niche market that maybe helpful. But to do so you have to consider the current state of PyPy which means eliminating markets that heavily rely on libraries that use the CPython API, also going to avoid the NumPy market as that's currently being worked on, there is the mobile market but that's a tough one to get into, maybe the gaming market could be a good one, etc. It turns out with the current state of PyPy many markets need to be eliminated if you looking for one that is going to help with growth. The parrallella project on the other hand looks like it could be a promising one and I'll share so thoughts a little later in this post as to why I feel this way. Right now you have been putting a lot of effort into STM in which your trying to solve what is likely the biggest challenge that the developer community is facing. That is how to write software that effective leverages many cores in a way that is straight forward and in the spirit of Python. When you solve this problem and I have the faith that you will, most would think that it would cause PyPy's popularity to sky rocket. What most likely will happen is that PyPy gets a temporary boost in popularity as there is another lesson in history to be concerned about. Often the first to solve a problem does not become popular in the long run. As usually the first to solve the problem does so via a unique solution but once people start using it issues with the approach gets discovered. Then often many others will use the original solution solution as a starting point and modify it to eliminate these new issues. Then one of the second generation solutions ends up being the defacto standard. Now PyPy is able to move fairly quickly in terms of implementing new approaches so it may in fact be able to compete just fine against other 2nd generation solutions. But there may be some benefits to exposing STM for a smaller market to help PyPy buy some additional time before releasing it as a solution for the general developer community. So why the Parallella project. Well I think it can be helpful in a number of ways. First I don't believe that this market is going to need much from the libraries that use the CPython APIs. Many who are in this market are used to having to program for embedded systems and are more likely have the skills to help out the PyPy project in a number of areas and would likely also have a financial incentive to contribute back to PyPy such as helping keep various back ends up to date such as Arm, PPC, and additional architectures. Some in this market are used to using a number of graphical languages to program their devices but unfortunately for them some of the new products that need to enter the market can't be built fully with these graphical languages. Well with the PyPy framework it's possible for them to implement a VM for that graphical language and be able to create products that contain elements programmed in both the graphical languages as well as text based languages. Also the VMs on many embedded systems are typically simple and don't have a JIT. PyPy can help with this but I don't believe any one who maintains these VMs are aware of the PyPy project. As far as STM is concerned, working with embedded systems will force finding solutions to the many issue that arise with various hardware architectures which would help STM become a more general solution. Right now your currently writing STM in a way that will support multiple cores on a single processor well. I know you have to start some where. But soon you will have to deal with issues that arise once you span to multiple processors such as dealing more often with the slower L3 cache and it sync issues and local vs remote memory issues. But on the embedded side you have to deal with processors of multiple architectures on the same system plus FPGAs as well as having to consider the various issues that a arise from the various buses involved which makes the STM problem quit a bit harder in how it gets optimized to handle all these variations. Of course many of these same issues exists if you want to have STM support GPUs in a normal computing device. The embedded side just adds additional complications as they come in more complex configurations. The Raspberry Pi has become popular, as many want to hack on these devices and the Raspberry Pi happens to be the first devices that is both cheap and allows programming at a high level. Previously if you wanted cheap it meant you needed to program using low level approach or you had to buy an expensive solution to program at a high level. Many who get interested in the Raspberry Pi soon find them self in the position where they have an idea and want to create a product to sell. But they realize you can't use a Raspberry Pi for production as it missing many features they would be required but they also like the idea of programming at a high level but the traditional embedded systems that support this may be too expensive for their product. That's where the Parallella project comes into play. They see there is a market for a low cost devices that can be programmed with higher level tools to build production systems. This market values programming at a high level and would highly appreciate being able to program them in Python. They also have a need to support multi cores and thus could use STM and it would be incredibly usefully if the STM approach could seamlessly support multiple architectures. There is a lot of value here for the companies that want to produce these devices and PyPy should try to tap into it. This new market segment using these low cost devices are going to have a large impact and also will play a role in the manufacturing revolution that is about to take place. This manufacturing revolution is likely to be on the same scale as the Internet revolution. Just think about what the effect 3D printing is going to have. It will be huge. PyPy getting a foot hold into this market before it takes off would be huge for PyPy as well as in general for Python. Also there are some big players who currently sell these more expensive embedded systems who are not going to be happy about these cheaper alternatives and are also going to want a piece of the action. I think for many of them who may not be able to quickly change their development and run-time processes may decide it's much easier for them to port their VMs over to PyPy to get into the action. Hopefully this gives some better insight as to why I feel it may be a good strategy to consider supporting the Parallella project. The possibility of getting a foot hold into a market that is about to take off doesn't come around too often. All I know is if PyPy would like to support this market right now is the best time to get started. This might be the ticket PyPy needs to gets it growth up which could then lead to additional markets taking notice and more of the Python ecosystem becoming compatible with PyPy. Of course this is just my opinion and maybe someone else could come up with another strategy that can help PyPy grow faster. Even an Open Source project can use a strategy. John On Tue, Feb 5, 2013 at 9:47 AM, Armin Rigo wrote: > Hi John, > > Sorry if I misread you, but you seem to be only saying "it would be > nice if the PyPy team worked on the support for rather than ". > While this might be true under some point of view, it is not > constructive. What would be nice is if *you* seriously proposed to > work on , or helped us raise commercial interest, or otherwise > contributed towards . If you're not up to it, and nobody steps up, > then it's the end of the story (but thanks anyway for the nice > description of Parallela). > > > A bient?t, > > Armin. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Tue Feb 5 22:34:09 2013 From: matti.picus at gmail.com (Matti Picus) Date: Tue, 05 Feb 2013 23:34:09 +0200 Subject: [pypy-dev] win32 own test failures Message-ID: <51117AD1.7060609@gmail.com> many of the jit.backend.x86.test are failing, I am willing to put time into solving this but have a bit of no idea where to start. Can someone give me the end of a string to pull? here are the last few lines of test_float, index so apparently deadframe.jf_values is empty Matti [llinterp:error] | File "c:\Users\matti\pypy_stuff\pypy\rpython\jit\metainterp\warmstate.py", line 322, in execute_assembler [llinterp:error] | fail_descr.handle_fail(deadframe, metainterp_sd, jitdriver_sd) [llinterp:error] | File "c:\Users\matti\pypy_stuff\pypy\rpython\jit\metainterp\compile.py", line 537, in handle_fail [llinterp:error] | resume_in_blackhole(metainterp_sd, jitdriver_sd, self, deadframe) [llinterp:error] | File "c:\Users\matti\pypy_stuff\pypy\rpython\jit\metainterp\blackhole.py", line 1558, in resume_in_blackhole [llinterp:error] | all_virtuals) [llinterp:error] | File "c:\Users\matti\pypy_stuff\pypy\rpython\jit\metainterp\resume.py", line 1041, in blackhole_from_resumedata [llinterp:error] | resumereader.consume_one_section(curbh) [llinterp:error] | File "c:\Users\matti\pypy_stuff\pypy\rpython\jit\metainterp\resume.py", line 1083, in consume_one_section [llinterp:error] | self._prepare_next_section(info) [llinterp:error] | File "c:\Users\matti\pypy_stuff\pypy\rpython\jit\metainterp\resume.py", line 759, in _prepare_next_section [llinterp:error] | self.unique_id) # <-- annotation hack [llinterp:error] | File "c:\Users\matti\pypy_stuff\pypy\rpython\jit\codewriter\jitcode.py", line 149, in enumerate_vars [llinterp:error] | callback_f(index, self.get_register_index_f(i)) [llinterp:error] | File "c:\Users\matti\pypy_stuff\pypy\rpython\jit\metainterp\resume.py", line 771, in _callback_f [llinterp:error] | value = self.decode_float(self.cur_numb.nums[index]) [llinterp:error] | File "c:\Users\matti\pypy_stuff\pypy\rpython\jit\metainterp\resume.py", line 1272, in decode_float [llinterp:error] | return self.cpu.get_latest_value_float(self.deadframe, num) [llinterp:error] | File "c:\Users\matti\pypy_stuff\pypy\rpython\jit\backend\llsupport\llmodel.py", line 290, in get_latest_value_float [llinterp:error] | return deadframe.jf_values[index].float [llinterp:error] | File "c:\Users\matti\pypy_stuff\pypy\rpython\rtyper\lltypesystem\lltype.py", line 1181, in __getitem__ [llinterp:error] | raise IndexError("array index out of bounds") [llinterp:error] | IndexError: array index out of bounds [llinterp:error] `------> [llinterp:traceback] f() rpython.jit.metainterp.test.test_ajit [llinterp:traceback] v4 = jit_marker(('can_enter_jit'), (), x_1, y_2, res_0) [llinterp:traceback] E v5 = direct_call((<* fn ll_portal_runner>), x_1, y_2, res_0) ==================================================================== short test summary info ==================================================================== FAIL rpython/jit/backend/x86/test/test_basic.py::TestBasic::()::test_float ============================================================ 169 tests deselected by '-ktest_float' ============================================================= ====================================================== 1 failed, 1 passed, 169 deselected in 4.32 seconds ======================================================= From matti.picus at gmail.com Tue Feb 5 23:51:50 2013 From: matti.picus at gmail.com (Matti Picus) Date: Wed, 06 Feb 2013 00:51:50 +0200 Subject: [pypy-dev] win32 own test failures In-Reply-To: <51117AD1.7060609@gmail.com> References: <51117AD1.7060609@gmail.com> Message-ID: <51118D06.6000908@gmail.com> An HTML attachment was scrubbed... URL: From fijall at gmail.com Wed Feb 6 00:26:31 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Wed, 6 Feb 2013 01:26:31 +0200 Subject: [pypy-dev] win32 own test failures In-Reply-To: <51118D06.6000908@gmail.com> References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> Message-ID: On Wed, Feb 6, 2013 at 12:51 AM, Matti Picus wrote: > fwiw, this happened on the remove-globals-in-jit branch and began occurring > soon after the first commits on the branch, after changeset 8c87151e76f0 on > that branch the test passed on 64 bit linux. > Matti it's probably already fixed on jitframe-on-heap which we aim to merge > > On 5/02/2013 11:34 PM, Matti Picus wrote: > > many of the jit.backend.x86.test are failing, I am willing to put time into > solving this but have a bit of no idea where to start. > Can someone give me the end of a string to pull > > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > From fijall at gmail.com Wed Feb 6 12:11:04 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Wed, 6 Feb 2013 13:11:04 +0200 Subject: [pypy-dev] Parallella open hardware platform In-Reply-To: References:

Message-ID: Hi John. Let me summarize your long post how I understood it. "You guys should bet everything on platform that both does not need PyPy and expressed no real interest. The reason why is because PyPy is not growing fast enough and we need a niche market. On top of that we should answer a lot of unanswered questions, like memory and warmup requirements on embedded devices". So, I think you're wrong in very many regards here. I think we should try to excel at providing a kick ass Python VM, but also I have seriously no say in what people work on (except me). We already have some niche markets, notably people who are willing to invest R&D and need serious power (but are unable or unwilling to use C or C++ for that). You just don't know about it, because those are typically not people writing blog posts. Having a dedicated web stack is another good step and we'll eventuall get there. I don't know why you think this particular niche market is better than any other, but it really does not matter all that much. There is no way you can convince people to do something else in their volunteer time than what they already feel like doing. Things you can do if you're interested: * do the work yourself * work with parallela project to have a first-class pypy support if they care about performance * spark commercial interest however, trying to convince volunteers that they should do what you think they should do is not really one of the helpful things you can be doing. Cheers, fijal From amauryfa at gmail.com Wed Feb 6 12:36:36 2013 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Wed, 6 Feb 2013 12:36:36 +0100 Subject: [pypy-dev] Parallella open hardware platform In-Reply-To: References:

Message-ID: 2013/2/6 Maciej Fijalkowski > however, trying to convince volunteers that they should do what you > think they should do is not really one of the helpful things you can > be doing. > Except if this brings *new* volunteers to the project :-) -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From arigo at tunes.org Wed Feb 6 13:13:44 2013 From: arigo at tunes.org (Armin Rigo) Date: Wed, 6 Feb 2013 13:13:44 +0100 Subject: [pypy-dev] Parallella open hardware platform In-Reply-To: References:

Message-ID: Hi John, Thanks for your lengthy analysis. I'm sure that it can be interesting for some to read. Unfortunately, I'm personally an Open Source hobbyist that happens to come from a university background and I'm still attached to some ideas behind it. You say about my hacking STM: "Often the first to solve a problem does not become popular in the long run". That is true, and I have no problem with that. My guess is that in the end STM will end up being common in programming languages. So I would like to help along the way --- by showing that it works in complicated languages like Python, using the unlimited flexibility of Software TM rather than as an exercice to fit it around some Hardware TM. It would be nice if PyPy also becomes the de-facto 2nd-generation standard, but that's less realistic --- and not a problem for me. My goal is *not* to write and sell the final product. What would also be nice is if this final product was Python, but unfortunately, it seems unlikely at this point that CPython will ever convert to STM. I guess that besides PyPy, Python as a whole will lag behind, and likely only end up using some HTM solution in 10-15 years when it's fully ready. (I consider the HTM that we have this year as preliminary at best.) That is my current analysis on the future of STM. It doesn't include huge monetary benefits for PyPy :-) but it doesn't change anything about my own research motivation: 1st-generation research, as you call it. Obviously, PyPy as a whole is such a 1st-generation project. What I would actually like a lot is to see the emergence of other 2nd-generation platforms that apply the same principles as PyPy --- for example, it would be a first step to see an efficient JavaScript JIT compiler not manually written from scratch. A bient?t, Armin. From skip at pobox.com Wed Feb 6 22:10:09 2013 From: skip at pobox.com (Skip Montanaro) Date: Wed, 6 Feb 2013 15:10:09 -0600 Subject: [pypy-dev] Cheetah on PyPy? Message-ID: I'm slowly working through some little tests which don't require and of our libraries at work. My current test is a script which uses the Cheetah template engine. I see that it is buried somewhere in PyPy's tests (inside genshi?), but my straightforward install of Cheetah 2.4.4 doesn't work because the str() of a compiled template doesn't do any of the required template expansion. It just returns what looks suspiciously like the repr() of the object. Is there a modified version of Cheetah somewhere (or patch) which coaxes it to work with PyPy? Failing that, where is the test suite code? I see no references to it as a standalone download, and am currently working with a binary build of 1.9. Thx, Skip Montanaro From fijall at gmail.com Wed Feb 6 22:52:01 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Wed, 6 Feb 2013 23:52:01 +0200 Subject: [pypy-dev] Cheetah on PyPy? In-Reply-To: References: Message-ID: On Wed, Feb 6, 2013 at 11:10 PM, Skip Montanaro wrote: > I'm slowly working through some little tests which don't require and > of our libraries at work. My current test is a script which uses the > Cheetah template engine. I see that it is buried somewhere in PyPy's > tests (inside genshi?), but my straightforward install of Cheetah > 2.4.4 doesn't work because the str() of a compiled template doesn't do > any of the required template expansion. It just returns what looks > suspiciously like the repr() of the object. > > Is there a modified version of Cheetah somewhere (or patch) which > coaxes it to work with PyPy? Failing that, where is the test suite > code? I see no references to it as a standalone download, and am > currently working with a binary build of 1.9. > > Thx, > > Skip Montanaro > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev It seems cheetah is doing some strange C extension called nameMapper. I can bet upfront that this is the part that does not work at all. It has a python fallback but it does not seem to work the same way. Cheers, fijal From amauryfa at gmail.com Wed Feb 6 22:54:42 2013 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Wed, 6 Feb 2013 22:54:42 +0100 Subject: [pypy-dev] Cheetah on PyPy? In-Reply-To: References: Message-ID: 2013/2/6 Skip Montanaro > I'm slowly working through some little tests which don't require and > of our libraries at work. My current test is a script which uses the > Cheetah template engine. I see that it is buried somewhere in PyPy's > tests (inside genshi?), but my straightforward install of Cheetah > 2.4.4 doesn't work because the str() of a compiled template doesn't do > any of the required template expansion. It just returns what looks > suspiciously like the repr() of the object. > > Is there a modified version of Cheetah somewhere (or patch) which > coaxes it to work with PyPy? Failing that, where is the test suite > code? I see no references to it as a standalone download, and am > currently working with a binary build of 1.9 > It's due a small incompatibility between CPython and PyPy. You are right that __str__ is not correctly set on Cheetah templates, this is because of this statement in Cheetah/Template.py: concreteTemplateClass.__str__ is object.__str__ A fix is to replace the "is" operator by "==". This is correcly covered by the tests in Cheetah/Tests/Template.py (just run this file with pypy) The explanation origins from a CPython oddity (IMHO): CPython has no "unbound built-in methods" for C types, and object.__str__ yields the same object every time. This is not the case for user-defined classes, for example "type(help).__repr__ is type(help).__repr__" is False. PyPy is more regular here, and has real unbound built-in methods. On the other hand this means that the "is" comparison should not be used, even for built-in methods. A similar pattern existed in the stdlib pprint.py, and we chose to fix the module. -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From skip at pobox.com Wed Feb 6 23:32:47 2013 From: skip at pobox.com (Skip Montanaro) Date: Wed, 6 Feb 2013 16:32:47 -0600 Subject: [pypy-dev] Cheetah on PyPy? In-Reply-To: References:

Message-ID: > You are right that __str__ is not correctly set on Cheetah templates, this > is because of this statement in Cheetah/Template.py: > concreteTemplateClass.__str__ is object.__str__ > A fix is to replace the "is" operator by "==". > This is correcly covered by the tests in Cheetah/Tests/Template.py (just run > this file with pypy) Thanks. I'll make that change and send it back upstream to the Cheetah folks. Skip From john.m.camara at gmail.com Thu Feb 7 05:41:43 2013 From: john.m.camara at gmail.com (John Camara) Date: Wed, 6 Feb 2013 23:41:43 -0500 Subject: [pypy-dev] Parallella open hardware platform In-Reply-To: References:

Message-ID: Fijal, In the past you have complained about it being hard to make money in open source. One way to make it easier for you is grow the popularity of PyPy. So I would think you would at least have some interest in thinking of ways to accomplish that. I'm not trying to dictate what PyPy should do but merely providing an opinion of mine that I see an opportunity that potential could be a great thing for PyPy. A year ago if someone asked me if PyPy should support embedded systems I would have given a firm no but I see the market changing in ways I didn't expect. The people hacking on these devices are fairly similar to open source developers and in some cases they even do open source development. They do things differently from the establishment which has provided a new way to think about manufacturing. Their ways are so different from the establishment and have become a game changer that it has ignited what is becoming a manufacturing revolution. Now because many who are involved in hacking with this hardware have no prior experience with the established ways of doing this type of business they are moving in directions that differ in how these devices get programmed. They are also in need of tools and new infrastructure and I feel that what PyPy has to offer can give them a starting point. Now at the end of the day I don't believe many of their requirements are going to be much different than the requirements for other markets and not likely too different than the direction PyPy will likely take. So why not go where all the big money is going to be at. Ok enough of that. Lets take a look at your example of a web stack. I believe right now PyPy is in a position to be used in this market. Sure PyPy could use some additional optimizations to improve the situation but I think in general it's already able to kick ass compared to CPython in terms of performance when a light web framework is used which is becoming increasing popular as web apps push the front ends to do most of the layout/presentation work. Also with with the web becoming more dynamic and the number of requests increasing at a substantial rate it becomes more important to reduce latencies which tends to give PyPy an advantage. This is all great while the web stacks are running on traditional servers but servers are changing. There are some servers being sold today that have hundreds of small cores and in the not too distant future there will be systems that have a number of full cores and a much larger number of smaller cores which may or may not have similar architectures. For instance servers with Phi coprocessors (8 GB of memory (60) 1 GHz cores, with I believe 4 threads each, with a PCIe3 interface) and have become recently available. How is PyPy going to handle this. Is this any different than the needs of the embedded systems. No. PyPy is going to have to start paying attention to how data is accessed and will have to make optimizations based on the access patterns. That is you have to make sure computational loads can offset the data transfer overhead. Today PyPy does not take into this overhead cost which is not required when running on one core.. For a web application it would be nice to run multiple sessions on a given core, save session related data locally to that core so as to minimize data transfer to the smaller cores which means directing all request for the session to the same core, doing any necessary encryption on these small cores, etc. But there may also be some work for a particular request which might not be appropriate to run on a small core and may have to run on the main core maybe due to it requiring access too much data. How is this going to work. Is PyPy going to do all the analysis itself or will the programmer provide some hints to PyPy as to how to break up the work. Who is going to be responsible for the scheduling and cleaning up the session data that is cached locally to the cores and a boat load of other issues I'm not sure it's a tough problem.and one that is just around the corner. Another option would be to run an HTTP load balance on the main cores, PyPy web stacks running on say dedicated Phi cores, with the HTTP requests forwarded over the PCIe bus. That way each Phi core acts like an independent web server. But running 60-240 PyPy processes in 8GB of memory is quite the challenge Maybe some sort of PyPy hypervisor that is able to run virtualized PyPy instances so that each instance can share all the JITed code but have it's own data. I'm sure many issues and questions exists like who would do the JITting the hypervisor or the virualized PyPy instances? Now even if you feel right now is not the time to start worrying about these new server architectures there are still other issues PyPy will start to run into, in the web stack market. Typically for a web application that is being accessed from the Internet there is a certain amount of latency that is acceptable. But what happens when the same web stack technology is deployed in local environments (i.e. on a LAN) with heavy dynamic requests with some requiring near real time performance. When operating in an a networked environment with low latencies people are going to expect more from a web servers (actual not just the people but systems talking to other systems that will require it). This ends up being a problem for Python in general as the garbage collector is going to be an issue. This is going to require a concurrent garbage collector. The concurrent garbage collector is also needed by the embedded market, as well as the gaming market, and many others. Any way, this is just food for thought. I'm not going to keep on giving more examples in more replies. In the end this is where the world is headed and it's going to take a lot of work and resources to get PyPy to handle these situations and only strong growth can make it possible. If you want PyPy to get there I hope you can see why a strategy for growth is necessary. On a side note, I'm not all that comfortable writing these posts when I know that at this particular time I don't have the spare time to contribute. Right now I work 7 days a week from the time I wake up until I go to sleep. But I wrote it any way as I do believe there its a good opportunity for PyPy. John On Wed, Feb 6, 2013 at 6:11 AM, Maciej Fijalkowski wrote: > Hi John. > > Let me summarize your long post how I understood it. "You guys should > bet everything on platform that both does not need PyPy and > expressed no real interest. The reason why is because PyPy is not > growing fast enough and we need a niche market. On top of that we > should answer a lot of unanswered questions, like memory and warmup > requirements on embedded devices". > > So, I think you're wrong in very many regards here. I think we should > try to excel at providing a kick ass Python VM, but also I have > seriously no say in what people work on (except me). We already have > some niche markets, notably people who are willing to invest R&D and > need serious power (but are unable or unwilling to use C or C++ for > that). You just don't know about it, because those are typically not > people writing blog posts. Having a dedicated web stack is another > good step and we'll eventuall get there. I don't know why you think > this particular niche market is better than any other, but it really > does not matter all that much. There is no way you can convince people > to do something else in their volunteer time than what they already > feel like doing. Things you can do if you're interested: > > * do the work yourself > > * work with parallela project to have a first-class pypy support if > they care about performance > > * spark commercial interest > > however, trying to convince volunteers that they should do what you > think they should do is not really one of the helpful things you can > be doing. > > Cheers, > fijal > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eginez at gmail.com Thu Feb 7 07:52:41 2013 From: eginez at gmail.com (=?ISO-8859-1?Q?Esteban_G=EDnez?=) Date: Wed, 6 Feb 2013 22:52:41 -0800 Subject: [pypy-dev] NumPyPy effort Message-ID: Hi there! I am currently looking to help out with PyPy and it seems like a good place to put some effort is in the NumPy. If someone can give pointers and resource on where/how to get started I would appreciated a ton. Thanks a bunch E. -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Thu Feb 7 10:30:30 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Thu, 7 Feb 2013 11:30:30 +0200 Subject: [pypy-dev] NumPyPy effort In-Reply-To: References: Message-ID: On Thu, Feb 7, 2013 at 8:52 AM, Esteban G?nez wrote: > Hi there! > I am currently looking to help out with PyPy and it seems like a good place > to put some effort is in the NumPy. > > If someone can give pointers and resource on where/how to get started I > would appreciated a ton. > > Thanks a bunch > E. Hi Esteban, you're welcome! We're very IRC-based, I suggest you show up on the PyPy IRC channel. The general idea is that you take some numpy function/failing test that's implemented in C and implement it :) Cheers, fijal From fijall at gmail.com Thu Feb 7 10:33:19 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Thu, 7 Feb 2013 11:33:19 +0200 Subject: [pypy-dev] Parallella open hardware platform In-Reply-To: References:

Message-ID: On Thu, Feb 7, 2013 at 6:41 AM, John Camara wrote: > Fijal, > > In the past you have complained about it being hard to make money in open > source. One way to make it easier for you is grow the popularity of PyPy. > So I would think you would at least have some interest in thinking of ways > to accomplish that. Before even reading further - how is being popular making money? Being popular is being popular. Do you know any CPython developer working full time on CPython? CPython is definitely popular by my standards From mtasic85 at gmail.com Thu Feb 7 12:55:42 2013 From: mtasic85 at gmail.com (Marko Tasic) Date: Thu, 7 Feb 2013 12:55:42 +0100 Subject: [pypy-dev] Great experience with PyPy Message-ID: Hi, I would like to share short story with you and share what we have accomplished with PyPy and its friends so far. Company that I have worked for last 7 months (intentionally unnamed) gave me absolute permission to pick up technologies on which we based our solution. What we do is: crawl for PDFs and newspapers articles, download, translate them if needed, OCR if needed, do extensive analysis of downloaded PDFs and articles, store them in more organized structures for faster querying, search for them and generate bunch of complex reports. >From very beginning I decided to go with PyPy no matter what. What we picked is following: * Flask for web framework, and few of its extensions such as Flask-Login, Flask-Principal, Flask-WTF, Flask-Mail, etc. * Cassandra as database because of its features and great experience with it. PyCassa is used as client to talk to Cassandra server. * ElasticSearch as distributed search engine, and its client library pyes. * Whoosh as search engine, but with some modifications to support Cassandra as storage and distributed locking. * Redis, and its client library redis-py, for caching and to speed up common auto-completion patterns. * ZooKeeper, and its client library Kazoo, for distributed locking which plays essential role in system for transaction-like behavior over many services at once. * Celery in conjunction with RabbitMQ for task distribution. * Sentry for error logging. What we have developed on our own are wrappers and clients for: * Moses which is language translator * Tesseract which is OCR engine * Cassandra store for Whoosh * wkhtmltopdf and wkhtmltoimage which are used for conversion of HTML to PDF/Image * etc Now when product is finished and in final testing phase, I can say that we did not regret because we used PyPy and stack around it. Typical speed improvement is 2x-3x over CPython in our case, but anyway we are mostly IO and memory bound, expect for Celery workers where we do analysis which are again many small CPU intensive tasks that are exchanged via RabbitMQ. Another reason why we don't see speedup us is that we are dependent on external software (servers) written in Erlang and Java. I'm already planing to do Cassandra (distributed key/value only database without index features), ZooKeeper, Redis and ElasticSearch ports in Python for next projects, and hopefully opensource them. Regards, Marko Tasic From fijall at gmail.com Thu Feb 7 13:00:27 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Thu, 7 Feb 2013 14:00:27 +0200 Subject: [pypy-dev] Great experience with PyPy In-Reply-To: References: Message-ID: On Thu, Feb 7, 2013 at 1:55 PM, Marko Tasic wrote: > Hi, > > I would like to share short story with you and share what we have > accomplished with PyPy and its friends so far. > > Company that I have worked for last 7 months (intentionally unnamed) > gave me absolute permission to pick up technologies on which we based > our solution. What we do is: crawl for PDFs and newspapers articles, > download, translate them if needed, OCR if needed, do extensive > analysis of downloaded PDFs and articles, store them in more organized > structures for faster querying, search for them and generate bunch of > complex reports. > > From very beginning I decided to go with PyPy no matter what. What we > picked is following: > * Flask for web framework, and few of its extensions such as > Flask-Login, Flask-Principal, Flask-WTF, Flask-Mail, etc. > * Cassandra as database because of its features and great experience > with it. PyCassa is used as client to talk to Cassandra server. > * ElasticSearch as distributed search engine, and its client library pyes. > * Whoosh as search engine, but with some modifications to support > Cassandra as storage and distributed locking. > * Redis, and its client library redis-py, for caching and to speed up > common auto-completion patterns. > * ZooKeeper, and its client library Kazoo, for distributed locking > which plays essential role in system for transaction-like behavior > over many services at once. > * Celery in conjunction with RabbitMQ for task distribution. > * Sentry for error logging. > > What we have developed on our own are wrappers and clients for: > * Moses which is language translator > * Tesseract which is OCR engine > * Cassandra store for Whoosh > * wkhtmltopdf and wkhtmltoimage which are used for conversion of HTML > to PDF/Image > * etc > > Now when product is finished and in final testing phase, I can say > that we did not regret because we used PyPy and stack around it. > Typical speed improvement is 2x-3x over CPython in our case, but > anyway we are mostly IO and memory bound, expect for Celery workers > where we do analysis which are again many small CPU intensive tasks > that are exchanged via RabbitMQ. Another reason why we don't see > speedup us is that we are dependent on external software (servers) > written in Erlang and Java. > > I'm already planing to do Cassandra (distributed key/value only > database without index features), ZooKeeper, Redis and ElasticSearch > ports in Python for next projects, and hopefully opensource them. > > Regards, > Marko Tasic > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev Awesome! I'm glad people can make pypy work for non-trivial tasks which require a lot of dependencies. We're trying to lower the bar, however it takes time. Cheers, fijal From phyo.arkarlwin at gmail.com Thu Feb 7 15:11:16 2013 From: phyo.arkarlwin at gmail.com (Phyo Arkar) Date: Thu, 7 Feb 2013 20:41:16 +0630 Subject: [pypy-dev] Great experience with PyPy In-Reply-To: References:

Message-ID: Pypy should have a page for "Success Stories!" Now with this and Quora proving Power of PyPy , i am beginning to start converting my projects into PyPy soon! I am only withholding right now because my projects uses a lot of C Libraries and Numpy/Matplotlib/scilit-learn. Thanks Phyo. On Thursday, February 7, 2013, Maciej Fijalkowski wrote: > On Thu, Feb 7, 2013 at 1:55 PM, Marko Tasic > > wrote: > > Hi, > > > > I would like to share short story with you and share what we have > > accomplished with PyPy and its friends so far. > > > > Company that I have worked for last 7 months (intentionally unnamed) > > gave me absolute permission to pick up technologies on which we based > > our solution. What we do is: crawl for PDFs and newspapers articles, > > download, translate them if needed, OCR if needed, do extensive > > analysis of downloaded PDFs and articles, store them in more organized > > structures for faster querying, search for them and generate bunch of > > complex reports. > > > > From very beginning I decided to go with PyPy no matter what. What we > > picked is following: > > * Flask for web framework, and few of its extensions such as > > Flask-Login, Flask-Principal, Flask-WTF, Flask-Mail, etc. > > * Cassandra as database because of its features and great experience > > with it. PyCassa is used as client to talk to Cassandra server. > > * ElasticSearch as distributed search engine, and its client library > pyes. > > * Whoosh as search engine, but with some modifications to support > > Cassandra as storage and distributed locking. > > * Redis, and its client library redis-py, for caching and to speed up > > common auto-completion patterns. > > * ZooKeeper, and its client library Kazoo, for distributed locking > > which plays essential role in system for transaction-like behavior > > over many services at once. > > * Celery in conjunction with RabbitMQ for task distribution. > > * Sentry for error logging. > > > > What we have developed on our own are wrappers and clients for: > > * Moses which is language translator > > * Tesseract which is OCR engine > > * Cassandra store for Whoosh > > * wkhtmltopdf and wkhtmltoimage which are used for conversion of HTML > > to PDF/Image > > * etc > > > > Now when product is finished and in final testing phase, I can say > > that we did not regret because we used PyPy and stack around it. > > Typical speed improvement is 2x-3x over CPython in our case, but > > anyway we are mostly IO and memory bound, expect for Celery workers > > where we do analysis which are again many small CPU intensive tasks > > that are exchanged via RabbitMQ. Another reason why we don't see > > speedup us is that we are dependent on external software (servers) > > written in Erlang and Java. > > > > I'm already planing to do Cassandra (distributed key/value only > > database without index features), ZooKeeper, Redis and ElasticSearch > > ports in Python for next projects, and hopefully opensource them. > > > > Regards, > > Marko Tasic > > _______________________________________________ > > pypy-dev mailing list > > pypy-dev at python.org > > http://mail.python.org/mailman/listinfo/pypy-dev > > Awesome! > > I'm glad people can make pypy work for non-trivial tasks which require > a lot of dependencies. We're trying to lower the bar, however it takes > time. > > Cheers, > fijal > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dynamicgl at gmail.com Thu Feb 7 16:12:53 2013 From: dynamicgl at gmail.com (Gelin Yan) Date: Thu, 7 Feb 2013 23:12:53 +0800 Subject: [pypy-dev] Great experience with PyPy In-Reply-To: References:

Message-ID: On Thu, Feb 7, 2013 at 10:11 PM, Phyo Arkar wrote: > Pypy should have a page for "Success Stories!" > > Now with this and Quora proving Power of PyPy , i am beginning to start > converting my projects into PyPy soon! > I am only withholding right now because my projects uses a lot of C > Libraries and Numpy/Matplotlib/scilit-learn. > > Thanks > > Phyo. > > On Thursday, February 7, 2013, Maciej Fijalkowski wrote: > >> On Thu, Feb 7, 2013 at 1:55 PM, Marko Tasic wrote: >> > Hi, >> > >> > I would like to share short story with you and share what we have >> > accomplished with PyPy and its friends so far. >> > >> > Company that I have worked for last 7 months (intentionally unnamed) >> > gave me absolute permission to pick up technologies on which we based >> > our solution. What we do is: crawl for PDFs and newspapers articles, >> > download, translate them if needed, OCR if needed, do extensive >> > analysis of downloaded PDFs and articles, store them in more organized >> > structures for faster querying, search for them and generate bunch of >> > complex reports. >> > >> > From very beginning I decided to go with PyPy no matter what. What we >> > picked is following: >> > * Flask for web framework, and few of its extensions such as >> > Flask-Login, Flask-Principal, Flask-WTF, Flask-Mail, etc. >> > * Cassandra as database because of its features and great experience >> > with it. PyCassa is used as client to talk to Cassandra server. >> > * ElasticSearch as distributed search engine, and its client library >> pyes. >> > * Whoosh as search engine, but with some modifications to support >> > Cassandra as storage and distributed locking. >> > * Redis, and its client library redis-py, for caching and to speed up >> > common auto-completion patterns. >> > * ZooKeeper, and its client library Kazoo, for distributed locking >> > which plays essential role in system for transaction-like behavior >> > over many services at once. >> > * Celery in conjunction with RabbitMQ for task distribution. >> > * Sentry for error logging. >> > >> > What we have developed on our own are wrappers and clients for: >> > * Moses which is language translator >> > * Tesseract which is OCR engine >> > * Cassandra store for Whoosh >> > * wkhtmltopdf and wkhtmltoimage which are used for conversion of HTML >> > to PDF/Image >> > * etc >> > >> > Now when product is finished and in final testing phase, I can say >> > that we did not regret because we used PyPy and stack around it. >> > Typical speed improvement is 2x-3x over CPython in our case, but >> > anyway we are mostly IO and memory bound, expect for Celery workers >> > where we do analysis which are again many small CPU intensive tasks >> > that are exchanged via RabbitMQ. Another reason why we don't see >> > speedup us is that we are dependent on external software (servers) >> > written in Erlang and Java. >> > >> > I'm already planing to do Cassandra (distributed key/value only >> > database without index features), ZooKeeper, Redis and ElasticSearch >> > ports in Python for next projects, and hopefully opensource them. >> > >> > Regards, >> > Marko Tasic >> > _______________________________________________ >> > pypy-dev mailing list >> > pypy-dev at python.org >> > http://mail.python.org/mailman/listinfo/pypy-dev >> >> Awesome! >> >> I'm glad people can make pypy work for non-trivial tasks which require >> a lot of dependencies. We're trying to lower the bar, however it takes >> time. >> >> Cheers, >> fijal >> _______________________________________________ >> pypy-dev mailing list >> pypy-dev at python.org >> http://mail.python.org/mailman/listinfo/pypy-dev >> > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > > Hi, It might be off topic. I want to know whether pypy support postgres. The last time I noticed ctypes based psycopg2 was still beta. I mainly use twisted & postgres. pypy supports twisted well but not good for psycopg2. Regards gelin yan -------------- next part -------------- An HTML attachment was scrubbed... URL: From konstantin.lopuhin at chtd.ru Thu Feb 7 16:28:38 2013 From: konstantin.lopuhin at chtd.ru (=?KOI8-R?B?68/T1NEg7M/Q1cjJzg==?=) Date: Thu, 7 Feb 2013 19:28:38 +0400 Subject: [pypy-dev] Great experience with PyPy In-Reply-To: References:

Message-ID: PyPy supports postgres with either psycopg2cffi or psycopg2-ctypes bindings. We use psycopg2cffi in production (and maintain them), and here http://chtd.ru/blog/bystraya-rabota-s-postgres-pod-pypy/?lang=en are some benchmarks. And yes, PyPy is cool :) Typically giving 3x speedups, and some memory savings sometimes. 2013/2/7 Gelin Yan : > > > On Thu, Feb 7, 2013 at 10:11 PM, Phyo Arkar > wrote: >> >> Pypy should have a page for "Success Stories!" >> >> Now with this and Quora proving Power of PyPy , i am beginning to start >> converting my projects into PyPy soon! >> I am only withholding right now because my projects uses a lot of C >> Libraries and Numpy/Matplotlib/scilit-learn. >> >> Thanks >> >> Phyo. >> >> On Thursday, February 7, 2013, Maciej Fijalkowski wrote: >>> >>> On Thu, Feb 7, 2013 at 1:55 PM, Marko Tasic wrote: >>> > Hi, >>> > >>> > I would like to share short story with you and share what we have >>> > accomplished with PyPy and its friends so far. >>> > >>> > Company that I have worked for last 7 months (intentionally unnamed) >>> > gave me absolute permission to pick up technologies on which we based >>> > our solution. What we do is: crawl for PDFs and newspapers articles, >>> > download, translate them if needed, OCR if needed, do extensive >>> > analysis of downloaded PDFs and articles, store them in more organized >>> > structures for faster querying, search for them and generate bunch of >>> > complex reports. >>> > >>> > From very beginning I decided to go with PyPy no matter what. What we >>> > picked is following: >>> > * Flask for web framework, and few of its extensions such as >>> > Flask-Login, Flask-Principal, Flask-WTF, Flask-Mail, etc. >>> > * Cassandra as database because of its features and great experience >>> > with it. PyCassa is used as client to talk to Cassandra server. >>> > * ElasticSearch as distributed search engine, and its client library >>> > pyes. >>> > * Whoosh as search engine, but with some modifications to support >>> > Cassandra as storage and distributed locking. >>> > * Redis, and its client library redis-py, for caching and to speed up >>> > common auto-completion patterns. >>> > * ZooKeeper, and its client library Kazoo, for distributed locking >>> > which plays essential role in system for transaction-like behavior >>> > over many services at once. >>> > * Celery in conjunction with RabbitMQ for task distribution. >>> > * Sentry for error logging. >>> > >>> > What we have developed on our own are wrappers and clients for: >>> > * Moses which is language translator >>> > * Tesseract which is OCR engine >>> > * Cassandra store for Whoosh >>> > * wkhtmltopdf and wkhtmltoimage which are used for conversion of HTML >>> > to PDF/Image >>> > * etc >>> > >>> > Now when product is finished and in final testing phase, I can say >>> > that we did not regret because we used PyPy and stack around it. >>> > Typical speed improvement is 2x-3x over CPython in our case, but >>> > anyway we are mostly IO and memory bound, expect for Celery workers >>> > where we do analysis which are again many small CPU intensive tasks >>> > that are exchanged via RabbitMQ. Another reason why we don't see >>> > speedup us is that we are dependent on external software (servers) >>> > written in Erlang and Java. >>> > >>> > I'm already planing to do Cassandra (distributed key/value only >>> > database without index features), ZooKeeper, Redis and ElasticSearch >>> > ports in Python for next projects, and hopefully opensource them. >>> > >>> > Regards, >>> > Marko Tasic >>> > _______________________________________________ >>> > pypy-dev mailing list >>> > pypy-dev at python.org >>> > http://mail.python.org/mailman/listinfo/pypy-dev >>> >>> Awesome! >>> >>> I'm glad people can make pypy work for non-trivial tasks which require >>> a lot of dependencies. We're trying to lower the bar, however it takes >>> time. >>> >>> Cheers, >>> fijal >>> _______________________________________________ >>> pypy-dev mailing list >>> pypy-dev at python.org >>> http://mail.python.org/mailman/listinfo/pypy-dev >> >> >> _______________________________________________ >> pypy-dev mailing list >> pypy-dev at python.org >> http://mail.python.org/mailman/listinfo/pypy-dev >> > > > Hi, It might be off topic. I want to know whether pypy support postgres. The > last time I noticed ctypes based psycopg2 was still beta. I mainly use > twisted & postgres. pypy supports twisted well but not good for psycopg2. > > Regards > > gelin yan > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > -- ?????????? ???????, ??????????? ???????? ??? -- http://chtd.ru +7 (495) 646-87-45, ?????????? 333 From skip at pobox.com Thu Feb 7 17:00:14 2013 From: skip at pobox.com (Skip Montanaro) Date: Thu, 7 Feb 2013 10:00:14 -0600 Subject: [pypy-dev] Cheetah on PyPy? In-Reply-To: References:

Message-ID: > Thanks. I'll make that change and send it back upstream to the Cheetah folks. This worked as Amaury advertised. I send a note to the Cheetah mailing list (where they want bug reports apparently) with the one line unidiff. Hopefully this change will make it into a near-term release. Skip From dynamicgl at gmail.com Thu Feb 7 17:00:58 2013 From: dynamicgl at gmail.com (Gelin Yan) Date: Fri, 8 Feb 2013 00:00:58 +0800 Subject: [pypy-dev] Great experience with PyPy In-Reply-To: References:

Message-ID: On Thu, Feb 7, 2013 at 11:28 PM, ????? ??????? wrote: > PyPy supports postgres with either psycopg2cffi or psycopg2-ctypes > bindings. We use psycopg2cffi in production (and maintain them), and > here http://chtd.ru/blog/bystraya-rabota-s-postgres-pod-pypy/?lang=en > are some benchmarks. > And yes, PyPy is cool :) Typically giving 3x speedups, and some memory > savings sometimes. > > 2013/2/7 Gelin Yan : > > > > > > On Thu, Feb 7, 2013 at 10:11 PM, Phyo Arkar > > wrote: > >> > >> Pypy should have a page for "Success Stories!" > >> > >> Now with this and Quora proving Power of PyPy , i am beginning to start > >> converting my projects into PyPy soon! > >> I am only withholding right now because my projects uses a lot of C > >> Libraries and Numpy/Matplotlib/scilit-learn. > >> > >> Thanks > >> > >> Phyo. > >> > >> On Thursday, February 7, 2013, Maciej Fijalkowski wrote: > >>> > >>> On Thu, Feb 7, 2013 at 1:55 PM, Marko Tasic > wrote: > >>> > Hi, > >>> > > >>> > I would like to share short story with you and share what we have > >>> > accomplished with PyPy and its friends so far. > >>> > > >>> > Company that I have worked for last 7 months (intentionally unnamed) > >>> > gave me absolute permission to pick up technologies on which we based > >>> > our solution. What we do is: crawl for PDFs and newspapers articles, > >>> > download, translate them if needed, OCR if needed, do extensive > >>> > analysis of downloaded PDFs and articles, store them in more > organized > >>> > structures for faster querying, search for them and generate bunch of > >>> > complex reports. > >>> > > >>> > From very beginning I decided to go with PyPy no matter what. What we > >>> > picked is following: > >>> > * Flask for web framework, and few of its extensions such as > >>> > Flask-Login, Flask-Principal, Flask-WTF, Flask-Mail, etc. > >>> > * Cassandra as database because of its features and great experience > >>> > with it. PyCassa is used as client to talk to Cassandra server. > >>> > * ElasticSearch as distributed search engine, and its client library > >>> > pyes. > >>> > * Whoosh as search engine, but with some modifications to support > >>> > Cassandra as storage and distributed locking. > >>> > * Redis, and its client library redis-py, for caching and to speed up > >>> > common auto-completion patterns. > >>> > * ZooKeeper, and its client library Kazoo, for distributed locking > >>> > which plays essential role in system for transaction-like behavior > >>> > over many services at once. > >>> > * Celery in conjunction with RabbitMQ for task distribution. > >>> > * Sentry for error logging. > >>> > > >>> > What we have developed on our own are wrappers and clients for: > >>> > * Moses which is language translator > >>> > * Tesseract which is OCR engine > >>> > * Cassandra store for Whoosh > >>> > * wkhtmltopdf and wkhtmltoimage which are used for conversion of HTML > >>> > to PDF/Image > >>> > * etc > >>> > > >>> > Now when product is finished and in final testing phase, I can say > >>> > that we did not regret because we used PyPy and stack around it. > >>> > Typical speed improvement is 2x-3x over CPython in our case, but > >>> > anyway we are mostly IO and memory bound, expect for Celery workers > >>> > where we do analysis which are again many small CPU intensive tasks > >>> > that are exchanged via RabbitMQ. Another reason why we don't see > >>> > speedup us is that we are dependent on external software (servers) > >>> > written in Erlang and Java. > >>> > > >>> > I'm already planing to do Cassandra (distributed key/value only > >>> > database without index features), ZooKeeper, Redis and ElasticSearch > >>> > ports in Python for next projects, and hopefully opensource them. > >>> > > >>> > Regards, > >>> > Marko Tasic > >>> > _______________________________________________ > >>> > pypy-dev mailing list > >>> > pypy-dev at python.org > >>> > http://mail.python.org/mailman/listinfo/pypy-dev > >>> > >>> Awesome! > >>> > >>> I'm glad people can make pypy work for non-trivial tasks which require > >>> a lot of dependencies. We're trying to lower the bar, however it takes > >>> time. > >>> > >>> Cheers, > >>> fijal > >>> _______________________________________________ > >>> pypy-dev mailing list > >>> pypy-dev at python.org > >>> http://mail.python.org/mailman/listinfo/pypy-dev > >> > >> > >> _______________________________________________ > >> pypy-dev mailing list > >> pypy-dev at python.org > >> http://mail.python.org/mailman/listinfo/pypy-dev > >> > > > > > > Hi, It might be off topic. I want to know whether pypy support postgres. > The > > last time I noticed ctypes based psycopg2 was still beta. I mainly use > > twisted & postgres. pypy supports twisted well but not good for psycopg2. > > > > Regards > > > > gelin yan > > > > _______________________________________________ > > pypy-dev mailing list > > pypy-dev at python.org > > http://mail.python.org/mailman/listinfo/pypy-dev > > > > > > -- > ?????????? ???????, ??????????? > ???????? ??? -- http://chtd.ru > +7 (495) 646-87-45, ?????????? 333 > Hi Glad to hear that. I will give it a try. By the way, Can i use it on windows? It looks like cffi support windows. Regards gelin yan -------------- next part -------------- An HTML attachment was scrubbed... URL: From kostia.lopuhin at gmail.com Thu Feb 7 17:08:51 2013 From: kostia.lopuhin at gmail.com (=?KOI8-R?B?68/T1NEg7M/Q1cjJzg==?=) Date: Thu, 7 Feb 2013 20:08:51 +0400 Subject: [pypy-dev] Great experience with PyPy In-Reply-To: References:

Message-ID: Hi! I did not test it on Windows, there may be problems with installation (searching for postgres header files, the config is not very smart - https://github.com/chtd/psycopg2cffi/blob/master/psycopg2cffi/_impl/libpq.py#L209), but they should be solvable I hope - submit a bug if you have problems. 2013/2/7 Gelin Yan : > > > On Thu, Feb 7, 2013 at 11:28 PM, ????? ??????? > wrote: >> >> PyPy supports postgres with either psycopg2cffi or psycopg2-ctypes >> bindings. We use psycopg2cffi in production (and maintain them), and >> here http://chtd.ru/blog/bystraya-rabota-s-postgres-pod-pypy/?lang=en >> are some benchmarks. >> And yes, PyPy is cool :) Typically giving 3x speedups, and some memory >> savings sometimes. >> >> 2013/2/7 Gelin Yan : >> > >> > >> > On Thu, Feb 7, 2013 at 10:11 PM, Phyo Arkar >> > wrote: >> >> >> >> Pypy should have a page for "Success Stories!" >> >> >> >> Now with this and Quora proving Power of PyPy , i am beginning to start >> >> converting my projects into PyPy soon! >> >> I am only withholding right now because my projects uses a lot of C >> >> Libraries and Numpy/Matplotlib/scilit-learn. >> >> >> >> Thanks >> >> >> >> Phyo. >> >> >> >> On Thursday, February 7, 2013, Maciej Fijalkowski wrote: >> >>> >> >>> On Thu, Feb 7, 2013 at 1:55 PM, Marko Tasic >> >>> wrote: >> >>> > Hi, >> >>> > >> >>> > I would like to share short story with you and share what we have >> >>> > accomplished with PyPy and its friends so far. >> >>> > >> >>> > Company that I have worked for last 7 months (intentionally unnamed) >> >>> > gave me absolute permission to pick up technologies on which we >> >>> > based >> >>> > our solution. What we do is: crawl for PDFs and newspapers articles, >> >>> > download, translate them if needed, OCR if needed, do extensive >> >>> > analysis of downloaded PDFs and articles, store them in more >> >>> > organized >> >>> > structures for faster querying, search for them and generate bunch >> >>> > of >> >>> > complex reports. >> >>> > >> >>> > From very beginning I decided to go with PyPy no matter what. What >> >>> > we >> >>> > picked is following: >> >>> > * Flask for web framework, and few of its extensions such as >> >>> > Flask-Login, Flask-Principal, Flask-WTF, Flask-Mail, etc. >> >>> > * Cassandra as database because of its features and great experience >> >>> > with it. PyCassa is used as client to talk to Cassandra server. >> >>> > * ElasticSearch as distributed search engine, and its client library >> >>> > pyes. >> >>> > * Whoosh as search engine, but with some modifications to support >> >>> > Cassandra as storage and distributed locking. >> >>> > * Redis, and its client library redis-py, for caching and to speed >> >>> > up >> >>> > common auto-completion patterns. >> >>> > * ZooKeeper, and its client library Kazoo, for distributed locking >> >>> > which plays essential role in system for transaction-like behavior >> >>> > over many services at once. >> >>> > * Celery in conjunction with RabbitMQ for task distribution. >> >>> > * Sentry for error logging. >> >>> > >> >>> > What we have developed on our own are wrappers and clients for: >> >>> > * Moses which is language translator >> >>> > * Tesseract which is OCR engine >> >>> > * Cassandra store for Whoosh >> >>> > * wkhtmltopdf and wkhtmltoimage which are used for conversion of >> >>> > HTML >> >>> > to PDF/Image >> >>> > * etc >> >>> > >> >>> > Now when product is finished and in final testing phase, I can say >> >>> > that we did not regret because we used PyPy and stack around it. >> >>> > Typical speed improvement is 2x-3x over CPython in our case, but >> >>> > anyway we are mostly IO and memory bound, expect for Celery workers >> >>> > where we do analysis which are again many small CPU intensive tasks >> >>> > that are exchanged via RabbitMQ. Another reason why we don't see >> >>> > speedup us is that we are dependent on external software (servers) >> >>> > written in Erlang and Java. >> >>> > >> >>> > I'm already planing to do Cassandra (distributed key/value only >> >>> > database without index features), ZooKeeper, Redis and ElasticSearch >> >>> > ports in Python for next projects, and hopefully opensource them. >> >>> > >> >>> > Regards, >> >>> > Marko Tasic >> >>> > _______________________________________________ >> >>> > pypy-dev mailing list >> >>> > pypy-dev at python.org >> >>> > http://mail.python.org/mailman/listinfo/pypy-dev >> >>> >> >>> Awesome! >> >>> >> >>> I'm glad people can make pypy work for non-trivial tasks which require >> >>> a lot of dependencies. We're trying to lower the bar, however it takes >> >>> time. >> >>> >> >>> Cheers, >> >>> fijal >> >>> _______________________________________________ >> >>> pypy-dev mailing list >> >>> pypy-dev at python.org >> >>> http://mail.python.org/mailman/listinfo/pypy-dev >> >> >> >> >> >> _______________________________________________ >> >> pypy-dev mailing list >> >> pypy-dev at python.org >> >> http://mail.python.org/mailman/listinfo/pypy-dev >> >> >> > >> > >> > Hi, It might be off topic. I want to know whether pypy support postgres. >> > The >> > last time I noticed ctypes based psycopg2 was still beta. I mainly use >> > twisted & postgres. pypy supports twisted well but not good for >> > psycopg2. >> > >> > Regards >> > >> > gelin yan >> > >> > _______________________________________________ >> > pypy-dev mailing list >> > pypy-dev at python.org >> > http://mail.python.org/mailman/listinfo/pypy-dev >> > >> >> >> >> -- >> ?????????? ???????, ??????????? >> ???????? ??? -- http://chtd.ru >> +7 (495) 646-87-45, ?????????? 333 > > > Hi > > Glad to hear that. I will give it a try. By the way, Can i use it on > windows? It looks like cffi support windows. > > Regards > > gelin yan > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > From john.m.camara at gmail.com Thu Feb 7 20:00:19 2013 From: john.m.camara at gmail.com (John Camara) Date: Thu, 7 Feb 2013 14:00:19 -0500 Subject: [pypy-dev] Parallella open hardware platform In-Reply-To: References:

Message-ID: Fijal, Whether someone works full time on a project is a separate issue. Being popular helps attract additional resources and PyPy is a project that could use additional resources. How many additional optimizations could PyPy add to get to a similar level of optimization to say the JVM. We are talking many many man years of work. How much additional work is it to develop and maintain backends for the various ARM, PPC, MIPS, etc processors How much work would it take to have PyPy support multi-cores? What if RPython needs to be significantly refactored or replaced. And we can go on and on. Typically every 10 years or so a new language becomes dominate but that hasn't happen lately. Java had been in the role for quite some time and for quite a few years it has be on the decline but yet no language has taken it's place in terms of dominance. The main reason why this hasn't happen so far is that no language has successfully dealt with the multi-core issue in a way that also keeps other desirable features we currently have with popular languages. But at some point, a language will prevail and become dominate and when that happens there will be a mass migration to this language. It doesn't mean that Python and other currently popular languages are just going to go away, it just their use will decline. If Python's popularity declines significantly it will in turn impact PyPy. Also many of the earlier adopters of PyPy are more likely to move on to the new dominate language. So where does that leave you. I expect you earn a living by doing PyPy consulting and thus you need PyPy to be popular. Now you don't have to believe that a new dominate language will occur but history says otherwise and many have been fooled into thinking otherwise is the past. I feel PyPy is Python's best chance at being able to survive this change in language dominance as it has the best chance of being able to do something about the multi-core situation. I'm glad the other day you mentioned about the web stack as if you didn't mention it I likely would not have thought about the PyPy hypervisor scenario. I'm starting to believe that approach, may have some decent merit to it and allow a way to kick the can down the road on the multi-core issues. I don't have the time to get into it right now but I start a new thread on the topic. Maybe within the next few days. John On Thu, Feb 7, 2013 at 4:33 AM, Maciej Fijalkowski wrote: > On Thu, Feb 7, 2013 at 6:41 AM, John Camara > wrote: > > Fijal, > > > > In the past you have complained about it being hard to make money in open > > source. One way to make it easier for you is grow the popularity of PyPy. > > So I would think you would at least have some interest in thinking of > ways > > to accomplish that. > > Before even reading further - how is being popular making money? Being > popular is being popular. Do you know any CPython developer working > full time on CPython? CPython is definitely popular by my standards > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mtasic85 at gmail.com Fri Feb 8 12:22:25 2013 From: mtasic85 at gmail.com (Marko Tasic) Date: Fri, 8 Feb 2013 12:22:25 +0100 Subject: [pypy-dev] Great experience with PyPy In-Reply-To: References:

Message-ID: Thanks everyone for support. @fijal In one of previous emails, I already told you that I'll be using pypy for real-life problems on medium to large scale projects. It is also very hard convincing companies that they should invest money in open source, but as far as they contribute to open source projects, I'm also satisfied in some way. Anyway, I prefer money ;) @carl I'm very bad at writing blog posts, but would like to explain in email what we have done, what obstacles have we faced and how we solved them. @armin Because I don't care of speed (I already have plenty of CPU cores not used all the time), and I only care of correctness and maintainability of code, your STM will perfectly fit in our requirements. As far as i know, every developer working on serious large scale project after going over your STM descriptions (emails and blogs) gives me the same answer about it, and that is the perfect solution for "per machine" concurrent programming. As far as I have freedom to pick technologies, I will definitely relay on it on one of the next projects. What is the status of it ATM, and what is best way to test and deploy pypy with stm? Regards, Marko Tasic On Thu, Feb 7, 2013 at 5:08 PM, ????? ??????? wrote: > Hi! I did not test it on Windows, there may be problems with > installation (searching for postgres header files, the config is not > very smart - https://github.com/chtd/psycopg2cffi/blob/master/psycopg2cffi/_impl/libpq.py#L209), > but they should be solvable I hope - submit a bug if you have > problems. > > 2013/2/7 Gelin Yan : >> >> >> On Thu, Feb 7, 2013 at 11:28 PM, ????? ??????? >> wrote: >>> >>> PyPy supports postgres with either psycopg2cffi or psycopg2-ctypes >>> bindings. We use psycopg2cffi in production (and maintain them), and >>> here http://chtd.ru/blog/bystraya-rabota-s-postgres-pod-pypy/?lang=en >>> are some benchmarks. >>> And yes, PyPy is cool :) Typically giving 3x speedups, and some memory >>> savings sometimes. >>> >>> 2013/2/7 Gelin Yan : >>> > >>> > >>> > On Thu, Feb 7, 2013 at 10:11 PM, Phyo Arkar >>> > wrote: >>> >> >>> >> Pypy should have a page for "Success Stories!" >>> >> >>> >> Now with this and Quora proving Power of PyPy , i am beginning to start >>> >> converting my projects into PyPy soon! >>> >> I am only withholding right now because my projects uses a lot of C >>> >> Libraries and Numpy/Matplotlib/scilit-learn. >>> >> >>> >> Thanks >>> >> >>> >> Phyo. >>> >> >>> >> On Thursday, February 7, 2013, Maciej Fijalkowski wrote: >>> >>> >>> >>> On Thu, Feb 7, 2013 at 1:55 PM, Marko Tasic >>> >>> wrote: >>> >>> > Hi, >>> >>> > >>> >>> > I would like to share short story with you and share what we have >>> >>> > accomplished with PyPy and its friends so far. >>> >>> > >>> >>> > Company that I have worked for last 7 months (intentionally unnamed) >>> >>> > gave me absolute permission to pick up technologies on which we >>> >>> > based >>> >>> > our solution. What we do is: crawl for PDFs and newspapers articles, >>> >>> > download, translate them if needed, OCR if needed, do extensive >>> >>> > analysis of downloaded PDFs and articles, store them in more >>> >>> > organized >>> >>> > structures for faster querying, search for them and generate bunch >>> >>> > of >>> >>> > complex reports. >>> >>> > >>> >>> > From very beginning I decided to go with PyPy no matter what. What >>> >>> > we >>> >>> > picked is following: >>> >>> > * Flask for web framework, and few of its extensions such as >>> >>> > Flask-Login, Flask-Principal, Flask-WTF, Flask-Mail, etc. >>> >>> > * Cassandra as database because of its features and great experience >>> >>> > with it. PyCassa is used as client to talk to Cassandra server. >>> >>> > * ElasticSearch as distributed search engine, and its client library >>> >>> > pyes. >>> >>> > * Whoosh as search engine, but with some modifications to support >>> >>> > Cassandra as storage and distributed locking. >>> >>> > * Redis, and its client library redis-py, for caching and to speed >>> >>> > up >>> >>> > common auto-completion patterns. >>> >>> > * ZooKeeper, and its client library Kazoo, for distributed locking >>> >>> > which plays essential role in system for transaction-like behavior >>> >>> > over many services at once. >>> >>> > * Celery in conjunction with RabbitMQ for task distribution. >>> >>> > * Sentry for error logging. >>> >>> > >>> >>> > What we have developed on our own are wrappers and clients for: >>> >>> > * Moses which is language translator >>> >>> > * Tesseract which is OCR engine >>> >>> > * Cassandra store for Whoosh >>> >>> > * wkhtmltopdf and wkhtmltoimage which are used for conversion of >>> >>> > HTML >>> >>> > to PDF/Image >>> >>> > * etc >>> >>> > >>> >>> > Now when product is finished and in final testing phase, I can say >>> >>> > that we did not regret because we used PyPy and stack around it. >>> >>> > Typical speed improvement is 2x-3x over CPython in our case, but >>> >>> > anyway we are mostly IO and memory bound, expect for Celery workers >>> >>> > where we do analysis which are again many small CPU intensive tasks >>> >>> > that are exchanged via RabbitMQ. Another reason why we don't see >>> >>> > speedup us is that we are dependent on external software (servers) >>> >>> > written in Erlang and Java. >>> >>> > >>> >>> > I'm already planing to do Cassandra (distributed key/value only >>> >>> > database without index features), ZooKeeper, Redis and ElasticSearch >>> >>> > ports in Python for next projects, and hopefully opensource them. >>> >>> > >>> >>> > Regards, >>> >>> > Marko Tasic >>> >>> > _______________________________________________ >>> >>> > pypy-dev mailing list >>> >>> > pypy-dev at python.org >>> >>> > http://mail.python.org/mailman/listinfo/pypy-dev >>> >>> >>> >>> Awesome! >>> >>> >>> >>> I'm glad people can make pypy work for non-trivial tasks which require >>> >>> a lot of dependencies. We're trying to lower the bar, however it takes >>> >>> time. >>> >>> >>> >>> Cheers, >>> >>> fijal >>> >>> _______________________________________________ >>> >>> pypy-dev mailing list >>> >>> pypy-dev at python.org >>> >>> http://mail.python.org/mailman/listinfo/pypy-dev >>> >> >>> >> >>> >> _______________________________________________ >>> >> pypy-dev mailing list >>> >> pypy-dev at python.org >>> >> http://mail.python.org/mailman/listinfo/pypy-dev >>> >> >>> > >>> > >>> > Hi, It might be off topic. I want to know whether pypy support postgres. >>> > The >>> > last time I noticed ctypes based psycopg2 was still beta. I mainly use >>> > twisted & postgres. pypy supports twisted well but not good for >>> > psycopg2. >>> > >>> > Regards >>> > >>> > gelin yan >>> > >>> > _______________________________________________ >>> > pypy-dev mailing list >>> > pypy-dev at python.org >>> > http://mail.python.org/mailman/listinfo/pypy-dev >>> > >>> >>> >>> >>> -- >>> ?????????? ???????, ??????????? >>> ???????? ??? -- http://chtd.ru >>> +7 (495) 646-87-45, ?????????? 333 >> >> >> Hi >> >> Glad to hear that. I will give it a try. By the way, Can i use it on >> windows? It looks like cffi support windows. >> >> Regards >> >> gelin yan >> >> _______________________________________________ >> pypy-dev mailing list >> pypy-dev at python.org >> http://mail.python.org/mailman/listinfo/pypy-dev >> > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev From yellowsq at hotmail.com Fri Feb 8 21:37:47 2013 From: yellowsq at hotmail.com (Yellow Sq) Date: Fri, 8 Feb 2013 20:37:47 +0000 Subject: [pypy-dev] Pypy's parser Message-ID: Hi. Short question: It says at http://doc.pypy.org/en/latest/parser.html that Pypy's parser is a recursive descent one. But following the content on that page it actually seems that the parser is a table-based LL. Is this perhaps out-dated and did Pypy had a different parser at some point? If so, what were the reasons that triggered the change? Is there a list of projects (either from the industry or the academy) using Pypy? Thx. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.gaynor at gmail.com Fri Feb 8 21:41:16 2013 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Fri, 8 Feb 2013 12:41:16 -0800 Subject: [pypy-dev] Pypy's parser In-Reply-To: References: Message-ID: Yes, that page is wrong, the parser is an LL table parser. I don't think we have an official list of projects using PyPy anywhere. Alex On Fri, Feb 8, 2013 at 12:37 PM, Yellow Sq wrote: > Hi. > > Short question: It says at http://doc.pypy.org/en/latest/parser.html that > Pypy's parser is a recursive descent one. But following the content on that > page it actually seems that the parser is a table-based LL. Is this perhaps > out-dated and did Pypy had a different parser at some point? If so, what > were the reasons that triggered the change? > > Is there a list of projects (either from the industry or the academy) > using Pypy? > > Thx. > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > > -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero -------------- next part -------------- An HTML attachment was scrubbed... URL: From arigo at tunes.org Sun Feb 10 15:43:33 2013 From: arigo at tunes.org (Armin Rigo) Date: Sun, 10 Feb 2013 15:43:33 +0100 Subject: [pypy-dev] Great experience with PyPy In-Reply-To: References:

Message-ID: Hi Marko, On Fri, Feb 8, 2013 at 12:22 PM, Marko Tasic wrote: > What is the status of it ATM, and what is best way to test > and deploy pypy with stm? The STM project progressed slowly during the last few months. The status right now is: * Most importantly, missing major Garbage Collection cycles, which means pypy-stm slowly but constantly leaks memory. * The JIT integration is not finished; so far pypy-stm can only be compiled without the JIT. * There are also other places where the performance can be improved, probably a lot. * Finally there are a number of usability concerns that we (or mostly Remi) worked on recently. The main issues turn around the idea that, as a user of pypy-stm, you should have a way to get freeback on the process. For example right now, transactions that abort are completely transparent --- to the point that you don't have any way to know that it occurred, apart from "it runs too slowly" if it occurs a lot. You should have a way to get Python tracebacks of aborts if you want to. A similar issue is "inevitable" transactions. A bient?t, Armin. From matti.picus at gmail.com Mon Feb 11 06:16:32 2013 From: matti.picus at gmail.com (Matti Picus) Date: Mon, 11 Feb 2013 07:16:32 +0200 Subject: [pypy-dev] gcc warnings / errors in translation Message-ID: <51187EB0.8060507@gmail.com> warning: the stdio output of a translate is a very large webpage. I wondered why the jit-benchmark-linux-x86-64 tests were failing to translate, where the pypy-c-jit-linux-x86-64 passed for instance the failed build on tannit64 http://buildbot.pypy.org/builders/jit-benchmark-linux-x86-64/builds/581/steps/translate/logs/stdio versus the successful build on allegro64 http://buildbot.pypy.org/builders/pypy-c-jit-linux-x86-64/builds/1257/steps/translate/logs/stdio Two things struck me from the failure of the make command: - Could the difference be environmental gcc flags making warnings into errors on tannit64? - There are alot of warnings, some of them seem important. In looking back into history, we seem to have gotten worse with warnings. I went back a bit, say to one of the release-2.0-beta1 builds, there are still tons of gcc warnings, but fewer http://buildbot.pypy.org/builders/pypy-c-jit-linux-x86-64/builds/1115/steps/translate/logs/stdio Anyone feel like taking a look? Matti From arigo at tunes.org Mon Feb 11 09:39:51 2013 From: arigo at tunes.org (Armin Rigo) Date: Mon, 11 Feb 2013 09:39:51 +0100 Subject: [pypy-dev] gcc warnings / errors in translation In-Reply-To: <51187EB0.8060507@gmail.com> References: <51187EB0.8060507@gmail.com> Message-ID: Hi Matti, On Mon, Feb 11, 2013 at 6:16 AM, Matti Picus wrote: > http://buildbot.pypy.org/builders/jit-benchmark-linux-x86-64/builds/581/steps/translate/logs/stdio Ah well, just Yet Another Intel assembler operation that asmgcc doesn't know about. The fix is trivial (done). It doesn't mean that looking at warnings is not a good idea; it should be done at some point too. A bient?t, Armin. From fijall at gmail.com Mon Feb 11 09:43:49 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 11 Feb 2013 10:43:49 +0200 Subject: [pypy-dev] gcc warnings / errors in translation In-Reply-To: References: <51187EB0.8060507@gmail.com> Message-ID: This: warning: array subscript is above array bounds [-Warray-bounds] Sounds like it's never correct. Should we pass -Wno-array-bounds? On Mon, Feb 11, 2013 at 10:39 AM, Armin Rigo wrote: > Hi Matti, > > On Mon, Feb 11, 2013 at 6:16 AM, Matti Picus wrote: >> http://buildbot.pypy.org/builders/jit-benchmark-linux-x86-64/builds/581/steps/translate/logs/stdio > > Ah well, just Yet Another Intel assembler operation that asmgcc > doesn't know about. The fix is trivial (done). > > It doesn't mean that looking at warnings is not a good idea; it should > be done at some point too. > > > A bient?t, > > Armin. > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev From arigo at tunes.org Mon Feb 11 09:57:06 2013 From: arigo at tunes.org (Armin Rigo) Date: Mon, 11 Feb 2013 09:57:06 +0100 Subject: [pypy-dev] win32 own test failures In-Reply-To: References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> Message-ID: Hi all, On Wed, Feb 6, 2013 at 12:26 AM, Maciej Fijalkowski wrote: > it's probably already fixed on jitframe-on-heap which we aim to merge Just answering this mail for the records: yes, on windows these tests pass on jitframe-on-heap. Armin From fijall at gmail.com Mon Feb 11 09:59:01 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 11 Feb 2013 10:59:01 +0200 Subject: [pypy-dev] win32 own test failures In-Reply-To: References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> Message-ID: On Mon, Feb 11, 2013 at 10:57 AM, Armin Rigo wrote: > Hi all, > > On Wed, Feb 6, 2013 at 12:26 AM, Maciej Fijalkowski wrote: >> it's probably already fixed on jitframe-on-heap which we aim to merge > > Just answering this mail for the records: yes, on windows these tests > pass on jitframe-on-heap. How so? The 32bit support is by far not done > > > Armin From arigo at tunes.org Mon Feb 11 10:10:08 2013 From: arigo at tunes.org (Armin Rigo) Date: Mon, 11 Feb 2013 10:10:08 +0100 Subject: [pypy-dev] gcc warnings / errors in translation In-Reply-To: References: <51187EB0.8060507@gmail.com>

Message-ID: Hi, On Mon, Feb 11, 2013 at 9:43 AM, Maciej Fijalkowski wrote: > This: warning: array subscript is above array bounds [-Warray-bounds] > > Sounds like it's never correct. Should we pass -Wno-array-bounds? Ah, indeed. We declare most arrays as "itemtype x[1]", so gcc complains when it can prove we do accesses at an index > 0. A bient?t, Armin. From arigo at tunes.org Mon Feb 11 10:11:51 2013 From: arigo at tunes.org (Armin Rigo) Date: Mon, 11 Feb 2013 10:11:51 +0100 Subject: [pypy-dev] win32 own test failures In-Reply-To: References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> Message-ID: Hi Fijal, On Mon, Feb 11, 2013 at 9:59 AM, Maciej Fijalkowski wrote: > How so? The 32bit support is by far not done Dunno? These two tests (from "test_basic.py -k test_float") also pass when running on linux32 fwiw. A bient?t, Armin. From estama at gmail.com Mon Feb 11 16:48:22 2013 From: estama at gmail.com (Eleytherios Stamatogiannakis) Date: Mon, 11 Feb 2013 17:48:22 +0200 Subject: [pypy-dev] Unicode encode/decode speed In-Reply-To: References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com>

Message-ID: <511912C6.5000201@gmail.com> Hi, We have been following the nightly builds of PyPy, with our testing workload (first described in the "CFFI speed results" thread). The news are very good. The performance of PyPy + CFFI has gone up considerably (~30% faster) since the last time we wrote about it! By adding on that speed up also our optimizations of the CFFI based SQLite3 wrapper (MSPW) that we are developing, the end result is that most of our test queries are at the same speed or faster than CPython + APSW now. Unfortunately, one of the queries where PyPy is slower [*] than CPython + APSW, is very central to all of our workflows, which means that we cannot fully convert to using PyPy. The main culprit of PyPy's slowness is the conversion (encoding, decoding) from PyPy's unicodes to UTF-8. It is the only thing, with a big percentage (~48%), remaining at the top of our performance profiles . Right now we are using PyPy's "codecs.utf_8_encode" and "codecs.utf_8_decode" to do this conversion. It there a faster way to do these conversions (encoding, decoding) in PyPy? Does CPython do something more clever than PyPY, like storing unicodes with full ASCII char content, in an ASCII representation? Thank you very much, lefteris. [*] For 1M rows: CPython + APSW: 10.5 sec PyPy + MSPW: 15.5 sec From amauryfa at gmail.com Mon Feb 11 17:13:58 2013 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Mon, 11 Feb 2013 17:13:58 +0100 Subject: [pypy-dev] Unicode encode/decode speed In-Reply-To: <511912C6.5000201@gmail.com> References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com>

<511912C6.5000201@gmail.com> Message-ID: 2013/2/11 Eleytherios Stamatogiannakis > Right now we are using PyPy's "codecs.utf_8_encode" and > "codecs.utf_8_decode" to do this conversion. > It's the most direct way to use the utf-8 conversion functions. > It there a faster way to do these conversions (encoding, decoding) in > PyPy? Does CPython do something more clever than PyPY, like storing > unicodes with full ASCII char content, in an ASCII representation? > Over years, utf-8 conversions have been heavily optimized in CPython: allocate short buffers on the stack, use aligned reads, quick check for ascii-only content (data & 0x80808080)... All things that pypy does not. But I tried some "timeit" runs, and pypy is often faster that CPython, and never much slower. Do your strings have many non-ascii characters? what's the len(utf8)/len(unicode) ratio? -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From estama at gmail.com Mon Feb 11 18:02:04 2013 From: estama at gmail.com (Eleytherios Stamatogiannakis) Date: Mon, 11 Feb 2013 19:02:04 +0200 Subject: [pypy-dev] Unicode encode/decode speed In-Reply-To: References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com>

<511912C6.5000201@gmail.com> Message-ID: <5119240C.2000209@gmail.com> On 11/02/13 18:13, Amaury Forgeot d'Arc wrote: > > 2013/2/11 Eleytherios Stamatogiannakis > > > Right now we are using PyPy's "codecs.utf_8_encode" and > "codecs.utf_8_decode" to do this conversion. > > > It's the most direct way to use the utf-8 conversion functions. > > It there a faster way to do these conversions (encoding, decoding) > in PyPy? Does CPython do something more clever than PyPY, like > storing unicodes with full ASCII char content, in an ASCII > representation? > > > Over years, utf-8 conversions have been heavily optimized in CPython: > allocate short buffers on the stack, use aligned reads, quick check for > ascii-only content (data & 0x80808080)... > All things that pypy does not. > > But I tried some "timeit" runs, and pypy is often faster that CPython, > and never much slower. This is odd. Maybe APSW uses some other CPython conversion API? Because the conversion overhead is not visible on CPython + APSW profiles. > Do your strings have many non-ascii characters? > what's the len(utf8)/len(unicode) ratio? > Our current tests, are using plain ASCII input (imported into sqlite3) which: - Go from sqlite3 (UTF-8) -> PyPy (unicode) - PyPy (unicode) -> sqlite3 (UTF-8). So i guess the len(utf-8)/len(unicode) = 1/4 (assuming 1 byte per char for ASCII (UTF-8) and 4 bytes per char for PyPy's unicode storage) l. From amauryfa at gmail.com Mon Feb 11 18:14:20 2013 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Mon, 11 Feb 2013 18:14:20 +0100 Subject: [pypy-dev] Unicode encode/decode speed In-Reply-To: <5119240C.2000209@gmail.com> References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com>

<511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> Message-ID: 2013/2/11 Eleytherios Stamatogiannakis > On 11/02/13 18:13, Amaury Forgeot d'Arc wrote: > >> >> 2013/2/11 Eleytherios Stamatogiannakis > > >> >> >> Right now we are using PyPy's "codecs.utf_8_encode" and >> "codecs.utf_8_decode" to do this conversion. >> >> >> It's the most direct way to use the utf-8 conversion functions. >> >> It there a faster way to do these conversions (encoding, decoding) >> in PyPy? Does CPython do something more clever than PyPY, like >> storing unicodes with full ASCII char content, in an ASCII >> representation? >> >> >> Over years, utf-8 conversions have been heavily optimized in CPython: >> allocate short buffers on the stack, use aligned reads, quick check for >> ascii-only content (data & 0x80808080)... >> All things that pypy does not. >> >> But I tried some "timeit" runs, and pypy is often faster that CPython, >> and never much slower. >> > > This is odd. Maybe APSW uses some other CPython conversion API? Because > the conversion overhead is not visible on CPython + APSW profiles. Which kind of profiler are you using? It possible that CPython builtin functions are not profiled the same way as PyPy's. > Do your strings have many non-ascii characters? >> what's the len(utf8)/len(unicode) ratio? >> >> > Our current tests, are using plain ASCII input (imported into sqlite3) > which: > > - Go from sqlite3 (UTF-8) -> PyPy (unicode) > - PyPy (unicode) -> sqlite3 (UTF-8). > > So i guess the len(utf-8)/len(unicode) = 1/4 > (assuming 1 byte per char for ASCII (UTF-8) and 4 bytes per char for > PyPy's unicode storage) > No, my question was about the number of non-ascii characters: s = u"SomeUnicodeString" 1.0 * len(s.encode('utf8')) / len(s) PyPy allocates the StringBuffer upfront, and must realloc to cope with multibytes characters. For English text, ratio is 1.0; for Greek, it will be close to 2.0. -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From estama at gmail.com Mon Feb 11 18:36:23 2013 From: estama at gmail.com (Eleytherios Stamatogiannakis) Date: Mon, 11 Feb 2013 19:36:23 +0200 Subject: [pypy-dev] Unicode encode/decode speed In-Reply-To: References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com>

<511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> Message-ID: <51192C17.7060907@gmail.com> On 11/02/13 19:14, Amaury Forgeot d'Arc wrote: > > > 2013/2/11 Eleytherios Stamatogiannakis > > > On 11/02/13 18:13, Amaury Forgeot d'Arc wrote: >... > > Which kind of profiler are you using? It possible that CPython builtin > functions are not profiled the same way as PyPy's. lsprofcalltree.py . From APSW's source code, i think that it uses this API: (in cursor.c) PyUnicode_DecodeUTF8 Maybe lsprofcalltree doesn't profile it? > > No, my question was about the number of non-ascii characters: > s = u"SomeUnicodeString" > 1.0 * len(s.encode('utf8')) / len(s) > PyPy allocates the StringBuffer upfront, and must realloc to cope with > multibytes characters. > For English text, ratio is 1.0; for Greek, it will be close to 2.0. > All of our tests use only plain English ASCII chars (converted to unicode). So the ratio is 1.0 . l. From amauryfa at gmail.com Mon Feb 11 18:39:58 2013 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Mon, 11 Feb 2013 18:39:58 +0100 Subject: [pypy-dev] Unicode encode/decode speed In-Reply-To: <51192C17.7060907@gmail.com> References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com>

<511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> <51192C17.7060907@gmail.com> Message-ID: 2013/2/11 Eleytherios Stamatogiannakis > > > > Which kind of profiler are you using? It possible that CPython builtin > > functions are not profiled the same way as PyPy's. > > lsprofcalltree.py . > > From APSW's source code, i think that it uses this API: > > (in cursor.c) > PyUnicode_DecodeUTF8 > > Maybe lsprofcalltree doesn't profile it? Indeed. CPU cost is hidden in the cursor method. -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Mon Feb 11 21:03:29 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 11 Feb 2013 22:03:29 +0200 Subject: [pypy-dev] Unicode encode/decode speed In-Reply-To: References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com>

<511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> <51192C17.7060907@gmail.com> Message-ID: On Mon, Feb 11, 2013 at 7:39 PM, Amaury Forgeot d'Arc wrote: > 2013/2/11 Eleytherios Stamatogiannakis >> >> > >> > Which kind of profiler are you using? It possible that CPython builtin >> > functions are not profiled the same way as PyPy's. >> >> lsprofcalltree.py . >> >> From APSW's source code, i think that it uses this API: >> >> (in cursor.c) >> PyUnicode_DecodeUTF8 >> >> Maybe lsprofcalltree doesn't profile it? > > > Indeed. CPU cost is hidden in the cursor method. > > > -- > Amaury Forgeot d'Arc > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > I would suggest using valgrind. It's a very good (albeit very slow) tool for seeing C-level performance. I remember seeing it both for CPython and PyPy when trying. Can you try yourself? From alex.gaynor at gmail.com Mon Feb 11 21:25:01 2013 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Mon, 11 Feb 2013 12:25:01 -0800 Subject: [pypy-dev] Unicode encode/decode speed In-Reply-To: References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com>

<511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> <51192C17.7060907@gmail.com>

Message-ID: I've also heard great things about `perf` if you're on Linux. Alex On Mon, Feb 11, 2013 at 12:03 PM, Maciej Fijalkowski wrote: > On Mon, Feb 11, 2013 at 7:39 PM, Amaury Forgeot d'Arc > wrote: > > 2013/2/11 Eleytherios Stamatogiannakis > >> > >> > > >> > Which kind of profiler are you using? It possible that CPython builtin > >> > functions are not profiled the same way as PyPy's. > >> > >> lsprofcalltree.py . > >> > >> From APSW's source code, i think that it uses this API: > >> > >> (in cursor.c) > >> PyUnicode_DecodeUTF8 > >> > >> Maybe lsprofcalltree doesn't profile it? > > > > > > Indeed. CPU cost is hidden in the cursor method. > > > > > > -- > > Amaury Forgeot d'Arc > > > > _______________________________________________ > > pypy-dev mailing list > > pypy-dev at python.org > > http://mail.python.org/mailman/listinfo/pypy-dev > > > > I would suggest using valgrind. It's a very good (albeit very slow) > tool for seeing C-level performance. I remember seeing it both for > CPython and PyPy when trying. Can you try yourself? > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero -------------- next part -------------- An HTML attachment was scrubbed... URL: From estama at gmail.com Tue Feb 12 00:24:03 2013 From: estama at gmail.com (Elefterios Stamatogiannakis) Date: Tue, 12 Feb 2013 01:24:03 +0200 Subject: [pypy-dev] Unicode encode/decode speed In-Reply-To: References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com>

<511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> <51192C17.7060907@gmail.com> Message-ID: <51197D93.2050201@gmail.com> On 11/2/2013 7:39 ??, Amaury Forgeot d'Arc wrote: > 2013/2/11 Eleytherios Stamatogiannakis > > > > > > Which kind of profiler are you using? It possible that CPython > builtin > > functions are not profiled the same way as PyPy's. > > lsprofcalltree.py . > > From APSW's source code, i think that it uses this API: > > (in cursor.c) > PyUnicode_DecodeUTF8 > > Maybe lsprofcalltree doesn't profile it? > > > Indeed. CPU cost is hidden in the cursor method. Thanks Amaury for looking into this, Assuming that PyPy's "codecs.utf_8_decode" is slower when used with CFFI than using PyUnicode_DecodeUTF8 in CPython. Is there anything that can be done in CFFI that would have the same performance as PyUnicode_DecodeUTF8 (and the same for encode)? l. From mail at justinbogner.com Tue Feb 12 05:57:23 2013 From: mail at justinbogner.com (Justin Bogner) Date: Mon, 11 Feb 2013 21:57:23 -0700 Subject: [pypy-dev] gcc warnings / errors in translation In-Reply-To: (Armin Rigo's message of "Mon, 11 Feb 2013 10:10:08 +0100") References: <51187EB0.8060507@gmail.com>

Message-ID: <87txpi164s.fsf@justinbogner.com> Armin Rigo writes: > Ah, indeed. We declare most arrays as "itemtype x[1]", so gcc complains > when it can prove we do accesses at an index > 0. Is there a good reason not to use the C99 "itemtype x[]" or even the old GCC extension "itemtype x[0]"? These won't trigger this warning, which means we could leave it on in case a legitimate case crops up. As far as I know, the only noticeable difference between [], [0], and [1] for flexible arrays is that sizeof has different semantics, but that's usually not a big deal. From amauryfa at gmail.com Tue Feb 12 08:13:35 2013 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Tue, 12 Feb 2013 08:13:35 +0100 Subject: [pypy-dev] gcc warnings / errors in translation In-Reply-To: <87txpi164s.fsf@justinbogner.com> References: <51187EB0.8060507@gmail.com>

<87txpi164s.fsf@justinbogner.com> Message-ID: 2013/2/12 Justin Bogner > Armin Rigo writes: > > Ah, indeed. We declare most arrays as "itemtype x[1]", so gcc complains > > when it can prove we do accesses at an index > 0. > > Is there a good reason not to use the C99 "itemtype x[]" or even the old > GCC extension "itemtype x[0]"? These won't trigger this warning, which > means we could leave it on in case a legitimate case crops up. It seems that Microsoft compilers also support this extension: http://msdn.microsoft.com/en-us/library/b6fae073(v=vs.71).aspx -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From amauryfa at gmail.com Tue Feb 12 08:47:36 2013 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Tue, 12 Feb 2013 08:47:36 +0100 Subject: [pypy-dev] Unicode encode/decode speed In-Reply-To: <51197D93.2050201@gmail.com> References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com>

<511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com>