From cfbolz at gmx.de Fri Feb 1 08:44:06 2013 From: cfbolz at gmx.de (Carl Friedrich Bolz) Date: Fri, 01 Feb 2013 08:44:06 +0100 Subject: [pypy-dev] [pypy-commit] pypy default: hidden frames are fairly rare, it's ok to unroll this In-Reply-To: <20130201015434.63E7E1C009B@cobra.cs.uni-duesseldorf.de> References: <20130201015434.63E7E1C009B@cobra.cs.uni-duesseldorf.de> Message-ID: <8e869c63-2c3a-4ed7-b4e2-05bd9eaccb88@email.android.com> Hi Alex, This needs a corresponding test in test_pypy_c.py Cheers, Carl Friedrich alex_gaynor wrote: >Author: Alex Gaynor >Branch: >Changeset: r60800:9aeefdb4841d >Date: 2013-01-31 17:54 -0800 >http://bitbucket.org/pypy/pypy/changeset/9aeefdb4841d/ > >Log: hidden frames are fairly rare, it's ok to unroll this > >diff --git a/pypy/interpreter/executioncontext.py >b/pypy/interpreter/executioncontext.py >--- a/pypy/interpreter/executioncontext.py >+++ b/pypy/interpreter/executioncontext.py >@@ -40,6 +40,7 @@ > def gettopframe(self): > return self.topframeref() > >+ @jit.unroll_safe > def gettopframe_nohidden(self): > frame = self.topframeref() > while frame and frame.hide(): >_______________________________________________ >pypy-commit mailing list >pypy-commit at python.org >http://mail.python.org/mailman/listinfo/pypy-commit -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.gaynor at gmail.com Sat Feb 2 23:53:07 2013 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Sat, 2 Feb 2013 14:53:07 -0800 Subject: [pypy-dev] [pypy-commit] pypy default: hidden frames are fairly rare, it's ok to unroll this In-Reply-To: <8e869c63-2c3a-4ed7-b4e2-05bd9eaccb88@email.android.com> References: <20130201015434.63E7E1C009B@cobra.cs.uni-duesseldorf.de> <8e869c63-2c3a-4ed7-b4e2-05bd9eaccb88@email.android.com> Message-ID: Hi Carl, At the moment this shouldn't affect anything, this function is only called in other places with loops. However, this is based on the same logic we used in getnextframe_nohidden, and I believe is a first step in making sys.exc_info() not explode stuff :) Alex On Thu, Jan 31, 2013 at 11:44 PM, Carl Friedrich Bolz wrote: > Hi Alex, > > This needs a corresponding test in test_pypy_c.py > > Cheers, > > Carl Friedrich > > > alex_gaynor wrote: >> >> Author: Alex Gaynor >> Branch: >> Changeset: r60800:9aeefdb4841d >> Date: 2013-01-31 17:54 -0800 >> http://bitbucket.org/pypy/pypy/changeset/9aeefdb4841d/ >> >> Log: hidden frames are fairly rare, it's ok to unroll this >> >> diff --git a/pypy/interpreter/executioncontext.py b/pypy/interpreter/executioncontext.py >> --- a/pypy/interpreter/executioncontext.py >> +++ b/pypy/interpreter/executioncontext.py >> @@ -40,6 +40,7 @@ >> def gettopframe(self): >> return self.topframeref() >> >> + @jit.unroll_safe >> def gettopframe_nohidden(self): >> frame = self.topframeref() >> while frame and >> frame.hide(): >> ------------------------------ >> >> pypy-commit mailing list >> pypy-commit at python.org >> http://mail.python.org/mailman/listinfo/pypy-commit >> >> > > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > > -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.m.camara at gmail.com Sun Feb 3 18:39:39 2013 From: john.m.camara at gmail.com (John Camara) Date: Sun, 3 Feb 2013 12:39:39 -0500 Subject: [pypy-dev] Should jitviewer come with a warning? Message-ID: I have been noticing a pattern where many who are writing Python code to run on PyPy are relying more and more on using the jitviewer to help them write faster code. Unfortunately, many of them who do so don't look at improving the design of their code as a way to improve the speed at which it will run under PyPy but instead start writing obscure Python code that happens to run faster under PyPy. I know that at least with the PyPy core developers they would like to see every one just create good clean Python code and that often code that has been made into obscure Python was don so to try to optimize it for CPython which in many cases causes it to run slower on PyPy than it would run it the code just followed typical Python idioms. I feel that a normal developer should be using tools like cProfiler and runsnakerun and cleaning up design issues way before they should even consider using jitviewer. In a recent case where I saw someone using the jitviewer who likely doesn't need to use it. At least they don't need to use it considering the current design of the code I said the following "The jitviewer should be mainly used by PyPy core developers and those building PyPy VMs. A normal developer writing Python code to run on PyPy shouldn?t have a need to use it. They can use it to point out an inefficiency that PyPy has to the core developers but it should not be used as a way to get you to write Python code in a way that has a better chance of being optimized under PyPy except for very rare occasions and even then it should only be made by those who follow closely and understand PyPy?s development." Do others here share this same opinion and should some warning be added to the jitviewer? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From exarkun at twistedmatrix.com Sun Feb 3 20:25:48 2013 From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com) Date: Sun, 03 Feb 2013 19:25:48 -0000 Subject: [pypy-dev] Should jitviewer come with a warning? In-Reply-To: References: Message-ID: <20130203192548.3816.313520268.divmod.xquotient.2@localhost6.localdomain6> On 05:39 pm, john.m.camara at gmail.com wrote: >I have been noticing a pattern where many who are writing Python code >to >run on PyPy are relying more and more on using the jitviewer to help >them >write faster code. Unfortunately, many of them who do so don't look at >improving the design of their code as a way to improve the speed at >which >it will run under PyPy but instead start writing obscure Python code >that >happens to run faster under PyPy. > >I know that at least with the PyPy core developers they would like to >see >every one just create good clean Python code and that often code that >has >been made into obscure Python was don so to try to optimize it for >CPython >which in many cases causes it to run slower on PyPy than it would run >it >the code just followed typical Python idioms. > >I feel that a normal developer should be using tools like cProfiler and >runsnakerun and cleaning up design issues way before they should even >consider using jitviewer. > >In a recent case where I saw someone using the jitviewer who likely >doesn't >need to use it. At least they don't need to use it considering the >current >design of the code I said the following > >"The jitviewer should be mainly used by PyPy core developers and those >building PyPy VMs. A normal developer writing Python code to run on >PyPy >shouldn?t have a need to use it. They can use it to point out an >inefficiency that PyPy has to the core developers but it should not be >used >as a way to get you to write Python code in a way that has a better >chance >of being optimized under PyPy except for very rare occasions and even >then >it should only be made by those who follow closely and understand >PyPy?s >development." > > >Do others here share this same opinion and should some warning be added >to >the jitviewer? What makes you think people will even read this warning, let alone prioritize it over their immediate desire to make their program run faster? (Not that I am objecting to adding the warning, but I think you might be fooling yourself if you think it will have any impact) Jean-Paul From fijall at gmail.com Sun Feb 3 20:39:38 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sun, 3 Feb 2013 21:39:38 +0200 Subject: [pypy-dev] Should jitviewer come with a warning? In-Reply-To: <20130203192548.3816.313520268.divmod.xquotient.2@localhost6.localdomain6> References: <20130203192548.3816.313520268.divmod.xquotient.2@localhost6.localdomain6> Message-ID: On Sun, Feb 3, 2013 at 9:25 PM, wrote: > On 05:39 pm, john.m.camara at gmail.com wrote: >> >> I have been noticing a pattern where many who are writing Python code to >> run on PyPy are relying more and more on using the jitviewer to help them >> write faster code. Unfortunately, many of them who do so don't look at >> improving the design of their code as a way to improve the speed at which >> it will run under PyPy but instead start writing obscure Python code that >> happens to run faster under PyPy. >> >> I know that at least with the PyPy core developers they would like to see >> every one just create good clean Python code and that often code that has >> been made into obscure Python was don so to try to optimize it for CPython >> which in many cases causes it to run slower on PyPy than it would run it >> the code just followed typical Python idioms. >> >> I feel that a normal developer should be using tools like cProfiler and >> runsnakerun and cleaning up design issues way before they should even >> consider using jitviewer. >> >> In a recent case where I saw someone using the jitviewer who likely >> doesn't >> need to use it. At least they don't need to use it considering the >> current >> design of the code I said the following >> >> "The jitviewer should be mainly used by PyPy core developers and those >> building PyPy VMs. A normal developer writing Python code to run on PyPy >> shouldn?t have a need to use it. They can use it to point out an >> inefficiency that PyPy has to the core developers but it should not be >> used >> as a way to get you to write Python code in a way that has a better chance >> of being optimized under PyPy except for very rare occasions and even then >> it should only be made by those who follow closely and understand PyPy?s >> development." >> >> >> Do others here share this same opinion and should some warning be added to >> the jitviewer? > > > What makes you think people will even read this warning, let alone > prioritize it over their immediate desire to make their program run faster? > > (Not that I am objecting to adding the warning, but I think you might be > fooling yourself if you think it will have any impact) > > Jean-Paul > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev Let me rephrase it. Where did you look for such a warning and you did not find it so you assumed it's ok? Cheers, fijal From john.m.camara at gmail.com Sun Feb 3 21:08:39 2013 From: john.m.camara at gmail.com (John Camara) Date: Sun, 3 Feb 2013 15:08:39 -0500 Subject: [pypy-dev] Should jitviewer come with a warning? In-Reply-To: References: Message-ID: > What makes you think people will even read this warning, let alone > prioritize it over their immediate desire to make their program run > faster? > (Not that I am objecting to adding the warning, but I think you might be > fooling yourself if you think it will have any impact) > Jean-Paul I agree with you and was not being naive and thinking this alone was going to solve the problem but it does gives us something to point to when we see someone abusing the jitviewer. Maybe, a more effective approach, is not to advertise about the jitviewer to everyone who has performance issues and only tell those who are experience programmers who have already done the obvious in fixing any design issues that had existed in their code. Having inexperience developers use the normal profiling tools will still help them find the hot spots in their code and help prevent them from picking up habits that lead them to writing un-Pythonic code. I'm sure we all agree that code with a better design will run faster in pypy than trying to add optimizations that work only for pypy to help out a poor design. I don't think we want to end up with a lot of Python code that looks like C code. This is what happens when the inexperience start relying on the jitviewer. For instance take a look at this code [1] and blog [2] which lead me to post this. This is not the first example I have come across this issue and unfortunately it appears to be increaseing at an alarming rate. I guess I feel we have a responsibility to try to promote good programming practices when we can. [1] - https://github.com/msgpack/msgpack-python/blob/master/msgpack/fallback.py [2] - http://blog.affien.com/archives/2013/01/29/msgpack-for-pypy/ John On Sun, Feb 3, 2013 at 12:39 PM, John Camara wrote: > I have been noticing a pattern where many who are writing Python code to > run on PyPy are relying more and more on using the jitviewer to help them > write faster code. Unfortunately, many of them who do so don't look at > improving the design of their code as a way to improve the speed at which > it will run under PyPy but instead start writing obscure Python code that > happens to run faster under PyPy. > > I know that at least with the PyPy core developers they would like to see > every one just create good clean Python code and that often code that has > been made into obscure Python was don so to try to optimize it for CPython > which in many cases causes it to run slower on PyPy than it would run it > the code just followed typical Python idioms. > > I feel that a normal developer should be using tools like cProfiler and > runsnakerun and cleaning up design issues way before they should even > consider using jitviewer. > > In a recent case where I saw someone using the jitviewer who likely > doesn't need to use it. At least they don't need to use it considering the > current design of the code I said the following > > "The jitviewer should be mainly used by PyPy core developers and those > building PyPy VMs. A normal developer writing Python code to run on PyPy > shouldn?t have a need to use it. They can use it to point out an > inefficiency that PyPy has to the core developers but it should not be used > as a way to get you to write Python code in a way that has a better chance > of being optimized under PyPy except for very rare occasions and even then > it should only be made by those who follow closely and understand PyPy?s > development." > > > Do others here share this same opinion and should some warning be added to > the jitviewer? > > John > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Sun Feb 3 21:12:03 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sun, 3 Feb 2013 22:12:03 +0200 Subject: [pypy-dev] Should jitviewer come with a warning? In-Reply-To: References: Message-ID: On Sun, Feb 3, 2013 at 10:08 PM, John Camara wrote: >> What makes you think people will even read this warning, let alone >> prioritize it over their immediate desire to make their program run >> faster? > >> (Not that I am objecting to adding the warning, but I think you might be >> fooling yourself if you think it will have any impact) > >> Jean-Paul > > I agree with you and was not being naive and thinking this alone was going > to solve the problem but it does gives us something to point to when we see > someone abusing the jitviewer. > > Maybe, a more effective approach, is not to advertise about the jitviewer to > everyone who has performance issues and only tell those who are experience > programmers who have already done the obvious in fixing any design issues > that had existed in their code. Having inexperience developers use the > normal profiling tools will still help them find the hot spots in their code > and help prevent them from picking up habits that lead them to writing > un-Pythonic code. > > I'm sure we all agree that code with a better design will run faster in pypy > than trying to add optimizations that work only for pypy to help out a poor > design. > > I don't think we want to end up with a lot of Python code that looks like C > code. This is what happens when the inexperience start relying on the > jitviewer. > > For instance take a look at this code [1] and blog [2] which lead me to post > this. This is not the first example I have come across this issue and > unfortunately it appears to be increaseing at an alarming rate. > > I guess I feel we have a responsibility to try to promote good programming > practices when we can. > > [1] - > https://github.com/msgpack/msgpack-python/blob/master/msgpack/fallback.py > > [2] - http://blog.affien.com/archives/2013/01/29/msgpack-for-pypy/ > > John > > > > On Sun, Feb 3, 2013 at 12:39 PM, John Camara > wrote: >> >> I have been noticing a pattern where many who are writing Python code to >> run on PyPy are relying more and more on using the jitviewer to help them >> write faster code. Unfortunately, many of them who do so don't look at >> improving the design of their code as a way to improve the speed at which it >> will run under PyPy but instead start writing obscure Python code that >> happens to run faster under PyPy. >> >> I know that at least with the PyPy core developers they would like to see >> every one just create good clean Python code and that often code that has >> been made into obscure Python was don so to try to optimize it for CPython >> which in many cases causes it to run slower on PyPy than it would run it the >> code just followed typical Python idioms. >> >> I feel that a normal developer should be using tools like cProfiler and >> runsnakerun and cleaning up design issues way before they should even >> consider using jitviewer. >> >> In a recent case where I saw someone using the jitviewer who likely >> doesn't need to use it. At least they don't need to use it considering the >> current design of the code I said the following >> >> "The jitviewer should be mainly used by PyPy core developers and those >> building PyPy VMs. A normal developer writing Python code to run on PyPy >> shouldn?t have a need to use it. They can use it to point out an >> inefficiency that PyPy has to the core developers but it should not be used >> as a way to get you to write Python code in a way that has a better chance >> of being optimized under PyPy except for very rare occasions and even then >> it should only be made by those who follow closely and understand PyPy?s >> development." >> >> >> Do others here share this same opinion and should some warning be added to >> the jitviewer? >> >> John > > > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > Hi John. I don't believe jitviewer is advertised really in that many places. We tell people who come to IRC, yes, but that's about it (it's not prominently featured on pypy.org for example). It's hard enough to make people read docs. From john.m.camara at gmail.com Sun Feb 3 21:13:00 2013 From: john.m.camara at gmail.com (John Camara) Date: Sun, 3 Feb 2013 15:13:00 -0500 Subject: [pypy-dev] Should jitviewer come with a warning? In-Reply-To: References: Message-ID: > Let me rephrase it. Where did you look for such a warning and you did > not find it so you assumed it's ok? > Cheers, > fijal Having a warning on https://bitbucket.org/pypy/jitviewer would be good. On Sun, Feb 3, 2013 at 3:08 PM, John Camara wrote: > > What makes you think people will even read this warning, let alone > > prioritize it over their immediate desire to make their program run > > faster? > > > (Not that I am objecting to adding the warning, but I think you might be > > fooling yourself if you think it will have any impact) > > > Jean-Paul > > I agree with you and was not being naive and thinking this alone was going to solve the problem but it does gives us something to point to when we see someone abusing the jitviewer. > > Maybe, a more effective approach, is not to advertise about the jitviewer to everyone who has performance issues and only tell those who are experience programmers who have already done the obvious in fixing any design issues that had existed in their code. Having inexperience developers use the normal profiling tools will still help them find the hot spots in their code and help prevent them from picking up habits that lead them to writing un-Pythonic code. > > I'm sure we all agree that code with a better design will run faster in pypy than trying to add optimizations that work only for pypy to help out a poor design. > > I don't think we want to end up with a lot of Python code that looks like C code. This is what happens when the inexperience start relying on the jitviewer. > > For instance take a look at this code [1] and blog [2] which lead me to post this. This is not the first example I have come across this issue and unfortunately it appears to be increaseing at an alarming rate. > > I guess I feel we have a responsibility to try to promote good programming practices when we can. > > [1] - https://github.com/msgpack/msgpack-python/blob/master/msgpack/fallback.py > > [2] - http://blog.affien.com/archives/2013/01/29/msgpack-for-pypy/ > > John > > > > On Sun, Feb 3, 2013 at 12:39 PM, John Camara wrote: > >> I have been noticing a pattern where many who are writing Python code to >> run on PyPy are relying more and more on using the jitviewer to help them >> write faster code. Unfortunately, many of them who do so don't look at >> improving the design of their code as a way to improve the speed at which >> it will run under PyPy but instead start writing obscure Python code that >> happens to run faster under PyPy. >> >> I know that at least with the PyPy core developers they would like to see >> every one just create good clean Python code and that often code that has >> been made into obscure Python was don so to try to optimize it for CPython >> which in many cases causes it to run slower on PyPy than it would run it >> the code just followed typical Python idioms. >> >> I feel that a normal developer should be using tools like cProfiler and >> runsnakerun and cleaning up design issues way before they should even >> consider using jitviewer. >> >> In a recent case where I saw someone using the jitviewer who likely >> doesn't need to use it. At least they don't need to use it considering the >> current design of the code I said the following >> >> "The jitviewer should be mainly used by PyPy core developers and those >> building PyPy VMs. A normal developer writing Python code to run on PyPy >> shouldn?t have a need to use it. They can use it to point out an >> inefficiency that PyPy has to the core developers but it should not be used >> as a way to get you to write Python code in a way that has a better chance >> of being optimized under PyPy except for very rare occasions and even then >> it should only be made by those who follow closely and understand PyPy?s >> development." >> >> >> Do others here share this same opinion and should some warning be added >> to the jitviewer? >> >> John >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Sun Feb 3 21:13:07 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sun, 3 Feb 2013 22:13:07 +0200 Subject: [pypy-dev] Should jitviewer come with a warning? In-Reply-To: References: Message-ID: On Sun, Feb 3, 2013 at 10:12 PM, Maciej Fijalkowski wrote: > On Sun, Feb 3, 2013 at 10:08 PM, John Camara wrote: >>> What makes you think people will even read this warning, let alone >>> prioritize it over their immediate desire to make their program run >>> faster? >> >>> (Not that I am objecting to adding the warning, but I think you might be >>> fooling yourself if you think it will have any impact) >> >>> Jean-Paul >> >> I agree with you and was not being naive and thinking this alone was going >> to solve the problem but it does gives us something to point to when we see >> someone abusing the jitviewer. >> >> Maybe, a more effective approach, is not to advertise about the jitviewer to >> everyone who has performance issues and only tell those who are experience >> programmers who have already done the obvious in fixing any design issues >> that had existed in their code. Having inexperience developers use the >> normal profiling tools will still help them find the hot spots in their code >> and help prevent them from picking up habits that lead them to writing >> un-Pythonic code. >> >> I'm sure we all agree that code with a better design will run faster in pypy >> than trying to add optimizations that work only for pypy to help out a poor >> design. >> >> I don't think we want to end up with a lot of Python code that looks like C >> code. This is what happens when the inexperience start relying on the >> jitviewer. >> >> For instance take a look at this code [1] and blog [2] which lead me to post >> this. This is not the first example I have come across this issue and >> unfortunately it appears to be increaseing at an alarming rate. >> >> I guess I feel we have a responsibility to try to promote good programming >> practices when we can. >> >> [1] - >> https://github.com/msgpack/msgpack-python/blob/master/msgpack/fallback.py >> >> [2] - http://blog.affien.com/archives/2013/01/29/msgpack-for-pypy/ >> >> John >> >> >> >> On Sun, Feb 3, 2013 at 12:39 PM, John Camara >> wrote: >>> >>> I have been noticing a pattern where many who are writing Python code to >>> run on PyPy are relying more and more on using the jitviewer to help them >>> write faster code. Unfortunately, many of them who do so don't look at >>> improving the design of their code as a way to improve the speed at which it >>> will run under PyPy but instead start writing obscure Python code that >>> happens to run faster under PyPy. >>> >>> I know that at least with the PyPy core developers they would like to see >>> every one just create good clean Python code and that often code that has >>> been made into obscure Python was don so to try to optimize it for CPython >>> which in many cases causes it to run slower on PyPy than it would run it the >>> code just followed typical Python idioms. >>> >>> I feel that a normal developer should be using tools like cProfiler and >>> runsnakerun and cleaning up design issues way before they should even >>> consider using jitviewer. >>> >>> In a recent case where I saw someone using the jitviewer who likely >>> doesn't need to use it. At least they don't need to use it considering the >>> current design of the code I said the following >>> >>> "The jitviewer should be mainly used by PyPy core developers and those >>> building PyPy VMs. A normal developer writing Python code to run on PyPy >>> shouldn?t have a need to use it. They can use it to point out an >>> inefficiency that PyPy has to the core developers but it should not be used >>> as a way to get you to write Python code in a way that has a better chance >>> of being optimized under PyPy except for very rare occasions and even then >>> it should only be made by those who follow closely and understand PyPy?s >>> development." >>> >>> >>> Do others here share this same opinion and should some warning be added to >>> the jitviewer? >>> >>> John >> >> >> >> _______________________________________________ >> pypy-dev mailing list >> pypy-dev at python.org >> http://mail.python.org/mailman/listinfo/pypy-dev >> > > Hi John. > > I don't believe jitviewer is advertised really in that many places. We > tell people who come to IRC, yes, but that's about it (it's not > prominently featured on pypy.org for example). It's hard enough to > make people read docs. Also, looking at the msgpack - this code is maybe not ideal, but if you're dealing with buffer-level protocols, you end up with code looking like C a lot. From fijall at gmail.com Sun Feb 3 21:19:41 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sun, 3 Feb 2013 22:19:41 +0200 Subject: [pypy-dev] Parallella open hardware platform In-Reply-To: References: Message-ID: On Fri, Feb 1, 2013 at 12:01 AM, John Camara wrote: > A couple of days ago I heard about the Parallella [1] project which is an > open hardware platform similar to the Raspberry Pi but with much higher > capabilities. It has a Zynq Z-7010 which has both a dual core ARM A9 (800 > MHz) processor and a Artix-7 FPGA, a 16 core Epiphany multicore accelerator, > 1GB ram (see [2] for more info) and currently boots up in Ubuntu. > > The goal of the Parallella project is to develop an open parallel hardware > platform and development tools. Recently they announced support for Python > with Mark Dewing [3] leading the effort. I had asked Mark if he considered > PyPy but at this time he doesn't have time for this investigation and he > reposted my comment on the forum [4] with a couple of question. Maybe one of > you could answer them. > > Working with the Parallella project maybe a good opportunity for the PyPy > project from both a PR perspective and as well as the technical challenges > it would present. On the technical side it would give the opportunity to > test STM on a reasonable number of cores while also dealing with cores from > different architectures (ARM and Epiphany). I could see all the JITting > occurring on the ARM cores with it producing output for both architectures > based on which type of core STM decides to use for a chunk of work to > execute on. Of course there is also the challenge of bridging between the 2 > architectures. Maybe even some of the more expensive STM operations could > be offloaded to the FPGA or even a limited amount of very hot sections of > code could be JITted to the FPGA (although this might be more work than its > worth). > > From a PR perspective PyPy needs to excel at some niche market so that the > PyPy platform can take off. When PyPy started concentrating on the > scientific market with increasing support for Numpy I thought this would be > the niche market that would get PyPy to take off. But there have been a > couple of issue with this approach. There is a tremendous amount of work > that needs to be done so that PyPy can look attractive to this niche market. > It requires supporting both NumPy and SciPy and their was an expectation > that if PyPy supports NumPy others would come to help out with the SciPy > support. The problem is that there doesn't seam to be many who are eager to > pitch in for the SciPy effort and there also has not been a whole lot > willing to help will the ongoing NumPy work. I think in general the ratio > of people who use NumPy and SciPy to those willing to contribute is quite > small. So the idea of going after this market was a good idea and can > definitely have the opportunity to showing the strength of PyPy project it > hasn't done much to improve the image of the PyPy project. It also doesn't > help that there is some commercial interests that have popped up recently > that have decided to play hard ball against PyPy by spreading FUD. > > Unlike the Raspberry Pi hardware which can only support hobbyist the > Parallella hardware can support both hobbyists and commercial interests. > They cost $100 which is more than the $35 for Raspberry Pi but still within > reach of most hobbyists and they didn't cut out the many features that are > needed for commercial interests. The Parallella project raised nearly $0.9 > million on kickstarter [5] for the project with nearly 5000 backers. Since > many who will use the Parallella hardware also have experience on embedded > systems they and are more likely used to writing low level code in assembly, > FPGAs, and even lots of C code and I'm sure have hit many issues with > programming in parallel/multithreaded and would welcome a better developer > experience. I bet many of them would be willing to contribute both > financially and time to supporting such an effort. I believe the > Architecture of PyPy could lend it self to becoming the core of such a > development system and would allow Python to be used in this space. This > could provide a lot of good PR for the PyPy project. > > Now I'm not saying PyPy shouldn't devote any more time to supporting NumPy > as I'm sure when PyPy has very good support for both NumPy and SciPy it's > going to be a very good day for all Python supporters. I just think that > the PyPy team needs to think about a strategy that in the end will help its > PR and gain support from a much larger community. This project is doing a > lot of good things technically and now it just needs to get the attention of > the development community at large. Now I can't predict if working with the > Parallella project would be the break though in PR that PyPy needs but it's > at least an option that's out there. > > BTW I don't have any commercial interests in the Parallella project. If > some time in the future I use their hardware it would likely be as a > hobbyist and it would be nice to program it in Python. My real objective of > this post to see the PyPy project gain wider interest as it would be a good > thing for Python. > > [1] - http://www.parallella.org/ > [2] - http://www.parallella.org/board/ > [3] - http://forums.parallella.org/memberlist.php?mode=viewprofile&u=3344 > [4] - http://forums.parallella.org/viewtopic.php?f=26&t=139 > [5] - > http://www.kickstarter.com/projects/adapteva/parallella-a-supercomputer-for-everyone > > John > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > Hi John To answer the question from the forum - the JIT emits assembler (x86, arm) it does not emit C code. As far as PR is concerned, there is no such things as PyPy team meets and decided where to go. Everyone works what they feel like doing where volunteer time is concerned. Obviously things are a little different when there is a commercial interest in something. >From my own perspective PyPy should excel at one thing - providing kick ass Python VM that's universally fast. We're missing quite a few things (like library support), but the things has improved quite drastically, due to things like cffi. The startup time is another one on the list to consider and it affects ARM even more (since it's slower in general). Cheers, fijal From john.m.camara at gmail.com Sun Feb 3 21:29:22 2013 From: john.m.camara at gmail.com (John Camara) Date: Sun, 3 Feb 2013 15:29:22 -0500 Subject: [pypy-dev] Should jitviewer come with a warning? In-Reply-To: References: Message-ID: > Also, looking at the msgpack - this code is maybe not ideal, but if > you're dealing with buffer-level protocols, you end up with code > looking like C a lot. I do agree that this type a code will likely end up looking like C but it's not necessary for all of it to look like c. Like there should be a need to have long chains of if, elif statements. Using pack_into and unpack_from instead of pack and unpack methods so that it directly deals with the buffer instead of making sub strings. Even if pypy can optimize this away why write Python code like this when its not necessary. Plus I felt, initially the code should just use cffi and connect to the native c library. I believe this approach is likely to give very close to the best performance you could get on pypy for this type of library. I'm not sure how much of an increase in performance would be gain by writing the library completely in Python vs using cffi. Is there anything wrong with this line of thinking. Do you feel a pure Python approach could achieve better results than using cffi under pypy. John -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Sun Feb 3 21:50:57 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sun, 3 Feb 2013 22:50:57 +0200 Subject: [pypy-dev] Should jitviewer come with a warning? In-Reply-To: References: Message-ID: On Sun, Feb 3, 2013 at 10:29 PM, John Camara wrote: >> Also, looking at the msgpack - this code is maybe not ideal, but if >> you're dealing with buffer-level protocols, you end up with code >> looking like C a lot. > > I do agree that this type a code will likely end up looking like C but it's > not necessary for all of it to look like c. Like there should be a need to > have long chains of if, elif statements. Using pack_into and unpack_from > instead of pack and unpack methods so that it directly deals with the buffer > instead of making sub strings. Even if pypy can optimize this away why > write Python code like this when its not necessary. er. strings are immutable in python. you can unpack into them. other kinds of buffers are kind of dodgy, because python never grew a correct buffer. > > Plus I felt, initially the code should just use cffi and connect to the > native c library. I believe this approach is likely to give very close to > the best performance you could get on pypy for this type of library. I'm > not sure how much of an increase in performance would be gain by writing the > library completely in Python vs using cffi. Is there anything wrong with > this line of thinking. Do you feel a pure Python approach could achieve > better results than using cffi under pypy. python is nicer. It does not segfault. Besides, how do you get a string out of a C library? if you do raw malloc it's prone to be bad. Etc. etc. > > John > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > From steve at pearwood.info Sun Feb 3 23:39:36 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 04 Feb 2013 09:39:36 +1100 Subject: [pypy-dev] Should jitviewer come with a warning? In-Reply-To: <20130203192548.3816.313520268.divmod.xquotient.2@localhost6.localdomain6> References: <20130203192548.3816.313520268.divmod.xquotient.2@localhost6.localdomain6> Message-ID: <510EE728.1000608@pearwood.info> On 04/02/13 06:25, exarkun at twistedmatrix.com wrote: > On 05:39 pm, john.m.camara at gmail.com wrote: >> I have been noticing a pattern where many who are writing Python code to >> run on PyPy are relying more and more on using the jitviewer to help them >> write faster code. Unfortunately, many of them who do so don't look at >> improving the design of their code as a way to improve the speed at which >> it will run under PyPy but instead start writing obscure Python code that >> happens to run faster under PyPy. [...] >> Do others here share this same opinion and should some warning be added to >> the jitviewer? > > What makes you think people will even read this warning, let alone prioritize < it over their immediate desire to make their program run faster? > > (Not that I am objecting to adding the warning, but I think you might be >fooling yourself if you think it will have any impact) I think that if the coder is actually using some sort of profiling tool, any sort of profiling tool, that makes them 1000 times more likely to read and pay attention to the warning than the average coder who optimizes code by guessing. Other than that observation, I don't have an opinion on whether jitviewer should come with a warning. (Oh, and another thing... I'm assuming you mean for jitviewer to print the warning as part of it's normal output.) -- Steven From fijall at gmail.com Sun Feb 3 23:51:03 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 4 Feb 2013 00:51:03 +0200 Subject: [pypy-dev] Should jitviewer come with a warning? In-Reply-To: <510EE728.1000608@pearwood.info> References: <20130203192548.3816.313520268.divmod.xquotient.2@localhost6.localdomain6> <510EE728.1000608@pearwood.info> Message-ID: On Mon, Feb 4, 2013 at 12:39 AM, Steven D'Aprano wrote: > On 04/02/13 06:25, exarkun at twistedmatrix.com wrote: >> >> On 05:39 pm, john.m.camara at gmail.com wrote: >>> >>> I have been noticing a pattern where many who are writing Python code to >>> run on PyPy are relying more and more on using the jitviewer to help them >>> write faster code. Unfortunately, many of them who do so don't look at >>> improving the design of their code as a way to improve the speed at which >>> it will run under PyPy but instead start writing obscure Python code that >>> happens to run faster under PyPy. > > [...] > >>> Do others here share this same opinion and should some warning be added >>> to >>> the jitviewer? >> >> >> What makes you think people will even read this warning, let alone >> prioritize > > < it over their immediate desire to make their program run faster? >> >> >> (Not that I am objecting to adding the warning, but I think you might be >> fooling yourself if you think it will have any impact) > > > > I think that if the coder is actually using some sort of profiling tool, > any sort of profiling tool, that makes them 1000 times more likely to read > and pay attention to the warning than the average coder who optimizes code > by > guessing. > > Other than that observation, I don't have an opinion on whether jitviewer > should come with a warning. > > (Oh, and another thing... I'm assuming you mean for jitviewer to print the > warning as part of it's normal output.) that is definitely a no (my screen is too small to have some noise there, if for no other reason), it might have a warning in the documentation though, if it's any useful. But honestly, I doubt such a warning makes any sense. People who are capable of using jitviewer already "know better". From john.m.camara at gmail.com Mon Feb 4 01:12:01 2013 From: john.m.camara at gmail.com (John Camara) Date: Sun, 3 Feb 2013 19:12:01 -0500 Subject: [pypy-dev] Should jitviewer come with a warning? In-Reply-To: References: Message-ID: > that is definitely a no (my screen is too small to have some noise > there, if for no other reason), it might have a warning in the > documentation though, if it's any useful. But honestly, I doubt such a > warning makes any sense. People who are capable of using jitviewer > already "know better". I agree it should not be part of the normal output. I would say add it to the doc string in app.py and to the README file. As far as people using the jitviewer already "know better". If that's the case I wouldn't have started this thread. Like you said earlier the use of jitviewer is only promoted on irc and yet I have come across 3 people working on different projects who are using it for the wrong reasons over the last 2 weeks. It's like this is the new RPython where people start using it for the wrong reasons. -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Mon Feb 4 09:42:47 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 4 Feb 2013 10:42:47 +0200 Subject: [pypy-dev] Should jitviewer come with a warning? In-Reply-To: References: Message-ID: On Mon, Feb 4, 2013 at 2:12 AM, John Camara wrote: >> that is definitely a no (my screen is too small to have some noise >> there, if for no other reason), it might have a warning in the >> documentation though, if it's any useful. But honestly, I doubt such a >> warning makes any sense. People who are capable of using jitviewer >> already "know better". > > I agree it should not be part of the normal output. I would say add it to > the doc string in app.py and to the README file. As far as people using the > jitviewer already "know better". If that's the case I wouldn't have started > this thread. Like you said earlier the use of jitviewer is only promoted on > irc and yet I have come across 3 people working on different projects who > are using it for the wrong reasons over the last 2 weeks. It's like this is > the new RPython where people start using it for the wrong reasons. > > Seriously which ones? I think msgpack usage is absolutely legit. You seem to have different opinions about the design of that software, but you did not respond to my concerns even, not to mention the fact that it sounds like it's not "obfuscated by jitviewer". Cheers, fijal From john.m.camara at gmail.com Mon Feb 4 17:28:22 2013 From: john.m.camara at gmail.com (John Camara) Date: Mon, 4 Feb 2013 11:28:22 -0500 Subject: [pypy-dev] Should jitviewer come with a warning? In-Reply-To: References: Message-ID: On Mon, Feb 4, 2013 at 3:42 AM, Maciej Fijalkowski wrote: > Seriously which ones? I think msgpack usage is absolutely legit. You > seem to have different opinions about the design of that software, but > you did not respond to my concerns even, not to mention the fact that > it sounds like it's not "obfuscated by jitviewer". > > Cheers, > fijal > First I would have tried using cffi to the msgpack c library. If I wasn't happy with it I would do a Python port. So for no lets forget about cffi and just deal with the current design of this library. I had tried to minimize the discussion about this library on this forum as I had already wrote extensive comments on the original blog [1]. Now I didn't do an extensive review of the code as I only concentrated on a small portion of it namely in the area of unpacking the msgpack messages. I'll just highlight a couple of concerns I had. The first thing the shocked me was the use of the struct.pack and struct.unpack functions. Normally when you need to pack and unpack often with the same format you would create a struct object with the desired format and use this object with its pack and unpack methods. That way the format string is not always being parsed but instead once when the struct object is created. As Bas pointed out pypy is able to optimize the parsing of the format which is great but why would you prefer to write code that would run with horrible performance under CPython when there is an alternative available. Now toward the end of the comments on the blog, Bas stated he tried the struct object under pypy and found it ran slower. So there is likely an opportunity for pypy to add another optimization as if pypy can optimize the struct functions it should be able to handle the struct objects which I would think would be an easier case to handle purely looking at it from a high level perspective. Another issue I had was the msgpack spec is designed in a way to minimize the need of copying data. That is you should be able to just use the data directly from the message buffers. The normal way to do this with the struct module is to use the unpack_from and pack_into methods instead of the pack and unpack methods. These methods take a buffer and an offset as opposed to the pack and unpack which would require you to slice out a copy of the original buffer to pass it in the unpack method. As Bas pointed out again pypy is able to optimize this copy created from slicing away which is great but again why code it in a way that will be slow on CPython when there is an alternative. The other issue I mentioned on the blog was the large number of if, elif statements used to handle each type of msgpack message. I instead suggested creating essentialy a list that holds references to struct objects so that the message type would be used as in index into this list. So that way you remove all the if, elif statements and end up with something like struct_objects[message_type].unpack_from() Now I understand that pypy is able to optimize all these if and elif statements by creating bridges for the various paths through this code but again why code it this way when it will be slow on CPython. I would also assume that using the if elif statements would still have more overhead in pypy compared to using a list of references although maybe there is not much of a difference. Any way this is just the issues I saw with this library which by the way is no where near as bad as other code I have seen written as a result of users using the jitviewer. Unfortunately, I could not discuss these other projects as they are closed source. Any way to get to the other part of you reply I assume not responding to your concerns is about the following "python is nicer. It does not segfault. Besides, how do you get a string out of a C library? if you do raw malloc it's prone to be bad. Etc. etc." Sorry that was an over sight. I feel the same way about Python but what's the real issue of taking the practical approach of using a c library that is written well and is robust. I would love to see everything written in Python but who has the time to port everything over. In the msgpack c library it would have the responsibility of maintaining the buffers. It's API supports creating and freeing these buffers. The msgpack library would be doing most of the work and the only data that has to go back and forth between the Python code and the library are just basic types like int, float, double, strings, etc. To get a string out of the c library just slice cffi.buffer to create a copy of it in Python before calling the function to clear the msgpack buffer. With using cffi this slicing to create copies of strings into Python and the overhead of calling into the c functions does add extra work over what would be done with the code written purely in Python and assuming pypy does have all the optimizations in place to get you to match the performance of the msgpack c library. The question is how much overhead does cffi really add in this use case, and is it worth doing the Python port to remove that overhead. I don't know the answer to this question. It would require profiling both cases. [1] - http://blog.affien.com/archives/2013/01/29/msgpack-for-pypy/ John -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Mon Feb 4 22:22:28 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 4 Feb 2013 23:22:28 +0200 Subject: [pypy-dev] Should jitviewer come with a warning? In-Reply-To: References: Message-ID: On Mon, Feb 4, 2013 at 6:28 PM, John Camara wrote: > On Mon, Feb 4, 2013 at 3:42 AM, Maciej Fijalkowski wrote: > >> >> Seriously which ones? I think msgpack usage is absolutely legit. You >> seem to have different opinions about the design of that software, but >> you did not respond to my concerns even, not to mention the fact that >> it sounds like it's not "obfuscated by jitviewer". >> >> Cheers, >> fijal > > > First I would have tried using cffi to the msgpack c library. If I wasn't > happy with it I would do a Python port. So for no lets forget about cffi > and just deal with the current design of this library. > > I had tried to minimize the discussion about this library on this forum as I > had already wrote extensive comments on the original blog [1]. Now I didn't > do an extensive review of the code as I only concentrated on a small portion > of it namely in the area of unpacking the msgpack messages. I'll just > highlight a couple of concerns I had. > > The first thing the shocked me was the use of the struct.pack and > struct.unpack functions. Normally when you need to pack and unpack often > with the same format you would create a struct object with the desired > format and use this object with its pack and unpack methods. That way the > format string is not always being parsed but instead once when the struct > object is created. > > As Bas pointed out pypy is able to optimize the parsing of the format which > is great but why would you prefer to write code that would run with horrible > performance under CPython when there is an alternative available. Now > toward the end of the comments on the blog, Bas stated he tried the struct > object under pypy and found it ran slower. So there is likely an > opportunity for pypy to add another optimization as if pypy can optimize the > struct functions it should be able to handle the struct objects which I > would think would be an easier case to handle purely looking at it from a > high level perspective. It's a fallback for PyPy, so CPython speed is irrelevant. Also CPython has tons of weird quirks and "faster for PyPy and slower for CPython" is not always a bad thing. Personally I don't care. This particular example however should be reported as a bug in PyPy - using Struct is *nicer*, so it should be as fast (and there is no good reason why not). > > Another issue I had was the msgpack spec is designed in a way to minimize > the need of copying data. That is you should be able to just use the data > directly from the message buffers. The normal way to do this with the > struct module is to use the unpack_from and pack_into methods instead of the > pack and unpack methods. These methods take a buffer and an offset as > opposed to the pack and unpack which would require you to slice out a copy > of the original buffer to pass it in the unpack method. As Bas pointed out > again pypy is able to optimize this copy created from slicing away which is > great but again why code it in a way that will be slow on CPython when there > is an alternative. Python buffer support sucks. For example you don't get a string out (because strings are immutable). PyPy buffer support double sucks, because buffer protocol is broken and we also didn't care. Fortunately we're able to optimize string slicing here (strings are nicer than buffers or bytearrays to play with), but we should fix buffers. Sorry about that. Again, the CPython speed does not apply. > > The other issue I mentioned on the blog was the large number of if, elif > statements used to handle each type of msgpack message. I instead suggested > creating essentialy a list that holds references to struct objects so that > the message type would be used as in index into this list. So that way you > remove all the if, elif statements and end up with something like > > struct_objects[message_type].unpack_from() Lack of constant propagation. Again, a potential bug in PyPy, but a hard one. > > Now I understand that pypy is able to optimize all these if and elif > statements by creating bridges for the various paths through this code but > again why code it this way when it will be slow on CPython. I would also > assume that using the if elif statements would still have more overhead in > pypy compared to using a list of references although maybe there is not much > of a difference. It's not about if/elif or references (all those things are incredibly cheap), but about constant propagation. Notably determining that a format is constant. This would disappear if we fix Struct (it's an easy fix, a few hours of work for someone not experienced with PyPy) > > Any way this is just the issues I saw with this library which by the way is > no where near as bad as other code I have seen written as a result of users > using the jitviewer. Unfortunately, I could not discuss these other > projects as they are closed source. And we're unable to help you because of that. > > Any way to get to the other part of you reply I assume not responding to > your concerns is about the following > > "python is nicer. It does not segfault. Besides, how do you get a > string out of a C library? if you do raw malloc it's prone to be bad. > Etc. etc." > > Sorry that was an over sight. I feel the same way about Python but what's > the real issue of taking the practical approach of using a c library that is > written well and is robust. I would love to see everything written in > Python but who has the time to port everything over. If you're dealing with a data coming from the outside using Python over C lib sounds like a very sensible idea security-wise. I can't blame anyone here. I would do the same (given that the protocol is simple enough as well). Cheers, fijal From arigo at tunes.org Tue Feb 5 15:47:28 2013 From: arigo at tunes.org (Armin Rigo) Date: Tue, 5 Feb 2013 15:47:28 +0100 Subject: [pypy-dev] Parallella open hardware platform In-Reply-To: References: Message-ID: Hi John, Sorry if I misread you, but you seem to be only saying "it would be nice if the PyPy team worked on the support for rather than ". While this might be true under some point of view, it is not constructive. What would be nice is if *you* seriously proposed to work on , or helped us raise commercial interest, or otherwise contributed towards . If you're not up to it, and nobody steps up, then it's the end of the story (but thanks anyway for the nice description of Parallela). A bient?t, Armin. From john.m.camara at gmail.com Tue Feb 5 19:25:16 2013 From: john.m.camara at gmail.com (John Camara) Date: Tue, 5 Feb 2013 13:25:16 -0500 Subject: [pypy-dev] Parallella open hardware platform In-Reply-To: References: Message-ID: Hi Armin, It's even worse I'm asking you to support and I don't even need it. When I posted this thread it was getting rather long and unfortunately I didn't really make all the points I wanted to make. At this point, and even for some time now PyPy has a great foundation but it's use remains low. Every now and then it's good to step back a little bit and reflect on the current situation and come up with a strategy that helps the project's popularity grow. I know that PyPy has done things to help with the growth such as writing blog posts, being quick to fix bugs, helping others with their performance issues and even rapidly adding optimizations to PyPy, presenting at conferences, and often actively being engaged in commenting any posts or comments made about PyPy. So PyPy is doing a lot of things right to help it's PR but yet there is this issue of slow growth. Now we know what the main issue is with it's growth is the fact that the Python ecosystem relies on a lot of libraries that use the CPython API and PyPy just doesn't have full support for this interface. I understand the reasons why PyPy is not going to support the full interface and PyPy has come up with the cffi library as a way to bridge the gap. And of course I don't expect the PyPy project to take on the responsibility of porting all the popular 3rd party libraries that use the CPython API to cffi. It's going to have to be a community effort. One thing that could help would be more marketing of cffi as very few Python developers know it exists. But that along is not going to be enough. History tells us that most successful products/projects that become popular do so by first supporting the needs of some niche market. As time goes by that niche market starts providing PR that helps other markets to discover the product/project and the cycle can sometimes continue until there is mass adoption. Now when PyPy started to place a focus on NumPy I had hoped that the market it serves would turn out to be the market that would help PyPy grow. But at this point in time it does not appear like that is going to happen. For a while I have been trying to think of a niche market that maybe helpful. But to do so you have to consider the current state of PyPy which means eliminating markets that heavily rely on libraries that use the CPython API, also going to avoid the NumPy market as that's currently being worked on, there is the mobile market but that's a tough one to get into, maybe the gaming market could be a good one, etc. It turns out with the current state of PyPy many markets need to be eliminated if you looking for one that is going to help with growth. The parrallella project on the other hand looks like it could be a promising one and I'll share so thoughts a little later in this post as to why I feel this way. Right now you have been putting a lot of effort into STM in which your trying to solve what is likely the biggest challenge that the developer community is facing. That is how to write software that effective leverages many cores in a way that is straight forward and in the spirit of Python. When you solve this problem and I have the faith that you will, most would think that it would cause PyPy's popularity to sky rocket. What most likely will happen is that PyPy gets a temporary boost in popularity as there is another lesson in history to be concerned about. Often the first to solve a problem does not become popular in the long run. As usually the first to solve the problem does so via a unique solution but once people start using it issues with the approach gets discovered. Then often many others will use the original solution solution as a starting point and modify it to eliminate these new issues. Then one of the second generation solutions ends up being the defacto standard. Now PyPy is able to move fairly quickly in terms of implementing new approaches so it may in fact be able to compete just fine against other 2nd generation solutions. But there may be some benefits to exposing STM for a smaller market to help PyPy buy some additional time before releasing it as a solution for the general developer community. So why the Parallella project. Well I think it can be helpful in a number of ways. First I don't believe that this market is going to need much from the libraries that use the CPython APIs. Many who are in this market are used to having to program for embedded systems and are more likely have the skills to help out the PyPy project in a number of areas and would likely also have a financial incentive to contribute back to PyPy such as helping keep various back ends up to date such as Arm, PPC, and additional architectures. Some in this market are used to using a number of graphical languages to program their devices but unfortunately for them some of the new products that need to enter the market can't be built fully with these graphical languages. Well with the PyPy framework it's possible for them to implement a VM for that graphical language and be able to create products that contain elements programmed in both the graphical languages as well as text based languages. Also the VMs on many embedded systems are typically simple and don't have a JIT. PyPy can help with this but I don't believe any one who maintains these VMs are aware of the PyPy project. As far as STM is concerned, working with embedded systems will force finding solutions to the many issue that arise with various hardware architectures which would help STM become a more general solution. Right now your currently writing STM in a way that will support multiple cores on a single processor well. I know you have to start some where. But soon you will have to deal with issues that arise once you span to multiple processors such as dealing more often with the slower L3 cache and it sync issues and local vs remote memory issues. But on the embedded side you have to deal with processors of multiple architectures on the same system plus FPGAs as well as having to consider the various issues that a arise from the various buses involved which makes the STM problem quit a bit harder in how it gets optimized to handle all these variations. Of course many of these same issues exists if you want to have STM support GPUs in a normal computing device. The embedded side just adds additional complications as they come in more complex configurations. The Raspberry Pi has become popular, as many want to hack on these devices and the Raspberry Pi happens to be the first devices that is both cheap and allows programming at a high level. Previously if you wanted cheap it meant you needed to program using low level approach or you had to buy an expensive solution to program at a high level. Many who get interested in the Raspberry Pi soon find them self in the position where they have an idea and want to create a product to sell. But they realize you can't use a Raspberry Pi for production as it missing many features they would be required but they also like the idea of programming at a high level but the traditional embedded systems that support this may be too expensive for their product. That's where the Parallella project comes into play. They see there is a market for a low cost devices that can be programmed with higher level tools to build production systems. This market values programming at a high level and would highly appreciate being able to program them in Python. They also have a need to support multi cores and thus could use STM and it would be incredibly usefully if the STM approach could seamlessly support multiple architectures. There is a lot of value here for the companies that want to produce these devices and PyPy should try to tap into it. This new market segment using these low cost devices are going to have a large impact and also will play a role in the manufacturing revolution that is about to take place. This manufacturing revolution is likely to be on the same scale as the Internet revolution. Just think about what the effect 3D printing is going to have. It will be huge. PyPy getting a foot hold into this market before it takes off would be huge for PyPy as well as in general for Python. Also there are some big players who currently sell these more expensive embedded systems who are not going to be happy about these cheaper alternatives and are also going to want a piece of the action. I think for many of them who may not be able to quickly change their development and run-time processes may decide it's much easier for them to port their VMs over to PyPy to get into the action. Hopefully this gives some better insight as to why I feel it may be a good strategy to consider supporting the Parallella project. The possibility of getting a foot hold into a market that is about to take off doesn't come around too often. All I know is if PyPy would like to support this market right now is the best time to get started. This might be the ticket PyPy needs to gets it growth up which could then lead to additional markets taking notice and more of the Python ecosystem becoming compatible with PyPy. Of course this is just my opinion and maybe someone else could come up with another strategy that can help PyPy grow faster. Even an Open Source project can use a strategy. John On Tue, Feb 5, 2013 at 9:47 AM, Armin Rigo wrote: > Hi John, > > Sorry if I misread you, but you seem to be only saying "it would be > nice if the PyPy team worked on the support for rather than ". > While this might be true under some point of view, it is not > constructive. What would be nice is if *you* seriously proposed to > work on , or helped us raise commercial interest, or otherwise > contributed towards . If you're not up to it, and nobody steps up, > then it's the end of the story (but thanks anyway for the nice > description of Parallela). > > > A bient?t, > > Armin. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Tue Feb 5 22:34:09 2013 From: matti.picus at gmail.com (Matti Picus) Date: Tue, 05 Feb 2013 23:34:09 +0200 Subject: [pypy-dev] win32 own test failures Message-ID: <51117AD1.7060609@gmail.com> many of the jit.backend.x86.test are failing, I am willing to put time into solving this but have a bit of no idea where to start. Can someone give me the end of a string to pull? here are the last few lines of test_float, index so apparently deadframe.jf_values is empty Matti [llinterp:error] | File "c:\Users\matti\pypy_stuff\pypy\rpython\jit\metainterp\warmstate.py", line 322, in execute_assembler [llinterp:error] | fail_descr.handle_fail(deadframe, metainterp_sd, jitdriver_sd) [llinterp:error] | File "c:\Users\matti\pypy_stuff\pypy\rpython\jit\metainterp\compile.py", line 537, in handle_fail [llinterp:error] | resume_in_blackhole(metainterp_sd, jitdriver_sd, self, deadframe) [llinterp:error] | File "c:\Users\matti\pypy_stuff\pypy\rpython\jit\metainterp\blackhole.py", line 1558, in resume_in_blackhole [llinterp:error] | all_virtuals) [llinterp:error] | File "c:\Users\matti\pypy_stuff\pypy\rpython\jit\metainterp\resume.py", line 1041, in blackhole_from_resumedata [llinterp:error] | resumereader.consume_one_section(curbh) [llinterp:error] | File "c:\Users\matti\pypy_stuff\pypy\rpython\jit\metainterp\resume.py", line 1083, in consume_one_section [llinterp:error] | self._prepare_next_section(info) [llinterp:error] | File "c:\Users\matti\pypy_stuff\pypy\rpython\jit\metainterp\resume.py", line 759, in _prepare_next_section [llinterp:error] | self.unique_id) # <-- annotation hack [llinterp:error] | File "c:\Users\matti\pypy_stuff\pypy\rpython\jit\codewriter\jitcode.py", line 149, in enumerate_vars [llinterp:error] | callback_f(index, self.get_register_index_f(i)) [llinterp:error] | File "c:\Users\matti\pypy_stuff\pypy\rpython\jit\metainterp\resume.py", line 771, in _callback_f [llinterp:error] | value = self.decode_float(self.cur_numb.nums[index]) [llinterp:error] | File "c:\Users\matti\pypy_stuff\pypy\rpython\jit\metainterp\resume.py", line 1272, in decode_float [llinterp:error] | return self.cpu.get_latest_value_float(self.deadframe, num) [llinterp:error] | File "c:\Users\matti\pypy_stuff\pypy\rpython\jit\backend\llsupport\llmodel.py", line 290, in get_latest_value_float [llinterp:error] | return deadframe.jf_values[index].float [llinterp:error] | File "c:\Users\matti\pypy_stuff\pypy\rpython\rtyper\lltypesystem\lltype.py", line 1181, in __getitem__ [llinterp:error] | raise IndexError("array index out of bounds") [llinterp:error] | IndexError: array index out of bounds [llinterp:error] `------> [llinterp:traceback] f() rpython.jit.metainterp.test.test_ajit [llinterp:traceback] v4 = jit_marker(('can_enter_jit'), (), x_1, y_2, res_0) [llinterp:traceback] E v5 = direct_call((<* fn ll_portal_runner>), x_1, y_2, res_0) ==================================================================== short test summary info ==================================================================== FAIL rpython/jit/backend/x86/test/test_basic.py::TestBasic::()::test_float ============================================================ 169 tests deselected by '-ktest_float' ============================================================= ====================================================== 1 failed, 1 passed, 169 deselected in 4.32 seconds ======================================================= From matti.picus at gmail.com Tue Feb 5 23:51:50 2013 From: matti.picus at gmail.com (Matti Picus) Date: Wed, 06 Feb 2013 00:51:50 +0200 Subject: [pypy-dev] win32 own test failures In-Reply-To: <51117AD1.7060609@gmail.com> References: <51117AD1.7060609@gmail.com> Message-ID: <51118D06.6000908@gmail.com> An HTML attachment was scrubbed... URL: From fijall at gmail.com Wed Feb 6 00:26:31 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Wed, 6 Feb 2013 01:26:31 +0200 Subject: [pypy-dev] win32 own test failures In-Reply-To: <51118D06.6000908@gmail.com> References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> Message-ID: On Wed, Feb 6, 2013 at 12:51 AM, Matti Picus wrote: > fwiw, this happened on the remove-globals-in-jit branch and began occurring > soon after the first commits on the branch, after changeset 8c87151e76f0 on > that branch the test passed on 64 bit linux. > Matti it's probably already fixed on jitframe-on-heap which we aim to merge > > On 5/02/2013 11:34 PM, Matti Picus wrote: > > many of the jit.backend.x86.test are failing, I am willing to put time into > solving this but have a bit of no idea where to start. > Can someone give me the end of a string to pull > > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > From fijall at gmail.com Wed Feb 6 12:11:04 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Wed, 6 Feb 2013 13:11:04 +0200 Subject: [pypy-dev] Parallella open hardware platform In-Reply-To: References: Message-ID: Hi John. Let me summarize your long post how I understood it. "You guys should bet everything on platform that both does not need PyPy and expressed no real interest. The reason why is because PyPy is not growing fast enough and we need a niche market. On top of that we should answer a lot of unanswered questions, like memory and warmup requirements on embedded devices". So, I think you're wrong in very many regards here. I think we should try to excel at providing a kick ass Python VM, but also I have seriously no say in what people work on (except me). We already have some niche markets, notably people who are willing to invest R&D and need serious power (but are unable or unwilling to use C or C++ for that). You just don't know about it, because those are typically not people writing blog posts. Having a dedicated web stack is another good step and we'll eventuall get there. I don't know why you think this particular niche market is better than any other, but it really does not matter all that much. There is no way you can convince people to do something else in their volunteer time than what they already feel like doing. Things you can do if you're interested: * do the work yourself * work with parallela project to have a first-class pypy support if they care about performance * spark commercial interest however, trying to convince volunteers that they should do what you think they should do is not really one of the helpful things you can be doing. Cheers, fijal From amauryfa at gmail.com Wed Feb 6 12:36:36 2013 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Wed, 6 Feb 2013 12:36:36 +0100 Subject: [pypy-dev] Parallella open hardware platform In-Reply-To: References: Message-ID: 2013/2/6 Maciej Fijalkowski > however, trying to convince volunteers that they should do what you > think they should do is not really one of the helpful things you can > be doing. > Except if this brings *new* volunteers to the project :-) -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From arigo at tunes.org Wed Feb 6 13:13:44 2013 From: arigo at tunes.org (Armin Rigo) Date: Wed, 6 Feb 2013 13:13:44 +0100 Subject: [pypy-dev] Parallella open hardware platform In-Reply-To: References: Message-ID: Hi John, Thanks for your lengthy analysis. I'm sure that it can be interesting for some to read. Unfortunately, I'm personally an Open Source hobbyist that happens to come from a university background and I'm still attached to some ideas behind it. You say about my hacking STM: "Often the first to solve a problem does not become popular in the long run". That is true, and I have no problem with that. My guess is that in the end STM will end up being common in programming languages. So I would like to help along the way --- by showing that it works in complicated languages like Python, using the unlimited flexibility of Software TM rather than as an exercice to fit it around some Hardware TM. It would be nice if PyPy also becomes the de-facto 2nd-generation standard, but that's less realistic --- and not a problem for me. My goal is *not* to write and sell the final product. What would also be nice is if this final product was Python, but unfortunately, it seems unlikely at this point that CPython will ever convert to STM. I guess that besides PyPy, Python as a whole will lag behind, and likely only end up using some HTM solution in 10-15 years when it's fully ready. (I consider the HTM that we have this year as preliminary at best.) That is my current analysis on the future of STM. It doesn't include huge monetary benefits for PyPy :-) but it doesn't change anything about my own research motivation: 1st-generation research, as you call it. Obviously, PyPy as a whole is such a 1st-generation project. What I would actually like a lot is to see the emergence of other 2nd-generation platforms that apply the same principles as PyPy --- for example, it would be a first step to see an efficient JavaScript JIT compiler not manually written from scratch. A bient?t, Armin. From skip at pobox.com Wed Feb 6 22:10:09 2013 From: skip at pobox.com (Skip Montanaro) Date: Wed, 6 Feb 2013 15:10:09 -0600 Subject: [pypy-dev] Cheetah on PyPy? Message-ID: I'm slowly working through some little tests which don't require and of our libraries at work. My current test is a script which uses the Cheetah template engine. I see that it is buried somewhere in PyPy's tests (inside genshi?), but my straightforward install of Cheetah 2.4.4 doesn't work because the str() of a compiled template doesn't do any of the required template expansion. It just returns what looks suspiciously like the repr() of the object. Is there a modified version of Cheetah somewhere (or patch) which coaxes it to work with PyPy? Failing that, where is the test suite code? I see no references to it as a standalone download, and am currently working with a binary build of 1.9. Thx, Skip Montanaro From fijall at gmail.com Wed Feb 6 22:52:01 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Wed, 6 Feb 2013 23:52:01 +0200 Subject: [pypy-dev] Cheetah on PyPy? In-Reply-To: References: Message-ID: On Wed, Feb 6, 2013 at 11:10 PM, Skip Montanaro wrote: > I'm slowly working through some little tests which don't require and > of our libraries at work. My current test is a script which uses the > Cheetah template engine. I see that it is buried somewhere in PyPy's > tests (inside genshi?), but my straightforward install of Cheetah > 2.4.4 doesn't work because the str() of a compiled template doesn't do > any of the required template expansion. It just returns what looks > suspiciously like the repr() of the object. > > Is there a modified version of Cheetah somewhere (or patch) which > coaxes it to work with PyPy? Failing that, where is the test suite > code? I see no references to it as a standalone download, and am > currently working with a binary build of 1.9. > > Thx, > > Skip Montanaro > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev It seems cheetah is doing some strange C extension called nameMapper. I can bet upfront that this is the part that does not work at all. It has a python fallback but it does not seem to work the same way. Cheers, fijal From amauryfa at gmail.com Wed Feb 6 22:54:42 2013 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Wed, 6 Feb 2013 22:54:42 +0100 Subject: [pypy-dev] Cheetah on PyPy? In-Reply-To: References: Message-ID: 2013/2/6 Skip Montanaro > I'm slowly working through some little tests which don't require and > of our libraries at work. My current test is a script which uses the > Cheetah template engine. I see that it is buried somewhere in PyPy's > tests (inside genshi?), but my straightforward install of Cheetah > 2.4.4 doesn't work because the str() of a compiled template doesn't do > any of the required template expansion. It just returns what looks > suspiciously like the repr() of the object. > > Is there a modified version of Cheetah somewhere (or patch) which > coaxes it to work with PyPy? Failing that, where is the test suite > code? I see no references to it as a standalone download, and am > currently working with a binary build of 1.9 > It's due a small incompatibility between CPython and PyPy. You are right that __str__ is not correctly set on Cheetah templates, this is because of this statement in Cheetah/Template.py: concreteTemplateClass.__str__ is object.__str__ A fix is to replace the "is" operator by "==". This is correcly covered by the tests in Cheetah/Tests/Template.py (just run this file with pypy) The explanation origins from a CPython oddity (IMHO): CPython has no "unbound built-in methods" for C types, and object.__str__ yields the same object every time. This is not the case for user-defined classes, for example "type(help).__repr__ is type(help).__repr__" is False. PyPy is more regular here, and has real unbound built-in methods. On the other hand this means that the "is" comparison should not be used, even for built-in methods. A similar pattern existed in the stdlib pprint.py, and we chose to fix the module. -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From skip at pobox.com Wed Feb 6 23:32:47 2013 From: skip at pobox.com (Skip Montanaro) Date: Wed, 6 Feb 2013 16:32:47 -0600 Subject: [pypy-dev] Cheetah on PyPy? In-Reply-To: References: Message-ID: > You are right that __str__ is not correctly set on Cheetah templates, this > is because of this statement in Cheetah/Template.py: > concreteTemplateClass.__str__ is object.__str__ > A fix is to replace the "is" operator by "==". > This is correcly covered by the tests in Cheetah/Tests/Template.py (just run > this file with pypy) Thanks. I'll make that change and send it back upstream to the Cheetah folks. Skip From john.m.camara at gmail.com Thu Feb 7 05:41:43 2013 From: john.m.camara at gmail.com (John Camara) Date: Wed, 6 Feb 2013 23:41:43 -0500 Subject: [pypy-dev] Parallella open hardware platform In-Reply-To: References: Message-ID: Fijal, In the past you have complained about it being hard to make money in open source. One way to make it easier for you is grow the popularity of PyPy. So I would think you would at least have some interest in thinking of ways to accomplish that. I'm not trying to dictate what PyPy should do but merely providing an opinion of mine that I see an opportunity that potential could be a great thing for PyPy. A year ago if someone asked me if PyPy should support embedded systems I would have given a firm no but I see the market changing in ways I didn't expect. The people hacking on these devices are fairly similar to open source developers and in some cases they even do open source development. They do things differently from the establishment which has provided a new way to think about manufacturing. Their ways are so different from the establishment and have become a game changer that it has ignited what is becoming a manufacturing revolution. Now because many who are involved in hacking with this hardware have no prior experience with the established ways of doing this type of business they are moving in directions that differ in how these devices get programmed. They are also in need of tools and new infrastructure and I feel that what PyPy has to offer can give them a starting point. Now at the end of the day I don't believe many of their requirements are going to be much different than the requirements for other markets and not likely too different than the direction PyPy will likely take. So why not go where all the big money is going to be at. Ok enough of that. Lets take a look at your example of a web stack. I believe right now PyPy is in a position to be used in this market. Sure PyPy could use some additional optimizations to improve the situation but I think in general it's already able to kick ass compared to CPython in terms of performance when a light web framework is used which is becoming increasing popular as web apps push the front ends to do most of the layout/presentation work. Also with with the web becoming more dynamic and the number of requests increasing at a substantial rate it becomes more important to reduce latencies which tends to give PyPy an advantage. This is all great while the web stacks are running on traditional servers but servers are changing. There are some servers being sold today that have hundreds of small cores and in the not too distant future there will be systems that have a number of full cores and a much larger number of smaller cores which may or may not have similar architectures. For instance servers with Phi coprocessors (8 GB of memory (60) 1 GHz cores, with I believe 4 threads each, with a PCIe3 interface) and have become recently available. How is PyPy going to handle this. Is this any different than the needs of the embedded systems. No. PyPy is going to have to start paying attention to how data is accessed and will have to make optimizations based on the access patterns. That is you have to make sure computational loads can offset the data transfer overhead. Today PyPy does not take into this overhead cost which is not required when running on one core.. For a web application it would be nice to run multiple sessions on a given core, save session related data locally to that core so as to minimize data transfer to the smaller cores which means directing all request for the session to the same core, doing any necessary encryption on these small cores, etc. But there may also be some work for a particular request which might not be appropriate to run on a small core and may have to run on the main core maybe due to it requiring access too much data. How is this going to work. Is PyPy going to do all the analysis itself or will the programmer provide some hints to PyPy as to how to break up the work. Who is going to be responsible for the scheduling and cleaning up the session data that is cached locally to the cores and a boat load of other issues I'm not sure it's a tough problem.and one that is just around the corner. Another option would be to run an HTTP load balance on the main cores, PyPy web stacks running on say dedicated Phi cores, with the HTTP requests forwarded over the PCIe bus. That way each Phi core acts like an independent web server. But running 60-240 PyPy processes in 8GB of memory is quite the challenge Maybe some sort of PyPy hypervisor that is able to run virtualized PyPy instances so that each instance can share all the JITed code but have it's own data. I'm sure many issues and questions exists like who would do the JITting the hypervisor or the virualized PyPy instances? Now even if you feel right now is not the time to start worrying about these new server architectures there are still other issues PyPy will start to run into, in the web stack market. Typically for a web application that is being accessed from the Internet there is a certain amount of latency that is acceptable. But what happens when the same web stack technology is deployed in local environments (i.e. on a LAN) with heavy dynamic requests with some requiring near real time performance. When operating in an a networked environment with low latencies people are going to expect more from a web servers (actual not just the people but systems talking to other systems that will require it). This ends up being a problem for Python in general as the garbage collector is going to be an issue. This is going to require a concurrent garbage collector. The concurrent garbage collector is also needed by the embedded market, as well as the gaming market, and many others. Any way, this is just food for thought. I'm not going to keep on giving more examples in more replies. In the end this is where the world is headed and it's going to take a lot of work and resources to get PyPy to handle these situations and only strong growth can make it possible. If you want PyPy to get there I hope you can see why a strategy for growth is necessary. On a side note, I'm not all that comfortable writing these posts when I know that at this particular time I don't have the spare time to contribute. Right now I work 7 days a week from the time I wake up until I go to sleep. But I wrote it any way as I do believe there its a good opportunity for PyPy. John On Wed, Feb 6, 2013 at 6:11 AM, Maciej Fijalkowski wrote: > Hi John. > > Let me summarize your long post how I understood it. "You guys should > bet everything on platform that both does not need PyPy and > expressed no real interest. The reason why is because PyPy is not > growing fast enough and we need a niche market. On top of that we > should answer a lot of unanswered questions, like memory and warmup > requirements on embedded devices". > > So, I think you're wrong in very many regards here. I think we should > try to excel at providing a kick ass Python VM, but also I have > seriously no say in what people work on (except me). We already have > some niche markets, notably people who are willing to invest R&D and > need serious power (but are unable or unwilling to use C or C++ for > that). You just don't know about it, because those are typically not > people writing blog posts. Having a dedicated web stack is another > good step and we'll eventuall get there. I don't know why you think > this particular niche market is better than any other, but it really > does not matter all that much. There is no way you can convince people > to do something else in their volunteer time than what they already > feel like doing. Things you can do if you're interested: > > * do the work yourself > > * work with parallela project to have a first-class pypy support if > they care about performance > > * spark commercial interest > > however, trying to convince volunteers that they should do what you > think they should do is not really one of the helpful things you can > be doing. > > Cheers, > fijal > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eginez at gmail.com Thu Feb 7 07:52:41 2013 From: eginez at gmail.com (=?ISO-8859-1?Q?Esteban_G=EDnez?=) Date: Wed, 6 Feb 2013 22:52:41 -0800 Subject: [pypy-dev] NumPyPy effort Message-ID: Hi there! I am currently looking to help out with PyPy and it seems like a good place to put some effort is in the NumPy. If someone can give pointers and resource on where/how to get started I would appreciated a ton. Thanks a bunch E. -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Thu Feb 7 10:30:30 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Thu, 7 Feb 2013 11:30:30 +0200 Subject: [pypy-dev] NumPyPy effort In-Reply-To: References: Message-ID: On Thu, Feb 7, 2013 at 8:52 AM, Esteban G?nez wrote: > Hi there! > I am currently looking to help out with PyPy and it seems like a good place > to put some effort is in the NumPy. > > If someone can give pointers and resource on where/how to get started I > would appreciated a ton. > > Thanks a bunch > E. Hi Esteban, you're welcome! We're very IRC-based, I suggest you show up on the PyPy IRC channel. The general idea is that you take some numpy function/failing test that's implemented in C and implement it :) Cheers, fijal From fijall at gmail.com Thu Feb 7 10:33:19 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Thu, 7 Feb 2013 11:33:19 +0200 Subject: [pypy-dev] Parallella open hardware platform In-Reply-To: References: Message-ID: On Thu, Feb 7, 2013 at 6:41 AM, John Camara wrote: > Fijal, > > In the past you have complained about it being hard to make money in open > source. One way to make it easier for you is grow the popularity of PyPy. > So I would think you would at least have some interest in thinking of ways > to accomplish that. Before even reading further - how is being popular making money? Being popular is being popular. Do you know any CPython developer working full time on CPython? CPython is definitely popular by my standards From mtasic85 at gmail.com Thu Feb 7 12:55:42 2013 From: mtasic85 at gmail.com (Marko Tasic) Date: Thu, 7 Feb 2013 12:55:42 +0100 Subject: [pypy-dev] Great experience with PyPy Message-ID: Hi, I would like to share short story with you and share what we have accomplished with PyPy and its friends so far. Company that I have worked for last 7 months (intentionally unnamed) gave me absolute permission to pick up technologies on which we based our solution. What we do is: crawl for PDFs and newspapers articles, download, translate them if needed, OCR if needed, do extensive analysis of downloaded PDFs and articles, store them in more organized structures for faster querying, search for them and generate bunch of complex reports. >From very beginning I decided to go with PyPy no matter what. What we picked is following: * Flask for web framework, and few of its extensions such as Flask-Login, Flask-Principal, Flask-WTF, Flask-Mail, etc. * Cassandra as database because of its features and great experience with it. PyCassa is used as client to talk to Cassandra server. * ElasticSearch as distributed search engine, and its client library pyes. * Whoosh as search engine, but with some modifications to support Cassandra as storage and distributed locking. * Redis, and its client library redis-py, for caching and to speed up common auto-completion patterns. * ZooKeeper, and its client library Kazoo, for distributed locking which plays essential role in system for transaction-like behavior over many services at once. * Celery in conjunction with RabbitMQ for task distribution. * Sentry for error logging. What we have developed on our own are wrappers and clients for: * Moses which is language translator * Tesseract which is OCR engine * Cassandra store for Whoosh * wkhtmltopdf and wkhtmltoimage which are used for conversion of HTML to PDF/Image * etc Now when product is finished and in final testing phase, I can say that we did not regret because we used PyPy and stack around it. Typical speed improvement is 2x-3x over CPython in our case, but anyway we are mostly IO and memory bound, expect for Celery workers where we do analysis which are again many small CPU intensive tasks that are exchanged via RabbitMQ. Another reason why we don't see speedup us is that we are dependent on external software (servers) written in Erlang and Java. I'm already planing to do Cassandra (distributed key/value only database without index features), ZooKeeper, Redis and ElasticSearch ports in Python for next projects, and hopefully opensource them. Regards, Marko Tasic From fijall at gmail.com Thu Feb 7 13:00:27 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Thu, 7 Feb 2013 14:00:27 +0200 Subject: [pypy-dev] Great experience with PyPy In-Reply-To: References: Message-ID: On Thu, Feb 7, 2013 at 1:55 PM, Marko Tasic wrote: > Hi, > > I would like to share short story with you and share what we have > accomplished with PyPy and its friends so far. > > Company that I have worked for last 7 months (intentionally unnamed) > gave me absolute permission to pick up technologies on which we based > our solution. What we do is: crawl for PDFs and newspapers articles, > download, translate them if needed, OCR if needed, do extensive > analysis of downloaded PDFs and articles, store them in more organized > structures for faster querying, search for them and generate bunch of > complex reports. > > From very beginning I decided to go with PyPy no matter what. What we > picked is following: > * Flask for web framework, and few of its extensions such as > Flask-Login, Flask-Principal, Flask-WTF, Flask-Mail, etc. > * Cassandra as database because of its features and great experience > with it. PyCassa is used as client to talk to Cassandra server. > * ElasticSearch as distributed search engine, and its client library pyes. > * Whoosh as search engine, but with some modifications to support > Cassandra as storage and distributed locking. > * Redis, and its client library redis-py, for caching and to speed up > common auto-completion patterns. > * ZooKeeper, and its client library Kazoo, for distributed locking > which plays essential role in system for transaction-like behavior > over many services at once. > * Celery in conjunction with RabbitMQ for task distribution. > * Sentry for error logging. > > What we have developed on our own are wrappers and clients for: > * Moses which is language translator > * Tesseract which is OCR engine > * Cassandra store for Whoosh > * wkhtmltopdf and wkhtmltoimage which are used for conversion of HTML > to PDF/Image > * etc > > Now when product is finished and in final testing phase, I can say > that we did not regret because we used PyPy and stack around it. > Typical speed improvement is 2x-3x over CPython in our case, but > anyway we are mostly IO and memory bound, expect for Celery workers > where we do analysis which are again many small CPU intensive tasks > that are exchanged via RabbitMQ. Another reason why we don't see > speedup us is that we are dependent on external software (servers) > written in Erlang and Java. > > I'm already planing to do Cassandra (distributed key/value only > database without index features), ZooKeeper, Redis and ElasticSearch > ports in Python for next projects, and hopefully opensource them. > > Regards, > Marko Tasic > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev Awesome! I'm glad people can make pypy work for non-trivial tasks which require a lot of dependencies. We're trying to lower the bar, however it takes time. Cheers, fijal From phyo.arkarlwin at gmail.com Thu Feb 7 15:11:16 2013 From: phyo.arkarlwin at gmail.com (Phyo Arkar) Date: Thu, 7 Feb 2013 20:41:16 +0630 Subject: [pypy-dev] Great experience with PyPy In-Reply-To: References: Message-ID: Pypy should have a page for "Success Stories!" Now with this and Quora proving Power of PyPy , i am beginning to start converting my projects into PyPy soon! I am only withholding right now because my projects uses a lot of C Libraries and Numpy/Matplotlib/scilit-learn. Thanks Phyo. On Thursday, February 7, 2013, Maciej Fijalkowski wrote: > On Thu, Feb 7, 2013 at 1:55 PM, Marko Tasic > > wrote: > > Hi, > > > > I would like to share short story with you and share what we have > > accomplished with PyPy and its friends so far. > > > > Company that I have worked for last 7 months (intentionally unnamed) > > gave me absolute permission to pick up technologies on which we based > > our solution. What we do is: crawl for PDFs and newspapers articles, > > download, translate them if needed, OCR if needed, do extensive > > analysis of downloaded PDFs and articles, store them in more organized > > structures for faster querying, search for them and generate bunch of > > complex reports. > > > > From very beginning I decided to go with PyPy no matter what. What we > > picked is following: > > * Flask for web framework, and few of its extensions such as > > Flask-Login, Flask-Principal, Flask-WTF, Flask-Mail, etc. > > * Cassandra as database because of its features and great experience > > with it. PyCassa is used as client to talk to Cassandra server. > > * ElasticSearch as distributed search engine, and its client library > pyes. > > * Whoosh as search engine, but with some modifications to support > > Cassandra as storage and distributed locking. > > * Redis, and its client library redis-py, for caching and to speed up > > common auto-completion patterns. > > * ZooKeeper, and its client library Kazoo, for distributed locking > > which plays essential role in system for transaction-like behavior > > over many services at once. > > * Celery in conjunction with RabbitMQ for task distribution. > > * Sentry for error logging. > > > > What we have developed on our own are wrappers and clients for: > > * Moses which is language translator > > * Tesseract which is OCR engine > > * Cassandra store for Whoosh > > * wkhtmltopdf and wkhtmltoimage which are used for conversion of HTML > > to PDF/Image > > * etc > > > > Now when product is finished and in final testing phase, I can say > > that we did not regret because we used PyPy and stack around it. > > Typical speed improvement is 2x-3x over CPython in our case, but > > anyway we are mostly IO and memory bound, expect for Celery workers > > where we do analysis which are again many small CPU intensive tasks > > that are exchanged via RabbitMQ. Another reason why we don't see > > speedup us is that we are dependent on external software (servers) > > written in Erlang and Java. > > > > I'm already planing to do Cassandra (distributed key/value only > > database without index features), ZooKeeper, Redis and ElasticSearch > > ports in Python for next projects, and hopefully opensource them. > > > > Regards, > > Marko Tasic > > _______________________________________________ > > pypy-dev mailing list > > pypy-dev at python.org > > http://mail.python.org/mailman/listinfo/pypy-dev > > Awesome! > > I'm glad people can make pypy work for non-trivial tasks which require > a lot of dependencies. We're trying to lower the bar, however it takes > time. > > Cheers, > fijal > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dynamicgl at gmail.com Thu Feb 7 16:12:53 2013 From: dynamicgl at gmail.com (Gelin Yan) Date: Thu, 7 Feb 2013 23:12:53 +0800 Subject: [pypy-dev] Great experience with PyPy In-Reply-To: References: Message-ID: On Thu, Feb 7, 2013 at 10:11 PM, Phyo Arkar wrote: > Pypy should have a page for "Success Stories!" > > Now with this and Quora proving Power of PyPy , i am beginning to start > converting my projects into PyPy soon! > I am only withholding right now because my projects uses a lot of C > Libraries and Numpy/Matplotlib/scilit-learn. > > Thanks > > Phyo. > > On Thursday, February 7, 2013, Maciej Fijalkowski wrote: > >> On Thu, Feb 7, 2013 at 1:55 PM, Marko Tasic wrote: >> > Hi, >> > >> > I would like to share short story with you and share what we have >> > accomplished with PyPy and its friends so far. >> > >> > Company that I have worked for last 7 months (intentionally unnamed) >> > gave me absolute permission to pick up technologies on which we based >> > our solution. What we do is: crawl for PDFs and newspapers articles, >> > download, translate them if needed, OCR if needed, do extensive >> > analysis of downloaded PDFs and articles, store them in more organized >> > structures for faster querying, search for them and generate bunch of >> > complex reports. >> > >> > From very beginning I decided to go with PyPy no matter what. What we >> > picked is following: >> > * Flask for web framework, and few of its extensions such as >> > Flask-Login, Flask-Principal, Flask-WTF, Flask-Mail, etc. >> > * Cassandra as database because of its features and great experience >> > with it. PyCassa is used as client to talk to Cassandra server. >> > * ElasticSearch as distributed search engine, and its client library >> pyes. >> > * Whoosh as search engine, but with some modifications to support >> > Cassandra as storage and distributed locking. >> > * Redis, and its client library redis-py, for caching and to speed up >> > common auto-completion patterns. >> > * ZooKeeper, and its client library Kazoo, for distributed locking >> > which plays essential role in system for transaction-like behavior >> > over many services at once. >> > * Celery in conjunction with RabbitMQ for task distribution. >> > * Sentry for error logging. >> > >> > What we have developed on our own are wrappers and clients for: >> > * Moses which is language translator >> > * Tesseract which is OCR engine >> > * Cassandra store for Whoosh >> > * wkhtmltopdf and wkhtmltoimage which are used for conversion of HTML >> > to PDF/Image >> > * etc >> > >> > Now when product is finished and in final testing phase, I can say >> > that we did not regret because we used PyPy and stack around it. >> > Typical speed improvement is 2x-3x over CPython in our case, but >> > anyway we are mostly IO and memory bound, expect for Celery workers >> > where we do analysis which are again many small CPU intensive tasks >> > that are exchanged via RabbitMQ. Another reason why we don't see >> > speedup us is that we are dependent on external software (servers) >> > written in Erlang and Java. >> > >> > I'm already planing to do Cassandra (distributed key/value only >> > database without index features), ZooKeeper, Redis and ElasticSearch >> > ports in Python for next projects, and hopefully opensource them. >> > >> > Regards, >> > Marko Tasic >> > _______________________________________________ >> > pypy-dev mailing list >> > pypy-dev at python.org >> > http://mail.python.org/mailman/listinfo/pypy-dev >> >> Awesome! >> >> I'm glad people can make pypy work for non-trivial tasks which require >> a lot of dependencies. We're trying to lower the bar, however it takes >> time. >> >> Cheers, >> fijal >> _______________________________________________ >> pypy-dev mailing list >> pypy-dev at python.org >> http://mail.python.org/mailman/listinfo/pypy-dev >> > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > > Hi, It might be off topic. I want to know whether pypy support postgres. The last time I noticed ctypes based psycopg2 was still beta. I mainly use twisted & postgres. pypy supports twisted well but not good for psycopg2. Regards gelin yan -------------- next part -------------- An HTML attachment was scrubbed... URL: From konstantin.lopuhin at chtd.ru Thu Feb 7 16:28:38 2013 From: konstantin.lopuhin at chtd.ru (=?KOI8-R?B?68/T1NEg7M/Q1cjJzg==?=) Date: Thu, 7 Feb 2013 19:28:38 +0400 Subject: [pypy-dev] Great experience with PyPy In-Reply-To: References: Message-ID: PyPy supports postgres with either psycopg2cffi or psycopg2-ctypes bindings. We use psycopg2cffi in production (and maintain them), and here http://chtd.ru/blog/bystraya-rabota-s-postgres-pod-pypy/?lang=en are some benchmarks. And yes, PyPy is cool :) Typically giving 3x speedups, and some memory savings sometimes. 2013/2/7 Gelin Yan : > > > On Thu, Feb 7, 2013 at 10:11 PM, Phyo Arkar > wrote: >> >> Pypy should have a page for "Success Stories!" >> >> Now with this and Quora proving Power of PyPy , i am beginning to start >> converting my projects into PyPy soon! >> I am only withholding right now because my projects uses a lot of C >> Libraries and Numpy/Matplotlib/scilit-learn. >> >> Thanks >> >> Phyo. >> >> On Thursday, February 7, 2013, Maciej Fijalkowski wrote: >>> >>> On Thu, Feb 7, 2013 at 1:55 PM, Marko Tasic wrote: >>> > Hi, >>> > >>> > I would like to share short story with you and share what we have >>> > accomplished with PyPy and its friends so far. >>> > >>> > Company that I have worked for last 7 months (intentionally unnamed) >>> > gave me absolute permission to pick up technologies on which we based >>> > our solution. What we do is: crawl for PDFs and newspapers articles, >>> > download, translate them if needed, OCR if needed, do extensive >>> > analysis of downloaded PDFs and articles, store them in more organized >>> > structures for faster querying, search for them and generate bunch of >>> > complex reports. >>> > >>> > From very beginning I decided to go with PyPy no matter what. What we >>> > picked is following: >>> > * Flask for web framework, and few of its extensions such as >>> > Flask-Login, Flask-Principal, Flask-WTF, Flask-Mail, etc. >>> > * Cassandra as database because of its features and great experience >>> > with it. PyCassa is used as client to talk to Cassandra server. >>> > * ElasticSearch as distributed search engine, and its client library >>> > pyes. >>> > * Whoosh as search engine, but with some modifications to support >>> > Cassandra as storage and distributed locking. >>> > * Redis, and its client library redis-py, for caching and to speed up >>> > common auto-completion patterns. >>> > * ZooKeeper, and its client library Kazoo, for distributed locking >>> > which plays essential role in system for transaction-like behavior >>> > over many services at once. >>> > * Celery in conjunction with RabbitMQ for task distribution. >>> > * Sentry for error logging. >>> > >>> > What we have developed on our own are wrappers and clients for: >>> > * Moses which is language translator >>> > * Tesseract which is OCR engine >>> > * Cassandra store for Whoosh >>> > * wkhtmltopdf and wkhtmltoimage which are used for conversion of HTML >>> > to PDF/Image >>> > * etc >>> > >>> > Now when product is finished and in final testing phase, I can say >>> > that we did not regret because we used PyPy and stack around it. >>> > Typical speed improvement is 2x-3x over CPython in our case, but >>> > anyway we are mostly IO and memory bound, expect for Celery workers >>> > where we do analysis which are again many small CPU intensive tasks >>> > that are exchanged via RabbitMQ. Another reason why we don't see >>> > speedup us is that we are dependent on external software (servers) >>> > written in Erlang and Java. >>> > >>> > I'm already planing to do Cassandra (distributed key/value only >>> > database without index features), ZooKeeper, Redis and ElasticSearch >>> > ports in Python for next projects, and hopefully opensource them. >>> > >>> > Regards, >>> > Marko Tasic >>> > _______________________________________________ >>> > pypy-dev mailing list >>> > pypy-dev at python.org >>> > http://mail.python.org/mailman/listinfo/pypy-dev >>> >>> Awesome! >>> >>> I'm glad people can make pypy work for non-trivial tasks which require >>> a lot of dependencies. We're trying to lower the bar, however it takes >>> time. >>> >>> Cheers, >>> fijal >>> _______________________________________________ >>> pypy-dev mailing list >>> pypy-dev at python.org >>> http://mail.python.org/mailman/listinfo/pypy-dev >> >> >> _______________________________________________ >> pypy-dev mailing list >> pypy-dev at python.org >> http://mail.python.org/mailman/listinfo/pypy-dev >> > > > Hi, It might be off topic. I want to know whether pypy support postgres. The > last time I noticed ctypes based psycopg2 was still beta. I mainly use > twisted & postgres. pypy supports twisted well but not good for psycopg2. > > Regards > > gelin yan > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > -- ?????????? ???????, ??????????? ???????? ??? -- http://chtd.ru +7 (495) 646-87-45, ?????????? 333 From skip at pobox.com Thu Feb 7 17:00:14 2013 From: skip at pobox.com (Skip Montanaro) Date: Thu, 7 Feb 2013 10:00:14 -0600 Subject: [pypy-dev] Cheetah on PyPy? In-Reply-To: References: Message-ID: > Thanks. I'll make that change and send it back upstream to the Cheetah folks. This worked as Amaury advertised. I send a note to the Cheetah mailing list (where they want bug reports apparently) with the one line unidiff. Hopefully this change will make it into a near-term release. Skip From dynamicgl at gmail.com Thu Feb 7 17:00:58 2013 From: dynamicgl at gmail.com (Gelin Yan) Date: Fri, 8 Feb 2013 00:00:58 +0800 Subject: [pypy-dev] Great experience with PyPy In-Reply-To: References: Message-ID: On Thu, Feb 7, 2013 at 11:28 PM, ????? ??????? wrote: > PyPy supports postgres with either psycopg2cffi or psycopg2-ctypes > bindings. We use psycopg2cffi in production (and maintain them), and > here http://chtd.ru/blog/bystraya-rabota-s-postgres-pod-pypy/?lang=en > are some benchmarks. > And yes, PyPy is cool :) Typically giving 3x speedups, and some memory > savings sometimes. > > 2013/2/7 Gelin Yan : > > > > > > On Thu, Feb 7, 2013 at 10:11 PM, Phyo Arkar > > wrote: > >> > >> Pypy should have a page for "Success Stories!" > >> > >> Now with this and Quora proving Power of PyPy , i am beginning to start > >> converting my projects into PyPy soon! > >> I am only withholding right now because my projects uses a lot of C > >> Libraries and Numpy/Matplotlib/scilit-learn. > >> > >> Thanks > >> > >> Phyo. > >> > >> On Thursday, February 7, 2013, Maciej Fijalkowski wrote: > >>> > >>> On Thu, Feb 7, 2013 at 1:55 PM, Marko Tasic > wrote: > >>> > Hi, > >>> > > >>> > I would like to share short story with you and share what we have > >>> > accomplished with PyPy and its friends so far. > >>> > > >>> > Company that I have worked for last 7 months (intentionally unnamed) > >>> > gave me absolute permission to pick up technologies on which we based > >>> > our solution. What we do is: crawl for PDFs and newspapers articles, > >>> > download, translate them if needed, OCR if needed, do extensive > >>> > analysis of downloaded PDFs and articles, store them in more > organized > >>> > structures for faster querying, search for them and generate bunch of > >>> > complex reports. > >>> > > >>> > From very beginning I decided to go with PyPy no matter what. What we > >>> > picked is following: > >>> > * Flask for web framework, and few of its extensions such as > >>> > Flask-Login, Flask-Principal, Flask-WTF, Flask-Mail, etc. > >>> > * Cassandra as database because of its features and great experience > >>> > with it. PyCassa is used as client to talk to Cassandra server. > >>> > * ElasticSearch as distributed search engine, and its client library > >>> > pyes. > >>> > * Whoosh as search engine, but with some modifications to support > >>> > Cassandra as storage and distributed locking. > >>> > * Redis, and its client library redis-py, for caching and to speed up > >>> > common auto-completion patterns. > >>> > * ZooKeeper, and its client library Kazoo, for distributed locking > >>> > which plays essential role in system for transaction-like behavior > >>> > over many services at once. > >>> > * Celery in conjunction with RabbitMQ for task distribution. > >>> > * Sentry for error logging. > >>> > > >>> > What we have developed on our own are wrappers and clients for: > >>> > * Moses which is language translator > >>> > * Tesseract which is OCR engine > >>> > * Cassandra store for Whoosh > >>> > * wkhtmltopdf and wkhtmltoimage which are used for conversion of HTML > >>> > to PDF/Image > >>> > * etc > >>> > > >>> > Now when product is finished and in final testing phase, I can say > >>> > that we did not regret because we used PyPy and stack around it. > >>> > Typical speed improvement is 2x-3x over CPython in our case, but > >>> > anyway we are mostly IO and memory bound, expect for Celery workers > >>> > where we do analysis which are again many small CPU intensive tasks > >>> > that are exchanged via RabbitMQ. Another reason why we don't see > >>> > speedup us is that we are dependent on external software (servers) > >>> > written in Erlang and Java. > >>> > > >>> > I'm already planing to do Cassandra (distributed key/value only > >>> > database without index features), ZooKeeper, Redis and ElasticSearch > >>> > ports in Python for next projects, and hopefully opensource them. > >>> > > >>> > Regards, > >>> > Marko Tasic > >>> > _______________________________________________ > >>> > pypy-dev mailing list > >>> > pypy-dev at python.org > >>> > http://mail.python.org/mailman/listinfo/pypy-dev > >>> > >>> Awesome! > >>> > >>> I'm glad people can make pypy work for non-trivial tasks which require > >>> a lot of dependencies. We're trying to lower the bar, however it takes > >>> time. > >>> > >>> Cheers, > >>> fijal > >>> _______________________________________________ > >>> pypy-dev mailing list > >>> pypy-dev at python.org > >>> http://mail.python.org/mailman/listinfo/pypy-dev > >> > >> > >> _______________________________________________ > >> pypy-dev mailing list > >> pypy-dev at python.org > >> http://mail.python.org/mailman/listinfo/pypy-dev > >> > > > > > > Hi, It might be off topic. I want to know whether pypy support postgres. > The > > last time I noticed ctypes based psycopg2 was still beta. I mainly use > > twisted & postgres. pypy supports twisted well but not good for psycopg2. > > > > Regards > > > > gelin yan > > > > _______________________________________________ > > pypy-dev mailing list > > pypy-dev at python.org > > http://mail.python.org/mailman/listinfo/pypy-dev > > > > > > -- > ?????????? ???????, ??????????? > ???????? ??? -- http://chtd.ru > +7 (495) 646-87-45, ?????????? 333 > Hi Glad to hear that. I will give it a try. By the way, Can i use it on windows? It looks like cffi support windows. Regards gelin yan -------------- next part -------------- An HTML attachment was scrubbed... URL: From kostia.lopuhin at gmail.com Thu Feb 7 17:08:51 2013 From: kostia.lopuhin at gmail.com (=?KOI8-R?B?68/T1NEg7M/Q1cjJzg==?=) Date: Thu, 7 Feb 2013 20:08:51 +0400 Subject: [pypy-dev] Great experience with PyPy In-Reply-To: References: Message-ID: Hi! I did not test it on Windows, there may be problems with installation (searching for postgres header files, the config is not very smart - https://github.com/chtd/psycopg2cffi/blob/master/psycopg2cffi/_impl/libpq.py#L209), but they should be solvable I hope - submit a bug if you have problems. 2013/2/7 Gelin Yan : > > > On Thu, Feb 7, 2013 at 11:28 PM, ????? ??????? > wrote: >> >> PyPy supports postgres with either psycopg2cffi or psycopg2-ctypes >> bindings. We use psycopg2cffi in production (and maintain them), and >> here http://chtd.ru/blog/bystraya-rabota-s-postgres-pod-pypy/?lang=en >> are some benchmarks. >> And yes, PyPy is cool :) Typically giving 3x speedups, and some memory >> savings sometimes. >> >> 2013/2/7 Gelin Yan : >> > >> > >> > On Thu, Feb 7, 2013 at 10:11 PM, Phyo Arkar >> > wrote: >> >> >> >> Pypy should have a page for "Success Stories!" >> >> >> >> Now with this and Quora proving Power of PyPy , i am beginning to start >> >> converting my projects into PyPy soon! >> >> I am only withholding right now because my projects uses a lot of C >> >> Libraries and Numpy/Matplotlib/scilit-learn. >> >> >> >> Thanks >> >> >> >> Phyo. >> >> >> >> On Thursday, February 7, 2013, Maciej Fijalkowski wrote: >> >>> >> >>> On Thu, Feb 7, 2013 at 1:55 PM, Marko Tasic >> >>> wrote: >> >>> > Hi, >> >>> > >> >>> > I would like to share short story with you and share what we have >> >>> > accomplished with PyPy and its friends so far. >> >>> > >> >>> > Company that I have worked for last 7 months (intentionally unnamed) >> >>> > gave me absolute permission to pick up technologies on which we >> >>> > based >> >>> > our solution. What we do is: crawl for PDFs and newspapers articles, >> >>> > download, translate them if needed, OCR if needed, do extensive >> >>> > analysis of downloaded PDFs and articles, store them in more >> >>> > organized >> >>> > structures for faster querying, search for them and generate bunch >> >>> > of >> >>> > complex reports. >> >>> > >> >>> > From very beginning I decided to go with PyPy no matter what. What >> >>> > we >> >>> > picked is following: >> >>> > * Flask for web framework, and few of its extensions such as >> >>> > Flask-Login, Flask-Principal, Flask-WTF, Flask-Mail, etc. >> >>> > * Cassandra as database because of its features and great experience >> >>> > with it. PyCassa is used as client to talk to Cassandra server. >> >>> > * ElasticSearch as distributed search engine, and its client library >> >>> > pyes. >> >>> > * Whoosh as search engine, but with some modifications to support >> >>> > Cassandra as storage and distributed locking. >> >>> > * Redis, and its client library redis-py, for caching and to speed >> >>> > up >> >>> > common auto-completion patterns. >> >>> > * ZooKeeper, and its client library Kazoo, for distributed locking >> >>> > which plays essential role in system for transaction-like behavior >> >>> > over many services at once. >> >>> > * Celery in conjunction with RabbitMQ for task distribution. >> >>> > * Sentry for error logging. >> >>> > >> >>> > What we have developed on our own are wrappers and clients for: >> >>> > * Moses which is language translator >> >>> > * Tesseract which is OCR engine >> >>> > * Cassandra store for Whoosh >> >>> > * wkhtmltopdf and wkhtmltoimage which are used for conversion of >> >>> > HTML >> >>> > to PDF/Image >> >>> > * etc >> >>> > >> >>> > Now when product is finished and in final testing phase, I can say >> >>> > that we did not regret because we used PyPy and stack around it. >> >>> > Typical speed improvement is 2x-3x over CPython in our case, but >> >>> > anyway we are mostly IO and memory bound, expect for Celery workers >> >>> > where we do analysis which are again many small CPU intensive tasks >> >>> > that are exchanged via RabbitMQ. Another reason why we don't see >> >>> > speedup us is that we are dependent on external software (servers) >> >>> > written in Erlang and Java. >> >>> > >> >>> > I'm already planing to do Cassandra (distributed key/value only >> >>> > database without index features), ZooKeeper, Redis and ElasticSearch >> >>> > ports in Python for next projects, and hopefully opensource them. >> >>> > >> >>> > Regards, >> >>> > Marko Tasic >> >>> > _______________________________________________ >> >>> > pypy-dev mailing list >> >>> > pypy-dev at python.org >> >>> > http://mail.python.org/mailman/listinfo/pypy-dev >> >>> >> >>> Awesome! >> >>> >> >>> I'm glad people can make pypy work for non-trivial tasks which require >> >>> a lot of dependencies. We're trying to lower the bar, however it takes >> >>> time. >> >>> >> >>> Cheers, >> >>> fijal >> >>> _______________________________________________ >> >>> pypy-dev mailing list >> >>> pypy-dev at python.org >> >>> http://mail.python.org/mailman/listinfo/pypy-dev >> >> >> >> >> >> _______________________________________________ >> >> pypy-dev mailing list >> >> pypy-dev at python.org >> >> http://mail.python.org/mailman/listinfo/pypy-dev >> >> >> > >> > >> > Hi, It might be off topic. I want to know whether pypy support postgres. >> > The >> > last time I noticed ctypes based psycopg2 was still beta. I mainly use >> > twisted & postgres. pypy supports twisted well but not good for >> > psycopg2. >> > >> > Regards >> > >> > gelin yan >> > >> > _______________________________________________ >> > pypy-dev mailing list >> > pypy-dev at python.org >> > http://mail.python.org/mailman/listinfo/pypy-dev >> > >> >> >> >> -- >> ?????????? ???????, ??????????? >> ???????? ??? -- http://chtd.ru >> +7 (495) 646-87-45, ?????????? 333 > > > Hi > > Glad to hear that. I will give it a try. By the way, Can i use it on > windows? It looks like cffi support windows. > > Regards > > gelin yan > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > From john.m.camara at gmail.com Thu Feb 7 20:00:19 2013 From: john.m.camara at gmail.com (John Camara) Date: Thu, 7 Feb 2013 14:00:19 -0500 Subject: [pypy-dev] Parallella open hardware platform In-Reply-To: References: Message-ID: Fijal, Whether someone works full time on a project is a separate issue. Being popular helps attract additional resources and PyPy is a project that could use additional resources. How many additional optimizations could PyPy add to get to a similar level of optimization to say the JVM. We are talking many many man years of work. How much additional work is it to develop and maintain backends for the various ARM, PPC, MIPS, etc processors How much work would it take to have PyPy support multi-cores? What if RPython needs to be significantly refactored or replaced. And we can go on and on. Typically every 10 years or so a new language becomes dominate but that hasn't happen lately. Java had been in the role for quite some time and for quite a few years it has be on the decline but yet no language has taken it's place in terms of dominance. The main reason why this hasn't happen so far is that no language has successfully dealt with the multi-core issue in a way that also keeps other desirable features we currently have with popular languages. But at some point, a language will prevail and become dominate and when that happens there will be a mass migration to this language. It doesn't mean that Python and other currently popular languages are just going to go away, it just their use will decline. If Python's popularity declines significantly it will in turn impact PyPy. Also many of the earlier adopters of PyPy are more likely to move on to the new dominate language. So where does that leave you. I expect you earn a living by doing PyPy consulting and thus you need PyPy to be popular. Now you don't have to believe that a new dominate language will occur but history says otherwise and many have been fooled into thinking otherwise is the past. I feel PyPy is Python's best chance at being able to survive this change in language dominance as it has the best chance of being able to do something about the multi-core situation. I'm glad the other day you mentioned about the web stack as if you didn't mention it I likely would not have thought about the PyPy hypervisor scenario. I'm starting to believe that approach, may have some decent merit to it and allow a way to kick the can down the road on the multi-core issues. I don't have the time to get into it right now but I start a new thread on the topic. Maybe within the next few days. John On Thu, Feb 7, 2013 at 4:33 AM, Maciej Fijalkowski wrote: > On Thu, Feb 7, 2013 at 6:41 AM, John Camara > wrote: > > Fijal, > > > > In the past you have complained about it being hard to make money in open > > source. One way to make it easier for you is grow the popularity of PyPy. > > So I would think you would at least have some interest in thinking of > ways > > to accomplish that. > > Before even reading further - how is being popular making money? Being > popular is being popular. Do you know any CPython developer working > full time on CPython? CPython is definitely popular by my standards > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mtasic85 at gmail.com Fri Feb 8 12:22:25 2013 From: mtasic85 at gmail.com (Marko Tasic) Date: Fri, 8 Feb 2013 12:22:25 +0100 Subject: [pypy-dev] Great experience with PyPy In-Reply-To: References: Message-ID: Thanks everyone for support. @fijal In one of previous emails, I already told you that I'll be using pypy for real-life problems on medium to large scale projects. It is also very hard convincing companies that they should invest money in open source, but as far as they contribute to open source projects, I'm also satisfied in some way. Anyway, I prefer money ;) @carl I'm very bad at writing blog posts, but would like to explain in email what we have done, what obstacles have we faced and how we solved them. @armin Because I don't care of speed (I already have plenty of CPU cores not used all the time), and I only care of correctness and maintainability of code, your STM will perfectly fit in our requirements. As far as i know, every developer working on serious large scale project after going over your STM descriptions (emails and blogs) gives me the same answer about it, and that is the perfect solution for "per machine" concurrent programming. As far as I have freedom to pick technologies, I will definitely relay on it on one of the next projects. What is the status of it ATM, and what is best way to test and deploy pypy with stm? Regards, Marko Tasic On Thu, Feb 7, 2013 at 5:08 PM, ????? ??????? wrote: > Hi! I did not test it on Windows, there may be problems with > installation (searching for postgres header files, the config is not > very smart - https://github.com/chtd/psycopg2cffi/blob/master/psycopg2cffi/_impl/libpq.py#L209), > but they should be solvable I hope - submit a bug if you have > problems. > > 2013/2/7 Gelin Yan : >> >> >> On Thu, Feb 7, 2013 at 11:28 PM, ????? ??????? >> wrote: >>> >>> PyPy supports postgres with either psycopg2cffi or psycopg2-ctypes >>> bindings. We use psycopg2cffi in production (and maintain them), and >>> here http://chtd.ru/blog/bystraya-rabota-s-postgres-pod-pypy/?lang=en >>> are some benchmarks. >>> And yes, PyPy is cool :) Typically giving 3x speedups, and some memory >>> savings sometimes. >>> >>> 2013/2/7 Gelin Yan : >>> > >>> > >>> > On Thu, Feb 7, 2013 at 10:11 PM, Phyo Arkar >>> > wrote: >>> >> >>> >> Pypy should have a page for "Success Stories!" >>> >> >>> >> Now with this and Quora proving Power of PyPy , i am beginning to start >>> >> converting my projects into PyPy soon! >>> >> I am only withholding right now because my projects uses a lot of C >>> >> Libraries and Numpy/Matplotlib/scilit-learn. >>> >> >>> >> Thanks >>> >> >>> >> Phyo. >>> >> >>> >> On Thursday, February 7, 2013, Maciej Fijalkowski wrote: >>> >>> >>> >>> On Thu, Feb 7, 2013 at 1:55 PM, Marko Tasic >>> >>> wrote: >>> >>> > Hi, >>> >>> > >>> >>> > I would like to share short story with you and share what we have >>> >>> > accomplished with PyPy and its friends so far. >>> >>> > >>> >>> > Company that I have worked for last 7 months (intentionally unnamed) >>> >>> > gave me absolute permission to pick up technologies on which we >>> >>> > based >>> >>> > our solution. What we do is: crawl for PDFs and newspapers articles, >>> >>> > download, translate them if needed, OCR if needed, do extensive >>> >>> > analysis of downloaded PDFs and articles, store them in more >>> >>> > organized >>> >>> > structures for faster querying, search for them and generate bunch >>> >>> > of >>> >>> > complex reports. >>> >>> > >>> >>> > From very beginning I decided to go with PyPy no matter what. What >>> >>> > we >>> >>> > picked is following: >>> >>> > * Flask for web framework, and few of its extensions such as >>> >>> > Flask-Login, Flask-Principal, Flask-WTF, Flask-Mail, etc. >>> >>> > * Cassandra as database because of its features and great experience >>> >>> > with it. PyCassa is used as client to talk to Cassandra server. >>> >>> > * ElasticSearch as distributed search engine, and its client library >>> >>> > pyes. >>> >>> > * Whoosh as search engine, but with some modifications to support >>> >>> > Cassandra as storage and distributed locking. >>> >>> > * Redis, and its client library redis-py, for caching and to speed >>> >>> > up >>> >>> > common auto-completion patterns. >>> >>> > * ZooKeeper, and its client library Kazoo, for distributed locking >>> >>> > which plays essential role in system for transaction-like behavior >>> >>> > over many services at once. >>> >>> > * Celery in conjunction with RabbitMQ for task distribution. >>> >>> > * Sentry for error logging. >>> >>> > >>> >>> > What we have developed on our own are wrappers and clients for: >>> >>> > * Moses which is language translator >>> >>> > * Tesseract which is OCR engine >>> >>> > * Cassandra store for Whoosh >>> >>> > * wkhtmltopdf and wkhtmltoimage which are used for conversion of >>> >>> > HTML >>> >>> > to PDF/Image >>> >>> > * etc >>> >>> > >>> >>> > Now when product is finished and in final testing phase, I can say >>> >>> > that we did not regret because we used PyPy and stack around it. >>> >>> > Typical speed improvement is 2x-3x over CPython in our case, but >>> >>> > anyway we are mostly IO and memory bound, expect for Celery workers >>> >>> > where we do analysis which are again many small CPU intensive tasks >>> >>> > that are exchanged via RabbitMQ. Another reason why we don't see >>> >>> > speedup us is that we are dependent on external software (servers) >>> >>> > written in Erlang and Java. >>> >>> > >>> >>> > I'm already planing to do Cassandra (distributed key/value only >>> >>> > database without index features), ZooKeeper, Redis and ElasticSearch >>> >>> > ports in Python for next projects, and hopefully opensource them. >>> >>> > >>> >>> > Regards, >>> >>> > Marko Tasic >>> >>> > _______________________________________________ >>> >>> > pypy-dev mailing list >>> >>> > pypy-dev at python.org >>> >>> > http://mail.python.org/mailman/listinfo/pypy-dev >>> >>> >>> >>> Awesome! >>> >>> >>> >>> I'm glad people can make pypy work for non-trivial tasks which require >>> >>> a lot of dependencies. We're trying to lower the bar, however it takes >>> >>> time. >>> >>> >>> >>> Cheers, >>> >>> fijal >>> >>> _______________________________________________ >>> >>> pypy-dev mailing list >>> >>> pypy-dev at python.org >>> >>> http://mail.python.org/mailman/listinfo/pypy-dev >>> >> >>> >> >>> >> _______________________________________________ >>> >> pypy-dev mailing list >>> >> pypy-dev at python.org >>> >> http://mail.python.org/mailman/listinfo/pypy-dev >>> >> >>> > >>> > >>> > Hi, It might be off topic. I want to know whether pypy support postgres. >>> > The >>> > last time I noticed ctypes based psycopg2 was still beta. I mainly use >>> > twisted & postgres. pypy supports twisted well but not good for >>> > psycopg2. >>> > >>> > Regards >>> > >>> > gelin yan >>> > >>> > _______________________________________________ >>> > pypy-dev mailing list >>> > pypy-dev at python.org >>> > http://mail.python.org/mailman/listinfo/pypy-dev >>> > >>> >>> >>> >>> -- >>> ?????????? ???????, ??????????? >>> ???????? ??? -- http://chtd.ru >>> +7 (495) 646-87-45, ?????????? 333 >> >> >> Hi >> >> Glad to hear that. I will give it a try. By the way, Can i use it on >> windows? It looks like cffi support windows. >> >> Regards >> >> gelin yan >> >> _______________________________________________ >> pypy-dev mailing list >> pypy-dev at python.org >> http://mail.python.org/mailman/listinfo/pypy-dev >> > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev From yellowsq at hotmail.com Fri Feb 8 21:37:47 2013 From: yellowsq at hotmail.com (Yellow Sq) Date: Fri, 8 Feb 2013 20:37:47 +0000 Subject: [pypy-dev] Pypy's parser Message-ID: Hi. Short question: It says at http://doc.pypy.org/en/latest/parser.html that Pypy's parser is a recursive descent one. But following the content on that page it actually seems that the parser is a table-based LL. Is this perhaps out-dated and did Pypy had a different parser at some point? If so, what were the reasons that triggered the change? Is there a list of projects (either from the industry or the academy) using Pypy? Thx. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.gaynor at gmail.com Fri Feb 8 21:41:16 2013 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Fri, 8 Feb 2013 12:41:16 -0800 Subject: [pypy-dev] Pypy's parser In-Reply-To: References: Message-ID: Yes, that page is wrong, the parser is an LL table parser. I don't think we have an official list of projects using PyPy anywhere. Alex On Fri, Feb 8, 2013 at 12:37 PM, Yellow Sq wrote: > Hi. > > Short question: It says at http://doc.pypy.org/en/latest/parser.html that > Pypy's parser is a recursive descent one. But following the content on that > page it actually seems that the parser is a table-based LL. Is this perhaps > out-dated and did Pypy had a different parser at some point? If so, what > were the reasons that triggered the change? > > Is there a list of projects (either from the industry or the academy) > using Pypy? > > Thx. > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > > -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero -------------- next part -------------- An HTML attachment was scrubbed... URL: From arigo at tunes.org Sun Feb 10 15:43:33 2013 From: arigo at tunes.org (Armin Rigo) Date: Sun, 10 Feb 2013 15:43:33 +0100 Subject: [pypy-dev] Great experience with PyPy In-Reply-To: References: Message-ID: Hi Marko, On Fri, Feb 8, 2013 at 12:22 PM, Marko Tasic wrote: > What is the status of it ATM, and what is best way to test > and deploy pypy with stm? The STM project progressed slowly during the last few months. The status right now is: * Most importantly, missing major Garbage Collection cycles, which means pypy-stm slowly but constantly leaks memory. * The JIT integration is not finished; so far pypy-stm can only be compiled without the JIT. * There are also other places where the performance can be improved, probably a lot. * Finally there are a number of usability concerns that we (or mostly Remi) worked on recently. The main issues turn around the idea that, as a user of pypy-stm, you should have a way to get freeback on the process. For example right now, transactions that abort are completely transparent --- to the point that you don't have any way to know that it occurred, apart from "it runs too slowly" if it occurs a lot. You should have a way to get Python tracebacks of aborts if you want to. A similar issue is "inevitable" transactions. A bient?t, Armin. From matti.picus at gmail.com Mon Feb 11 06:16:32 2013 From: matti.picus at gmail.com (Matti Picus) Date: Mon, 11 Feb 2013 07:16:32 +0200 Subject: [pypy-dev] gcc warnings / errors in translation Message-ID: <51187EB0.8060507@gmail.com> warning: the stdio output of a translate is a very large webpage. I wondered why the jit-benchmark-linux-x86-64 tests were failing to translate, where the pypy-c-jit-linux-x86-64 passed for instance the failed build on tannit64 http://buildbot.pypy.org/builders/jit-benchmark-linux-x86-64/builds/581/steps/translate/logs/stdio versus the successful build on allegro64 http://buildbot.pypy.org/builders/pypy-c-jit-linux-x86-64/builds/1257/steps/translate/logs/stdio Two things struck me from the failure of the make command: - Could the difference be environmental gcc flags making warnings into errors on tannit64? - There are alot of warnings, some of them seem important. In looking back into history, we seem to have gotten worse with warnings. I went back a bit, say to one of the release-2.0-beta1 builds, there are still tons of gcc warnings, but fewer http://buildbot.pypy.org/builders/pypy-c-jit-linux-x86-64/builds/1115/steps/translate/logs/stdio Anyone feel like taking a look? Matti From arigo at tunes.org Mon Feb 11 09:39:51 2013 From: arigo at tunes.org (Armin Rigo) Date: Mon, 11 Feb 2013 09:39:51 +0100 Subject: [pypy-dev] gcc warnings / errors in translation In-Reply-To: <51187EB0.8060507@gmail.com> References: <51187EB0.8060507@gmail.com> Message-ID: Hi Matti, On Mon, Feb 11, 2013 at 6:16 AM, Matti Picus wrote: > http://buildbot.pypy.org/builders/jit-benchmark-linux-x86-64/builds/581/steps/translate/logs/stdio Ah well, just Yet Another Intel assembler operation that asmgcc doesn't know about. The fix is trivial (done). It doesn't mean that looking at warnings is not a good idea; it should be done at some point too. A bient?t, Armin. From fijall at gmail.com Mon Feb 11 09:43:49 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 11 Feb 2013 10:43:49 +0200 Subject: [pypy-dev] gcc warnings / errors in translation In-Reply-To: References: <51187EB0.8060507@gmail.com> Message-ID: This: warning: array subscript is above array bounds [-Warray-bounds] Sounds like it's never correct. Should we pass -Wno-array-bounds? On Mon, Feb 11, 2013 at 10:39 AM, Armin Rigo wrote: > Hi Matti, > > On Mon, Feb 11, 2013 at 6:16 AM, Matti Picus wrote: >> http://buildbot.pypy.org/builders/jit-benchmark-linux-x86-64/builds/581/steps/translate/logs/stdio > > Ah well, just Yet Another Intel assembler operation that asmgcc > doesn't know about. The fix is trivial (done). > > It doesn't mean that looking at warnings is not a good idea; it should > be done at some point too. > > > A bient?t, > > Armin. > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev From arigo at tunes.org Mon Feb 11 09:57:06 2013 From: arigo at tunes.org (Armin Rigo) Date: Mon, 11 Feb 2013 09:57:06 +0100 Subject: [pypy-dev] win32 own test failures In-Reply-To: References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> Message-ID: Hi all, On Wed, Feb 6, 2013 at 12:26 AM, Maciej Fijalkowski wrote: > it's probably already fixed on jitframe-on-heap which we aim to merge Just answering this mail for the records: yes, on windows these tests pass on jitframe-on-heap. Armin From fijall at gmail.com Mon Feb 11 09:59:01 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 11 Feb 2013 10:59:01 +0200 Subject: [pypy-dev] win32 own test failures In-Reply-To: References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> Message-ID: On Mon, Feb 11, 2013 at 10:57 AM, Armin Rigo wrote: > Hi all, > > On Wed, Feb 6, 2013 at 12:26 AM, Maciej Fijalkowski wrote: >> it's probably already fixed on jitframe-on-heap which we aim to merge > > Just answering this mail for the records: yes, on windows these tests > pass on jitframe-on-heap. How so? The 32bit support is by far not done > > > Armin From arigo at tunes.org Mon Feb 11 10:10:08 2013 From: arigo at tunes.org (Armin Rigo) Date: Mon, 11 Feb 2013 10:10:08 +0100 Subject: [pypy-dev] gcc warnings / errors in translation In-Reply-To: References: <51187EB0.8060507@gmail.com> Message-ID: Hi, On Mon, Feb 11, 2013 at 9:43 AM, Maciej Fijalkowski wrote: > This: warning: array subscript is above array bounds [-Warray-bounds] > > Sounds like it's never correct. Should we pass -Wno-array-bounds? Ah, indeed. We declare most arrays as "itemtype x[1]", so gcc complains when it can prove we do accesses at an index > 0. A bient?t, Armin. From arigo at tunes.org Mon Feb 11 10:11:51 2013 From: arigo at tunes.org (Armin Rigo) Date: Mon, 11 Feb 2013 10:11:51 +0100 Subject: [pypy-dev] win32 own test failures In-Reply-To: References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> Message-ID: Hi Fijal, On Mon, Feb 11, 2013 at 9:59 AM, Maciej Fijalkowski wrote: > How so? The 32bit support is by far not done Dunno? These two tests (from "test_basic.py -k test_float") also pass when running on linux32 fwiw. A bient?t, Armin. From estama at gmail.com Mon Feb 11 16:48:22 2013 From: estama at gmail.com (Eleytherios Stamatogiannakis) Date: Mon, 11 Feb 2013 17:48:22 +0200 Subject: [pypy-dev] Unicode encode/decode speed In-Reply-To: References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> Message-ID: <511912C6.5000201@gmail.com> Hi, We have been following the nightly builds of PyPy, with our testing workload (first described in the "CFFI speed results" thread). The news are very good. The performance of PyPy + CFFI has gone up considerably (~30% faster) since the last time we wrote about it! By adding on that speed up also our optimizations of the CFFI based SQLite3 wrapper (MSPW) that we are developing, the end result is that most of our test queries are at the same speed or faster than CPython + APSW now. Unfortunately, one of the queries where PyPy is slower [*] than CPython + APSW, is very central to all of our workflows, which means that we cannot fully convert to using PyPy. The main culprit of PyPy's slowness is the conversion (encoding, decoding) from PyPy's unicodes to UTF-8. It is the only thing, with a big percentage (~48%), remaining at the top of our performance profiles . Right now we are using PyPy's "codecs.utf_8_encode" and "codecs.utf_8_decode" to do this conversion. It there a faster way to do these conversions (encoding, decoding) in PyPy? Does CPython do something more clever than PyPY, like storing unicodes with full ASCII char content, in an ASCII representation? Thank you very much, lefteris. [*] For 1M rows: CPython + APSW: 10.5 sec PyPy + MSPW: 15.5 sec From amauryfa at gmail.com Mon Feb 11 17:13:58 2013 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Mon, 11 Feb 2013 17:13:58 +0100 Subject: [pypy-dev] Unicode encode/decode speed In-Reply-To: <511912C6.5000201@gmail.com> References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> <511912C6.5000201@gmail.com> Message-ID: 2013/2/11 Eleytherios Stamatogiannakis > Right now we are using PyPy's "codecs.utf_8_encode" and > "codecs.utf_8_decode" to do this conversion. > It's the most direct way to use the utf-8 conversion functions. > It there a faster way to do these conversions (encoding, decoding) in > PyPy? Does CPython do something more clever than PyPY, like storing > unicodes with full ASCII char content, in an ASCII representation? > Over years, utf-8 conversions have been heavily optimized in CPython: allocate short buffers on the stack, use aligned reads, quick check for ascii-only content (data & 0x80808080)... All things that pypy does not. But I tried some "timeit" runs, and pypy is often faster that CPython, and never much slower. Do your strings have many non-ascii characters? what's the len(utf8)/len(unicode) ratio? -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From estama at gmail.com Mon Feb 11 18:02:04 2013 From: estama at gmail.com (Eleytherios Stamatogiannakis) Date: Mon, 11 Feb 2013 19:02:04 +0200 Subject: [pypy-dev] Unicode encode/decode speed In-Reply-To: References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> <511912C6.5000201@gmail.com> Message-ID: <5119240C.2000209@gmail.com> On 11/02/13 18:13, Amaury Forgeot d'Arc wrote: > > 2013/2/11 Eleytherios Stamatogiannakis > > > Right now we are using PyPy's "codecs.utf_8_encode" and > "codecs.utf_8_decode" to do this conversion. > > > It's the most direct way to use the utf-8 conversion functions. > > It there a faster way to do these conversions (encoding, decoding) > in PyPy? Does CPython do something more clever than PyPY, like > storing unicodes with full ASCII char content, in an ASCII > representation? > > > Over years, utf-8 conversions have been heavily optimized in CPython: > allocate short buffers on the stack, use aligned reads, quick check for > ascii-only content (data & 0x80808080)... > All things that pypy does not. > > But I tried some "timeit" runs, and pypy is often faster that CPython, > and never much slower. This is odd. Maybe APSW uses some other CPython conversion API? Because the conversion overhead is not visible on CPython + APSW profiles. > Do your strings have many non-ascii characters? > what's the len(utf8)/len(unicode) ratio? > Our current tests, are using plain ASCII input (imported into sqlite3) which: - Go from sqlite3 (UTF-8) -> PyPy (unicode) - PyPy (unicode) -> sqlite3 (UTF-8). So i guess the len(utf-8)/len(unicode) = 1/4 (assuming 1 byte per char for ASCII (UTF-8) and 4 bytes per char for PyPy's unicode storage) l. From amauryfa at gmail.com Mon Feb 11 18:14:20 2013 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Mon, 11 Feb 2013 18:14:20 +0100 Subject: [pypy-dev] Unicode encode/decode speed In-Reply-To: <5119240C.2000209@gmail.com> References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> <511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> Message-ID: 2013/2/11 Eleytherios Stamatogiannakis > On 11/02/13 18:13, Amaury Forgeot d'Arc wrote: > >> >> 2013/2/11 Eleytherios Stamatogiannakis > > >> >> >> Right now we are using PyPy's "codecs.utf_8_encode" and >> "codecs.utf_8_decode" to do this conversion. >> >> >> It's the most direct way to use the utf-8 conversion functions. >> >> It there a faster way to do these conversions (encoding, decoding) >> in PyPy? Does CPython do something more clever than PyPY, like >> storing unicodes with full ASCII char content, in an ASCII >> representation? >> >> >> Over years, utf-8 conversions have been heavily optimized in CPython: >> allocate short buffers on the stack, use aligned reads, quick check for >> ascii-only content (data & 0x80808080)... >> All things that pypy does not. >> >> But I tried some "timeit" runs, and pypy is often faster that CPython, >> and never much slower. >> > > This is odd. Maybe APSW uses some other CPython conversion API? Because > the conversion overhead is not visible on CPython + APSW profiles. Which kind of profiler are you using? It possible that CPython builtin functions are not profiled the same way as PyPy's. > Do your strings have many non-ascii characters? >> what's the len(utf8)/len(unicode) ratio? >> >> > Our current tests, are using plain ASCII input (imported into sqlite3) > which: > > - Go from sqlite3 (UTF-8) -> PyPy (unicode) > - PyPy (unicode) -> sqlite3 (UTF-8). > > So i guess the len(utf-8)/len(unicode) = 1/4 > (assuming 1 byte per char for ASCII (UTF-8) and 4 bytes per char for > PyPy's unicode storage) > No, my question was about the number of non-ascii characters: s = u"SomeUnicodeString" 1.0 * len(s.encode('utf8')) / len(s) PyPy allocates the StringBuffer upfront, and must realloc to cope with multibytes characters. For English text, ratio is 1.0; for Greek, it will be close to 2.0. -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From estama at gmail.com Mon Feb 11 18:36:23 2013 From: estama at gmail.com (Eleytherios Stamatogiannakis) Date: Mon, 11 Feb 2013 19:36:23 +0200 Subject: [pypy-dev] Unicode encode/decode speed In-Reply-To: References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> <511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> Message-ID: <51192C17.7060907@gmail.com> On 11/02/13 19:14, Amaury Forgeot d'Arc wrote: > > > 2013/2/11 Eleytherios Stamatogiannakis > > > On 11/02/13 18:13, Amaury Forgeot d'Arc wrote: >... > > Which kind of profiler are you using? It possible that CPython builtin > functions are not profiled the same way as PyPy's. lsprofcalltree.py . From APSW's source code, i think that it uses this API: (in cursor.c) PyUnicode_DecodeUTF8 Maybe lsprofcalltree doesn't profile it? > > No, my question was about the number of non-ascii characters: > s = u"SomeUnicodeString" > 1.0 * len(s.encode('utf8')) / len(s) > PyPy allocates the StringBuffer upfront, and must realloc to cope with > multibytes characters. > For English text, ratio is 1.0; for Greek, it will be close to 2.0. > All of our tests use only plain English ASCII chars (converted to unicode). So the ratio is 1.0 . l. From amauryfa at gmail.com Mon Feb 11 18:39:58 2013 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Mon, 11 Feb 2013 18:39:58 +0100 Subject: [pypy-dev] Unicode encode/decode speed In-Reply-To: <51192C17.7060907@gmail.com> References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> <511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> <51192C17.7060907@gmail.com> Message-ID: 2013/2/11 Eleytherios Stamatogiannakis > > > > Which kind of profiler are you using? It possible that CPython builtin > > functions are not profiled the same way as PyPy's. > > lsprofcalltree.py . > > From APSW's source code, i think that it uses this API: > > (in cursor.c) > PyUnicode_DecodeUTF8 > > Maybe lsprofcalltree doesn't profile it? Indeed. CPU cost is hidden in the cursor method. -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Mon Feb 11 21:03:29 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 11 Feb 2013 22:03:29 +0200 Subject: [pypy-dev] Unicode encode/decode speed In-Reply-To: References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> <511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> <51192C17.7060907@gmail.com> Message-ID: On Mon, Feb 11, 2013 at 7:39 PM, Amaury Forgeot d'Arc wrote: > 2013/2/11 Eleytherios Stamatogiannakis >> >> > >> > Which kind of profiler are you using? It possible that CPython builtin >> > functions are not profiled the same way as PyPy's. >> >> lsprofcalltree.py . >> >> From APSW's source code, i think that it uses this API: >> >> (in cursor.c) >> PyUnicode_DecodeUTF8 >> >> Maybe lsprofcalltree doesn't profile it? > > > Indeed. CPU cost is hidden in the cursor method. > > > -- > Amaury Forgeot d'Arc > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > I would suggest using valgrind. It's a very good (albeit very slow) tool for seeing C-level performance. I remember seeing it both for CPython and PyPy when trying. Can you try yourself? From alex.gaynor at gmail.com Mon Feb 11 21:25:01 2013 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Mon, 11 Feb 2013 12:25:01 -0800 Subject: [pypy-dev] Unicode encode/decode speed In-Reply-To: References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> <511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> <51192C17.7060907@gmail.com> Message-ID: I've also heard great things about `perf` if you're on Linux. Alex On Mon, Feb 11, 2013 at 12:03 PM, Maciej Fijalkowski wrote: > On Mon, Feb 11, 2013 at 7:39 PM, Amaury Forgeot d'Arc > wrote: > > 2013/2/11 Eleytherios Stamatogiannakis > >> > >> > > >> > Which kind of profiler are you using? It possible that CPython builtin > >> > functions are not profiled the same way as PyPy's. > >> > >> lsprofcalltree.py . > >> > >> From APSW's source code, i think that it uses this API: > >> > >> (in cursor.c) > >> PyUnicode_DecodeUTF8 > >> > >> Maybe lsprofcalltree doesn't profile it? > > > > > > Indeed. CPU cost is hidden in the cursor method. > > > > > > -- > > Amaury Forgeot d'Arc > > > > _______________________________________________ > > pypy-dev mailing list > > pypy-dev at python.org > > http://mail.python.org/mailman/listinfo/pypy-dev > > > > I would suggest using valgrind. It's a very good (albeit very slow) > tool for seeing C-level performance. I remember seeing it both for > CPython and PyPy when trying. Can you try yourself? > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero -------------- next part -------------- An HTML attachment was scrubbed... URL: From estama at gmail.com Tue Feb 12 00:24:03 2013 From: estama at gmail.com (Elefterios Stamatogiannakis) Date: Tue, 12 Feb 2013 01:24:03 +0200 Subject: [pypy-dev] Unicode encode/decode speed In-Reply-To: References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> <511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> <51192C17.7060907@gmail.com> Message-ID: <51197D93.2050201@gmail.com> On 11/2/2013 7:39 ??, Amaury Forgeot d'Arc wrote: > 2013/2/11 Eleytherios Stamatogiannakis > > > > > > Which kind of profiler are you using? It possible that CPython > builtin > > functions are not profiled the same way as PyPy's. > > lsprofcalltree.py . > > From APSW's source code, i think that it uses this API: > > (in cursor.c) > PyUnicode_DecodeUTF8 > > Maybe lsprofcalltree doesn't profile it? > > > Indeed. CPU cost is hidden in the cursor method. Thanks Amaury for looking into this, Assuming that PyPy's "codecs.utf_8_decode" is slower when used with CFFI than using PyUnicode_DecodeUTF8 in CPython. Is there anything that can be done in CFFI that would have the same performance as PyUnicode_DecodeUTF8 (and the same for encode)? l. From mail at justinbogner.com Tue Feb 12 05:57:23 2013 From: mail at justinbogner.com (Justin Bogner) Date: Mon, 11 Feb 2013 21:57:23 -0700 Subject: [pypy-dev] gcc warnings / errors in translation In-Reply-To: (Armin Rigo's message of "Mon, 11 Feb 2013 10:10:08 +0100") References: <51187EB0.8060507@gmail.com> Message-ID: <87txpi164s.fsf@justinbogner.com> Armin Rigo writes: > Ah, indeed. We declare most arrays as "itemtype x[1]", so gcc complains > when it can prove we do accesses at an index > 0. Is there a good reason not to use the C99 "itemtype x[]" or even the old GCC extension "itemtype x[0]"? These won't trigger this warning, which means we could leave it on in case a legitimate case crops up. As far as I know, the only noticeable difference between [], [0], and [1] for flexible arrays is that sizeof has different semantics, but that's usually not a big deal. From amauryfa at gmail.com Tue Feb 12 08:13:35 2013 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Tue, 12 Feb 2013 08:13:35 +0100 Subject: [pypy-dev] gcc warnings / errors in translation In-Reply-To: <87txpi164s.fsf@justinbogner.com> References: <51187EB0.8060507@gmail.com> <87txpi164s.fsf@justinbogner.com> Message-ID: 2013/2/12 Justin Bogner > Armin Rigo writes: > > Ah, indeed. We declare most arrays as "itemtype x[1]", so gcc complains > > when it can prove we do accesses at an index > 0. > > Is there a good reason not to use the C99 "itemtype x[]" or even the old > GCC extension "itemtype x[0]"? These won't trigger this warning, which > means we could leave it on in case a legitimate case crops up. It seems that Microsoft compilers also support this extension: http://msdn.microsoft.com/en-us/library/b6fae073(v=vs.71).aspx -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From amauryfa at gmail.com Tue Feb 12 08:47:36 2013 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Tue, 12 Feb 2013 08:47:36 +0100 Subject: [pypy-dev] Unicode encode/decode speed In-Reply-To: <51197D93.2050201@gmail.com> References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> <511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> <51192C17.7060907@gmail.com> <51197D93.2050201@gmail.com> Message-ID: 2013/2/12 Elefterios Stamatogiannakis > On 11/2/2013 7:39 ??, Amaury Forgeot d'Arc wrote: > >> 2013/2/11 Eleytherios Stamatogiannakis > > >> >> >> > >> > Which kind of profiler are you using? It possible that CPython >> builtin >> > functions are not profiled the same way as PyPy's. >> >> lsprofcalltree.py . >> >> From APSW's source code, i think that it uses this API: >> >> (in cursor.c) >> PyUnicode_DecodeUTF8 >> >> Maybe lsprofcalltree doesn't profile it? >> >> >> Indeed. CPU cost is hidden in the cursor method. >> > > Thanks Amaury for looking into this, > > Assuming that PyPy's "codecs.utf_8_decode" is slower when used with CFFI > than using PyUnicode_DecodeUTF8 in CPython. > > Is there anything that can be done in CFFI that would have the same > performance as PyUnicode_DecodeUTF8 (and the same for encode) > First, codecs.utf_8_decode has nothing to do with CFFI... Then, do we have evidence that the utf8 codec is enough to explain the different performance? Since your data is only ASCII, it would be interesting to use the ASCII encoding: try to replace PyUnicode_DecodeUTF8 by PyUnicode_DecodeASCII and codecs.utf_8_decode by codecs.ascii_decode -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Tue Feb 12 10:04:59 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Tue, 12 Feb 2013 11:04:59 +0200 Subject: [pypy-dev] Unicode encode/decode speed In-Reply-To: <51197D93.2050201@gmail.com> References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> <511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> <51192C17.7060907@gmail.com> <51197D93.2050201@gmail.com> Message-ID: On Tue, Feb 12, 2013 at 1:24 AM, Elefterios Stamatogiannakis wrote: > On 11/2/2013 7:39 ??, Amaury Forgeot d'Arc wrote: >> >> 2013/2/11 Eleytherios Stamatogiannakis > > >> >> >> > >> > Which kind of profiler are you using? It possible that CPython >> builtin >> > functions are not profiled the same way as PyPy's. >> >> lsprofcalltree.py . >> >> From APSW's source code, i think that it uses this API: >> >> (in cursor.c) >> PyUnicode_DecodeUTF8 >> >> Maybe lsprofcalltree doesn't profile it? >> >> >> Indeed. CPU cost is hidden in the cursor method. > > > Thanks Amaury for looking into this, > > Assuming that PyPy's "codecs.utf_8_decode" is slower when used with CFFI > than using PyUnicode_DecodeUTF8 in CPython. > > Is there anything that can be done in CFFI that would have the same > performance as PyUnicode_DecodeUTF8 (and the same for encode)? > Hi I would like to see some evidence about it. Did you try valgrind? Cheers, fijal From fijall at gmail.com Tue Feb 12 10:07:00 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Tue, 12 Feb 2013 11:07:00 +0200 Subject: [pypy-dev] gcc warnings / errors in translation In-Reply-To: <87txpi164s.fsf@justinbogner.com> References: <51187EB0.8060507@gmail.com> <87txpi164s.fsf@justinbogner.com> Message-ID: On Tue, Feb 12, 2013 at 6:57 AM, Justin Bogner wrote: > Armin Rigo writes: >> Ah, indeed. We declare most arrays as "itemtype x[1]", so gcc complains >> when it can prove we do accesses at an index > 0. > > Is there a good reason not to use the C99 "itemtype x[]" or even the old > GCC extension "itemtype x[0]"? These won't trigger this warning, which > means we could leave it on in case a legitimate case crops up. > > As far as I know, the only noticeable difference between [], [0], and > [1] for flexible arrays is that sizeof has different semantics, but > that's usually not a big deal. I think we use sizeof. How is it better than just turning off the warning actually? There are no legitimate cases From fijall at gmail.com Tue Feb 12 16:17:51 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Tue, 12 Feb 2013 17:17:51 +0200 Subject: [pypy-dev] Generator leaks Message-ID: Hi pypy-dev (and hi armin :) Quick question - do we make https://bugs.pypy.org/issue1282 a release blocker? As far as I understand this is a chain-of-destructors scenario. Can we do better than wait N gc.collects? If not, can we fix generators? Cheers, fijal From estama at gmail.com Tue Feb 12 19:14:13 2013 From: estama at gmail.com (Eleytherios Stamatogiannakis) Date: Tue, 12 Feb 2013 20:14:13 +0200 Subject: [pypy-dev] Unicode encode/decode speed In-Reply-To: References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> <511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> <51192C17.7060907@gmail.com> <51197D93.2050201@gmail.com> Message-ID: <511A8675.3040001@gmail.com> On 12/02/13 11:04, Maciej Fijalkowski wrote: > > I would like to see some evidence about it. Did you try valgrind? > > Cheers, > fijal > Even better, we wanted to find a way for you to be able to test it by yourselves, so we tried to create a representative synthetic benchmark. Surprisingly when we retested the benchmark that we had previously posted here in this mailing list, we found that the performance profile is very similar to the one slow query that i've talked about in my recent emails. To make it easier i'll repeat the freshened instructions (from the old email) of how to run that benchmark. Also attached is the updated (and heavily optimized) MSPW: --repost-- To run it you'll need latest madIS. You can clone it using: hg clone https://code.google.com/p/madis/ For running the test with CPython you'll need: CPython 2.7 + APSW: https://code.google.com/p/apsw/ For PyPy you'll need MPSW renamed to "apsw.py" (the attached MPSW is already renamed to "apsw.py"). Move "apsw.py" to pypy's "site-packages" directory. For MSPW to work in PyPy, you'll also need CFFI and "libsqlite3" installed. To run the test with PyPy: pypy mterm.py < mspw_bench.sql or with CPython python mterm.py < mspw_bench.sql The timings of "mspw_bench" that we get are: CPython 2.7 + APSW: ~ 2.6sec PyPy + MSPW: ~ 4sec There are two ways to adjust the processing load of mspw_bench. One is to change the value in "range(xxxxx)". This will in essence create a bigger/smaller "synthetic text". This puts more pressure on CPython's/pypy's side. The other way is to adjust the window size of textwindow(t, xx, xx). This puts more pressure on the wrapper (APSW/MSPW) because it changes the number of columns that CPython/PyPy have to send to SQLite (they are send one value at a time). --/repost-- Attached you'll find our latest MSPW (renamed to "apsw.py") and "mspw_bench.sql" Also we are looking into adding a special ffi.string_decode_UTF8 in CFFI's backend to reduce the number of calls that are needed to go from utf8_char* to PyPy's unicode. Do you thing that such an addition would be worthwhile? Thank you, lefteris -------------- next part -------------- A non-text attachment was scrubbed... Name: mspw_bench.sql Type: text/x-sql Size: 124 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: apsw.py Type: text/x-python Size: 67124 bytes Desc: not available URL: From tismer at stackless.com Wed Feb 13 00:53:05 2013 From: tismer at stackless.com (Christian Tismer) Date: Wed, 13 Feb 2013 00:53:05 +0100 Subject: [pypy-dev] efficient string concatenation (yep, from 2004) Message-ID: <511AD5E1.1060209@stackless.com> Hi friends, _efficient string concatenation_ has been a topic in 2004. Armin Rigo proposed a patch with the name of the subject, more precisely: /[Patches] [ python-Patches-980695 ] efficient string concatenation// //on sourceforge.net, on 2004-06-28.// / This patch was finally added to Python 2.4 on 2004-11-30. Some people might remember the larger discussion if such a patch should be accepted at all, because it changes the programming style for many of us from "don't do that, stupid" to "well, you may do it in CPython", which has quite some impact on other implementations (is it fast on Jython, now?). It changed for instance my programming and teaching style a lot, of course! But I think nobody but people heavily involved in PyPy expected this: Now, more than eight years after that patch appeared and made it into 2.4, PyPy (!) still does _not_ have it! Obviously I was mislead by other optimizations, and the fact that this patch was from a/the major author of PyPy who invented the initial patch for CPython. That this would be in PyPy as well sooner or later was without question for me. Wrong... ;-) Yes, I agree that for PyPy it is much harder to implement without the refcounting trick, and probably even more difficult in case of the JIT. But nevertheless, I tried to find any reference to this missing crucial optimization, with no success after an hour (*). And I guess many other people are stepping in the same trap. So I can imagine that PyPy looses some of its speed in many programs, because Armin's great hack did not make it into PyPy, and this is not loudly declared somewhere. I believe the efficiency of string concatenation is something that people assume by default and add it to the vague CPython compatibility claim, if not explicitly told otherwise. ---- Some silly proof, using python 2.7.3 vs PyPy 1.9: > $ cat strconc.py > #!env python > > from timeit import default_timer as timer > > tim = timer() > > s = '' > for i in xrange(100000): > s += 'X' > > tim = timer() - tim > > print 'time for {} concats = {:0.3f}'.format(len(s), tim) > $ python strconc.py > time for 100000 concats = 0.028 > $ pypy strconc.py > time for 100000 concats = 0.804 Something is needed - a patch for PyPy or for the documentation I guess. This is not just some unoptimized function in some module, but it is used all over the place and became a very common pattern since introduced. /How ironic that a foreseen problem occurs _now_, and _there_ :-)// / cheers -- chris (*) http://pypy.readthedocs.org/en/latest/cpython_differences.html http://pypy.org/compat.html http://pypy.org/performance.html -- Christian Tismer :^) Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail at justinbogner.com Wed Feb 13 07:07:14 2013 From: mail at justinbogner.com (Justin Bogner) Date: Tue, 12 Feb 2013 23:07:14 -0700 Subject: [pypy-dev] gcc warnings / errors in translation In-Reply-To: (Maciej Fijalkowski's message of "Tue, 12 Feb 2013 11:07:00 +0200") References: <51187EB0.8060507@gmail.com> <87txpi164s.fsf@justinbogner.com> Message-ID: <87k3qc21d9.fsf@justinbogner.com> Maciej Fijalkowski writes: > On Tue, Feb 12, 2013 at 6:57 AM, Justin Bogner wrote: >> Armin Rigo writes: >>> Ah, indeed. We declare most arrays as "itemtype x[1]", so gcc complains >>> when it can prove we do accesses at an index > 0. >> >> Is there a good reason not to use the C99 "itemtype x[]" or even the old >> GCC extension "itemtype x[0]"? These won't trigger this warning, which >> means we could leave it on in case a legitimate case crops up. >> >> As far as I know, the only noticeable difference between [], [0], and >> [1] for flexible arrays is that sizeof has different semantics, but >> that's usually not a big deal. > > I think we use sizeof. How is it better than just turning off the > warning actually? There are no legitimate cases I'm not sure what you mean when you say there are no legitimate cases. In the cases where we define an array of length 1 it's true that these warnings aren't meaningful, but that's just because an array of length 1 (almost) always means we want a flexible array. On the other hand, for arrays with a meaningful length, I've only ever seen this warning point out legitimate bugs. Using flexible arrays instead of relying on, as Dennis Ritchie has referred to it, "unwarranted chumminess with the C implementation" makes the intention clearer. It avoids the issue without having to fight with the compiler or turn the warning off. I don't really see a downside, personally. Maybe I'm missing something here. From fijall at gmail.com Wed Feb 13 08:35:28 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Wed, 13 Feb 2013 09:35:28 +0200 Subject: [pypy-dev] efficient string concatenation (yep, from 2004) In-Reply-To: <511AD5E1.1060209@stackless.com> References: <511AD5E1.1060209@stackless.com> Message-ID: Hi Christian. We have it, just not enabled by default. --objspace-with-strbuf I think On Wed, Feb 13, 2013 at 1:53 AM, Christian Tismer wrote: > Hi friends, > > efficient string concatenation has been a topic in 2004. > Armin Rigo proposed a patch with the name of the subject, > more precisely: > > [Patches] [ python-Patches-980695 ] efficient string concatenation > on sourceforge.net, on 2004-06-28. > > This patch was finally added to Python 2.4 on 2004-11-30. > > Some people might remember the larger discussion if such a patch should be > accepted at all, because it changes the programming style for many of us > from "don't do that, stupid" to "well, you may do it in CPython", which has > quite > some impact on other implementations (is it fast on Jython, now?). > > It changed for instance my programming and teaching style a lot, of course! > > But I think nobody but people heavily involved in PyPy expected this: > > Now, more than eight years after that patch appeared and made it into 2.4, > PyPy (!) still does _not_ have it! > > Obviously I was mislead by other optimizations, and the fact that > this patch was from a/the major author of PyPy who invented the initial > patch for CPython. That this would be in PyPy as well sooner or later was > without question for me. Wrong... ;-) > > Yes, I agree that for PyPy it is much harder to implement without the > refcounting trick, and probably even more difficult in case of the JIT. > > But nevertheless, I tried to find any reference to this missing crucial > optimization, > with no success after an hour (*). > > And I guess many other people are stepping in the same trap. > > So I can imagine that PyPy looses some of its speed in many programs, > because > Armin's great hack did not make it into PyPy, and this is not loudly > declared > somewhere. I believe the efficiency of string concatenation is something > that people assume by default and add it to the vague CPython compatibility > claim, if not explicitly told otherwise. > > ---- > > Some silly proof, using python 2.7.3 vs PyPy 1.9: > > $ cat strconc.py > #!env python > > from timeit import default_timer as timer > > tim = timer() > > s = '' > for i in xrange(100000): > s += 'X' > > tim = timer() - tim > > print 'time for {} concats = {:0.3f}'.format(len(s), tim) > > > $ python strconc.py > time for 100000 concats = 0.028 > $ pypy strconc.py > time for 100000 concats = 0.804 > > > Something is needed - a patch for PyPy or for the documentation I guess. > > This is not just some unoptimized function in some module, but it is used > all over the place and became a very common pattern since introduced. > > How ironic that a foreseen problem occurs _now_, and _there_ :-) > > cheers -- chris > > > (*) > http://pypy.readthedocs.org/en/latest/cpython_differences.html > http://pypy.org/compat.html > http://pypy.org/performance.html > > -- > Christian Tismer :^) > Software Consulting : Have a break! Take a ride on Python's > Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ > 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de > phone +49 173 24 18 776 fax +49 (30) 700143-0023 > PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 > whom do you want to sponsor today? http://www.stackless.com/ > > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > From regebro at gmail.com Wed Feb 13 08:42:17 2013 From: regebro at gmail.com (Lennart Regebro) Date: Wed, 13 Feb 2013 08:42:17 +0100 Subject: [pypy-dev] [Python-Dev] efficient string concatenation (yep, from 2004) In-Reply-To: <511AD5E1.1060209@stackless.com> References: <511AD5E1.1060209@stackless.com> Message-ID: > Something is needed - a patch for PyPy or for the documentation I guess. Not arguing that it wouldn't be good, but I disagree that it is needed. This is only an issue when you, as in your proof, have a loop that does concatenation. This is usually when looping over a list of strings that should be concatenated together. Doing so in a loop with concatenation may be the natural way for people new to Python, but the "natural" way to do it in Python is with a ''.join() call. This: s = ''.join(('X' for x in xrange(x))) Is more than twice as fast in Python 2.7 than your example. It is in fact also slower in PyPy 1.9 than Python 2.7, but only with a factor of two: Python 2.7: time for 10000000 concats = 0.887 Pypy 1.9: time for 10000000 concats = 1.600 (And of course s = 'X'* x takes only a bout a hundredth of the time, but that's cheating. ;-) //Lennart From davidf at sjsoft.com Wed Feb 13 08:53:10 2013 From: davidf at sjsoft.com (David Fraser) Date: Wed, 13 Feb 2013 01:53:10 -0600 (CST) Subject: [pypy-dev] Great experience with PyPy In-Reply-To: Message-ID: <24890242.381.1360741988037.JavaMail.davidf@jackdaw.local> You may also want to try pg8000; this is a pure-Python driver that works on Windows On Thursday, February 7, 2013 at 6:08:51 PM, "????? ???????" wrote: > Hi! I did not test it on Windows, there may be problems with > installation (searching for postgres header files, the config is not > very smart - > https://github.com/chtd/psycopg2cffi/blob/master/psycopg2cffi/_impl/libpq.py#L209), > but they should be solvable I hope - submit a bug if you have > problems. > > 2013/2/7 Gelin Yan : > > > > > > On Thu, Feb 7, 2013 at 11:28 PM, ????? ??????? > > > > wrote: > >> > >> PyPy supports postgres with either psycopg2cffi or psycopg2-ctypes > >> bindings. We use psycopg2cffi in production (and maintain them), > >> and > >> here > >> http://chtd.ru/blog/bystraya-rabota-s-postgres-pod-pypy/?lang=en > >> are some benchmarks. > >> And yes, PyPy is cool :) Typically giving 3x speedups, and some > >> memory > >> savings sometimes. > >> > >> 2013/2/7 Gelin Yan : > >> > > >> > > >> > On Thu, Feb 7, 2013 at 10:11 PM, Phyo Arkar > >> > > >> > wrote: > >> >> > >> >> Pypy should have a page for "Success Stories!" > >> >> > >> >> Now with this and Quora proving Power of PyPy , i am beginning > >> >> to start > >> >> converting my projects into PyPy soon! > >> >> I am only withholding right now because my projects uses a lot > >> >> of C > >> >> Libraries and Numpy/Matplotlib/scilit-learn. > >> >> > >> >> Thanks > >> >> > >> >> Phyo. > >> >> > >> >> On Thursday, February 7, 2013, Maciej Fijalkowski wrote: > >> >>> > >> >>> On Thu, Feb 7, 2013 at 1:55 PM, Marko Tasic > >> >>> > >> >>> wrote: > >> >>> > Hi, > >> >>> > > >> >>> > I would like to share short story with you and share what we > >> >>> > have > >> >>> > accomplished with PyPy and its friends so far. > >> >>> > > >> >>> > Company that I have worked for last 7 months (intentionally > >> >>> > unnamed) > >> >>> > gave me absolute permission to pick up technologies on which > >> >>> > we > >> >>> > based > >> >>> > our solution. What we do is: crawl for PDFs and newspapers > >> >>> > articles, > >> >>> > download, translate them if needed, OCR if needed, do > >> >>> > extensive > >> >>> > analysis of downloaded PDFs and articles, store them in more > >> >>> > organized > >> >>> > structures for faster querying, search for them and generate > >> >>> > bunch > >> >>> > of > >> >>> > complex reports. > >> >>> > > >> >>> > From very beginning I decided to go with PyPy no matter > >> >>> > what. What > >> >>> > we > >> >>> > picked is following: > >> >>> > * Flask for web framework, and few of its extensions such as > >> >>> > Flask-Login, Flask-Principal, Flask-WTF, Flask-Mail, etc. > >> >>> > * Cassandra as database because of its features and great > >> >>> > experience > >> >>> > with it. PyCassa is used as client to talk to Cassandra > >> >>> > server. > >> >>> > * ElasticSearch as distributed search engine, and its client > >> >>> > library > >> >>> > pyes. > >> >>> > * Whoosh as search engine, but with some modifications to > >> >>> > support > >> >>> > Cassandra as storage and distributed locking. > >> >>> > * Redis, and its client library redis-py, for caching and to > >> >>> > speed > >> >>> > up > >> >>> > common auto-completion patterns. > >> >>> > * ZooKeeper, and its client library Kazoo, for distributed > >> >>> > locking > >> >>> > which plays essential role in system for transaction-like > >> >>> > behavior > >> >>> > over many services at once. > >> >>> > * Celery in conjunction with RabbitMQ for task distribution. > >> >>> > * Sentry for error logging. > >> >>> > > >> >>> > What we have developed on our own are wrappers and clients > >> >>> > for: > >> >>> > * Moses which is language translator > >> >>> > * Tesseract which is OCR engine > >> >>> > * Cassandra store for Whoosh > >> >>> > * wkhtmltopdf and wkhtmltoimage which are used for > >> >>> > conversion of > >> >>> > HTML > >> >>> > to PDF/Image > >> >>> > * etc > >> >>> > > >> >>> > Now when product is finished and in final testing phase, I > >> >>> > can say > >> >>> > that we did not regret because we used PyPy and stack around > >> >>> > it. > >> >>> > Typical speed improvement is 2x-3x over CPython in our case, > >> >>> > but > >> >>> > anyway we are mostly IO and memory bound, expect for Celery > >> >>> > workers > >> >>> > where we do analysis which are again many small CPU > >> >>> > intensive tasks > >> >>> > that are exchanged via RabbitMQ. Another reason why we don't > >> >>> > see > >> >>> > speedup us is that we are dependent on external software > >> >>> > (servers) > >> >>> > written in Erlang and Java. > >> >>> > > >> >>> > I'm already planing to do Cassandra (distributed key/value > >> >>> > only > >> >>> > database without index features), ZooKeeper, Redis and > >> >>> > ElasticSearch > >> >>> > ports in Python for next projects, and hopefully opensource > >> >>> > them. > >> >>> > > >> >>> > Regards, > >> >>> > Marko Tasic > >> >>> > _______________________________________________ > >> >>> > pypy-dev mailing list > >> >>> > pypy-dev at python.org > >> >>> > http://mail.python.org/mailman/listinfo/pypy-dev > >> >>> > >> >>> Awesome! > >> >>> > >> >>> I'm glad people can make pypy work for non-trivial tasks which > >> >>> require > >> >>> a lot of dependencies. We're trying to lower the bar, however > >> >>> it takes > >> >>> time. > >> >>> > >> >>> Cheers, > >> >>> fijal > >> >>> _______________________________________________ > >> >>> pypy-dev mailing list > >> >>> pypy-dev at python.org > >> >>> http://mail.python.org/mailman/listinfo/pypy-dev > >> >> > >> >> > >> >> _______________________________________________ > >> >> pypy-dev mailing list > >> >> pypy-dev at python.org > >> >> http://mail.python.org/mailman/listinfo/pypy-dev > >> >> > >> > > >> > > >> > Hi, It might be off topic. I want to know whether pypy support > >> > postgres. > >> > The > >> > last time I noticed ctypes based psycopg2 was still beta. I > >> > mainly use > >> > twisted & postgres. pypy supports twisted well but not good for > >> > psycopg2. > >> > > >> > Regards > >> > > >> > gelin yan > >> > > >> > _______________________________________________ > >> > pypy-dev mailing list > >> > pypy-dev at python.org > >> > http://mail.python.org/mailman/listinfo/pypy-dev > >> > > >> > >> > >> > >> -- > >> ?????????? ???????, ??????????? > >> ???????? ??? -- http://chtd.ru > >> +7 (495) 646-87-45, ?????????? 333 > > > > > > Hi > > > > Glad to hear that. I will give it a try. By the way, Can i use > > it on > > windows? It looks like cffi support windows. > > > > Regards > > > > gelin yan > > > > _______________________________________________ > > pypy-dev mailing list > > pypy-dev at python.org > > http://mail.python.org/mailman/listinfo/pypy-dev > > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > From fijall at gmail.com Wed Feb 13 09:08:11 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Wed, 13 Feb 2013 10:08:11 +0200 Subject: [pypy-dev] Unicode encode/decode speed In-Reply-To: <511A8675.3040001@gmail.com> References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> <511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> <51192C17.7060907@gmail.com> <51197D93.2050201@gmail.com> <511A8675.3040001@gmail.com> Message-ID: On Tue, Feb 12, 2013 at 8:14 PM, Eleytherios Stamatogiannakis wrote: > On 12/02/13 11:04, Maciej Fijalkowski wrote: >> >> >> I would like to see some evidence about it. Did you try valgrind? >> >> Cheers, >> fijal >> > > Even better, we wanted to find a way for you to be able to test it by > yourselves, so we tried to create a representative synthetic benchmark. > > Surprisingly when we retested the benchmark that we had previously posted > here in this mailing list, we found that the performance profile is very > similar to the one slow query that i've talked about in my recent emails. > > To make it easier i'll repeat the freshened instructions (from the old > email) of how to run that benchmark. Also attached is the updated (and > heavily optimized) MSPW: > > --repost-- > > To run it you'll need latest madIS. You can clone it using: > > hg clone https://code.google.com/p/madis/ > > For running the test with CPython you'll need: > > CPython 2.7 + APSW: > > https://code.google.com/p/apsw/ > > For PyPy you'll need MPSW renamed to "apsw.py" (the attached MPSW is already > renamed to "apsw.py"). > Move "apsw.py" to pypy's "site-packages" directory. For MSPW to work in > PyPy, you'll also need CFFI and "libsqlite3" installed. > > To run the test with PyPy: > > pypy mterm.py < mspw_bench.sql > > or with CPython > > python mterm.py < mspw_bench.sql > > The timings of "mspw_bench" that we get are: > > CPython 2.7 + APSW: ~ 2.6sec > PyPy + MSPW: ~ 4sec > > There are two ways to adjust the processing load of mspw_bench. > > One is to change the value in "range(xxxxx)". This will in essence create a > bigger/smaller "synthetic text". This puts more pressure on CPython's/pypy's > side. > > The other way is to adjust the window size of textwindow(t, xx, xx). This > puts more pressure on the wrapper (APSW/MSPW) because it changes the number > of columns that CPython/PyPy have to send to SQLite (they are send one value > at a time). > > --/repost-- > > Attached you'll find our latest MSPW (renamed to "apsw.py") and > "mspw_bench.sql" > > Also we are looking into adding a special ffi.string_decode_UTF8 in CFFI's > backend to reduce the number of calls that are needed to go from utf8_char* > to PyPy's unicode. > > Do you thing that such an addition would be worthwhile? > > Thank you, > > lefteris > > > > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > Hey I have serious trouble running apsw. Message I got so far: /home/fijal/.virtualenvs/cffi/local/lib/python2.7/site-packages/apsw.so: undefined symbol: sqlite3_uri_boolean From estama at gmail.com Wed Feb 13 12:39:03 2013 From: estama at gmail.com (Eleytherios Stamatogiannakis) Date: Wed, 13 Feb 2013 13:39:03 +0200 Subject: [pypy-dev] Unicode encode/decode speed In-Reply-To: References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> <511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> <51192C17.7060907@gmail.com> <51197D93.2050201@gmail.com> <511A8675.3040001@gmail.com> Message-ID: <511B7B57.7060102@gmail.com> On 13/02/13 10:08, Maciej Fijalkowski wrote: > Hey > > I have serious trouble running apsw. Message I got so far: > > /home/fijal/.virtualenvs/cffi/local/lib/python2.7/site-packages/apsw.so: > undefined symbol: sqlite3_uri_boolean > Thanks Maciej for looking into it, Which version of APSW have you tried to install and how? Debian/Ubuntu based systems, include APSW in package: python-apsw You can also try to install APSW yourself. Instructions: http://apidoc.apsw.googlecode.com/hg/build.html#recommended and (some more details from madIS' docs) http://doc.madis.googlecode.com/hg/install.html l. From tismer at stackless.com Wed Feb 13 13:06:27 2013 From: tismer at stackless.com (Christian Tismer) Date: Wed, 13 Feb 2013 13:06:27 +0100 Subject: [pypy-dev] [Python-Dev] efficient string concatenation (yep, from 2004) In-Reply-To: References: <511AD5E1.1060209@stackless.com> Message-ID: <511B81C3.9030905@stackless.com> On 13.02.13 08:42, Lennart Regebro wrote: >> Something is needed - a patch for PyPy or for the documentation I guess. > Not arguing that it wouldn't be good, but I disagree that it is needed. > > This is only an issue when you, as in your proof, have a loop that > does concatenation. This is usually when looping over a list of > strings that should be concatenated together. Doing so in a loop with > concatenation may be the natural way for people new to Python, but the > "natural" way to do it in Python is with a ''.join() call. > > This: > > s = ''.join(('X' for x in xrange(x))) > > Is more than twice as fast in Python 2.7 than your example. It is in > fact also slower in PyPy 1.9 than Python 2.7, but only with a factor > of two: > > Python 2.7: > time for 10000000 concats = 0.887 > Pypy 1.9: > time for 10000000 concats = 1.600 > > (And of course s = 'X'* x takes only a bout a hundredth of the time, > but that's cheating. ;-) > This is not about how to write efficient concatenation and not for me. It is also not about a constant factor, which I don't really care about but in situations where speed matters. This is about a possible algorithmic trap, where code written for CPython may behave well with some roughly O(n) behavior, and by switching to PyPy you get a surprise when the same code now has O(n**2) behavior. Such runtime explosions can damage the trust in PyPy, with code sitting in some module which you even did not write but "pip install"-ed it. So this is important to know, especially for newcomers, and for people who are giving advice to them. For algorithmic compatibility, there should no longer be a feature with this drastic side effect, if that cannot be supported by all other dialects. To avoid such hidden traps in larger code bases, documentation is needed that clearly gives a warning saying "don't do that", like CS students learn for most other languages. cheers - chris -- Christian Tismer :^) Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From steve at pearwood.info Wed Feb 13 13:10:26 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 13 Feb 2013 23:10:26 +1100 Subject: [pypy-dev] [Python-Dev] efficient string concatenation (yep, from 2004) In-Reply-To: <511AD5E1.1060209@stackless.com> References: <511AD5E1.1060209@stackless.com> Message-ID: <511B82B2.2000505@pearwood.info> On 13/02/13 10:53, Christian Tismer wrote: > Hi friends, > > _efficient string concatenation_ has been a topic in 2004. > Armin Rigo proposed a patch with the name of the subject, > more precisely: > > /[Patches] [ python-Patches-980695 ] efficient string concatenation// > //on sourceforge.net, on 2004-06-28.// > / > This patch was finally added to Python 2.4 on 2004-11-30. > > Some people might remember the larger discussion if such a patch should be > accepted at all, because it changes the programming style for many of us > from "don't do that, stupid" to "well, you may do it in CPython", which has quite > some impact on other implementations (is it fast on Jython, now?). I disagree. If you look at the archives on the python-list@ and tutor at python.org mailing lists, you will see that whenever string concatenation comes up, the common advice given is to use join. The documentation for strings is also clear that you should not rely on this optimization: http://docs.python.org/2/library/stdtypes.html#typesseq And quadratic performance for repeated concatenation is not unique to Python: it applies to pretty much any language with immutable strings, including Java, C++, Lua and Javascript. > It changed for instance my programming and teaching style a lot, of course! Why do you say, "Of course"? It should not have changed anything. Best practice remains the same: - we should still use join for repeated concatenations; - we should still avoid + except for small cases which are not performance critical; - we should still teach beginners to use join; - while this optimization is nice to have, we cannot rely on it being there when it matters. It's not just Jython and IronPython that can't make use of this optimization. It can, and does, fail on CPython as well, as it is sensitive to memory allocation details. See for example: http://utcc.utoronto.ca/~cks/space/blog/python/ExaminingStringConcatOpt and here for a cautionary tale about what can happen when the optimization fails under CPython: http://mail.python.org/pipermail/python-dev/2009-August/091125.html -- Steven From ncoghlan at gmail.com Wed Feb 13 15:44:09 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 14 Feb 2013 00:44:09 +1000 Subject: [pypy-dev] [Python-Dev] efficient string concatenation (yep, from 2004) In-Reply-To: <511B81C3.9030905@stackless.com> References: <511AD5E1.1060209@stackless.com> <511B81C3.9030905@stackless.com> Message-ID: On Wed, Feb 13, 2013 at 10:06 PM, Christian Tismer wrote: > To avoid such hidden traps in larger code bases, documentation is > needed that clearly gives a warning saying "don't do that", like CS > students learn for most other languages. How much more explicit do you want us to be? """6. CPython implementation detail: If s and t are both strings, some Python implementations such as CPython can usually perform an in-place optimization for assignments of the form s = s + t or s += t. When applicable, this optimization makes quadratic run-time much less likely. This optimization is both version and implementation dependent. For performance sensitive code, it is preferable to use the str.join() method which assures consistent linear concatenation performance across versions and implementations.""" from http://docs.python.org/2/library/stdtypes.html#typesseq So please don't blame us for people not reading a warning that is already there. Since my rewrite of the sequence docs, Python 3 doesn't even acknowledge the hack's existence and is quite explicit about what you need to do to get reliably linear behaviour: """6. Concatenating immutable sequences always results in a new object. This means that building up a sequence by repeated concatenation will have a quadratic runtime cost in the total sequence length. To get a linear runtime cost, you must switch to one of the alternatives below: if concatenating str objects, you can build a list and use str.join() at the end or else write to a io.StringIO instance and retrieve its value when complete if concatenating bytes objects, you can similarly use bytes.join() or io.BytesIO, or you can do in-place concatenation with a bytearray object. bytearray objects are mutable and have an efficient overallocation mechanism if concatenating tuple objects, extend a list instead for other types, investigate the relevant class documentation""" from http://docs.python.org/3/library/stdtypes.html#common-sequence-operations Deliberately *relying* on the += hack to avoid quadratic runtime is just plain wrong, and our documentation already says so. If anyone really thinks it will help, I can add a CPython implementation note back in to the Python 3 docs as well, pointing out that CPython performance measurements may hide broken algorithmic complexity related to string concatenation, but the corresponding note in Python 2 doesn't seem to have done much good :P Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From marius at pov.lt Wed Feb 13 16:22:00 2013 From: marius at pov.lt (Marius Gedminas) Date: Wed, 13 Feb 2013 17:22:00 +0200 Subject: [pypy-dev] [Python-Dev] efficient string concatenation (yep, from 2004) In-Reply-To: <511B82B2.2000505@pearwood.info> References: <511AD5E1.1060209@stackless.com> <511B82B2.2000505@pearwood.info> Message-ID: <20130213152200.GA19918@fridge.pov.lt> On Wed, Feb 13, 2013 at 11:10:26PM +1100, Steven D'Aprano wrote: > Best practice remains the same: > > - we should still use join for repeated concatenations; > > - we should still avoid + except for small cases which are not performance critical; > > - we should still teach beginners to use join; > > - while this optimization is nice to have, we cannot rely on it being there > when it matters. > > It's not just Jython and IronPython that can't make use of this optimization. It > can, and does, fail on CPython as well, as it is sensitive to memory > allocation details. See for example: > > http://utcc.utoronto.ca/~cks/space/blog/python/ExaminingStringConcatOpt > > and here for a cautionary tale about what can happen when the optimization fails > under CPython: > > http://mail.python.org/pipermail/python-dev/2009-August/091125.html Is that the right link? This thread doesn't mention +=, the bug mentioned in the first email doesn't mention +=, and the fix mentioned for that bug appears to be "let's not pass 0 as the buffer size of sock.makefile()". Did I skim too much? Marius Gedminas -- And yes, you'd be insane to consider C++ for a new project in 2007. -- Thomas Ptacek -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 190 bytes Desc: Digital signature URL: From taavi.burns at gmail.com Wed Feb 13 16:56:49 2013 From: taavi.burns at gmail.com (Taavi Burns) Date: Wed, 13 Feb 2013 10:56:49 -0500 Subject: [pypy-dev] Helping with STM at the PyCon 2013 (Santa Clara) sprints Message-ID: >From a recent email of Armin's to the list: > The STM project progressed slowly during the last few months. The > status right now is: > > * Most importantly, missing major Garbage Collection cycles, which > means pypy-stm slowly but constantly leaks memory. > > * The JIT integration is not finished; so far pypy-stm can only be > compiled without the JIT. > > * There are also other places where the performance can be improved, > probably a lot. > > * Finally there are a number of usability concerns that we (or mostly > Remi) worked on recently. The main issues turn around the idea that, > as a user of pypy-stm, you should have a way to get freeback on the > process. For example right now, transactions that abort are > completely transparent --- to the point that you don't have any way to > know that it occurred, apart from "it runs too slowly" if it occurs a > lot. You should have a way to get Python tracebacks of aborts if you > want to. A similar issue is "inevitable" transactions. I'm interested in helping with STM: 1) I think STM is really interesting, particularly Armin's take on it 2) I need a "Personal Development Goal" for %(dayjob)s. Last year it was just "Contribute to PyPy", which I did at the sprints (a bit). This year, I'd like to try something a bit more ambitious. ;) >From the list above, are there any particular areas (tickets?) that would be a good starting place for me to look at? I expect that to get the most out of the sprints, I should do a bit of pre-work (reading at least, if not poking). Thanks! -- taa /*eof*/ From arigo at tunes.org Wed Feb 13 17:05:55 2013 From: arigo at tunes.org (Armin Rigo) Date: Wed, 13 Feb 2013 17:05:55 +0100 Subject: [pypy-dev] [Python-Dev] efficient string concatenation (yep, from 2004) In-Reply-To: <20130213152200.GA19918@fridge.pov.lt> References: <511AD5E1.1060209@stackless.com> <511B82B2.2000505@pearwood.info> <20130213152200.GA19918@fridge.pov.lt> Message-ID: Hi, This was already discussed on pypy-dev a few times, like in 2011 (http://mail.python.org/pipermail//pypy-dev/2011-August/008068.html). Google finds more (site:mail.python.org pypy-dev string concatenation). A bient?t, Armin. From arigo at tunes.org Wed Feb 13 17:16:36 2013 From: arigo at tunes.org (Armin Rigo) Date: Wed, 13 Feb 2013 17:16:36 +0100 Subject: [pypy-dev] Helping with STM at the PyCon 2013 (Santa Clara) sprints In-Reply-To: References: Message-ID: Hi Taavi, On Wed, Feb 13, 2013 at 4:56 PM, Taavi Burns wrote: > From the list above, are there any particular areas (tickets?) that > would be a good starting place for me to look at? I can't just give you a specific task to do, but you can try to understand what is here so far. Look at the branch "stm-thread-2" on the pypy repository; e.g. try to translate with "rpython -O2 --stm targetpypystandalone". This gives you a kind-of-GIL-less PyPy. Try to use the transaction module ("import transaction") on some demo programs. Then I suppose you should dive into the mess that is multithreaded programming by looking in depth at lib_pypy/transaction.py. And this is all before diving into the PyPy sources themselves... You may also look at the work done by Remi Meier on his own separate repository (https://bitbucket.org/Raemi/pypy-stm-logging). It contains mostly playing around with various ideas that haven't been integrated back, or not yet. A bient?t, Armin. From fijall at gmail.com Wed Feb 13 17:18:18 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Wed, 13 Feb 2013 18:18:18 +0200 Subject: [pypy-dev] Unicode encode/decode speed In-Reply-To: <511B7B57.7060102@gmail.com> References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> <511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> <51192C17.7060907@gmail.com> <51197D93.2050201@gmail.com> <511A8675.3040001@gmail.com> <511B7B57.7060102@gmail.com> Message-ID: On Wed, Feb 13, 2013 at 1:39 PM, Eleytherios Stamatogiannakis wrote: > On 13/02/13 10:08, Maciej Fijalkowski wrote: >> >> Hey >> >> I have serious trouble running apsw. Message I got so far: >> >> /home/fijal/.virtualenvs/cffi/local/lib/python2.7/site-packages/apsw.so: >> undefined symbol: sqlite3_uri_boolean >> > > Thanks Maciej for looking into it, > > Which version of APSW have you tried to install and how? the recent version (exactly the command you pasted) > > Debian/Ubuntu based systems, include APSW in package: > > python-apsw > > You can also try to install APSW yourself. Instructions: > > http://apidoc.apsw.googlecode.com/hg/build.html#recommended > > and (some more details from madIS' docs) > > http://doc.madis.googlecode.com/hg/install.html > > l. From taavi.burns at gmail.com Thu Feb 14 00:20:47 2013 From: taavi.burns at gmail.com (Taavi Burns) Date: Wed, 13 Feb 2013 18:20:47 -0500 Subject: [pypy-dev] Helping with STM at the PyCon 2013 (Santa Clara) sprints In-Reply-To: References: Message-ID: That sounds like a reasonable place to start, thanks! I tried running the translation, and immediately hit what looks like a failure from merging in default, due to the pypy/rpython move. I've got a patch currently pushing to bitbucket, but it'll be a few minutes (pushing ~10 months of pypy dev effort). It'll be at https://bitbucket.org/taavi_burns/pypy/commits/0378c78cc316 when the push finishes. :) The translation still eventually fails, though: [translation:ERROR] File "../../rpython/translator/stm/jitdriver.py", line 86, in check_jitdriver [translation:ERROR] assert not jitdriver.autoreds # XXX [translation:ERROR] AssertionError Full stack and software versions: https://gist.github.com/taavi/4949322 Any ideas? Thanks! On Wed, Feb 13, 2013 at 11:16 AM, Armin Rigo wrote: > Hi Taavi, > > On Wed, Feb 13, 2013 at 4:56 PM, Taavi Burns wrote: >> From the list above, are there any particular areas (tickets?) that >> would be a good starting place for me to look at? > > I can't just give you a specific task to do, but you can try to > understand what is here so far. Look at the branch "stm-thread-2" on > the pypy repository; e.g. try to translate with "rpython -O2 --stm > targetpypystandalone". This gives you a kind-of-GIL-less PyPy. Try > to use the transaction module ("import transaction") on some demo > programs. Then I suppose you should dive into the mess that is > multithreaded programming by looking in depth at > lib_pypy/transaction.py. And this is all before diving into the PyPy > sources themselves... > > You may also look at the work done by Remi Meier on his own separate > repository (https://bitbucket.org/Raemi/pypy-stm-logging). It > contains mostly playing around with various ideas that haven't been > integrated back, or not yet. > > > A bient?t, > > Armin. -- taa /*eof*/ From tismer at stackless.com Thu Feb 14 00:49:19 2013 From: tismer at stackless.com (Christian Tismer) Date: Thu, 14 Feb 2013 00:49:19 +0100 Subject: [pypy-dev] [Python-Dev] efficient string concatenation (yep, from 2004) In-Reply-To: References: <511AD5E1.1060209@stackless.com> Message-ID: <0936978B-D261-4C90-ABD5-13F7BD5668DE@stackless.com> Hi Lennart, Sent from my Ei4Steve On Feb 13, 2013, at 8:42, Lennart Regebro wrote: >> Something is needed - a patch for PyPy or for the documentation I guess. > > Not arguing that it wouldn't be good, but I disagree that it is needed. > > This is only an issue when you, as in your proof, have a loop that > does concatenation. This is usually when looping over a list of > strings that should be concatenated together. Doing so in a loop with > concatenation may be the natural way for people new to Python, but the > "natural" way to do it in Python is with a ''.join() call. > > This: > > s = ''.join(('X' for x in xrange(x))) > > Is more than twice as fast in Python 2.7 than your example. It is in > fact also slower in PyPy 1.9 than Python 2.7, but only with a factor > of two: > > Python 2.7: > time for 10000000 concats = 0.887 > Pypy 1.9: > time for 10000000 concats = 1.600 > > (And of course s = 'X'* x takes only a bout a hundredth of the time, > but that's cheating. ;-) > > //Lennart This all does not really concern me, as long as it roughly has the same order of magnitude, or better the same big Oh. I'm not concerned by a constant factor. I'm concerned by a freezing machine that suddenly gets 10000 times slower because the algorithms never explicitly state their algorithmic complexity. ( I think I said this too often, today?) As a side note: Something similar happened to me when somebody used "range" in Python3.3. He ran the same code on Python 2.7. with a crazy effect of having to re-boot: Range() on 2.7 with arguments from some arbitrary input file. A newbie error that was hard to understand, because he was tought thinking 'xrange' when writing 'range'. Hard for me to understand because I am no longer able to make these errors at all, or even expect them. Cheers - Chris From steve at pearwood.info Thu Feb 14 00:59:37 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 14 Feb 2013 10:59:37 +1100 Subject: [pypy-dev] [Python-Dev] efficient string concatenation (yep, from 2004) In-Reply-To: <20130213152200.GA19918@fridge.pov.lt> References: <511AD5E1.1060209@stackless.com> <511B82B2.2000505@pearwood.info> <20130213152200.GA19918@fridge.pov.lt> Message-ID: <511C28E9.9030707@pearwood.info> On 14/02/13 02:22, Marius Gedminas wrote: > On Wed, Feb 13, 2013 at 11:10:26PM +1100, Steven D'Aprano wrote: >> Best practice remains the same: >> >> - we should still use join for repeated concatenations; >> >> - we should still avoid + except for small cases which are not performance critical; >> >> - we should still teach beginners to use join; >> >> - while this optimization is nice to have, we cannot rely on it being there >> when it matters. >> >> It's not just Jython and IronPython that can't make use of this optimization. It >> can, and does, fail on CPython as well, as it is sensitive to memory >> allocation details. See for example: >> >> http://utcc.utoronto.ca/~cks/space/blog/python/ExaminingStringConcatOpt >> >> and here for a cautionary tale about what can happen when the optimization fails >> under CPython: >> >> http://mail.python.org/pipermail/python-dev/2009-August/091125.html > > Is that the right link? This thread doesn't mention +=, the bug > mentioned in the first email doesn't mention +=, and the fix mentioned > for that bug appears to be "let's not pass 0 as the buffer size of > sock.makefile()". > > Did I skim too much? Yes you skimmed too much :-) The point is that use of += can cause *platform specific* O(N**2) performance in things which don't obviously involve string concatenation. It would be nice if real world bugs were as simple as "I concatenate a lot of strings with += and it's slow, how do I fix it?" but in this case the bug report was that httplib was an order of magnitude or more slower on Windows than Linux. The thread continues into the following month. Here's the first pointer to the problem: http://mail.python.org/pipermail/python-dev/2009-September/091582.html and a note that platform differences in realloc matter: http://mail.python.org/pipermail/python-dev/2009-September/091583.html and solution: http://mail.python.org/pipermail/python-dev/2009-September/091588.html -- Steven From steve at pearwood.info Thu Feb 14 01:35:34 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 14 Feb 2013 11:35:34 +1100 Subject: [pypy-dev] [Python-Dev] efficient string concatenation (yep, from 2004) In-Reply-To: References: <511AD5E1.1060209@stackless.com> <511B81C3.9030905@stackless.com> Message-ID: <511C3156.8060902@pearwood.info> On 14/02/13 01:44, Nick Coghlan wrote: > Deliberately *relying* on the += hack to avoid quadratic runtime is > just plain wrong, and our documentation already says so. +1 I'm not sure that there's any evidence that people in general are *relying* on the += hack. More likely they write the first code they think of, which is +=, and never considered the consequences or test it thoroughly. Even if they test it, they only test it on one version of one implementation on one platform, and likely only with small N. Besides, if you know that N will always be small, then using += is not wrong. I think we have a tendency to sometimes overreact in cases like this. I don't think we need to do any more than we are already doing: the tutor@ and python-list@ mailing lists already try to educate users to use join, the docs recommend to use join, the Internet is filled with code that correctly uses join. What more can we do? I see no evidence that the Python community is awash with coders who write code with atrocious performance characteristics, or at least no more than any other language. -- Steven From fijall at gmail.com Thu Feb 14 17:04:46 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Thu, 14 Feb 2013 18:04:46 +0200 Subject: [pypy-dev] [pypy-commit] pypy jitframe-on-heap: start fixing call_assembler for ARM In-Reply-To: <20130214154412.6EE461C00BD@cobra.cs.uni-duesseldorf.de> References: <20130214154412.6EE461C00BD@cobra.cs.uni-duesseldorf.de> Message-ID: Hi David. I started working on this, but also this is again a direct copy of x86 code. I understand where it comes from but please refrain from doing it in the future. I will work out the call_assembler and friends, but I'll backout your commit. On Thu, Feb 14, 2013 at 5:44 PM, bivab wrote: > Author: David Schneider > Branch: jitframe-on-heap > Changeset: r61238:1dd0aa6c631a > Date: 2013-02-14 16:43 +0100 > http://bitbucket.org/pypy/pypy/changeset/1dd0aa6c631a/ > > Log: start fixing call_assembler for ARM > > floats do not work correctly yet > > diff --git a/rpython/jit/backend/arm/opassembler.py b/rpython/jit/backend/arm/opassembler.py > --- a/rpython/jit/backend/arm/opassembler.py > +++ b/rpython/jit/backend/arm/opassembler.py > @@ -1104,59 +1104,52 @@ > # XXX Split into some helper methods > def emit_guard_call_assembler(self, op, guard_op, arglocs, regalloc, > fcond): > - tmploc = arglocs[1] > - resloc = arglocs[2] > - callargs = arglocs[3:] > - > self._store_force_index(guard_op) > descr = op.getdescr() > assert isinstance(descr, JitCellToken) > - # check value > - assert tmploc is r.r0 > - self._emit_call(imm(descr._ll_function_addr), > - callargs, fcond, resloc=tmploc) > + if len(arglocs) == 4: > + [frame_loc, vloc, result_loc, argloc] = arglocs > + else: > + [frame_loc, result_loc, argloc] = arglocs > + vloc = imm(0) > + > + # > + # Write a call to the target assembler > + # we need to allocate the frame, keep in sync with runner's > + # execute_token > + jd = descr.outermost_jitdriver_sd > + base_ofs = self.cpu.get_baseofs_of_frame_field() > + self._emit_call(imm(descr._ll_function_addr), [argloc], fcond) > if op.result is None: > - value = self.cpu.done_with_this_frame_void_v > + assert result_loc is None > + value = self.cpu.done_with_this_frame_descr_void > else: > kind = op.result.type > if kind == INT: > - value = self.cpu.done_with_this_frame_int_v > + assert result_loc is r.r0 > + value = self.cpu.done_with_this_frame_descr_int > elif kind == REF: > - value = self.cpu.done_with_this_frame_ref_v > + assert result_loc is r.r0 > + value = self.cpu.done_with_this_frame_descr_ref > elif kind == FLOAT: > - value = self.cpu.done_with_this_frame_float_v > + value = self.cpu.done_with_this_frame_descr_float > else: > raise AssertionError(kind) > - from rpython.jit.backend.llsupport.descr import unpack_fielddescr > - from rpython.jit.backend.llsupport.descr import unpack_interiorfielddescr > - descrs = self.cpu.gc_ll_descr.getframedescrs(self.cpu) > - _offset, _size, _ = unpack_fielddescr(descrs.jf_descr) > - fail_descr = self.cpu.get_fail_descr_from_number(value) > - value = fail_descr.hide(self.cpu) > - rgc._make_sure_does_not_move(value) > - value = rffi.cast(lltype.Signed, value) > > - if check_imm_arg(_offset): > - self.mc.LDR_ri(r.ip.value, tmploc.value, imm=_offset) > - else: > - self.mc.gen_load_int(r.ip.value, _offset) > - self.mc.LDR_rr(r.ip.value, tmploc.value, r.ip.value) > + > + gcref = cast_instance_to_gcref(value) > + rgc._make_sure_does_not_move(gcref) > + value = rffi.cast(lltype.Signed, gcref) > + ofs = self.cpu.get_ofs_of_frame_field('jf_descr') > + assert check_imm_arg(ofs) > + self.mc.LDR_ri(r.ip.value, r.r0.value, imm=ofs) > + > if check_imm_arg(value): > self.mc.CMP_ri(r.ip.value, imm=value) > else: > self.mc.gen_load_int(r.lr.value, value) > self.mc.CMP_rr(r.lr.value, r.ip.value) > - > - > - #if values are equal we take the fast path > - # Slow path, calling helper > - # jump to merge point > - > - jd = descr.outermost_jitdriver_sd > - assert jd is not None > - > - # Path A: load return value and reset token > - # Fast Path using result boxes > + # Path 1: Fast Path > > fast_path_cond = c.EQ > # Reset the vable token --- XXX really too much special logic here:-( > @@ -1164,45 +1157,40 @@ > from rpython.jit.backend.llsupport.descr import FieldDescr > fielddescr = jd.vable_token_descr > assert isinstance(fielddescr, FieldDescr) > - ofs = fielddescr.offset > - tmploc = regalloc.get_scratch_reg(INT) > - self.mov_loc_loc(arglocs[0], r.ip, cond=fast_path_cond) > - self.mc.MOV_ri(tmploc.value, 0, cond=fast_path_cond) > - self.mc.STR_ri(tmploc.value, r.ip.value, ofs, cond=fast_path_cond) > + assert isinstance(fielddescr, FieldDescr) > + vtoken_ofs = fielddescr.offset > + assert check_imm_arg(vtoken_ofs) > + self.mov_loc_loc(vloc, r.ip, cond=fast_path_cond) > + self.mc.MOV_ri(r.lr.value, 0, cond=fast_path_cond) > + self.mc.STR_ri(tmploc.value, r.ip.value, vtoken_ofs, cond=fast_path_cond) > + # in the line above, TOKEN_NONE = 0 > > if op.result is not None: > - # load the return value from fail_boxes_xxx[0] > + # load the return value from the dead frame's value index 0 > kind = op.result.type > if kind == FLOAT: > - t = unpack_interiorfielddescr(descrs.as_float)[0] > - if not check_imm_arg(t): > - self.mc.gen_load_int(r.ip.value, t, cond=fast_path_cond) > + descr = self.cpu.getarraydescr_for_frame(kind) > + ofs = self.cpu.unpack_arraydescr(descr) > + if not check_imm_arg(ofs): > + self.mc.gen_load_int(r.ip.value, ofs, cond=fast_path_cond) > self.mc.ADD_rr(r.ip.value, r.r0.value, r.ip.value, > cond=fast_path_cond) > - t = 0 > + ofs = 0 > base = r.ip > else: > base = r.r0 > - self.mc.VLDR(resloc.value, base.value, imm=t, > - cond=fast_path_cond) > + self.mc.VLDR(result_loc.value, base.value, imm=ofs, > + cond=fast_path_cond) > else: > - assert resloc is r.r0 > - if kind == INT: > - t = unpack_interiorfielddescr(descrs.as_int)[0] > - else: > - t = unpack_interiorfielddescr(descrs.as_ref)[0] > - if not check_imm_arg(t): > - self.mc.gen_load_int(r.ip.value, t, cond=fast_path_cond) > - self.mc.LDR_rr(resloc.value, resloc.value, r.ip.value, > - cond=fast_path_cond) > - else: > - self.mc.LDR_ri(resloc.value, resloc.value, imm=t, > - cond=fast_path_cond) > + assert result_loc is r.r0 > + descr = self.cpu.getarraydescr_for_frame(kind) > + ofs = self.cpu.unpack_arraydescr(descr) > + self.load_reg(self.mc, r.r0, r.r0, ofs, cond=fast_path_cond) > # jump to merge point > jmp_pos = self.mc.currpos() > self.mc.BKPT() > > - # Path B: use assembler helper > + # Path 2: use assembler helper > asm_helper_adr = self.cpu.cast_adr_to_int(jd.assembler_helper_adr) > if self.cpu.supports_floats: > floats = r.caller_vfp_resp > @@ -1213,28 +1201,24 @@ > # the result > core = r.caller_resp > if op.result: > - if resloc.is_vfp_reg(): > + if result_loc.is_vfp_reg(): > floats = r.caller_vfp_resp[1:] > else: > core = r.caller_resp[1:] + [r.ip] # keep alignment > with saved_registers(self.mc, core, floats): > # result of previous call is in r0 > - self.mov_loc_loc(arglocs[0], r.r1) > + self.mov_loc_loc(vloc, r.r1) > self.mc.BL(asm_helper_adr) > - if not self.cpu.use_hf_abi and op.result and resloc.is_vfp_reg(): > + if not self.cpu.use_hf_abi and op.result and result_loc.is_vfp_reg(): > # move result to the allocated register > - self.mov_to_vfp_loc(r.r0, r.r1, resloc) > + self.mov_to_vfp_loc(r.r0, r.r1, result_loc) > > # merge point > currpos = self.mc.currpos() > pmc = OverwritingBuilder(self.mc, jmp_pos, WORD) > pmc.B_offs(currpos, fast_path_cond) > > - self.mc.LDR_ri(r.ip.value, r.fp.value) > - self.mc.CMP_ri(r.ip.value, 0) > - > - self._emit_guard(guard_op, regalloc._prepare_guard(guard_op), > - c.GE, save_exc=True) > + self._emit_guard_may_force(guard_op, arglocs, op.numargs()) > return fcond > > # ../x86/assembler.py:668 > diff --git a/rpython/jit/backend/arm/regalloc.py b/rpython/jit/backend/arm/regalloc.py > --- a/rpython/jit/backend/arm/regalloc.py > +++ b/rpython/jit/backend/arm/regalloc.py > @@ -1101,20 +1101,18 @@ > prepare_guard_call_release_gil = prepare_guard_call_may_force > > def prepare_guard_call_assembler(self, op, guard_op, fcond): > + > descr = op.getdescr() > assert isinstance(descr, JitCellToken) > - jd = descr.outermost_jitdriver_sd > - assert jd is not None > - vable_index = jd.index_of_virtualizable > - if vable_index >= 0: > - self._sync_var(op.getarg(vable_index)) > - vable = self.frame_manager.loc(op.getarg(vable_index)) > + arglist = op.getarglist() > + self.rm._sync_var(arglist[0]) > + frame_loc = self.fm.loc(op.getarg(0)) > + if len(arglist) == 2: > + self.rm._sync_var(arglist[1]) > + locs = [frame_loc, self.fm.loc(arglist[1])] > else: > - vable = imm(0) > - # make sure the call result location is free > - tmploc = self.get_scratch_reg(INT, selected_reg=r.r0) > - self.possibly_free_vars(guard_op.getfailargs()) > - return [vable, tmploc] + self._prepare_call(op, save_all_regs=True) > + locs = [frame_loc] > + return locs + self._prepare_call(op, save_all_regs=True) > > def _prepare_args_for_new_op(self, new_args): > gc_ll_descr = self.cpu.gc_ll_descr > _______________________________________________ > pypy-commit mailing list > pypy-commit at python.org > http://mail.python.org/mailman/listinfo/pypy-commit From wizzat at gmail.com Thu Feb 14 22:33:02 2013 From: wizzat at gmail.com (Mark Roberts) Date: Thu, 14 Feb 2013 13:33:02 -0800 Subject: [pypy-dev] Generator leaks Message-ID: On Tue, Feb 12, 2013 at 10:14 AM, wrote: > Hi pypy-dev (and hi armin :) > > Quick question - do we make https://bugs.pypy.org/issue1282 a release > blocker? As far as I understand this is a chain-of-destructors > scenario. Can we do better than wait N gc.collects? If not, can we fix > generators? > > Cheers, > fijal Making sure this is fixed would greatly help some of my code. Is it possible to flag these as a chain for collection in a single pass? -Mark From jameslan at gmail.com Fri Feb 15 08:14:07 2013 From: jameslan at gmail.com (James Lan) Date: Thu, 14 Feb 2013 23:14:07 -0800 Subject: [pypy-dev] test_all.py can't recognize has multi inherited classes Message-ID: Hi All, I tried to run test cases for jvm backend by the following command, pypy pypy/test_all.py rpython/translator/jvm/test but what I got was 10 tests defined in rpython/translator/jvm/test/test_backendopt.py although there are two dozens of test files in that directory. After experiment, I found that it failed to collect test within a multi inherited class. Take test_constant.py as an example, it defines a class, class TestConstant(BaseTestConstant, JvmTest): pass and JvmTest is defined as, class JvmTest(BaseRtypingTest, OORtypeMixin): ...... When I executed the folloing, ypy pypy/test_all.py rpython/translator/jvm/test/test_constant.py --collectonly it reported 0 item was collected. If I remove BaseTestConstant from TestConstant's super classes, it still reports 0 item collected; but if I remove JvmTest from its super classes, it reports 18 items are collected. I'm wondering how the daily test solve this problem? Thanks, James Lan -------------- next part -------------- An HTML attachment was scrubbed... URL: From Ronny.Pfannschmidt at gmx.de Fri Feb 15 08:34:25 2013 From: Ronny.Pfannschmidt at gmx.de (Ronny Pfannschmidt) Date: Fri, 15 Feb 2013 08:34:25 +0100 Subject: [pypy-dev] test_all.py can't recognize has multi inherited classes In-Reply-To: References: Message-ID: <511DE501.4080707@gmx.de> Hi James, Interesting find, i can replicate and i don't yet see how it happens, because the code pytest uses does walk the mro to find tests in parent classes i will report back once i figured whats wrong -- Ronny On 02/15/2013 08:14 AM, James Lan wrote: > Hi All, > > I tried to run test cases for jvm backend by the following command, > > > pypy pypy/test_all.py rpython/translator/jvm/test > > > but what I got was 10 tests defined in > rpython/translator/jvm/test/test_backendopt.py although there are two > dozens of test files in that directory. > > After experiment, I found that it failed to collect test within a multi > inherited class. Take test_constant.py as an example, it defines a class, > > > class TestConstant(BaseTestConstant, JvmTest): > pass > > > and JvmTest is defined as, > > > class JvmTest(BaseRtypingTest, OORtypeMixin): > ...... > > > When I executed the folloing, > > > ypy pypy/test_all.py rpython/translator/jvm/test/test_constant.py > --collectonly > > > it reported 0 item was collected. If I remove BaseTestConstant from > TestConstant's super classes, it still reports 0 item collected; but if > I remove JvmTest from its super classes, it reports 18 items are collected. > > > I'm wondering how the daily test solve this problem? > > Thanks, > James Lan > > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev From Ronny.Pfannschmidt at gmx.de Fri Feb 15 08:45:44 2013 From: Ronny.Pfannschmidt at gmx.de (Ronny Pfannschmidt) Date: Fri, 15 Feb 2013 08:45:44 +0100 Subject: [pypy-dev] test_all.py can't recognize has multi inherited classes In-Reply-To: <511DE501.4080707@gmx.de> References: <511DE501.4080707@gmx.de> Message-ID: <511DE7A8.7090803@gmx.de> i confirmed the issue - its in the included pytest and its fixed upstream its going to be fixed in the pytest branch of pypy later On 02/15/2013 08:34 AM, Ronny Pfannschmidt wrote: > Hi James, > > > Interesting find, > > i can replicate and i don't yet see how it happens, > because the code pytest uses does walk the mro to find tests in parent > classes > > i will report back once i figured whats wrong > > -- Ronny > > > On 02/15/2013 08:14 AM, James Lan wrote: >> Hi All, >> >> I tried to run test cases for jvm backend by the following command, >> >> >> pypy pypy/test_all.py rpython/translator/jvm/test >> >> >> but what I got was 10 tests defined in >> rpython/translator/jvm/test/test_backendopt.py although there are two >> dozens of test files in that directory. >> >> After experiment, I found that it failed to collect test within a multi >> inherited class. Take test_constant.py as an example, it defines a class, >> >> >> class TestConstant(BaseTestConstant, JvmTest): >> pass >> >> >> and JvmTest is defined as, >> >> >> class JvmTest(BaseRtypingTest, OORtypeMixin): >> ...... >> >> >> When I executed the folloing, >> >> >> ypy pypy/test_all.py rpython/translator/jvm/test/test_constant.py >> --collectonly >> >> >> it reported 0 item was collected. If I remove BaseTestConstant from >> TestConstant's super classes, it still reports 0 item collected; but if >> I remove JvmTest from its super classes, it reports 18 items are >> collected. >> >> >> I'm wondering how the daily test solve this problem? >> >> Thanks, >> James Lan >> >> >> _______________________________________________ >> pypy-dev mailing list >> pypy-dev at python.org >> http://mail.python.org/mailman/listinfo/pypy-dev > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev From fijall at gmail.com Fri Feb 15 08:55:03 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 15 Feb 2013 09:55:03 +0200 Subject: [pypy-dev] Generator leaks In-Reply-To: References: Message-ID: On Thu, Feb 14, 2013 at 11:33 PM, Mark Roberts wrote: > On Tue, Feb 12, 2013 at 10:14 AM, wrote: >> Hi pypy-dev (and hi armin :) >> >> Quick question - do we make https://bugs.pypy.org/issue1282 a release >> blocker? As far as I understand this is a chain-of-destructors >> scenario. Can we do better than wait N gc.collects? If not, can we fix >> generators? >> >> Cheers, >> fijal > > Making sure this is fixed would greatly help some of my code. Is it > possible to flag these as a chain for collection in a single pass? > > -Mark We're working on it (or at least actively thinking) From fijall at gmail.com Fri Feb 15 08:55:41 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 15 Feb 2013 09:55:41 +0200 Subject: [pypy-dev] test_all.py can't recognize has multi inherited classes In-Reply-To: <511DE7A8.7090803@gmx.de> References: <511DE501.4080707@gmx.de> <511DE7A8.7090803@gmx.de> Message-ID: On Fri, Feb 15, 2013 at 9:45 AM, Ronny Pfannschmidt wrote: > i confirmed the issue - its in the included pytest and its fixed upstream > > its going to be fixed in the pytest branch of pypy later Can't you fix it on default? > > > > > On 02/15/2013 08:34 AM, Ronny Pfannschmidt wrote: >> >> Hi James, >> >> >> Interesting find, >> >> i can replicate and i don't yet see how it happens, >> because the code pytest uses does walk the mro to find tests in parent >> classes >> >> i will report back once i figured whats wrong >> >> -- Ronny >> >> >> On 02/15/2013 08:14 AM, James Lan wrote: >>> >>> Hi All, >>> >>> I tried to run test cases for jvm backend by the following command, >>> >>> >>> pypy pypy/test_all.py rpython/translator/jvm/test >>> >>> >>> but what I got was 10 tests defined in >>> rpython/translator/jvm/test/test_backendopt.py although there are two >>> dozens of test files in that directory. >>> >>> After experiment, I found that it failed to collect test within a multi >>> inherited class. Take test_constant.py as an example, it defines a class, >>> >>> >>> class TestConstant(BaseTestConstant, JvmTest): >>> pass >>> >>> >>> and JvmTest is defined as, >>> >>> >>> class JvmTest(BaseRtypingTest, OORtypeMixin): >>> ...... >>> >>> >>> When I executed the folloing, >>> >>> >>> ypy pypy/test_all.py rpython/translator/jvm/test/test_constant.py >>> --collectonly >>> >>> >>> it reported 0 item was collected. If I remove BaseTestConstant from >>> TestConstant's super classes, it still reports 0 item collected; but if >>> I remove JvmTest from its super classes, it reports 18 items are >>> collected. >>> >>> >>> I'm wondering how the daily test solve this problem? >>> >>> Thanks, >>> James Lan >>> >>> >>> _______________________________________________ >>> pypy-dev mailing list >>> pypy-dev at python.org >>> http://mail.python.org/mailman/listinfo/pypy-dev >> >> >> _______________________________________________ >> pypy-dev mailing list >> pypy-dev at python.org >> http://mail.python.org/mailman/listinfo/pypy-dev > > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev From Ronny.Pfannschmidt at gmx.de Fri Feb 15 09:01:55 2013 From: Ronny.Pfannschmidt at gmx.de (Ronny Pfannschmidt) Date: Fri, 15 Feb 2013 09:01:55 +0100 Subject: [pypy-dev] test_all.py can't recognize has multi inherited classes In-Reply-To: References: <511DE501.4080707@gmx.de> <511DE7A8.7090803@gmx.de> Message-ID: <511DEB73.2040602@gmx.de> On 02/15/2013 08:55 AM, Maciej Fijalkowski wrote: > On Fri, Feb 15, 2013 at 9:45 AM, Ronny Pfannschmidt > wrote: >> i confirmed the issue - its in the included pytest and its fixed upstream >> >> its going to be fixed in the pytest branch of pypy later > > Can't you fix it on default? i need to investigate the effects of the update first > >> >> >> >> >> On 02/15/2013 08:34 AM, Ronny Pfannschmidt wrote: >>> >>> Hi James, >>> >>> >>> Interesting find, >>> >>> i can replicate and i don't yet see how it happens, >>> because the code pytest uses does walk the mro to find tests in parent >>> classes >>> >>> i will report back once i figured whats wrong >>> >>> -- Ronny >>> >>> >>> On 02/15/2013 08:14 AM, James Lan wrote: >>>> >>>> Hi All, >>>> >>>> I tried to run test cases for jvm backend by the following command, >>>> >>>> >>>> pypy pypy/test_all.py rpython/translator/jvm/test >>>> >>>> >>>> but what I got was 10 tests defined in >>>> rpython/translator/jvm/test/test_backendopt.py although there are two >>>> dozens of test files in that directory. >>>> >>>> After experiment, I found that it failed to collect test within a multi >>>> inherited class. Take test_constant.py as an example, it defines a class, >>>> >>>> >>>> class TestConstant(BaseTestConstant, JvmTest): >>>> pass >>>> >>>> >>>> and JvmTest is defined as, >>>> >>>> >>>> class JvmTest(BaseRtypingTest, OORtypeMixin): >>>> ...... >>>> >>>> >>>> When I executed the folloing, >>>> >>>> >>>> ypy pypy/test_all.py rpython/translator/jvm/test/test_constant.py >>>> --collectonly >>>> >>>> >>>> it reported 0 item was collected. If I remove BaseTestConstant from >>>> TestConstant's super classes, it still reports 0 item collected; but if >>>> I remove JvmTest from its super classes, it reports 18 items are >>>> collected. >>>> >>>> >>>> I'm wondering how the daily test solve this problem? >>>> >>>> Thanks, >>>> James Lan >>>> From gherman at darwin.in-berlin.de Sat Feb 16 22:48:18 2013 From: gherman at darwin.in-berlin.de (Dinu Gherman) Date: Sat, 16 Feb 2013 22:48:18 +0100 Subject: [pypy-dev] Release date for PyPy 2.0 beta 2? Message-ID: <72089605-4607-4C44-8AA9-75D1B277A1C1@darwin.in-berlin.de> Hi, I'm trying to make some performance comparisons using various tools like CPython, Cython, PyPy and Numba as described in an exercise I've put up here for a presentation (a tiny function generating digits of Pi): https://gist.github.com/deeplook/4947835 For this code PyPy 1.9 shows around 50 % of the performance of CPython. Christian Tismer tells me 2.0 beta 1 was much better, but I'm running into a bug for PyPy 2.0 beta 1 already described here two months ago: https://bugs.pypy.org/issue1350 So... is there any estimate for the release date of 2.0 beta 2? Thanks, Dinu From fijall at gmail.com Sat Feb 16 23:39:30 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sun, 17 Feb 2013 00:39:30 +0200 Subject: [pypy-dev] Release date for PyPy 2.0 beta 2? In-Reply-To: <72089605-4607-4C44-8AA9-75D1B277A1C1@darwin.in-berlin.de> References: <72089605-4607-4C44-8AA9-75D1B277A1C1@darwin.in-berlin.de> Message-ID: On Sat, Feb 16, 2013 at 11:48 PM, Dinu Gherman wrote: > Hi, > > I'm trying to make some performance comparisons using various tools > like CPython, Cython, PyPy and Numba as described in an exercise I've > put up here for a presentation (a tiny function generating digits of > Pi): https://gist.github.com/deeplook/4947835 > For this code PyPy 1.9 shows around 50 % of the performance of CPython. > > Christian Tismer tells me 2.0 beta 1 was much better, but I'm running > into a bug for PyPy 2.0 beta 1 already described here two months ago: > https://bugs.pypy.org/issue1350 > > So... is there any estimate for the release date of 2.0 beta 2? > > Thanks, > > Dinu > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev Hi Dinu, just run nightly http://buildbot.pypy.org/nightly/trunk/ ideally, also don't benchmark on OS X, it's a system that has lots of strange problems. For what is worth, you picked up a very terrible program - this program exercises long implementation, not "how fast you run python programs". A new pypy is ~30% slower than cpython, which we find reasonable (because it's runtime time), if you want faster pick gmpy. GMP however has no means of recovering from a MemoryError. How do you want to benchmark python compilers on this? Can anyone do something more sensible than just call operations on longs? Cheers, fijal From arigo at tunes.org Sun Feb 17 09:06:50 2013 From: arigo at tunes.org (Armin Rigo) Date: Sun, 17 Feb 2013 09:06:50 +0100 Subject: [pypy-dev] Release date for PyPy 2.0 beta 2? In-Reply-To: References: <72089605-4607-4C44-8AA9-75D1B277A1C1@darwin.in-berlin.de> Message-ID: Hi Fijal, On Sat, Feb 16, 2013 at 11:39 PM, Maciej Fijalkowski wrote: > How do you want to benchmark python compilers on this? Can anyone do > something more sensible than just call operations on longs? In theory it would be possible to queue up common sequences of operations, a bit like you did with numpy's lazy evaluation; e.g. an expression like "z = x * 3 + y" could be processed in only one walk through the digits. Just saying. This is very unlikely to give any performance gain unless the numbers are very large. A bient?t, Armin. From arigo at tunes.org Sun Feb 17 09:58:42 2013 From: arigo at tunes.org (Armin Rigo) Date: Sun, 17 Feb 2013 09:58:42 +0100 Subject: [pypy-dev] Helping with STM at the PyCon 2013 (Santa Clara) sprints In-Reply-To: References: Message-ID: Hi Taavi, I finally fixed pypy-stm with signals. Now I'm getting again results that scale with the number of processors. Note that it stops scaling up at some point, around 4 or 6 threads, on machines I tried it on. I suspect it's related to the fact that physical processors have 4 or 6 cores internally, but the results are still a bit inconsistent. Using the "taskset" command to force the threads to run on particular physical sockets seems to help a little bit with some numbers. Fwiw, I got the maximum throughput on a 24-cores machine by really running 24 threads, but that seems wasteful, as it is only 25% better than running 6 threads on one physical socket. The next step will be trying to reduce the overhead, currently considerable (about 10x slower than CPython, too much to ever have any net benefit). Also high on the list is fixing the constant memory leak (i.e. implementing major garbage collection steps). A bient?t, Armin. From gherman at darwin.in-berlin.de Sun Feb 17 10:30:32 2013 From: gherman at darwin.in-berlin.de (Dinu Gherman) Date: Sun, 17 Feb 2013 10:30:32 +0100 Subject: [pypy-dev] Release date for PyPy 2.0 beta 2? In-Reply-To: References: <72089605-4607-4C44-8AA9-75D1B277A1C1@darwin.in-berlin.de> Message-ID: <072669F8-0D5D-4840-A25B-1C34FC48A81F@darwin.in-berlin.de> Maciej Fijalkowski: > http://buildbot.pypy.org/nightly/trunk/ Thanks. > For what is worth, you picked up a very terrible program - this > program exercises long implementation, not "how fast you run python > programs". A new pypy is ~30% slower than cpython, which we find > reasonable (because it's runtime time), if you want faster pick gmpy. > GMP however has no means of recovering from a MemoryError. It was an example from a given context. Clearly, it doesn't show many different features to optimize. > How do you want to benchmark python compilers on this? Can anyone do > something more sensible than just call operations on longs? I compared it also with a version with serialized tuple assignments which shows an improvement of around 2.5 % on CPython, but no real change on PyPy, which is kind of what I hoped. Regards, Dinu From arigo at tunes.org Sun Feb 17 10:43:45 2013 From: arigo at tunes.org (Armin Rigo) Date: Sun, 17 Feb 2013 10:43:45 +0100 Subject: [pypy-dev] Unicode encode/decode speed In-Reply-To: <511A8675.3040001@gmail.com> References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> <511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> <51192C17.7060907@gmail.com> <51197D93.2050201@gmail.com> <511A8675.3040001@gmail.com> Message-ID: Hi, On Tue, Feb 12, 2013 at 7:14 PM, Eleytherios Stamatogiannakis wrote: > Also we are looking into adding a special ffi.string_decode_UTF8 in CFFI's > backend to reduce the number of calls that are needed to go from utf8_char* > to PyPy's unicode. A first note: I'm wondering why you need to convert from utf-8-that-contains-only-ascii, to unicode, and back. What is the point of having unicode strings in the first place? Can't you just pass around your complete program plain non-unicode strings? If not, then indeed, it would make (a bit of) sense to have ways to convert directly between "char *" and unicode strings, in both directions, assuming utf-8. This could be done with an API like: ffi.encode_utf8(unicode_string) -> new_char*_cdata ffi.encode_utf8(unicode_string, target_char*_cdata, maximum_length) ffi.decode_utf8(char*_cdata, [length]) -> unicode_string Alternatively, we could accept unicode strings whenever a "char*" is expected and encode it to utf-8, but that sounds a bit too magical. A bient?t, Armin. From fijall at gmail.com Sun Feb 17 10:55:07 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sun, 17 Feb 2013 11:55:07 +0200 Subject: [pypy-dev] Unicode encode/decode speed In-Reply-To: References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> <511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> <51192C17.7060907@gmail.com> <51197D93.2050201@gmail.com> <511A8675.3040001@gmail.com> Message-ID: On Sun, Feb 17, 2013 at 11:43 AM, Armin Rigo wrote: > Hi, > > On Tue, Feb 12, 2013 at 7:14 PM, Eleytherios Stamatogiannakis > wrote: >> Also we are looking into adding a special ffi.string_decode_UTF8 in CFFI's >> backend to reduce the number of calls that are needed to go from utf8_char* >> to PyPy's unicode. > > A first note: I'm wondering why you need to convert from > utf-8-that-contains-only-ascii, to unicode, and back. What is the > point of having unicode strings in the first place? Can't you just > pass around your complete program plain non-unicode strings? > > If not, then indeed, it would make (a bit of) sense to have ways to > convert directly between "char *" and unicode strings, in both > directions, assuming utf-8. This could be done with an API like: > > ffi.encode_utf8(unicode_string) -> new_char*_cdata > ffi.encode_utf8(unicode_string, target_char*_cdata, maximum_length) > ffi.decode_utf8(char*_cdata, [length]) -> unicode_string > > Alternatively, we could accept unicode strings whenever a "char*" is > expected and encode it to utf-8, but that sounds a bit too magical. > > > A bient?t, > > Armin. > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev We should add rffi.charp2unicode too From fijall at gmail.com Sun Feb 17 11:00:35 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sun, 17 Feb 2013 12:00:35 +0200 Subject: [pypy-dev] [pypy-commit] pypy jitframe-on-heap: fix for call_assembler with floats. The offset calculation for the arguments to call_assembler in rewrite.py assumes a double-word aligned JITFRAME In-Reply-To: <20130217020025.1B8A11C062C@cobra.cs.uni-duesseldorf.de> References: <20130217020025.1B8A11C062C@cobra.cs.uni-duesseldorf.de> Message-ID: On Sun, Feb 17, 2013 at 4:00 AM, bivab wrote: > Author: David Schneider > Branch: jitframe-on-heap > Changeset: r61341:017892f48c74 > Date: 2013-02-17 02:59 +0100 > http://bitbucket.org/pypy/pypy/changeset/017892f48c74/ > > Log: fix for call_assembler with floats. The offset calculation for the > arguments to call_assembler in rewrite.py assumes a double-word > aligned JITFRAME > > diff --git a/rpython/jit/backend/arm/arch.py b/rpython/jit/backend/arm/arch.py > --- a/rpython/jit/backend/arm/arch.py > +++ b/rpython/jit/backend/arm/arch.py > @@ -17,4 +17,4 @@ > # A jitframe is a jit.backend.llsupport.llmodel.jitframe.JITFRAME > # Stack frame fixed area > # Currently only the force_index > -JITFRAME_FIXED_SIZE = 11 + 16 * 2 # 11 GPR + 16 VFP Regs (64bit) > +JITFRAME_FIXED_SIZE = 12 + 16 * 2 # 11 GPR + one word to keep alignment + 16 VFP Regs (64bit) > diff --git a/rpython/jit/backend/llsupport/rewrite.py b/rpython/jit/backend/llsupport/rewrite.py > --- a/rpython/jit/backend/llsupport/rewrite.py > +++ b/rpython/jit/backend/llsupport/rewrite.py > @@ -179,6 +179,9 @@ > for i, arg in enumerate(arglist): > descr = self.cpu.getarraydescr_for_frame(arg.type) > _, itemsize, _ = self.cpu.unpack_arraydescr_size(descr) > + # XXX > + # this calculation breaks for floats on 32 bit if > + # base_ofs of JITFRAME + index * 8 is not double-word aligned > index = index_list[i] // itemsize # index is in bytes > self.newops.append(ResOperation(rop.SETARRAYITEM_GC, > [frame, ConstInt(index), > _______________________________________________ > pypy-commit mailing list > pypy-commit at python.org > http://mail.python.org/mailman/listinfo/pypy-commit This fix is incorrect, I'll do the correct one later today (on GC you have extra word for GC header, so you have to account for that) From taavi.burns at gmail.com Mon Feb 18 00:38:02 2013 From: taavi.burns at gmail.com (Taavi Burns) Date: Sun, 17 Feb 2013 18:38:02 -0500 Subject: [pypy-dev] Helping with STM at the PyCon 2013 (Santa Clara) sprints In-Reply-To: References: Message-ID: That's great, thanks! I did get it to work when you wrote earlier, but it's definitely faster now. I tried a ridiculously simple and no-conflict parallel program and came up with this, which gave me some questionable performance numbers from a build of 65ec96e15463: taavi at pypy:~/pypy/pypy/goal$ ./pypy-c -m timeit -s 'import transaction; transaction.set_num_threads(1)' ' def foo(): x = 0 for y in range(100000): x += y transaction.add(foo) transaction.add(foo) transaction.run()' 10 loops, best of 3: 198 msec per loop taavi at pypy:~/pypy/pypy/goal$ ./pypy-c -m timeit -s 'import transaction; transaction.set_num_threads(2)' ' def foo(): x = 0 for y in range(100000): x += y transaction.add(foo) transaction.add(foo) transaction.run()' 10 loops, best of 3: 415 msec per loop It's entirely possible that this is an effect of running inside a VMWare guest (set to use 2 cores) running on my Core2Duo laptop. If this is the case, I'll refrain from trying to do anything remotely like benchmarking in this environment in the future. :) Would it be more helpful (if I want to contribute to STM) to use something like a high-CPU EC2 instance, or should I look at obtaining something like an 8-real-core AMD X8? (my venerable X2 has started to disagree with its RAM, so it's prime for retirement) Thanks! On Sun, Feb 17, 2013 at 3:58 AM, Armin Rigo wrote: > Hi Taavi, > > I finally fixed pypy-stm with signals. Now I'm getting again results > that scale with the number of processors. > > Note that it stops scaling up at some point, around 4 or 6 threads, on > machines I tried it on. I suspect it's related to the fact that > physical processors have 4 or 6 cores internally, but the results are > still a bit inconsistent. Using the "taskset" command to force the > threads to run on particular physical sockets seems to help a little > bit with some numbers. Fwiw, I got the maximum throughput on a > 24-cores machine by really running 24 threads, but that seems > wasteful, as it is only 25% better than running 6 threads on one > physical socket. > > The next step will be trying to reduce the overhead, currently > considerable (about 10x slower than CPython, too much to ever have any > net benefit). Also high on the list is fixing the constant memory > leak (i.e. implementing major garbage collection steps). > > > A bient?t, > > Armin. -- taa /*eof*/ From estama at gmail.com Mon Feb 18 12:37:21 2013 From: estama at gmail.com (Eleytherios Stamatogiannakis) Date: Mon, 18 Feb 2013 13:37:21 +0200 Subject: [pypy-dev] Unicode encode/decode speed In-Reply-To: References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> <511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> <51192C17.7060907@gmail.com> <51197D93.2050201@gmail.com> <511A8675.3040001@gmail.com> Message-ID: <51221271.9060907@gmail.com> On 17/02/13 11:43, Armin Rigo wrote: > Hi, > > On Tue, Feb 12, 2013 at 7:14 PM, Eleytherios Stamatogiannakis > wrote: >> Also we are looking into adding a special ffi.string_decode_UTF8 in CFFI's >> backend to reduce the number of calls that are needed to go from utf8_char* >> to PyPy's unicode. > > A first note: I'm wondering why you need to convert from > utf-8-that-contains-only-ascii, to unicode, and back. What is the > point of having unicode strings in the first place? Can't you just > pass around your complete program plain non-unicode strings? > The problem is that SQlite internally uses UTF-8. So you cannot know in advance if the char* that you get from it is plain ASCII or a UTF-8 encoded Unicode. So we end up always converting to Unicode from the char* that SQlite returns. When sending to it, we have different code paths for Python's str() and unicode() string representations. Unfortunately, due to the nature of our data (its multilingual), and to make our life easier when we code our relational operators (written in Python), we always convert to Unicode inside our operators. So the str() path inside the MSPW SQLite wrapper, mostly sits unused. > If not, then indeed, it would make (a bit of) sense to have ways to > convert directly between "char *" and unicode strings, in both > directions, assuming utf-8. This could be done with an API like: > > ffi.encode_utf8(unicode_string) -> new_char*_cdata > ffi.encode_utf8(unicode_string, target_char*_cdata, maximum_length) > ffi.decode_utf8(char*_cdata, [length]) -> unicode_string > > Alternatively, we could accept unicode strings whenever a "char*" is > expected and encode it to utf-8, but that sounds a bit too magical. > An API like the one you propose would be very nice, and IMHO would give a substantial speedup. May i suggest, that for generality purposes, the same API functions should also be added for UTF-16, UTF-32 ? Thanks Armin and Maciej for looking into this, l. From taavi.burns at gmail.com Mon Feb 18 14:27:11 2013 From: taavi.burns at gmail.com (Taavi Burns) Date: Mon, 18 Feb 2013 08:27:11 -0500 Subject: [pypy-dev] Helping with STM at the PyCon 2013 (Santa Clara) sprints In-Reply-To: References: Message-ID: I got frustrated with my (actually dying now) local box and signed up for AWS. Using an m1.medium instance to build pypy (~100 minutes), and then upgrading it to a c1.xlarge (claims to be 8 virtual cores of 2.5 ECU each). With the same sample program, I see the expected kinds of speedups! :D So using VMWare is right out. Hopefully that info is useful for someone else in the future. :) On Sun, Feb 17, 2013 at 6:38 PM, Taavi Burns wrote: > That's great, thanks! I did get it to work when you wrote earlier, but > it's definitely faster now. > > I tried a ridiculously simple and no-conflict parallel program and > came up with this, which gave me some questionable performance numbers > from a build of 65ec96e15463: > > taavi at pypy:~/pypy/pypy/goal$ ./pypy-c -m timeit -s 'import > transaction; transaction.set_num_threads(1)' ' > def foo(): > x = 0 > for y in range(100000): > x += y > transaction.add(foo) > transaction.add(foo) > transaction.run()' > 10 loops, best of 3: 198 msec per loop > > taavi at pypy:~/pypy/pypy/goal$ ./pypy-c -m timeit -s 'import > transaction; transaction.set_num_threads(2)' ' > def foo(): > x = 0 > for y in range(100000): > x += y > transaction.add(foo) > transaction.add(foo) > transaction.run()' > 10 loops, best of 3: 415 msec per loop > > > It's entirely possible that this is an effect of running inside a > VMWare guest (set to use 2 cores) running on my Core2Duo laptop. If > this is the case, I'll refrain from trying to do anything remotely > like benchmarking in this environment in the future. :) > > Would it be more helpful (if I want to contribute to STM) to use > something like a high-CPU EC2 instance, or should I look at obtaining > something like an 8-real-core AMD X8? > > (my venerable X2 has started to disagree with its RAM, so it's prime > for retirement) > > Thanks! > > On Sun, Feb 17, 2013 at 3:58 AM, Armin Rigo wrote: >> Hi Taavi, >> >> I finally fixed pypy-stm with signals. Now I'm getting again results >> that scale with the number of processors. >> >> Note that it stops scaling up at some point, around 4 or 6 threads, on >> machines I tried it on. I suspect it's related to the fact that >> physical processors have 4 or 6 cores internally, but the results are >> still a bit inconsistent. Using the "taskset" command to force the >> threads to run on particular physical sockets seems to help a little >> bit with some numbers. Fwiw, I got the maximum throughput on a >> 24-cores machine by really running 24 threads, but that seems >> wasteful, as it is only 25% better than running 6 threads on one >> physical socket. >> >> The next step will be trying to reduce the overhead, currently >> considerable (about 10x slower than CPython, too much to ever have any >> net benefit). Also high on the list is fixing the constant memory >> leak (i.e. implementing major garbage collection steps). >> >> >> A bient?t, >> >> Armin. > > > > -- > taa > /*eof*/ -- taa /*eof*/ From estama at gmail.com Mon Feb 18 17:20:30 2013 From: estama at gmail.com (Eleytherios Stamatogiannakis) Date: Mon, 18 Feb 2013 18:20:30 +0200 Subject: [pypy-dev] Unicode encode/decode speed (cont) In-Reply-To: References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> <511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> <51192C17.7060907@gmail.com> <51197D93.2050201@gmail.com> <511A8675.3040001@gmail.com> Message-ID: <512254CE.6070501@gmail.com> We have found another (very simple) madIS query where PyPy is around 250x slower that CPython: CPython: 314msec PyPy: 1min 16sec The query if you would like to test it yourself is the following: select count(*) from (file 'some_big_text_file.txt' limit 100000); To run it you'll need some big text file containing at least 100000 text lines (we have run above query with a very big XML file). You can also run above query with a lower limit (the behaviour will be the same) as such: select count(*) from (file 'some_big_text_file.txt' limit 10000); Be careful for the file to not have a csv, tsv, json, db or gz ending because a different code path inside the "file" operator will be taken than the one for simple text files. l. From fijall at gmail.com Mon Feb 18 17:44:42 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 18 Feb 2013 18:44:42 +0200 Subject: [pypy-dev] Unicode encode/decode speed (cont) In-Reply-To: <512254CE.6070501@gmail.com> References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> <511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> <51192C17.7060907@gmail.com> <51197D93.2050201@gmail.com> <511A8675.3040001@gmail.com> <512254CE.6070501@gmail.com> Message-ID: On Mon, Feb 18, 2013 at 6:20 PM, Eleytherios Stamatogiannakis wrote: > We have found another (very simple) madIS query where PyPy is around 250x > slower that CPython: > > CPython: 314msec > PyPy: 1min 16sec > > The query if you would like to test it yourself is the following: > > select count(*) from (file 'some_big_text_file.txt' limit 100000); > > To run it you'll need some big text file containing at least 100000 text > lines (we have run above query with a very big XML file). You can also run > above query with a lower limit (the behaviour will be the same) as such: > > select count(*) from (file 'some_big_text_file.txt' limit 10000); > > Be careful for the file to not have a csv, tsv, json, db or gz ending > because a different code path inside the "file" operator will be taken than > the one for simple text files. > > l. > > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev Hey I would be incredibly convinient if you can change it to be a standalone benchmark (say reading large string from a file and decoding it in a whole or in pieces); From arigo at tunes.org Mon Feb 18 18:11:05 2013 From: arigo at tunes.org (Armin Rigo) Date: Mon, 18 Feb 2013 18:11:05 +0100 Subject: [pypy-dev] Unicode encode/decode speed In-Reply-To: <51221271.9060907@gmail.com> References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> <511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> <51192C17.7060907@gmail.com> <51197D93.2050201@gmail.com> <511A8675.3040001@gmail.com> <51221271.9060907@gmail.com> Message-ID: Hi, On Mon, Feb 18, 2013 at 12:37 PM, Eleytherios Stamatogiannakis wrote: > An API like the one you propose would be very nice, and IMHO would give a > substantial speedup. https://bitbucket.org/cffi/cffi/issue/57/shortcuts-to-encode-decode-between-unicode > May i suggest, that for generality purposes, the same API functions should > also be added for UTF-16, UTF-32 ? Well, I'll rather wait for someone to clearly shows the purpose of that. As I said in the above issue, modern programs tend to use UTF-8 systematically, unless they are on an OS with a precise notion of wider unicodes (like Windows), in which case Python's own unicode representation matches already and can be used directly in "wchar_t*". A bient?t, Armin. From amauryfa at gmail.com Mon Feb 18 19:21:17 2013 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Mon, 18 Feb 2013 19:21:17 +0100 Subject: [pypy-dev] Unicode encode/decode speed (cont) In-Reply-To: <512254CE.6070501@gmail.com> References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> <511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> <51192C17.7060907@gmail.com> <51197D93.2050201@gmail.com> <511A8675.3040001@gmail.com> <512254CE.6070501@gmail.com> Message-ID: 2013/2/18 Eleytherios Stamatogiannakis > We have found another (very simple) madIS query where PyPy is around 250x > slower that CPython: > > CPython: 314msec > PyPy: 1min 16sec > > The query if you would like to test it yourself is the following: > > select count(*) from (file 'some_big_text_file.txt' limit 100000); > Are you really running with mpsw.py? For me, the C (=cpyext based) version of apsw works (slowly), but mpsw gives me: >From callback : Traceback (most recent call last): File "/home/amauryfa/python/madis/apsw.py", line 924, in xOpen self._vtcursorcolumn.append(instance.Column) AttributeError: Cursor instance has no attribute 'Column' -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From estama at gmail.com Mon Feb 18 19:26:33 2013 From: estama at gmail.com (Eleytherios Stamatogiannakis) Date: Mon, 18 Feb 2013 20:26:33 +0200 Subject: [pypy-dev] Unicode encode/decode speed (cont) In-Reply-To: References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> <511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> <51192C17.7060907@gmail.com> <51197D93.2050201@gmail.com> <511A8675.3040001@gmail.com> <512254CE.6070501@gmail.com> Message-ID: <51227259.5050509@gmail.com> On 18/02/13 20:21, Amaury Forgeot d'Arc wrote: > > 2013/2/18 Eleytherios Stamatogiannakis > > > We have found another (very simple) madIS query where PyPy is around > 250x slower that CPython: > > CPython: 314msec > PyPy: 1min 16sec > > The query if you would like to test it yourself is the following: > > select count(*) from (file 'some_big_text_file.txt' limit 100000); > > > Are you really running with mpsw.py? > For me, the C (=cpyext based) version of apsw works (slowly), > but mpsw gives me: > > From callback : > Traceback (most recent call last): > File "/home/amauryfa/python/madis/apsw.py", line 924, in xOpen > self._vtcursorcolumn.append(instance.Column) > AttributeError: Cursor instance has no attribute 'Column' > Most probably you are using the ZIP distribution of madIS, which doesn't contain the changes for MSPW. For MSPW to work, you'll need the head version of madIS from Hg. Clone it with: hg clone https://code.google.com/p/madis/ l. From estama at gmail.com Mon Feb 18 19:41:25 2013 From: estama at gmail.com (Eleytherios Stamatogiannakis) Date: Mon, 18 Feb 2013 20:41:25 +0200 Subject: [pypy-dev] Unicode encode/decode speed (cont) In-Reply-To: References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> <511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> <51192C17.7060907@gmail.com> <51197D93.2050201@gmail.com> <511A8675.3040001@gmail.com> <512254CE.6070501@gmail.com> Message-ID: <512275D5.1010609@gmail.com> On 18/02/13 18:44, Maciej Fijalkowski wrote: > On Mon, Feb 18, 2013 at 6:20 PM, Eleytherios Stamatogiannakis > wrote: >> We have found another (very simple) madIS query where PyPy is around 250x >> slower that CPython: >> >> CPython: 314msec >> PyPy: 1min 16sec >> >> The query if you would like to test it yourself is the following: >> >> select count(*) from (file 'some_big_text_file.txt' limit 100000); >> >> To run it you'll need some big text file containing at least 100000 text >> lines (we have run above query with a very big XML file). You can also run >> above query with a lower limit (the behaviour will be the same) as such: >> >> select count(*) from (file 'some_big_text_file.txt' limit 10000); >> >> Be careful for the file to not have a csv, tsv, json, db or gz ending >> because a different code path inside the "file" operator will be taken than >> the one for simple text files. >> >> l. >> >> >> _______________________________________________ >> pypy-dev mailing list >> pypy-dev at python.org >> http://mail.python.org/mailman/listinfo/pypy-dev > > Hey > > I would be incredibly convinient if you can change it to be a > standalone benchmark (say reading large string from a file and > decoding it in a whole or in pieces); > As it involves SQLite, CFFI and Python, it is very hard to extract the full execution path that madIS goes through even in a simple query like this. Nevertheless we extracted a part of the pure Python execution path, and PyPy is around 50% slower than CPython: CPython: 21 sec PyPy: 33 sec The full madIS execution path involves additional CFFI calls and callbacks (from SQLite) to pass the data to SQLite. To run the test.py: test.py big_text_file l. -------------- next part -------------- A non-text attachment was scrubbed... Name: test.py Type: text/x-python Size: 463 bytes Desc: not available URL: From amauryfa at gmail.com Mon Feb 18 19:51:02 2013 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Mon, 18 Feb 2013 19:51:02 +0100 Subject: [pypy-dev] Unicode encode/decode speed (cont) In-Reply-To: <512275D5.1010609@gmail.com> References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> <511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> <51192C17.7060907@gmail.com> <51197D93.2050201@gmail.com> <511A8675.3040001@gmail.com> <512254CE.6070501@gmail.com> <512275D5.1010609@gmail.com> Message-ID: 2013/2/18 Eleytherios Stamatogiannakis > On 18/02/13 18:44, Maciej Fijalkowski wrote: > >> On Mon, Feb 18, 2013 at 6:20 PM, Eleytherios Stamatogiannakis >> wrote: >> >>> We have found another (very simple) madIS query where PyPy is around 250x >>> slower that CPython: >>> >>> CPython: 314msec >>> PyPy: 1min 16sec >>> >>> The query if you would like to test it yourself is the following: >>> >>> select count(*) from (file 'some_big_text_file.txt' limit 100000); >>> >>> To run it you'll need some big text file containing at least 100000 text >>> lines (we have run above query with a very big XML file). You can also >>> run >>> above query with a lower limit (the behaviour will be the same) as such: >>> >>> select count(*) from (file 'some_big_text_file.txt' limit 10000); >>> >>> Be careful for the file to not have a csv, tsv, json, db or gz ending >>> because a different code path inside the "file" operator will be taken >>> than >>> the one for simple text files. >>> >>> l. >>> >>> >>> ______________________________**_________________ >>> pypy-dev mailing list >>> pypy-dev at python.org >>> http://mail.python.org/**mailman/listinfo/pypy-dev >>> >> >> Hey >> >> I would be incredibly convinient if you can change it to be a >> standalone benchmark (say reading large string from a file and >> decoding it in a whole or in pieces); >> >> > As it involves SQLite, CFFI and Python, it is very hard to extract the > full execution path that madIS goes through even in a simple query like > this. > > Nevertheless we extracted a part of the pure Python execution path, and > PyPy is around 50% slower than CPython: > > CPython: 21 sec > PyPy: 33 sec > > The full madIS execution path involves additional CFFI calls and callbacks > (from SQLite) to pass the data to SQLite. > > To run the test.py: > > test.py big_text_file > Most of the time is spent in file iteration. I added f = f.read().splitlines() and the query is almost instant. -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.gaynor at gmail.com Mon Feb 18 19:59:10 2013 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Mon, 18 Feb 2013 10:59:10 -0800 Subject: [pypy-dev] Unicode encode/decode speed (cont) In-Reply-To: References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> <511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> <51192C17.7060907@gmail.com> <51197D93.2050201@gmail.com> <511A8675.3040001@gmail.com> <512254CE.6070501@gmail.com> <512275D5.1010609@gmail.com> Message-ID: So, iter(file).next() is slow? Alex On Mon, Feb 18, 2013 at 10:51 AM, Amaury Forgeot d'Arc wrote: > 2013/2/18 Eleytherios Stamatogiannakis > >> On 18/02/13 18:44, Maciej Fijalkowski wrote: >> >>> On Mon, Feb 18, 2013 at 6:20 PM, Eleytherios Stamatogiannakis >>> wrote: >>> >>>> We have found another (very simple) madIS query where PyPy is around >>>> 250x >>>> slower that CPython: >>>> >>>> CPython: 314msec >>>> PyPy: 1min 16sec >>>> >>>> The query if you would like to test it yourself is the following: >>>> >>>> select count(*) from (file 'some_big_text_file.txt' limit 100000); >>>> >>>> To run it you'll need some big text file containing at least 100000 text >>>> lines (we have run above query with a very big XML file). You can also >>>> run >>>> above query with a lower limit (the behaviour will be the same) as such: >>>> >>>> select count(*) from (file 'some_big_text_file.txt' limit 10000); >>>> >>>> Be careful for the file to not have a csv, tsv, json, db or gz ending >>>> because a different code path inside the "file" operator will be taken >>>> than >>>> the one for simple text files. >>>> >>>> l. >>>> >>>> >>>> ______________________________**_________________ >>>> pypy-dev mailing list >>>> pypy-dev at python.org >>>> http://mail.python.org/**mailman/listinfo/pypy-dev >>>> >>> >>> Hey >>> >>> I would be incredibly convinient if you can change it to be a >>> standalone benchmark (say reading large string from a file and >>> decoding it in a whole or in pieces); >>> >>> >> As it involves SQLite, CFFI and Python, it is very hard to extract the >> full execution path that madIS goes through even in a simple query like >> this. >> >> Nevertheless we extracted a part of the pure Python execution path, and >> PyPy is around 50% slower than CPython: >> >> CPython: 21 sec >> PyPy: 33 sec >> >> The full madIS execution path involves additional CFFI calls and >> callbacks (from SQLite) to pass the data to SQLite. >> >> To run the test.py: >> >> test.py big_text_file >> > > Most of the time is spent in file iteration. > I added > f = f.read().splitlines() > and the query is almost instant. > > > -- > Amaury Forgeot d'Arc > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > > -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero -------------- next part -------------- An HTML attachment was scrubbed... URL: From amauryfa at gmail.com Mon Feb 18 20:15:50 2013 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Mon, 18 Feb 2013 20:15:50 +0100 Subject: [pypy-dev] Unicode encode/decode speed (cont) In-Reply-To: References: <51117AD1.7060609@gmail.com> <51118D06.6000908@gmail.com> <511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> <51192C17.7060907@gmail.com> <51197D93.2050201@gmail.com> <511A8675.3040001@gmail.com> <512254CE.6070501@gmail.com> <512275D5.1010609@gmail.com> Message-ID: 2013/2/18 Alex Gaynor > So, iter(file).next() is slow? Yes, but only with "rU" mode. My benchmark with yesterday's build: $ ~/pypy/pypy-c-jit-60005-0f1e91da6cb2-linux64/bin/pypy -m timeit "fp = open('/tmp/large-text-file'); list(fp)" 10 loops, best of 3: 43.5 msec per loop $ ~/pypy/pypy-c-jit-60005-0f1e91da6cb2-linux64/bin/pypy -m timeit "fp = open('/tmp/large-text-file', 'rU'); list(fp)" 10 loops, best of 3: 638 msec per loop 15 times slower... -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From jiaxinrun at tp-link.com.cn Tue Feb 19 08:43:53 2013 From: jiaxinrun at tp-link.com.cn (jiaxinrun) Date: Tue, 19 Feb 2013 15:43:53 +0800 Subject: [pypy-dev] =?gb2312?b?ob5QeXB5IFF1ZXN0aW9uc6G/?= Message-ID: <201302191543527181604@tp-link.com.cn> Hi,all? I am a fresh man in pypy world! I have a question when I am using pypy for developing. When I import win32process.pyd(something in sit-packets\win32) with pypy.exe , it prompts "ImportError: No module named win32process". I am sure it's not a path problem. But when I import the same thing with python.exe?it works well. So, Can you give me some help? Thanks very much!! Additionally, I have finished a set of program. Now I want to use pypy to fast it. How could I do? Best Regards -------------- next part -------------- An HTML attachment was scrubbed... URL: From hyarion at iinet.net.au Tue Feb 19 10:06:26 2013 From: hyarion at iinet.net.au (Ben) Date: Tue, 19 Feb 2013 20:06:26 +1100 Subject: [pypy-dev] =?utf-8?b?44CQUHlweSBRdWVzdGlvbnPjgJE=?= In-Reply-To: <201302191543527181604@tp-link.com.cn> References: <201302191543527181604@tp-link.com.cn> Message-ID: <51234092.8030706@iinet.net.au> My list is still: - Laura - Cumber - Hespa - Darren And I haven't booked yet, because I'm slack. :) Did you want in, Adrian? On Tue 19 Feb 2013 18:43:53 EST, jiaxinrun wrote: > Hi,all? > I am a fresh man in pypy world! > I have a question when I am using pypy for developing. > When I import win32process.pyd(something in sit-packets\win32) with > pypy.exe , it prompts "ImportError: No module named win32process". I > am sure it's not a path problem. > But when I import the same thing with python.exe?it works well. > So, Can you give me some help? Thanks very much!! > Additionally, I have finished a set of program. Now I want to use pypy > to fast it. How could I do? > Best Regards > > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev From jiaxinrun at tp-link.com.cn Tue Feb 19 10:11:59 2013 From: jiaxinrun at tp-link.com.cn (jiaxinrun) Date: Tue, 19 Feb 2013 17:11:59 +0800 Subject: [pypy-dev] =?utf-8?b?44CQUHlweSBRdWVzdGlvbnPjgJE=?= References: <201302191543527181604@tp-link.com.cn>, <51234092.8030706@iinet.net.au> Message-ID: <201302191711580466445@tp-link.com.cn> Hi,Ben? Yes, I want in now! And tell me how ???? Ben ????? 2013-02-19 17:06 ???? jiaxinrun ??? pypy-dev ??? Re: [pypy-dev]?Pypy Questions? My list is still: - Laura - Cumber - Hespa - Darren And I haven't booked yet, because I'm slack. :) Did you want in, Adrian? On Tue 19 Feb 2013 18:43:53 EST, jiaxinrun wrote: > Hi,all? > I am a fresh man in pypy world! > I have a question when I am using pypy for developing. > When I import win32process.pyd(something in sit-packets\win32) with > pypy.exe , it prompts "ImportError: No module named win32process". I > am sure it's not a path problem. > But when I import the same thing with python.exe?it works well. > So, Can you give me some help? Thanks very much!! > Additionally, I have finished a set of program. Now I want to use pypy > to fast it. How could I do? > Best Regards > > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From hyarion at iinet.net.au Tue Feb 19 10:17:03 2013 From: hyarion at iinet.net.au (Ben) Date: Tue, 19 Feb 2013 20:17:03 +1100 Subject: [pypy-dev] =?utf-8?b?44CQUHlweSBRdWVzdGlvbnPjgJE=?= In-Reply-To: <51234092.8030706@iinet.net.au> References: <201302191543527181604@tp-link.com.cn> <51234092.8030706@iinet.net.au> Message-ID: <5123430F.8010204@iinet.net.au> Sorry all! I somehow sent a reply to a completely unrelated email as a reply to this thread. :o On 19/02/13 20:06, Ben wrote: > My list is still: > > - Laura > - Cumber > - Hespa > - Darren > > And I haven't booked yet, because I'm slack. :) > > Did you want in, Adrian? > > > On Tue 19 Feb 2013 18:43:53 EST, jiaxinrun wrote: >> Hi,all? >> I am a fresh man in pypy world! >> I have a question when I am using pypy for developing. >> When I import win32process.pyd(something in sit-packets\win32) with >> pypy.exe , it prompts "ImportError: No module named win32process". I >> am sure it's not a path problem. >> But when I import the same thing with python.exe?it works well. >> So, Can you give me some help? Thanks very much!! >> Additionally, I have finished a set of program. Now I want to use pypy >> to fast it. How could I do? >> Best Regards >> >> >> _______________________________________________ >> pypy-dev mailing list >> pypy-dev at python.org >> http://mail.python.org/mailman/listinfo/pypy-dev > > From amauryfa at gmail.com Tue Feb 19 11:06:46 2013 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Tue, 19 Feb 2013 11:06:46 +0100 Subject: [pypy-dev] =?gb2312?b?ob5QeXB5IFF1ZXN0aW9uc6G/?= In-Reply-To: <201302191543527181604@tp-link.com.cn> References: <201302191543527181604@tp-link.com.cn> Message-ID: Hi, 2013/2/19 jiaxinrun > ** **** > Hi,all? > > I am a fresh man in pypy world! > > I have a question when I am using pypy for developing. > > When I import win32process.pyd(something in sit-packets\win32) with > pypy.exe , it prompts "ImportError: No module named win32process". I am > sure it's not a path problem. > > But when I import the same thing with python.exe?it works well. So, Can > you give me some help? Thanks very much!! > > Additionally, I have finished a set of program. Now I want to use pypy to > fast it. How could I do? > PyPy cannot import extension modules as is, You need to recompile the pywin32 project with PyPy. And the main distribution won't even work. A long time ago I made the necessary changes for pywin32 to work with PyPy, the code is here: https://bitbucket.org/amauryfa/pywin32-pypy Unfortunately I don't have access to a windows machine anymore, so someone else should continue the project. -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From estama at gmail.com Tue Feb 19 13:09:45 2013 From: estama at gmail.com (Eleytherios Stamatogiannakis) Date: Tue, 19 Feb 2013 14:09:45 +0200 Subject: [pypy-dev] Unicode encode/decode speed (cont) In-Reply-To: References: <51117AD1.7060609@gmail.com> <511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> <51192C17.7060907@gmail.com> <51197D93.2050201@gmail.com> <511A8675.3040001@gmail.com> <512254CE.6070501@gmail.com> <512275D5.1010609@gmail.com> Message-ID: <51236B89.1070407@gmail.com> On 18/02/13 21:15, Amaury Forgeot d'Arc wrote: > 2013/2/18 Alex Gaynor > > > So, iter(file).next() is slow? > > > Yes, but only with "rU" mode. > My benchmark with yesterday's build: > > $ ~/pypy/pypy-c-jit-60005-0f1e91da6cb2-linux64/bin/pypy -m timeit "fp = > open('/tmp/large-text-file'); list(fp)" > 10 loops, best of 3: 43.5 msec per loop > $ ~/pypy/pypy-c-jit-60005-0f1e91da6cb2-linux64/bin/pypy -m timeit "fp = > open('/tmp/large-text-file', 'rU'); list(fp)" > 10 loops, best of 3: 638 msec per loop > > 15 times slower... > Yes you are right. We rerun the query without the 'rU' and the result is: CPython: 328 msec PyPy: 443 msec PyPy (with 'rU'): 1 min 17 sec So the main culprit of PyPy's slowdown is 'rU' option in open. Thanks for looking into it. l. From fijall at gmail.com Tue Feb 19 13:13:19 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Tue, 19 Feb 2013 14:13:19 +0200 Subject: [pypy-dev] Unicode encode/decode speed (cont) In-Reply-To: <51236B89.1070407@gmail.com> References: <51117AD1.7060609@gmail.com> <511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> <51192C17.7060907@gmail.com> <51197D93.2050201@gmail.com> <511A8675.3040001@gmail.com> <512254CE.6070501@gmail.com> <512275D5.1010609@gmail.com> <51236B89.1070407@gmail.com> Message-ID: On Tue, Feb 19, 2013 at 2:09 PM, Eleytherios Stamatogiannakis wrote: > On 18/02/13 21:15, Amaury Forgeot d'Arc wrote: >> >> 2013/2/18 Alex Gaynor > > >> >> >> So, iter(file).next() is slow? >> >> >> Yes, but only with "rU" mode. >> My benchmark with yesterday's build: >> >> $ ~/pypy/pypy-c-jit-60005-0f1e91da6cb2-linux64/bin/pypy -m timeit "fp = >> open('/tmp/large-text-file'); list(fp)" >> 10 loops, best of 3: 43.5 msec per loop >> $ ~/pypy/pypy-c-jit-60005-0f1e91da6cb2-linux64/bin/pypy -m timeit "fp = >> open('/tmp/large-text-file', 'rU'); list(fp)" >> 10 loops, best of 3: 638 msec per loop >> >> 15 times slower... >> > > Yes you are right. We rerun the query without the 'rU' and the result is: > > CPython: 328 msec > PyPy: 443 msec > PyPy (with 'rU'): 1 min 17 sec > > > So the main culprit of PyPy's slowdown is 'rU' option in open. > > Thanks for looking into it. > > > l. > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev Is this yet-another-fault-of-streamio? From amauryfa at gmail.com Tue Feb 19 14:39:06 2013 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Tue, 19 Feb 2013 14:39:06 +0100 Subject: [pypy-dev] Unicode encode/decode speed (cont) In-Reply-To: References: <51117AD1.7060609@gmail.com> <511912C6.5000201@gmail.com> <5119240C.2000209@gmail.com> <51192C17.7060907@gmail.com> <51197D93.2050201@gmail.com> <511A8675.3040001@gmail.com> <512254CE.6070501@gmail.com> <512275D5.1010609@gmail.com> <51236B89.1070407@gmail.com> Message-ID: 2013/2/19 Maciej Fijalkowski > On Tue, Feb 19, 2013 at 2:09 PM, Eleytherios Stamatogiannakis > wrote: > > > > So the main culprit of PyPy's slowdown is 'rU' option in open. > > > > Thanks for looking into it. > > Is this yet-another-fault-of-streamio? Not quite. Even a implementation based on fread() would need to care of these universal newlines and tune the usage of the various buffers. BTW, I tried io.open, and surprisingly the "rb" mode is twice slower as "rU". I guess that's because our io.Buffered is missing a dedicated "readline" method. -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From joehillen at gmail.com Wed Feb 20 18:50:35 2013 From: joehillen at gmail.com (Joe Hillenbrand) Date: Wed, 20 Feb 2013 09:50:35 -0800 Subject: [pypy-dev] HTML Parser? Message-ID: What is the recommended HTML parser to run in PyPy? The typical goto for Python is lxml, but of course that doesn't work with PyPy. Has anyone tested any other libraries? Are there any benchmarks? Thanks, -Joe -------------- next part -------------- An HTML attachment was scrubbed... URL: From amauryfa at gmail.com Wed Feb 20 19:02:14 2013 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Wed, 20 Feb 2013 19:02:14 +0100 Subject: [pypy-dev] HTML Parser? In-Reply-To: References: Message-ID: 2013/2/20 Joe Hillenbrand > What is the recommended HTML parser to run in PyPy? > > The typical goto for Python is lxml, but of course that doesn't work with > PyPy. > This is not true anymore. There has been a lot of work on both sides to make lxml work with PyPy. You should try with latest versions. In addition, there is a port of lxml that does not use Cython nor the C API: https://github.com/amauryfa/lxml/tree/lxml-cffi most of the tests are passing (except objectify), but "setup.py install" does not work yet. It works from the source tree, though. -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Wed Feb 20 19:07:23 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Wed, 20 Feb 2013 20:07:23 +0200 Subject: [pypy-dev] HTML Parser? In-Reply-To: References: Message-ID: On Wed, Feb 20, 2013 at 8:02 PM, Amaury Forgeot d'Arc wrote: > 2013/2/20 Joe Hillenbrand >> >> What is the recommended HTML parser to run in PyPy? >> >> The typical goto for Python is lxml, but of course that doesn't work with >> PyPy. > > > This is not true anymore. There has been a lot of work on both sides to make > lxml work with PyPy. > You should try with latest versions. > > In addition, there is a port of lxml that does not use Cython nor the C API: > https://github.com/amauryfa/lxml/tree/lxml-cffi > most of the tests are passing (except objectify), but "setup.py install" > does not work yet. > It works from the source tree, though. > > -- > Amaury Forgeot d'Arc > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > Is it working on released cffi or on cffi that's in-development or you need patches? From amauryfa at gmail.com Wed Feb 20 20:28:14 2013 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Wed, 20 Feb 2013 20:28:14 +0100 Subject: [pypy-dev] HTML Parser? In-Reply-To: References: Message-ID: 2013/2/20 Maciej Fijalkowski > Is it working on released cffi or on cffi that's in-development or you > need patches? > It developed it with a nightly build from mid-January, and the cffi library that was available at the time. It's now released as cffi 0.5 I think. I did not test with CPython at all. At the time cffi used to return enum values as strings, but I just tested with the last version of cffi and pypy nightly build, and tests still pass! Ran 1006 tests in 34.730s FAILED (failures=1) and the only failure is:: self.assertTrue(hasattr(self.etree, '_import_c_api')) :-) -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From arigo at tunes.org Wed Feb 20 20:29:27 2013 From: arigo at tunes.org (Armin Rigo) Date: Wed, 20 Feb 2013 20:29:27 +0100 Subject: [pypy-dev] HTML Parser? In-Reply-To: References: Message-ID: Hi all, Just so everybody knows, the plan is to release CFFI 0.6 latest when we do the PyPy 2.0 release, and include it fully inside PyPy too. (The idea is to avoid "pip install cffi", which would get a potentially incompatible version: PyPy includes the "_cffi_backend" module, which only works with a specific version of CFFI). A bient?t, Armin. From alex.gaynor at gmail.com Wed Feb 20 20:39:36 2013 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Wed, 20 Feb 2013 11:39:36 -0800 Subject: [pypy-dev] HTML Parser? In-Reply-To: References: Message-ID: Are we also planning to bundle ply and cparser? Alex On Wed, Feb 20, 2013 at 11:29 AM, Armin Rigo wrote: > Hi all, > > Just so everybody knows, the plan is to release CFFI 0.6 latest when > we do the PyPy 2.0 release, and include it fully inside PyPy too. > (The idea is to avoid "pip install cffi", which would get a > potentially incompatible version: PyPy includes the "_cffi_backend" > module, which only works with a specific version of CFFI). > > > A bient?t, > > Armin. > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Wed Feb 20 21:03:58 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Wed, 20 Feb 2013 22:03:58 +0200 Subject: [pypy-dev] HTML Parser? In-Reply-To: References: Message-ID: On Wed, Feb 20, 2013 at 9:29 PM, Armin Rigo wrote: > Hi all, > > Just so everybody knows, the plan is to release CFFI 0.6 latest when > we do the PyPy 2.0 release, and include it fully inside PyPy too. > (The idea is to avoid "pip install cffi", which would get a > potentially incompatible version: PyPy includes the "_cffi_backend" > module, which only works with a specific version of CFFI). > > > A bient?t, > > Armin. > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev One thing we have to consider is how do you write setup.py (or requirements.txt) in case you need to install cffi on cpython but not pypy From flyflybutters at hotmail.com Thu Feb 21 19:01:04 2013 From: flyflybutters at hotmail.com (stone node) Date: Thu, 21 Feb 2013 13:01:04 -0500 Subject: [pypy-dev] Can I install pypy on Mac osx 32 bit system Message-ID: Hi all, I want to install pypy on Mac 32bit system, but I didn't see a binary version for this configuration. Could anyone tell me how to do that? Thanks very much, Hang -------------- next part -------------- An HTML attachment was scrubbed... URL: From ddvento at ucar.edu Thu Feb 21 20:43:58 2013 From: ddvento at ucar.edu (Davide Del Vento) Date: Thu, 21 Feb 2013 12:43:58 -0700 Subject: [pypy-dev] pypy and PYTHONPATH Message-ID: <512678FE.7060206@ucar.edu> Folks, I've just installed pypy on a production machine which makes use of PYTHONPATH to let the user pick and chose python libraries installed in non-standard directories. I am wondering if I should wrap pypy with a shell script forcing the -E setting, to prevent picking libraries that it should not (such as numpy and scipy). Thanks! Davide Del Vento, NCAR Computational & Information Services Laboratory Consulting Services Software Engineer http://www2.cisl.ucar.edu/uss/csg/ SEA Chair http://sea.ucar.edu/ From kennylevinsen at gmail.com Thu Feb 21 22:55:04 2013 From: kennylevinsen at gmail.com (Kenny Lasse Hoff Levinsen) Date: Thu, 21 Feb 2013 22:55:04 +0100 Subject: [pypy-dev] Can I install pypy on Mac osx 32 bit system In-Reply-To: References: Message-ID: <94A0BA18-4487-41D0-875D-FAA3D623587F@gmail.com> Hi Hang, You are indeed correct in the lack of binaries - I'm not sure if anyone have tested that configuration. I'm curious as to why you need a 32-bit binary - Unless I'm having a major brainfart here, you'd need to have a pre-Core2 Mac in order for 64-bit to be a problem. Starting with Core2 (and whatever version of 10.4 Tiger was available at the time), 64-bit has been supported on all Macs? (They even dropped support for 32-bit kernel in 10.8) Oh well. Long story short, you're most likely going to end up translating PyPy yourself, in which case I will refer you to the wiki [link], as well as the IRC channel [#pypy at irc.freenode.net] for further assistance. Good luck! Kenny Levinsen // joushou On Feb 21, 2013, at 7:01 PM, stone node wrote: > Hi all, > > I want to install pypy on Mac 32bit system, but I didn't see a binary version for this configuration. > > Could anyone tell me how to do that? > > Thanks very much, > > Hang > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From joehillen at gmail.com Fri Feb 22 07:39:21 2013 From: joehillen at gmail.com (Joe Hillenbrand) Date: Thu, 21 Feb 2013 22:39:21 -0800 Subject: [pypy-dev] HTML Parser? In-Reply-To: References: Message-ID: Great to hear! I just got it working with scrapy. Unfortunately there wasn't any speedup. A normal crawl in CPython takes: real 1m32.238s user 0m56.576s sys 0m1.208s In PyPy: real 1m54.098s user 1m18.105s sys 0m1.372s Thanks for all your hard work. -Joe On Wed, Feb 20, 2013 at 11:28 AM, Amaury Forgeot d'Arc wrote: > > 2013/2/20 Maciej Fijalkowski > >> Is it working on released cffi or on cffi that's in-development or you >> need patches? >> > > It developed it with a nightly build from mid-January, > and the cffi library that was available at the time. > It's now released as cffi 0.5 I think. > > I did not test with CPython at all. > > At the time cffi used to return enum values as strings, > but I just tested with the last version of cffi and pypy nightly build, > and tests still pass! > > Ran 1006 tests in 34.730s > FAILED (failures=1) > and the only failure is:: > self.assertTrue(hasattr(self.etree, '_import_c_api')) > :-) > > -- > Amaury Forgeot d'Arc > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kostia.lopuhin at gmail.com Fri Feb 22 07:49:53 2013 From: kostia.lopuhin at gmail.com (=?KOI8-R?B?68/T1NEg7M/Q1cjJzg==?=) Date: Fri, 22 Feb 2013 10:49:53 +0400 Subject: [pypy-dev] [rpython] What might prevent a function call from beeing invlined by the JIT? Message-ID: In what cases does the jit decide not to inline a function call, but place "call_may_force" instead? The context is that I have a simple interpreter, like an expanded kermit, and I am testing how the jit helps - it perfectly unboxes wrapped objects in a loop, but does not inline function calls - I wonder what am I missing here. I found a place where PyPy interpreter makes a call and did not see anything special there. From fijall at gmail.com Fri Feb 22 11:19:34 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 22 Feb 2013 12:19:34 +0200 Subject: [pypy-dev] HTML Parser? In-Reply-To: References: Message-ID: On Fri, Feb 22, 2013 at 8:39 AM, Joe Hillenbrand wrote: > Great to hear! I just got it working with scrapy. Unfortunately there wasn't > any speedup. > > A normal crawl in CPython takes: > real 1m32.238s > user 0m56.576s > sys 0m1.208s > > In PyPy: > real 1m54.098s > user 1m18.105s > sys 0m1.372s > > Thanks for all your hard work. > > -Joe lxml-cffi is known to be slower than normal lxml. You'll get speedups if you start doing non-trivial logic in python, probably. For what is worth, cffi is missing a lot of trivial optimizations (and one non-trivial), so there is a lot of room for improvement. From fijall at gmail.com Fri Feb 22 11:27:45 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 22 Feb 2013 12:27:45 +0200 Subject: [pypy-dev] [rpython] What might prevent a function call from beeing invlined by the JIT? In-Reply-To: References: Message-ID: On Fri, Feb 22, 2013 at 8:49 AM, ????? ??????? wrote: > In what cases does the jit decide not to inline a function call, but > place "call_may_force" instead? > The context is that I have a simple interpreter, like an expanded > kermit, and I am testing how the jit helps - it perfectly unboxes > wrapped objects in a loop, but does not inline function calls - I > wonder what am I missing here. I found a place where PyPy interpreter > makes a call and did not see anything special there. call_may_force means you call a function that has a loop that calls back to the intepreter. Probably argument handling or so. You need to annotate the function with @jit.unroll_safe (that means that each time the number of iteration is different, you'll get slightly different assembler, so beware) From kostia.lopuhin at gmail.com Fri Feb 22 12:56:33 2013 From: kostia.lopuhin at gmail.com (=?KOI8-R?B?68/T1NEg7M/Q1cjJzg==?=) Date: Fri, 22 Feb 2013 15:56:33 +0400 Subject: [pypy-dev] [rpython] What might prevent a function call from beeing invlined by the JIT? In-Reply-To: References: Message-ID: Yes, it had a little loop that was setting arguments for a function call, now the call is inlined, thank you! But in the inlined code there is immidately "call_assembler", followed by a "keepalive" of the frame. Does it have an equally simple answer? Or it has to do with the function arguments? 2013/2/22 Maciej Fijalkowski : > On Fri, Feb 22, 2013 at 8:49 AM, ????? ??????? wrote: >> In what cases does the jit decide not to inline a function call, but >> place "call_may_force" instead? >> The context is that I have a simple interpreter, like an expanded >> kermit, and I am testing how the jit helps - it perfectly unboxes >> wrapped objects in a loop, but does not inline function calls - I >> wonder what am I missing here. I found a place where PyPy interpreter >> makes a call and did not see anything special there. > > call_may_force means you call a function that has a loop that calls > back to the intepreter. Probably argument handling or so. You need to > annotate the function with @jit.unroll_safe (that means that each time > the number of iteration is different, you'll get slightly different > assembler, so beware) 2013/2/22 Maciej Fijalkowski : > On Fri, Feb 22, 2013 at 8:49 AM, ????? ??????? wrote: >> In what cases does the jit decide not to inline a function call, but >> place "call_may_force" instead? >> The context is that I have a simple interpreter, like an expanded >> kermit, and I am testing how the jit helps - it perfectly unboxes >> wrapped objects in a loop, but does not inline function calls - I >> wonder what am I missing here. I found a place where PyPy interpreter >> makes a call and did not see anything special there. > > call_may_force means you call a function that has a loop that calls > back to the intepreter. Probably argument handling or so. You need to > annotate the function with @jit.unroll_safe (that means that each time > the number of iteration is different, you'll get slightly different > assembler, so beware) 2013/2/22 Maciej Fijalkowski : > On Fri, Feb 22, 2013 at 8:49 AM, ????? ??????? wrote: >> In what cases does the jit decide not to inline a function call, but >> place "call_may_force" instead? >> The context is that I have a simple interpreter, like an expanded >> kermit, and I am testing how the jit helps - it perfectly unboxes >> wrapped objects in a loop, but does not inline function calls - I >> wonder what am I missing here. I found a place where PyPy interpreter >> makes a call and did not see anything special there. > > call_may_force means you call a function that has a loop that calls > back to the intepreter. Probably argument handling or so. You need to > annotate the function with @jit.unroll_safe (that means that each time > the number of iteration is different, you'll get slightly different > assembler, so beware) From fijall at gmail.com Fri Feb 22 12:58:01 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 22 Feb 2013 13:58:01 +0200 Subject: [pypy-dev] [rpython] What might prevent a function call from beeing invlined by the JIT? In-Reply-To: References: Message-ID: On Fri, Feb 22, 2013 at 1:56 PM, ????? ??????? wrote: > Yes, it had a little loop that was setting arguments for a function > call, now the call is inlined, thank you! But in the inlined code > there is immidately "call_assembler", followed by a "keepalive" of the > frame. Does it have an equally simple answer? Or it has to do with the > function arguments? call_assembler means that the function you call has a loop (such functions are not inlined, only a part of it), so you end up with a call. IF you call a simpler function with no loops, the call will disappear > > 2013/2/22 Maciej Fijalkowski : >> On Fri, Feb 22, 2013 at 8:49 AM, ????? ??????? wrote: >>> In what cases does the jit decide not to inline a function call, but >>> place "call_may_force" instead? >>> The context is that I have a simple interpreter, like an expanded >>> kermit, and I am testing how the jit helps - it perfectly unboxes >>> wrapped objects in a loop, but does not inline function calls - I >>> wonder what am I missing here. I found a place where PyPy interpreter >>> makes a call and did not see anything special there. >> >> call_may_force means you call a function that has a loop that calls >> back to the intepreter. Probably argument handling or so. You need to >> annotate the function with @jit.unroll_safe (that means that each time >> the number of iteration is different, you'll get slightly different >> assembler, so beware) > > > 2013/2/22 Maciej Fijalkowski : >> On Fri, Feb 22, 2013 at 8:49 AM, ????? ??????? wrote: >>> In what cases does the jit decide not to inline a function call, but >>> place "call_may_force" instead? >>> The context is that I have a simple interpreter, like an expanded >>> kermit, and I am testing how the jit helps - it perfectly unboxes >>> wrapped objects in a loop, but does not inline function calls - I >>> wonder what am I missing here. I found a place where PyPy interpreter >>> makes a call and did not see anything special there. >> >> call_may_force means you call a function that has a loop that calls >> back to the intepreter. Probably argument handling or so. You need to >> annotate the function with @jit.unroll_safe (that means that each time >> the number of iteration is different, you'll get slightly different >> assembler, so beware) > > > 2013/2/22 Maciej Fijalkowski : >> On Fri, Feb 22, 2013 at 8:49 AM, ????? ??????? wrote: >>> In what cases does the jit decide not to inline a function call, but >>> place "call_may_force" instead? >>> The context is that I have a simple interpreter, like an expanded >>> kermit, and I am testing how the jit helps - it perfectly unboxes >>> wrapped objects in a loop, but does not inline function calls - I >>> wonder what am I missing here. I found a place where PyPy interpreter >>> makes a call and did not see anything special there. >> >> call_may_force means you call a function that has a loop that calls >> back to the intepreter. Probably argument handling or so. You need to >> annotate the function with @jit.unroll_safe (that means that each time >> the number of iteration is different, you'll get slightly different >> assembler, so beware) From ddvento at ucar.edu Fri Feb 22 21:22:34 2013 From: ddvento at ucar.edu (Davide Del Vento) Date: Fri, 22 Feb 2013 13:22:34 -0700 Subject: [pypy-dev] how to check if jit is available in my build Message-ID: <5127D38A.4040202@ucar.edu> Folks, I compiled pypy 1.9 and 2.0-beta1 from source, and the few small tests I ran were slower than expected. I am wondering if I did everything "right" and if there is a runtime check that would give me a definitive answer to the question "is jit available in this build"? Google seems to not know the answer. Thanks Davide From alex.gaynor at gmail.com Fri Feb 22 21:54:51 2013 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Fri, 22 Feb 2013 12:54:51 -0800 Subject: [pypy-dev] how to check if jit is available in my build In-Reply-To: <5127D38A.4040202@ucar.edu> References: <5127D38A.4040202@ucar.edu> Message-ID: sys.pypy_translation_info["translation.jit"] will tell you definitely. Alex On Fri, Feb 22, 2013 at 12:22 PM, Davide Del Vento wrote: > Folks, > > I compiled pypy 1.9 and 2.0-beta1 from source, and the few small tests I > ran were slower than expected. I am wondering if I did everything "right" > and if there is a runtime check that would give me a definitive answer to > the question "is jit available in this build"? > > Google seems to not know the answer. > > Thanks > Davide > ______________________________**_________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/**mailman/listinfo/pypy-dev > -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero -------------- next part -------------- An HTML attachment was scrubbed... URL: From ddvento at ucar.edu Fri Feb 22 22:01:05 2013 From: ddvento at ucar.edu (Davide Del Vento) Date: Fri, 22 Feb 2013 14:01:05 -0700 Subject: [pypy-dev] how to check if jit is available in my build In-Reply-To: References: <5127D38A.4040202@ucar.edu> Message-ID: <5127DC91.2080407@ucar.edu> Thanks to both. I can import pypyjit and sys.pypy_translation_info["translation.jit"] is True (for both 2.9 and 2.0-beta1). Have a nice weekend, Davide On 02/22/2013 01:54 PM, Alex Gaynor wrote: > sys.pypy_translation_info["translation.jit"] > > will tell you definitely. > > Alex > > > On Fri, Feb 22, 2013 at 12:22 PM, Davide Del Vento > wrote: > > Folks, > > I compiled pypy 1.9 and 2.0-beta1 from source, and the few small > tests I ran were slower than expected. I am wondering if I did > everything "right" and if there is a runtime check that would give > me a definitive answer to the question "is jit available in this build"? > > Google seems to not know the answer. > > Thanks > Davide > _________________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/__mailman/listinfo/pypy-dev > > > > > > -- > "I disapprove of what you say, but I will defend to the death your right > to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) > "The people's good is the highest law." -- Cicero From razvan.ghitulete at gmail.com Sun Feb 24 17:48:58 2013 From: razvan.ghitulete at gmail.com (Ghitulete Razvan) Date: Sun, 24 Feb 2013 18:48:58 +0200 Subject: [pypy-dev] translating pypy benchmarks to C Message-ID: Hi, I've been trying for some time now to translate python code into C. After playing around with the pypy interactive translator shell and worked just fine. But the I tried to translate the pypy benchmarks from speed.pypy.orginto C code, but I seem to be running into all kinds of trouble. So far I've tried with bm_ai.py which seems to fail because it uses closures(or so i'm told by the translator), and bm_threading.py seems to fail while processing threading.py. Is there something I'm doing wrong? P.S.: I'm simply running the translator.py script with -s option on slightly modified versions of the above mentioned files(adding an entry point). -- Sincerely, Razvan Ghitulete Universitatea Politehnica Bucuresti -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.gaynor at gmail.com Sun Feb 24 17:58:39 2013 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Sun, 24 Feb 2013 08:58:39 -0800 Subject: [pypy-dev] translating pypy benchmarks to C In-Reply-To: References: Message-ID: Hi, Why are you trying to do this? The translator doesn't handle random Python, only RPython. Alex On Sun, Feb 24, 2013 at 8:48 AM, Ghitulete Razvan < razvan.ghitulete at gmail.com> wrote: > Hi, > > I've been trying for some time now to translate python code into C. After > playing around with the pypy interactive translator shell and worked just > fine. But the I tried to translate the pypy benchmarks from speed.pypy.orginto C code, but I seem to be running into all kinds of trouble. > > So far I've tried with bm_ai.py which seems to fail because it uses > closures(or so i'm told by the translator), and bm_threading.py seems to > fail while processing threading.py. Is there something I'm doing wrong? > > P.S.: I'm simply running the translator.py script with -s option on > slightly modified versions of the above mentioned files(adding an entry > point). > > -- > Sincerely, > Razvan Ghitulete > Universitatea Politehnica Bucuresti > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > > -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero -------------- next part -------------- An HTML attachment was scrubbed... URL: From razvan.ghitulete at gmail.com Sun Feb 24 18:29:24 2013 From: razvan.ghitulete at gmail.com (Ghitulete Razvan) Date: Sun, 24 Feb 2013 19:29:24 +0200 Subject: [pypy-dev] translating pypy benchmarks to C In-Reply-To: References: Message-ID: On Sun, Feb 24, 2013 at 6:58 PM, Alex Gaynor wrote: > Hi, > > Why are you trying to do this? The translator doesn't handle random > Python, only RPython. > > Alex > > I am working on a research project to run python on a baremetal system. So I basically need a way of translating python code into something that can run on baremetal, hence C. After that I want to see whether it is worth it or not, and this is why I am trying to translate the benchmarks(as to have a common denominator). Is there any way I can get threading work with the translator. That is, are there any threading implementations available in RPython code? -- Sincerely, Razvan Ghitulete Universitatea Politehnica Bucuresti -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Sun Feb 24 18:37:30 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sun, 24 Feb 2013 19:37:30 +0200 Subject: [pypy-dev] translating pypy benchmarks to C In-Reply-To: References: Message-ID: On Sun, Feb 24, 2013 at 7:29 PM, Ghitulete Razvan wrote: > On Sun, Feb 24, 2013 at 6:58 PM, Alex Gaynor wrote: >> >> Hi, >> >> Why are you trying to do this? The translator doesn't handle random >> Python, only RPython. >> >> Alex >> > > I am working on a research project to run python on a baremetal system. So I > basically need a way of translating python code into something that can run > on baremetal, hence C. After that I want to see whether it is worth it or > not, and this is why I am trying to translate the benchmarks(as to have a > common denominator). > > Is there any way I can get threading work with the translator. That is, are > there any threading implementations available in RPython code? > > -- > Sincerely, > Razvan Ghitulete > Universitatea Politehnica Bucuresti Hi There is some support for threading in RPython, see rlib.rthread. RPython might not be your good buddy here. The executables built are rather large and the GC (at least on default settings) will not be that interesting for bare metal. What sort of parameters are you looking at? Also, using RPython is really tedious, you have been warned (it's *not* Python). From mail at justinbogner.com Sun Feb 24 18:46:44 2013 From: mail at justinbogner.com (Justin Bogner) Date: Sun, 24 Feb 2013 10:46:44 -0700 Subject: [pypy-dev] Updates to hgignore for rpython split Message-ID: <87fw0lzjsr.fsf@justinbogner.com> The .hgignore file got missed when we moved rpython/ out of pypy/. Can someone take a look at this patch and commit it if it looks good? Thanks. -------------- next part -------------- A non-text attachment was scrubbed... Name: hgignore.patch Type: text/x-diff Size: 2768 bytes Desc: not available URL: From razvan.ghitulete at gmail.com Sun Feb 24 18:48:44 2013 From: razvan.ghitulete at gmail.com (Ghitulete Razvan) Date: Sun, 24 Feb 2013 19:48:44 +0200 Subject: [pypy-dev] translating pypy benchmarks to C In-Reply-To: References: Message-ID: On Sun, Feb 24, 2013 at 7:37 PM, Maciej Fijalkowski wrote: > > Hi > > There is some support for threading in RPython, see rlib.rthread. > > RPython might not be your good buddy here. The executables built are > rather large and the GC (at least on default settings) will not be > that interesting for bare metal. What sort of parameters are you > looking at? > > Also, using RPython is really tedious, you have been warned (it's *not* > Python). > Well, I don't seem to have much of a choice as I basically need source code out of python and not a binary. Also from what I see there is no translator that successfully translates full Python code into C/C++. As for parameters I don't care that much about the binary as I am not running in a resource restricted environment. I am actually running the baremetal binary on a x86_64 workstation. -- Sincerely, Razvan Ghitulete Universitatea Politehnica Bucuresti -------------- next part -------------- An HTML attachment was scrubbed... URL: From douwe at triposo.com Sun Feb 24 18:54:05 2013 From: douwe at triposo.com (Douwe Osinga) Date: Sun, 24 Feb 2013 18:54:05 +0100 Subject: [pypy-dev] translating pypy benchmarks to C In-Reply-To: References: Message-ID: Have you tried cython? douwe at triposo.com | +49-(0)-1573-4469916 | @dosinga On Sun, Feb 24, 2013 at 6:48 PM, Ghitulete Razvan wrote: > On Sun, Feb 24, 2013 at 7:37 PM, Maciej Fijalkowski > wrote: >> >> >> Hi >> >> There is some support for threading in RPython, see rlib.rthread. >> >> RPython might not be your good buddy here. The executables built are >> rather large and the GC (at least on default settings) will not be >> that interesting for bare metal. What sort of parameters are you >> looking at? >> >> Also, using RPython is really tedious, you have been warned (it's *not* >> Python). > > > Well, I don't seem to have much of a choice as I basically need source code > out of python and not a binary. Also from what I see there is no translator > that successfully translates full Python code into C/C++. As for parameters > I don't care that much about the binary as I am not running in a resource > restricted environment. I am actually running the baremetal binary on a > x86_64 workstation. > > > -- > Sincerely, > Razvan Ghitulete > Universitatea Politehnica Bucuresti > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > From Ronny.Pfannschmidt at gmx.de Sun Feb 24 18:58:25 2013 From: Ronny.Pfannschmidt at gmx.de (Ronny Pfannschmidt) Date: Sun, 24 Feb 2013 18:58:25 +0100 Subject: [pypy-dev] translating pypy benchmarks to C In-Reply-To: References: Message-ID: <512A54C1.1020506@gmx.de> are you trying to write an operating system in "python"? On 02/24/2013 06:48 PM, Ghitulete Razvan wrote: > On Sun, Feb 24, 2013 at 7:37 PM, Maciej Fijalkowski > wrote: > > > Hi > > There is some support for threading in RPython, see rlib.rthread. > > RPython might not be your good buddy here. The executables built are > rather large and the GC (at least on default settings) will not be > that interesting for bare metal. What sort of parameters are you > looking at? > > Also, using RPython is really tedious, you have been warned (it's > *not* Python). > > > Well, I don't seem to have much of a choice as I basically need source > code out of python and not a binary. Also from what I see there is no > translator that successfully translates full Python code into C/C++. As > for parameters I don't care that much about the binary as I am not > running in a resource restricted environment. I am actually running the > baremetal binary on a x86_64 workstation. > > > -- > Sincerely, > Razvan Ghitulete > Universitatea Politehnica Bucuresti > > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev From razvan.ghitulete at gmail.com Sun Feb 24 19:09:20 2013 From: razvan.ghitulete at gmail.com (Ghitulete Razvan) Date: Sun, 24 Feb 2013 20:09:20 +0200 Subject: [pypy-dev] translating pypy benchmarks to C In-Reply-To: <512A54C1.1020506@gmx.de> References: <512A54C1.1020506@gmx.de> Message-ID: On Sun, Feb 24, 2013 at 7:58 PM, Ronny Pfannschmidt < Ronny.Pfannschmidt at gmx.de> wrote: > are you trying to write an operating system in "python"? > > No, I actually want to see how fast can python code go. Writing an operating system in python would be quite crazy, and I am still rather sane. Maybe in the future though. On Sun, Feb 24, 2013 at 7:54 PM, Douwe Osinga wrote: > Have you tried cython? > Actually I haven't. I remember of reading their front page, but it doesn't say anything explicitly there about converting to C code. Now that I've taken a better look, it seems interesting enough. I'll try and check it out. -- Sincerely, Razvan Ghitulete Universitatea Politehnica Bucuresti -------------- next part -------------- An HTML attachment was scrubbed... URL: From Ronny.Pfannschmidt at gmx.de Sun Feb 24 19:22:27 2013 From: Ronny.Pfannschmidt at gmx.de (Ronny Pfannschmidt) Date: Sun, 24 Feb 2013 19:22:27 +0100 Subject: [pypy-dev] translating pypy benchmarks to C In-Reply-To: References: <512A54C1.1020506@gmx.de> Message-ID: <512A5A63.9060505@gmx.de> On 02/24/2013 07:09 PM, Ghitulete Razvan wrote: > On Sun, Feb 24, 2013 at 7:58 PM, Ronny Pfannschmidt > > wrote: > > are you trying to write an operating system in "python"? > > No, I actually want to see how fast can python code go. Writing an > operating system in python would be quite crazy, and I am still rather > sane. Maybe in the future though. then i dont quite get why you want to use rpython - pypy+jit should do > > On Sun, Feb 24, 2013 at 7:54 PM, Douwe Osinga > wrote: > > Have you tried cython? > > Actually I haven't. I remember of reading their front page, but it > doesn't say anything explicitly there about converting to C code. Now > that I've taken a better look, it seems interesting enough. I'll try and > check it out. > > > -- > Sincerely, > Razvan Ghitulete > Universitatea Politehnica Bucuresti From razvan.ghitulete at gmail.com Sun Feb 24 19:44:19 2013 From: razvan.ghitulete at gmail.com (Ghitulete Razvan) Date: Sun, 24 Feb 2013 20:44:19 +0200 Subject: [pypy-dev] translating pypy benchmarks to C In-Reply-To: <512A5A63.9060505@gmx.de> References: <512A54C1.1020506@gmx.de> <512A5A63.9060505@gmx.de> Message-ID: On Sun, Feb 24, 2013 at 8:22 PM, Ronny Pfannschmidt < Ronny.Pfannschmidt at gmx.de> wrote: > then i dont quite get why you want to use rpython - pypy+jit should do > >> >> Ok let me rephrase that, because I fear it might not have been clear. By saying that I do not plan to write an operating system I mean that the resulted binary will not offer facilities to other programs(the common meaning of an operating system). On the other hand, by running on baremetal I mean that there will not actually be any operating system around to offer support and all code needs to be in binary form so that it can run. So yes, you can say that the resulting binary will be an operating system that will be aimed of doing a single task(in this case running various python benchmarks). -- Sincerely, Razvan Ghitulete Universitatea Politehnica Bucuresti -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.gaynor at gmail.com Sun Feb 24 19:47:03 2013 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Sun, 24 Feb 2013 10:47:03 -0800 Subject: [pypy-dev] translating pypy benchmarks to C In-Reply-To: References: <512A54C1.1020506@gmx.de> <512A5A63.9060505@gmx.de> Message-ID: What you are doing will not generate any information about how fast Python can be. It will show you the speed of RPython or Cython on baremetal, these are *NOT* python. Alex On Sun, Feb 24, 2013 at 10:44 AM, Ghitulete Razvan < razvan.ghitulete at gmail.com> wrote: > On Sun, Feb 24, 2013 at 8:22 PM, Ronny Pfannschmidt < > Ronny.Pfannschmidt at gmx.de> wrote: > >> then i dont quite get why you want to use rpython - pypy+jit should do >> >>> >>> > Ok let me rephrase that, because I fear it might not have been clear. By > saying that I do not plan to write an operating system I mean that the > resulted binary will not offer facilities to other programs(the common > meaning of an operating system). On the other hand, by running on baremetal > I mean that there will not actually be any operating system around to offer > support and all code needs to be in binary form so that it can run. So yes, > you can say that the resulting binary will be an operating system that will > be aimed of doing a single task(in this case running various python > benchmarks). > > -- > Sincerely, > Razvan Ghitulete > Universitatea Politehnica Bucuresti > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > > -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero -------------- next part -------------- An HTML attachment was scrubbed... URL: From santagada at gmail.com Sun Feb 24 20:17:07 2013 From: santagada at gmail.com (Leonardo Santagada) Date: Sun, 24 Feb 2013 16:17:07 -0300 Subject: [pypy-dev] translating pypy benchmarks to C In-Reply-To: References: <512A54C1.1020506@gmx.de> <512A5A63.9060505@gmx.de> Message-ID: On Sun, Feb 24, 2013 at 3:44 PM, Ghitulete Razvan < razvan.ghitulete at gmail.com> wrote: > On Sun, Feb 24, 2013 at 8:22 PM, Ronny Pfannschmidt < > Ronny.Pfannschmidt at gmx.de> wrote: > >> then i dont quite get why you want to use rpython - pypy+jit should do >> >>> >>> > Ok let me rephrase that, because I fear it might not have been clear. By > saying that I do not plan to write an operating system I mean that the > resulted binary will not offer facilities to other programs(the common > meaning of an operating system). On the other hand, by running on baremetal > I mean that there will not actually be any operating system around to offer > support and all code needs to be in binary form so that it can run. So yes, > you can say that the resulting binary will be an operating system that will > be aimed of doing a single task(in this case running various python > benchmarks). > So what I think you need is a pypy binary that can run without an os... the pypy binary needs a libc to access stuff, if you have one that you are using with other C software in your project maybe you can port pypy to it... probably a pthreads library will also be needed. What you need is to define a new platform and port the whole pypy to it... probably cross compiling from linux. I think that is how the arm port works and should be doable. -- Leonardo Santagada -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidmenhur at gmail.com Sun Feb 24 21:10:00 2013 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Sun, 24 Feb 2013 21:10:00 +0100 Subject: [pypy-dev] translating pypy benchmarks to C In-Reply-To: References: Message-ID: On 24 February 2013 18:54, Douwe Osinga wrote: > Have you tried cython? > Another possibility is Shedskin: "an *experimental* compiler, that can translate pure, but *implicitly statically typed* Python (2.4-2.6) programs into optimized C++. It can generate stand-alone programs or *extension modules* that can be imported and used in larger Python programs." https://code.google.com/p/shedskin/ It (intentionally) does not support type annotation, because they want to know how far they can go without it. David. -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Sun Feb 24 21:33:21 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sun, 24 Feb 2013 22:33:21 +0200 Subject: [pypy-dev] translating pypy benchmarks to C In-Reply-To: References: Message-ID: On Sun, Feb 24, 2013 at 10:10 PM, Da?id wrote: > On 24 February 2013 18:54, Douwe Osinga wrote: >> >> Have you tried cython? > > > Another possibility is Shedskin: "an experimental compiler, that can > translate pure, but implicitly statically typed Python (2.4-2.6) programs > into optimized C++. It can generate stand-alone programs or extension > modules that can be imported and used in larger Python programs." > > https://code.google.com/p/shedskin/ > > It (intentionally) does not support type annotation, because they want to > know how far they can go without it. Shedskin is again, just as bad as RPython (and maybe worse) > > > David. > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > From arigo at tunes.org Sun Feb 24 22:14:48 2013 From: arigo at tunes.org (Armin Rigo) Date: Sun, 24 Feb 2013 22:14:48 +0100 Subject: [pypy-dev] translating pypy benchmarks to C In-Reply-To: References: <512A54C1.1020506@gmx.de> <512A5A63.9060505@gmx.de> Message-ID: Hi Ghitulete, So are you saying that you don't want to use CPython because it's C, and you want to try C-less alternatives, or at least things that don't use libc? Then look elsewhere. An RPython program (which is definitely something different than a Python program) is translated to C code that uses libc. Changing this would be possible, but certainly not less work than, say, changing CPython to not use the libc. Which, I seem to recall, has been done long ago in an experiment of "booting CPython". Either way, I'm rather sure that this has nothing to do with seeing how fast Python runs. Using a regular PyPy is more or less the fastest known way to run full pure Python code, so far. If you're rather interested in restricted subsets of Python or other Python-ish languages, then yes, RPython is one of them, and others have been mentioned in this thread. A bient?t, Armin. From razvan.ghitulete at gmail.com Mon Feb 25 09:13:33 2013 From: razvan.ghitulete at gmail.com (Ghitulete Razvan) Date: Mon, 25 Feb 2013 10:13:33 +0200 Subject: [pypy-dev] translating pypy benchmarks to C In-Reply-To: References: <512A54C1.1020506@gmx.de> <512A5A63.9060505@gmx.de> Message-ID: On Sun, Feb 24, 2013 at 8:47 PM, Alex Gaynor wrote: > What you are doing will not generate any information about how fast Python > can be. It will show you the speed of RPython or Cython on baremetal, these > are *NOT* python. > > I really disapprove of this language purity stuff. If it compiles, it works. If it runs it's perfect. The idea behind this attempt is to see what can be done if one removes all possible overhead. So I would not like to go down that rabbit hole. On Sun, Feb 24, 2013 at 11:14 PM, Armin Rigo wrote: > Hi Ghitulete, > > So are you saying that you don't want to use CPython because it's C, > and you want to try C-less alternatives, or at least things that don't > use libc? Then look elsewhere. An RPython program (which is > definitely something different than a Python program) is translated to > C code that uses libc. Changing this would be possible, but certainly > not less work than, say, changing CPython to not use the libc. Which, > I seem to recall, has been done long ago in an experiment of "booting > CPython". > I have never said I want to try C-less alternatives, but as to my knowledge the only common ground between CPython and C, si that part of CPython is written in C, as opposed to generating C code. What I need is to get the equivalent C code of a python program. CPython on the other hand would need to have a VM to run the bytecode in, which I not plan on doing. On Sun, Feb 24, 2013 at 9:17 PM, Leonardo Santagada wrote: > > So what I think you need is a pypy binary that can run without an os... > the pypy binary needs a libc to access stuff, if you have one that you are > using with other C software in your project maybe you can port pypy to > it... probably a pthreads library will also be needed. What you need is to > define a new platform and port the whole pypy to it... probably cross > compiling from linux. I think that is how the arm port works and should be > doable. > > I have pondered on doing that, but even though it is doable, it would require quite an effort as it would need a more complete environment than what I already have. Also, by porting pypy I would yet again get another layer between python code and hardware. -- Sincerely, Razvan Ghitulete Universitatea Politehnica Bucuresti -------------- next part -------------- An HTML attachment was scrubbed... URL: From arigo at tunes.org Mon Feb 25 09:43:29 2013 From: arigo at tunes.org (Armin Rigo) Date: Mon, 25 Feb 2013 09:43:29 +0100 Subject: [pypy-dev] translating pypy benchmarks to C In-Reply-To: References: <512A54C1.1020506@gmx.de> <512A5A63.9060505@gmx.de> Message-ID: Hi, On Mon, Feb 25, 2013 at 9:13 AM, Ghitulete Razvan wrote: > On Sun, Feb 24, 2013 at 8:47 PM, Alex Gaynor wrote: >> What you are doing will not generate any information about how fast Python >> can be. It will show you the speed of RPython or Cython on baremetal, these >> are *NOT* python. > > I really disapprove of this language purity stuff. If it compiles, it works. The point that Alex tried to make is that if you take a random medium-sized Python program, like any of the benchmarks you wanted to use, then it is very unlikely that just by chance they happen to also be valid RPython or Cython code. In order to take a Python program that was not meant to be RPython, and turn it into RPython, for example, then you need to review and fix it completely --- it is quite an endeavour. I wouldn't call it "language purity stuff" at all. If you don't think what I'm saying here makes sense, just try. A bient?t, Armin. From razvan.ghitulete at gmail.com Mon Feb 25 09:58:51 2013 From: razvan.ghitulete at gmail.com (Ghitulete Razvan) Date: Mon, 25 Feb 2013 10:58:51 +0200 Subject: [pypy-dev] translating pypy benchmarks to C In-Reply-To: References: <512A54C1.1020506@gmx.de> <512A5A63.9060505@gmx.de> Message-ID: Hi, On Mon, Feb 25, 2013 at 10:43 AM, Armin Rigo wrote: > Hi, > > The point that Alex tried to make is that if you take a random > medium-sized Python program, like any of the benchmarks you wanted to > use, then it is very unlikely that just by chance they happen to also > be valid RPython or Cython code. In order to take a Python program > that was not meant to be RPython, and turn it into RPython, for > example, then you need to review and fix it completely --- it is quite > an endeavour. I wouldn't call it "language purity stuff" at all. If > you don't think what I'm saying here makes sense, just try. > > It's not that I don't think it makes sense and I am pretty sure it will prove to be an ordeal. Also, the reason I said that I don't want to go into language purity arguments is that it is actually pretty obvious that unless you get an actual VM you cannot say that the end result is pure python as some features of the language are really hard to get when you are using a direct translation. Though I am curious as to why there isn't any `full` python code translator. -- Sincerely, Razvan Ghitulete Universitatea Politehnica Bucuresti -------------- next part -------------- An HTML attachment was scrubbed... URL: From Ronny.Pfannschmidt at gmx.de Tue Feb 26 15:29:59 2013 From: Ronny.Pfannschmidt at gmx.de (Ronny Pfannschmidt) Date: Tue, 26 Feb 2013 15:29:59 +0100 Subject: [pypy-dev] continulet stacks on the heap and other schemes for io/threading Message-ID: <512CC6E7.1040209@gmx.de> Hi, over the last few weeks few ideas have been brooding in the back of my head, in particular after seeing how rust creates its stacks and handles io. The basis is allocating all stacks as non-movable structures on the heap. This would remove the need to copy the c/rpython level stack for continulets and enable to move them between native threads which is essentially enabling a M:N threading scheme. On top of that i would like to introduce transformation similar to the sandbox that would defer all IO to a io loop in a separate thread. Additionally it should change the threading abstractions to use said continuation instead of os level threads the result in case of a success would be a python that defers all blocking operations to a io loop in a separate thread im currently investigating libuv, since it does well for async io and also has utilities to defer blocking calls to c code to a pool of native threads -- Ronny From fijall at gmail.com Tue Feb 26 16:08:14 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Tue, 26 Feb 2013 17:08:14 +0200 Subject: [pypy-dev] continulet stacks on the heap and other schemes for io/threading In-Reply-To: <512CC6E7.1040209@gmx.de> References: <512CC6E7.1040209@gmx.de> Message-ID: On Tue, Feb 26, 2013 at 4:29 PM, Ronny Pfannschmidt wrote: > Hi, > > over the last few weeks few ideas > have been brooding in the back of my head, > in particular after seeing how rust creates its stacks and handles io. > > The basis is allocating all stacks as non-movable structures on the heap. This is essentially done on jitframe-on-heap (as the name suggests) for the JIT. From fijall at gmail.com Tue Feb 26 16:11:13 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Tue, 26 Feb 2013 17:11:13 +0200 Subject: [pypy-dev] cffi in stdlib Message-ID: Hello. I would like to discuss on the language summit a potential inclusion of cffi[1] into stdlib. This is a project Armin Rigo has been working for a while, with some input from other developers. It seems that the main reason why people would prefer ctypes over cffi these days is "because it's included in stdlib", which is not generally the reason I would like to hear. Our calls to not use C extensions and to use an FFI instead has seen very limited success with ctypes and quite a lot more since cffi got released. The API is fairly stable right now with minor changes going in and it'll definitely stablize until Python 3.4 release. Notable projects using it: * pypycore - gevent main loop ported to cffi * pgsql2cffi * sdl-cffi bindings * tls-cffi bindings * lxml-cffi port * cairo-cffi * pyzmq * a bunch of others So relatively a lot given that the project is not even a year old (it got 0.1 release in June). As per documentation, the advantages over ctypes: * The goal is to call C code from Python. You should be able to do so without learning a 3rd language: every alternative requires you to learn their own language (Cython, SWIG) or API (ctypes). So we tried to assume that you know Python and C and minimize the extra bits of API that you need to learn. * Keep all the Python-related logic in Python so that you don?t need to write much C code (unlike CPython native C extensions). * Work either at the level of the ABI (Application Binary Interface) or the API (Application Programming Interface). Usually, C libraries have a specified C API but often not an ABI (e.g. they may document a ?struct? as having at least these fields, but maybe more). (ctypes works at the ABI level, whereas Cython and native C extensions work at the API level.) * We try to be complete. For now some C99 constructs are not supported, but all C89 should be, including macros (and including macro ?abuses?, which you can manually wrap in saner-looking C functions). * We attempt to support both PyPy and CPython, with a reasonable path for other Python implementations like IronPython and Jython. * Note that this project is not about embedding executable C code in Python, unlike Weave. This is about calling existing C libraries from Python. so among other things, making a cffi extension gives you the same level of security as writing C (and unlike ctypes) and brings quite a bit more flexibility (API vs ABI issue) that let's you wrap arbitrary libraries, even those full of macros. Cheers, fijal .. [1]: http://cffi.readthedocs.org/en/release-0.5/ From fijall at gmail.com Tue Feb 26 16:13:05 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Tue, 26 Feb 2013 17:13:05 +0200 Subject: [pypy-dev] cffi in stdlib In-Reply-To: References: Message-ID: Eh, I'm a moron, this was supposed to go to python-dev, not here. please ignore On Tue, Feb 26, 2013 at 5:11 PM, Maciej Fijalkowski wrote: > Hello. > > I would like to discuss on the language summit a potential inclusion > of cffi[1] into stdlib. This is a project Armin Rigo has been working > for a while, with some input from other developers. It seems that the > main reason why people would prefer ctypes over cffi these days is > "because it's included in stdlib", which is not generally the reason I > would like to hear. Our calls to not use C extensions and to use an > FFI instead has seen very limited success with ctypes and quite a lot > more since cffi got released. The API is fairly stable right now with > minor changes going in and it'll definitely stablize until Python 3.4 > release. Notable projects using it: > > * pypycore - gevent main loop ported to cffi > * pgsql2cffi > * sdl-cffi bindings > * tls-cffi bindings > * lxml-cffi port > * cairo-cffi > * pyzmq > * a bunch of others > > So relatively a lot given that the project is not even a year old (it > got 0.1 release in June). As per documentation, the advantages over > ctypes: > > * The goal is to call C code from Python. You should be able to do so > without learning a 3rd language: every alternative requires you to > learn their own language (Cython, SWIG) or API (ctypes). So we tried > to assume that you know Python and C and minimize the extra bits of > API that you need to learn. > > * Keep all the Python-related logic in Python so that you don?t need > to write much C code (unlike CPython native C extensions). > > * Work either at the level of the ABI (Application Binary Interface) > or the API (Application Programming Interface). Usually, C libraries > have a specified C API but often not an ABI (e.g. they may document a > ?struct? as having at least these fields, but maybe more). (ctypes > works at the ABI level, whereas Cython and native C extensions work at > the API level.) > > * We try to be complete. For now some C99 constructs are not > supported, but all C89 should be, including macros (and including > macro ?abuses?, which you can manually wrap in saner-looking C > functions). > > * We attempt to support both PyPy and CPython, with a reasonable path > for other Python implementations like IronPython and Jython. > > * Note that this project is not about embedding executable C code in > Python, unlike Weave. This is about calling existing C libraries from > Python. > > so among other things, making a cffi extension gives you the same > level of security as writing C (and unlike ctypes) and brings quite a > bit more flexibility (API vs ABI issue) that let's you wrap arbitrary > libraries, even those full of macros. > > Cheers, > fijal > > .. [1]: http://cffi.readthedocs.org/en/release-0.5/ From ddvento at ucar.edu Tue Feb 26 16:34:52 2013 From: ddvento at ucar.edu (Davide Del Vento) Date: Tue, 26 Feb 2013 08:34:52 -0700 Subject: [pypy-dev] cffi in stdlib In-Reply-To: References: Message-ID: <512CD61C.7040006@ucar.edu> Well, not so fast :-) I'm glad you posted it here since I don't follow python-dev (too many mailing lists) and I'm happy to hear about this proposal, even if there isn't much to discuss about it from the pypy side. Cheers. Davide Del Vento, On 02/26/2013 08:13 AM, Maciej Fijalkowski wrote: > Eh, I'm a moron, this was supposed to go to python-dev, not here. please ignore > > On Tue, Feb 26, 2013 at 5:11 PM, Maciej Fijalkowski wrote: >> Hello. >> >> I would like to discuss on the language summit a potential inclusion >> of cffi[1] into stdlib. This is a project Armin Rigo has been working >> for a while, with some input from other developers. It seems that the >> main reason why people would prefer ctypes over cffi these days is >> "because it's included in stdlib", which is not generally the reason I >> would like to hear. Our calls to not use C extensions and to use an >> FFI instead has seen very limited success with ctypes and quite a lot >> more since cffi got released. The API is fairly stable right now with >> minor changes going in and it'll definitely stablize until Python 3.4 >> release. Notable projects using it: >> >> * pypycore - gevent main loop ported to cffi >> * pgsql2cffi >> * sdl-cffi bindings >> * tls-cffi bindings >> * lxml-cffi port >> * cairo-cffi >> * pyzmq >> * a bunch of others >> >> So relatively a lot given that the project is not even a year old (it >> got 0.1 release in June). As per documentation, the advantages over >> ctypes: >> >> * The goal is to call C code from Python. You should be able to do so >> without learning a 3rd language: every alternative requires you to >> learn their own language (Cython, SWIG) or API (ctypes). So we tried >> to assume that you know Python and C and minimize the extra bits of >> API that you need to learn. >> >> * Keep all the Python-related logic in Python so that you don?t need >> to write much C code (unlike CPython native C extensions). >> >> * Work either at the level of the ABI (Application Binary Interface) >> or the API (Application Programming Interface). Usually, C libraries >> have a specified C API but often not an ABI (e.g. they may document a >> ?struct? as having at least these fields, but maybe more). (ctypes >> works at the ABI level, whereas Cython and native C extensions work at >> the API level.) >> >> * We try to be complete. For now some C99 constructs are not >> supported, but all C89 should be, including macros (and including >> macro ?abuses?, which you can manually wrap in saner-looking C >> functions). >> >> * We attempt to support both PyPy and CPython, with a reasonable path >> for other Python implementations like IronPython and Jython. >> >> * Note that this project is not about embedding executable C code in >> Python, unlike Weave. This is about calling existing C libraries from >> Python. >> >> so among other things, making a cffi extension gives you the same >> level of security as writing C (and unlike ctypes) and brings quite a >> bit more flexibility (API vs ABI issue) that let's you wrap arbitrary >> libraries, even those full of macros. >> >> Cheers, >> fijal >> >> .. [1]: http://cffi.readthedocs.org/en/release-0.5/ > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > From fijall at gmail.com Tue Feb 26 17:24:10 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Tue, 26 Feb 2013 18:24:10 +0200 Subject: [pypy-dev] cffi in stdlib In-Reply-To: <512CD61C.7040006@ucar.edu> References: <512CD61C.7040006@ucar.edu> Message-ID: On Tue, Feb 26, 2013 at 5:34 PM, Davide Del Vento wrote: > Well, not so fast :-) > I'm glad you posted it here since I don't follow python-dev (too many > mailing lists) and I'm happy to hear about this proposal, even if there > isn't much to discuss about it from the pypy side. There is not much more than I described in the mail "put it in" :) > > Cheers. > Davide Del Vento, > > > On 02/26/2013 08:13 AM, Maciej Fijalkowski wrote: >> >> Eh, I'm a moron, this was supposed to go to python-dev, not here. please >> ignore >> >> On Tue, Feb 26, 2013 at 5:11 PM, Maciej Fijalkowski >> wrote: >>> >>> Hello. >>> >>> I would like to discuss on the language summit a potential inclusion >>> of cffi[1] into stdlib. This is a project Armin Rigo has been working >>> for a while, with some input from other developers. It seems that the >>> main reason why people would prefer ctypes over cffi these days is >>> "because it's included in stdlib", which is not generally the reason I >>> would like to hear. Our calls to not use C extensions and to use an >>> FFI instead has seen very limited success with ctypes and quite a lot >>> more since cffi got released. The API is fairly stable right now with >>> minor changes going in and it'll definitely stablize until Python 3.4 >>> release. Notable projects using it: >>> >>> * pypycore - gevent main loop ported to cffi >>> * pgsql2cffi >>> * sdl-cffi bindings >>> * tls-cffi bindings >>> * lxml-cffi port >>> * cairo-cffi >>> * pyzmq >>> * a bunch of others >>> >>> So relatively a lot given that the project is not even a year old (it >>> got 0.1 release in June). As per documentation, the advantages over >>> ctypes: >>> >>> * The goal is to call C code from Python. You should be able to do so >>> without learning a 3rd language: every alternative requires you to >>> learn their own language (Cython, SWIG) or API (ctypes). So we tried >>> to assume that you know Python and C and minimize the extra bits of >>> API that you need to learn. >>> >>> * Keep all the Python-related logic in Python so that you don?t need >>> to write much C code (unlike CPython native C extensions). >>> >>> * Work either at the level of the ABI (Application Binary Interface) >>> or the API (Application Programming Interface). Usually, C libraries >>> have a specified C API but often not an ABI (e.g. they may document a >>> ?struct? as having at least these fields, but maybe more). (ctypes >>> works at the ABI level, whereas Cython and native C extensions work at >>> the API level.) >>> >>> * We try to be complete. For now some C99 constructs are not >>> supported, but all C89 should be, including macros (and including >>> macro ?abuses?, which you can manually wrap in saner-looking C >>> functions). >>> >>> * We attempt to support both PyPy and CPython, with a reasonable path >>> for other Python implementations like IronPython and Jython. >>> >>> * Note that this project is not about embedding executable C code in >>> Python, unlike Weave. This is about calling existing C libraries from >>> Python. >>> >>> so among other things, making a cffi extension gives you the same >>> level of security as writing C (and unlike ctypes) and brings quite a >>> bit more flexibility (API vs ABI issue) that let's you wrap arbitrary >>> libraries, even those full of macros. >>> >>> Cheers, >>> fijal >>> >>> .. [1]: http://cffi.readthedocs.org/en/release-0.5/ >> >> _______________________________________________ >> pypy-dev mailing list >> pypy-dev at python.org >> http://mail.python.org/mailman/listinfo/pypy-dev >> > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev From arigo at tunes.org Tue Feb 26 19:41:27 2013 From: arigo at tunes.org (Armin Rigo) Date: Tue, 26 Feb 2013 19:41:27 +0100 Subject: [pypy-dev] continulet stacks on the heap and other schemes for io/threading In-Reply-To: References: <512CC6E7.1040209@gmx.de> Message-ID: Hi, On Tue, Feb 26, 2013 at 4:08 PM, Maciej Fijalkowski wrote: >> The basis is allocating all stacks as non-movable structures on the heap. > > This is essentially done on jitframe-on-heap (as the name suggests) for the JIT. I think that what Ronny has in mind is different. Unless I'm mistaking it, Rust's "tasks" are green threads, but still thread-like structures, but fully managed by the process. They each have their own C-level stack. That's why you can run 100'000 of them in maybe 1 GB of RAM (rough order of magnitude), but not 1 or 10 million of them. CPython's Stackless and PyPy's stacklets allow basically one order of magnitude more. For PyPy as well as "hard-switching" Stackless this comes at the cost of needing to do copies around. I suppose it's a trade-off between this cost and the extra memory of green threads, so I cannot judge a priori which solution is the best --- it probably depends on the use case. The jitframe-on-heap branch "just" enables, finally, PyPy's existing coroutines to be fully JITted. A bient?t, Armin. From aidembb at yahoo.com Wed Feb 27 21:10:04 2013 From: aidembb at yahoo.com (Roger Flores) Date: Wed, 27 Feb 2013 12:10:04 -0800 (PST) Subject: [pypy-dev] Slow int code Message-ID: <1361995804.94140.YahooMailNeo@web162205.mail.bf1.yahoo.com> Hi guys.? I've been looking at two simple routines using jitviewer to figure out why they're so much slower than expected. I've also noticed that http://pypy.org/performance.html has the line "Bad examples include doing computations with large longs ? which is performed by unoptimizable support code.".? I'm worried that my 32 bit int code is falling into this, and I'm wondering what I can do to avoid it? Trivial code like if (self.low ^ self.high) & 0x80000000 == 0: is expanding into several dozen asm instructions.? I'm suspecting that lines like? self.low = (self.low << 1) & 0xffffffff with it's shift left are convincing the jit to consider the int to need 64 bits (large long?) instead of 32. Ideas?? The asm is clearly operating on QWORDs and calling routines to do the bit arithmetic instead of single instructions.? Is this what that line in performance.html is warning about? -Roger BTW Fijal's jitviewer is a *must see* for anyone interested in how pypy makes their code fast! From alex.gaynor at gmail.com Wed Feb 27 21:23:33 2013 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Wed, 27 Feb 2013 12:23:33 -0800 Subject: [pypy-dev] Slow int code In-Reply-To: <1361995804.94140.YahooMailNeo@web162205.mail.bf1.yahoo.com> References: <1361995804.94140.YahooMailNeo@web162205.mail.bf1.yahoo.com> Message-ID: In that context large longs means HUNDREDS or THOUSANDS of bits, not 64 :) Can you show us a full runnable example that illustrates this? Alex On Wed, Feb 27, 2013 at 12:10 PM, Roger Flores wrote: > Hi guys. I've been looking at two simple routines using jitviewer to > figure out why they're so much slower than expected. > > > I've also noticed that http://pypy.org/performance.html has the line "Bad > examples include doing computations with > large longs ? which is performed by unoptimizable support code.". I'm > worried that my 32 bit int code is falling into this, and I'm wondering > what I can do to avoid it? > > Trivial code like > > if (self.low ^ self.high) & 0x80000000 == 0: > > > is expanding into several dozen asm instructions. I'm suspecting that > lines like > > self.low = (self.low << 1) & 0xffffffff > > > with it's shift left are convincing the jit to consider the int to need 64 > bits (large long?) instead of 32. > > > Ideas? The asm is clearly operating on QWORDs and calling routines to do > the bit arithmetic instead of single instructions. Is this what that line > in performance.html is warning about? > > > > -Roger > > BTW Fijal's jitviewer is a *must see* for anyone interested in how pypy > makes their code fast! > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.gaynor at gmail.com Wed Feb 27 21:35:18 2013 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Wed, 27 Feb 2013 12:35:18 -0800 Subject: [pypy-dev] Slow int code In-Reply-To: <1361997160.20274.YahooMailNeo@web162202.mail.bf1.yahoo.com> References: <1361995804.94140.YahooMailNeo@web162205.mail.bf1.yahoo.com> <1361997160.20274.YahooMailNeo@web162202.mail.bf1.yahoo.com> Message-ID: The original source code would be best! Thanks, Alex On Wed, Feb 27, 2013 at 12:32 PM, Roger Flores wrote: > Would you like a paste from jitviewer or the source code to run and > examine with jitviewer? > > -Roger > > > ------------------------------ > *From:* Alex Gaynor > *To:* Roger Flores > *Cc:* "pypy-dev at python.org" > *Sent:* Wednesday, February 27, 2013 12:23 PM > *Subject:* Re: [pypy-dev] Slow int code > > In that context large longs means HUNDREDS or THOUSANDS of bits, not 64 :) > Can you show us a full runnable example that illustrates this? > > Alex > > > On Wed, Feb 27, 2013 at 12:10 PM, Roger Flores wrote: > > Hi guys. I've been looking at two simple routines using jitviewer to > figure out why they're so much slower than expected. > > > I've also noticed that http://pypy.org/performance.html has the line "Bad > examples include doing computations with > large longs ? which is performed by unoptimizable support code.". I'm > worried that my 32 bit int code is falling into this, and I'm wondering > what I can do to avoid it? > > Trivial code like > > if (self.low ^ self.high) & 0x80000000 == 0: > > > is expanding into several dozen asm instructions. I'm suspecting that > lines like > > self.low = (self.low << 1) & 0xffffffff > > > with it's shift left are convincing the jit to consider the int to need 64 > bits (large long?) instead of 32. > > > Ideas? The asm is clearly operating on QWORDs and calling routines to do > the bit arithmetic instead of single instructions. Is this what that line > in performance.html is warning about? > > > > -Roger > > BTW Fijal's jitviewer is a *must see* for anyone interested in how pypy > makes their code fast! > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > http://mail.python.org/mailman/listinfo/pypy-dev > > > > > -- > "I disapprove of what you say, but I will defend to the death your right > to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) > "The people's good is the highest law." -- Cicero > > > -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero -------------- next part -------------- An HTML attachment was scrubbed... URL: From aidembb at yahoo.com Wed Feb 27 21:32:40 2013 From: aidembb at yahoo.com (Roger Flores) Date: Wed, 27 Feb 2013 12:32:40 -0800 (PST) Subject: [pypy-dev] Slow int code In-Reply-To: References: <1361995804.94140.YahooMailNeo@web162205.mail.bf1.yahoo.com> Message-ID: <1361997160.20274.YahooMailNeo@web162202.mail.bf1.yahoo.com> Would you like a paste from jitviewer or the source code to run and examine with jitviewer? -Roger ________________________________ From: Alex Gaynor To: Roger Flores Cc: "pypy-dev at python.org" Sent: Wednesday, February 27, 2013 12:23 PM Subject: Re: [pypy-dev] Slow int code In that context large longs means HUNDREDS or THOUSANDS of bits, not 64 :) Can you show us a full runnable example that illustrates this? Alex On Wed, Feb 27, 2013 at 12:10 PM, Roger Flores wrote: Hi guys.? I've been looking at two simple routines using jitviewer to figure out why they're so much slower than expected. > > >I've also noticed that http://pypy.org/performance.html has the line "Bad examples include doing computations with >large longs ? which is performed by unoptimizable support code.".? I'm worried that my 32 bit int code is falling into this, and I'm wondering what I can do to avoid it? > >Trivial code like > >if (self.low ^ self.high) & 0x80000000 == 0: > > >is expanding into several dozen asm instructions.? I'm suspecting that lines like? > >self.low = (self.low << 1) & 0xffffffff > > >with it's shift left are convincing the jit to consider the int to need 64 bits (large long?) instead of 32. > > >Ideas?? The asm is clearly operating on QWORDs and calling routines to do the bit arithmetic instead of single instructions.? Is this what that line in performance.html is warning about? > > > >-Roger > >BTW Fijal's jitviewer is a *must see* for anyone interested in how pypy makes their code fast! > >_______________________________________________ >pypy-dev mailing list >pypy-dev at python.org >http://mail.python.org/mailman/listinfo/pypy-dev > -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero -------------- next part -------------- An HTML attachment was scrubbed... URL: From aidembb at yahoo.com Wed Feb 27 21:55:48 2013 From: aidembb at yahoo.com (Roger Flores) Date: Wed, 27 Feb 2013 12:55:48 -0800 (PST) Subject: [pypy-dev] Slow int code In-Reply-To: References: <1361995804.94140.YahooMailNeo@web162205.mail.bf1.yahoo.com> <1361997160.20274.YahooMailNeo@web162202.mail.bf1.yahoo.com> Message-ID: <1361998548.58023.YahooMailNeo@web162202.mail.bf1.yahoo.com> I'll email the code separetely because I'm not sure everyone wants a tiny zip.? Anyone is welcome to it.? It's a newer version of the compressor I entered into the Large Text Compression Benchmark. I'm running it as: PYPYLOG=jit-log-opt,jit-backend:dizlog.pypylog pypy diz.py -p -t frank.txt You can get frank.txt from http://www.gutenberg.org/ebooks/84 (and rename it) or substitute a similar file. Examine the second line in output(): ???? if (self.low ^ self.high) & 0x80000000 == 0: The remaining lines are similar.? Also, the routine encode() listed one line above in jitviewer has the same issues.? If I comment out the two calls to encode(), I save a huge percentage of time (up to 40% in some configurations). -Roger ________________________________ From: Alex Gaynor To: Roger Flores Cc: "pypy-dev at python.org" Sent: Wednesday, February 27, 2013 12:35 PM Subject: Re: [pypy-dev] Slow int code The original source code would be best! Thanks, Alex On Wed, Feb 27, 2013 at 12:32 PM, Roger Flores wrote: Would you like a paste from jitviewer or the source code to run and examine with jitviewer? > >-Roger > > > > > > >________________________________ > From: Alex Gaynor >To: Roger Flores >Cc: "pypy-dev at python.org" >Sent: Wednesday, February 27, 2013 12:23 PM >Subject: Re: [pypy-dev] Slow int code > > > >In that context large longs means HUNDREDS or THOUSANDS of bits, not 64 :) Can you show us a full runnable example that illustrates this? > > >Alex > > > >On Wed, Feb 27, 2013 at 12:10 PM, Roger Flores wrote: > >Hi guys.? I've been looking at two simple routines using jitviewer to figure out why they're so much slower than expected. >> >> >>I've also noticed that http://pypy.org/performance.html has the line "Bad examples include doing computations with >>large longs ? which is performed by unoptimizable support code.".? I'm worried that my 32 bit int code is falling into this, and I'm wondering what I can do to avoid it? >> >>Trivial code like >> >>if (self.low ^ self.high) & 0x80000000 == 0: >> >> >>is expanding into several dozen asm instructions.? I'm suspecting that lines like? >> >>self.low = (self.low << 1) & 0xffffffff >> >> >>with it's shift left are convincing the jit to consider the int to need 64 bits (large long?) instead of 32. >> >> >>Ideas?? The asm is clearly operating on QWORDs and calling routines to do the bit arithmetic instead of single instructions.? Is this what that line in performance.html is warning about? >> >> >> >>-Roger >> >>BTW Fijal's jitviewer is a *must see* for anyone interested in how pypy makes their code fast! >> >>_______________________________________________ >>pypy-dev mailing list >>pypy-dev at python.org >>http://mail.python.org/mailman/listinfo/pypy-dev >> > > > >-- >"I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) >"The people's good is the highest law." -- Cicero > > > -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero -------------- next part -------------- An HTML attachment was scrubbed... URL: From announcements at youngstartups.net Wed Feb 27 22:05:10 2013 From: announcements at youngstartups.net (Joe Benjamin) Date: Wed, 27 Feb 2013 16:05:10 -0500 Subject: [pypy-dev] Early Bird Rates Expire TOMORROW Message-ID: <2f50980e4c8dc05f8100efa3001486ff@youngstartups.net> "Early Bird" registration savings and opportunity to submit your company EXPIRE This Tomorrow, Feb. 28th at midnight. A youngStartUp Ventures Conference The Life Sciences & Healthcare Venture Summit Where Innovation Meets Capital MARCH 20, 2013 | Digital Sandbox | New York City Friends, A quick reminder that "early bird" registration savings of 50% and the early submission deadline to present for The 2013 Life Sciences & Healthcare Venture Summit, EXPIRE Tomorrow, February 28th at midnight. Up to fifty emerging-growth entrepreneurial companies will be selected to present at this year's Life Sciences and Healthcare Venture Summit to be held on Wednesday, March 20th in New York City. This is your chance to apply to participate as a presenter before an audience of leading Venture Capitalists, private investors, investment bankers, corporate investors and strategic partners. To Register Now and take advantage of the 50% registration savings before they expire click here. To apply to present: click here. Whether you?re a startup seeking capital and exposure, or an investor seeking new deals, The 2013 Life Sciences and Healthcare venture summit is one event you don't want to miss! MORE THAN FORTY EARLY STAGE VCs ON INTERACTIVE PANELS: Discover what industries they favor, what they are looking for in their next investment, and what you must do to get them excited. Click Here to Register Now! "Early bird" Registration savings expire Tomorrow, February 28th Partial List of VCs and experts confirmed to speak (many more in attendance): 3M New Ventures 5AM Ventures Ascent Biomedical Ventures BD Ventures (Becton Dickinson) CHL Healthcare Partners Cross Atlantic Capital Partners Early Stage Partners Humana Ventures Launch Capital Longitude Capital Millennium Technology Value Partners Mitsui & Co. Global Investment MP Healthcare Venture MVM Life Science Partners New Leaf Venture Partners NGN Capital Novo Ventures Osage University Partners Oxford Bioscience Partners Psilos Qualcomm Ventures Romulus Capital Signet Healthcare Partners Skyline Ventures SR One Technology Partners TEXO Ventures Thomas, McNerney & Partner TPG Biotech Triathlon Medical Ventures SEEKING CAPITAL & EXPOSURE? Get Noticed > Get Funded > Grow Faster 50 companies seeking initial rounds of venture funding will be chosen to present their innovative investment opportunities at The 2013 Life Sciences & Healthcare Venture Summit. If you are looking for capital, don't miss this opportunity to present your company to a leading audience of more than 400 active Venture Capitalists, Angel Investors, Corporate VCs, senior management of emerging companies and potential strategic partners. Featured Company Benefits include: Access and exposure to influential investors Media Exposure Two Page Company Profile published in event guide distributed to all investors and attendees Complimentary passes for 3 company executives Investor Pitch and Presentation Coaching Presentation slot Introductions to leading investors and strategic partners Apply to Present: To be considered for one of the Top Innovator slots, please email iwant2present at youngstartup.com for an application. To RSVP by phone or for more information call 212-202-1002. For additional information on presenting opportunities, please visit: http://www.youngstartup.com/lifesciences2013/overview.php HIGH- LEVEL NETWORKING: An excellent opportunity to meet and network with active early stage VCs, angel investors, Corporate VCs, investment bankers, senior executives of emerging companies and strategic partners. FEES: "Early bird" registration savings of 50% EXPIRE Tomorrow, February 28th at midnight Entrepreneurs: Now only $395 Investors: Now only $495 Service Providers: Now only $695 REGISTER NOW and take advantage of the lowest rates available. Click Here to Register Now! We look forward to seeing you there. Sincerely, Joe Benjamin, Founder & CEO youngStartup Ventures PS: If you want to inquire about group rates contact Rivka at 718-447-0009 or rivka at youngstartup.com. Founding Sponsor Diamond Sponsor Platinum Sponsor Gold Sponsors Silver Sponsor Marketing Solutions Sponsor Industry Partners For more information about sponsorship contact joe (at) youngstartup.com Unsubscribe: To no longer receive youngStartup Ventures research reports or invitations click here - youngStartup Ventures, Inc. - 258 Crafton Ave. Staten Island, NY, 10314 -------------- next part -------------- An HTML attachment was scrubbed... URL: From arigo at tunes.org Thu Feb 28 16:28:24 2013 From: arigo at tunes.org (Armin Rigo) Date: Thu, 28 Feb 2013 16:28:24 +0100 Subject: [pypy-dev] Slow int code In-Reply-To: <1361998548.58023.YahooMailNeo@web162202.mail.bf1.yahoo.com> References: <1361995804.94140.YahooMailNeo@web162205.mail.bf1.yahoo.com> <1361997160.20274.YahooMailNeo@web162202.mail.bf1.yahoo.com> <1361998548.58023.YahooMailNeo@web162202.mail.bf1.yahoo.com> Message-ID: Hi Roger, On Wed, Feb 27, 2013 at 9:55 PM, Roger Flores wrote: > I'll email the code separetely because I'm not sure everyone wants a tiny > zip. I'm sure no-one would mind receiving a tiny zip; or just use a paste site if your program is a single module. (http://bpaste.net/) A bient?t, Armin. From aidembb at yahoo.com Thu Feb 28 18:00:15 2013 From: aidembb at yahoo.com (Roger Flores) Date: Thu, 28 Feb 2013 09:00:15 -0800 (PST) Subject: [pypy-dev] Slow int code In-Reply-To: References: <1361995804.94140.YahooMailNeo@web162205.mail.bf1.yahoo.com> <1361997160.20274.YahooMailNeo@web162202.mail.bf1.yahoo.com> <1361998548.58023.YahooMailNeo@web162202.mail.bf1.yahoo.com> Message-ID: <1362070815.93192.YahooMailNeo@web162206.mail.bf1.yahoo.com> >I'm sure no-one would mind receiving a tiny zip; OK then. Unzip it, grab a text file large enough to warm up the jit, and run the line to generate the log for jitviewer. The issue is the codegen for the output() function, and is there anything about my python code that's unintentionally confusing for pypy? Say, there isn't by chance a command that will show the types that pypy annotates for the classes in a program?? That way I could easily check that important data structures are easily and correctly understood type wise? Thanks, -Roger -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: diz-3.zip Type: application/zip Size: 37058 bytes Desc: not available URL: