From bra at fsn.hu Fri Jun 3 03:45:33 2016 From: bra at fsn.hu (Nagy, Attila) Date: Fri, 3 Jun 2016 09:45:33 +0200 Subject: [pypy-dev] Twisted's deferToThread is much more slower with pypy than on cpython Message-ID: <5751359D.5020901@fsn.hu> Hi, Consider this example program: https://gist.github.com/bra-fsn/1fd481b44590a939e849cb9073ba1a41 # pypy /tmp/deft.py 1 deferToThread: avg 1310.90 us, sync: avg 1.13 us, 1155.26x increase deferToThread: avg 248.42 us, sync: avg 0.19 us, 1288.78x increase deferToThread: avg 288.20 us, sync: avg 0.20 us, 1450.80x increase deferToThread: avg 366.34 us, sync: avg 0.20 us, 1860.96x increase deferToThread: avg 300.38 us, sync: avg 0.20 us, 1510.52x increase deferToThread: avg 341.94 us, sync: avg 0.20 us, 1751.63x increase deferToThread: avg 282.96 us, sync: avg 0.20 us, 1432.18x increase deferToThread: avg 232.87 us, sync: avg 0.20 us, 1185.64x increase deferToThread: avg 350.51 us, sync: avg 0.20 us, 1751.06x increase deferToThread: avg 379.09 us, sync: avg 0.20 us, 1903.13x increase While this runs, it uses only around 15% CPU. Running this on more threads yields even worse results: # pypy /tmp/deft.py 8 deferToThread: avg 8236.71 us, sync: avg 0.28 us, 29359.01x increase deferToThread: avg 8286.90 us, sync: avg 0.32 us, 26029.29x increase deferToThread: avg 8345.67 us, sync: avg 0.67 us, 12436.32x increase deferToThread: avg 8685.53 us, sync: avg 0.33 us, 26617.55x increase deferToThread: avg 8689.54 us, sync: avg 0.48 us, 18093.90x increase deferToThread: avg 8743.51 us, sync: avg 0.31 us, 28124.92x increase deferToThread: avg 8760.89 us, sync: avg 0.32 us, 27270.95x increase deferToThread: avg 8757.82 us, sync: avg 0.32 us, 27791.52x increase deferToThread: avg 5884.83 us, sync: avg 0.22 us, 26213.80x increase deferToThread: avg 5927.90 us, sync: avg 0.23 us, 25773.31x increase deferToThread: avg 6410.40 us, sync: avg 0.23 us, 27943.53x increase deferToThread: avg 6133.28 us, sync: avg 0.23 us, 26417.04x increase deferToThread: avg 6182.15 us, sync: avg 0.23 us, 26529.44x increase deferToThread: avg 6679.73 us, sync: avg 0.22 us, 30022.54x increase deferToThread: avg 6390.30 us, sync: avg 0.22 us, 28647.95x increase deferToThread: avg 6509.35 us, sync: avg 0.23 us, 28201.91x increase deferToThread: avg 6640.61 us, sync: avg 0.23 us, 28499.73x increase deferToThread: avg 6508.86 us, sync: avg 0.23 us, 28293.32x increase deferToThread: avg 6394.18 us, sync: avg 0.23 us, 27980.45x increase deferToThread: avg 6584.74 us, sync: avg 0.23 us, 28983.66x increase deferToThread: avg 6806.19 us, sync: avg 0.24 us, 28535.81x increase The process now only uses around 2% CPU. Running on even more threads: # pypy /tmp/deft.py 16 deferToThread: avg 27107.54 us, sync: avg 0.29 us, 94991.56x increase deferToThread: avg 28288.76 us, sync: avg 0.29 us, 96298.54x increase deferToThread: avg 28365.01 us, sync: avg 3.86 us, 7346.72x increase deferToThread: avg 28927.80 us, sync: avg 0.33 us, 86955.17x increase deferToThread: avg 28958.93 us, sync: avg 0.29 us, 101428.44x increase deferToThread: avg 29037.54 us, sync: avg 0.30 us, 98424.06x increase deferToThread: avg 29223.17 us, sync: avg 0.29 us, 99730.33x increase deferToThread: avg 29270.21 us, sync: avg 0.37 us, 78388.17x increase deferToThread: avg 29312.24 us, sync: avg 0.32 us, 90785.14x increase deferToThread: avg 29462.58 us, sync: avg 0.29 us, 101798.22x increase deferToThread: avg 29475.06 us, sync: avg 0.32 us, 91309.64x increase deferToThread: avg 29618.17 us, sync: avg 0.28 us, 105142.35x increase deferToThread: avg 29781.02 us, sync: avg 0.32 us, 93284.17x increase deferToThread: avg 29986.18 us, sync: avg 3.21 us, 9327.75x increase deferToThread: avg 30258.60 us, sync: avg 0.42 us, 71364.92x increase deferToThread: avg 30525.20 us, sync: avg 0.28 us, 107570.08x increase This eats between 0.2-1% CPU. And now on cypthon: # python /tmp/deft.py 1 deferToThread: avg 316.17 us, sync: avg 1.38 us, 228.71x increase deferToThread: avg 312.92 us, sync: avg 1.38 us, 226.96x increase deferToThread: avg 320.22 us, sync: avg 1.39 us, 230.37x increase deferToThread: avg 317.33 us, sync: avg 1.35 us, 235.24x increase # python /tmp/deft.py 8 deferToThread: avg 2542.90 us, sync: avg 1.37 us, 1854.14x increase deferToThread: avg 2544.50 us, sync: avg 1.35 us, 1878.13x increase deferToThread: avg 2544.47 us, sync: avg 1.36 us, 1864.52x increase deferToThread: avg 2544.52 us, sync: avg 1.38 us, 1839.01x increase deferToThread: avg 2544.92 us, sync: avg 1.36 us, 1871.81x increase deferToThread: avg 2546.71 us, sync: avg 1.39 us, 1830.35x increase deferToThread: avg 2552.38 us, sync: avg 1.35 us, 1893.17x increase deferToThread: avg 2552.40 us, sync: avg 1.36 us, 1870.20x increase # python /tmp/deft.py 16 deferToThread: avg 4745.76 us, sync: avg 1.26 us, 3770.11x increase deferToThread: avg 4748.67 us, sync: avg 1.24 us, 3817.03x increase deferToThread: avg 4749.81 us, sync: avg 1.26 us, 3756.39x increase deferToThread: avg 4749.72 us, sync: avg 1.24 us, 3839.88x increase deferToThread: avg 4749.87 us, sync: avg 1.28 us, 3709.99x increase deferToThread: avg 4752.63 us, sync: avg 1.24 us, 3842.90x increase deferToThread: avg 4752.53 us, sync: avg 1.23 us, 3866.08x increase deferToThread: avg 4752.55 us, sync: avg 1.23 us, 3855.40x increase deferToThread: avg 4754.03 us, sync: avg 1.29 us, 3678.09x increase deferToThread: avg 4754.97 us, sync: avg 1.25 us, 3817.19x increase deferToThread: avg 4755.45 us, sync: avg 1.32 us, 3593.28x increase deferToThread: avg 4756.35 us, sync: avg 1.25 us, 3804.18x increase deferToThread: avg 4756.19 us, sync: avg 1.29 us, 3687.73x increase deferToThread: avg 4757.19 us, sync: avg 1.23 us, 3860.74x increase deferToThread: avg 4758.02 us, sync: avg 1.24 us, 3824.33x increase deferToThread: avg 4759.63 us, sync: avg 1.24 us, 3830.40x increase The interpreter uses around 100% CPU. Python 2.7.11 PyPy 5.1.1 FreeBSD 10/amd64 Why deferToThread is so slow is another question, but here I'm interested in pypy's abysmal performance. From phyo.arkarlwin at gmail.com Sun Jun 5 19:50:07 2016 From: phyo.arkarlwin at gmail.com (Phyo Arkar) Date: Sun, 05 Jun 2016 23:50:07 +0000 Subject: [pypy-dev] pypy3.3-v5.2-alpha1 release In-Reply-To: References: <574CAE11.7080504@gmail.com> Message-ID: Thanks a lot for continuing development of pypy 3.3. can't wait for pypy 3.5 On Tue, May 31, 2016 at 2:10 PM Konstantin Lopuhin wrote: > Ah, sorry, my bad: I did not know about ensurepip module, checked if > after reading the blog post, and it works: it installs pip and "python > -m pip" works as expected (although "pip" still points to the system > one). > > 2016-05-31 10:08 GMT+03:00 Konstantin Lopuhin : > > Thanks for doing the release, this is awesome! > > > > I tried the OS X version > > ( > https://bitbucket.org/pypy/pypy/downloads/pypy3.3-v5.2.0-alpha1-osx64.tar.bz2 > ), > > it runs, but it does not seem to have pip (or perhaps I did something > > wrong): https://bpaste.net/show/f5e10b16bc22 > > > > 2016-05-31 0:18 GMT+03:00 Matti Picus : > >> We are almost ready to release pypy3.3-v5.2-alpha1, but need some help > to > >> make sure all is OK. > >> Please try the download packages here > >> https://bitbucket.org/pypy/pypy/downloads > >> also, if someone could check that the links related to pypy3.3 all work > on > >> this page that would be great > >> http://pypy.org/download.html > >> If the page is still showing PyPy3 2.4.0, wait 30 minutes or so from > now and > >> then hit refresh on your browser. > >> > >> Thanks, and kudos to the pypy3 team. > >> Matti > >> _______________________________________________ > >> pypy-dev mailing list > >> pypy-dev at python.org > >> https://mail.python.org/mailman/listinfo/pypy-dev > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Tue Jun 7 18:11:34 2016 From: matti.picus at gmail.com (Matti Picus) Date: Wed, 8 Jun 2016 01:11:34 +0300 Subject: [pypy-dev] PyPy2.7 v5.3 release candidates available for download Message-ID: <57574696.3040508@gmail.com> Please check the binaries and source packages available at https://bitbucket.org/pypy/pypy/downloads I uses a slightly different naming scheme to differentiate from pypy3.3, so the files are now https://bitbucket.org/pypy/pypy/downloads/pypy2-v5.3.0-* Matti From matti.picus at gmail.com Wed Jun 8 15:53:06 2016 From: matti.picus at gmail.com (Matti Picus) Date: Wed, 8 Jun 2016 22:53:06 +0300 Subject: [pypy-dev] PyPy2 v5.3.0 released, help us get the word out Message-ID: <575877A2.3090104@gmail.com> An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Wed Jun 8 15:55:12 2016 From: matti.picus at gmail.com (Matti Picus) Date: Wed, 8 Jun 2016 22:55:12 +0300 Subject: [pypy-dev] PyPy2 v5.3.0 released, help us get the word out In-Reply-To: <575877A2.3090104@gmail.com> References: <575877A2.3090104@gmail.com> Message-ID: <57587820.3090800@gmail.com> And this time in plain text not html https://morepypy.blogspot.co.il/2016/06/pypy2-v53-released-major-c-extension.html This is the culmination of ~6 months of teamwork to get cpyext to a state where it can run lxml, numpy, scipy, matplotlib, It runs, but is slow and maybe buggy. We need to create some new momentum around PyPy, to encourage uptake, to get more sponsorships, and to get some new contributors, so now is the time to try this version out, and show it off to your friends, coworkers, and managers. Matti From phyo.arkarlwin at gmail.com Wed Jun 8 15:57:37 2016 From: phyo.arkarlwin at gmail.com (Phyo Arkar) Date: Wed, 08 Jun 2016 19:57:37 +0000 Subject: [pypy-dev] PyPy2 v5.3.0 released, help us get the word out In-Reply-To: <57587820.3090800@gmail.com> References: <575877A2.3090104@gmail.com> <57587820.3090800@gmail.com> Message-ID: WOW!! Thank you veryy veryy much on keeping working on cpyext!! On Thu, Jun 9, 2016 at 2:25 AM Matti Picus wrote: > And this time in plain text not html > > > https://morepypy.blogspot.co.il/2016/06/pypy2-v53-released-major-c-extension.html > This is the culmination of ~6 months of teamwork to get cpyext to a > state where it can run lxml, numpy, scipy, matplotlib, > It runs, but is slow and maybe buggy. > We need to create some new momentum around PyPy, to encourage uptake, to > get more sponsorships, and to get some new contributors, so now is the > time to try this version out, and show it off to your friends, > coworkers, and managers. > > Matti > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rymg19 at gmail.com Wed Jun 8 16:38:48 2016 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Wed, 8 Jun 2016 15:38:48 -0500 Subject: [pypy-dev] PyPy2 v5.3.0 released, help us get the word out In-Reply-To: <575877A2.3090104@gmail.com> References: <575877A2.3090104@gmail.com> Message-ID: Posted to /r/programming! -- Ryan [ERROR]: Your autotools build scripts are 200 lines longer than your program. Something?s wrong. http://kirbyfan64.github.io/ On Jun 8, 2016 2:53 PM, "Matti Picus" wrote: > > https://morepypy.blogspot.co.il/2016/06/pypy2-v53-released-major-c-extension.html > This is the culmination of ~6 months of teamwork to get cpyext to a state > where it can run lxml, numpy, scipy, matplotlib, > It runs, but is slow and maybe buggy. > We need to create some new momentum around PyPy, to encourage uptake, to > get more sponsorships, and to get some new contributors, so now is the time > to try this version out, and show it off to your friends, coworkers, and > managers. > > Matti > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From phyo.arkarlwin at gmail.com Wed Jun 8 17:13:43 2016 From: phyo.arkarlwin at gmail.com (Phyo Arkar) Date: Wed, 08 Jun 2016 21:13:43 +0000 Subject: [pypy-dev] PyPy2 v5.3.0 released, help us get the word out In-Reply-To: References: <575877A2.3090104@gmail.com> Message-ID: Nice to see nim guys on PyPy too , Kirbyfan! On Thu, Jun 9, 2016 at 3:09 AM Ryan Gonzalez wrote: > Posted to /r/programming! > > -- > Ryan > [ERROR]: Your autotools build scripts are 200 lines longer than your > program. Something?s wrong. > http://kirbyfan64.github.io/ > On Jun 8, 2016 2:53 PM, "Matti Picus" wrote: > >> >> https://morepypy.blogspot.co.il/2016/06/pypy2-v53-released-major-c-extension.html >> This is the culmination of ~6 months of teamwork to get cpyext to a state >> where it can run lxml, numpy, scipy, matplotlib, >> It runs, but is slow and maybe buggy. >> We need to create some new momentum around PyPy, to encourage uptake, to >> get more sponsorships, and to get some new contributors, so now is the time >> to try this version out, and show it off to your friends, coworkers, and >> managers. >> >> Matti >> >> _______________________________________________ >> pypy-dev mailing list >> pypy-dev at python.org >> https://mail.python.org/mailman/listinfo/pypy-dev >> >> _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wickedgrey at gmail.com Wed Jun 8 17:39:26 2016 From: wickedgrey at gmail.com (Eli Stevens (Gmail)) Date: Wed, 8 Jun 2016 14:39:26 -0700 Subject: [pypy-dev] v5.3.0 numpy support (specifically numpy/core/src/multiarray/multiarray.c.src) Message-ID: I asked about the following about a month ago: ====================================================================== ERROR: Failure: ImportError (No module named numpy.core.multiarray_tests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/elis/venv/ppdb/site-packages/nose/loader.py", line 418, in loadTestsFromName addr.filename, addr.module) File "/home/elis/venv/ppdb/site-packages/nose/importer.py", line 47, in importFromPath return self.importFromDir(dir_path, fqname) File "/home/elis/venv/ppdb/site-packages/nose/importer.py", line 94, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/home/elis/venv/ppdb/site-packages/numpy/core/tests/test_multiarray.py", line 22, in from numpy.core.multiarray_tests import ( ImportError: No module named numpy.core.multiarray_tests And was told by Matti: > multiarray_tests comes from numpy/core/src/multiarray/multiarray.c.src which is compiled to a C-API module. > We skip building it as we have quite a way to go before we can support that level of C-API compatibility. Is there any change in behavior expected with the new release regarding these kinds of errors? Thanks, Eli From matti.picus at gmail.com Thu Jun 9 01:15:51 2016 From: matti.picus at gmail.com (Matti Picus) Date: Thu, 9 Jun 2016 08:15:51 +0300 Subject: [pypy-dev] v5.3.0 numpy support (specifically numpy/core/src/multiarray/multiarray.c.src) In-Reply-To: References: Message-ID: <5758FB87.4000101@gmail.com> No, the latest release does not solve this issue. And in a bit more detail: The latest release supports upstream numpy (mostly) using a C-API compatibility layer we call cpyext. This is a totally different approach than using our internal _numpypy module and the python-level code from bitbucket/pypy/numpy. While the new approach allows you to use upstream (i.e. github.com/numpy/numpy) numpy over the latest release of PyPY, the code will run much slower, and the JIT will not be able to help since it cannot look inside C code. We have some ideas on how to make this fast, but they are just ideas right now. Matti On 09/06/16 00:39, Eli Stevens (Gmail) wrote: > I asked about the following about a month ago: > > ====================================================================== > ERROR: Failure: ImportError (No module named numpy.core.multiarray_tests) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/home/elis/venv/ppdb/site-packages/nose/loader.py", line 418, > in loadTestsFromName > addr.filename, addr.module) > File "/home/elis/venv/ppdb/site-packages/nose/importer.py", line 47, > in importFromPath > return self.importFromDir(dir_path, fqname) > File "/home/elis/venv/ppdb/site-packages/nose/importer.py", line 94, > in importFromDir > mod = load_module(part_fqname, fh, filename, desc) > File "/home/elis/venv/ppdb/site-packages/numpy/core/tests/test_multiarray.py", > line 22, in > from numpy.core.multiarray_tests import ( > ImportError: No module named numpy.core.multiarray_tests > > And was told by Matti: > >> multiarray_tests comes from numpy/core/src/multiarray/multiarray.c.src which is compiled to a C-API module. >> We skip building it as we have quite a way to go before we can support that level of C-API compatibility. > Is there any change in behavior expected with the new release > regarding these kinds of errors? > > Thanks, > Eli > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev From stuaxo2 at yahoo.com Thu Jun 9 11:05:00 2016 From: stuaxo2 at yahoo.com (Stuart Axon) Date: Thu, 9 Jun 2016 15:05:00 +0000 (UTC) Subject: [pypy-dev] PyPy2 v5.3.0 released, help us get the word out In-Reply-To: References: <575877A2.3090104@gmail.com> Message-ID: <908649286.198068.1465484700614.JavaMail.yahoo@mail.yahoo.com> Awesome work, I ask this every time (on one site or another) - is cpyext complete enough to use Gtk3 (gi.repository) yet. ? ?S++ On Wednesday, June 8, 2016 10:14 PM, Phyo Arkar wrote: Nice to see nim guys on PyPy too , Kirbyfan! On Thu, Jun 9, 2016 at 3:09 AM Ryan Gonzalez wrote: Posted to /r/programming!-- Ryan [ERROR]: Your autotools build scripts are 200 lines longer than your program. Something?s wrong. http://kirbyfan64.github.io/On Jun 8, 2016 2:53 PM, "Matti Picus" wrote: https://morepypy.blogspot.co.il/2016/06/pypy2-v53-released-major-c-extension.html This is the culmination of ~6 months of teamwork to get cpyext to a state where it can run lxml, numpy, scipy, matplotlib, It runs, but is slow and maybe buggy. We need to create some new momentum around PyPy, to encourage uptake, to get more sponsorships, and to get some new contributors, so now is the time to try this version out, and show it off to your friends, coworkers, and managers. Matti _______________________________________________ pypy-dev mailing list pypy-dev at python.org https://mail.python.org/mailman/listinfo/pypy-dev _______________________________________________ pypy-dev mailing list pypy-dev at python.org https://mail.python.org/mailman/listinfo/pypy-dev _______________________________________________ pypy-dev mailing list pypy-dev at python.org https://mail.python.org/mailman/listinfo/pypy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Thu Jun 9 11:13:11 2016 From: matti.picus at gmail.com (Matti Picus) Date: Thu, 9 Jun 2016 18:13:11 +0300 Subject: [pypy-dev] PyPy2 v5.3.0 released, help us get the word out In-Reply-To: <908649286.198068.1465484700614.JavaMail.yahoo@mail.yahoo.com> References: <575877A2.3090104@gmail.com> <908649286.198068.1465484700614.JavaMail.yahoo@mail.yahoo.com> Message-ID: <57598787.5060706@gmail.com> cpyext should be complete enough, give it a try and let us know what's missing. As mentioned elsewhere, there may be some incompatibilites and it certainly will be slower than cpython. If it doesn't work, please file an issue with exact steps to replicate. Matti On 09/06/16 18:05, Stuart Axon wrote: > Awesome work, I ask this every time (on one site or another) - is > cpyext complete enough to use Gtk3 (gi.repository) yet. ? > S++ > > > From edd at theunixzoo.co.uk Thu Jun 9 11:19:02 2016 From: edd at theunixzoo.co.uk (Edd Barrett) Date: Thu, 9 Jun 2016 16:19:02 +0100 Subject: [pypy-dev] W^X in the PyPy JIT? Message-ID: <20160609151902.GD93981@wilfred.home> Hi! Recently OpenBSD has started enforcing W^X [1] for all programs, unless the program is on a file system with a special 'wxallowed' flag. Regardless of the flag, a message is emitted to the dmesg buffer. Starting PyPy will emit such a message: pypy(62301): mmap W^X violation I suppose the memory the JIT is using for traces has been mapped W+X? Assuming this is the case, the security of PyPy (on all platforms) could be improved by mapping the memory W during trace compilation, and then re-mapping the memory X once compilation is complete. Of course, it might not be the JIT at all... Cheers [1] https://en.wikipedia.org/wiki/W%5EX -- Best Regards Edd Barrett http://www.theunixzoo.co.uk From arigo at tunes.org Thu Jun 9 19:44:52 2016 From: arigo at tunes.org (Armin Rigo) Date: Fri, 10 Jun 2016 01:44:52 +0200 Subject: [pypy-dev] W^X in the PyPy JIT? In-Reply-To: <20160609151902.GD93981@wilfred.home> References: <20160609151902.GD93981@wilfred.home> Message-ID: Hi Edd, On 9 June 2016 at 17:19, Edd Barrett wrote: > Recently OpenBSD has started enforcing W^X [1] for all programs, unless > the program is on a file system with a special 'wxallowed' flag. We already saw the "W^X" issue. A possible solution would be indeed for the JIT to map W when it compiles and then X "when it's done", except that it is never done, as there are various places that can be patched at various point (including jump targets, and now also some young GC pointers that are replaced with their old equivalent during the next minor collection). It's not completely easy to do. It's not impossible, though. Patches welcome. A bient?t, Armin. From arigo at tunes.org Thu Jun 9 19:49:50 2016 From: arigo at tunes.org (Armin Rigo) Date: Fri, 10 Jun 2016 01:49:50 +0200 Subject: [pypy-dev] PyPy2 v5.3.0 released, help us get the word out In-Reply-To: <57598787.5060706@gmail.com> References: <575877A2.3090104@gmail.com> <908649286.198068.1465484700614.JavaMail.yahoo@mail.yahoo.com> <57598787.5060706@gmail.com> Message-ID: Hi Stuart, hi all, On 9 June 2016 at 17:13, Matti Picus wrote: > If it doesn't work, please file an issue with exact steps to replicate. I should add that by now, with a cpyext that seems to start working on "hard" cases like numpy, chances are high that if you file an issue for a particular package we'll quickly fix the problem. :-) Armin From raffael.tfirst at gmail.com Tue Jun 14 15:52:59 2016 From: raffael.tfirst at gmail.com (Raffael Tfirst) Date: Tue, 14 Jun 2016 21:52:59 +0200 Subject: [pypy-dev] Implementation of some Python 3.5 features for more PyPy 3.5 support Message-ID: Hi, I am Raffael Tfirst, currently working on PyPy 3.5 for the Google Summer of Code. Since it might be of interest to some, I want to give you a short summary of what I do. The Python 3.5 features I implement in PyPy are the @ operator for matrix multiplications (PEP 465), additional unpacking generalizations (PEP 448) and coroutines with async and await syntax (PEP 492). A short description of each feature can be found here (at New Syntax Features): https://docs.python.org/3/whatsnew/3.5.html It would be cool to have those features implemented in PyPy because they will most likely be used a lot in future Python 3.5 programs, that is why I want to close this gap and push PyPy a bit closer to Python 3.5 functionality. I started working on this about a month ago. As things stand now, the @ operator already works. The additional unpacking feature will be ready soon, too, since a great part is already implemented. There's just some nasty errors I need to fix before it can be tested. In about two weeks progress will speed up a lot, because then I'm going to have way more time for this project. For those who are interested, I work on the py3.5 branch. You can also check out my progress on my blog: http://pypy35syntax.blogspot.co.at/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Wed Jun 15 03:47:02 2016 From: songofacandy at gmail.com (INADA Naoki) Date: Wed, 15 Jun 2016 16:47:02 +0900 Subject: [pypy-dev] Implementation of some Python 3.5 features for more PyPy 3.5 support In-Reply-To: References: Message-ID: I'm happy to hear that someone working on PyPy 3.5. Thank you for your effort! On Wed, Jun 15, 2016 at 4:52 AM, Raffael Tfirst wrote: > Hi, > I am Raffael Tfirst, currently working on PyPy 3.5 for the Google Summer > of Code. > > Since it might be of interest to some, I want to give you a short summary > of what I do. > > The Python 3.5 features I implement in PyPy are the @ operator for matrix > multiplications (PEP 465), additional unpacking generalizations (PEP 448) > and coroutines with async and await syntax (PEP 492). A short description > of each feature can be found here (at New Syntax Features): > https://docs.python.org/3/whatsnew/3.5.html > It would be cool to have those features implemented in PyPy because they > will most likely be used a lot in future Python 3.5 programs, that is why I > want to close this gap and push PyPy a bit closer to Python 3.5 > functionality. > > I started working on this about a month ago. As things stand now, the @ > operator already works. The additional unpacking feature will be ready > soon, too, since a great part is already implemented. There's just some > nasty errors I need to fix before it can be tested. > In about two weeks progress will speed up a lot, because then I'm going to > have way more time for this project. > > For those who are interested, I work on the py3.5 branch. > > You can also check out my progress on my blog: > http://pypy35syntax.blogspot.co.at/ > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > > -- INADA Naoki -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Thu Jun 16 16:37:18 2016 From: matti.picus at gmail.com (Matti Picus) Date: Thu, 16 Jun 2016 23:37:18 +0300 Subject: [pypy-dev] bugfix PyPy 2.7 v5.3.1 ready for release Message-ID: <57630DFE.3080303@gmail.com> We fixed a few bugs and would like to release, please help us make sure everything is ready to go. The downloads are here https://bitbucket.org/pypy/pypy/downloads the donwload page has been updated http://pypy.org/download.html and the release notice is ready to be posted to the blog http://doc.pypy.org/en/latest/release-pypy2.7-v5.3.1.html http://doc.pypy.org/en/latest/whatsnew-pypy2-5.3.1.html Let me know if something needs fixing... Thanks Matti From omer.drow at gmail.com Mon Jun 20 02:51:51 2016 From: omer.drow at gmail.com (Omer Katz) Date: Mon, 20 Jun 2016 06:51:51 +0000 Subject: [pypy-dev] PyParallel-style threads Message-ID: Hi all, There was an experiment based on CPython's code called PyParallel that allows running threads in parallel without STM and modifying source code of both Python and C extensions. The only limitation is that they disallow mutation of global state in parallel context. I briefly mentioned it before on PyPy's freenode channel. I'd like to discuss why the approach is useful, how it can benefit PyPy users and how can it be implemented. Allowing to run in parallel without mutating global state can help servers use each thread to handle a request. It can also allow to log in parallel or send an HTTP request (or an AMQP message) without sharing the response with the main thread. This is useful in some cases and since PyParallel managed to keep the same semantics it (shouldn't) break CPyExt. If we keep to the following rules: 1. No global state mutation is allowed 2. No new keywords or code modifications required 3. No CPyExt code is allowed (for now) I believe that users can somewhat benefit from this implementation if done correctly. As for implementation, if we can trace the code running in the thread and ensure it's not mutating global state and that CPyExt is never used during the thread's course we can simply release the GIL when such a thread is run. That requires less knowledge than using STM and less code modifications. However I think that attempting to do so will introduce the same issue with caching traces (Armin am I correct here?). As for CPyExt, we could copy the same code modifications that PyParallels did but I suspect that it will be so slow that the benefit of running in parallel will be completely lost for all cases but very long threads. Is what I'm suggesting even possible? How challenging will it be? Thanks, Omer Katz. -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Mon Jun 20 10:47:25 2016 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 20 Jun 2016 16:47:25 +0200 Subject: [pypy-dev] PyParallel-style threads In-Reply-To: References: Message-ID: so quick question - what's the win compared to multiple processes? On Mon, Jun 20, 2016 at 8:51 AM, Omer Katz wrote: > Hi all, > There was an experiment based on CPython's code called PyParallel that > allows running threads in parallel without STM and modifying source code of > both Python and C extensions. The only limitation is that they disallow > mutation of global state in parallel context. > I briefly mentioned it before on PyPy's freenode channel. > I'd like to discuss why the approach is useful, how it can benefit PyPy > users and how can it be implemented. > Allowing to run in parallel without mutating global state can help servers > use each thread to handle a request. It can also allow to log in parallel or > send an HTTP request (or an AMQP message) without sharing the response with > the main thread. This is useful in some cases and since PyParallel managed > to keep the same semantics it (shouldn't) break CPyExt. > If we keep to the following rules: > > No global state mutation is allowed > No new keywords or code modifications required > No CPyExt code is allowed (for now) > > I believe that users can somewhat benefit from this implementation if done > correctly. > As for implementation, if we can trace the code running in the thread and > ensure it's not mutating global state and that CPyExt is never used during > the thread's course we can simply release the GIL when such a thread is run. > That requires less knowledge than using STM and less code modifications. > However I think that attempting to do so will introduce the same issue with > caching traces (Armin am I correct here?). > > As for CPyExt, we could copy the same code modifications that PyParallels > did but I suspect that it will be so slow that the benefit of running in > parallel will be completely lost for all cases but very long threads. > > Is what I'm suggesting even possible? How challenging will it be? > > Thanks, > Omer Katz. > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > From arigo at tunes.org Mon Jun 20 10:53:11 2016 From: arigo at tunes.org (Armin Rigo) Date: Mon, 20 Jun 2016 16:53:11 +0200 Subject: [pypy-dev] PyParallel-style threads In-Reply-To: References: Message-ID: Hi Omer, On 20 June 2016 at 08:51, Omer Katz wrote: > As for implementation, if we can trace the code running in the thread and > ensure it's not mutating global state and that CPyExt is never used during > the thread's course we can simply release the GIL when such a thread is run. That's a very hand-wavy and vague description. To start with, how do you define exactly "not mutating global state"? We are not allowed to write to any of the objects that existed before we started the thread? It may be possible to have such an implementation, yes. Actually, that's probably easy: tweak the STM code to crash instead of doing something more complicated when we write to an old object. I'm not sure how useful that would be---or how useful PyParallel is on CPython. Maybe if you can point us to real usages of PyParallel it would be a start. A bient?t, Armin. From mdomans at gmail.com Mon Jun 20 11:02:46 2016 From: mdomans at gmail.com (=?UTF-8?B?TWljaGHFgiBEb21hxYRza2k=?=) Date: Mon, 20 Jun 2016 17:02:46 +0200 Subject: [pypy-dev] PyParallel-style threads In-Reply-To: References: Message-ID: So I actually thought about similar approach. I was curious what do you think about approach to concurrency similar to what Apple did with C blocks and GCD. That is: enable threading but instead of the STM approach have fully explicit mutations within atomic blocks 2016-06-20 16:53 GMT+02:00 Armin Rigo : > Hi Omer, > > On 20 June 2016 at 08:51, Omer Katz wrote: > > As for implementation, if we can trace the code running in the thread and > > ensure it's not mutating global state and that CPyExt is never used > during > > the thread's course we can simply release the GIL when such a thread is > run. > > That's a very hand-wavy and vague description. To start with, how do > you define exactly "not mutating global state"? We are not allowed to > write to any of the objects that existed before we started the thread? > It may be possible to have such an implementation, yes. Actually, > that's probably easy: tweak the STM code to crash instead of doing > something more complicated when we write to an old object. > > I'm not sure how useful that would be---or how useful PyParallel is on > CPython. Maybe if you can point us to real usages of PyParallel it > would be a start. > > > A bient?t, > > Armin. > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > -- --------------------------- Micha? Doma?ski -------------- next part -------------- An HTML attachment was scrubbed... URL: From omer.drow at gmail.com Mon Jun 20 12:42:09 2016 From: omer.drow at gmail.com (Omer Katz) Date: Mon, 20 Jun 2016 16:42:09 +0000 Subject: [pypy-dev] PyParallel-style threads In-Reply-To: References: Message-ID: Let's review what forking does in Python from a 10,000ft view: 1) It pickles the current state of the process. 2) Starts a new Python process 3) Unpickles the current state of the process There are a lot more memory allocations when forking comparing to starting a new thread. That makes forking unsuitable for small workloads. I'm guessing that PyPy does not save the trace/optimized ASM of the forked process in the parent process so each time you start a new process you have to trace again which makes small workloads even less suitable and even large processing batches will need to be traced again. In case of pre-forking servers, each PyPy instance has to trace and optimize the same code when there is no reason. Threads would allow us to reduce warmup time for this case. It will also consume less memory. ??????? ??? ??, 20 ????? 2016 ?-17:47 ??? ?Maciej Fijalkowski?? :? > so quick question - what's the win compared to multiple processes? > > On Mon, Jun 20, 2016 at 8:51 AM, Omer Katz wrote: > > Hi all, > > There was an experiment based on CPython's code called PyParallel that > > allows running threads in parallel without STM and modifying source code > of > > both Python and C extensions. The only limitation is that they disallow > > mutation of global state in parallel context. > > I briefly mentioned it before on PyPy's freenode channel. > > I'd like to discuss why the approach is useful, how it can benefit PyPy > > users and how can it be implemented. > > Allowing to run in parallel without mutating global state can help > servers > > use each thread to handle a request. It can also allow to log in > parallel or > > send an HTTP request (or an AMQP message) without sharing the response > with > > the main thread. This is useful in some cases and since PyParallel > managed > > to keep the same semantics it (shouldn't) break CPyExt. > > If we keep to the following rules: > > > > No global state mutation is allowed > > No new keywords or code modifications required > > No CPyExt code is allowed (for now) > > > > I believe that users can somewhat benefit from this implementation if > done > > correctly. > > As for implementation, if we can trace the code running in the thread and > > ensure it's not mutating global state and that CPyExt is never used > during > > the thread's course we can simply release the GIL when such a thread is > run. > > That requires less knowledge than using STM and less code modifications. > > However I think that attempting to do so will introduce the same issue > with > > caching traces (Armin am I correct here?). > > > > As for CPyExt, we could copy the same code modifications that PyParallels > > did but I suspect that it will be so slow that the benefit of running in > > parallel will be completely lost for all cases but very long threads. > > > > Is what I'm suggesting even possible? How challenging will it be? > > > > Thanks, > > Omer Katz. > > > > _______________________________________________ > > pypy-dev mailing list > > pypy-dev at python.org > > https://mail.python.org/mailman/listinfo/pypy-dev > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Mon Jun 20 13:16:34 2016 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 20 Jun 2016 19:16:34 +0200 Subject: [pypy-dev] PyParallel-style threads In-Reply-To: References: Message-ID: no, you misunderstood me: if you want to use multiple processes, you not gonna start a new one per thing to do. You'll have a process pool and use that. Also, if you don't use multiprocessing, you don't use pickling, you use something sane for communication. The PyParallels essentially allows read-only access to the global state, but read-only is ill defined and ill enforced (especially in the case of cpy extensions) in Python. So what do you get as opposed to multiple processing? On Mon, Jun 20, 2016 at 6:42 PM, Omer Katz wrote: > Let's review what forking does in Python from a 10,000ft view: > 1) It pickles the current state of the process. > 2) Starts a new Python process > 3) Unpickles the current state of the process > There are a lot more memory allocations when forking comparing to starting a > new thread. That makes forking unsuitable for small workloads. > I'm guessing that PyPy does not save the trace/optimized ASM of the forked > process in the parent process so each time you start a new process you have > to trace again which makes small workloads even less suitable and even large > processing batches will need to be traced again. > > In case of pre-forking servers, each PyPy instance has to trace and optimize > the same code when there is no reason. Threads would allow us to reduce > warmup time for this case. It will also consume less memory. > > ??????? ??? ??, 20 ????? 2016 ?-17:47 ??? ?Maciej Fijalkowski?? > :? >> >> so quick question - what's the win compared to multiple processes? >> >> On Mon, Jun 20, 2016 at 8:51 AM, Omer Katz wrote: >> > Hi all, >> > There was an experiment based on CPython's code called PyParallel that >> > allows running threads in parallel without STM and modifying source code >> > of >> > both Python and C extensions. The only limitation is that they disallow >> > mutation of global state in parallel context. >> > I briefly mentioned it before on PyPy's freenode channel. >> > I'd like to discuss why the approach is useful, how it can benefit PyPy >> > users and how can it be implemented. >> > Allowing to run in parallel without mutating global state can help >> > servers >> > use each thread to handle a request. It can also allow to log in >> > parallel or >> > send an HTTP request (or an AMQP message) without sharing the response >> > with >> > the main thread. This is useful in some cases and since PyParallel >> > managed >> > to keep the same semantics it (shouldn't) break CPyExt. >> > If we keep to the following rules: >> > >> > No global state mutation is allowed >> > No new keywords or code modifications required >> > No CPyExt code is allowed (for now) >> > >> > I believe that users can somewhat benefit from this implementation if >> > done >> > correctly. >> > As for implementation, if we can trace the code running in the thread >> > and >> > ensure it's not mutating global state and that CPyExt is never used >> > during >> > the thread's course we can simply release the GIL when such a thread is >> > run. >> > That requires less knowledge than using STM and less code modifications. >> > However I think that attempting to do so will introduce the same issue >> > with >> > caching traces (Armin am I correct here?). >> > >> > As for CPyExt, we could copy the same code modifications that >> > PyParallels >> > did but I suspect that it will be so slow that the benefit of running in >> > parallel will be completely lost for all cases but very long threads. >> > >> > Is what I'm suggesting even possible? How challenging will it be? >> > >> > Thanks, >> > Omer Katz. >> > >> > _______________________________________________ >> > pypy-dev mailing list >> > pypy-dev at python.org >> > https://mail.python.org/mailman/listinfo/pypy-dev >> > From omer.drow at gmail.com Mon Jun 20 13:28:02 2016 From: omer.drow at gmail.com (Omer Katz) Date: Mon, 20 Jun 2016 17:28:02 +0000 Subject: [pypy-dev] PyParallel-style threads In-Reply-To: References: Message-ID: In preforking (as in the case of a process pool), you use less memory and reduce the tracing/optimization time on the code since the same PyPy instance already traced and optimized that part of the code. ??????? ??? ??, 20 ????? 2016 ?-20:16 ??? ?Maciej Fijalkowski?? :? > no, you misunderstood me: > > if you want to use multiple processes, you not gonna start a new one > per thing to do. You'll have a process pool and use that. Also, if you > don't use multiprocessing, you don't use pickling, you use something > sane for communication. The PyParallels essentially allows read-only > access to the global state, but read-only is ill defined and ill > enforced (especially in the case of cpy extensions) in Python. So what > do you get as opposed to multiple processing? > > On Mon, Jun 20, 2016 at 6:42 PM, Omer Katz wrote: > > Let's review what forking does in Python from a 10,000ft view: > > 1) It pickles the current state of the process. > > 2) Starts a new Python process > > 3) Unpickles the current state of the process > > There are a lot more memory allocations when forking comparing to > starting a > > new thread. That makes forking unsuitable for small workloads. > > I'm guessing that PyPy does not save the trace/optimized ASM of the > forked > > process in the parent process so each time you start a new process you > have > > to trace again which makes small workloads even less suitable and even > large > > processing batches will need to be traced again. > > > > In case of pre-forking servers, each PyPy instance has to trace and > optimize > > the same code when there is no reason. Threads would allow us to reduce > > warmup time for this case. It will also consume less memory. > > > > ??????? ??? ??, 20 ????? 2016 ?-17:47 ??? ?Maciej Fijalkowski?? > > :? > >> > >> so quick question - what's the win compared to multiple processes? > >> > >> On Mon, Jun 20, 2016 at 8:51 AM, Omer Katz wrote: > >> > Hi all, > >> > There was an experiment based on CPython's code called PyParallel that > >> > allows running threads in parallel without STM and modifying source > code > >> > of > >> > both Python and C extensions. The only limitation is that they > disallow > >> > mutation of global state in parallel context. > >> > I briefly mentioned it before on PyPy's freenode channel. > >> > I'd like to discuss why the approach is useful, how it can benefit > PyPy > >> > users and how can it be implemented. > >> > Allowing to run in parallel without mutating global state can help > >> > servers > >> > use each thread to handle a request. It can also allow to log in > >> > parallel or > >> > send an HTTP request (or an AMQP message) without sharing the response > >> > with > >> > the main thread. This is useful in some cases and since PyParallel > >> > managed > >> > to keep the same semantics it (shouldn't) break CPyExt. > >> > If we keep to the following rules: > >> > > >> > No global state mutation is allowed > >> > No new keywords or code modifications required > >> > No CPyExt code is allowed (for now) > >> > > >> > I believe that users can somewhat benefit from this implementation if > >> > done > >> > correctly. > >> > As for implementation, if we can trace the code running in the thread > >> > and > >> > ensure it's not mutating global state and that CPyExt is never used > >> > during > >> > the thread's course we can simply release the GIL when such a thread > is > >> > run. > >> > That requires less knowledge than using STM and less code > modifications. > >> > However I think that attempting to do so will introduce the same issue > >> > with > >> > caching traces (Armin am I correct here?). > >> > > >> > As for CPyExt, we could copy the same code modifications that > >> > PyParallels > >> > did but I suspect that it will be so slow that the benefit of running > in > >> > parallel will be completely lost for all cases but very long threads. > >> > > >> > Is what I'm suggesting even possible? How challenging will it be? > >> > > >> > Thanks, > >> > Omer Katz. > >> > > >> > _______________________________________________ > >> > pypy-dev mailing list > >> > pypy-dev at python.org > >> > https://mail.python.org/mailman/listinfo/pypy-dev > >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From omer.drow at gmail.com Mon Jun 20 13:53:09 2016 From: omer.drow at gmail.com (Omer Katz) Date: Mon, 20 Jun 2016 17:53:09 +0000 Subject: [pypy-dev] PyParallel-style threads In-Reply-To: References: Message-ID: PyParallel defines "not mutating global state" as *"avoiding mutation of Python objects that were allocated from the main thread; don't append to a main thread list or assign to a main thread dict from a parallel thread"*. The PyParallel approach provides different tradeoffs from STM. You can't parallelize desrialization of a dictionary to a Python object instance e.g. a Django model but you can run a threaded server that performs parallel I/O since in STM performing I/O turns the transaction to be inevitable. There can only be one inevitable transaction at any given point of time according to the documentation found here http://doc.pypy.org/en/latest/stm.html#transaction-transactionqueue. Also, I'm not sure how allowing to perform a single I/O operation in PyPy STM will affect gevent/eventlet or asyncio if more than one thread is involved (which is supported in both gevent and asyncio. I haven't used eventlet so I don't really know). The PyParallel approach offers the same semantics as CPython when it comes to gevent/asyncio/eventlet. Each thread has it's own event loop and you are allowed to switch execution in the middle since you're not changing anything from other threads. You can also report errors to Sentry using raven while handling other requests normally. Raven collects stack information which is never mutated (See https://github.com/getsentry/raven-python/blob/master/raven/utils/stacks.py#L246) and then sends it to Sentry's servers. There's no reason (that I can see at least) to block another request from being processed while collecting that information and sending the data to Sentry's servers. The usecase described by PyParallel is also valid: "...This is significant when you factor in how Python's scoping works at a language level: Python code executing in a parallel thread can freely access any non-local variables created by the "main thread". That is, it has the exact same scoping and variable name resolution rules as any other Python code. This facilitates loading large data structures from the main thread and then freely accessing them from parallel callbacks. We demonstrate this with our simple Wikipedia "instant search" server , which loads a trie with 27 million entries, each one mapping a title to a 64-bit byte offset within a 60GB XML file. We then load a sorted NumPy array of all 64-bit offsets, which allows us to extract the exact byte range a given title's content appears within the XML file, allowing a client to issue a ranged request for those bytes to get the exact content via a single call to TransmitFile. This call returns immediately, but sets up the necessary structures for the kernel to send that byte range directly to the client without further interaction from us. The working set size of the python.exe process is about 11GB when the trie and NumPy array are loaded. Thus,multiprocessing would not be feasible, as you'd have 8 separate processes of 11GB if you had 8 cores and started 8 workers, requiring 88GB just for the processes. The number of allocated objects is around 27.1 million; the datrie library can efficiently store values if they're a 32-bit integer, however, our offsets are 64-bit, so an 80-something byte PyObjectneeds to be allocated to represent each one. This is significant because it demonstrates the competitive advantage PyParallel has against other languages when dealing with large heap sizes and object counts, whilst simultaneously avoiding the need for continual GC-motivated heap traversal, a product of memory allocation pressure (which is an inevitable side-effect of high-end network load, where incoming links are saturated at line rate)." STM currently requires code modifications in order to avoid conflicts, at least when collections are involved. PyParallel doesn't allow these kinds of mutations so it makes the implementation much easier in PyPy. PyParallel also requires a specific API to be used in order to utilize their parallel threads. There is a way to eliminate code modifications in PyParallel's case. We initially run with the GIL acquired as in with any other thread and then the trace for CPyExt calls or non-thread locals mutations and if there are none we can eliminate the call to acquire the GIL. Further optimizations can be performed if only a branch of the code requires CPyExt/non-thread locals mutations. I don't know if it's any easier than scanning the trace for lists/sets/dictionaries and replacing them with their equivalent STM implementations which Armin has already mentioned is not trivial. In the future when STM will be production ready we can "downgrade" a thread to an STM thread when it is required instead of acquiring the GIL and blocking the execution of other threads if we want to. STM also currently makes it harder to reason on how the program behaves. Especially when you have conflicts. With my suggestion you can easily say if the GIL is released or not. ??????? ??? ??, 20 ????? 2016 ?-17:53 ??? ?Armin Rigo?? :? > Hi Omer, > > On 20 June 2016 at 08:51, Omer Katz wrote: > > As for implementation, if we can trace the code running in the thread and > > ensure it's not mutating global state and that CPyExt is never used > during > > the thread's course we can simply release the GIL when such a thread is > run. > > That's a very hand-wavy and vague description. To start with, how do > you define exactly "not mutating global state"? We are not allowed to > write to any of the objects that existed before we started the thread? > It may be possible to have such an implementation, yes. Actually, > that's probably easy: tweak the STM code to crash instead of doing > something more complicated when we write to an old object. > > I'm not sure how useful that would be---or how useful PyParallel is on > CPython. Maybe if you can point us to real usages of PyParallel it > would be a start. > > > A bient?t, > > Armin. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mdomans at gmail.com Mon Jun 20 15:56:12 2016 From: mdomans at gmail.com (=?UTF-8?B?TWljaGHFgiBEb21hxYRza2k=?=) Date: Mon, 20 Jun 2016 21:56:12 +0200 Subject: [pypy-dev] PyParallel-style threads In-Reply-To: References: Message-ID: With all due respect - wouldn't it make more sense to agree on API that in fact would use threads? The patterns used by Go and Erlang and to some degree by ObjC and Swift seem more promising than "hidden multiprocessing" I may be a bore, but if what I'm getting is just "a nice syntax with restrictions" - it's not worth working on. I'd like to see actual benefits to people that want multithreading to work. It may seem like a blasphemy but with PyPy we could agree on a new APIs 2016-06-20 19:53 GMT+02:00 Omer Katz : > PyParallel defines "not mutating global state" as *"avoiding mutation of > Python objects that were allocated from the main thread; don't append to a > main thread list or assign to a main thread dict from a parallel thread"*. > > The PyParallel approach provides different tradeoffs from STM. > You can't parallelize desrialization of a dictionary to a Python object > instance e.g. a Django model but you can run a threaded server that > performs parallel I/O since in STM performing I/O turns the transaction to > be inevitable. There can only be one inevitable transaction at any given > point of time according to the documentation found here > http://doc.pypy.org/en/latest/stm.html#transaction-transactionqueue. > Also, I'm not sure how allowing to perform a single I/O operation in PyPy > STM will affect gevent/eventlet or asyncio if more than one thread is > involved (which is supported in both gevent and asyncio. I haven't used > eventlet so I don't really know). > The PyParallel approach offers the same semantics as CPython when it comes > to gevent/asyncio/eventlet. Each thread has it's own event loop and you are > allowed to switch execution in the middle since you're not changing > anything from other threads. > > You can also report errors to Sentry using raven while handling other > requests normally. Raven collects stack information which is never mutated > (See > https://github.com/getsentry/raven-python/blob/master/raven/utils/stacks.py#L246) > and then sends it to Sentry's servers. There's no reason (that I can see at > least) to block another request from being processed while collecting that > information and sending the data to Sentry's servers. > > The usecase described by PyParallel is also valid: > > "...This is significant when you factor in how Python's scoping works at a > language level: Python code executing in a parallel thread can freely > access any non-local variables created by the "main thread". That is, it > has the exact same scoping and variable name resolution rules as any other > Python code. This facilitates loading large data structures from the main > thread and then freely accessing them from parallel callbacks. > > We demonstrate this with our simple Wikipedia "instant search" server > , > which loads a trie with 27 million entries, each one mapping a title to a > 64-bit byte offset within a 60GB XML file. We then load a sorted NumPy > array of all 64-bit offsets, which allows us to extract the exact byte > range a given title's content appears within the XML file, allowing a > client to issue a ranged request for those bytes to get the exact content > via a single call to TransmitFile. This call returns immediately, but > sets up the necessary structures for the kernel to send that byte range > directly to the client without further interaction from us. > > The working set size of the python.exe process is about 11GB when the > trie and NumPy array are loaded. Thus,multiprocessing would not be > feasible, as you'd have 8 separate processes of 11GB if you had 8 cores and > started 8 workers, requiring 88GB just for the processes. The number of > allocated objects is around 27.1 million; the datrie library can > efficiently store values if they're a 32-bit integer, however, our offsets > are 64-bit, so an 80-something byte PyObjectneeds to be allocated to > represent each one. > > This is significant because it demonstrates the competitive advantage > PyParallel has against other languages when dealing with large heap sizes > and object counts, whilst simultaneously avoiding the need for continual > GC-motivated heap traversal, a product of memory allocation pressure (which > is an inevitable side-effect of high-end network load, where incoming links > are saturated at line rate)." > > STM currently requires code modifications in order to avoid conflicts, at > least when collections are involved. PyParallel doesn't allow these kinds > of mutations so it makes the implementation much easier in PyPy. > PyParallel also requires a specific API to be used in order to utilize > their parallel threads. There is a way to eliminate code modifications in > PyParallel's case. > We initially run with the GIL acquired as in with any other thread and > then the trace for CPyExt calls or non-thread locals mutations and if there > are none we can eliminate the call to acquire the GIL. Further > optimizations can be performed if only a branch of the code requires > CPyExt/non-thread locals mutations. > I don't know if it's any easier than scanning the trace for > lists/sets/dictionaries and replacing them with their equivalent STM > implementations which Armin has already mentioned is not trivial. > In the future when STM will be production ready we can "downgrade" a > thread to an STM thread when it is required instead of acquiring the GIL > and blocking the execution of other threads if we want to. > > STM also currently makes it harder to reason on how the program behaves. > Especially when you have conflicts. > With my suggestion you can easily say if the GIL is released or not. > ??????? ??? ??, 20 ????? 2016 ?-17:53 ??? ?Armin Rigo?? ??>:? > >> Hi Omer, >> >> On 20 June 2016 at 08:51, Omer Katz wrote: >> > As for implementation, if we can trace the code running in the thread >> and >> > ensure it's not mutating global state and that CPyExt is never used >> during >> > the thread's course we can simply release the GIL when such a thread is >> run. >> >> That's a very hand-wavy and vague description. To start with, how do >> you define exactly "not mutating global state"? We are not allowed to >> write to any of the objects that existed before we started the thread? >> It may be possible to have such an implementation, yes. Actually, >> that's probably easy: tweak the STM code to crash instead of doing >> something more complicated when we write to an old object. >> >> I'm not sure how useful that would be---or how useful PyParallel is on >> CPython. Maybe if you can point us to real usages of PyParallel it >> would be a start. >> >> >> A bient?t, >> >> Armin. >> > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > > -- --------------------------- Micha? Doma?ski -------------- next part -------------- An HTML attachment was scrubbed... URL: From omer.drow at gmail.com Tue Jun 21 01:34:13 2016 From: omer.drow at gmail.com (Omer Katz) Date: Tue, 21 Jun 2016 05:34:13 +0000 Subject: [pypy-dev] PyParallel-style threads In-Reply-To: References: Message-ID: CSP is something you can already implement with Python. In fact, that's exactly what happens when one uses Python threads with coroutines (such as gevent). I'm not sure what your suggesting and how would they keep Python semantics. The limitations of the GIL will prevent CSP style concurrency from actually performing as well as Go. What I'm suggesting will relex the limitations of the GIL without changing semantics or requiring a battle tested STM implementation for the time being. The use cases I described will benefit from having threads working. Applications like Salt use a lot of threads. If we can run some of them in parallel without changing code, that's a huge win in my book. On Mon, Jun 20, 2016, 22:56 Micha? Doma?ski wrote: > With all due respect - wouldn't it make more sense to agree on API that in > fact would use threads? The patterns used by Go and Erlang and to some > degree by ObjC and Swift seem more promising than "hidden multiprocessing" > > I may be a bore, but if what I'm getting is just "a nice syntax with > restrictions" - it's not worth working on. I'd like to see actual benefits > to people that want multithreading to work. It may seem like a blasphemy > but with PyPy we could agree on a new APIs > > 2016-06-20 19:53 GMT+02:00 Omer Katz : > >> PyParallel defines "not mutating global state" as *"avoiding mutation of >> Python objects that were allocated from the main thread; don't append to a >> main thread list or assign to a main thread dict from a parallel thread"* >> . >> >> The PyParallel approach provides different tradeoffs from STM. >> You can't parallelize desrialization of a dictionary to a Python object >> instance e.g. a Django model but you can run a threaded server that >> performs parallel I/O since in STM performing I/O turns the transaction to >> be inevitable. There can only be one inevitable transaction at any given >> point of time according to the documentation found here >> http://doc.pypy.org/en/latest/stm.html#transaction-transactionqueue. >> Also, I'm not sure how allowing to perform a single I/O operation in PyPy >> STM will affect gevent/eventlet or asyncio if more than one thread is >> involved (which is supported in both gevent and asyncio. I haven't used >> eventlet so I don't really know). >> The PyParallel approach offers the same semantics as CPython when it >> comes to gevent/asyncio/eventlet. Each thread has it's own event loop and >> you are allowed to switch execution in the middle since you're not changing >> anything from other threads. >> >> You can also report errors to Sentry using raven while handling other >> requests normally. Raven collects stack information which is never mutated >> (See >> https://github.com/getsentry/raven-python/blob/master/raven/utils/stacks.py#L246) >> and then sends it to Sentry's servers. There's no reason (that I can see at >> least) to block another request from being processed while collecting that >> information and sending the data to Sentry's servers. >> >> The usecase described by PyParallel is also valid: >> >> "...This is significant when you factor in how Python's scoping works at >> a language level: Python code executing in a parallel thread can freely >> access any non-local variables created by the "main thread". That is, it >> has the exact same scoping and variable name resolution rules as any other >> Python code. This facilitates loading large data structures from the main >> thread and then freely accessing them from parallel callbacks. >> >> We demonstrate this with our simple Wikipedia "instant search" server >> , >> which loads a trie with 27 million entries, each one mapping a title to a >> 64-bit byte offset within a 60GB XML file. We then load a sorted NumPy >> array of all 64-bit offsets, which allows us to extract the exact byte >> range a given title's content appears within the XML file, allowing a >> client to issue a ranged request for those bytes to get the exact content >> via a single call to TransmitFile. This call returns immediately, but >> sets up the necessary structures for the kernel to send that byte range >> directly to the client without further interaction from us. >> >> The working set size of the python.exe process is about 11GB when the >> trie and NumPy array are loaded. Thus,multiprocessing would not be >> feasible, as you'd have 8 separate processes of 11GB if you had 8 cores and >> started 8 workers, requiring 88GB just for the processes. The number of >> allocated objects is around 27.1 million; the datrie library can >> efficiently store values if they're a 32-bit integer, however, our offsets >> are 64-bit, so an 80-something byte PyObjectneeds to be allocated to >> represent each one. >> >> This is significant because it demonstrates the competitive advantage >> PyParallel has against other languages when dealing with large heap sizes >> and object counts, whilst simultaneously avoiding the need for continual >> GC-motivated heap traversal, a product of memory allocation pressure (which >> is an inevitable side-effect of high-end network load, where incoming links >> are saturated at line rate)." >> >> STM currently requires code modifications in order to avoid conflicts, at >> least when collections are involved. PyParallel doesn't allow these kinds >> of mutations so it makes the implementation much easier in PyPy. >> PyParallel also requires a specific API to be used in order to utilize >> their parallel threads. There is a way to eliminate code modifications in >> PyParallel's case. >> We initially run with the GIL acquired as in with any other thread and >> then the trace for CPyExt calls or non-thread locals mutations and if there >> are none we can eliminate the call to acquire the GIL. Further >> optimizations can be performed if only a branch of the code requires >> CPyExt/non-thread locals mutations. >> I don't know if it's any easier than scanning the trace for >> lists/sets/dictionaries and replacing them with their equivalent STM >> implementations which Armin has already mentioned is not trivial. >> In the future when STM will be production ready we can "downgrade" a >> thread to an STM thread when it is required instead of acquiring the GIL >> and blocking the execution of other threads if we want to. >> >> STM also currently makes it harder to reason on how the program behaves. >> Especially when you have conflicts. >> With my suggestion you can easily say if the GIL is released or not. >> ??????? ??? ??, 20 ????? 2016 ?-17:53 ??? ?Armin Rigo?? > ??>:? >> >>> Hi Omer, >>> >>> On 20 June 2016 at 08:51, Omer Katz wrote: >>> > As for implementation, if we can trace the code running in the thread >>> and >>> > ensure it's not mutating global state and that CPyExt is never used >>> during >>> > the thread's course we can simply release the GIL when such a thread >>> is run. >>> >>> That's a very hand-wavy and vague description. To start with, how do >>> you define exactly "not mutating global state"? We are not allowed to >>> write to any of the objects that existed before we started the thread? >>> It may be possible to have such an implementation, yes. Actually, >>> that's probably easy: tweak the STM code to crash instead of doing >>> something more complicated when we write to an old object. >>> >>> I'm not sure how useful that would be---or how useful PyParallel is on >>> CPython. Maybe if you can point us to real usages of PyParallel it >>> would be a start. >>> >>> >>> A bient?t, >>> >>> Armin. >>> >> >> _______________________________________________ >> pypy-dev mailing list >> pypy-dev at python.org >> https://mail.python.org/mailman/listinfo/pypy-dev >> >> > > > -- > --------------------------- > Micha? Doma?ski > -------------- next part -------------- An HTML attachment was scrubbed... URL: From omer.drow at gmail.com Tue Jun 21 10:17:42 2016 From: omer.drow at gmail.com (Omer Katz) Date: Tue, 21 Jun 2016 17:17:42 +0300 Subject: [pypy-dev] PyParallel-style threads In-Reply-To: References: Message-ID: <681D6525-315B-43D9-8963-8FB22A52514A@gmail.com> I?m not familiar with C blocks and GCD. How would Python code look with that approach? > On 20 Jun 2016, at 6:02 PM, Micha? Doma?ski wrote: > > So I actually thought about similar approach. I was curious what do you think about approach to concurrency similar to what Apple did with C blocks and GCD. That is: enable threading but instead of the STM approach have fully explicit mutations within atomic blocks > > 2016-06-20 16:53 GMT+02:00 Armin Rigo >: > Hi Omer, > > On 20 June 2016 at 08:51, Omer Katz > wrote: > > As for implementation, if we can trace the code running in the thread and > > ensure it's not mutating global state and that CPyExt is never used during > > the thread's course we can simply release the GIL when such a thread is run. > > That's a very hand-wavy and vague description. To start with, how do > you define exactly "not mutating global state"? We are not allowed to > write to any of the objects that existed before we started the thread? > It may be possible to have such an implementation, yes. Actually, > that's probably easy: tweak the STM code to crash instead of doing > something more complicated when we write to an old object. > > I'm not sure how useful that would be---or how useful PyParallel is on > CPython. Maybe if you can point us to real usages of PyParallel it > would be a start. > > > A bient?t, > > Armin. > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > > > > -- > --------------------------- > Micha? Doma?ski -------------- next part -------------- An HTML attachment was scrubbed... URL: From mdomans at gmail.com Tue Jun 21 11:20:25 2016 From: mdomans at gmail.com (=?UTF-8?B?TWljaGHFgiBEb21hxYRza2k=?=) Date: Tue, 21 Jun 2016 17:20:25 +0200 Subject: [pypy-dev] PyParallel-style threads In-Reply-To: <681D6525-315B-43D9-8963-8FB22A52514A@gmail.com> References: <681D6525-315B-43D9-8963-8FB22A52514A@gmail.com> Message-ID: C blocks are very similar to Go function literals or Ecmas6 arrow functions. I was thinking on either using {} or just having it through identation with (): So instead of: def function(): def inner(): # something loop.call(inner) You'd do: def function(): loop.call( lambda (arg1, arg2, **kwargs) : # something ) Also, we don't have real CSP - since we have GIL all implementations are really just toying with the ideas. The pain point I'm talking about here is such - we introduce a lot of ways to structure code, but they don't give the benefits present in other languages. I'm starting to think we have too much cruft in .py 2016-06-21 16:17 GMT+02:00 Omer Katz : > I?m not familiar with C blocks and GCD. > How would Python code look with that approach? > > On 20 Jun 2016, at 6:02 PM, Micha? Doma?ski wrote: > > So I actually thought about similar approach. I was curious what do you > think about approach to concurrency similar to what Apple did with C blocks > and GCD. That is: enable threading but instead of the STM approach have > fully explicit mutations within atomic blocks > > 2016-06-20 16:53 GMT+02:00 Armin Rigo : > >> Hi Omer, >> >> On 20 June 2016 at 08:51, Omer Katz wrote: >> > As for implementation, if we can trace the code running in the thread >> and >> > ensure it's not mutating global state and that CPyExt is never used >> during >> > the thread's course we can simply release the GIL when such a thread is >> run. >> >> That's a very hand-wavy and vague description. To start with, how do >> you define exactly "not mutating global state"? We are not allowed to >> write to any of the objects that existed before we started the thread? >> It may be possible to have such an implementation, yes. Actually, >> that's probably easy: tweak the STM code to crash instead of doing >> something more complicated when we write to an old object. >> >> I'm not sure how useful that would be---or how useful PyParallel is on >> CPython. Maybe if you can point us to real usages of PyParallel it >> would be a start. >> >> >> A bient?t, >> >> Armin. >> _______________________________________________ >> pypy-dev mailing list >> pypy-dev at python.org >> https://mail.python.org/mailman/listinfo/pypy-dev >> > > > > -- > --------------------------- > Micha? Doma?ski > > > -- --------------------------- Micha? Doma?ski -------------- next part -------------- An HTML attachment was scrubbed... URL: From deprekate at gmail.com Tue Jun 21 21:08:20 2016 From: deprekate at gmail.com (Katelyn McNair) Date: Tue, 21 Jun 2016 18:08:20 -0700 Subject: [pypy-dev] fix libpypy-c.so needing certein versions Message-ID: I have been trying to install PyPy locally for the last 3 days. I didn't have Openssl, which install fine. But then PyPy started not liking the new version (libssl.so.1.0.1), so I got the archive for 1.0.0, and installed. But then PyPy didn't like that the .so files didn't have to correct version name in them. [katelyn at anthill bin]$ ./pypy ./pypy: /home/katelyn/lib/libssl.so.1.0.0: version `OPENSSL_1.0.1' not found (required by /home/katelyn/opt/pypy2-v5.3.0-linux64/bin/libpypy-c.so) ./pypy: /home/katelyn/lib/libssl.so.1.0.0: version `OPENSSL_1.0.0' not found (required by /home/katelyn/opt/pypy2-v5.3.0-linux64/bin/libpypy-c.so) ./pypy: /home/katelyn/lib/libcrypto.so.1.0.0: version `OPENSSL_1.0.0' not found (required by /home/katelyn/opt/pypy2-v5.3.0-linux64/bin/libpypy-c.so) ./pypy: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /home/katelyn/opt/pypy2-v5.3.0-linux64/bin/libpypy-c.so) ./pypy: /lib64/libc.so.6: version `GLIBC_2.15' not found (required by /home/katelyn/opt/pypy2-v5.3.0-linux64/bin/libpypy-c.so) It took me a day to find the hack here: http://stackoverflow.com/questions/18390833/no-version-information-available Although I had to put both OPENSSL_1.0.0 and OPENSSL_1.0.1in the file. Now I have to go build my own libc.so since libpypy.so doesn't like the one I have. [katelyn at anthill openssl-1.0.0s]$ pypy /home3/katelyn/opt/pypy2-v5.3.0-linux64/bin/pypy: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /home/katelyn/opt/pypy2-v5.3.0-linux64/bin/libpypy-c.so) /home3/katelyn/opt/pypy2-v5.3.0-linux64/bin/pypy: /lib64/libc.so.6: version `GLIBC_2.15' not found (required by /home/katelyn/opt/pypy2-v5.3.0-linux64/bin/libpypy-c.so) -- -katelyn -------------- next part -------------- An HTML attachment was scrubbed... URL: From arigo at tunes.org Wed Jun 22 04:19:21 2016 From: arigo at tunes.org (Armin Rigo) Date: Wed, 22 Jun 2016 10:19:21 +0200 Subject: [pypy-dev] fix libpypy-c.so needing certein versions In-Reply-To: References: Message-ID: Hi, On 22 June 2016 at 03:08, Katelyn McNair wrote: > I have been trying to install PyPy locally for the last 3 days. I didn't > have Openssl, which install fine. But then PyPy started not liking the new > version (libssl.so.1.0.1), so I got the archive for 1.0.0, and installed. > > But then PyPy didn't like that the .so files didn't have to correct version > name in them. I'm unsure what solution you propose. This is a known issue with binary distributions on Linux. You're trying to use on Linux distribution X a binary file that was made for distribution Y. It can't work cleanly. http://pypy.org/download.html#linux-binaries-and-common-distributions A bient?t, Armin. From william.leslie.ttg at gmail.com Wed Jun 22 04:30:00 2016 From: william.leslie.ttg at gmail.com (William ML Leslie) Date: Wed, 22 Jun 2016 18:30:00 +1000 Subject: [pypy-dev] fix libpypy-c.so needing certein versions In-Reply-To: References: Message-ID: On 22 June 2016 at 18:19, Armin Rigo wrote: > http://pypy.org/download.html#linux-binaries-and-common-distributions The portable builds linked from that page work really well if you're on a gnu/linux (thanks, squeaky!) - try those out first. -- William Leslie Notice: Likely much of this email is, by the nature of copyright, covered under copyright law. You absolutely MAY reproduce any part of it in accordance with the copyright law of the nation you are reading this in. Any attempt to DENY YOU THOSE RIGHTS would be illegal without prior contractual agreement. From phyo.arkarlwin at gmail.com Wed Jun 22 08:21:21 2016 From: phyo.arkarlwin at gmail.com (Phyo Arkar) Date: Wed, 22 Jun 2016 12:21:21 +0000 Subject: [pypy-dev] fix libpypy-c.so needing certein versions In-Reply-To: References: Message-ID: Here is portable pypy. https://github.com/squeaky-pl/portable-pypy You can just use it by directly unzipping and running it. it comes with a virtual-env directly too. Regards, Phyo. On Wed, Jun 22, 2016 at 3:00 PM William ML Leslie < william.leslie.ttg at gmail.com> wrote: > On 22 June 2016 at 18:19, Armin Rigo wrote: > > > http://pypy.org/download.html#linux-binaries-and-common-distributions > > The portable builds linked from that page work really well if you're > on a gnu/linux (thanks, squeaky!) - try those out first. > > -- > William Leslie > > Notice: > Likely much of this email is, by the nature of copyright, covered > under copyright law. You absolutely MAY reproduce any part of it in > accordance with the copyright law of the nation you are reading this > in. Any attempt to DENY YOU THOSE RIGHTS would be illegal without > prior contractual agreement. > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmludo at gmail.com Thu Jun 23 17:32:21 2016 From: gmludo at gmail.com (Ludovic Gasc) Date: Thu, 23 Jun 2016 23:32:21 +0200 Subject: [pypy-dev] Implementation of some Python 3.5 features for more PyPy 3.5 support In-Reply-To: References: Message-ID: Good luck for async/await ;-) In case of you'll go to EuroPython next month, don't hesitate to ping me. Have a nice week. -- Ludovic Gasc (GMLudo) http://www.gmludo.eu/ 2016-06-14 21:52 GMT+02:00 Raffael Tfirst : > Hi, > I am Raffael Tfirst, currently working on PyPy 3.5 for the Google Summer > of Code. > > Since it might be of interest to some, I want to give you a short summary > of what I do. > > The Python 3.5 features I implement in PyPy are the @ operator for matrix > multiplications (PEP 465), additional unpacking generalizations (PEP 448) > and coroutines with async and await syntax (PEP 492). A short description > of each feature can be found here (at New Syntax Features): > https://docs.python.org/3/whatsnew/3.5.html > It would be cool to have those features implemented in PyPy because they > will most likely be used a lot in future Python 3.5 programs, that is why I > want to close this gap and push PyPy a bit closer to Python 3.5 > functionality. > > I started working on this about a month ago. As things stand now, the @ > operator already works. The additional unpacking feature will be ready > soon, too, since a great part is already implemented. There's just some > nasty errors I need to fix before it can be tested. > In about two weeks progress will speed up a lot, because then I'm going to > have way more time for this project. > > For those who are interested, I work on the py3.5 branch. > > You can also check out my progress on my blog: > http://pypy35syntax.blogspot.co.at/ > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wickedgrey at gmail.com Thu Jun 23 22:14:18 2016 From: wickedgrey at gmail.com (Eli Stevens (Gmail)) Date: Thu, 23 Jun 2016 19:14:18 -0700 Subject: [pypy-dev] pickle numpy array from pypy to cpython? Message-ID: I'm trying to construct some data that includes numpy arrays in pypy, pickle it, then unpickle it in cpython (to use some non-pypy-compatible libs). However, the actual class of the pickled array is _numpypy.multiarray, which cpython doesn't have. Any suggestions? Thanks, Eli From william.leslie.ttg at gmail.com Thu Jun 23 22:54:29 2016 From: william.leslie.ttg at gmail.com (William ML Leslie) Date: Fri, 24 Jun 2016 12:54:29 +1000 Subject: [pypy-dev] pickle numpy array from pypy to cpython? In-Reply-To: References: Message-ID: On 24 June 2016 at 12:14, Eli Stevens (Gmail) wrote: > I'm trying to construct some data that includes numpy arrays in pypy, > pickle it, then unpickle it in cpython (to use some > non-pypy-compatible libs). > > However, the actual class of the pickled array is _numpypy.multiarray, > which cpython doesn't have. > > Any suggestions? Have you considered the tofile method and fromfile function? http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.tofile.html#numpy.ndarray.tofile http://docs.scipy.org/doc/numpy/reference/generated/numpy.fromfile.html#numpy.fromfile -- William Leslie Notice: Likely much of this email is, by the nature of copyright, covered under copyright law. You absolutely MAY reproduce any part of it in accordance with the copyright law of the nation you are reading this in. Any attempt to DENY YOU THOSE RIGHTS would be illegal without prior contractual agreement. From wickedgrey at gmail.com Fri Jun 24 01:01:42 2016 From: wickedgrey at gmail.com (Eli Stevens (Gmail)) Date: Thu, 23 Jun 2016 22:01:42 -0700 Subject: [pypy-dev] pickle numpy array from pypy to cpython? In-Reply-To: References: Message-ID: No, since it's not *just* a numpy array I need to move around (dict with numpy values, in this case, more complicated objects in the future). Obviously I can kludge something manual together (assuming the tofile/fromfile functions work cross-interpreter, which I wouldn't take for granted at this point), but I'd rather be able to use pickle (easier to work with libraries that also expect pickles, etc.). Eli On Thu, Jun 23, 2016 at 7:54 PM, William ML Leslie wrote: > On 24 June 2016 at 12:14, Eli Stevens (Gmail) wrote: >> I'm trying to construct some data that includes numpy arrays in pypy, >> pickle it, then unpickle it in cpython (to use some >> non-pypy-compatible libs). >> >> However, the actual class of the pickled array is _numpypy.multiarray, >> which cpython doesn't have. >> >> Any suggestions? > > Have you considered the tofile method and fromfile function? > > http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.tofile.html#numpy.ndarray.tofile > > http://docs.scipy.org/doc/numpy/reference/generated/numpy.fromfile.html#numpy.fromfile > > -- > William Leslie > > Notice: > Likely much of this email is, by the nature of copyright, covered > under copyright law. You absolutely MAY reproduce any part of it in > accordance with the copyright law of the nation you are reading this > in. Any attempt to DENY YOU THOSE RIGHTS would be illegal without > prior contractual agreement. From david.brochart at gmail.com Fri Jun 24 01:14:58 2016 From: david.brochart at gmail.com (David Brochart) Date: Fri, 24 Jun 2016 07:14:58 +0200 Subject: [pypy-dev] pickle numpy array from pypy to cpython? In-Reply-To: References: Message-ID: Last time I tried tofile and fromfile in Numpypy it was not implemented. On Fri, Jun 24, 2016 at 7:01 AM, Eli Stevens (Gmail) wrote: > No, since it's not *just* a numpy array I need to move around (dict > with numpy values, in this case, more complicated objects in the > future). Obviously I can kludge something manual together (assuming > the tofile/fromfile functions work cross-interpreter, which I wouldn't > take for granted at this point), but I'd rather be able to use pickle > (easier to work with libraries that also expect pickles, etc.). > > Eli > > On Thu, Jun 23, 2016 at 7:54 PM, William ML Leslie > wrote: > > On 24 June 2016 at 12:14, Eli Stevens (Gmail) > wrote: > >> I'm trying to construct some data that includes numpy arrays in pypy, > >> pickle it, then unpickle it in cpython (to use some > >> non-pypy-compatible libs). > >> > >> However, the actual class of the pickled array is _numpypy.multiarray, > >> which cpython doesn't have. > >> > >> Any suggestions? > > > > Have you considered the tofile method and fromfile function? > > > > > http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.tofile.html#numpy.ndarray.tofile > > > > > http://docs.scipy.org/doc/numpy/reference/generated/numpy.fromfile.html#numpy.fromfile > > > > -- > > William Leslie > > > > Notice: > > Likely much of this email is, by the nature of copyright, covered > > under copyright law. You absolutely MAY reproduce any part of it in > > accordance with the copyright law of the nation you are reading this > > in. Any attempt to DENY YOU THOSE RIGHTS would be illegal without > > prior contractual agreement. > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wickedgrey at gmail.com Fri Jun 24 15:43:12 2016 From: wickedgrey at gmail.com (Eli Stevens (Gmail)) Date: Fri, 24 Jun 2016 12:43:12 -0700 Subject: [pypy-dev] pickle numpy array from pypy to cpython? In-Reply-To: References: Message-ID: Yeah, looks like that's still the case: >>>> z = np.zeros((2,3), dtype=np.float32) >>>> z.tofile Traceback (most recent call last): File "", line 1, in AttributeError: 'numpy.ndarray' object has no attribute 'tofile' What would it take to get cross-interpreter numpy array pickles working? Thanks, Eli On Thu, Jun 23, 2016 at 10:14 PM, David Brochart wrote: > Last time I tried tofile and fromfile in Numpypy it was not implemented. > > On Fri, Jun 24, 2016 at 7:01 AM, Eli Stevens (Gmail) > wrote: >> >> No, since it's not *just* a numpy array I need to move around (dict >> with numpy values, in this case, more complicated objects in the >> future). Obviously I can kludge something manual together (assuming >> the tofile/fromfile functions work cross-interpreter, which I wouldn't >> take for granted at this point), but I'd rather be able to use pickle >> (easier to work with libraries that also expect pickles, etc.). >> >> Eli >> >> On Thu, Jun 23, 2016 at 7:54 PM, William ML Leslie >> wrote: >> > On 24 June 2016 at 12:14, Eli Stevens (Gmail) >> > wrote: >> >> I'm trying to construct some data that includes numpy arrays in pypy, >> >> pickle it, then unpickle it in cpython (to use some >> >> non-pypy-compatible libs). >> >> >> >> However, the actual class of the pickled array is _numpypy.multiarray, >> >> which cpython doesn't have. >> >> >> >> Any suggestions? >> > >> > Have you considered the tofile method and fromfile function? >> > >> > >> > http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.tofile.html#numpy.ndarray.tofile >> > >> > >> > http://docs.scipy.org/doc/numpy/reference/generated/numpy.fromfile.html#numpy.fromfile >> > >> > -- >> > William Leslie >> > >> > Notice: >> > Likely much of this email is, by the nature of copyright, covered >> > under copyright law. You absolutely MAY reproduce any part of it in >> > accordance with the copyright law of the nation you are reading this >> > in. Any attempt to DENY YOU THOSE RIGHTS would be illegal without >> > prior contractual agreement. >> _______________________________________________ >> pypy-dev mailing list >> pypy-dev at python.org >> https://mail.python.org/mailman/listinfo/pypy-dev > > From matti.picus at gmail.com Fri Jun 24 16:21:26 2016 From: matti.picus at gmail.com (matti picus) Date: Fri, 24 Jun 2016 23:21:26 +0300 Subject: [pypy-dev] pickle numpy array from pypy to cpython? In-Reply-To: References: Message-ID: The first step would be to pickle the same dtype/shape/data ndarray once from numpy and again from _numpypy, and to compare the binary result. The only difference should be the class name, if the difference goes deeper that difference must be fixed. Then it it just a matter of patching pickle.py to use the desired class instead of the class name encoded into the pickled binary result. Matti > On 24 Jun 2016, at 10:43 PM, Eli Stevens (Gmail) wrote: > > Yeah, looks like that's still the case: > >>>>> z = np.zeros((2,3), dtype=np.float32) >>>>> z.tofile > Traceback (most recent call last): > File "", line 1, in > AttributeError: 'numpy.ndarray' object has no attribute 'tofile' > > What would it take to get cross-interpreter numpy array pickles working? > > Thanks, > Eli > > From wickedgrey at gmail.com Fri Jun 24 17:29:57 2016 From: wickedgrey at gmail.com (Eli Stevens (Gmail)) Date: Fri, 24 Jun 2016 14:29:57 -0700 Subject: [pypy-dev] pickle numpy array from pypy to cpython? In-Reply-To: References: Message-ID: Doesn't look like they are exactly the same: https://gist.github.com/elistevens/03e22f4684fb77d3edfe13ffcd406ef4 Certainly some similarities, though. I'm not familiar with the pickle format, and I haven't yet had time to dig in beyond this, though. Hoping I can tonight. Cheers, Eli On Fri, Jun 24, 2016 at 1:21 PM, matti picus wrote: > The first step would be to pickle the same dtype/shape/data ndarray once from numpy and again from _numpypy, and to compare the binary result. The only difference should be the class name, if the difference goes deeper that difference must be fixed. Then it it just a matter of patching pickle.py to use the desired class instead of the class name encoded into the pickled binary result. > Matti > >> On 24 Jun 2016, at 10:43 PM, Eli Stevens (Gmail) wrote: >> >> Yeah, looks like that's still the case: >> >>>>>> z = np.zeros((2,3), dtype=np.float32) >>>>>> z.tofile >> Traceback (most recent call last): >> File "", line 1, in >> AttributeError: 'numpy.ndarray' object has no attribute 'tofile' >> >> What would it take to get cross-interpreter numpy array pickles working? >> >> Thanks, >> Eli >> >> From omer.drow at gmail.com Fri Jun 24 18:45:53 2016 From: omer.drow at gmail.com (Omer Katz) Date: Fri, 24 Jun 2016 22:45:53 +0000 Subject: [pypy-dev] Implementation of some Python 3.5 features for more PyPy 3.5 support In-Reply-To: References: Message-ID: Great to hear that! Are you going to implement the coroutines, await & async functionality on our existing greenlets implementation? ??????? ??? ??, 24 ????? 2016 ?-0:33 ??? ?Ludovic Gasc?? :? > Good luck for async/await ;-) > > In case of you'll go to EuroPython next month, don't hesitate to ping me. > > Have a nice week. > > -- > Ludovic Gasc (GMLudo) > http://www.gmludo.eu/ > > 2016-06-14 21:52 GMT+02:00 Raffael Tfirst : > >> Hi, >> I am Raffael Tfirst, currently working on PyPy 3.5 for the Google Summer >> of Code. >> >> Since it might be of interest to some, I want to give you a short summary >> of what I do. >> >> The Python 3.5 features I implement in PyPy are the @ operator for matrix >> multiplications (PEP 465), additional unpacking generalizations (PEP 448) >> and coroutines with async and await syntax (PEP 492). A short description >> of each feature can be found here (at New Syntax Features): >> https://docs.python.org/3/whatsnew/3.5.html >> It would be cool to have those features implemented in PyPy because they >> will most likely be used a lot in future Python 3.5 programs, that is why I >> want to close this gap and push PyPy a bit closer to Python 3.5 >> functionality. >> >> I started working on this about a month ago. As things stand now, the @ >> operator already works. The additional unpacking feature will be ready >> soon, too, since a great part is already implemented. There's just some >> nasty errors I need to fix before it can be tested. >> In about two weeks progress will speed up a lot, because then I'm going >> to have way more time for this project. >> >> For those who are interested, I work on the py3.5 branch. >> >> You can also check out my progress on my blog: >> http://pypy35syntax.blogspot.co.at/ >> >> _______________________________________________ >> pypy-dev mailing list >> pypy-dev at python.org >> https://mail.python.org/mailman/listinfo/pypy-dev >> >> > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wickedgrey at gmail.com Fri Jun 24 20:46:01 2016 From: wickedgrey at gmail.com (Eli Stevens (Gmail)) Date: Fri, 24 Jun 2016 17:46:01 -0700 Subject: [pypy-dev] pickle numpy array from pypy to cpython? In-Reply-To: References: Message-ID: Okay, if I pass the pickles through pickletools.optimize, they look identical except for the very first line (and a resulting systematic shift in offset): >>> pt.dis(pt.optimize(open('cp123.pkl').read())) 0: c GLOBAL 'numpy.core.multiarray _reconstruct' >>> pt.dis(pt.optimize(open('pp123.pkl').read())) 0: c GLOBAL '_numpypy.multiarray _reconstruct' So I suspect that simply lying about what class we just pickled would do the trick. I have no idea how acceptable that would be as a general solution, though. Thoughts? Eli On Fri, Jun 24, 2016 at 2:29 PM, Eli Stevens (Gmail) wrote: > Doesn't look like they are exactly the same: > > https://gist.github.com/elistevens/03e22f4684fb77d3edfe13ffcd406ef4 > > Certainly some similarities, though. > > I'm not familiar with the pickle format, and I haven't yet had time to > dig in beyond this, though. Hoping I can tonight. > > Cheers, > Eli > > > On Fri, Jun 24, 2016 at 1:21 PM, matti picus wrote: >> The first step would be to pickle the same dtype/shape/data ndarray once from numpy and again from _numpypy, and to compare the binary result. The only difference should be the class name, if the difference goes deeper that difference must be fixed. Then it it just a matter of patching pickle.py to use the desired class instead of the class name encoded into the pickled binary result. >> Matti >> >>> On 24 Jun 2016, at 10:43 PM, Eli Stevens (Gmail) wrote: >>> >>> Yeah, looks like that's still the case: >>> >>>>>>> z = np.zeros((2,3), dtype=np.float32) >>>>>>> z.tofile >>> Traceback (most recent call last): >>> File "", line 1, in >>> AttributeError: 'numpy.ndarray' object has no attribute 'tofile' >>> >>> What would it take to get cross-interpreter numpy array pickles working? >>> >>> Thanks, >>> Eli >>> >>> From wickedgrey at gmail.com Fri Jun 24 23:21:41 2016 From: wickedgrey at gmail.com (Eli Stevens (Gmail)) Date: Fri, 24 Jun 2016 20:21:41 -0700 Subject: [pypy-dev] pickle numpy array from pypy to cpython? In-Reply-To: References: Message-ID: Heh, interestingly, if I add the following to the local dir and files when trying to unpickle under cpython, it works (note that cpython to pypy actually works out of the box, which I hadn't realized): $ cat _numpypy/__init__.py from numpy.core import * $ cat _numpypy/multiarray.py from numpy.core.multiarray import * import numpy.core.multiarray as _ncm _reconstruct = _ncm._reconstruct This is obviously a total hack, and not one I'm comfortable with (since I need to use this codebase from both cpython and pypy), but it demonstrates that it's just bookkeeping that needs to change to get things to work. My first approach would be to add a wrapper around save_global here https://bitbucket.org/pypy/pypy/src/a0105e0d00dbd0f73d06fc704db704868a6c6ed2/lib-python/2.7/pickle.py?at=default&fileviewer=file-view-default#pickle.py-814 that special-cases the global '_numpypy.multiarray' to instead be 'numpy.core.multiarray'. That seem like a reasonable thing to do? Cheers, Eli On Fri, Jun 24, 2016 at 5:46 PM, Eli Stevens (Gmail) wrote: > Okay, if I pass the pickles through pickletools.optimize, they look > identical except for the very first line (and a resulting systematic > shift in offset): > >>>> pt.dis(pt.optimize(open('cp123.pkl').read())) > 0: c GLOBAL 'numpy.core.multiarray _reconstruct' > >>>> pt.dis(pt.optimize(open('pp123.pkl').read())) > 0: c GLOBAL '_numpypy.multiarray _reconstruct' > > So I suspect that simply lying about what class we just pickled would > do the trick. > > I have no idea how acceptable that would be as a general solution, > though. Thoughts? > > Eli > > On Fri, Jun 24, 2016 at 2:29 PM, Eli Stevens (Gmail) > wrote: >> Doesn't look like they are exactly the same: >> >> https://gist.github.com/elistevens/03e22f4684fb77d3edfe13ffcd406ef4 >> >> Certainly some similarities, though. >> >> I'm not familiar with the pickle format, and I haven't yet had time to >> dig in beyond this, though. Hoping I can tonight. >> >> Cheers, >> Eli >> >> >> On Fri, Jun 24, 2016 at 1:21 PM, matti picus wrote: >>> The first step would be to pickle the same dtype/shape/data ndarray once from numpy and again from _numpypy, and to compare the binary result. The only difference should be the class name, if the difference goes deeper that difference must be fixed. Then it it just a matter of patching pickle.py to use the desired class instead of the class name encoded into the pickled binary result. >>> Matti >>> >>>> On 24 Jun 2016, at 10:43 PM, Eli Stevens (Gmail) wrote: >>>> >>>> Yeah, looks like that's still the case: >>>> >>>>>>>> z = np.zeros((2,3), dtype=np.float32) >>>>>>>> z.tofile >>>> Traceback (most recent call last): >>>> File "", line 1, in >>>> AttributeError: 'numpy.ndarray' object has no attribute 'tofile' >>>> >>>> What would it take to get cross-interpreter numpy array pickles working? >>>> >>>> Thanks, >>>> Eli >>>> >>>> From matti.picus at gmail.com Sat Jun 25 00:19:43 2016 From: matti.picus at gmail.com (matti picus) Date: Sat, 25 Jun 2016 07:19:43 +0300 Subject: [pypy-dev] pickle numpy array from pypy to cpython? In-Reply-To: References: Message-ID: Sounds reasonable. You might want to generalize it a bit by trying to import _numpypy /numpy, and setting up the replacement by whichever fails to import. Matti On Saturday, 25 June 2016, Eli Stevens (Gmail) wrote: > Heh, interestingly, if I add the following to the local dir and files > when trying to unpickle under cpython, it works (note that cpython to > pypy actually works out of the box, which I hadn't realized): > > $ cat _numpypy/__init__.py > from numpy.core import * > > $ cat _numpypy/multiarray.py > from numpy.core.multiarray import * > import numpy.core.multiarray as _ncm > _reconstruct = _ncm._reconstruct > > This is obviously a total hack, and not one I'm comfortable with > (since I need to use this codebase from both cpython and pypy), but it > demonstrates that it's just bookkeeping that needs to change to get > things to work. > > My first approach would be to add a wrapper around save_global here > > https://bitbucket.org/pypy/pypy/src/a0105e0d00dbd0f73d06fc704db704868a6c6ed2/lib-python/2.7/pickle.py?at=default&fileviewer=file-view-default#pickle.py-814 > that special-cases the global '_numpypy.multiarray' to instead be > 'numpy.core.multiarray'. That seem like a reasonable thing to do? > > Cheers, > Eli > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sat Jun 25 00:56:33 2016 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 24 Jun 2016 21:56:33 -0700 Subject: [pypy-dev] Any core devs want to come to the python compilers workshop next month? Message-ID: Hi all, This is super short notice but... you may be aware that I'm organizing a workshop next month (July 11-12) in Austin, for some of the different Python compiler projects to get together and compare notes on how best to achieve world domination... or at least make the Python ecosystem more friendly to non-CPython implementations: https://python-compilers-workshop.github.io/ Originally Maciej and Matti were going to represent PyPy, but unfortunately Maciej had to cancel. Matti will still be there, so that's awesome, but I find myself with a chunk of unused sponsorship $$, and given how PyPy is currently by far the most successful of these projects, I feel like it'd be even more awesome if we had even more representation from the PyPy core devs :-). So if any of you are interested in a free last minute trip to Austin + a chance to maybe help influence future plans and funding in a more PyPy-friendly direction, please get in touch ASAP... -n -- Nathaniel J. Smith -- https://vorpus.org From wickedgrey at gmail.com Sat Jun 25 02:01:14 2016 From: wickedgrey at gmail.com (Eli Stevens (Gmail)) Date: Fri, 24 Jun 2016 23:01:14 -0700 Subject: [pypy-dev] pickle numpy array from pypy to cpython? In-Reply-To: References: Message-ID: I was thinking about doing it on import of the micronumpy module (pypy/module/micronumpy/app_numpy.py). Right now, when I try and import pickle during the tests: $ cat pypy/module/micronumpy/test/test_pickling_app.py import sys import py from pypy.module.micronumpy.test.test_base import BaseNumpyAppTest from pypy.conftest import option class AppTestPicklingNumpy(BaseNumpyAppTest): def setup_class(cls): if option.runappdirect and '__pypy__' not in sys.builtin_module_names: py.test.skip("pypy only test") BaseNumpyAppTest.setup_class.im_func(cls) def test_pickle_module(self): import pickle ... # more code I get this error: > import struct lib-python/2.7/pickle.py:34: _ _ _ _ _ _ > from _struct import * E (application-level) ImportError: No module named _struct lib-python/2.7/struct.py:1: ImportError But everything seems fine with struct: $ ./pytest.py pypy/module/struct/test/test_struct.py ==== test session starts ==== platform linux2 -- Python 2.7.11 -- py-1.4.20 -- pytest-2.5.2 pytest-2.5.2 from /home/elis/edit/play/pypy/pytest.pyc collected 30 items pypy/module/struct/test/test_struct.py .............................. ==== 30 passed in 11.95 seconds ==== Any idea what's going on here? Thanks, Eli On Fri, Jun 24, 2016 at 9:19 PM, matti picus wrote: > Sounds reasonable. You might want to generalize it a bit by trying to import > _numpypy /numpy, and setting up the replacement by whichever fails to > import. > Matti > > On Saturday, 25 June 2016, Eli Stevens (Gmail) wrote: >> >> Heh, interestingly, if I add the following to the local dir and files >> when trying to unpickle under cpython, it works (note that cpython to >> pypy actually works out of the box, which I hadn't realized): >> >> $ cat _numpypy/__init__.py >> from numpy.core import * >> >> $ cat _numpypy/multiarray.py >> from numpy.core.multiarray import * >> import numpy.core.multiarray as _ncm >> _reconstruct = _ncm._reconstruct >> >> This is obviously a total hack, and not one I'm comfortable with >> (since I need to use this codebase from both cpython and pypy), but it >> demonstrates that it's just bookkeeping that needs to change to get >> things to work. >> >> My first approach would be to add a wrapper around save_global here >> >> https://bitbucket.org/pypy/pypy/src/a0105e0d00dbd0f73d06fc704db704868a6c6ed2/lib-python/2.7/pickle.py?at=default&fileviewer=file-view-default#pickle.py-814 >> that special-cases the global '_numpypy.multiarray' to instead be >> 'numpy.core.multiarray'. That seem like a reasonable thing to do? >> >> Cheers, >> Eli >> > From matti.picus at gmail.com Sat Jun 25 13:26:17 2016 From: matti.picus at gmail.com (Matti Picus) Date: Sat, 25 Jun 2016 20:26:17 +0300 Subject: [pypy-dev] pickle numpy array from pypy to cpython? In-Reply-To: References: Message-ID: <576EBEB9.9070804@gmail.com> You need to add the modules to those that the class-local space is built with using a spaceconfig, so something like class AppTestPicklingNumpy(BaseNumpyAppTest): spaceconfig = dict(usemodules=["micronumpy", "struct", "binascii"]) def setup_class(cls): if option.runappdirect and '__pypy__' not in sys.builtin_module_names: py.test.skip("pypy only test") BaseNumpyAppTest.setup_class.im_func(cls) def test_pickle_module(self): import pickle On 25/06/16 09:01, Eli Stevens (Gmail) wrote: > I was thinking about doing it on import of the micronumpy module > (pypy/module/micronumpy/app_numpy.py). > > Right now, when I try and import pickle during the tests: > > $ cat pypy/module/micronumpy/test/test_pickling_app.py > import sys > import py > > from pypy.module.micronumpy.test.test_base import BaseNumpyAppTest > from pypy.conftest import option > > class AppTestPicklingNumpy(BaseNumpyAppTest): > def setup_class(cls): > if option.runappdirect and '__pypy__' not in sys.builtin_module_names: > py.test.skip("pypy only test") > BaseNumpyAppTest.setup_class.im_func(cls) > > def test_pickle_module(self): > import pickle > ... # more code > > > I get this error: > >> import struct > lib-python/2.7/pickle.py:34: > _ _ _ _ _ _ > >> from _struct import * > E (application-level) ImportError: No module named _struct > > lib-python/2.7/struct.py:1: ImportError > > > But everything seems fine with struct: > > $ ./pytest.py pypy/module/struct/test/test_struct.py > ==== test session starts ==== > platform linux2 -- Python 2.7.11 -- py-1.4.20 -- pytest-2.5.2 > pytest-2.5.2 from /home/elis/edit/play/pypy/pytest.pyc > collected 30 items > > pypy/module/struct/test/test_struct.py .............................. > > ==== 30 passed in 11.95 seconds ==== > > Any idea what's going on here? > > Thanks, > Eli > > > On Fri, Jun 24, 2016 at 9:19 PM, matti picus wrote: >> Sounds reasonable. You might want to generalize it a bit by trying to import >> _numpypy /numpy, and setting up the replacement by whichever fails to >> import. >> Matti >> >> On Saturday, 25 June 2016, Eli Stevens (Gmail) wrote: >>> Heh, interestingly, if I add the following to the local dir and files >>> when trying to unpickle under cpython, it works (note that cpython to >>> pypy actually works out of the box, which I hadn't realized): >>> >>> $ cat _numpypy/__init__.py >>> from numpy.core import * >>> >>> $ cat _numpypy/multiarray.py >>> from numpy.core.multiarray import * >>> import numpy.core.multiarray as _ncm >>> _reconstruct = _ncm._reconstruct >>> >>> This is obviously a total hack, and not one I'm comfortable with >>> (since I need to use this codebase from both cpython and pypy), but it >>> demonstrates that it's just bookkeeping that needs to change to get >>> things to work. >>> >>> My first approach would be to add a wrapper around save_global here >>> >>> https://bitbucket.org/pypy/pypy/src/a0105e0d00dbd0f73d06fc704db704868a6c6ed2/lib-python/2.7/pickle.py?at=default&fileviewer=file-view-default#pickle.py-814 >>> that special-cases the global '_numpypy.multiarray' to instead be >>> 'numpy.core.multiarray'. That seem like a reasonable thing to do? >>> >>> Cheers, >>> Eli >>> From wickedgrey at gmail.com Sat Jun 25 23:19:13 2016 From: wickedgrey at gmail.com (Eli Stevens (Gmail)) Date: Sat, 25 Jun 2016 20:19:13 -0700 Subject: [pypy-dev] pickle numpy array from pypy to cpython? In-Reply-To: <576EBEB9.9070804@gmail.com> References: <576EBEB9.9070804@gmail.com> Message-ID: That did the trick. Pull request here: https://bitbucket.org/pypy/pypy/pull-requests/460/changes-reported-location-of-_reconstruct/diff Please let me know if there are changes that should be made. As noted, I'm not super happy with the tests, but am unsure what direction I should go with them. Cheers, Eli On Sat, Jun 25, 2016 at 10:26 AM, Matti Picus wrote: > You need to add the modules to those that the class-local space is built > with using a spaceconfig, so something like > > class AppTestPicklingNumpy(BaseNumpyAppTest): > spaceconfig = dict(usemodules=["micronumpy", "struct", "binascii"]) > > def setup_class(cls): > if option.runappdirect and '__pypy__' not in > sys.builtin_module_names: > py.test.skip("pypy only test") > BaseNumpyAppTest.setup_class.im_func(cls) > > > def test_pickle_module(self): > import pickle > > > > > On 25/06/16 09:01, Eli Stevens (Gmail) wrote: >> >> I was thinking about doing it on import of the micronumpy module >> (pypy/module/micronumpy/app_numpy.py). >> >> Right now, when I try and import pickle during the tests: >> >> $ cat pypy/module/micronumpy/test/test_pickling_app.py >> import sys >> import py >> >> from pypy.module.micronumpy.test.test_base import BaseNumpyAppTest >> from pypy.conftest import option >> >> class AppTestPicklingNumpy(BaseNumpyAppTest): >> def setup_class(cls): >> if option.runappdirect and '__pypy__' not in >> sys.builtin_module_names: >> py.test.skip("pypy only test") >> BaseNumpyAppTest.setup_class.im_func(cls) >> >> def test_pickle_module(self): >> import pickle >> ... # more code >> >> >> I get this error: >> >>> import struct >> >> lib-python/2.7/pickle.py:34: >> _ _ _ _ _ _ >> >>> from _struct import * >> >> E (application-level) ImportError: No module named _struct >> >> lib-python/2.7/struct.py:1: ImportError >> >> >> But everything seems fine with struct: >> >> $ ./pytest.py pypy/module/struct/test/test_struct.py >> ==== test session starts ==== >> platform linux2 -- Python 2.7.11 -- py-1.4.20 -- pytest-2.5.2 >> pytest-2.5.2 from /home/elis/edit/play/pypy/pytest.pyc >> collected 30 items >> >> pypy/module/struct/test/test_struct.py .............................. >> >> ==== 30 passed in 11.95 seconds ==== >> >> Any idea what's going on here? >> >> Thanks, >> Eli >> >> >> On Fri, Jun 24, 2016 at 9:19 PM, matti picus >> wrote: >>> >>> Sounds reasonable. You might want to generalize it a bit by trying to >>> import >>> _numpypy /numpy, and setting up the replacement by whichever fails to >>> import. >>> Matti >>> >>> On Saturday, 25 June 2016, Eli Stevens (Gmail) >>> wrote: >>>> >>>> Heh, interestingly, if I add the following to the local dir and files >>>> when trying to unpickle under cpython, it works (note that cpython to >>>> pypy actually works out of the box, which I hadn't realized): >>>> >>>> $ cat _numpypy/__init__.py >>>> from numpy.core import * >>>> >>>> $ cat _numpypy/multiarray.py >>>> from numpy.core.multiarray import * >>>> import numpy.core.multiarray as _ncm >>>> _reconstruct = _ncm._reconstruct >>>> >>>> This is obviously a total hack, and not one I'm comfortable with >>>> (since I need to use this codebase from both cpython and pypy), but it >>>> demonstrates that it's just bookkeeping that needs to change to get >>>> things to work. >>>> >>>> My first approach would be to add a wrapper around save_global here >>>> >>>> >>>> https://bitbucket.org/pypy/pypy/src/a0105e0d00dbd0f73d06fc704db704868a6c6ed2/lib-python/2.7/pickle.py?at=default&fileviewer=file-view-default#pickle.py-814 >>>> that special-cases the global '_numpypy.multiarray' to instead be >>>> 'numpy.core.multiarray'. That seem like a reasonable thing to do? >>>> >>>> Cheers, >>>> Eli >>>> > From planrichi at gmail.com Sun Jun 26 02:33:57 2016 From: planrichi at gmail.com (Richard Plangger) Date: Sun, 26 Jun 2016 08:33:57 +0200 Subject: [pypy-dev] Micro NumPy & VecOpt 2.0 Message-ID: <0836502c-fb7b-4ada-8108-daef65eddb01@gmail.com> Hi, in case you have not heard: I'm currently working on the PPC and S390X port for micro numpy. Thanks to IBM for funding this work. I'm ~50% through the ppc operations to implement. The goal is to turn this optimization on (by default) in the micro numpy module. I recently had the idea to enhance the jit driver by giving it more information about parallel execution. I'm *not* talking about the main interp. loop. Having a vectorized loop that executes parallel in threads would certainly push micronumpy performance. Has somebody already tried something similar? I think it is a challenge, but it should be possible (with a reasonable amount of work) to get a simple thread fork/join model such as OpenMP provides. Cheers, Richard -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From planrichi at gmail.com Sun Jun 26 06:36:53 2016 From: planrichi at gmail.com (Richard Plangger) Date: Sun, 26 Jun 2016 12:36:53 +0200 Subject: [pypy-dev] Any core devs want to come to the python compilers workshop next month? In-Reply-To: References: Message-ID: <9dee5ba1-45b4-f6d0-42ad-ba38c824845d@gmail.com> Hi, > ... So if any of you are > interested in a free last minute trip to Austin + a chance to maybe > help influence future plans and funding in a more PyPy-friendly > direction, please get in touch ASAP... I'm interested and would find some time to join the workshop. Cheers, Richard -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From njs at pobox.com Sun Jun 26 23:35:49 2016 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 26 Jun 2016 20:35:49 -0700 Subject: [pypy-dev] Any core devs want to come to the python compilers workshop next month? In-Reply-To: <9dee5ba1-45b4-f6d0-42ad-ba38c824845d@gmail.com> References: <9dee5ba1-45b4-f6d0-42ad-ba38c824845d@gmail.com> Message-ID: On Sun, Jun 26, 2016 at 3:36 AM, Richard Plangger wrote: > Hi, > >> ... So if any of you are >> interested in a free last minute trip to Austin + a chance to maybe >> help influence future plans and funding in a more PyPy-friendly >> direction, please get in touch ASAP... > > I'm interested and would find some time to join the workshop. Excellent, I'll follow-up off-list with details. -n -- Nathaniel J. Smith -- https://vorpus.org From John.Zhang at anu.edu.au Sun Jun 26 22:31:51 2016 From: John.Zhang at anu.edu.au (John Zhang) Date: Mon, 27 Jun 2016 02:31:51 +0000 Subject: [pypy-dev] Recursive struct definition in rffi? Message-ID: Hi guys, I?m using RFFI to reflect a C API interface that my colleague has developed. During this process I encountered the problem of not knowing how to define a recursive struct definition. e.g. How do I define a struct like the following: typedef struct A A; struct A { ? void (*fn) (A*); } Help appreciated. Regards, John Zhang ------------------------------------------------------ John Zhang Research Assistant Programming Languages, Design & Implementation Division Computer Systems Group ANU College of Engineering & Computer Science 108 North Rd The Australian National University Acton ACT 2601 john.zhang at anu.edu.au -------------- next part -------------- An HTML attachment was scrubbed... URL: From arigo at tunes.org Mon Jun 27 02:51:36 2016 From: arigo at tunes.org (Armin Rigo) Date: Mon, 27 Jun 2016 08:51:36 +0200 Subject: [pypy-dev] Recursive struct definition in rffi? In-Reply-To: References: Message-ID: Hi John, On 27 June 2016 at 04:31, John Zhang wrote: > I?m using RFFI to reflect a C API interface that my colleague has developed. > During this process I encountered the problem of not knowing how to define a > recursive struct definition. > e.g. How do I define a struct like the following: > typedef struct A A; > struct A { > ? > void (*fn) (A*); > } See lltype.ForwardReference(), e.g. in translator/c/test/test_genc.py . A bient?t, Armin. From John.Zhang at anu.edu.au Tue Jun 28 03:23:32 2016 From: John.Zhang at anu.edu.au (John Zhang) Date: Tue, 28 Jun 2016 07:23:32 +0000 Subject: [pypy-dev] Passing an array to C in RFFI Message-ID: <1CE09768-C86B-43A0-954A-F7F8219F05C9@anu.edu.au> Hi all, Thanks for the help previously. I have another question regarding passing an array value to C function using RFFI. Suppose I have a function in C that?s like this: void fn(void** array_of_voips, int length) How do I in RPython call the function using RFFI passing an array value for it? it seems that I can?t just do fn([a, b, c], 3) Help appreciated. John Zhang ------------------------------------------------------ John Zhang Research Assistant Programming Languages, Design & Implementation Division Computer Systems Group ANU College of Engineering & Computer Science 108 North Rd The Australian National University Acton ACT 2601 john.zhang at anu.edu.au -------------- next part -------------- An HTML attachment was scrubbed... URL: From wickedgrey at gmail.com Wed Jun 29 12:59:19 2016 From: wickedgrey at gmail.com (Eli Stevens (Gmail)) Date: Wed, 29 Jun 2016 09:59:19 -0700 Subject: [pypy-dev] pickle numpy array from pypy to cpython? In-Reply-To: References: <576EBEB9.9070804@gmail.com> Message-ID: Any thoughts on if this approach is acceptable? Happy to incorporate feedback. I wouldn't be surprised if there are more functions than just _reconstruct that will need to be special cased, but without a concrete use case I wasn't going to complicate things. Thanks, Eli On Sat, Jun 25, 2016 at 8:19 PM, Eli Stevens (Gmail) wrote: > That did the trick. > > Pull request here: > https://bitbucket.org/pypy/pypy/pull-requests/460/changes-reported-location-of-_reconstruct/diff > > Please let me know if there are changes that should be made. As noted, > I'm not super happy with the tests, but am unsure what direction I > should go with them. > > Cheers, > Eli > > On Sat, Jun 25, 2016 at 10:26 AM, Matti Picus wrote: >> You need to add the modules to those that the class-local space is built >> with using a spaceconfig, so something like >> >> class AppTestPicklingNumpy(BaseNumpyAppTest): >> spaceconfig = dict(usemodules=["micronumpy", "struct", "binascii"]) >> >> def setup_class(cls): >> if option.runappdirect and '__pypy__' not in >> sys.builtin_module_names: >> py.test.skip("pypy only test") >> BaseNumpyAppTest.setup_class.im_func(cls) >> >> >> def test_pickle_module(self): >> import pickle >> >> >> >> >> On 25/06/16 09:01, Eli Stevens (Gmail) wrote: >>> >>> I was thinking about doing it on import of the micronumpy module >>> (pypy/module/micronumpy/app_numpy.py). >>> >>> Right now, when I try and import pickle during the tests: >>> >>> $ cat pypy/module/micronumpy/test/test_pickling_app.py >>> import sys >>> import py >>> >>> from pypy.module.micronumpy.test.test_base import BaseNumpyAppTest >>> from pypy.conftest import option >>> >>> class AppTestPicklingNumpy(BaseNumpyAppTest): >>> def setup_class(cls): >>> if option.runappdirect and '__pypy__' not in >>> sys.builtin_module_names: >>> py.test.skip("pypy only test") >>> BaseNumpyAppTest.setup_class.im_func(cls) >>> >>> def test_pickle_module(self): >>> import pickle >>> ... # more code >>> >>> >>> I get this error: >>> >>>> import struct >>> >>> lib-python/2.7/pickle.py:34: >>> _ _ _ _ _ _ >>> >>>> from _struct import * >>> >>> E (application-level) ImportError: No module named _struct >>> >>> lib-python/2.7/struct.py:1: ImportError >>> >>> >>> But everything seems fine with struct: >>> >>> $ ./pytest.py pypy/module/struct/test/test_struct.py >>> ==== test session starts ==== >>> platform linux2 -- Python 2.7.11 -- py-1.4.20 -- pytest-2.5.2 >>> pytest-2.5.2 from /home/elis/edit/play/pypy/pytest.pyc >>> collected 30 items >>> >>> pypy/module/struct/test/test_struct.py .............................. >>> >>> ==== 30 passed in 11.95 seconds ==== >>> >>> Any idea what's going on here? >>> >>> Thanks, >>> Eli >>> >>> >>> On Fri, Jun 24, 2016 at 9:19 PM, matti picus >>> wrote: >>>> >>>> Sounds reasonable. You might want to generalize it a bit by trying to >>>> import >>>> _numpypy /numpy, and setting up the replacement by whichever fails to >>>> import. >>>> Matti >>>> >>>> On Saturday, 25 June 2016, Eli Stevens (Gmail) >>>> wrote: >>>>> >>>>> Heh, interestingly, if I add the following to the local dir and files >>>>> when trying to unpickle under cpython, it works (note that cpython to >>>>> pypy actually works out of the box, which I hadn't realized): >>>>> >>>>> $ cat _numpypy/__init__.py >>>>> from numpy.core import * >>>>> >>>>> $ cat _numpypy/multiarray.py >>>>> from numpy.core.multiarray import * >>>>> import numpy.core.multiarray as _ncm >>>>> _reconstruct = _ncm._reconstruct >>>>> >>>>> This is obviously a total hack, and not one I'm comfortable with >>>>> (since I need to use this codebase from both cpython and pypy), but it >>>>> demonstrates that it's just bookkeeping that needs to change to get >>>>> things to work. >>>>> >>>>> My first approach would be to add a wrapper around save_global here >>>>> >>>>> >>>>> https://bitbucket.org/pypy/pypy/src/a0105e0d00dbd0f73d06fc704db704868a6c6ed2/lib-python/2.7/pickle.py?at=default&fileviewer=file-view-default#pickle.py-814 >>>>> that special-cases the global '_numpypy.multiarray' to instead be >>>>> 'numpy.core.multiarray'. That seem like a reasonable thing to do? >>>>> >>>>> Cheers, >>>>> Eli >>>>> >> From matti.picus at gmail.com Wed Jun 29 14:31:12 2016 From: matti.picus at gmail.com (Matti Picus) Date: Wed, 29 Jun 2016 21:31:12 +0300 Subject: [pypy-dev] pickle numpy array from pypy to cpython? In-Reply-To: References: <576EBEB9.9070804@gmail.com> Message-ID: <577413F0.90909@gmail.com> I think I would prefer this be done in upstream numpy (which is 95% supported by PyPy's cpyext layer) rather than changing the class name when saving a _numpypy ndarray. In both cases, a warning should be emitted when loading the "wrong" object to tell the user that subtle problems may occur, for instance with complicated record dtypes or with arrays of objects. Your pull request seems OK, it needs tests of more complicated numpy types like scalars and record arrays. Again, I would be happier if it spit out some kind of warning when overriding the object name. Maybe we should merge it until we can fix upstream numpy? Does anyone else have an opinion? Matti On 29/06/16 19:59, Eli Stevens (Gmail) wrote: > Any thoughts on if this approach is acceptable? Happy to incorporate feedback. > > I wouldn't be surprised if there are more functions than just > _reconstruct that will need to be special cased, but without a > concrete use case I wasn't going to complicate things. > > Thanks, > Eli > > From wickedgrey at gmail.com Wed Jun 29 15:37:41 2016 From: wickedgrey at gmail.com (Eli Stevens (Gmail)) Date: Wed, 29 Jun 2016 12:37:41 -0700 Subject: [pypy-dev] pickle numpy array from pypy to cpython? In-Reply-To: <577413F0.90909@gmail.com> References: <576EBEB9.9070804@gmail.com> <577413F0.90909@gmail.com> Message-ID: To make sure I'm understanding, are you saying that upstream/cpython numpy should pick up an alternate way to import multiarray (via _numpypy.multiarray, instead of numpy.core.multiarray), similar to how one can `import numpy` under pypy, even though the real implementation is in `_numpypy`? Thanks, Eli On Wed, Jun 29, 2016 at 11:31 AM, Matti Picus wrote: > I think I would prefer this be done in upstream numpy (which is 95% > supported by PyPy's cpyext layer) rather than changing the class name when > saving a _numpypy ndarray. In both cases, a warning should be emitted when > loading the "wrong" object to tell the user that subtle problems may occur, > for instance with complicated record dtypes or with arrays of objects. > > Your pull request seems OK, it needs tests of more complicated numpy types > like scalars and record arrays. Again, I would be happier if it spit out > some kind of warning when overriding the object name. > Maybe we should merge it until we can fix upstream numpy? > Does anyone else have an opinion? > Matti > > On 29/06/16 19:59, Eli Stevens (Gmail) wrote: >> >> Any thoughts on if this approach is acceptable? Happy to incorporate >> feedback. >> >> I wouldn't be surprised if there are more functions than just >> _reconstruct that will need to be special cased, but without a >> concrete use case I wasn't going to complicate things. >> >> Thanks, >> Eli >> >> > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev From arigo at tunes.org Thu Jun 30 01:58:10 2016 From: arigo at tunes.org (Armin Rigo) Date: Thu, 30 Jun 2016 07:58:10 +0200 Subject: [pypy-dev] pickle numpy array from pypy to cpython? In-Reply-To: References: <576EBEB9.9070804@gmail.com> <577413F0.90909@gmail.com> Message-ID: Hi Eli, hi Matti, On 29 June 2016 at 21:37, Eli Stevens (Gmail) wrote: > To make sure I'm understanding, are you saying that upstream/cpython > numpy should pick up an alternate way to import multiarray (via > _numpypy.multiarray, instead of numpy.core.multiarray) Hum, in my opinion we should always pickle/unpickle arrays by reproducing and expecting the exact same format as CPython's numpy, with no warnings. Any difference (e.g. with complicated dtypes) is a bug that should eventually be fixed. A bient?t, Armin. From arigo at tunes.org Thu Jun 30 02:05:36 2016 From: arigo at tunes.org (Armin Rigo) Date: Thu, 30 Jun 2016 08:05:36 +0200 Subject: [pypy-dev] Passing an array to C in RFFI In-Reply-To: <1CE09768-C86B-43A0-954A-F7F8219F05C9@anu.edu.au> References: <1CE09768-C86B-43A0-954A-F7F8219F05C9@anu.edu.au> Message-ID: Hi John, On 28 June 2016 at 09:23, John Zhang wrote: > I have another question regarding passing an array value to C function using > RFFI. > Suppose I have a function in C that?s like this: > void fn(void** array_of_voips, int length) > How do I in RPython call the function using RFFI passing an array value for > it? > it seems that I can?t just do > fn([a, b, c], 3) Indeed, this works in CFFI, but RFFI is by far not as nice as that. You need to build the array value manually with "ar=lltype.new(ARRAY_OF_VOIDP.TO, length, flavor='raw'); ar[0]=a; " etc., remembering to do "lltype.free(ar, flavor='raw')" at the end. Or use "with lltype.scoped_alloc(...)", which replaces both the malloc at the start and the free at the end. A bient?t, Armin. From John.Zhang at anu.edu.au Thu Jun 30 02:33:09 2016 From: John.Zhang at anu.edu.au (John Zhang) Date: Thu, 30 Jun 2016 06:33:09 +0000 Subject: [pypy-dev] Passing an array to C in RFFI In-Reply-To: References: <1CE09768-C86B-43A0-954A-F7F8219F05C9@anu.edu.au> Message-ID: <3F8AAFAD-8ECC-4F89-9453-6ECE9D86A2CC@anu.edu.au> Hi Armin, Thanks for the help! The scoped_alloc really is quite handy in these situations. Cheers, John Zhang ------------------------------------------------------ John Zhang Research Assistant Programming Languages, Design & Implementation Division Computer Systems Group ANU College of Engineering & Computer Science 108 North Rd The Australian National University Acton ACT 2601 john.zhang at anu.edu.au On 30 Jun 2016, at 16:05, Armin Rigo > wrote: Hi John, On 28 June 2016 at 09:23, John Zhang > wrote: I have another question regarding passing an array value to C function using RFFI. Suppose I have a function in C that?s like this: void fn(void** array_of_voips, int length) How do I in RPython call the function using RFFI passing an array value for it? it seems that I can?t just do fn([a, b, c], 3) Indeed, this works in CFFI, but RFFI is by far not as nice as that. You need to build the array value manually with "ar=lltype.new(ARRAY_OF_VOIDP.TO, length, flavor='raw'); ar[0]=a; " etc., remembering to do "lltype.free(ar, flavor='raw')" at the end. Or use "with lltype.scoped_alloc(...)", which replaces both the malloc at the start and the free at the end. A bient?t, Armin. -------------- next part -------------- An HTML attachment was scrubbed... URL: From paulbaker8 at gmail.com Thu Jun 30 05:42:51 2016 From: paulbaker8 at gmail.com (Paul Baker) Date: Thu, 30 Jun 2016 21:42:51 +1200 Subject: [pypy-dev] Is it possible to make PyPy smaller? Message-ID: I have a small command-line tool, written in Python, which I would like to give to a few people to use on their own machines. For performance reasons, I would like to distribute the tool with PyPy rather than CPython. For this particular tool, PyPy runs 60 times faster than CPython (and only 3 times slower than a less-capable C++ implementation). My concern with distributing PyPy to end-users is size: Python27.dll is 2.3MB while libpypy-c.dll is 41.9MB. Why is PyPy so much bigger than CPython? Is it possible to build a version of PyPy which is closer to CPython in size? From yury at shurup.com Thu Jun 30 07:15:25 2016 From: yury at shurup.com (Yury V. Zaytsev) Date: Thu, 30 Jun 2016 13:15:25 +0200 (CEST) Subject: [pypy-dev] Is it possible to make PyPy smaller? In-Reply-To: References: Message-ID: On Thu, 30 Jun 2016, Paul Baker wrote: > My concern with distributing PyPy to end-users is size: Python27.dll is > 2.3MB while libpypy-c.dll is 41.9MB. > > Why is PyPy so much bigger than CPython? Is it possible to build a > version of PyPy which is closer to CPython in size? It may be very well possible with some compilation flags juggling, but building PyPy on Windows is not a walk in the park, so you really need to consider whether you want to get into it or not... What is your main concern, however? I would assume, that this DLL compresses really well with something like 7z / ultra, does it not? If it's the unpacked size that bothers you, you could, for instance, treat the DLL with UPX. -- Sincerely yours, Yury V. Zaytsev From arigo at tunes.org Thu Jun 30 16:41:55 2016 From: arigo at tunes.org (Armin Rigo) Date: Thu, 30 Jun 2016 22:41:55 +0200 Subject: [pypy-dev] Is it possible to make PyPy smaller? In-Reply-To: References: Message-ID: Hi, On 30 June 2016 at 13:15, Yury V. Zaytsev wrote: > On Thu, 30 Jun 2016, Paul Baker wrote: >> Why is PyPy so much bigger than CPython? Is it possible to build a version >> of PyPy which is closer to CPython in size? > > It may be very well possible with some compilation flags juggling, but > building PyPy on Windows is not a walk in the park, so you really need to > consider whether you want to get into it or not... > > What is your main concern, however? I would assume, that this DLL compresses > really well with something like 7z / ultra, does it not? If it's the > unpacked size that bothers you, you could, for instance, treat the DLL with > UPX. Thanks Yury for the answer. Yes, technically, the PyPy DLL is much larger than CPython's for a combination of reasons, each adding their own factor---even if, like you should, you compare libpypy-c.dll with the total size of python27.dll and all other .pyd files. In the end the libpypy-c.dll is probably more compressible than CPython's, so by compressing it you regain some of that factor. You don't regain all of it by far, though. You don't get a complex JIT compiler for free, for example. A bient?t, Armin. From yury at shurup.com Thu Jun 30 17:09:45 2016 From: yury at shurup.com (Yury V. Zaytsev) Date: Thu, 30 Jun 2016 23:09:45 +0200 (CEST) Subject: [pypy-dev] Is it possible to make PyPy smaller? In-Reply-To: References: Message-ID: On Thu, 30 Jun 2016, Armin Rigo wrote: > In the end the libpypy-c.dll is probably more compressible than > CPython's, so by compressing it you regain some of that factor. You > don't regain all of it by far, though. You don't get a complex JIT > compiler for free, for example. So, just out of curiosity, with a very old and outdated version of upx: File size Ratio Format Name -------------------- ------ ----------- ----------- 40565248 -> 7155712 17.64% win32/pe libpypy-c.dll That's 7M! Surely the latest version must be able to do a bit better. Now, the complete PyPy distribution, compressed with a very old and outdated xz --best is around 14M, and it's even a bit smaller if you compress the libraries with upx first. I think that's a fair price to pay for a powerful JIT, isn't it? On top of that, as I said, one might have some luck playing with the compiler flags, but I think it's hardly worth it. -- Sincerely yours, Yury V. Zaytsev