From emre.yavuz169 at gmail.com Sat May 1 16:50:10 2021 From: emre.yavuz169 at gmail.com (Emre Yavuz) Date: Sat, 1 May 2021 22:50:10 +0200 Subject: [pypy-dev] Syscall Interface slowness Message-ID: Hello, Today I was doing some experiment with CPython and PyPy. I was very impressed by the performance of PyPy, when it?s doing operations in user space, it was almost 20 times faster than CPython. Then I decided to switch our Python CLI to PyPy and I run one of our major command in our CLI and results were worse than CPython. It got slower! Then I started to research it more. Our CLI?s characteristic is that it calls multiple other programs and read a lot of configuration data and create many files which mean all of those operations were related to sys calls. Then I run some simple test cases, tried to read and write millions of lines to a file or create and kill multiple processes. All of these operations were almost 5 times slower than CPython. I run my tests both MacOS and RHEL with latest version of PyPy3.7 My question is that, is that something known? Or can it be some improvement area that can be contributed? Best, Emre Yavuz -------------- next part -------------- An HTML attachment was scrubbed... URL: From strombrg at gmail.com Sat May 1 16:55:54 2021 From: strombrg at gmail.com (Dan Stromberg) Date: Sat, 1 May 2021 13:55:54 -0700 Subject: [pypy-dev] Syscall Interface slowness In-Reply-To: References: Message-ID: I have a system call-heavy program ( https://stromberg.dnsalias.org/~strombrg/backshift/), that is faster with pypy on one machine, and faster with CPython+Cython on another. Same code, different machines, different relative speeds for the two implementations. For a long time, I thought pypy was just faster at CPU and slower at I/O, but it turns out that's not always true. HTH. On Sat, May 1, 2021 at 1:50 PM Emre Yavuz wrote: > Hello, > > Today I was doing some experiment with CPython and PyPy. I was very > impressed by the performance of PyPy, when it?s doing operations in user > space, it was almost 20 times faster than CPython. > > Then I decided to switch our Python CLI to PyPy and I run one of our major > command in our CLI and results were worse than CPython. It got slower! Then > I started to research it more. Our CLI?s characteristic is that it calls > multiple other programs and read a lot of configuration data and create > many files which mean all of those operations were related to sys calls. > > Then I run some simple test cases, tried to read and write millions of > lines to a file or create and kill multiple processes. All of these > operations were almost 5 times slower than CPython. I run my tests both > MacOS and RHEL with latest version of PyPy3.7 > > My question is that, is that something known? Or can it be some > improvement area that can be contributed? > > Best, > Emre Yavuz > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > -- Dan Stromberg -------------- next part -------------- An HTML attachment was scrubbed... URL: From cfbolz at gmx.de Sat May 1 17:12:05 2021 From: cfbolz at gmx.de (Carl Friedrich Bolz-Tereick) Date: Sat, 1 May 2021 23:12:05 +0200 Subject: [pypy-dev] Syscall Interface slowness In-Reply-To: References: Message-ID: <53264227-3a78-21c8-890c-72776f40fafa@gmx.de> Hi Emre, IO can definitely be slower than CPython, but not in all cases. A few of them are known and we try to improve them, eg readline operations on files: https://foss.heptapod.net/pypy/pypy/-/issues/3126 5x is definitely too much of a difference, if you can minimize that to a small self-contained example, that would be a very valuable bug report and we would definitely try to fix it. Cheers, Carl Friedrich On 5/1/21 10:50 PM, Emre Yavuz wrote: > Hello, > > Today I was doing some experiment with CPython and PyPy. I was very > impressed by the performance of PyPy, when it?s doing operations in user > space, it was almost 20 times faster than CPython. > > Then I decided to switch our Python CLI to PyPy and I run one of our > major command in our CLI and results were worse than CPython. It got > slower! Then I started to research it more. Our CLI?s characteristic is > that it calls multiple other programs and read a lot of configuration > data and create many files which mean all of those operations were > related to sys calls. > > Then I run some simple test cases, tried to read and write millions of > lines to a file or create and kill multiple processes. All of these > operations were almost 5 times slower than CPython. I run my tests both > MacOS and RHEL with latest version of PyPy3.7 > > My question is that, is that something known? Or can it be some > improvement area that can be contributed? > > Best, > Emre Yavuz > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > From emre.yavuz169 at gmail.com Sat May 1 18:08:29 2021 From: emre.yavuz169 at gmail.com (Emre Yavuz) Date: Sun, 2 May 2021 00:08:29 +0200 Subject: [pypy-dev] Syscall Interface slowness In-Reply-To: References: Message-ID: That?s really interesting! I also started to think it?s slower in I/O, I also run my tests in Ubuntu and Debian which resulted same. What type of system runs PyPy faster? Did you or someone have experience that and chance to look what?s making it slower? Because difference is huge when it comes to sys calls. I am planning to dive into the code to find out more if it?s not a known fact Best, Emre Yavuz > On 1 May 2021, at 22:55, Dan Stromberg wrote: > > > I have a system call-heavy program (https://stromberg.dnsalias.org/~strombrg/backshift/ ), that is faster with pypy on one machine, and faster with CPython+Cython on another. Same code, different machines, different relative speeds for the two implementations. > > For a long time, I thought pypy was just faster at CPU and slower at I/O, but it turns out that's not always true. > > HTH. > > On Sat, May 1, 2021 at 1:50 PM Emre Yavuz > wrote: > Hello, > > Today I was doing some experiment with CPython and PyPy. I was very impressed by the performance of PyPy, when it?s doing operations in user space, it was almost 20 times faster than CPython. > > Then I decided to switch our Python CLI to PyPy and I run one of our major command in our CLI and results were worse than CPython. It got slower! Then I started to research it more. Our CLI?s characteristic is that it calls multiple other programs and read a lot of configuration data and create many files which mean all of those operations were related to sys calls. > > Then I run some simple test cases, tried to read and write millions of lines to a file or create and kill multiple processes. All of these operations were almost 5 times slower than CPython. I run my tests both MacOS and RHEL with latest version of PyPy3.7 > > My question is that, is that something known? Or can it be some improvement area that can be contributed? > > Best, > Emre Yavuz > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > > > -- > Dan Stromberg -------------- next part -------------- An HTML attachment was scrubbed... URL: From strombrg at gmail.com Sat May 1 18:25:21 2021 From: strombrg at gmail.com (Dan Stromberg) Date: Sat, 1 May 2021 15:25:21 -0700 Subject: [pypy-dev] Syscall Interface slowness In-Reply-To: References: Message-ID: The system that's faster with Pypy is a AMD FX(tm)-4300 Quad-Core Processor. The system that's faster with CPython+Cython is a AMD Athlon(tm) II X3 455 Processor. On Sat, May 1, 2021 at 3:08 PM Emre Yavuz wrote: > That?s really interesting! I also started to think it?s slower in I/O, I > also run my tests in Ubuntu and Debian which resulted same. > > What type of system runs PyPy faster? > > Did you or someone have experience that and chance to look what?s making > it slower? Because difference is huge when it comes to sys calls. > > I am planning to dive into the code to find out more if it?s not a known > fact > > Best, > Emre Yavuz > > On 1 May 2021, at 22:55, Dan Stromberg wrote: > > > I have a system call-heavy program ( > https://stromberg.dnsalias.org/~strombrg/backshift/), that is faster with > pypy on one machine, and faster with CPython+Cython on another. Same code, > different machines, different relative speeds for the two implementations. > > For a long time, I thought pypy was just faster at CPU and slower at I/O, > but it turns out that's not always true. > > HTH. > > On Sat, May 1, 2021 at 1:50 PM Emre Yavuz wrote: > >> Hello, >> >> Today I was doing some experiment with CPython and PyPy. I was very >> impressed by the performance of PyPy, when it?s doing operations in user >> space, it was almost 20 times faster than CPython. >> >> Then I decided to switch our Python CLI to PyPy and I run one of our >> major command in our CLI and results were worse than CPython. It got >> slower! Then I started to research it more. Our CLI?s characteristic is >> that it calls multiple other programs and read a lot of configuration data >> and create many files which mean all of those operations were related to >> sys calls. >> >> Then I run some simple test cases, tried to read and write millions of >> lines to a file or create and kill multiple processes. All of these >> operations were almost 5 times slower than CPython. I run my tests both >> MacOS and RHEL with latest version of PyPy3.7 >> >> My question is that, is that something known? Or can it be some >> improvement area that can be contributed? >> >> Best, >> Emre Yavuz >> _______________________________________________ >> pypy-dev mailing list >> pypy-dev at python.org >> https://mail.python.org/mailman/listinfo/pypy-dev >> > > > -- > Dan Stromberg > > > -- Dan Stromberg -------------- next part -------------- An HTML attachment was scrubbed... URL: From emre.yavuz169 at gmail.com Sat May 1 18:49:27 2021 From: emre.yavuz169 at gmail.com (Emre Yavuz) Date: Sun, 2 May 2021 00:49:27 +0200 Subject: [pypy-dev] Syscall Interface slowness In-Reply-To: References: Message-ID: Hi Carl, Sorry I couldn?t receive your message (probably something wrong in my mail configuration) but I saw the message from digest. I created self contained program for this example. I am using ?writelines? That?s paste bin link to program: https://pastebin.com/D6auMcwN Command line: (devpy) emreyavuz:tmp emreyavuz$ python3 test-writer Time has been spend for copying data 28 secs (devpy) emreyavuz:tmp emreyavuz$ pypy3 test-writer Time has been spend for copying data 118 secs Best, Emre Yavuz -------------- next part -------------- An HTML attachment was scrubbed... URL: From cfbolz at gmx.de Sun May 2 09:36:47 2021 From: cfbolz at gmx.de (Carl Friedrich Bolz-Tereick) Date: Sun, 2 May 2021 15:36:47 +0200 Subject: [pypy-dev] Syscall Interface slowness In-Reply-To: References: Message-ID: <65bbaa2f-eff8-4a4d-7d42-172acee3eebb@gmx.de> Hi Emre, thanks for that, it's great! there's nothing deeply wrong in PyPy here, just plain optimization work needed. We're tracking it here, in the issue I already linked: https://foss.heptapod.net/pypy/pypy/-/issues/3126 If you find more like this, please let us know! Cheers, CF On 5/2/21 12:49 AM, Emre Yavuz wrote: > Hi Carl, > > Sorry I couldn?t receive your message (probably something wrong in my > mail configuration) but I saw the message from digest. > > I created self contained program for this example. I am using ?writelines? > > > *That?s paste bin link to program:* > https://pastebin.com/D6auMcwN > > *Command line:* > (devpy) emreyavuz:tmp emreyavuz$ python3 test-writer > Time has been spend for copying data 28 secs > (devpy) emreyavuz:tmp emreyavuz$ pypy3 test-writer > Time has been spend for copying data 118 secs > > > Best, > Emre Yavuz > From cfbolz at gmx.de Sun May 2 10:06:12 2021 From: cfbolz at gmx.de (Carl Friedrich Bolz-Tereick) Date: Sun, 2 May 2021 16:06:12 +0200 Subject: [pypy-dev] Syscall Interface slowness In-Reply-To: References: Message-ID: <1e183cd6-0996-8de2-1cc4-d41fe95a248d@gmx.de> Ah, I found one problem: your script uses .writelines with a string as the argument. writelines usually takes an iterator (like a list) but it will also work with a string argument, and then do: for char in s: w.write(char) if I replace the writelines(...) with a .write(...) it becomes much faster, both on cpython and pypy. CF On 5/2/21 12:49 AM, Emre Yavuz wrote: > Hi Carl, > > Sorry I couldn?t receive your message (probably something wrong in my > mail configuration) but I saw the message from digest. > > I created self contained program for this example. I am using ?writelines? > > > *That?s paste bin link to program:* > https://pastebin.com/D6auMcwN > > *Command line:* > (devpy) emreyavuz:tmp emreyavuz$ python3 test-writer > Time has been spend for copying data 28 secs > (devpy) emreyavuz:tmp emreyavuz$ pypy3 test-writer > Time has been spend for copying data 118 secs > > > Best, > Emre Yavuz > From emre.yavuz169 at gmail.com Sun May 2 10:16:38 2021 From: emre.yavuz169 at gmail.com (Emre Yavuz) Date: Sun, 2 May 2021 16:16:38 +0200 Subject: [pypy-dev] Syscall Interface slowness In-Reply-To: <1e183cd6-0996-8de2-1cc4-d41fe95a248d@gmx.de> References: <1e183cd6-0996-8de2-1cc4-d41fe95a248d@gmx.de> Message-ID: <061A026F-3E73-4E2E-A9D3-34AC21310AE0@gmail.com> Hi Carl, Thank you for informing about the issue link, I?ll be following that. Yes, "writelines" took much more longer time in that case I suppose but I would expect them to be same but it?s good to know that with few tweaks it can actually got faster. I?ll take a look more on other things made the CLI slower and if I find something interesting, I?ll let you know for sure! Best, Emre Yavuz > On 2 May 2021, at 16:06, Carl Friedrich Bolz-Tereick wrote: > > Ah, I found one problem: your script uses .writelines with a string as > the argument. writelines usually takes an iterator (like a list) but it > will also work with a string argument, and then do: > > for char in s: > w.write(char) > > if I replace the writelines(...) with a .write(...) it becomes much > faster, both on cpython and pypy. > > CF > > On 5/2/21 12:49 AM, Emre Yavuz wrote: >> Hi Carl, >> >> Sorry I couldn?t receive your message (probably something wrong in my >> mail configuration) but I saw the message from digest. >> >> I created self contained program for this example. I am using ?writelines? >> >> >> *That?s paste bin link to program:* >> https://pastebin.com/D6auMcwN >> >> *Command line:* >> (devpy) emreyavuz:tmp emreyavuz$ python3 test-writer >> Time has been spend for copying data 28 secs >> (devpy) emreyavuz:tmp emreyavuz$ pypy3 test-writer >> Time has been spend for copying data 118 secs >> >> >> Best, >> Emre Yavuz >> > From matti.picus at gmail.com Sun May 2 16:17:16 2021 From: matti.picus at gmail.com (Matti Picus) Date: Sun, 2 May 2021 23:17:16 +0300 Subject: [pypy-dev] release candidates for bugfix v7.3.5 are available Message-ID: In order to fix some problems with the 7.3.4 release, I am releasing a 7.3.5 bugfix. The rc1 candidates are available at https://downloads.python.org/pypy/ and the checksums can be found in the PR to pypy.org https://608f07a183d23c00083f6115--keen-mestorf-442210.netlify.app/download_advanced.html#checksums The changes are (as can be seen in the release note https://doc.pypy.org/en/latest/release-v7.3.5.html): - The new windows 64-bit builds improperly named c-extension modules with the same extension as the 32-bit build (issue 3443) - A change to the python 3.7 sysconfig.get_config_var('LIBDIR') was wrong, leading to problems finding libpypy3-c.so for embedded PyPy (issue 3442). - Two upstream (CPython) security patches were applied: BPO 42988 to remove pydoc.getfile and BPO 43285 to not trust the PASV response in ftplib. - When assigning the full slice of a list, evaluate the rhs before clearing the list (issue 3440) - On Python2, PyUnicode_Contains accepts bytes as well as unicode. - Update the packaged sqlite3 to 3.35.5 on windows. While not a bugfix, this seems like an easy win. Please try out the release candidates soon so I can finish the release. Matti From mgorny at gentoo.org Sun May 2 16:26:21 2021 From: mgorny at gentoo.org (=?UTF-8?Q?Micha=C5=82_G=C3=B3rny?=) Date: Sun, 02 May 2021 22:26:21 +0200 Subject: [pypy-dev] release candidates for bugfix v7.3.5 are available In-Reply-To: References: Message-ID: <354281cbb298dc7818bc170c19fbb90705fdf47a.camel@gentoo.org> On Sun, 2021-05-02 at 23:17 +0300, Matti Picus wrote: > In order to fix some problems with the 7.3.4 release, I am releasing a > 7.3.5 bugfix. The rc1 candidates are available at > https://downloads.python.org/pypy/ and the checksums can be found in the > PR to pypy.org > https://608f07a183d23c00083f6115--keen-mestorf-442210.netlify.app/download_advanced.html#checksums > > > The changes are (as can be seen in the release note > https://doc.pypy.org/en/latest/release-v7.3.5.html): > > > - The new windows 64-bit builds improperly named c-extension modules > with the same extension as the 32-bit build (issue 3443) > > - A change to the python 3.7 sysconfig.get_config_var('LIBDIR') was > wrong, leading to problems finding libpypy3-c.so for embedded PyPy > (issue 3442). > > - Two upstream (CPython) security patches were applied: BPO 42988 to > remove pydoc.getfile and BPO 43285 to not trust the PASV response in ftplib. > Hope it's not too late for a few more. I've been working on security backports today, and I have a few patches you could apply: For PyPy3.7: https://gitweb.gentoo.org/fork/cpython.git/commit/?h=gentoo-3.7.10_p3&id=f80b05c6d0dbad28453a824d10cc7a5336dd090f https://gitweb.gentoo.org/fork/cpython.git/commit/?h=gentoo-3.7.10_p3&id=7c53e5864fddc139a85a674583e0c6fb825b6378 https://gitweb.gentoo.org/fork/cpython.git/commit/?h=gentoo-3.7.10_p3&id=b84a761c292f6f04a1c238908c1ebc42af52fd53 (note that the last patch is actually a fixup to the previous one) For PyPy2.7: https://gitweb.gentoo.org/fork/cpython.git/commit/?h=gentoo-2.7-vanilla&id=f9e5d7ae605bd9739b1cf3423f03daa909ee4a0c https://gitweb.gentoo.org/fork/cpython.git/commit/?h=gentoo-2.7-vanilla&id=41823e6111ee1c9b4f114764548f4323ada85dd1 -- Best regards, Micha? G?rny From matti.picus at gmail.com Wed May 5 17:39:22 2021 From: matti.picus at gmail.com (Matti Picus) Date: Thu, 6 May 2021 00:39:22 +0300 Subject: [pypy-dev] 7.3.5 rc2 tarballs are up. Message-ID: Following the rc1 I have uploaded rc2 candidates to https://downloads.python.org/pypy/ and the checksums can be found in the PR to pypy.org https://608f07a183d23c00083f6115--keen-mestorf-442210.netlify.app/download_advanced.html#checksums The changes are in the release note. https://doc.pypy.org/en/latest/release-v7.3.5.html: - The new windows 64-bit builds improperly named c-extension modules with the same extension as the 32-bit build (issue 3443) - Use the windows-specific PC/pyconfig.h rather than the posix one - A change to the python 3.7 sysconfig.get_config_var('LIBDIR') was wrong, leading to problems finding libpypy3-c.so for embedded PyPy (issue 3442). - Instantiate distutils.command.install schema for PyPy-specific implementation_lower - Four upstream (CPython) security patches were applied - When assigning the full slice of a list, evaluate the rhs before clearing the list (issue 3440) - On Python2, PyUnicode_Contains accepts bytes as well as unicode. - Update the packaged sqlite3 to 3.35.5 on windows. While not a bugfix, this seems like an easy win. Please try them out Matti From metst13 at gmail.com Thu May 6 03:36:03 2021 From: metst13 at gmail.com (Yaroslav Nikitenko) Date: Thu, 6 May 2021 10:36:03 +0300 Subject: [pypy-dev] Allow PyPy to use ROOT (pyroot) Message-ID: I posted this at https://foss.heptapod.net/pypy/pypy/-/issues/3455, and they recommended to write to this mailing list. ROOT data analysis framework (https://root.cern) is written in C++ (with cppyy Python bindings). I tried to import that in PyPy, and got an error >>>> import ROOT Traceback (most recent call last): File "", line 1, in File "/opt/root/cur/lib/ROOT/__init__.py", line 22, in import cppyy File "/opt/root/cur/lib/cppyy/__init__.py", line 64, in libcppyy_mod_name, major, minor)) ImportError: Failed to import libcppyy2_7. Please check that ROOT has been built for Python 2.7 However, I allowed support for Python 2 in ROOT, and python2 imports ROOT fine. That library is in LD_LIBRARY_PATH. Could you please add support of ROOT in PyPy? From wlavrijsen at lbl.gov Thu May 6 09:52:50 2021 From: wlavrijsen at lbl.gov (wlavrijsen at lbl.gov) Date: Thu, 6 May 2021 06:52:50 -0700 (PDT) Subject: [pypy-dev] Allow PyPy to use ROOT (pyroot) In-Reply-To: References: Message-ID: Yaroslav, > ROOT data analysis framework (https://root.cern) is written in C++ > (with cppyy Python bindings). not exactly: ROOT uses a fork of cppyy with several local modifications (and it's also very much behind cppyy master). One of them is precisely this: > ImportError: Failed to import libcppyy2_7. Please check that ROOT has > been built for Python 2.7 Which is referring to their "multi-python" build, a local "invention." But also, libcppyy is not used in PyPy and should never be loaded directly for portable use. (If they had stayed with standard Python platform tags, they would not have to load it explicitly.) > Could you please add support of ROOT in PyPy? This really is on the ROOT folk to make their fork compatible with the Python ecosphere, so file a bug report with them. Best regards, Wim -- WLavrijsen at lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net From metst13 at gmail.com Thu May 6 11:29:31 2021 From: metst13 at gmail.com (Yaroslav Nikitenko) Date: Thu, 6 May 2021 18:29:31 +0300 Subject: [pypy-dev] Allow PyPy to use ROOT (pyroot) In-Reply-To: References: Message-ID: Thanks, Wim! I filed a feature request to ROOT developers (https://github.com/root-project/root/issues/8110). Best regards, Yaroslav 2021-05-06 16:52 GMT+03:00, wlavrijsen at lbl.gov : > Yaroslav, > >> ROOT data analysis framework (https://root.cern) is written in C++ >> (with cppyy Python bindings). > > not exactly: ROOT uses a fork of cppyy with several local modifications > (and > it's also very much behind cppyy master). One of them is precisely this: > >> ImportError: Failed to import libcppyy2_7. Please check that ROOT has >> been built for Python 2.7 > > Which is referring to their "multi-python" build, a local "invention." But > also, libcppyy is not used in PyPy and should never be loaded directly for > portable use. (If they had stayed with standard Python platform tags, they > would not have to load it explicitly.) > >> Could you please add support of ROOT in PyPy? > > This really is on the ROOT folk to make their fork compatible with the > Python ecosphere, so file a bug report with them. > > Best regards, > Wim > -- > WLavrijsen at lbl.gov -- +1 (510) 486 6411 -- www.lavrijsen.net > -- ? ?????????, ??????? ????????? +7 916 743 3759 From matti.picus at gmail.com Fri May 7 05:19:45 2021 From: matti.picus at gmail.com (Matti Picus) Date: Fri, 7 May 2021 12:19:45 +0300 Subject: [pypy-dev] 7.3.5 rc2 tarballs are up. In-Reply-To: References: Message-ID: <1258cc11-8db9-063a-583e-7fdee7563481@gmail.com> On 7/5/21 11:59 am, Oliver Margetts wrote: > Hi, did this fix https://foss.heptapod.net/pypy/pypy/-/issues/3441 > make it into the RC? > > From my testing, it still seems to be a problem. > > Best, > Oliver > Thanks, I missed that but put it in now. It will be part of rc3, which will come out once I figure out what is going wrong with cpyext and win64. Matti From mgorny at gentoo.org Fri May 14 16:08:32 2021 From: mgorny at gentoo.org (=?UTF-8?Q?Micha=C5=82_G=C3=B3rny?=) Date: Fri, 14 May 2021 22:08:32 +0200 Subject: [pypy-dev] 7.3.5 rc2 tarballs are up. In-Reply-To: References: Message-ID: <46e731e476008b1e1a48789fade622d39b947530.camel@gentoo.org> On Thu, 2021-05-06 at 00:39 +0300, Matti Picus wrote: > Following the rc1 I have uploaded rc2 candidates to > > https://downloads.python.org/pypy/ and the checksums can be found in the > PR to pypy.org > > https://608f07a183d23c00083f6115--keen-mestorf-442210.netlify.app/download_advanced.html#checksums > > The changes are in the release note. > > https://doc.pypy.org/en/latest/release-v7.3.5.html: > > > - The new windows 64-bit builds improperly named c-extension modules > with the same extension as the 32-bit build (issue 3443) > > > - Use the windows-specific PC/pyconfig.h rather than the posix one > > > - A change to the python 3.7 sysconfig.get_config_var('LIBDIR') was > wrong, leading to problems finding libpypy3-c.so for embedded PyPy > (issue 3442). > > > - Instantiate distutils.command.install schema for PyPy-specific > implementation_lower > > - Four upstream (CPython) security patches were applied > > > - When assigning the full slice of a list, evaluate the rhs before > clearing the list (issue 3440) > > - On Python2, PyUnicode_Contains accepts bytes as well as unicode. > > - Update the packaged sqlite3 to 3.35.5 on windows. While not a bugfix, > this seems like an easy win. > I think the fix for: https://foss.heptapod.net/pypy/pypy/-/issues/3432 is missing too. -- Best regards, Micha? G?rny From matti.picus at gmail.com Sat May 15 14:37:01 2021 From: matti.picus at gmail.com (Matti Picus) Date: Sat, 15 May 2021 21:37:01 +0300 Subject: [pypy-dev] 7.3.5 rc2 tarballs are up. In-Reply-To: <46e731e476008b1e1a48789fade622d39b947530.camel@gentoo.org> References: <46e731e476008b1e1a48789fade622d39b947530.camel@gentoo.org> Message-ID: <097ff1c1-115e-3779-95b3-4ceece17ff42@gmail.com> On 14/5/21 11:08 pm, Micha? G?rny wrote: > I think the fix for: > > https://foss.heptapod.net/pypy/pypy/-/issues/3432 > > is missing too. > Thanks. Done in 77787b8f4c49 Matti From yury at shurup.com Sun May 16 04:36:18 2021 From: yury at shurup.com (Yury V. Zaytsev) Date: Sun, 16 May 2021 10:36:18 +0200 (CEST) Subject: [pypy-dev] Koka benchmark paper Message-ID: <9a857569-bae5-7ad1-b263-9931a1a55063@shurup.com> Hi folks, You must have already seen the TR by MSR covering the new reference counting garbage collector for Koka? https://www.microsoft.com/en-us/research/publication/perceus-garbage-free-reference-counting-with-reuse/ I really wonder how warmed up PyPy (including vs. CPython ;-) ) would do on those benchmarks if translated rather close to Koka? -- Sincerely yours, Yury V. Zaytsev From cfbolz at gmx.de Sun May 16 14:21:23 2021 From: cfbolz at gmx.de (Carl Friedrich Bolz-Tereick) Date: Sun, 16 May 2021 20:21:23 +0200 Subject: [pypy-dev] Koka benchmark paper In-Reply-To: <9a857569-bae5-7ad1-b263-9931a1a55063@shurup.com> References: <9a857569-bae5-7ad1-b263-9931a1a55063@shurup.com> Message-ID: Hi Yury! yeah, it's on my to-read list! did you find the source code of the benchmark set they used in the paper? might be worth mailing the authors. Would be a fun experiment! Cheers, Carl Friedrich On 16.05.21 10:36, Yury V. Zaytsev wrote: > Hi folks, > > You must have already seen the TR by MSR covering the new reference > counting garbage collector for Koka? > > https://www.microsoft.com/en-us/research/publication/perceus-garbage-free-reference-counting-with-reuse/ > > > I really wonder how warmed up PyPy (including vs. CPython ;-) ) would do > on those benchmarks if translated rather close to Koka? > From yury at shurup.com Mon May 17 02:33:15 2021 From: yury at shurup.com (Yury V. Zaytsev) Date: Mon, 17 May 2021 08:33:15 +0200 (CEST) Subject: [pypy-dev] Koka benchmark paper In-Reply-To: References: <9a857569-bae5-7ad1-b263-9931a1a55063@shurup.com> Message-ID: Hi Carl, On Sun, 16 May 2021, Carl Friedrich Bolz-Tereick wrote: > > yeah, it's on my to-read list! did you find the source code of the > benchmark set they used in the paper? might be worth mailing the > authors. Would be a fun experiment! Nope, unfortunately I've only seen snippets provided in the paper, although I have to admit that I didn't put hours in looking for them... I guess emailing would be easiest. -- Sincerely yours, Yury V. Zaytsev From cfbolz at gmx.de Mon May 17 15:27:43 2021 From: cfbolz at gmx.de (Carl Friedrich Bolz-Tereick) Date: Mon, 17 May 2021 21:27:43 +0200 Subject: [pypy-dev] Koka benchmark paper In-Reply-To: References: <9a857569-bae5-7ad1-b263-9931a1a55063@shurup.com> Message-ID: Hi Yury, there here: https://github.com/koka-lang/koka/tree/master/test/bench (I asked on Twitter). Cheers, Carl Friedrich On 17.05.21 08:33, Yury V. Zaytsev wrote: > Hi Carl, > > On Sun, 16 May 2021, Carl Friedrich Bolz-Tereick wrote: >> >> yeah, it's on my to-read list! did you find the source code of the >> benchmark set they used in the paper? might be worth mailing the >> authors. Would be a fun experiment! > > Nope, unfortunately I've only seen snippets provided in the paper, > although I have to admit that I didn't put hours in looking for them... > I guess emailing would be easiest. > From matti.picus at gmail.com Wed May 19 09:31:23 2021 From: matti.picus at gmail.com (Matti Picus) Date: Wed, 19 May 2021 16:31:23 +0300 Subject: [pypy-dev] 7.3.5 rc3 tarballs are up Message-ID: I have uploaded rc3 candidates to https://downloads.python.org/pypy/ and the checksums can be found in the PR to pypy.org https://608f07a183d23c00083f6115--keen-mestorf-442210.netlify.app/download_advanced.html#checksums The changes are in the release note. https://doc.pypy.org/en/latest/release-v7.3.5.html: - The new windows 64-bit builds improperly named c-extension modules with the same extension as the 32-bit build (issue 3443) - Use the windows-specific PC/pyconfig.h rather than the posix one - A change to the python 3.7 sysconfig.get_config_var('LIBDIR') was wrong, leading to problems finding libpypy3-c.so for embedded PyPy (issue 3442). - Instantiate distutils.command.install schema for PyPy-specific implementation_lower - Four upstream (CPython) security patches were applied - When assigning the full slice of a list, evaluate the rhs before clearing the list (issue 3440) - On Python2, PyUnicode_Contains accepts bytes as well as unicode. - Update the packaged sqlite3 to 3.35.5 on windows. While not a bugfix, this seems like an easy win. Added in this rc3: ----------------------- - Fix the return type for |_Py_HashDouble| which impacts 64-bit windows - Delay thread-checking logic in greenlets until the thread is actually started (continuation of issue 3441 ) - Fix for json-specialized dicts (issue 3460 ) - Specialize |ByteBuffer.setslice| which speeds up binary file reading by a factor of 3 - Test and fix a broken code path in _sqlite3 (issue 3432) Please try them out Matti From thomas.moulton at us.qinetiq.com Thu May 20 12:45:10 2021 From: thomas.moulton at us.qinetiq.com (Moulton, Tom) Date: Thu, 20 May 2021 12:45:10 -0400 Subject: [pypy-dev] [External] pypy3 Multiprocessing Socket Issue Message-ID: I'm using pypy3.7-v7.3.4-linux64 (Python v3.7.10) but have also tried pypy3.7-v7.3.5rc1-linux64 & pypy-c-jit-102427-77787b8f4c49-linux64. I'm seeing an issue that does NOT occur when using regular Python 3.7 (CPython v3.7.10). My application uses Python's multiprocessing library to take advantage of multiple cores on the CPU and as such I have shared resources that get utilized between these processes (sockets, queues, etc.), arbitrated through appropriate locking mechanisms to address resource contention. The issue seems to occur with respect to the sharing of sockets between these processes. That is, after 30 or more seconds I start to continually receive the following output, with NO stack trace to where it is occurring in my application... Traceback (most recent call last): File "/opt/pypy/lib-python/3/multiprocessing/resource_sharer.py", line 142, in _serve File "/opt/pypy/lib-python/3/multiprocessing/connection.py", line 453, in accept File "/opt/pypy/lib-python/3/multiprocessing/connection.py", line 599, in accept File "/opt/pypy/lib-python/3/socket.py", line 212, in accept OSError: [Errno 24] Too many open files Clearly, under pypy3, my application is exceeding the number of open files supported by the operating system (OS) and I can see this occur during the ~30 second lead up, where if I run "lsof -i" (in another shell), I can see the number of TCP/IP connections increasing steadily (associated with my application) up to the OS limit, which I can only assume is associated with, when one my processes accesses this shared socket resource. Again, this does NOT occur when running my application under regular Python (CPython) and I verify this using the same "lsof -i" method. Thoughts on how I might overcome this issue so I can see if using pypy3 over CPython 3 garners better performance for my specific application? Regards, Tom Moulton -- This communication, including any attachments, may contain Proprietary, Export Controlled, Legally Privileged or other Confidential Information. Any unauthorized review, printing, copying, transmission, dissemination, or use of the information is strictly prohibited. If you are not the intended recipient, please indicate to the sender that you have received this email in error, and delete the copy you received. -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Thu May 20 13:41:44 2021 From: matti.picus at gmail.com (Matti Picus) Date: Thu, 20 May 2021 20:41:44 +0300 Subject: [pypy-dev] [External] pypy3 Multiprocessing Socket Issue In-Reply-To: References: Message-ID: <519ada6b-3146-1ff3-752e-ed96f4bc2e03@gmail.com> On 20/5/21 7:45 pm, Moulton, Tom via pypy-dev wrote: > I'm using pypy3.7-v7.3.4-linux64 (Python v3.7.10) but have also > tried?pypy3.7-v7.3.5rc1-linux64 > &?pypy-c-jit-102427-77787b8f4c49-linux64.? I'm seeing an issue that > does NOT occur when using regular?Python 3.7 (CPython v3.7.10). > ... > OSError: [Errno 24] Too many open files > > ... > Thoughts on how I might overcome this issue so I can see if using > pypy3 over CPython 3 garners better performance for my specific > application? > > Regards, > > Tom Moulton Without seeing your code it is difficult to be specific. Typically this is caused by not closing allocated resources. CPython uses reference counting, and so it implicitly releases resources when they go out of scope. PyPy does not. Are you calling socket.close() or fid.close() when you are done with them? Also see this section of our documentation https://doc.pypy.org/en/latest/cpython_differences.html#differences-related-to-garbage-collection-strategies Matti From apurbo97 at gmail.com Sat May 22 03:04:35 2021 From: apurbo97 at gmail.com (Raihan Rasheed Apurbo) Date: Sat, 22 May 2021 13:04:35 +0600 Subject: [pypy-dev] seeking pypy gc configuring guideline Message-ID: I was trying to run pypy using semispace gc. But as it stands currently I can only compile using incminimark and boehm. What changes I have to make so that I would be able to test pypy with any existing gc implementation? Moreover I want to write my own version of gc and test with pypy. In future. Can anyone point me where do I have to make changes to make my implementation of any arbitrary gc work? So far I was able to compile an edited version of incminimark gc. Any suggestions or link to any perticular documentation would be appreciated. p.s. I ran pypy with the following command and got this error. I can run same command just replacing incminimark in place of semispace. [image: image.png] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 133140 bytes Desc: not available URL: From armin.rigo at gmail.com Sat May 22 04:00:06 2021 From: armin.rigo at gmail.com (Armin Rigo) Date: Sat, 22 May 2021 10:00:06 +0200 Subject: [pypy-dev] seeking pypy gc configuring guideline In-Reply-To: References: Message-ID: Hi Raihan, On Sat, 22 May 2021 at 09:05, Raihan Rasheed Apurbo wrote: > I was trying to run pypy using semispace gc. But as it stands currently I > can only compile using incminimark and boehm. What changes I have to make > so that I would be able to test pypy with any existing gc implementation? > Moreover I want to write my own version of gc and test with pypy. In > future. Can anyone point me where do I have to make changes to make my > implementation of any arbitrary gc work? So far I was able to compile an > edited version of incminimark gc. > Any suggestions or link to any perticular documentation would be > appreciated. > > p.s. I ran pypy with the following command and got this error. I can run > same command just replacing incminimark in place of semispace. > This screenshot contains the whole explanation in bold red text: some components of PyPy have evolved over time to depend on special features of the GC. These features have been implemented only in the 'incminimark' and the 'boehm' GC. For experimentation with other GCs, you can disable these components---like the 'cpyext' module of PyPy---by following the instructions in the bold red text. Note also that if you want to play with the GC, it makes little sense to translate a complete PyPy every time. You should at least have a small RPython program, much simpler than the whole PyPy, and use that for testing. But ideally you should look at the way we write various tests in rpython/memory/. A bient?t, Armin -------------- next part -------------- An HTML attachment was scrubbed... URL: From apurbo97 at gmail.com Sat May 22 08:40:09 2021 From: apurbo97 at gmail.com (Raihan Rasheed Apurbo) Date: Sat, 22 May 2021 18:40:09 +0600 Subject: [pypy-dev] seeking pypy gc configuring guideline In-Reply-To: References: Message-ID: Hello Armin, Thanks for helping me out. I was actually looking for a generalized solution. The last paragraph of your answer covers that. What I understand from that is, if I want to understand how pypy gc works and if i want to write my own version of GC at first I have to understand all the tests written on rpython/memory. I will now look extensively into that. I have tried to understand some of the test codes earlier but there is a problem that I faced. Suppose gc_test_base.py written in rpython/memory/test not only uses modules written in rpython.memory but also uses modules like llinterp from rpython.rtyper. As I don't know how those modules work how do I figure out their function written in the test codes? Do you have any suggestions for someone who is facing this issue? On Sat, May 22, 2021 at 2:00 PM Armin Rigo wrote: > Hi Raihan, > > On Sat, 22 May 2021 at 09:05, Raihan Rasheed Apurbo > wrote: > >> I was trying to run pypy using semispace gc. But as it stands currently I >> can only compile using incminimark and boehm. What changes I have to make >> so that I would be able to test pypy with any existing gc implementation? >> Moreover I want to write my own version of gc and test with pypy. In >> future. Can anyone point me where do I have to make changes to make my >> implementation of any arbitrary gc work? So far I was able to compile an >> edited version of incminimark gc. >> Any suggestions or link to any perticular documentation would be >> appreciated. >> >> p.s. I ran pypy with the following command and got this error. I can run >> same command just replacing incminimark in place of semispace. >> > > This screenshot contains the whole explanation in bold red text: some > components of PyPy have evolved over time to depend on special features of > the GC. These features have been implemented only in the 'incminimark' and > the 'boehm' GC. For experimentation with other GCs, you can disable these > components---like the 'cpyext' module of PyPy---by following the > instructions in the bold red text. > > Note also that if you want to play with the GC, it makes little sense to > translate a complete PyPy every time. You should at least have a small > RPython program, much simpler than the whole PyPy, and use that for > testing. But ideally you should look at the way we write various tests in > rpython/memory/. > > > A bient?t, > > Armin > -------------- next part -------------- An HTML attachment was scrubbed... URL: From armin.rigo at gmail.com Sun May 23 02:22:48 2021 From: armin.rigo at gmail.com (Armin Rigo) Date: Sun, 23 May 2021 08:22:48 +0200 Subject: [pypy-dev] seeking pypy gc configuring guideline In-Reply-To: References: Message-ID: Hi Raihan, On Sat, 22 May 2021 at 14:40, Raihan Rasheed Apurbo wrote: > Thanks for helping me out. I was actually looking for a generalized solution. The last paragraph of your answer covers that. What I understand from that is, if I want to understand how pypy gc works and if i want to write my own version of GC at first I have to understand all the tests written on rpython/memory. I will now look extensively into that. > > I have tried to understand some of the test codes earlier but there is a problem that I faced. Suppose gc_test_base.py written in rpython/memory/test not only uses modules written in rpython.memory but also uses modules like llinterp from rpython.rtyper. As I don't know how those modules work how do I figure out their function written in the test codes? Do you have any suggestions for someone who is facing this issue? Modules from rpython.rtyper.lltypesystem are essential. You need to learn about them to the extent that they are used. They have test files in "test" subdirectories, like almost every file in PyPy and RPython. Armin From cfbolz at gmx.de Sun May 23 03:27:17 2021 From: cfbolz at gmx.de (Carl Friedrich Bolz-Tereick) Date: Sun, 23 May 2021 09:27:17 +0200 Subject: [pypy-dev] seeking pypy gc configuring guideline In-Reply-To: References: Message-ID: <3F433451-FB21-4105-A342-31114BD08C5F@gmx.de> Hi Raihan, Could you please describe a bit what you are trying to implement? That way we could give you a bit more targeted advice. Cheers, Carl Friedrich On May 23, 2021 8:22:48 AM GMT+02:00, Armin Rigo wrote: >Hi Raihan, > >On Sat, 22 May 2021 at 14:40, Raihan Rasheed Apurbo > wrote: >> Thanks for helping me out. I was actually looking for a generalized >solution. The last paragraph of your answer covers that. What I >understand from that is, if I want to understand how pypy gc works and >if i want to write my own version of GC at first I have to understand >all the tests written on rpython/memory. I will now look extensively >into that. >> >> I have tried to understand some of the test codes earlier but there >is a problem that I faced. Suppose gc_test_base.py written in >rpython/memory/test not only uses modules written in rpython.memory but >also uses modules like llinterp from rpython.rtyper. As I don't know >how those modules work how do I figure out their function written in >the test codes? Do you have any suggestions for someone who is facing >this issue? > >Modules from rpython.rtyper.lltypesystem are essential. You need to >learn about them to the extent that they are used. They have test >files in "test" subdirectories, like almost every file in PyPy and >RPython. > >Armin >_______________________________________________ >pypy-dev mailing list >pypy-dev at python.org >https://mail.python.org/mailman/listinfo/pypy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From apurbo97 at gmail.com Sun May 23 10:58:59 2021 From: apurbo97 at gmail.com (Raihan Rasheed Apurbo) Date: Sun, 23 May 2021 20:58:59 +0600 Subject: [pypy-dev] seeking pypy gc configuring guideline In-Reply-To: <3F433451-FB21-4105-A342-31114BD08C5F@gmx.de> References: <3F433451-FB21-4105-A342-31114BD08C5F@gmx.de> Message-ID: Hello Carl, Thanks for your response. I am currently trying to do my thesis. My long term goal is to create a binding of pypy with mmtk ( https://github.com/mmtk/mmtk-core) so that I can test some of the memory manager written in this mmtk platform on pypy. In order to do that at first I have to able to write a pypy memory manager like incminimark that can communicate with the rest of the compiler and extract all information that is needed for a memory manager to function well. Then I will write an api for mmtk and pass those information to mmtk platform so that it can use its memory manager to perform garbage collection for pypy. In that way I would be able to test how new garbage collection algorithms behave in pypy and if I am lucky then I will find some possible imporvements. Any suggestion given by anyone will be much appreciated. Thanks, Raihan On Sun, May 23, 2021 at 1:27 PM Carl Friedrich Bolz-Tereick wrote: > Hi Raihan, > > Could you please describe a bit what you are trying to implement? That way > we could give you a bit more targeted advice. > > Cheers, > > Carl Friedrich > > On May 23, 2021 8:22:48 AM GMT+02:00, Armin Rigo > wrote: >> >> Hi Raihan, >> >> On Sat, 22 May 2021 at 14:40, Raihan Rasheed Apurbo wrote: >> >>> Thanks for helping me out. I was actually looking for a generalized solution. The last paragraph of your answer covers that. What I understand from that is, if I want to understand how pypy gc works and if i want to write my own version of GC at first I have to understand all the tests written on rpython/memory. I will now look extensively into that. >>> >>> I have tried to understand some of the test codes earlier but there is a problem that I faced. Suppose gc_test_base.py written in rpython/memory/test not only uses modules written in rpython.memory but also uses modules like llinterp from rpython.rtyper. As I don't know how those modules work how do I figure out their function written in the test codes? Do you have any suggestions for someone who is facing this issue? >>> >> >> Modules from rpython.rtyper.lltypesystem are essential. You need to >> learn about them to the extent that they are used. They have test >> files in "test" subdirectories, like almost every file in PyPy and >> RPython. >> >> Armin >> ------------------------------ >> pypy-dev mailing list >> pypy-dev at python.org >> https://mail.python.org/mailman/listinfo/pypy-dev >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From apurbo97 at gmail.com Sun May 23 11:00:50 2021 From: apurbo97 at gmail.com (Raihan Rasheed Apurbo) Date: Sun, 23 May 2021 21:00:50 +0600 Subject: [pypy-dev] seeking pypy gc configuring guideline In-Reply-To: <3F433451-FB21-4105-A342-31114BD08C5F@gmx.de> References: <3F433451-FB21-4105-A342-31114BD08C5F@gmx.de> Message-ID: Hello Armin, Thanks for pointing that out. I will definitely take a look at the test code of that module first. -Raihan On Sun, May 23, 2021 at 1:27 PM Carl Friedrich Bolz-Tereick wrote: > Hi Raihan, > > Could you please describe a bit what you are trying to implement? That way > we could give you a bit more targeted advice. > > Cheers, > > Carl Friedrich > > On May 23, 2021 8:22:48 AM GMT+02:00, Armin Rigo > wrote: >> >> Hi Raihan, >> >> On Sat, 22 May 2021 at 14:40, Raihan Rasheed Apurbo wrote: >> >>> Thanks for helping me out. I was actually looking for a generalized solution. The last paragraph of your answer covers that. What I understand from that is, if I want to understand how pypy gc works and if i want to write my own version of GC at first I have to understand all the tests written on rpython/memory. I will now look extensively into that. >>> >>> I have tried to understand some of the test codes earlier but there is a problem that I faced. Suppose gc_test_base.py written in rpython/memory/test not only uses modules written in rpython.memory but also uses modules like llinterp from rpython.rtyper. As I don't know how those modules work how do I figure out their function written in the test codes? Do you have any suggestions for someone who is facing this issue? >>> >> >> Modules from rpython.rtyper.lltypesystem are essential. You need to >> learn about them to the extent that they are used. They have test >> files in "test" subdirectories, like almost every file in PyPy and >> RPython. >> >> Armin >> ------------------------------ >> pypy-dev mailing list >> pypy-dev at python.org >> https://mail.python.org/mailman/listinfo/pypy-dev >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From brad.kish at arcticwolf.com Wed May 26 10:31:29 2021 From: brad.kish at arcticwolf.com (Brad Kish) Date: Wed, 26 May 2021 14:31:29 +0000 Subject: [pypy-dev] Memory leak in 7.3.4 Message-ID: We have a streaming application that processes JSON messages. Our system has been mostly stable running PyPy2.7 6.0 for a couple of years. Recently we tried upgrading to 7.3.4 and saw our processes leak memory until the container was killed. This happened with both PyPy2.7 7.3.4 and PyPy3.7 7.3.4. The only difference was the cpu usage with PyPy3.7 was almost double. :( My question is could this be from the new JSON parser introduced in 7.2? According to https://morepypy.blogspot.com/2019/10/pypys-new-json-parser.html, the parser now caches values not just keys. Is there anyway to disable the value caching? Or other things we can try? Thanks, Brad. From brad.kish at arcticwolf.com Wed May 26 10:39:19 2021 From: brad.kish at arcticwolf.com (Brad Kish) Date: Wed, 26 May 2021 14:39:19 +0000 Subject: [pypy-dev] Memory leak in 7.3.4 In-Reply-To: References: Message-ID: <3D7F7E82-F585-4A83-9907-D9068D112EFF@arcticwolf.com> Sorry, the containers are not dying. Our processes are exiting due to gc_max settings. Brad. > On May 26, 2021, at 10:31 AM, Brad Kish wrote: > > We have a streaming application that processes JSON messages. Our system has been mostly stable running PyPy2.7 6.0 for a couple of years. > > Recently we tried upgrading to 7.3.4 and saw our processes leak memory until the container was killed. This happened with both PyPy2.7 7.3.4 and PyPy3.7 7.3.4. The only difference was the cpu usage with PyPy3.7 was almost double. :( > > My question is could this be from the new JSON parser introduced in 7.2? According to https://morepypy.blogspot.com/2019/10/pypys-new-json-parser.html, the parser now caches values not just keys. Is there anyway to disable the value caching? Or other things we can try? > > Thanks, > > Brad. From cfbolz at gmx.de Thu May 27 05:10:36 2021 From: cfbolz at gmx.de (Carl Friedrich Bolz-Tereick) Date: Thu, 27 May 2021 11:10:36 +0200 Subject: [pypy-dev] Memory leak in 7.3.4 In-Reply-To: <3D7F7E82-F585-4A83-9907-D9068D112EFF@arcticwolf.com> References: <3D7F7E82-F585-4A83-9907-D9068D112EFF@arcticwolf.com> Message-ID: Hi Brad, the new json decoder caches values, but not across json.load calls. ie if you don't process a single gigantic message but many tiny messages, the caching of (string) values cannot be the problem. what *could* be the problem is the key caching though, in theory. The key cache is partially persisted across calls. Are there arbitrarily many different keys in your messages? The key cache shouldn't grow without bounds, however. what do you set your max heap size to? To debug further, I would first try to see whether it is indeed the json module or something else. Maybe you could try another json decoder and see whether the problem persists with that? Eg ujson works on PyPy (slowly) or you could even use the pure python builtin one that you get by importing json.decoder.JSONDecoder. Cheers, Carl Friedrich On 26.05.21 16:39, Brad Kish wrote: > Sorry, the containers are not dying. Our processes are exiting due to gc_max settings. > > Brad. > >> On May 26, 2021, at 10:31 AM, Brad Kish wrote: >> >> We have a streaming application that processes JSON messages. Our system has been mostly stable running PyPy2.7 6.0 for a couple of years. >> >> Recently we tried upgrading to 7.3.4 and saw our processes leak memory until the container was killed. This happened with both PyPy2.7 7.3.4 and PyPy3.7 7.3.4. The only difference was the cpu usage with PyPy3.7 was almost double. :( >> >> My question is could this be from the new JSON parser introduced in 7.2? According to https://morepypy.blogspot.com/2019/10/pypys-new-json-parser.html, the parser now caches values not just keys. Is there anyway to disable the value caching? Or other things we can try? >> >> Thanks, >> >> Brad. > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > From apurbo97 at gmail.com Thu May 27 11:43:54 2021 From: apurbo97 at gmail.com (Raihan Rasheed Apurbo) Date: Thu, 27 May 2021 21:43:54 +0600 Subject: [pypy-dev] seeking pypy gc configuring guideline In-Reply-To: References: <3F433451-FB21-4105-A342-31114BD08C5F@gmx.de> Message-ID: Hello Armin, I have read the test codes of test_llinterp.py and llinterp.py. I am writing a summary below. I just want you to point out my mistakes if I am wrong. Please help me to understand this. Basically, llinterp is an object that can interpret given a python function and function's arguments. It is used to test any component of pypy. If we plug any component (that we want to test) into llinterp such as if we plug it with a heap that has semispace gc the interpreter will use semispace gc when it needs to use gc functionality. So if I get desired output with semispace gc when I plug it into llinterp and do my test then I can assume that if I build pypy properly using semispace gc that newly build pypy would behave the same given the same test input. Please do me a favor by commenting on this summary. This will help to know whether my understanding is right on this matter. Moreover, can you tell me why some of the member functions written in class LLFrame (inside lliterp.py) only raise NotImplmentedError exception. Methods such as op_gc_heap_stats, op_gc_obtain_free_space, and many more just only raise this exception without doing anything. Thanks a lot! -Raihan On Sun, May 23, 2021 at 9:00 PM Raihan Rasheed Apurbo wrote: > Hello Armin, > Thanks for pointing that out. I will definitely take a look at the test > code of that module first. > -Raihan > > On Sun, May 23, 2021 at 1:27 PM Carl Friedrich Bolz-Tereick > wrote: > >> Hi Raihan, >> >> Could you please describe a bit what you are trying to implement? That >> way we could give you a bit more targeted advice. >> >> Cheers, >> >> Carl Friedrich >> >> On May 23, 2021 8:22:48 AM GMT+02:00, Armin Rigo >> wrote: >>> >>> Hi Raihan, >>> >>> On Sat, 22 May 2021 at 14:40, Raihan Rasheed Apurbo wrote: >>> >>>> Thanks for helping me out. I was actually looking for a generalized solution. The last paragraph of your answer covers that. What I understand from that is, if I want to understand how pypy gc works and if i want to write my own version of GC at first I have to understand all the tests written on rpython/memory. I will now look extensively into that. >>>> >>>> I have tried to understand some of the test codes earlier but there is a problem that I faced. Suppose gc_test_base.py written in rpython/memory/test not only uses modules written in rpython.memory but also uses modules like llinterp from rpython.rtyper. As I don't know how those modules work how do I figure out their function written in the test codes? Do you have any suggestions for someone who is facing this issue? >>>> >>> >>> Modules from rpython.rtyper.lltypesystem are essential. You need to >>> learn about them to the extent that they are used. They have test >>> files in "test" subdirectories, like almost every file in PyPy and >>> RPython. >>> >>> Armin >>> ------------------------------ >>> pypy-dev mailing list >>> pypy-dev at python.org >>> https://mail.python.org/mailman/listinfo/pypy-dev >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From anto.cuni at gmail.com Sun May 30 15:46:51 2021 From: anto.cuni at gmail.com (Antonio Cuni) Date: Sun, 30 May 2021 21:46:51 +0200 Subject: [pypy-dev] New official IRC channel Message-ID: Following the example of many other FOSS projects, the PyPy team has decided to move its official #pypy IRC channel from Freenode to Libera.Chat: irc://irc.libera.chat/hpy The core devs will no longer be present on the Freenode channel, so we recommend to join the new channel as soon as possible. https://www.pypy.org/posts/2021/05/pypy-irc-moves-to-libera-chat.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From armin.rigo at gmail.com Mon May 31 03:54:18 2021 From: armin.rigo at gmail.com (Armin Rigo) Date: Mon, 31 May 2021 09:54:18 +0200 Subject: [pypy-dev] seeking pypy gc configuring guideline In-Reply-To: References: <3F433451-FB21-4105-A342-31114BD08C5F@gmx.de> Message-ID: Hi Raihan, On Thu, 27 May 2021 at 17:44, Raihan Rasheed Apurbo wrote: > I have read the test codes of test_llinterp.py and llinterp.py. I am writing a summary below. I just want you to point out my mistakes if I am wrong. Please help me to understand this. There are many tests doing various checks at various levels. I think you should focus on the "final" tests that try to translate the GC to C and run it: they are in rpython/translator/c/test/test_newgc.py. The reason I'm suggesting to focus on these tests instead of looking at all intermediate levels is that, if I'm understanding it correctly, you mostly want to plug in an existing library to do the GC. You're thus going to fight problems of integration more than problems of algorithmically correct code. It's OK if you ignore the other tests because they are likely to not work out of the box in your case. For example they're half-emulating the actual memory layout of the objects, but your external library cannot work with such emulation. So I'd recommend looking at test_newgc instead of, say, rpython/rtyper/llinterp.py, which is here only to make this emulation work. test_newgc contains tests of the form of a top-level RPython function (typically called f()) that does random RPython things that need memory allocation (which is almost anything that makes lists, tuples, instances of classes, etc.). These RPython things are automatically translated to calls to lltype.malloc(). Some of the tests in test_newgc are directly calling lltype.malloc() too. For the purpose of this test, the two kinds of approaches are equivalent. Note also that in the past, during development of the branch "stmgc-c8", we made a different GC in C directly and integrated it a bit differently, in a way that is simpler and makes more sense for external GCs. Maybe you want to look at this old branch instead. But it can also be confusing because this STM GC added many kinds of unusual barriers (rpython/memory/gctransform/stmframework.py in this branch). A bient?t, Armin. From brad.kish at arcticwolf.com Mon May 31 12:35:00 2021 From: brad.kish at arcticwolf.com (Brad Kish) Date: Mon, 31 May 2021 16:35:00 +0000 Subject: [pypy-dev] Memory leak in 7.3.4 Message-ID: Thanks Carl. Appreciate the feedback and suggestions. We will try this out. Brad. Hi Brad, the new json decoder caches values, but not across json.load calls. ie if you don't process a single gigantic message but many tiny messages, the caching of (string) values cannot be the problem. what *could* be the problem is the key caching though, in theory. The key cache is partially persisted across calls. Are there arbitrarily many different keys in your messages? The key cache shouldn't grow without bounds, however. what do you set your max heap size to? To debug further, I would first try to see whether it is indeed the json module or something else. Maybe you could try another json decoder and see whether the problem persists with that? Eg ujson works on PyPy (slowly) or you could even use the pure python builtin one that you get by importing json.decoder.JSONDecoder. Cheers, Carl Friedrich -------------- next part -------------- An HTML attachment was scrubbed... URL: