From matti.picus at gmail.com Thu Mar 7 02:26:27 2019 From: matti.picus at gmail.com (Matti Picus) Date: Thu, 7 Mar 2019 09:26:27 +0200 Subject: [pypy-dev] Updating packages.pypy.org Message-ID: <442a9497-4263-cf3b-e256-1c375d4f40e8@gmail.com> I have updated the repo behind http://packages.pypy.org (forked to https://github.com/pypy/pypy.packages): - use the latest pypy3.6 (v7.1 from two days ago), which has the utf8 improvements - PyPI changed their interface, so I had to manually create a download.csv with the top 1000 packages from the last 12 months - updated the docker image to use utf-8 as the filesystemencoding so we don't get strange ascii encoding errors when reading files with unicode With these changes, we successfully can install almost all the packages. OpenCV, Tensorflow and apache-beam, which all require complicated install procedures that seem to be beyond the ability of a simple "pip install", fail, as do some python2-only packages that are still in the top 1000. Next steps: - Release 7.1 once the newmemoryview-app-level branch lands (it allows creating memoryviews of ctypes structures, passing more tests on numpy's latest version) - Rerun the repo with an official build - Update the web site. From matti.picus at gmail.com Fri Mar 15 07:24:46 2019 From: matti.picus at gmail.com (Matti Picus) Date: Fri, 15 Mar 2019 13:24:46 +0200 Subject: [pypy-dev] Downloads for release v7.1 of pypy2.7 and pypy3.6 are on bitbucket Message-ID: An HTML attachment was scrubbed... URL: From stuaxo2 at yahoo.com Sat Mar 16 11:05:03 2019 From: stuaxo2 at yahoo.com (Stuart Axon) Date: Sat, 16 Mar 2019 15:05:03 +0000 (UTC) Subject: [pypy-dev] Pypy friendly to get pointer enclosed in python object ? References: <1292564908.5449758.1552748703265.ref@mail.yahoo.com> Message-ID: <1292564908.5449758.1552748703265@mail.yahoo.com> PyCairo has a struct like `??? typedef struct { | ?? PyObject_HEAD | | | ???? cairo_t *ctx; | | | ??? PyObject *base; /* base object used to create context, or NULL */ | | ??? } PycairoContext; ` And a #define that you can use to get pointer to ctx: `??? #define PycairoContext_GET(obj) (((PycairoContext *)(obj))->ctx)` How can I get hold of the pointer *ctx in Pypy ? I had a bit of a look at CFFI, but couldn't find anything using structures with PyObject_HEAD in them. S++ -------------- next part -------------- An HTML attachment was scrubbed... URL: From stuaxo2 at yahoo.com Sun Mar 17 08:04:29 2019 From: stuaxo2 at yahoo.com (Stuart Axon) Date: Sun, 17 Mar 2019 12:04:29 +0000 (UTC) Subject: [pypy-dev] Pypy friendly to get pointer enclosed in python object ? In-Reply-To: <1292564908.5449758.1552748703265@mail.yahoo.com> References: <1292564908.5449758.1552748703265.ref@mail.yahoo.com> <1292564908.5449758.1552748703265@mail.yahoo.com> Message-ID: <599374158.5656903.1552824269323@mail.yahoo.com> Realised I should ask this on the CFFI list, apologies for the noise. On Saturday, March 16, 2019, 3:05:03 PM GMT, Stuart Axon wrote: PyCairo has a struct like `??? typedef struct { | ?? PyObject_HEAD | | | ???? cairo_t *ctx; | | | ??? PyObject *base; /* base object used to create context, or NULL */ | | ??? } PycairoContext; ` And a #define that you can use to get pointer to ctx: `??? #define PycairoContext_GET(obj) (((PycairoContext *)(obj))->ctx)` How can I get hold of the pointer *ctx in Pypy ? I had a bit of a look at CFFI, but couldn't find anything using structures with PyObject_HEAD in them. S++ -------------- next part -------------- An HTML attachment was scrubbed... URL: From sunil.guglani at gmail.com Mon Mar 18 08:31:48 2019 From: sunil.guglani at gmail.com (Sunil Guglani) Date: Mon, 18 Mar 2019 18:01:48 +0530 Subject: [pypy-dev] pypy configured Cloud host Message-ID: Hi, I am facing issues in installing/configuring pypy with other libraries..Will you pls suggest any cloud hosting service provider, which gives pypy preconfigured environment? Thanks, Sunil -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark.melvin at gmail.com Tue Mar 19 07:45:44 2019 From: mark.melvin at gmail.com (Mark Melvin) Date: Tue, 19 Mar 2019 07:45:44 -0400 Subject: [pypy-dev] PyPy as micropython Replacement Message-ID: Hi All, I was wondering if PyPy would be a good candidate to create an embedded distribution of Python similar to micropython. Do you think it would be feasible to create an installation (stripped of things like Tkinter and other unneeded modules) that could execute with as little as say 2-4 MB of heap on an ARM7 running Linux? Thanks, Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From armin.rigo at gmail.com Tue Mar 19 12:00:41 2019 From: armin.rigo at gmail.com (Armin Rigo) Date: Tue, 19 Mar 2019 17:00:41 +0100 Subject: [pypy-dev] PyPy as micropython Replacement In-Reply-To: References: Message-ID: Hi Mark, On Tue, 19 Mar 2019 at 12:46, Mark Melvin wrote: > I was wondering if PyPy would be a good candidate to create an embedded distribution of Python similar to micropython. No, PyPy's minimal memory requirements are larger than CPython's. A bient?t, Armin. From mark.melvin at gmail.com Tue Mar 19 12:02:42 2019 From: mark.melvin at gmail.com (Mark Melvin) Date: Tue, 19 Mar 2019 12:02:42 -0400 Subject: [pypy-dev] PyPy as micropython Replacement In-Reply-To: References: Message-ID: OK. Thanks for the quick reply, Armin. Mark On Tue., Mar. 19, 2019, 12:01 p.m. Armin Rigo, wrote: > Hi Mark, > > On Tue, 19 Mar 2019 at 12:46, Mark Melvin wrote: > > I was wondering if PyPy would be a good candidate to create an embedded > distribution of Python similar to micropython. > > No, PyPy's minimal memory requirements are larger than CPython's. > > > A bient?t, > > Armin. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Fri Mar 22 04:01:59 2019 From: matti.picus at gmail.com (Matti Picus) Date: Fri, 22 Mar 2019 10:01:59 +0200 Subject: [pypy-dev] Downloads for release v7.1 of pypy2.7 and pypy3.6 are on bitbucket In-Reply-To: References: Message-ID: I updated the release candidates: - fixes for issue 2968 and inheriting from tuple in C - changed the py3.6 label to "beta" - linux builds on xenail not stretch for Ubuntu 16.04 compatibility Please try them out Matti On 15/3/19 1:24 pm, Matti Picus wrote: > > I have uploaded release bundles of pypy2.7 and pypy3.6 v7.1 > > The sha256 checksums are at > https://bitbucket.org/pypy/pypy.org/src/916dbc49d94493ce97263f6a4924f491275ee23b/source/download.txt#lines-457 > > > The release notes are at http://doc.pypy.org/en/latest/release-v7.1.0.html > > > Please try out the release bundles and suggest improvements to the > release notes. > > > Matti > From matthewmatthewfl at gmail.com Fri Mar 22 21:23:35 2019 From: matthewmatthewfl at gmail.com (Matthew Francis-Landau) Date: Fri, 22 Mar 2019 21:23:35 -0400 Subject: [pypy-dev] Some questions about using rpython Message-ID: <2ad7209c-c4f1-20fb-6a96-3e8f2a673f44@gmail.com> Hello, I am considering using rpython to optimize some interpreter like code and I have a few questions that I couldn't quite figure out from pypy's source. 1. Is it possible to call something written in rpython from "normal python" (either running on cPython or PyPy)?? Ideally, I would like something where objects could be shared between the two environments.? Can I compile something using rpython and then turn that into a C compiled python module that can share objects with the interpreter. 2. How much control can I have over the tracing compiler, eg is it possible to forcing something to be traced the first time it is run? 3. If I am inside of a trace, can I add my own guards or conditionally aborting trace and returning to an interpreted version of the code. ? I am going to have some code like: ????? if self.method is None: ????????? # this branch is only taken at most once for this object ????????? # Abort the trace??? ????????? self.method = build_method()? # marked as an elidable method? ??? self.method(....) ??? 4. How can I use threads from inside rpython?? What considerations do I need to make when interacting with the JitDriver.? What sorts of atomic operations are available when interfacing with my data structures.? What considerations would I need to make to work with rpython's GC.? Can I release "normal python's" GIL when writing threaded code from rpython? ?? Also, is software transactional memory usable by things that are not PyPy?? It looks like there where a few branches that haven't been merged. 5. Inside of a trace, I would like to be able to look at mutable objects (namely the entire current stack frame) from an elidable methods and have the result be cached without a need for any guards.? This is something that I know will be safe due to how the program was written, but not something I would expect rpython to be able to deduce from the context of a trace.? How can I annotate a method to be able to do this? Thanks, Matthew -- Matthew Francis-Landau From cfbolz at gmx.de Sat Mar 23 11:13:13 2019 From: cfbolz at gmx.de (Carl Friedrich Bolz-Tereick) Date: Sat, 23 Mar 2019 16:13:13 +0100 Subject: [pypy-dev] [CfP] DLS 2019 - 15th Dynamic Languages Symposium, co-located with SPLASH 2019 Call for Papers Message-ID: <651CEC5E-2D17-40E9-952A-65E73BD2195C@gmx.de> DLS 2019 - 15th Dynamic Languages Symposium Co-located with SPLASH 2019, October 22, Athens, Greece https://conf.researchr.org/home/dls-2019 Follow us @dynlangsym Dynamic Languages play a fundamental role in today?s world of software, from the perspective of research and practice. Languages such as JavaScript, R, and Python are vehicles for cutting edge research as well as building widely used products and computational tools. The 15th Dynamic Languages Symposium (DLS) at SPLASH 2019 is the premier forum for researchers and practitioners to share research and experience on all aspects on dynamic languages. DLS 2019 invites high quality papers reporting original research and experience related to the design, implementation, and applications of dynamic languages. Areas of interest are generally empirical studies, language design, implementation, and runtimes, which includes but is not limited to: - innovative language features - innovative implementation techniques - innovative applications - development environments and tools - experience reports and case studies - domain-oriented programming - late binding, dynamic composition, and run-time adaptation - reflection and meta-programming - software evolution - language symbiosis and multi-paradigm languages - dynamic optimization - interpretation, just-in-time and ahead-of-time compilation - soft/optional/gradual typing - hardware support - educational approaches and perspectives - semantics of dynamic languages - frameworks and languages for the Cloud and the IoT Submission Details Submissions must neither be previously published nor under review at other events. DLS 2019 uses a single-blind, two-phase reviewing process. Papers are assumed to be in one of the following categories: Research Papers: describe work that advances the current state of the art Experience Papers: describe insights gained from substantive practical applications that should be of a broad interest Dynamic Pearls: describe a known idea in an appealing way to remind the community and capture a reader?s interest The program committee will evaluate each paper based on its relevance, significance, clarity, and originality. The paper category needs to be indicated during submission, and papers are judged accordingly. Papers are to be submitted electronically at https://dls19.hotcrp.com/ in PDF format. Submissions must be in the ACM SIGPLAN conference acmart format, 10 point font, and should not exceed 12 pages. Please see full details in the instructions for authors available at: https://conf.researchr.org/home/dls-2019#Instructions-for-Authors DLS 2019 will run a two-phase reviewing process to help authors make their final papers the best that they can be. Accepted papers will be published in the ACM Digital Library and will be freely available for one month, starting two weeks before the event. Important Deadlines Abstract Submission: May 29, 2019 Paper Submission: June 5, 2019 First Phase Notification: July 3, 2019 Final Notifications: August 14, 2019 Camera Ready: August 28, 2019 All deadlines are 23:59 AoE (UTC-12h). AUTHORS TAKE NOTE: The official publication date is the date the proceedings are made available in the ACM Digital Library. This date may be up to two weeks prior to the first day of your conference. The official publication date affects the deadline for any patent filings related to published work. Program CommitteeAlexandre Bergel, University of Chile Carl Friedrich Bolz-Tereick, Heinrich-Heine-Universit?t D?sseldorf Chris Seaton, Oracle Labs David Chisnall, Microsoft Research Elisa Gonzalez Boix, Vrije Universiteit Brussel Gregor Richards, University of Waterloo Guido Chari, Czech Technical University Hannes Payer, Google James Noble, Victoria University of Wellington Jeremy Singer, University of Glasgow Joe Gibbs Politz, University of California San Diego Juan Fumero, The University of Manchester Julien Ponge, Red Hat Mandana Vaziri, IBM Research Manuel Serrano, Inria Marc Feeley, Universit? de Montr?al Mark Marron, Microsoft Research Na Meng, Virginia Tech Nick Papoulias, Universit? Grenoble Alpes Richard Roberts, Victoria University of Wellington Sam Tobin-Hochstadt, Indiana University Sarah Mount, Aston University S?bastien Doeraene, ?cole polytechnique f?d?rale de Lausanne William Cook, University of Texas at Austin Program Chair Stefan Marr, University of Kent, United Kingdom -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.whitcher at rubrik.com Wed Mar 27 18:33:39 2019 From: robert.whitcher at rubrik.com (Robert Whitcher) Date: Wed, 27 Mar 2019 17:33:39 -0500 Subject: [pypy-dev] Continual Increase In Memory Utilization Using PyPy 6.0 (python 2.7) -- Help!? Message-ID: So I have a process that use PyPy and pymongo in a loop. It does basically the same thing every loop, which query a table in via pymongo and do a few non-save calculations and then wait and loop again The RSS of the process continually increased (the PYPY_GC_MAX is set pretty high). So I hooked in the GC stats output per: http://doc.pypy.org/en/latest/gc_info.html I also assure that gc.collect() was called at least every 3 minutes. What I see is that... The memory while high is fair constant for a long time: 2019-03-27 00:04:10.033-0600 [-] main_thread(29621)log (async_worker_process.py:304): INFO_FLUSH: RSS: 144244736 ... 2019-03-27 01:01:46.841-0600 [-] main_thread(29621)log (async_worker_process.py:304): INFO_FLUSH: RSS: 144420864 2019-03-27 01:02:36.943-0600 [-] main_thread(29621)log (async_worker_process.py:304): INFO_FLUSH: RSS: 144269312 Then it decides (an the exact per-loop behavior is the same each time) to chew up much more memory: 2019-03-27 01:04:17.184-0600 [-] main_thread(29621)log (async_worker_process.py:304): INFO_FLUSH: RSS: 145469440 2019-03-27 01:05:07.305-0600 [-] main_thread(29621)log (async_worker_process.py:304): INFO_FLUSH: RSS: 158175232 2019-03-27 01:05:57.401-0600 [-] main_thread(29621)log (async_worker_process.py:304): INFO_FLUSH: RSS: 173191168 2019-03-27 01:06:47.490-0600 [-] main_thread(29621)log (async_worker_process.py:304): INFO_FLUSH: RSS: 196943872 2019-03-27 01:07:37.575-0600 [-] main_thread(29621)log (async_worker_process.py:304): INFO_FLUSH: RSS: 205406208 2019-03-27 01:08:27.659-0600 [-] main_thread(29621)log (async_worker_process.py:304): INFO_FLUSH: RSS: 254562304 2019-03-27 01:09:17.770-0600 [-] main_thread(29621)log (async_worker_process.py:304): INFO_FLUSH: RSS: 256020480 2019-03-27 01:10:07.866-0600 [-] main_thread(29621)log (async_worker_process.py:304): INFO_FLUSH: RSS: 289779712 That's 140 MB .... Where is all that memory going... What's more is that the PyPy GC stats do not show anything different: Here are the GC stats from GC-Complete when we were at *144MB*: 2019-03-26 23:55:49.127-0600 [-] main_thread(29621)log (async_worker_process.py:304): INFO_FLUSH: RSS: 140632064 2019-03-26 23:55:49.133-0600 [-] main_thread(29621)log (async_worker_process.py:308): DBG0: Total memory consumed: GC used: 56.8MB (peak: 69.6MB) in arenas: 39.3MB rawmalloced: 14.5MB nursery: 3.0MB raw assembler used: 521.6kB ----------------------------- Total: 57.4MB Total memory allocated: GC allocated: 63.0MB (peak: 71.2MB) in arenas: 43.9MB rawmalloced: 22.7MB nursery: 3.0MB raw assembler allocated: 1.0MB ----------------------------- Total: 64.0MB Here are the GC stats from GC-Complete when we are at *285MB*: 2019-03-27 01:42:41.751-0600 [-] main_thread(29621)log (async_worker_process.py:304): INFO_FLUSH: RSS: 285147136 2019-03-27 01:42:41.751-0600 [-] main_thread(29621)log (async_worker_process.py:308): DBG0: Total memory consumed: GC used: 57.5MB (peak: 69.6MB) in arenas: 39.9MB rawmalloced: 14.6MB nursery: 3.0MB raw assembler used: 1.5MB ----------------------------- Total: 58.9MB Total memory allocated: GC allocated: 63.1MB (peak: 71.2MB) in arenas: 43.9MB rawmalloced: 22.7MB nursery: 3.0MB raw assembler allocated: 2.0MB ----------------------------- Total: 65.1MB How is this possible? I am measuring RSS with: def get_rss_mem_usage(): ''' Get the RSS memory usage in bytes @return: memory size in bytes; -1 if error occurs ''' try: process = psutil.Process(os.getpid()) return process.get_memory_info().rss except: return -1 And cross referencing with "ps -orss -p " and the RSS values reported are correct.... I cannot figure out where to go from here with this as it appears that PyPy is leaking this memory somehow... And I have no idea howto proceed from here... I end up having memory problems and getting Memory Warnings for a process that just loops and queries via pymongo Pymongo version is 3.7.1 This is: Python 2.7.13 (ab0b9caf307db6592905a80b8faffd69b39005b8, Apr 30 2018, 08:21:35) [PyPy 6.0.0 with GCC 7.2.0] -------------- next part -------------- An HTML attachment was scrubbed... URL: From anto.cuni at gmail.com Thu Mar 28 11:55:49 2019 From: anto.cuni at gmail.com (Antonio Cuni) Date: Thu, 28 Mar 2019 16:55:49 +0100 Subject: [pypy-dev] Continual Increase In Memory Utilization Using PyPy 6.0 (python 2.7) -- Help!? In-Reply-To: References: Message-ID: Hi Robert, are you using any package which relies on cpyext? I.e., modules written in C and/or with Cython (cffi is fine). IIRC, at the moment PyPy doesn't detect GC cycles which involve cpyext objects. So if you have a cycle which does e.g. Py_foo -> C_bar -> Py_foo (where Py_foo is a pure-python object and C_bar a cpyext object) they will never be collected unless you break the cycle manually. Other than that: have you tried running it with PyPy 7.0 and/or 7.1? On Thu, Mar 28, 2019 at 8:35 AM Robert Whitcher wrote: > So I have a process that use PyPy and pymongo in a loop. > It does basically the same thing every loop, which query a table in via > pymongo and do a few non-save calculations and then wait and loop again > > The RSS of the process continually increased (the PYPY_GC_MAX is set > pretty high). > So I hooked in the GC stats output per: > http://doc.pypy.org/en/latest/gc_info.html > I also assure that gc.collect() was called at least every 3 minutes. > > What I see is that... The memory while high is fair constant for a long > time: > > 2019-03-27 00:04:10.033-0600 [-] main_thread(29621)log > (async_worker_process.py:304): INFO_FLUSH: RSS: 144244736 > ... > 2019-03-27 01:01:46.841-0600 [-] main_thread(29621)log > (async_worker_process.py:304): INFO_FLUSH: RSS: 144420864 > 2019-03-27 01:02:36.943-0600 [-] main_thread(29621)log > (async_worker_process.py:304): INFO_FLUSH: RSS: 144269312 > > > Then it decides (an the exact per-loop behavior is the same each time) to > chew up much more memory: > > 2019-03-27 01:04:17.184-0600 [-] main_thread(29621)log > (async_worker_process.py:304): INFO_FLUSH: RSS: 145469440 > 2019-03-27 01:05:07.305-0600 [-] main_thread(29621)log > (async_worker_process.py:304): INFO_FLUSH: RSS: 158175232 > 2019-03-27 01:05:57.401-0600 [-] main_thread(29621)log > (async_worker_process.py:304): INFO_FLUSH: RSS: 173191168 > 2019-03-27 01:06:47.490-0600 [-] main_thread(29621)log > (async_worker_process.py:304): INFO_FLUSH: RSS: 196943872 > 2019-03-27 01:07:37.575-0600 [-] main_thread(29621)log > (async_worker_process.py:304): INFO_FLUSH: RSS: 205406208 > 2019-03-27 01:08:27.659-0600 [-] main_thread(29621)log > (async_worker_process.py:304): INFO_FLUSH: RSS: 254562304 > 2019-03-27 01:09:17.770-0600 [-] main_thread(29621)log > (async_worker_process.py:304): INFO_FLUSH: RSS: 256020480 > 2019-03-27 01:10:07.866-0600 [-] main_thread(29621)log > (async_worker_process.py:304): INFO_FLUSH: RSS: 289779712 > > > That's 140 MB .... Where is all that memory going... > What's more is that the PyPy GC stats do not show anything different: > > Here are the GC stats from GC-Complete when we were at *144MB*: > > 2019-03-26 23:55:49.127-0600 [-] main_thread(29621)log > (async_worker_process.py:304): INFO_FLUSH: RSS: 140632064 > 2019-03-26 23:55:49.133-0600 [-] main_thread(29621)log > (async_worker_process.py:308): DBG0: Total memory consumed: > GC used: 56.8MB (peak: 69.6MB) > in arenas: 39.3MB > rawmalloced: 14.5MB > nursery: 3.0MB > raw assembler used: 521.6kB > ----------------------------- > Total: 57.4MB > > Total memory allocated: > GC allocated: 63.0MB (peak: 71.2MB) > in arenas: 43.9MB > rawmalloced: 22.7MB > nursery: 3.0MB > raw assembler allocated: 1.0MB > ----------------------------- > Total: 64.0MB > > > Here are the GC stats from GC-Complete when we are at *285MB*: > > 2019-03-27 01:42:41.751-0600 [-] main_thread(29621)log > (async_worker_process.py:304): INFO_FLUSH: RSS: 285147136 > 2019-03-27 01:42:41.751-0600 [-] main_thread(29621)log > (async_worker_process.py:308): DBG0: Total memory consumed: > GC used: 57.5MB (peak: 69.6MB) > in arenas: 39.9MB > rawmalloced: 14.6MB > nursery: 3.0MB > raw assembler used: 1.5MB > ----------------------------- > Total: 58.9MB > > Total memory allocated: > GC allocated: 63.1MB (peak: 71.2MB) > in arenas: 43.9MB > rawmalloced: 22.7MB > nursery: 3.0MB > raw assembler allocated: 2.0MB > ----------------------------- > Total: 65.1MB > > > How is this possible? > > I am measuring RSS with: > > def get_rss_mem_usage(): > ''' > Get the RSS memory usage in bytes > @return: memory size in bytes; -1 if error occurs > ''' > try: > process = psutil.Process(os.getpid()) > return process.get_memory_info().rss > except: > return -1 > > > And cross referencing with "ps -orss -p " and the RSS values reported > are correct.... > > I cannot figure out where to go from here with this as it appears that > PyPy is leaking this memory somehow... > And I have no idea howto proceed from here... > I end up having memory problems and getting Memory Warnings for a process > that just loops and queries via pymongo > Pymongo version is 3.7.1 > > This is: > > Python 2.7.13 (ab0b9caf307db6592905a80b8faffd69b39005b8, Apr 30 2018, > 08:21:35) > [PyPy 6.0.0 with GCC 7.2.0] > > > > _______________________________________________ > pypy-dev mailing list > pypy-dev at python.org > https://mail.python.org/mailman/listinfo/pypy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.whitcher at rubrik.com Thu Mar 28 13:55:57 2019 From: robert.whitcher at rubrik.com (Robert Whitcher) Date: Thu, 28 Mar 2019 12:55:57 -0500 Subject: [pypy-dev] Continual Increase In Memory Utilization Using PyPy 6.0 (python 2.7) -- Help!? In-Reply-To: References: Message-ID: Thanks Antonio... Relative to cpyext... I am not sure. We are not directly, but who knows what is being used by incorporated modules (pymongo, etc..) Trying to create a stripped down test case... I filed bug here with some updates: https://bitbucket.org/pypy/pypy/issues/2982/continual-increase-in-memory-utilization Reproduces in pypy, pypy without JIT but *does not* reproduce in CPython so there is some indications there Once I get the stripped down test case that is not tied in to our entire codebase -- I will publish it in the bug if I can and try other PyPy versions as well. Rob On Thu, Mar 28, 2019 at 10:56 AM Antonio Cuni wrote: > Hi Robert, > are you using any package which relies on cpyext? I.e., modules written in > C and/or with Cython (cffi is fine). > IIRC, at the moment PyPy doesn't detect GC cycles which involve cpyext > objects. > So if you have a cycle which does e.g. > Py_foo -> C_bar -> Py_foo > (where Py_foo is a pure-python object and C_bar a cpyext object) they will > never be collected unless you break the cycle manually. > > Other than that: have you tried running it with PyPy 7.0 and/or 7.1? > > > On Thu, Mar 28, 2019 at 8:35 AM Robert Whitcher < > robert.whitcher at rubrik.com> wrote: > >> So I have a process that use PyPy and pymongo in a loop. >> It does basically the same thing every loop, which query a table in via >> pymongo and do a few non-save calculations and then wait and loop again >> >> The RSS of the process continually increased (the PYPY_GC_MAX is set >> pretty high). >> So I hooked in the GC stats output per: >> http://doc.pypy.org/en/latest/gc_info.html >> I also assure that gc.collect() was called at least every 3 minutes. >> >> What I see is that... The memory while high is fair constant for a long >> time: >> >> 2019-03-27 00:04:10.033-0600 [-] main_thread(29621)log >> (async_worker_process.py:304): INFO_FLUSH: RSS: 144244736 >> ... >> 2019-03-27 01:01:46.841-0600 [-] main_thread(29621)log >> (async_worker_process.py:304): INFO_FLUSH: RSS: 144420864 >> 2019-03-27 01:02:36.943-0600 [-] main_thread(29621)log >> (async_worker_process.py:304): INFO_FLUSH: RSS: 144269312 >> >> >> Then it decides (an the exact per-loop behavior is the same each time) to >> chew up much more memory: >> >> 2019-03-27 01:04:17.184-0600 [-] main_thread(29621)log >> (async_worker_process.py:304): INFO_FLUSH: RSS: 145469440 >> 2019-03-27 01:05:07.305-0600 [-] main_thread(29621)log >> (async_worker_process.py:304): INFO_FLUSH: RSS: 158175232 >> 2019-03-27 01:05:57.401-0600 [-] main_thread(29621)log >> (async_worker_process.py:304): INFO_FLUSH: RSS: 173191168 >> 2019-03-27 01:06:47.490-0600 [-] main_thread(29621)log >> (async_worker_process.py:304): INFO_FLUSH: RSS: 196943872 >> 2019-03-27 01:07:37.575-0600 [-] main_thread(29621)log >> (async_worker_process.py:304): INFO_FLUSH: RSS: 205406208 >> 2019-03-27 01:08:27.659-0600 [-] main_thread(29621)log >> (async_worker_process.py:304): INFO_FLUSH: RSS: 254562304 >> 2019-03-27 01:09:17.770-0600 [-] main_thread(29621)log >> (async_worker_process.py:304): INFO_FLUSH: RSS: 256020480 >> 2019-03-27 01:10:07.866-0600 [-] main_thread(29621)log >> (async_worker_process.py:304): INFO_FLUSH: RSS: 289779712 >> >> >> That's 140 MB .... Where is all that memory going... >> What's more is that the PyPy GC stats do not show anything different: >> >> Here are the GC stats from GC-Complete when we were at *144MB*: >> >> 2019-03-26 23:55:49.127-0600 [-] main_thread(29621)log >> (async_worker_process.py:304): INFO_FLUSH: RSS: 140632064 >> 2019-03-26 23:55:49.133-0600 [-] main_thread(29621)log >> (async_worker_process.py:308): DBG0: Total memory consumed: >> GC used: 56.8MB (peak: 69.6MB) >> in arenas: 39.3MB >> rawmalloced: 14.5MB >> nursery: 3.0MB >> raw assembler used: 521.6kB >> ----------------------------- >> Total: 57.4MB >> >> Total memory allocated: >> GC allocated: 63.0MB (peak: 71.2MB) >> in arenas: 43.9MB >> rawmalloced: 22.7MB >> nursery: 3.0MB >> raw assembler allocated: 1.0MB >> ----------------------------- >> Total: 64.0MB >> >> >> Here are the GC stats from GC-Complete when we are at *285MB*: >> >> 2019-03-27 01:42:41.751-0600 [-] main_thread(29621)log >> (async_worker_process.py:304): INFO_FLUSH: RSS: 285147136 >> 2019-03-27 01:42:41.751-0600 [-] main_thread(29621)log >> (async_worker_process.py:308): DBG0: Total memory consumed: >> GC used: 57.5MB (peak: 69.6MB) >> in arenas: 39.9MB >> rawmalloced: 14.6MB >> nursery: 3.0MB >> raw assembler used: 1.5MB >> ----------------------------- >> Total: 58.9MB >> >> Total memory allocated: >> GC allocated: 63.1MB (peak: 71.2MB) >> in arenas: 43.9MB >> rawmalloced: 22.7MB >> nursery: 3.0MB >> raw assembler allocated: 2.0MB >> ----------------------------- >> Total: 65.1MB >> >> >> How is this possible? >> >> I am measuring RSS with: >> >> def get_rss_mem_usage(): >> ''' >> Get the RSS memory usage in bytes >> @return: memory size in bytes; -1 if error occurs >> ''' >> try: >> process = psutil.Process(os.getpid()) >> return process.get_memory_info().rss >> except: >> return -1 >> >> >> And cross referencing with "ps -orss -p " and the RSS values >> reported are correct.... >> >> I cannot figure out where to go from here with this as it appears that >> PyPy is leaking this memory somehow... >> And I have no idea howto proceed from here... >> I end up having memory problems and getting Memory Warnings for a process >> that just loops and queries via pymongo >> Pymongo version is 3.7.1 >> >> This is: >> >> Python 2.7.13 (ab0b9caf307db6592905a80b8faffd69b39005b8, Apr 30 2018, >> 08:21:35) >> [PyPy 6.0.0 with GCC 7.2.0] >> >> >> >> _______________________________________________ >> pypy-dev mailing list >> pypy-dev at python.org >> https://mail.python.org/mailman/listinfo/pypy-dev >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From renesd at gmail.com Thu Mar 28 14:27:55 2019 From: renesd at gmail.com (=?UTF-8?Q?Ren=C3=A9_Dudfield?=) Date: Thu, 28 Mar 2019 19:27:55 +0100 Subject: [pypy-dev] Continual Increase In Memory Utilization Using PyPy 6.0 (python 2.7) -- Help!? In-Reply-To: References: Message-ID: Tried deleting the pymongo._cmessage module? It?s a cpyext one, so seems like a good place to start. I think it works without it. A general bisection memory leak finding technique that works in 11 runs or under... Run the code under coverage, so you have the lines that are actually called. Now comment out half the code so that it still runs. Is there a memory leak still? Look at the uncommented lines. No leak? Look at the comment lines. Keep doing it until you spot the problem, or you only have 1 line left. A common resource issue with pypy is how it does cleanup differently. Some code relies on destructors and reference counting to clean things up. Maybe check things like the number of open sockets and files? objgraph is reeeeeealy good IMHO (haven?t tried it with pypy, not sure it works) https://mg.pov.lt/objgraph/ On Thursday, March 28, 2019, Robert Whitcher wrote: > Thanks Antonio... > > Relative to cpyext... I am not sure. We are not directly, but who knows > what is being used by incorporated modules (pymongo, etc..) > Trying to create a stripped down test case... > I filed bug here with some updates: > https://bitbucket.org/pypy/pypy/issues/2982/continual- > increase-in-memory-utilization > > Reproduces in pypy, pypy without JIT but *does not* reproduce in CPython > so there is some indications there > > Once I get the stripped down test case that is not tied in to our entire > codebase -- I will publish it in the bug if I can and try other PyPy > versions as well. > > Rob > > On Thu, Mar 28, 2019 at 10:56 AM Antonio Cuni wrote: > >> Hi Robert, >> are you using any package which relies on cpyext? I.e., modules written >> in C and/or with Cython (cffi is fine). >> IIRC, at the moment PyPy doesn't detect GC cycles which involve cpyext >> objects. >> So if you have a cycle which does e.g. >> Py_foo -> C_bar -> Py_foo >> (where Py_foo is a pure-python object and C_bar a cpyext object) they >> will never be collected unless you break the cycle manually. >> >> Other than that: have you tried running it with PyPy 7.0 and/or 7.1? >> >> >> On Thu, Mar 28, 2019 at 8:35 AM Robert Whitcher < >> robert.whitcher at rubrik.com> wrote: >> >>> So I have a process that use PyPy and pymongo in a loop. >>> It does basically the same thing every loop, which query a table in via >>> pymongo and do a few non-save calculations and then wait and loop again >>> >>> The RSS of the process continually increased (the PYPY_GC_MAX is set >>> pretty high). >>> So I hooked in the GC stats output per: http://doc.pypy.org/en/ >>> latest/gc_info.html >>> I also assure that gc.collect() was called at least every 3 minutes. >>> >>> What I see is that... The memory while high is fair constant for a long >>> time: >>> >>> 2019-03-27 00:04:10.033-0600 [-] main_thread(29621)log >>> (async_worker_process.py:304): INFO_FLUSH: RSS: 144244736 >>> ... >>> 2019-03-27 01:01:46.841-0600 [-] main_thread(29621)log >>> (async_worker_process.py:304): INFO_FLUSH: RSS: 144420864 >>> 2019-03-27 01:02:36.943-0600 [-] main_thread(29621)log >>> (async_worker_process.py:304): INFO_FLUSH: RSS: 144269312 >>> >>> >>> Then it decides (an the exact per-loop behavior is the same each time) >>> to chew up much more memory: >>> >>> 2019-03-27 01:04:17.184-0600 [-] main_thread(29621)log >>> (async_worker_process.py:304): INFO_FLUSH: RSS: 145469440 >>> 2019-03-27 01:05:07.305-0600 [-] main_thread(29621)log >>> (async_worker_process.py:304): INFO_FLUSH: RSS: 158175232 >>> 2019-03-27 01:05:57.401-0600 [-] main_thread(29621)log >>> (async_worker_process.py:304): INFO_FLUSH: RSS: 173191168 >>> 2019-03-27 01:06:47.490-0600 [-] main_thread(29621)log >>> (async_worker_process.py:304): INFO_FLUSH: RSS: 196943872 >>> 2019-03-27 01:07:37.575-0600 [-] main_thread(29621)log >>> (async_worker_process.py:304): INFO_FLUSH: RSS: 205406208 >>> 2019-03-27 01:08:27.659-0600 [-] main_thread(29621)log >>> (async_worker_process.py:304): INFO_FLUSH: RSS: 254562304 >>> 2019-03-27 01:09:17.770-0600 [-] main_thread(29621)log >>> (async_worker_process.py:304): INFO_FLUSH: RSS: 256020480 >>> 2019-03-27 01:10:07.866-0600 [-] main_thread(29621)log >>> (async_worker_process.py:304): INFO_FLUSH: RSS: 289779712 >>> >>> >>> That's 140 MB .... Where is all that memory going... >>> What's more is that the PyPy GC stats do not show anything different: >>> >>> Here are the GC stats from GC-Complete when we were at *144MB*: >>> >>> 2019-03-26 23:55:49.127-0600 [-] main_thread(29621)log >>> (async_worker_process.py:304): INFO_FLUSH: RSS: 140632064 >>> 2019-03-26 23:55:49.133-0600 [-] main_thread(29621)log >>> (async_worker_process.py:308): DBG0: Total memory consumed: >>> GC used: 56.8MB (peak: 69.6MB) >>> in arenas: 39.3MB >>> rawmalloced: 14.5MB >>> nursery: 3.0MB >>> raw assembler used: 521.6kB >>> ----------------------------- >>> Total: 57.4MB >>> >>> Total memory allocated: >>> GC allocated: 63.0MB (peak: 71.2MB) >>> in arenas: 43.9MB >>> rawmalloced: 22.7MB >>> nursery: 3.0MB >>> raw assembler allocated: 1.0MB >>> ----------------------------- >>> Total: 64.0MB >>> >>> >>> Here are the GC stats from GC-Complete when we are at *285MB*: >>> >>> 2019-03-27 01:42:41.751-0600 [-] main_thread(29621)log >>> (async_worker_process.py:304): INFO_FLUSH: RSS: 285147136 >>> 2019-03-27 01:42:41.751-0600 [-] main_thread(29621)log >>> (async_worker_process.py:308): DBG0: Total memory consumed: >>> GC used: 57.5MB (peak: 69.6MB) >>> in arenas: 39.9MB >>> rawmalloced: 14.6MB >>> nursery: 3.0MB >>> raw assembler used: 1.5MB >>> ----------------------------- >>> Total: 58.9MB >>> >>> Total memory allocated: >>> GC allocated: 63.1MB (peak: 71.2MB) >>> in arenas: 43.9MB >>> rawmalloced: 22.7MB >>> nursery: 3.0MB >>> raw assembler allocated: 2.0MB >>> ----------------------------- >>> Total: 65.1MB >>> >>> >>> How is this possible? >>> >>> I am measuring RSS with: >>> >>> def get_rss_mem_usage(): >>> ''' >>> Get the RSS memory usage in bytes >>> @return: memory size in bytes; -1 if error occurs >>> ''' >>> try: >>> process = psutil.Process(os.getpid()) >>> return process.get_memory_info().rss >>> except: >>> return -1 >>> >>> >>> And cross referencing with "ps -orss -p " and the RSS values >>> reported are correct.... >>> >>> I cannot figure out where to go from here with this as it appears that >>> PyPy is leaking this memory somehow... >>> And I have no idea howto proceed from here... >>> I end up having memory problems and getting Memory Warnings for a >>> process that just loops and queries via pymongo >>> Pymongo version is 3.7.1 >>> >>> This is: >>> >>> Python 2.7.13 (ab0b9caf307db6592905a80b8faffd69b39005b8, Apr 30 2018, >>> 08:21:35) >>> [PyPy 6.0.0 with GCC 7.2.0] >>> >>> >>> >>> _______________________________________________ >>> pypy-dev mailing list >>> pypy-dev at python.org >>> https://mail.python.org/mailman/listinfo/pypy-dev >>> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: