python gc performance in large apps

Mon Oct 24 13:46:40 EDT 2005

A couple of strategic gc.collect() calls can be useful.  You can also tweak
how the garbage collector gets run by changing settings in the gc module.

-Chris
On Fri, Oct 21, 2005 at 04:13:09PM -0400, Robby Dermody wrote:
> 
> Hey guys (thus begins a book of a post :),
> 
> I'm in the process of writing a commercial VoIP call monitoring and 
> recording application suite in python and pyrex. Basically, this 
> software sits in a VoIP callcenter-type environment (complete with agent 
> phones and VoIP servers), sniffs voice data off of the network, and 
> allows users to listen into calls. It can record calls as well. The 
> project is about a year and 3 months in the making and lately the 
> codebase has stabilized enough to where it can be used by some of our 
> clients. The entire project has about 37,000 lines of python and pyrex 
> code (along with 1-2K lines of unrelated java code).
> 
> Now, some disjointed rambling about the architecture of this software. 
> This software has two long-running server-type components. One 
> component, the "director" application, is written in pure python and 
> makes use of the twisted, nevow, and kinterbasdb libraries (which I 
> realize link to some C extensions). The other component, the 
> "harvester", is a mixture of python and pyrex, and makes use of the 
> twisted library, along with using the C libs libpcap and glib on the 
> pyrex end. Basically, the director is the "master" component. A single 
> director process interacts with users of the system through a web and/or 
> pygtk client application interface and can coordinate 1 to n harvesters 
> spread about the world. The harvester is the "heavy lifter" component 
> that sniffs the network traffic and sifts out the voice and signalling 
> data. It then updates the director of call status changes, and can 
> provide users of the system access to the data. It records the data to 
> disk as well. The scalibility of this thing is really cool: given a 
> single director sitting somewhere coordinating the list of agents, 
> multiple harvester can be placed anywhere there is voice traffic. A user 
> that logs into the director can end up seeing the activity of all of 
> these seperate voice networks presented like a single giant mesh.
> 
> Overall, I have been very pleased with python and the 3rd party 
> libraries that I use (twisted, nevow, kinterbasdb and pygtk). It is a 
> joy to program with, and I think the python community has done a fine 
> job. However, as I have been running the software lately and profiling 
> its memory usage, the one and only Big Problem I have seen is that of 
> the memory usage. Ideally, the server application(s) should be able to 
> run indefinitely, but from the results I'm seeing I will end up 
> exhausting the memory on a 2 GB machine in 2 to 3 days of heavy load.
> 
> Now normally I would not raise up an issue like this on this list, but 
> based on the conversations held on this list lately, and the work done 
> by Evan Jones (http://evanjones.ca/python-memory.html), I am led to 
> believe that this memory usage -- while partially due to some probably 
> leaks in my program -- is largely due to the current python gc. I have 
> some graphs I made to show the extent of this memory usage growth:
> 
> http://public.robbyd.fastmail.fm/iq-graph1.gif
> 
> http://public.robbyd.fastmail.fm/iq-graph-director-rss.gif
> 
> http://public.robbyd.fastmail.fm/iq-graph-harv-rss.gif
> 
> The preceding three diagrams are the result of running the 1 director 
> process and 1 harvester process on the same machine for about 48 hours. 
> This is the most basic configuration of this software. I was running 
> this application through /usr/bin/python (CPython) on a Debian 'testing' 
> box running Linux 2.4 with 2GB of memory and Python version 2.3.5. 
> During that time, I gathered the resident and virtual memory size of 
> each component at 120 second intervals. I then imported this data into 
> MINITAB and did some plots. The first one is a graph of the resident 
> (RSS) and virtual memory usage of the two applications. The second one 
> is a zoomed in graph of the director's resident memory usage (complete 
> with a best fit quadratic), and the 3rd one is a zoomed in graph of the 
> harvester's resident memory usage.
> 
> To give you an idea of the network load these apps were undergoing 
> during this sampling time, by the time 48 hours had passed, the 
> harvester had gathered and parsed about 900 million packets. During the 
> day there will be 50-70 agents talking. This number goes to 10-30 at night.
> 
> In the diagrams above, one can see the night-day separation clearly. At 
> night, the memory usage growth seemed to all but stop, but with the 
> increased call volume of the day, it started shooting off again. When I 
> first started gathering this data, I was hoping for a logarithmic curve, 
> but at least after 48 hours, it looks like the usage increase is almost 
> linear. (Although logarithmic may still be the case after it exceeds a 
> gig or two of used memory. :) I'm not sure if this is something that I 
> should expect from the current gc, and when it would stop.
> 
> Now, as I stated above, I am certain that at least some of this 
> increased memory usage is due to either un-collectable objects in the 
> python code, or memory leaks in the pyrex code (where I make some use of 
> malloc/free). I am working on finding and removing these issues, but 
> from what I've seen  with the help of gc UNCOLLECTABLE traces, there are 
> not many un-collectable reference issues at least. Yes, there are some 
> but definitely not enough to justify growth like I am seeing. The pyrex 
> side should not be leaking too much, I'm very good about freeing what I 
> allocate in pyrex/C land. I will be running that linked to a memory leak 
> finding library in the next few days. Past the code reviews I've done, 
> what makes me think that I don't have any *wild* leaks going on at least 
> with the pyrex code is that I am seeing the same type of growth patterns 
> in both apps, and I don't use any pyrex with the director. Yes, the 
> harvester is consuming much more memory, but it also does the majority 
> of the heavy lifting.
> 
> I am alright with the app not freeing all the memory it can between high 
> and low activity times, but what puzzles me is how the memory usage just 
> keeps on growing and growing. Will it ever stop?
> 
> What I would like to know if others on this list have had similar 
> problems with python's gc in long running, larger python applications. 
> Am I crazy or is this a real problem with python's gc itself? If it's a 
> python gc issue, then it's my opinion that we will need to enhance the 
> gc before python can really gain leverage as a language suitable for 
> "enterprise-class" applications. I have surprised many other programmers 
> that I'm writing an application like this in python/pyrex that works 
> just as well and even more efficiently than the C/C++/Java competitors. 
> The only thing I have left to show is that the app lasts as long between 
> restarts. ;)
> 
> 
> Robby
> -- 
> http://mail.python.org/mailman/listinfo/python-list