Self healthcheck

Wed Jan 22 03:18:16 EST 2014

On Wednesday, January 22, 2014 5:08:25 AM UTC+2, Chris Angelico wrote:
> I assume you're talking about pure Python code, running under CPython.
> (If you're writing an extension module, say in C, there are completely
> different ways to detect reference leaks; and other Pythons will
> behave slightly differently.) There's no way to detect truly
> unreferenced objects, because they simply won't exist - not after a
> garbage collection run, and usually sooner than that. But if you want
> to find objects that you're somehow not using and yet still have live
> references to, you'll need to define "using" in a way that makes
> sense. Generally there aren't many ways that that can happen, so those
> few places are candidates for a weak reference system (maybe you map a
> name to the "master object" representing that thing, and you can
> recreate the master object from the disk, so when nothing else is
> referring to it, you can happily flush it out - that mapping is a good
> candidate for weak references).
> 
> But for most programs, don't bother. CPython is pretty good at keeping
> track of its own references, so chances are you don't need to - and if
> you're seeing the process's memory usage going up, it's entirely
> possible you can neither detect nor correct the problem in Python code
> (eg heap fragmentation).
> ChrisA

Hi Chris

Yes the question was about CPython. But i am not after CPython leaks
though detecting these would be good, but my own mistakes leading to 
accumulation of data in mutable structures.
there will be few processes running python code standalone communicating
across servers and every activity will be spread over time so 
i have to persistently keep record of activity and remove it later when
activity is finished. In addition to checking objects directly i would 
like to analyze also app health indirectly via checking amount of data 
it holds. let say there is permanently 100 activities per second and 
typical object count figure is 1000 (in abstract units averaged over long enough time window), so i would check throughput and memory to see if my program is healthy in terms of leaking resources and generate log if it 
is not.
Input to such module will be traffic events (whatever event significant 
to object creation). 
So i am looking for proper way to detect memory held by CPython app. And 
it would be good if memory can be deduced down to object/class name so 
blamed one could be identified and reported.

Thanks 

Asaf