Is this a bug? Python intermittently stops dead for seconds

Steve Holden steve at holdenweb.com
Sun Oct 1 10:48:12 EDT 2006


charlie strauss wrote:
> Steve and other good folks who replied:
> 
> I want to clarify that, on my computer, the first instance of the gap occurs way before the memory if filled. (at about 20% of physical ram).  Additionally the process monitor shows no page faults.
> 
>   Yes if you let the as-written demo program  run to completetion (all 20,000 iterations) then on many computers it would not be surprising that your computer eventually goes into forced page swapping at some point.  That would be expected and is not the issue than the one I am concerned with.
> 
> in my case starts glicthing at around iteration 1000.
> 
> 1000(bars) x 100(foos)x(10 integers in array)
> 
> is nominally 
> 100,000 class objects and
> 1,000,000 array elements.
> 
> (note that the array if filled as [1]*10, so there is actually only one "integer", but 10 array elements refering to it, per foo class.)
> 
> 
> However steve may have put his finger on the reason why the duration grows with time.  Here is my current hypothesis.  The design of the program does not have and points where significant amounts of memory are released: all objects have held references till the end.  But prehaps there are some implicitly created objects of the same size created along the way???  For example when I write
> 
> me.memory = [1]*nfoo
> 
> perhaps internally, python is allocating an array of size foo and then __copying__ it into me.memory???  Since there is no reference to the intermediate it would then marked for future garbage collection.   
> 
> If that were true then the memory would have interleaved entities of things to GC and things with references held in me.memory.
> 
> Then to remove these would require GC to scan the entire set of existing objects, which is growing.
> 
> Turning off GC would prevent this.
> 
> 
> In any case I don't think what I'm doing is very unusual.  The original program that trigger my investigation of the bug was doing this:
> 
> foo was an election ballot holding 10 outcomes, and bar was a set of 100 ballots from 100 voting machines, and the total array was holding the ballot sets from a few thousand voting machines.  
> 
> Almost any inventory program is likely to have such a simmilar set of nested array, so it hardly seems unusual.
> 
> 
> 
> 
> 
I think the point you are missing is that the garbage collector is 
triggered from time to time to ensure that no cyclical garbage remains 
uncollected, IIRC. The more data that's been allocated, the longer it 
takes the collector to scan all of memory to do its job.

If you can find a way to avoid the behaviour I'm sure the development 
team would be interested to hear it :-)

I think you'll find that most programs that eat through memory in this 
way will exhibit pretty much the same behaviour. If you *know* your 
program isn't creating data cycles, just turn the GC off and rely on 
reference counting. That won't save you from paging when you eventually 
exhaust physical memory. Nothing can.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden




More information about the Python-list mailing list