[Patches] [ python-Patches-1022910 ] Conserve memory with list.pop()

SourceForge.net noreply at sourceforge.net
Sun Sep 12 21:57:45 CEST 2004


Patches item #1022910, was opened at 2004-09-06 02:16
Message generated for change (Comment added) made by rhettinger
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1022910&group_id=5470

Category: Core (C code)
Group: Python 2.4
>Status: Closed
Resolution: None
Priority: 6
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Raymond Hettinger (rhettinger)
Summary: Conserve memory with list.pop()

Initial Comment:
The current list resizing scheme only downsizes when
more than 16 elements are removed in a single step: 
del a[100:120].

When popping elements off of a list one at a time, the
current resizing approach never shrinks.

This patch makes it shrink whenever more than half of
the  space is unused.

There may be a better approach.  This one is simple and
avoids thrashing in stack applications that hover up
and down around a nearly steady state.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2004-09-12 14:57

Message:
Logged In: YES 
user_id=80475

Revised and applies as Objects/listobject.c  2.223

* Changed the var name as requested.  Great idea.

* Changed the second "if" to only check for zero  (so that
lists can shrink all the way down).  As requested, this
change leaves non-zero length lists in an over-allocated state.

* Restated the overallocation formula so that it depends
only on the requested size instead of both requested and
previous size.  Doesn't change the growth schedule but does
make the algorithm easier to analyze.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2004-09-11 23:28

Message:
Logged In: YES 
user_id=31435

+1 on the idea, and maybe <wink> on the code.  Reading this 
routine always gave me a headache, and I finally figured out 
why:  _new_size should be name new_allocated.  It's the 
new value of self->allocated, after all, and has much less to 
do with newsize (which _new_size looks most like) or ob_size.

If I rename the code like that, I find it much easier to follow.  
Then the tail end of the patched code looks like

new_allocated = (newsize >> 3) + (self->ob_size < 8 ? 3 : 6) 
+ newsize;
if (newsize < (allocated >> 1))
	new_allocated = newsize;

Then I ask "why?" about the last "if" clause.  That is, why 
change new_allocated away from the value it was given on 
the preceding line?  IOW, if that was a good value for 
new_allocated when the list was growing, why not also when 
the list is shrinking?  If we reset new_allocated to *exactly* 
the smaller requested new size, then if they add an element 
next time we have to endure a realloc right away again.  I 
think it's good to leave some room for growth.  If it turns out 
the list isn't yo-yo'ing, but going straight down to 0, that's 
fine, we'll just shrink it at a slightly slower pace then.

For example, suppose the list has room for 500 slots, 
currently holds 250, and the last element is popped.  
Unconditionally accepting the first new_allocated computed 
would leave it with 249 + 3 + (249 >> 3) = 283 allocated 
slots instead of with 249.  The relative difference is small, but 
the former is friendlier if the next operation is an append, and 
is faster to compute (skips the "if").

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1022910&group_id=5470


More information about the Patches mailing list