[Python-Dev] Re: Speed of test_sort.py

Tim Peters tim.one@comcast.net
Thu, 01 Aug 2002 15:05:16 -0400


[Guido, pins the blame on PyFrame_New -- cool!]
> ...
> Suggestion: doesn't test_longexp create some frames with a very large
> number of local variables?  Then PyFrame_New could spend a lot of time
> in this loop:
>
> 	while (--extras >= 0)
> 		f->f_localsplus[extras] = NULL;

In my poor man's profiling <wink>, I ran the self-contained test case posted
eariler under the debugger with REPS=120000, and since the "sort" part takes
20 seconds then, there was lots of opportunity to break at random times (the
MSVC debugger lets you do that, i.e. click a button that means "I don't care
where you are, break *now*").  It was always in that loop when it broke, and
extras always started life at 120000 before that loop.  Yikes!

> There's a free list of frames, and PyFrame_New picks the first frame
> on the free list.  It grows the space for locals if necessary, but it
> never shrinks it.
>
> Back to Tim -- does this make sense?  Should we attempt to fix it?

I can't make sufficient time to think about this, but I suspect a principled
fix is simply to delete these two lines:

		else
			extras = f->ob_size;

The number of extras the code object actually needs was already computed
correctly earlier, via

	extras = code->co_stacksize + code->co_nlocals + ncells + nfrees;

and there's no point clearing any more than that original value.  IOW, I
don't think it hurts to have a big old frame left on the freelist, the pain
comes from clearing out more slots in it than the *current* code object
uses.

A quick test of this showed it cured the test_longexp + test_sort speed
problem, and the regression suite ran without problems.

If someone understands this code well enough to finish thinking about
whether that's a correct thing to do, please do!