[Python-Dev] Optimization targets
Michael Hudson
mwh at python.net
Thu Apr 15 07:05:37 EDT 2004
Mike Pall <mikepy-0404 at mike.de> writes:
> Hi,
>
> Raymond wrote:
>> It looks like the best bet is to try to speedup the code without
>> changing the multiplier.
>
> Indeed. And while you are at it, there are other optimizations, that
> seem more promising:
>
> I compiled a recent CVS Python with profiling and here is a list of the
> top CPU hogs (on a Pentium III, your mileage may vary):
I played this game recently, but on a G3 ibook, which is probably a
much more boring processor from a scheduling point of view.
> pystone:
>
> CPU% Function Name
> ----------------------------
> 55.44 eval_frame
> 7.30 lookdict_string
> 4.34 PyFrame_New
> 3.73 frame_dealloc
> 1.73 vgetargs1
> 1.65 PyDict_SetItem
> 1.42 string_richcompare
> 1.15 PyObject_GC_UnTrack
> 1.11 PyObject_RichCompare
> 1.08 PyInt_FromLong
> 1.08 tupledealloc
> 1.04 insertdict
I saw similar results to this, tho' I don't remember lookdict_string
being so high on the list.
> parrotbench:
>
> CPU% Function Name
> ----------------------------
> 23.65 eval_frame
> 8.68 l_divmod
> 4.43 lookdict_string
> 2.95 k_mul
> 2.27 PyType_IsSubtype
> 2.23 PyObject_Malloc
> 2.09 x_add
> 2.05 PyObject_Free
> 2.05 tupledealloc
>
> Arguably parrotbench is a bit unrepresentative here. And beware: due to
> automatic inlining of static functions the real offender may be hidden
> (x_divmod is the hog, not l_divmod).
Probably a fine candidate function for rewriting in assembly too...
> Anyway, this just confirms that the most important optimization targets are:
> 1. eval_frame
> 2. string keyed dictionaries
> 3. frame handling
[...]
> But GCC has more to offer: read the man page entries for -fprofile-arcs
> and -fbranch-probabilities. Here is a short recipe:
I tried this on the ibook and I found that it made a small difference
*on the program you ran to generate the profile data* (e.g. pystone),
but made naff all difference for something else. I can well believe
that it makes more difference on a P4 or G5.
[snippety]
> I bet there are more, but I'm running out of time right now --
> sorry.
I certainly don't want to discourage people from optimizing Python's
current implementation, but...
Some months ago (just after I played with -fprofile-arcs, most likely)
I wrote a rant about improving Python's performance, which I've
finally got around to uploading:
http://starship.python.net/crew/mwh/hacks/speeding-python.html
Tell me what you think!
Cheers,
mwh
--
Hmmm... its Sunday afternoon: I could do my work, or I could do a
Fourier analysis of my computer's fan noise.
-- Amit Muthu, ucam.chat (from Owen Dunn's summary of the year)
More information about the Python-Dev
mailing list