unit-profiling, similar to unit-testing

Roy Smith roy at panix.com
Thu Nov 17 09:03:15 EST 2011


In article <kkuep8-nqd.ln1 at satorlaser.homedns.org>,
 Ulrich Eckhardt <ulrich.eckhardt at dominolaser.com> wrote:

> Yes, this is surely something that is necessary, in particular since 
> there are no clear success/failure outputs like for unit tests and they 
> require a human to interpret them.

As much as possible, you want to automate things so no human 
intervention is required.

For example, let's say you have a test which calls foo() and times how 
long it takes.  You've already mentioned that you run it N times and 
compute some basic (min, max, avg, sd) stats.  So far, so good.

The next step is to do some kind of regression against past results.  
Once you've got a bunch of historical data, it should be possible to 
look at today's numbers and detect any significant change in performance.

Much as I loathe the bureaucracy and religious fervor which has grown up 
around Six Sigma, it does have some good tools.  You might want to look 
into control charts (http://en.wikipedia.org/wiki/Control_chart).  You 
think you've got the test environment under control, do you?  Try 
plotting a month's worth of run times for a particular test on a control 
chart and see what it shows.

Assuming your process really is under control, I would write scripts 
that did the following kinds of analysis:

1) For a given test, do a linear regression of run time vs date.  If the 
line has any significant positive slope, you want to investigate why.

2) You already mentioned, "I would even wonder if you can't verify the 
behaviour agains an expected Big-O complexity somehow".  Of course you 
can.  Run your test a bunch of times with different input sizes.  I 
would try something like a 1-2-5 progression over several decades (i.e. 
input sizes of 10, 20, 50, 100, 200, 500, 1000, etc)  You will have to 
figure out what an appropriate range is, and how to generate useful 
input sets.  Now, curve fit your performance numbers to various shape 
curves and see what correlation coefficient you get.

All that being said, in my experience, nothing beats plotting your data 
and looking at it.



More information about the Python-list mailing list