[Speed] Median +- MAD or Mean +- std dev?

Victor Stinner victor.stinner at gmail.com
Wed Mar 15 14:11:15 EDT 2017


2017-03-15 18:11 GMT+01:00 Antoine Pitrou <solipsis at pitrou.net>:
> I would say keep it simple.  mean/stddev is informative enough, no need
> to add or maintain options of dubious utility.

Ok. I added a message to suggest to use perf stats to analyze results.

Example of warnings with a benchmark result considered as unstable,
python startup time measured by the new bench_command() function:
---
$ python3 -m perf show startup1.json
WARNING: the benchmark result may be unstable
* the standard deviation (6.08 ms) is 16% of the mean (39.1 ms)
* the minimum (23.6 ms) is 40% smaller than the mean (39.1 ms)

Try to rerun the benchmark with more runs, values and/or loops.
Run 'python3 -m perf system tune' command to reduce the system jitter.
Use perf stats to analyze results, or --quiet to hide warnings.

Median +- MAD: 40.7 ms +- 3.9 ms
----

Statistics of this result:
---
$ python3 -m perf stats startup1.json -q
Total duration: 37.2 sec
Start date: 2017-03-15 18:02:46
End date: 2017-03-15 18:03:27
Raw value minimum: 189 ms
Raw value maximum: 390 ms

Number of runs: 25
Total number of values: 75
Number of values per run: 3
Number of warmups per run: 1
Loop iterations per value: 8

Minimum: 23.6 ms (-42% of the median)
Median +- MAD: 40.7 ms +- 3.9 ms
Mean +- std dev: 39.1 ms +- 6.1 ms
Maximum: 48.7 ms (+20% of the median)
---

Victor


More information about the Speed mailing list