[py-dev] Performance tests with py.test

Mon Jun 4 11:10:55 CEST 2012

Hi Bogdan,

sorry for taking a while ...

On Thu, May 03, 2012 at 11:18 +1000, Bogdan Opanchuk wrote:
> For some reason this message has not shown up in the list archives, so
> I'll resend it again.
> 
> On Sun, Apr 1, 2012 at 12:21 PM, holger krekel <holger at merlinux.eu> wrote:
> 
> > do you want to consider performance regressions as a failure?
> 
> Not really. I just need some table with performance results that I
> could get for different systems/versions and compare them. Besides,
> performance regressions can be implemented using existing
> functionality, because they do not have some continuous result
> associated with them — only pass/fail.
> 
> > Could you maybe provide a simple example test file and make up
> > some example output that you'd like to see?
> 
> Sure. Consider the following test file:
> 
> -----
> import pytest
> 
> def test_matrixmul():
>    pass
> 
> @pytest.mark.perf('seconds')
> def test_reduce():
>    # <some lengthy preparation that I do not want to time>
>    # <actual work>
>    return 1.0
> 
> @pytest.mark.perf('GFLOPS')
> def test_fft():
>    # <again, some lengthy preparation that I do not want to time>
>    # <actual work>
>    return 1.0, 1e10
> -----
> 
> Here "test_matrixmul()" is a normal pass/fail test, "test_reduce()" is
> marked as performance test that returns execution time, and
> "test_fft()" is marked as a test that returns execution time + the
> number of operations (thus allowing us to calculate GFLOPS value).
> 
> I have put together a clunky solution (see the end of this letter)
> using existing hooks that gives me more or less what I want to see:
> 
> $ py.test -v
> ...
> test_test.py:3: test_matrixmul PASSED
> test_test.py:6: test_reduce 1.0 s
> test_test.py:10: test_fft 10.0 GFLOPS
> ...
> 
> The only problem here is that I have to explicitly increase verbosity
> level. I'd prefer 'perf' marked tests show their result even for
> default verbosity, but I haven't found a way to do it yet.

not sure it helps but are you aware that you can put 

    [pytest]
    addopts = -v

in a pytest.ini file?

> > Meanwhile, if you haven't already you might want to look at the output
> > of "py.test --durations=10" and see about its implementation (mostly
> > contained in _pytest/runner.py, grep for 'duration').
> 
> Yes, I know about it, but it is not quite what I need:
> - it measures the time of the whole testcase, while I usually need to
> time only specific part

right.  It differentiates between setup and runtest phases for a test, though.

> - it does not allow me to measure anything more complicated (e.g.
> GFLOPS, as another variant I may want to see the error value)
> - it prints its report after all the tests are finished, while it is
> much more convenient to see testcase result as soon as it is finished
> (my performance tests may run for quite a long time)

right.

> So, the solution I have now is shown below. "pytest_pyfunc_call()"
> implementation annoys me the most, because I had to copy-paste it from
> python.py, so it exposes some py.test internals and can easily break
> when something (seemingly hidden) inside the library is changed.

That's true.  If you want to open an issue about this (pytest_pyfunc_call
to return test function return value), i can see to care about it.

> -----
> def pytest_configure(config):
>    config.pluginmanager.register(PerfPlugin(config), '_perf')
> 
> class PerfPlugin(object):
> 
>    def __init__(self, config):
>        pass
> 
>    def pytest_pyfunc_call(self, __multicall__, pyfuncitem):
>        # collect testcase return result
>        testfunction = pyfuncitem.obj
>        if pyfuncitem._isyieldedfunction():
>            res = testfunction(*pyfuncitem._args)
>        else:
>            funcargs = pyfuncitem.funcargs
>            res = testfunction(**funcargs)
>        pyfuncitem.result = res
> 
>    def pytest_report_teststatus(self, __multicall__, report):
>        outcome, letter, msg = __multicall__.execute()
> 
>        # if we have some result attached to the testcase, print it
> instead of 'PASSED'
>        if hasattr(report, 'result'):
>            msg = report.result
> 
>        return outcome, letter, msg
> 
>    def pytest_runtest_makereport(self, __multicall__, item, call):
>        report = __multicall__.execute()
> 
>        # if the testcase has passed, and has 'perf' marker, process its results
>        if call.when == 'call' and report.passed and
> hasattr(item.function, 'perf'):
>            perf = item.function.perf
>            perftype = perf.args[0]
>            if perftype == 'seconds':
>                report.result = str(item.result) + " s"
>            else:
>                seconds, operations = item.result
>                report.result = str(operations / seconds / 1e9) + " GFLOPS"
> 
>        return report
> -----

We could also think about a convention that would allow setting
the short/longform (letter,msg) on the report object so that could
also get rid of the report_teststatus hook.  I am slightly less
inclined to go for this because the above teststatus bit seems
nice enough.

best,
holger