[python-committers] Codecov and PR
Brett Cannon
brett at python.org
Thu Apr 27 15:44:08 EDT 2017
On Wed, 26 Apr 2017 at 22:36 Terry Reedy <tjreedy at udel.edu> wrote:
> On 4/26/2017 1:45 PM, Brett Cannon wrote:
> >
> > On Tue, 25 Apr 2017 at 17:00 Terry Reedy <tjreedy at udel.edu
> > <mailto:tjreedy at udel.edu>> wrote:
>
> > While I use code coverage to improve automated unittesting, I am
> opposed
> > to turning a usable but limited and sometime faulty tool into a blind
> > robotic master that blocks improvements. The prospect of this being
> > done has discouraged me from learning the new system. (More on
> 'faulty
> > tool' later.)
> >
> > It should be stated that code coverage is not a blocking status check
> > for merging from our perspective (the **only** required check is that
> > Travis pass with it's test run).
>
> I have the impression that at one time you hoped to make it blocking.
> If that was wrong, I apologize for misunderstanding. If you have
> changed your mind, then I am happy.
>
"Hope", sure; "currently plan to", no. IOW I have said a lot of things over
the 2.5 years I've been leading this workflow shift and I don't expect all
of them to pan out.
>
> I am otherwise in favor of both the measurement and report of coverage
> being improved.
>
> > The temptation to write artificial tests to satisfy an artificial
> goal
> > is real. Doing so can eat valuable time better used for something
> else.
> > For instance:
> >
> > def meth(self, arg):
> > mod.inst.meth(arg, True, ob=self, kw='cut')
> >
> > Mocking mod.class.meth, calling meth, and checking that the mock is
> > called will satisfy the robot, but does not contribute much to the
> goal
> > of providing a language that people can use to solve problems.
>
> > My assumption is that there will be a test that meth() does the right
> > thing, mock or not. If we provide an API there should be some test for
> > it; using a mock or something else to do the test is an implementation
> > detail.
>
> My impression is that default mocks have a generic signature, so that
> merely checking that the mock is called will not catch an invalid call.
> I presume that one can do better with mocks, and I have with custom
> mocks I have written, but my point above was that coverage does not care.
>
> > >> If it's not important enough to require tests >> it's not
> > important enough to be in Python. ;)
> >
> > Modules in the test package are mostly not tested. ;)
> >
> >
> > :) But they are at least executed which is what we're really measuring
> > here and I think all Ethan and I are advocating for.
>
> I thought Ethan was advocating for more -- a specific unittest for each
> line.
>
> > E.g. I don't expect
> > test_importlib to be directly responsible for exercising all code in
> > importlib, just that Python's entire test suite exercise importlib as
> > much as possible as a whole.
>
> The advantage for importlib in this respect is that import statements
> cannot be mocked; only the objects imported, after importlib is finished.
>
Oh, you can mock import statements. :)
>
> There is lots of interaction between idlelib modules, but I would still
> like direct tests of each idlelib.xyz with a test_xyz.py. Three years
> ago, there was no test.test_idle. There now is, and it runs 35
> idlelib.idle_test.text_* files. (There are 60 idlelib modules.)
>
> > The problem I have with just doing manual testing for things that can
> > easily be covered by a unit test -- directly or indirectly -- is there
> > are technically 85 people who can change CPython today. That means
> > there's potentially 85 different people who can screw up the code ;) .
>
> At the moment, I am the only one pushing idlelib patches, except when it
> gets included in one of Serhiy's multi-module refactoring patches (and
> he always nosies me). That skews my view a bit. However, with most of
> the critical issues fixed, I am anxious to automate what I can of what I
> now do by hand.
>
> > Making sure code is exercised somehow by tests at least minimizes how
> > badly someone like me might mess something thanks to me not realizing I
> > broke the code.
>
> I had not thought about the issue that way. I should add a test_module
> for each remaining module, import the module, and at least create an
> instance of every tkinter widget defined therein, and see what other
> classes could be easily instantiated and what functions easily run.
>
That seems like a good starting point. Kind of like test_sundry but with
class instantiation on top of it.
> > Some practical issues with coverage and CodeCov:
>
> > 2. Some statements are only intended to run on certain systems,
> making
> > 100% coverage impossible unless one carefully puts all
> system-specific
> > code in "if system == 'xyz'" statements and uses system-specific
> > .coveragerc files to exclude code for 'other' systems.
>
> > True. We could have a discussion as to whether we want to use
> > Coverage.py's pragmas ... I'm sure we could discuss things with Ned
> > Batchelder if we needed some functionality in coverage.py for our needs).
>
> Let's skip this for now.
>
> > 3. Some tests required extended resources. Statements that are only
> > covered by such tests will be seen as uncovered when coverage is run
> on
> > a system lacking the resources. As far as I know, all non-Windows
> > buildbots and CodeCov are run on systems lacking the 'gui'
> resource. So
> > patches to gui code will be seen as uncovered.
> >
> > I view 100% coverage as aspirational, not attainable. But if we want an
> > attainable goal, what should we aim for? We're at 83.44% now
>
> On what system?
Travis, where the Codecov run is driven from.
> I suspect that Tkinter, ttk, turtle, and IDLE
> GUI-dependent tests make at least a 2% difference on GUI Windows versus
> no-GUI *nix.
> we could
> > say that 80% is something we never want to drop below and be done with
> > it. We could up it to 85% or 90% in recognizing that there is more
> > testing to do but that we will never cover all Python code (all of this
> > is configurable in Codecov, hence why I'm asking).
>
> Since I think we actually are at 85%, and certainly will be when I add
> minimal easy tests for the rest of IDLELIB, I think 90% would be a
> reasonable goal.
>
Seems reasonable to me. Opened
https://github.com/python/core-workflow/issues/75 to remind me to tweak
Codecov's config so that we are aiming for 90% overall.
-Brett
>
> One way to increase coverage is to push a bit harder on fulfilling the
> 'test needed' stage. Theoretically, every substantive
> (behavior-changing) patch should start with a test that fails. Since
> PRs are separate from the main repository and can be patched separately,
> a PR could start with a test that should immediately fail but should
> pass before merging. It would be nice if the test runner knew to only
> run the new test and not the entire suite. It would be even nicer if it
> know that initial failure is success. Is there at least a 'New Test'
> label on PRs?
>
> > 4. As I explained in a post on the core-workflow list, IDLE needs the
> > following added to the 'exclude_lines' item of .coveragerc.
> > .*# htest #
> > if not _utest:
>
> These additions would remove, I think, at least 400 lines from the
> uncovered category. Both only occur in idlelib.
>
> --
> Terry Jan Reedy
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-committers/attachments/20170427/e8ccfa6f/attachment.html>
More information about the python-committers
mailing list