[Python-checkins] peps: Incorporate PEP 462 feedback

Mon Jan 27 04:22:31 CET 2014

http://hg.python.org/peps/rev/6bb993cca12c
changeset:   5357:6bb993cca12c
user:        Nick Coghlan <ncoghlan at gmail.com>
date:        Mon Jan 27 13:22:21 2014 +1000
summary:
  Incorporate PEP 462 feedback

files:
  pep-0462.txt |  136 +++++++++++++++++++++++++++++---------
  1 files changed, 102 insertions(+), 34 deletions(-)

diff --git a/pep-0462.txt b/pep-0462.txt
--- a/pep-0462.txt
+++ b/pep-0462.txt
@@ -7,6 +7,7 @@
 Type: Process
 Content-Type: text/x-rst
 Created: 23-Jan-2014
+Post-History: 25-Jan-2014, 27-Jan-2014
 
 
 Abstract
@@ -139,7 +140,7 @@
   failures prior to a release) and for the developers themselves (since
   it creates significant pressure to fix any failures we inadvertently
   introduce right *now*, rather than at a more convenient time).
-* For new contributors, a core developer spending time actually getting
+* For other contributors, a core developer spending time actually getting
   changes merged is a developer that isn't reviewing and discussing patches
   on the issue tracker or otherwise helping others to contribute effectively.
   It is especially frustrating for contributors that are accustomed to the
@@ -221,9 +222,10 @@
 that will be a very good day).
 
 However, the merge queue itself is a very powerful concept that should
-directly address several of the issues described above.
+directly address several of the issues described in the Rationale above.
 
 .. _Zuul: http://ci.openstack.org/zuul/
+.. _Elastic recheck: http://status.openstack.org/elastic-recheck/>
 
 
 Deferred Proposals
@@ -234,15 +236,39 @@
 * Running preliminary "check" tests against patches posted to Gerrit.
 * Creation of updated release artefacts and republishing documentation when
   changes are merged
-* Using ElasticSearch in conjunction with a spam filter to monitor test
-  output and suggest the specific intermittent failure that may have
-  caused a test to fail, rather than requiring users to search logs manually
+* The `Elastic recheck`_ feature that uses ElasticSearch in conjunction with
+  a spam filter to monitor test output and suggest the specific intermittent
+  failure that may have caused a test to fail, rather than requiring users
+  to search logs manually
 
 While these are possibilities worth exploring in the future (and one of the
 possible benefits I see to seeking closer coordination with the OpenStack
 Infrastructure team), I don't see them as offering quite the same kind of
 fundamental workflow improvement that merge gating appears to provide.
 
+However, it may be that the last is needed to handle intermittent test
+failures in the gate effectively, in which case it may need to be
+considered as a possible part of the initial deployment.
+
+
+Suggested Variants
+==================
+
+Terry Reedy has suggested doing an initial filter which specifically looks
+for approved documentation-only patches (~700 of the 4000+ open CPython
+issues are pure documentation updates). This approach would avoid several
+of the issues related to flaky tests and cross-platform testing, while
+still allowing the rest of the automation flows to be worked out (such as
+how to push a patch into the merge queue).
+
+The one downside to this approach is that Zuul wouldn't have complete
+control of the merge process as it usually expects, so there would
+potentially be additional coordination needed around that.
+
+It may be worth keeping this approach as a fallback option if the initial
+deployment proves to have more trouble with test reliability than is
+anticipated.
+
 
 Perceived Benefits
 ==================
@@ -261,8 +287,8 @@
 
 With the bulk of the time investment moved to the review process, this
 also encourages "development for reviewability" - smaller, easier to review
-patches, since the overhead of running the tests five times rather than once
-will be incurred by Zuul rather than by the core developers.
+patches, since the overhead of running the tests multiple times will be
+incurred by Zuul rather than by the core developers.
 
 However, removing this time sink from the core development team should also
 improve the experience of CPython development for other contributors, as it
@@ -282,6 +308,13 @@
 the other sprint participants than it is on things that are sufficiently
 mechanical that a computer can (and should) handle them.
 
+Finally, with most of the ways to make a mistake when committing a change
+automated out of existence, there are substantially fewer new things to
+learn when a contributor is nominated to become a core developer. This
+should have a dual benefit, both in making the existing core developers more
+comfortable with granting that additional level of responsibility, and in
+making new contributors more comforable with exercising it.
+
 
 Technical Challenges
 ====================
@@ -292,21 +325,39 @@
 in some of our existing tools.
 
 
-Rietveld vs Gerrit
-------------------
+Rietveld/Roundup vs Gerrit
+--------------------------
 
 Rietveld does not currently include a voting/approval feature that is
 equivalent to Gerrit's. For CPython, we wouldn't need anything as
 sophisticated as Gerrit's voting system - a simple core-developer-only
 "Approved" marker to trigger action from Zuul should suffice. The
-core-developer-or-not flag is available in Roundup, which may require
-further additions to the existing integration between the two tools.
+core-developer-or-not flag is available in Roundup, as is the flag
+indicating whether or not the uploader of a patch has signed a PSF
+Contributor Licensing Agreement, which may require further additions to
+the existing integration between the two tools.
 
 Rietveld may also require some changes to allow the uploader of a patch
 to indicate which branch it is intended for.
 
-There would also be an additional Zuul trigger plugin needed to monitor
-Rietveld activity rather than Gerrit.
+We would likely also want to improve the existing patch handling,
+in particular looking at how the Roundup/Reitveld integration handles cases
+where it can't figure out a suitable base version to use when generating
+the review (if Rietveld gains the ability to nominate a particular target
+repository and branch for a patch, then this may be relatively easy to
+resolve).
+
+Some of the existing Zuul triggers work by monitoring for particular comments
+(in particular, recheck/reverify comments to ask Zuul to try merging a
+change again if it was previously rejected due to an unrelated intermittent
+failure). We will likely also want similar explicit triggers for Rietveld.
+
+The current Zuul plugins for Gerrit work by monitoring the Gerrit activity
+stream for particular events. If Rietveld has no equivalent, we will need
+to add something suitable for the events we would like to trigger on.
+
+There would also be development effort needed to create a Zuul plugin
+that monitors Rietveld activity rather than Gerrit.
 
 
 Mercurial vs Gerrit/git
@@ -332,9 +383,14 @@
 Buildbot vs Jenkins
 -------------------
 
-As far as I am aware, Zuul's interaction with the CI system is also
-pluggable, so this should only require creating a Buildbot plugin to use
-instead of the Jenkins one.
+Zuul's interaction with the CI system is also pluggable, using Gearman
+as the `preferred interface <http://ci.openstack.org/zuul/launchers.html>`__.
+Accordingly, adapting the CI jobs to run in Buildbot rather than Jenkins
+should just be a matter of writing a Gearman client that can process the
+requests from Zuul and pass them on to the Buildbot master. Zuul uses the
+pure Python `gear client library <https://pypi.python.org/pypi/gear>`__ to
+communicate with Gearman, and this library should also be useful to handle
+the Buildbot side of things.
 
 Note that, in the initial iteration, I am proposing that we *do not*
 attempt to pipeline test execution. This means Zuul would be running in
@@ -345,28 +401,26 @@
 the result to come back before moving on to the second patch in the queue.
 
 If we ultimately decide that this is not sufficient, and we need to start
-using the CI pipelining features of Zuul, then we may need to look at using
-Jenkins test runs to control the gating process. Due to the differences
-in the client/server architectures between Jenkins and Buildbot, the
-initial feedback from the OpenStack infrastructure team is that it is likely
-to be difficult to adapt the way Zuul controls the CI pipelining process in
-Jenkins to control Buildbot instead.
+using the CI pipelining features of Zuul, then we may need to look at moving
+the test execution to dynamically provisioned cloud images, rather than
+relying on volunteer maintained statically provisioned systems as we do
+currently. The OpenStack CI infrastructure team are exploring the idea of
+replacing their current use of Jenkins masters with a simpler pure Python
+test runner, so if we find that we can't get Buildbot to effectively
+support the pipelined testing model, we'd likely participate in that
+effort rather than setting up a Jenkins instance for CPython.
 
-If that latter step occurs, it would likely make sense to look at moving the
-test execution to dynamically provisioned cloud images, rather than relying
-on volunteer maintained statically provisioned systems as we do currently.
-
-In this case, the main technical risk would become Zuul's handling of testing
-on platforms other than Linux (our stable buildbots currently cover Windows,
-Mac OS X, FreeBSD and OpenIndiana in addition to a couple of different Linux
-variants).
+In this case, the main technical risk would be a matter of ensuring we
+support testing on platforms other than Linux (as our stable buildbots
+currently cover Windows, Mac OS X, FreeBSD and OpenIndiana in addition to a
+couple of different Linux variants).
 
 In such a scenario, the Buildbot fleet would still have a place in doing
 "check" runs against the master repository (either periodically or for
 every commit), even if it did not play a part in the merge gating process.
 More unusual configurations (such as building without threads, or without
 SSL/TLS support) would likely still be handled that way rather than being
-included in the gate criteria.
+included in the gate criteria (at least initially, anyway).
 
 
 Handling of maintenance branches
@@ -426,7 +480,19 @@
 
 Some tests, especially timing tests, exhibit intermittent failures on the
 existing Buildbot fleet. In particular, test systems running as VMs may
-sometimes exhibit timing failures
+sometimes exhibit timing failures when the VM host is under higher than
+normal load.
+
+The OpenStack CI infrastructure includes a number of additional features to
+help deal with intermittent failures, the most basic of which is simply
+allowing developers to request that merging a patch be tried again when the
+original failure appears to be due to a known intermittent failure (whether
+that intermittent failure is in OpenStack itself or just in a flaky test).
+
+The more sophisticated `Elastic recheck`_ feature may be worth considering,
+especially since the output of the CPython test suite is substantially
+simpler than that from OpenStack's more complex multi-service testing, and
+hence likely even more amenable to automated analysis.
 
 
 Social Challenges
@@ -437,7 +503,7 @@
 automated by the proposal should create a strong incentive for the
 existing developers to go along with the idea.
 
-I believe two specific features may be needed to assure existing
+I believe three specific features may be needed to assure existing
 developers that there are no downsides to the automation of this workflow:
 
 * Only requiring approval from a single core developer to incorporate a
@@ -489,7 +555,9 @@
 ================
 
 Thanks to Jesse Noller, Alex Gaynor and James Blair for providing valuable
-feedback on a preliminary draft of this proposal.
+feedback on a preliminary draft of this proposal, and to James and Monty
+Taylor for additional technical feedback following publication of the
+initial draft.
 
 
 Copyright

-- 
Repository URL: http://hg.python.org/peps