[Python-Dev] Move selected documentation repos to PSF BitBucket account?

Donald Stufft donald at stufft.io
Mon Nov 24 13:54:44 CET 2014


> On Nov 24, 2014, at 2:25 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
> On 24 November 2014 at 02:55, Brett Cannon <brett at python.org> wrote:
>> On Sun Nov 23 2014 at 6:18:46 AM Nick Coghlan <ncoghlan at gmail.com> wrote:
>>> Those features are readily accessible without changing the underlying
>>> version control system (whether self-hosted through Kallithea or externally
>>> hosted through BitBucket or RhodeCode). Thus the folks that want to change
>>> the version control system need to make the case that doing so will provide
>>> additional benefits that *can't* be obtained in a less disruptive way.
>> 
>> I guess my question is who and what is going to be disrupted if we go with
>> Guido's suggestion of switching to GitHub for code hosting? Contributors
>> won't be disrupted at all since most people are more familiar with GitHub
>> vs. Bitbucket (how many times have we all heard the fact someone has even
>> learned Mercurial just to contribute to Python?). Core developers might be
>> based on some learned workflow, but I'm willing to bet we all know git at
>> this point (and for those of us who still don't like it, myself included,
>> there are GUI apps to paper over it or hg-git for those that prefer a CLI).
>> Our infrastructure will need to be updated, but how much of it is that
>> hg-specific short of the command to checkout out the repo? Obviously
>> Bitbucket is much more minor by simply updating just a URL, but changing `hg
>> clone` to `git clone` isn't crazy either. Georg, Antoine, or Benjamin can
>> point out if I'm wrong on this, maybe Donald or someone in the
>> infrastructure committee.
> 
> Are you volunteering to write a competing PEP for a migration to git and GitHub?
> 
> I won't be updating PEP 474 to recommend moving to either, as I don't
> think that would be a good outcome for the Python ecosystem as a
> whole. It massively undercuts any possible confidence anyone else
> might have in Mercurial, BitBucket, Rhodecode, Kallithea & Allura (all
> Python based version control, or version control hosting, systems). If
> we as the Python core development team don't think any of those are
> good enough to meet the modest version control needs of our support
> repos, why on earth would anyone else choose them?

If those project’s depend on CPython to pick them then they are doomed
because we can only pick one. If we pick Bitbucket than the core development
team must not think Rhodecode, Kallithea, & Allura is good enough so why
on earth would anyone choose them? Ditto for any of the other options.

> 
> In reality, I think most of these services are pretty interchangeable
> - GitHub's just been the most effective at the venture capital powered
> mindshare grab business model (note how many of the arguments here
> stem from the fact folks like *other* things that only interoperate
> with GitHub, and no other repository hosting providers - that's the
> core of the A18z funded approach to breaking the "D" in DVCS and
> ensuring that GitHub's investors are in a position to clip the ticket
> when GitHub eventually turns around and takes advantage of its
> dominant market position to increase profit margins).

You’ll see those arguments because the other argument is “softer”, Github
just plain works better than Bitbucket does.

> 
> That's why I consider it legitimate to treat supporting fellow Python
> community members as the determining factor - a number of the
> available options meet the "good enough" bar from a technical
> perspective, so it's reasonable to take other commercial and community
> factors into account when making a final decision.

Sure it’s completely reasonable to take that into account, I don’t think
“Not written in Python” should be a disqualifying statement though. We should
pick the best tool for the job.

> 
>> Probably the biggest thing I can think of that would need updating is our
>> commit hooks. Once again Georg, Antoine, or Benjamin could say how difficult
>> it would be to update those hooks.
> 
> If CPython eventually followed suit in migrating to git (as seems
> inevitable if all the other repos were to switch), then every buildbot
> will also need to be updated to have git installed (and Mercurial
> removed).
> 
>>> From my perspective, swapping out Mercurial for git achieves exactly
>>> nothing in terms of alleviating the review bottleneck (since the core
>>> developers that strongly prefer the git UI will already be using an
>>> adapter), and is in fact likely to make it worse by putting the greatest
>>> burden in adapting to the change on the folks that are already under the
>>> greatest time pressure.
>> 
>> That's not entirely true. If you are pushing a PR shift in our patch
>> acceptance workflow then Bitbucket vs. GitHub isn't fundamentally any
>> different in terms of benefit, and I would honestly argue that GitHub's PR
>> experience is better. IOW either platform is of equal benefit.
> 
> Yes, I agree any real benefit comes from the PR workflow, not from
> git. This is why I consider "written in Python" to be a valid
> determining factor - multiple services meet the "good enough" bar from
> a practical perspective, allowing other considerations to come to the
> fore.
> 
> (Also note that this proposal does NOT currently cover CPython itself.
> Neither GitHub nor BitBucket is set up to handle maintenance branches
> well, and any server side merge based workflow improvements for
> CPython are gated on fixing the NEWS file maintenance issue. However,
> once you contemplate moving CPython, then the ripple effects on other
> systems become much larger)
> 
>>> It's also worth keeping in mind that changing the underlying VCS means
>>> changing *all* the automation scripts, rather than just updating the
>>> configuration settings to reflect a new hosting URL.
>> 
>> What are the automation scripts there are that would require updating? I
>> would like to a list and to have the difficulty of moving them mentioned to
>> know what the impact would be.
> 
> For the documentation repos, just the devguide and PEP update scripts
> come to mind. As noted above, the implications get more significant if
> the main CPython repo eventually follows suit and the buildbot
> infrastructure all needs to be updated.
> 
>>> Orchestrating this kind of infrastructure enhancement for Red Hat *is* my
>>> day job, and you almost always want to go for the lowest impact, lowest risk
>>> approach that will alleviate the bottleneck you're worried about while
>>> changing the smallest possible number of elements in the overall workflow
>>> management system.
>> 
>> Sure, but I would never compare our infrastructure needs to Red Hat. =) You
>> also have to be conservative in order to minimize downtown and impact for
>> cost reasons. As an open source project we don't have those kinds of worry;
>> we just have to worry about keeping everyone happy.
> 
> Switching to a proprietary hosting service written in a mixture of
> Ruby, C & bash wouldn't make me happy.
> 
> If that's the end result of this thread, I'll be sorry I even
> suggested the idea of reverting to external hosting at all. That
> outcome would be the antithesis of the PSF's overall mission, whereas
> I started this thread at least in part due to a discussion about ways
> the PSF board might be able to help resolve some of the current
> CPython workflow issues. Offering the use of the PSF's existing
> BitBucket org account as hosting location for Mercurial repos was an
> idea I first brought up in that PSF board thread, and then moved over
> here since it seemed worthwhile to at least make the suggestion and
> see what people thought.
> 
>>> That underlying calculation doesn't really change much even when the units
>>> shift from budget dollars to volunteers' time and energy.
>> 
>> So here is what I want to know to focus this discussion:
>> 
>> First, what new workflow are you proposing regardless of repo hosting
>> provider? Are you proposing we maintain just mirrors and update the devguide
>> to tell people to fork on the hosting provider, make their changes, generate
>> a patch (which can be as simple as telling people how find the raw diff on
>> the hosting provider), and then upload the patch the issue tracker just like
>> today? Are you going farther and saying we have people send PRs on the
>> hosting site, have them point to their PR in the issue tracker, and then we
>> accept PRs (I'm going on the assumption we are not dropping our issue
>> tracker)?
> 
> I am proposing that we switch at least some documentation-only repos
> to a full PR based workflow, including support for online editing (to
> make it easy to fix simple typos and the like without even leaving the
> browser). CPython itself would remain completely unaffected.
> 
> The proposal in PEP 474 is that we do that by setting up Kallithea as
> forge.python.org. This thread was about considering BitBucket as an
> alternative approach. RhodeCode would a third option that still didn't
> involve switching away from Mercurial.
> 
>> Second, to properly gauge the impact of switching from git to hg from an
>> infrastructure perspective, what automation scripts do we have and how
>> difficult would it be to update them to use git instead of hg? This is
>> necessary simply to know where we would need to update URLs, let alone
>> change in DVCS.
> 
> The problems with changing version control systems don't really become
> significant until we start talking about switching CPython itself,
> rather than the support repos.
> 
>> Third, do our release managers care about hg vs. git strongly? They probably
>> use the DVCS the most directly and at a lower level by necessity compared to
>> anyone else.
>> 
>> Fourth, do any core developers feel strongly about not using GitHub? Now
>> please notice I said "GitHub" and not "git"; I think the proper way to frame
>> this whole discussion is we are deciding if we want to switch to Bitbucket
>> or GitHub who provide a low-level API for their version control storage
>> service through hg or git, respectively. I personally dislike git, but I
>> really like GitHub and I don't even notice git there since I use GitHub's OS
>> X app; as I said, I view this as choosing a platform and not the underlying
>> DVCS as I have happily chosen to access the GitHub hosting service through
>> an app that is not git (it's like accessing a web app through it's web page
>> or its REST API).
> 
> Yes, I object strongly to the use of GitHub when there are
> commercially supported services written in Python like BitBucket and
> RhodeCode available if we want to go the external hosting route, and
> other options like the RhodeCode derived Kallithea if we want to run a
> self-hosted forge. RhodeCode are even PSF sponsors - I'm sure they'd
> be willing to discuss the possibility of hosting core development
> repos on their service.
> 
> If I was doing a full risk management breakdown, then RhodeCode would
> be the obvious winner, as not only are they PSF sponsors, but
> reverting to self-hosting on Kallithea would remain available as an
> exit strategy.
> 
> I only suggested BitBucket in this thread because the PSF already has
> some repos set up there, so that seemed easier than establishing a new
> set of repos on a RhodeCode hosted instance.

The PSF account on Bitbucket has 3 repositories. One hasn’t been touched
in two years, the other in a year, and the third isn’t in use either.

Compare that to Github which has:

- The Chef Cookbooks
- The Salt states
- The Python.org website
- The Log Archiver for PyPI/Python.org
- The documentation building scripts

Which are all actively being used/developed.

Beyond that, I don’t have hard numbers (though I could probably attempt to
get them) but my gut instinct is that the Python community is primarily
there as well.

> 
>> At least for me, until we get a clear understanding of what workflow changes
>> we are asking for both contributors and core developers and exactly what
>> work would be necessary to update our infrastructure for either Bitbucket or
>> GitHub we can't really have a reasonable discussion that isn't going to be
>> full of guessing.
> 
> All repos that migrated away from hg.python.org would move to a PR
> based workflow, rather than manual patch management on the issue
> tracker. The migrated repos would likely also use their integrated
> issue tracker rather than the main CPython one at bugs.python.org.
> 
> Externally hosted repos would likely retain a regularly updated mirror
> on hg.python.org to ensure the source remains available even in the
> event of problems affecting the external hosting provider.
> 
>> And I'm still in support no matter what of breaking out the HOWTOs and the
>> tutorial into their own repos for easier updating (having to update the
>> Python porting HOWTO in three branches is a pain when it should be
>> consistent across Python releases).
> 
> Agreed.
> 
> Cheers,
> Nick.
> 
> -- 
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

---
Donald Stufft
PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



More information about the Python-Dev mailing list