[Python-Dev] Looking for VCS usage scenarios

Brett Cannon brett at python.org
Thu Nov 6 06:18:36 CET 2008


On Wed, Nov 5, 2008 at 17:36, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> In what follows, caveat IANB (I am not Brett, and neither is
> Cosmin<wink>), but there is some experience with these systems, and my
> recommendations are based on that.
>

Wow, I'm part of an acronym! That's a first.

> Cosmin Stejerean writes:
>  > On Nov 5, 2008, at 12:16 PM, skip at pobox.com wrote:
>
>  > > What DVCS fits my poor brain best?  I feel I'm like a dinosaur
>  > > not being able to figure out how I'm supposed to contribute
>  > > changes to the system.
>
> You need not feel that way.  It's not you---the flexibility of dVCS
> means that until the Powers That Be promulgate a Workflow, this will
> be ambiguous.
>

It also took me quite a while to finally grasp exactly how the typical
workflow could go with a DVCS.

> This is part of the purpose of the PEP.  We[1] will be presenting the
> 5-finger exercises required to accomplish typical (and perhaps some
> not-so-typical) tasks, as well as benchmarks for the various systems.
>
>  > > Do I:
>  > >
>  > >    * commit my changes to some central branch?
>
> Call this the "record && commit to authoritative" workflow.
>
>  > Not exactly. If you had commit access to the central repository you
>  > could commit then push, which would be the DVCS equivalent of
>  > committing to a central branch.
>
> The workflow where general contributors commit directly to the trunk
> surely won't be used in Python, because of the instability it would
> cause.  It would be possible to have a staging branch for this
> purpose, but IMO that's not a very effective use of a dVCS.[2]
>

I assume by "general contributor" you mean "anybody" and not "core developer".

> It is useful to avoid the term "commit" here because its semantics
> vary across systems.  As Cosmin points out, in a dVCS, what is
> accomplished by "vc commit" in CVS is done as "vc commit; vc push".  I
> use the terminology "record" for the action of adding a workspace-
> based patch or snapshot to a repository.  "push" (and "pull") move
> content between repositories.  Unfortunately "commit" is the name of
> the record command in most dVCSes, so this terminology probably won't
> catch on.
>

This is why there is a Terminology section to the PEP; people have not
fully agreed on terms yet.

> Also, when talking about "where to commit" in terms of communication
> among developers, you should probably refer to storage locations as
> "repositories".  "Branch" is another term that has varying semantics
> in different VCSes.  In some systems (git) it is reasonable to think
> of repositories containing more than one branch, and branches as
> existing in more than one repository (but this isn't quite robust in
> git because branch names are just names, not first-class objects).  In
> others (Darcs is the extreme) repository == branch == workspace.
>
> (I'm trying to get permission to publish a 3rd party's draft document
> that goes into these issues in detail; here I just want to raise
> awareness that the intuitions that go with CVS/Subversion usage of
> various terms is *not* always going to carry over to dVCSes.)
>
>  > >    * commit my changes locally then create diffs I then submit to the
>  > >      tracker?
>
> "Record && patch" workflow.
>
>  > Possible.
>
> But again not very effective.  Under a dVCS I believe these patches
> will languish in the tracker as they do today, unless tools are
> written to automatically pull them into a repo somewhere.
>
>  > >    * commit locally then push them somewhere?
>
> "Record && push to candidate" workflow.
>
> If we go with Bazaar, this is very likely to occur, especially if
> Canonical's launchpad is the host.  This is what Linux kernel does on
> git.kernel.org as well, if I understand their workflow correctly, and
> what github helps to support.  I imagine Mercurial has an equivalent
> but I'm not familiar with it.
>

http://www.bitbucket.org has free Mercurial hosting for open source projects.

As for how git.kernel.org works, I believe that won't work for Python
without a cultural shift in how Python development happens. Linus has
subsystem maintainers who are in charge of certain subsections of the
Linux kernel. They are the ones that accept the various patches from
people contributing. I believe Linus then pulls from these people into
his tree which is basically the authoritative tree for the kernel.

For Python development we don't really have subsystem maintainers. We
have unofficial ones (e.g. Georg with the docs), but I honestly think
that this has not worked for us over the years as people come and go.
I think part of the reason the Linux kernel can get away with the
structure it has is because people get paid to help maintain it while
we don't. Plus I don't think Guido wants to act as BDFL on every
potential patch into Python with his 50% time.

>  > >    * commit locally then ask someone to pull?
>
> "Record && request pull" workflow.
>
>  > Often preferred way to submit patches, as you can continue to maintain
>  > the patch locally against newer versions of trunk so that the patch is
>  > not obsolete by the time people finally get around to it.
>
> I disagree.  This doesn't scale to Python size.  For distributed VC to
> work, somebody has to maintain a repo 24x7.  Python has to do this for
> the trunk; the additional burden for contributed patches is not great.
> There is no real advantage to having contributors do so, too.[3]
> Integrators and interested third parties also must keep track of
> contributor's repo URLs.  (Cf. Skip's question about discovering repos.)
> Not happy stuff.
>
> The "record && push" workflow scales much better for numbers of
> contributors, as each contributor needs only to maintain one "push"
> URL, and integrators only one "pull" base URL.
>
>  > >    * Not commit anything anywhere but just submit patches to the
>  > > tracker?
>
> "Patch from workspace" workflow.
>
>  > Likely possible, but it's good to have the patch committed locally so
>  > you can modify it or continue to build upon it until it gets accepted.
>
> The same considerations as "record && patch" also apply here.
>
>  > > In addition:
>  > >
>  > >    * Will there be a central repository?
>  >
>  > Generally there should be a central authoritative repository where
>  > people can turn to for the official version.
>
> Ie, "yes".  There's no point in a PEP unless there's going to be a
> central repo and a defined workflow for getting contributions into it.
>
> Note that you can always maintain your own local repo with dVCS.
>
>  > >    * How will I know which of possibly many repos is "authoritative"?
>  >
>  > The authoritative repo should generally be linked to from the website
>  > so that people can easily find it.
>
> That depends.  The notion of "authoritative" gets weakened in a
> distributed system, and probably more important is "which repo will be
> used to make the next official release".
>
> However, although I can't say what the mechanism will be, be sure you
> will not have a problem learning which is authoritative for the trunk
> or where to find RCs and releases.  (If you do, it's a doc problem and
> it will be fixed quickly.)
>

Right. The concept of the "trunk" will continue to exist, as will what
the official maintenance repos and those locations will be written
down.

> You may have more trouble with third-party patches gotten from
> third-party repos.  GNU Arch has a system for handling this (patch
> names contain the originating repo).  That was one of the first things
> the Bazaar people discarded from Arch, though.  Darcs has something
> similar, but again Darcs is not a candidate here.  I think for such
> "maverick" contributions there will never really be a substitute for
> watching the ML and tracker like a hawk.
>
>  > >    * How will I discover other repos?  For example, if the
>  > >      safethread stuff is sitting somewhere in a repository can I
>  > >      find it on my own somehow?
>  >
>  > I'm not aware of any decentralized system for discovering
>  > repositories. Something like github or bitbucket could be used which
>  > help you discover repositories, but a wiki page with a list of
>  > alternative repositories and their purpose should suffice.
>
> Most likely the repos you care about will be hosted in a central
> location, and will be browsable from a single base URL.  See
> http://git.kernel.org/ for the git version of a browser.  Mercurial
> and Bazaar support similar facilities either in the VCS itself or in
> an easily available add-on.  Fancier support is available via systems
> like github and Launchpad.
>

We have http://code.python.org/ for this. And yes, you kind of just
have to know since any random branches that might be out there will
not be in the branches/ directory like in svn. But honestly how often
does anyone just browse the branches/ directory anyway?

-Brett


More information about the Python-Dev mailing list