[Python-Dev] On distributed vs centralised SCM for Python

Bryan O'Sullivan bos at serpentine.com
Mon Aug 15 23:04:53 CEST 2005


Pardon me for coming a little late to the SCM discussion, but I thought
I would throw a few comments in.

A little background: I've used Perforce, CVS, Subversion and BitKeeper
for a number of years.  Currently, I hack on Mercurial
<URL:http://www.selenic.com/mercurial>.

However, I'm not here to try and specifically push Mercurial, but rather
to bring up a few points that I haven't seen made in the earlier
discussions.

The biggest distinguishing factor between centralised and decentralised
SCMs is the kinds of interaction they permit between the core developer
community and outsiders.

The centralised SCM tools all create a wall between core developers
(i.e. people with commit access to the central repository) and people
who are on the fringes.  Outsiders may be able to get anonymous
read-only access, but they are left up to their own devices if they want
to make changes that they would like to contribute back to the project.

With centralised tools, any revision control that outsiders do must be
ad-hoc in nature, and they cannot share their changes in a natural way
(i.e. preserving revision history) with anyone else.

I do not follow Python development closely, so I have no idea how open
Python typically is to contributions from people outside the core CVS
committers.

However, it's worth pointing out that with a distributed SCM - it
doesn't really matter which one you use - it is simple to put together a
workflow that operates in the same way as a centralised SCM.  You lose
nothing in the translation.  What you gain is several-fold:

      * Outsiders get to work according to the same terms, and with the
        same tools, as core developers.
      * Everyone can perform whatever work they want (branch, commit,
        diff, undo, etc) without being connected to the main repository
        in any way.
      * Peer-level sharing of changes, for testing or evaluation, is
        easy and doesn't clutter up the central server with short-lived
        branches.
      * Speculative branching: it is cheap to create a local private
        branch that contains some half-baked changes.  If they work out,
        fold them back and commit them to the main repository.  If not,
        blow the branch away and forget about it.

Regardless of what you may think of the Linux development model, it is
teling that there have been about 80 people able to commit changes to
Python since 1990 (I just checked the cvsroot tarball), whereas my
estimate is that about ten times as many used BitKeeper to contribute
changes to the Linux kernel just since the 2.5 tree began in 2002.  (The
total number of users who contributed changes was about 1600, 1300 of
whom used BK, while the remainder emailed plain old patches that someone
applied.)

It is, of course, not possible for me to tell which CVS commits were
really patches that originated with someone else, but my intent is to
show how the choice of tools affects the ability of people to contribute
in "natural" ways.  How much of the difference in numbers is due to the
respective popularity or accessibility of the projects is anyone's
guess.

With any luck, there's some food for thought above.

Regards,

	<b

-- 
Bryan O'Sullivan <bos at serpentine.com>



More information about the Python-Dev mailing list