[Distutils] Version numbers for module distributions

Greg Ward gward@cnri.reston.va.us
Thu, 10 Dec 1998 11:14:36 -0500


There have been a few rumbles about module version numbering on and off
the list, and I thought I should bring it all out into the open.  First,
I think I tossed off a random opinion along these lines:

   Module version numbers should follow the GNU standard: X.Y.Z, where
   X is the major version, Y is the minor version, and Z is the
   patch level.  Alpha and beta versions can be noted by appending 
   "aN" or "bN", where N is any integer.

This is still my opinion, although John Skaller raised a bunch of issues 
on the list, and Greg Stein noted a minor wrinlke in private email.
I'll address those momentarily.

First, let me eludicate the position above.  This was (roughly) my
opinion going into the Developer's Day session back at the Python
Conference, and when Eric Raymond said roughly the same thing (minus the
bit about alpha/beta version) based on his experience running the giant
sunsite archive, I was glad to hear I wasn't the only one who likes this
scheme.  (And after all, the GNU folks have a hell of a lot of
experience in these matters, and they have been right before!)  Finally,
this seems to be the version numbering system that Guido uses for Python
itself, which is another good precedent.

To be formal for a moment: a version number must match the regular
expression (ignore whitespace in the regex):

     (\d+) \. (\d+) \. (\d+) ([ab](\d+))?

Two possible changes I would consider open to debate:

  * replace [ab] with [a-z], to allow test versions beyond
    "beta"... although any programmer who goes up to "zeta" (or should
    that be omega?) really needs to look at his quality control... ;-)
  * make the patch number optional (and treat 1.0 and 1.0.0 as the
    same version number)

Yes, this is very rigid.  But it also makes it pretty easy to parse,
split up, and compare version numbers.  Making alpha/beta numbers a
formal part is, IMHO, necessary, because intuitively 1.0.0a1 should be
less than 1.0.0 -- but no lexical or numeric comparison will draw this
conclusion!  

Now, there should also be semi-formal guidelines on what the different
components of a version number *for modules* might mean:

  * a major number of 0 means the module is still being actively 
    developed and, while it may be ready for evaluation, nobody
    should depend on stability of interface or implementation
  * a change in the major number beyond 1 means a major change in the
    interface, possibly not backwards compatible with previous
    major releases
  * a change in the minor number means some new functionality has been
    added in a way that doesn't break the interface
  * a change in the patch level means bugs have been fixed without
    adding new functionality
  * an alpha version (X.Y.Z.aN) is an advance preview of a future
    release where the interface has not yet stabilized
  * a beta version (X.Y.Z.aN) is an advance preview of a future
    release where the interface has stabilized, but known bugs
    exist and are still being fixed

These rules have a number of implications that hadn't occurred to me
until just now, when I wrote them down.  In particular, whole classes of
version numbers make no sense at all.  Having an alpha version of
something with major number 0 (0.1a3) is silly, because that leading
zero already implies an unstable interface.  Likewise, having an alpha
version of a release with a non-zero patchlevel (1.3.4a2) is silly for
the opposite reason; a change in the patchlevel means only bugs are
being fixed, so there's no new interface to be unstable.  This of course
directly contradicts the semantics of Python version numbers; witness
1.5.2a2.  But Guido's allowed to make up his own semantics; I'm just
suggesting rules for Python module distributions.

Oh yeah, let me stress that: *module distributions*.  If you distribute
a bunch of modules together, they might each have a version number,
*but* there should also be a version number for the whole distribution.
(And the individual modules in the distribution could just go in
lockstep with the distribution version, if you want.)

Now I'll try to address John Skaller's points about the
multidimensionality of software versions...

Quoth John Skaller (2 December 1998):
>         No, that isn't enough. Consider:
> 
>         MacVersion
>         UnixVersion
>         NTVersion

I think that platform differences can be handled by the naming scheme
for built distributions.  For instance, version 1.3.2 of module 'foo'
might be distributed under the following names:

   foo-1.3.2.tar.gz                       # source distribution
   foo-1.3.2-sparc-sun-solaris2.tar.gz    # 'dumb' built distribution
                                          # for a certain Unix platform
   foo-1.3.2-i386-linux.rpm               # 'smart' built distribution
                                          # for one popular PC platform
   foo-1.3.2-win32.???                    # 'smart' built distribution for
                                          # some obscure PC platform

This of course assumes that the same source distribution applies to all
platforms.  This is a lot easier with modern programming languages like
Python, Java, or Perl than with crufty old beasts like C or C++.  (In
the immortal words of Larry Wall, "C isn't so much portable as it is
ubiquitous".)  But lots of C and C++ programmers go to *great* effort to
ensure that their source distribution is the same for all platforms,
even if they need to resort to conditional compilation, complicated
configuration scripts, etc. to make it so.  I think for the restricted
domain of Python modules (even those written in C or C++), we can assume
that people will write cross-platform source distributions (or at least
have them as a goal).

Next issue:

> Now consider:
> 
>         FreeVersion
>         Commercial Version
> 
> and then:
> 
>         BareBones version
>         Delux Version
>         Everything including the Kitchen Sink Version

I see these are basically the same: different releases of essentially
the same product but with varying degrees of functionality.  (Where
"functionality" might be defined as things like a license server for the
commercial version.  It could be argued that these *detract* from
functionality, but let's not go there...)

I'm not sure how to accomodate this one.  A hack would be to give
different names to the distributions: foo-1.3.2 for the standard free
distribution, foo-commercial-1.3.2, foo-barebones-1.3.2, foo-delux-1.3.2
etc. for varying levels of functionality (and price!).  However, again
we come back to the fact that the product being distributed is *not*
arbitrary software, but collections of Python modules together with
their documentation and test suites.  If you want different levels of
functionality, you should split up your modules accordingly and make
several distributions.  Give away foo-base-1.3.2 to anyone who wants it,
and charge extra for foo-bronze-1.3.2, foo-silver-1.3.2, and
foo-gold-1.3.2, each of which contain a couple of modules that add still
more functionality to the whole system.

I think a lot of "version" problems can be solved by carefully
organizing your modules so that blocks of functionality (barebones vs
deluxe, informix vs sybase, etc.) correspond to modules or blocks of
modules.  Hell, this is just good solid software engineering practice,
so you should be doing it that way anyways.  That answer should also
suffice where you have non-cross-platform source code (which is about
the only reason I can think of for having platform-specific source
releases).

Finally, Greg Stein raised the following wrinkle in private email:

> I wanted to briefly let you know that Apache modules are typically
> versioned as: 0.9.3-1.3.3. The first set refers to the module itself.
> The second set refers to Apache (which is currently at 1.3.3).

Good point!  I certainly hope we don't have to deal with this, i.e. have 
module developers worry whether their module will work with Python 1.5.2 
or 1.6 or 1.6.1 (etc.).

There is, however, the distinct possibility of API breakage with Python
2.0, which could affect potentially all extension modules.  Umm, I think 
I'll let Guido handle this one, if he's still reading this SIG...