[Distutils] recollections of Pycon distutils versioning discussion (part 2)

Ben Finney ben+python at benfinney.id.au
Sun Jun 14 01:55:18 CEST 2009


Paul Moore <p.f.moore at gmail.com> writes:

> Here's an alternative suggestion:
> 
> * Versions are treated as dot-separated tuples
> * Comparison is component-by-component, exactly as Python tuples
> compare

Agreed so far (unsurprisingly, because so far it matches the algorithm I
outlined).

> * Components must have the form [a-z]*[0-9]+([a-z][0-9]+)? (ie,
> optional leading alphas, an integer, and an optional "letter-integer"
> suffix)
> * Call the 3 parts "prefix" ([a-z]*), "number" ([0-9]+), "suffix" ([a-z][0-9]+)
> * Components compare as follows:
>   - Components with differing prefixes are incomparable[1]. Otherwise,
> ignore the prefix.
>   - Within this, sort by the number part (as a number, not as text)
>   - Within this, components with a suffix sort BEFORE those without,
> in the obvious letter-then-number order.
> 
> That's a little messy

More than a little. That's not something I'd expect people to keep in
their head without needing to look at the specification frequently; or,
worse, make a guess and often get it wrong.

> but I think it follows people's intuition,

I think it's far from obvious that this represents an intuitive
comparison scheme. It's yet another set of special cases for certain
tokens, as far as I can see; which leads us back to the point that as
soon as we get into those, there's far less consensus about how they
should work.

> allows for most of the variations people want, and most importantly
> (to my mind) isolates the complexity to how *components* sort against
> each other (the high-level rule is "like tuples", which is simple).

Yes, I've no disagreement about version strings being sorted like tuples
at the component level. I don't see how you claim that as a
distinguishing characteristic of this suggestion.

> [1] Note that I see the "prefix" as cosmetic. I would expect real
> projects to use a fixed prefix on a component-by-component basis -
> 1.2.r34567 or 1.2.dev5 or whatever, but never a mix of 1.2.3,
> 1.2.r1234 and 1.2.dev5.

Why not? We've already had people talking about a mix of ‘a123’,
‘post123’, ‘b123’, ‘r123’, ‘dev123’, etc. I think any version comparison
scheme needs to allow a definite statement to be made about the
sequencing of *any* possible version strings, with only the answers
{equal, less-than, greater-than} possible.

> Hence, I have said that mixed prefixes are incomparable. If this
> causes an outcry,

The cry is a simple “What does that mean I should do with two version
strings that are incomparable?”.

> the following rule could be used instead:
> 
>   - Components with a prefix sort before components without, in
> alphabetic order of prefix
> 
> but in my view it adds unnecessary complexity (and hence I'd like to
> see real-world, justified use cases).

Actually, that seems *simpler*. It's such a simplification, in fact,
that it makes this rule redundant: it's covered already by the existing
rules (AFAICT).


You've essentially got components within components, but some components
sort differently from others by non-obvious rules, and worst of all some
components “are incomparable”; what does *that* mean when I need to
compare them for sequence?

I think “can keep the whole specification in one's head easily” is an
important criterion for any version comparison scheme that we promote
for the standard library. People should be able to learn it once, then
be able to look at any two version strings in the future and quickly
know what sequence Python will put them in, without going back to the
specification again for special cases and differing comparison rules.

-- 
 \             “I have never imputed to Nature a purpose or a goal, or |
  `\    anything that could be understood as anthropomorphic.” —Albert |
_o__)                                    Einstein, unsent letter, 1955 |
Ben Finney



More information about the Distutils-SIG mailing list