[Python-Dev] Keyword meanings [was: Accept just PEP-0426]

PJ Eby pje at telecommunity.com
Fri Dec 7 23:02:26 CET 2012


On Fri, Dec 7, 2012 at 12:01 PM, Toshio Kuratomi <a.badger at gmail.com> wrote:
> On Fri, Dec 07, 2012 at 01:18:40AM -0500, PJ Eby wrote:
>> On Thu, Dec 6, 2012 at 1:49 AM, Toshio Kuratomi <a.badger at gmail.com> wrote:
>> > On Wed, Dec 05, 2012 at 07:34:41PM -0500, PJ Eby wrote:
>> >> Nobody has actually proposed a better one, outside of package renaming
>> >> -- and that example featured an author who could just as easily have
>> >> used an obsoleted-by field.
>> >>
>> > How about pexpect and pextpect-u as a better example?
>>
>> Perhaps you could explain?  I'm not familiar with those projects.
>>
>
> pexepect was last released in 2008.  Upstream went silent with unanswered
> bugs in its tracker and no mailing list.  A fork of pexpect was created that
> addressed the issue of unicode type in python2, a python3 port, and has
> slowly evolvd since then.
>
> I see that the original upstream has made some commits to their source
> repository since the fork was created although there has still been no new
> release.

And what problem are you saying which fields would have solved (or
which benefits they would have provided), for whom?

If the packages have files in conflict, they won't be both installed.
If they don't have files in conflict, there's nothing important to be
informed of.  If one is installing pexpect-u, then one does not need
to discover that it is a successor of pexpect.  If one is installing
pexpect, it might be useful to know that pexpect-u exists, but one
can't simply discover that from an Obsoletes field on pexpect-u.
However, even if one did discover it, this would merely constitute an
*advertisement* of pexpect-u's existence, not a *requirement* that it
be used in place.  A tool cannot know, without other affirmative user
action, that it is actually a good assumption to use the advertised
replacement.

In the distro world, a user has *already* taken this affirmative
action by choosing which repository to source packages from, on an
implicit contract that this source is up to the job of managing his
needs across multiple packages.  Or, if they choose to source an
off-brand or upstream package, they are taking affirmative action to
risk it.

In the Python world, there is no notion of a "repository", aside from
a handful of managed Python distros, which have their own, distinct
packaging methods and distribution tools.  So there is no affirmative
contract of trust regarding *inter-project* relationships.

It is precisely this lack that is why the metadata spec has gone
mostly unused since its inception about a decade ago.  Nobody really
knows what to "provide" or "require", or in what context they would
actually be "obsoleting" anything that isn't their own package, or a
package they've forked.

But if you live mainly in the distro world, this concept seems absurd,
and the fields *obviously* useful.  But that's because you're swimming
in an ocean of context that doesn't exist on dry land.  You're saying
that *of course* swimming fins are useful...  if you live in the
ocean.

And I, living on dry land, am saying that *sure* they are...  but only
in a swimming pool or a pond, and we don't have very many of those
here in dry Python-land.  And the people who run the swimming pools
have thoughtfully already provided their own.  Do we need to
standardize swim fin sizes for people who mostly live on dry land?

The flip side of this, btw, is that there's an implicit contract in
the Python world that there is generally only "the" package - not "the
package as patched and re-packaged by vendors X, Y, and Z".  If I
install python project foo, version 1.2, I expect it to be the *same*
foo-1.2, with the *same metadata*, *no matter where I got it from*.

And so, this assumption is our "air" to your "water".  We know that
pools and ponds (curated Python distros) are different, as an
exception to this rule, just as you know that reefs and islands
(uncurated repositories, search engines, and upstream-built packages)
are different, as an exception to your assumption that "the package I
get is intended to play well with everything else in my system."

(This of course is why many distro managers are suspicious of
language-specific or other sorts of vertical package management tools
- they seem as pointless as wheels in the water, solving problems you
don't have, and creating new problems for you at the same time.
Unfortunately, people on land will keep inventing them, because they
have a different set of problems to solve -- some of which are
actually created by the ocean-oriented tools.  For example, virtualenv
and its predecessors were developed to solve the "problem" of a single
integrated environment, even though that integrated environment is the
"solution" from a distro perspective.)


> *) Not all packages built build on top of that system.  There are rpm
> packages provided by upstreams that users attempt (to greater and lesser
> degrees of success) to install on SuSE, RHEL, Fedora, Mandriva, etc.  There
> are debs built for Ubuntu that people attempt to install onto Debian.

Sure.  But the reference points still exist, and there is a layer of
indirection between "packager" and "developer", even in the case where
the packager and developer are the same person or organization.  In
the Python case, there is usually no such indirection, outside of
curated systems like SciPy et al.  (Even there, most of what
third-party packaging is about in the Python world is taking care of
binary builds.)

Again, it's islands in the ocean vs. pools on land.


> *) PPAs and rpmfusion may both build on top of an existing system but they
> can change the underlying structure, replacing components that other pieces
> of the base system depend on.  You talk about the setuptools and distribute
> problem on pypi.... there's absolutley nothing that prevents someone from
> building a PPA or a package in a third-party rpm repository that packages
> a setuptools that Obsoletes: distribute or a distribute package that
> Obsoletes: setuptools.

At the *same time*?  That is, are you saying that there are
repositories that contain *self-contained* "Obsoletes"-cycles?
(Presumably, there are no end-user sites containing such cycles, if
the install tool responds by refusing to install one or by removing
the other.)


> If you constantly forget why the fields are useful, then I suppose you'll
> always believe that :-)

I've stated many times that they're useful...  in the context of a
larger system.  Within the distro packaging ecosystem, a package
"conflicts", "obsoletes", or "provides" things *relative* to some
notion of an installation -- however vague -- that has been selected
by an explicit user action (such as choice of basic distro, package
manager, and repository).

So, despite their framing as binary relationships -- e.g.
Obsoletes(predecessor,succesor) -- the *actual* relationship is
three-valued: Obsoletes(predecessor, successor, integration-context).
The third player in the relationship is whoever *packaged* the
project(s) in question...  and in the Python world (outside of curated
repositories), that packager is *always the original author*.

Now, in the case where the packager and author are different, we can
talk about such relationships in the same way: binary relationships
with an implied third.  For example, if SciPy decided at some point to
replace NumPy with NumPyPy, it would be more than reasonable to state
that Obsoletes(NumPy, NumPyPy, SciPy), even as at the same time,
perhaps Enthought has already tried this and decided to go the other
way, so that Obsoletes(NumPyPy, NumPy, EnthoughtPD).  They use
different tools and repositories and thus can imply the third
position.

In neither case, however is SciPy or Enthought (nor the authors of
NumPy or NumPyPy), entitled to declare an Obsoletes relationship with
a *true*  wildcard for the third position.

And so the key distinction between PyPI and the distro world is that
*PyPI is not an integration context*.  Packages provided by authors do
not usually include this type of metadata, unless the author of the
package has a specific integration context in mind.  So the burden
falls to either the repository manager or the user to define these
higher-level relationships *within their intended integration context*

(Or to put it another way, *somebody* has to be the "packager", not
just the "developer".)

Currently, Python distribution tools, culture, and methodology do not
have any precedent for the metadata spec contents to be overrridden by
a third-party packager, curator or repository manager, in the way that
is normal and common in the distro world.  (Try to imagine a Linux
distro where this kind of information was *always* put in "upstream",
because *there is no such thing* as "downstream".  That's what it's
like "on land".)

This is why I keep saying that blind copying is an invitation to
trouble, and that clear thinking about the actual requirements is
needed.  I would not object to explicitly three-way versions of these
fields (requires, provides, conflicts, obsoletes) that define a
specific integration context in which the statement applies.
(Although defining how to name integration contexts would present a
*new* challenge for discussion!)

Likewise, I would not object to discussion of how to manage metadata
for *repackaging* of Python projects by third-party curators (e.g.
SciPy et al), and ways to keep that separate from the author's
declarations.  Or discussion of what should constitute a "repository"
in the Python world, as opposed to what we have now (which apart from
curated distributions, consists mainly of indexes, not true
repositories in the distro sense).

Today, however, there is no separation in the metadata spec (or tools)
between "packaging" (in the sense understood by distros) and
"distributing" (in the sense normally applied to Python packages
distributed via PyPI and similar channels).

And "packaging" in the distro sense is all about *integrating*
packages, not merely making them *available* for others to integrate.
That's the critical difference between the two, and in the resulting
use cases for the metadata spec.


More information about the Python-Dev mailing list