[Distutils] First steps with distutils...

Mark W. Alexander mwa@gate.net
Sat Sep 9 20:25:01 2000


On Sat, 9 Sep 2000, Greg Ward wrote:
> 
> On 05 September 2000, Harry Henry Gebel said:
> > The contents of the spec file are included in the source RPM, that is why A
> > think that using sys.executable should be the option and not the default.
> 
> Dammit, I'm still waiting for a light to flash on so I can understand
> where you guys are coming from on this.  Please indulge me as I think
> out loud...

Making packages is not that hard. Making GOOD packages is very subtly
tricky. My comments below may help your understanding. They may not
help find a solution....

> Assertions (please tell me if I'm misunderstanding something):
> 
>   * this isn't really all *that* important, as the build instructions in 
>     the .spec file only apply to people building the source RPM -- IOW,
>     the build instructions in the .spec file in no way affect the
>     majority of people who use the RPM, ie. those who just install the
>     "built" RPM

Correct.

>   * in fact, the only person who *must* build the source RPM is
>     the person who creates the RPM in the first place (although
>     anyone creating binary RPMs for other distributions or other
>     architectures would probably start from the source RPM)

Not exactly. No one ever *must* build a source RPM. If you have
a .spec and a source tar/directory you can build a binary RPM
without creating a source RPM. Distribution of the binary RPM
satisfies 99.99% of the people who want to install an RPM.
Those who want to tweak how the binary is created could use
the source RPM, but even then (especially with pure python 
packages) it's not required. I've "rebuilt" RPM's by getting
a source tarball and recreating most of the .spec from an 
rpm -qi query.

> I've been thinking hard about this -- hey, my brain is slow this weekend
> -- and I think I understand it a little better.
> 
> Case 1 (the status quo): put "python setup.py ..." in the build/install/
>   clean instructions in the .spec file.  This is bad when the packager,
>   P, uses a "non-standard" python (anything other than /usr/bin/python)
>   to create an RPM that is intended to go in the non-standard location.
>   This is mainly a problem when P immediately creates a built RPM from
>   his new source RPM (the usual case, in fact); if a builder, B, turns
>   source RPM -> built RPM on a separate system, then using the first
>   python in the path -- most likely /usr/bin/python -- might well be the
>   right thing to do.

Close. It really has nothing to do with the source RPM. It's just
the usual RPM way of package building. I prefer to "force" my 
rpm installs (the fake install done by rpm -b during the install
step) into a directory OTHER than where it will install on the target
machine. Benefits are 1) I can create a binary RPM on a system where
I don't hace root access (as if I'd work there...), and 2) you
don't step on your production files untill you are satisfied with
the RPM. This has the added advantage in that you can now make the
RPM relocatable with a little more effort in the .spec file.

To make the package relocatable, you capture the library path
where you want it to go (presumably the site-packages directory
of the default python, but it doesn't have to be) to the location
where you just did the fake RPM install. RPM then works it's
magic by creating the package from the files in the fake tree
so that they will be installed by default into the real
site-packages directory. Since the package is relocatable now,
the installer could override it to go anywhere. This is really
cool, in that now the installer doesn't have to be root. 
Instead, they can relocate the package anywhere they can
write and just add that path to their PYTHONPATH.

> Case 2: put sys.executable + " setup.py ..." in the .spec file.
>   This fixes the above problem, but is bad in the case where P
>   accidentally uses a non-standard python to create an RPM that
>   is supposed to go to the standard python installation (/usr).
>   Eg. if I forget that /usr/local/bin/python is first in my path,
>   then any source RPMs I create will refer to /usr/local/bin/python
>   in the .spec file, and building those source RPMs will either
>   fail (on systems that don't have /usr/local/bin/python, probably
>   the vast majority of installed Linux boxen out there) or will
>   generate an RPM with the "non-standard" destination of /usr/local.

There's a little more confusion here, and I'm not exactly sure
how it works with bdist_rpm. RPM does some additional hacking
beyond what's in the .spec file. It looks especially for
#! magic in package files but also for referenced libraries
and anything else it thinks the package is going to need. 
RPM then (theoritcally, in a helpful fashion) adds these
things to the "Requires:" tag in the binary package. SO,
if it identifies /usr/local/bin/python as a requirement for
the package, it will not install (without --force) on a 
target system that does not have /usr/local/bin/python.
(Yes, I have done this...the original pygresql package
went out this way...sigh). I don't think RPM deduces
requires from processors used in the setup, config, build, 
install steps, but if the packager is using a /usr/local/bin
python, it's very likely to be references somewhere in the
package, too.

> In the latter case -- accidentally using the wrong python -- it would be
> best if the Distutil issued a warning.  I don't see an easy way to
> detect this, especially if someone is deliberately creating an RPM to
> install modules to a non-standard location.  (Eg. Andrew's case of
> having python in /www/python/bin/python: he might want to make RPMs of
> common modules to install on all the web developers' workstations in
> /www/python.)

This is why I'm struggling with pkgtool (and even worse, sdux).
If the bdist_whatever's make the package relocatable, it doesn't 
matter what python was used in the build (location-wise anyway).
Then the issue shift's to installation time. How do you determin
the TARGET system's "prefered" python? This could be retrieved
either by executing python or querying RPM in a preinstall step,
then automatically reloacting the package where it should go.
Except you also have to be prepared to skip the auto-relocation
if the installer is relocating it with RPM options.

> Both situations are subtle errors on the packager's part, and neither
> seem to have obvious automatic solutions.  The "fix" is social
> engineering: let the packager decide what he wants to do with options to 
> the bdist_rpm command, and make sure the rationale for each option is
> carefully documented.

Like I said...Packaging is easy. Making easy packages are hard. I'd
vote for making bdist_rpm make things relocatable with a default
pre-install to do auto-relocation IF I still had any sanity from
actually trying to do this with other package types. Maybe I'm 
too anal about my packages, but good ones make it sooooo much
easier to administer large number of machines.

> The question is: which failure is more obvious and happens closest to
> the packager, the one who made the mistake (failed to RTFM, etc.)?  That
> one should be the default.  It seems to me that creating a source RPM
> (spec file) that immediately fails because of an explicit
> /path/to/python is better than one that works, but possibly wrongly,
> because of implicitly using the first python on the path.

Agreed, except that the source RPM is still only going to be
used by people who are already prepared to get in and hack at it
anyway.

> IOW, "Explicit is better than implicit", even in snippets of shell code
> included in a .spec file bundled in a source RPM.

Hey, you're not trying to suggest that maybe, possible, ONE thing
from bdist_pkgtool was almost, kinda, maybe on the right track ;-)

Anyway I said this probably wouldn't be a solution, but I hope I
clarified the problem al little....


mwa