[Distutils] setuptools in a cross-compilation packaging environment

M.-A. Lemburg mal at egenix.com
Fri Oct 7 14:01:25 CEST 2005


Phillip J. Eby wrote:
> At 10:27 AM 10/5/2005 +0200, M.-A. Lemburg wrote:
> 
>>[Some comments on your strategy...]
>>
>>Phillip J. Eby wrote:
>>
>>>>The new setuptools is all nice and easy for end user, but as a package
>>>>maintainer, I'd like to have the option of building a binary package 
>>
>>without
>>
>>>>all the dependencies.
>>>
>>>In the long run, this should be done by packaging the result of bdist_egg,
>>>and by default doing bdist_rpm will do this now.  In the short term, 
>>
>>unless
>>
>>>you're switching to an all-egg distribution, you'll probably want to use
>>>legacy/unmanaged mode.
>>
>>I think you are missing his point here:
>>
>>As package maintainer you *have* to be able to build a distribution
>>package without all the dependency checks being applied - how else
>>would you be able to bootstrap the package in case you have circular
>>dependencies ?
> 
> In legacy/unmanaged mode, setuptools' "install" command behaves the way the 
> standard distutils "install" does today, without creating an egg or 
> searching for dependencies.

Sorry, maybe I wasn't clear: a package builder needs
to *build* a package (rpm, egg, .tar.gz drop in place
archive, etc.) without the dependency checks.

For the user to be able to turn off the dependency checks
when installing an egg using an option is also an often
needed feature. rpm often requires this when you want
to install packages in different order, in automated
installs or due to conflicts in the way different
packages name the dependencies. I guess, eggs will
exhibit the same problems over time.

>>I don't think that eggs are the solution to everything, so
>>you should at least extend the dependency checking code to
>>have it detect already installed packages (by trying import
>>and looking at __version__ strings) or having an option
>>to tell the system: "this dependency is satisfied, trust me".
> 
> There are plans to have a feature like that, and in fact setuptools already 
> has code to hunt down __version__ strings and the like, without even 
> needing to import the packages.  It isn't integrated with the rest of the 
> system yet, though.
> 
> One reason for that is that early feedback suggests that package developers 
> and users would rather have the assurance of having the exact version 
> required by something, as long as the installation process doesn't impose 
> any additional burden on them.  Local detection hacks have been primarily 
> requested by packagers, who (quite reasonably) do not want to have to 
> repackage everything as eggs.
> 
> There is a simple trick that packagers can use to make their legacy 
> packages work as eggs: build .egg-info directories for them in the sys.path 
> directory where the package resides, so that the necessary metadata is 
> present.  This does not require the use of .pth files, but it does slow 
> down the process of package discovery for things that do use pkg_resources 
> to locate their dependencies.  It also still requires them to repackage 
> existing packages, but doesn't require changing the layout. 

Where would you have to put these directories and what
do they contain ?

> Also, such 
> packages will currently cause easy_install to warn about conflicting 
> packages if you try to install a different version of the same package, but 
> this will be alleviated soon, as I'm working on a better conflict 
> management mechanism that will allow egg directories on PYTHONPATH to 
> override things in the standard directories.  (Currently, eggs are only 
> ever added to the end of sys.path, so if the local packaging system puts 
> .egg-info directories in site-packages, there would be no way to locally 
> override that for an individual user's packages.  A future version of 
> setuptools will resolve that issue soon, hopefully in the next few weeks.)

I must admit that I haven't followed the discussions about
these .egg-info directories. Is there a good reason not to
use the already existing PKG-INFO files that distutils builds
and which are used by PyPI (aka cheeseshop) ?

> As for eggs being the "solution to everything", I would like to point out 
> that what precisely constitutes an egg is an extensible concept.  See e.g.:
> 
>      http://mail.python.org/pipermail/distutils-sig/2005-June/004652.html
> 
> which shows that there are actually three formats that are "eggs" at the 
> moment:
> 
>   1. .egg zipfiles
>   2. .egg directories
>   3. .egg-info marker directories
> 
> The key requirements for a format to be a pluggable distribution or "egg" are:
> 
>   * Adding it to sys.path must make it importable
>   * It must be possible to discover its PyPI project name (and preferably 
> version and platform) from the filename
>   * It must allow arbitrary data files and directories to be included 
> within packages, and allow arbitrary metadata files and directories to be 
> included for the project as a whole
>   * It must include the standard PKG-INFO metadata
> 
> These are the absolute minimums, but there are additional specific metadata 
> files and directories that easy_install requires in order to detect 
> possible conflicts, create scripts, etc.
> 
> Anyway, the point is that what constitutes an "egg" is flexible, but the 
> "add to sys.path and make it importable" requirement certainly limits what 
> formats are practically meaningful.  Nonetheless, further extensibility is 
> certainly possible if there's need.

Hmm, you seem to be making things unnecessarily complicated.

Why not just rely on the import mechanism and put all
eggs into a common package, e.g. pythoneggs ?!

Your EasyInstall script could then modify a file in that
package called e.g. database.py which includes all the
necessary information about all the installed packages
in form of a dictionary.

This would have the great advantage of allowing introspection
without too much fuzz and reduces the need to search paths,
directories and so-on which causes a lot of I/O overhead
and slows down startup times for applications needing
to check dependency requirements a lot.

>>Please make sure that your eggs catch all possible
>>Python binary build dimensions:
>>
>>* Python version
>>* Python Unicode variant (UCS2, UCS4)
>>* OS name
>>* OS version
>>* Platform architecture (e.g. 32-bit vs. 64-bit)
> 
> 
> As far as I know, all of this except the Unicode variant is captured in 
> distutils' get_platform().  And if it's not, it should be, since it affects 
> any other kind of bdist mechanism.

Agreed.

So you use get_platform() for the egg names ?

>>and please also make this scheme extendable, so that
>>it is easy to add more dimensions should they become
>>necessary in the future.
> 
> 
> It's extensible by changing the get_platform() and compatible_platform() 
> functions in pkg_resources.

Ah, that's monkey patching. Isn't there some better way ?

> By the way, I've issued requests on this list at least twice over the past 
> year for people to provide input about how the platform strings should 
> work; I got no response to either call, so I gave up.  Later, when an OS X 
> upgrade created a compatibility problem, somebody finally chipped in with 
> info about what good OS X platform strings might be.  I suspect that 
> basically we'll get good platform strings once there are enough people 
> encountering problems with the current ones to suggest a better scheme.  :(
> 
> If you have suggestions, please make them known, and let's get them into 
> the distutils in general, not just our own offshoots thereof.

This is what we use:

def py_version(unicode_aware=1, include_patchlevel=0):

    """ Return the Python version as short string.

        If unicode_aware is true (default), the function also tests
        whether a UCS2 or UCS4 built is running and modifies the
        version accordingly.

        If include_patchlevel is true (default is false), the patch
        level is also included in the version string.

    """
    if include_patchlevel:
        version = sys.version[:5]
    else:
        version = sys.version[:3]
    if unicode_aware and version > '2.0':
        # UCS4 builds were introduced in Python 2.1; Note: RPM doesn't
        # like hyphens to be used in the Python version string which is
        # why we append the UCS information using an underscore.
        try:
            unichr(100000)
        except ValueError:
            # UCS2 build (standard)
            version = version + '_ucs2'
        else:
            # UCS4 build (most recent Linux distros)
            version = version + '_ucs4'
    return version

and then patch the various commands in distutils, e.g.:

class mx_build(build):

    """ build command which knows about our distutils extensions.

        This build command builds extensions in properly separated
        directories (which includes building different Unicode
        variants in different directories).

    """

    ...

    def finalize_options(self):

        # Make sure different Python versions are built in separate
        # directories
        python_platform = '.%s-%s' % (get_platform(), py_version())
        if self.build_platlib is None:
            self.build_platlib = os.path.join(self.build_base,
                                              'lib' + python_platform)
        if self.build_temp is None:
            self.build_temp = os.path.join(self.build_base,
                                           'temp' + python_platform)

        # Call the base method
        build.finalize_options(self)


class mx_bdist(bdist):

    """ Generic binary distribution command.

    """

    def finalize_options(self):

        # Default to <platform>-<pyversion> on all platforms
        if self.plat_name is None:
            self.plat_name = '%s-py%s' % (get_platform(), py_version())
        bdist.finalize_options(self)


The result is a build system that can be used to build
all binaries for a single platform without getting
conflicts and binaries that include a proper platform
string, e.g.

egenix-mxodbc-zopeda-1.0.9.darwin-8.2.0-Power_Macintosh-py2.3_ucs2.zip
egenix-mxodbc-zopeda-1.0.9.linux-i686-py2.3_ucs2.zip
egenix-mxodbc-zopeda-1.0.9.linux-i686-py2.3_ucs4.zip

>>To make things easier for the user, the install system
>>should be capable of detecting all these dimensions
>>and use appropriate defaults when looking for an egg.
> 
> 
> That's done for those dimensions currently handled by get_platform(), and 
> can be changed by changes to get_platform() and compatible_platforms() in 
> pkg_resources.
> 
> 
> 
>>Please reconsider your use of .pth files - these cause the
>>Python interpreter startup time to increase significantly.
>>If you just have one of those files pointing to your
>>managed installation path used for eggs, that should
>>be fine (although adding that path to PYTHONPATH still
>>beats having a .pth to parse everytime the interpreter
>>fires up).
> 
> 
> EasyInstall uses at most one .pth file, to allow packages to be on the path 
> at runtime without needing an explicit 'require()'.  However, a vendor 
> creating packages probably doesn't want to have to edit that .pth file, so 
> a trivial alternative is to install a .pth for each package.  The tradeoff 
> is startup time versus packager convenience in that case.  Having a tool to 
> edit a single .pth file would be good, but not all packaging systems have 
> the ability to run a program at install or uninstall time.  If they do, 
> then editing easy-install.pth to add or remove eggs is a better option.
> 
> Eggs can of course be installed in multi-version mode, in which case no 
> .pth is necessary, but then an explicit require() or a dependency 
> declaration in a setup script is necessary in order to use the package.
> 
> 
> 
>>If you however install a .pth file for every
>>egg, you'll soon end up with an unreasonable startup time
>>which slows down your whole Python installation - including
>>applications that don't use setuptools or any of the eggs.
> 
> 
> A single .pth file is certainly an option, and it's what easy_install 
> itself uses.

Fair enough.

Could this be enforced and maybe also removed
completely by telling people to add the egg directory to
PYTHONPATH ?

Note that the pythonegg package approach would pretty much
remove the need for these .pth files.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 07 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::


More information about the Distutils-SIG mailing list