[Distutils] A Modest Proposal for "A Database of Installed Packages"
Phillip J. Eby
pje at telecommunity.com
Sun Apr 6 01:50:19 CEST 2008
At 10:07 PM 4/5/2008 +0100, Floris Bruynooghe wrote:
>This proposal has been here about a week now, with no comments on it.
>I take that as positive as no one has had major objections. :-)
It's more that there are some holes and handwaving; I haven't really
had the mental bandwidth to offer comments on the original proposal as yet.
(One comment, though: I really don't like the idea of extending
PKG-INFO to include installation data; it's only incidentally related
and there are other contexts in which we use PKG-INFO where having
that data included would make no sense. Plus, it's really not an
ideal file format for including data about a potentially rather large
number of files.)
>Secondly I'm not sure how
>useful it is for the version number to be encoded in the filename.
It's very useful for setuptools, as it avoids the need to open and
parse the file when searching for a suitable version of a desired package.
>It seems the .egg-info file does get installed in the site-packages
>root currently. This will likely give conflicts when we're starting
>to use namespace packages.
This doesn't make sense. Namespace packages and project names are
not in the same namespace and have nothing to do with each
other. For example, I have a project called DecoratorTools that
installs a module in the peak.util namespace package. Its egg-info
would be something like DecoratorTools-1.6.egg-info. So I think you
are confused about something here.
> We can't put the .pyi *in* the package
>since then we lose support for simple modules, so we have to place it
>*next* to the package.
No, it just goes to the --install-lib directory, which in the default
case is site-packages. (But could be a PYTHONPATH or other directory.)
> So if "bar" is a namespace package inside
>"foo" then we would have:
>
>site-packages/foo/bar.pyi
>site-packages/foo/bar/__init__.py
Ah, I see... you are definitely confusing package names and project names.
>This means any package tool will need to recursively scan the
>site-packages directory to find the files, but that doesn't seem like
>to much a penalty? The alternative is to have a separate directory
>for the intalldb files:
>
>site-packages/foo/bar/__init__.py
>site-packages/install.db/foo/bar.pyi
>
>This could significantly reduce the scanning time since there are far
>fewer files too walk. I chose a name with a "." for install.db so
>we're not stealing a possible module or package name. Other then that
>the name of the directory can by anything we manage to agree on. :-)
>Using this approach might create confusion about relative paths
>mentioned in .pyi files though (is the root the current direcotry or
>do we pretend the .pyi was actually next to the package/module?).
>
>Distribution not providing a package/module or with a different
>distribution name then the package(s)/module(s) provided would end up
>in the top-level of the database (in both scenarios), effectively
>stealing package/module names but that seems to be the current
>behaviour of distutils already anyway. Namespace sub-distributions
>(bar in the example above) with a different distribution name as
>package/module name would steal names from it's namespace.
All of this is moot, since project/distribution names are unrelated
to package names.
>Namespace packages are not fully handled yet, ...
>
>AFAIK this should cover namespace packages.
Unfortunately, this doesn't fix the problem, since either *some*
package has to own the __init__.py, or there has to be a way for
Python to treat the directory as a package without one. And for
system package managers (esp. on Linux), some *one* system package
must own the file - it can't be owned by multiple system packages.
My guess is that this is true, *even if* the file is automatically
generated. Some system packaging folks will need to chime in here.
>Lastly --and I'm not sure how happy I'm about this, should have
>thought of this earlier-- the python packaging tools need to support
>giving away ownership at install time! Since Debian and Redhat etc
>just call setup.py that would mean each package they install would be
>owned by distutils/setuptools/... That's bad.
>
>I propose that setup.py needs to honour an environment variable:
>PYI_OWNER so that distros can set this to their custom name (dpkg,
>rpm, ...).
A command-line option to 'install' that's inherited by
'install_egg_info' would handle this; I don't think an environment
variable is a good idea for this -- too implicit. Note that
bdist_rpm, for example, would need to encode this as a command-line
option in the .spec file, anyway.
>Phew, thanks for reading this far! I hope this is useful, if it is we
>should probably start writing the text for the new PEP262 on a wiki
>somewhere while we discus details.
The major issues at the moment are that 1) your spec is confused
about packages vs. projects or distributions (and thus needs to be
revamped with that in mind), and 2) PKG-INFO is a really lousy place
to put this, from a formatting perspective. It's one thing to
include the PKG-INFO in the install DB, and another thing entirely to
include the install db into the PKG-INFO! I think PEP 262 had the
right idea, even though I'm not overjoyed by its proposed format, either.
More information about the Distutils-SIG
mailing list