[Distutils] thoughts on distutils 1 & 2
has
hengist.podd at virgin.net
Fri May 14 10:16:31 EDT 2004
Hello List,
Very quiet here, so thought I would toss in some notes I've been
making regarding Python's module system, the current DistUtils 1.x
and some of the proposals I've seen for Distutils 2. These notes are
very rough so I dunno how much sense they'll make to anyone else in
their current state, but I figure it's better to pitch them in to
find out if there's any interest in discussing them further than
spend time polishing them if there isn't.
Let us know what you think, and we can take it from there if folk are
interested.
Regards,
has
-------
Issues:
http://www.python.org/cgi-bin/moinmoin/DistUtils20 states:
"The ultimate goal: Must be backwards-compatible with
existing setup.py scripts."
This is both a red herring and likely recipe for DU2 becoming a big
ball of mud before it's even out the door...
- Compatibility for existing setup.py scripts can easily be ensured
by retaining DU1. DU1 should be declared at end of its development
life. DU1 API may eventually be re-implemented on top of DU2,
allowing DU1 core to be ditched to reduce maintenance cost. Deprecate
DU1 API.
- DU1 doesn't scale down as well as it could/should. Doesn't scale up
as well as it could/should. Current DU2 proposals don't seem to
address these points, seeking only to add new material on top rather
than reexamine/reevaluate existing architecture. Some current DU2
proposals smack of rampant architecture astronomy, lacking sufficient
evaluation of their potential cost or whether the same goals could be
achieved through other, simpler means.
- DU2 provides an opportunity to review everything learnt over course
of DU1 development and do it better. DU1 development has stagnated
under its own weight. DU1 architecture is a rat's nest. Not a good
base to build DU2 on. Better to design afresh: assemble list
representative range of use cases and their relative frequencies in
real-world use, determine "ideal" solution, determine "practical"
solution. "Practical" solution = "ideal" solution minus anything that
would prove too disruptive to Python, or too expensive for the
benefits it'd provide, or where existing material from DU1 could be
leveraged in at less cost than reimplementing from scratch.
-------
Recommend:
- Before adding new features/complexity, refactor current _design_ to
simplify it as much as possible. Philosophy here is much more
hands-off than DU1; less is more; power and flexibility through
simplicity: make others (filesystem, generic tools, etc.) do as much
of the work as possible; don't create dependencies.
-- e.g. c.f. Typical OS X application installation procedure (mount
disk image and copy single application package to Applications
folder; no special tools/actions required) versus typical Windows
installation procedure (run InstallShield to put lots of bits into
various locations, update Registry, etc.) or typical Unix
installation procedure (build everything from source, then move into
location). Avoiding overreliance on rigid semi-complex procedures
will allow DU2 to scale down very well and provide more flexibility
in how it scales up.
- Eliminate DU1's "Swiss Army" tendencies. Separate the build,
install and register procedures for higher cohesion and lower
coupling. This will make it much easier to refactor design of each in
turn.
- Every Python module should be distributed, managed and used as a
single folder containing ALL resources relating to that module:
sub-modules, extensions, documentation (bundled, generated, etc.),
tests, examples, etc. (Note: this can be done without affecting
backwards-compatibility, which is important.) Similar idea to OS X's
package scheme, where all resources for [e.g.] an application are
bundled in a single folder, but less formal (no need to hide package
contents from user).
- Question: is there any reason why modules should not be installable
via simple drag-n-drop (GUI) or mv (CLI)? A standard policy of "the
package IS the module" (see above) would allow a good chunk of both
existing and proposed DU "features" to be gotten rid of completely
without any loss of "functionality", greatly simplifying both build
and install procedures.
--Replace current system where user must explicitly state what they
want included with one where user need only state what they want
excluded. Simpler and less error-prone; fits better with user
expectations (meeting the most common requirement should require
least amount of work, ideally none). Manifest system would no longer
be needed (good riddance). Most distributions could be created simply
by zipping/tar.gzipping the module folder and all its contents, minus
any .pyc and [for source-only extension distributions] .so files.
-- In particular, removing most DU involvment from build procedures
would allow developers to use their own development/build systems
much more easily.
- Installation and compilation should be separate procedures. Python
already compiles .py files to .pyc on demand; is there any reason why
.c/.so files couldn't be treated the same? Have a standard 'src'
folder containing source files, and have Python's module mechanism
look in/for that as part of its search operation when looking for a
missing module; c.f. Python's automatic rebuilding of .pyc files from
.py files when former isn't found. (Q. How would this folder's
contents need to be represented to Python?)
- What else may setup.py scripts do apart from install modules (2)
and build extensions (3)?
-- Most packages should not require a setup.py script to install.
Users can, of course, employ their own generic shell
script/executable to [e.g.] unzip downloaded packages and mv them to
their site-packages folder.
-- Extensions distributed as source will presumably require some kind
of setup script in 'src' folder. Would this need to be a dedicated
Python script or would something like a standard makefile be
sufficient?
-- Build operations should be handled by separate dedicated scripts
when necessary. Most packages should only require a generic shell
script/executable to zip up package folder and its entire contents
(minus .pyc and, optionally, .so files).
- Remove metadata from setup.py and modules. All metadata should
appear in a single location: meta.txt file included in every package
folder. Use a single metadata scheme in simple structured nested
machine-readable plaintext format (modified Trove); example:
------------------------------------------------------------------
Name
roundup
Version
0.1.0
Intended Audience
End Users/Desktop
Developers
System Administrators
License
OSI Approved
Python Software Foundation License
Topic
Communications
Email
Office/Business
Software Development
Bug Tracking
Dependencies
etc...
------------------------------------------------------------------
- Improve version control. Junk current "operators" scheme (=,
<, >, >=, <=) as both unnecessarily complex and inadequate (i.e.
stating module X requires module Y (>= 1.0) is useless in practice as
it's impossible to predict _future_ compatibility). Metadata should
support 'Backwards Compatibility' (optional) value indicating
earliest version of the module that current version is
backwards-compatible with. Dependencies list should declare name and
version of each required package (specifically, the version used as
package was developed and released). Version control system can then
use both values to determine compatibility. Example: if module X is
at v1.0 and is backwards-compatible to v0.5, then if module Y lists
module X v0.8 as a dependency then X 1.0 will be deemed acceptable,
whereas if module Z lists X 0.4.5 as a dependency then X 1.0 will be
deemed unacceptable and system should start looking for an older
version of X.
- Make it easier to have multiple installed versions of a module.
Ideally this would require including both name and version in each
module name so that multiple modules may coexist in same
site-packages folder. Note that this naming scheme would require
alterations to Python's module import mechanism and would not be
directly compatible with older Python versions (users could still use
modules with older Pythons, but would need to strip version from
module name when installing).
- Reject PEP 262 (installed packages database). Complex, fragile,
duplication of information, single point of failure reminiscent of
Windows Registry. Exploit the filesystem instead - any info a
separate db system would provide should already be available from
each module's metadata.
--
http://freespace.virgin.net/hamish.sanderson/
More information about the Distutils-SIG
mailing list