[Distutils] Request for Input re Packaging

David Cournapeau david at ar.media.kyoto-u.ac.jp
Thu Mar 20 06:35:38 CET 2008


Jeff Rush wrote:
> In researching the state of packaging, I've been reading the archives and all 
> the bug reports filed against distutils.
>
> I'd like though to get some examples of particularly troublesome uses of 
> setup.py, to pull together and propose some changes to make their use case a 
> bit easier.  So far such cases I've been made aware of are Twisted, numpy and 
> SciPy.  If you know of a tough case where the developer had to jump through 
> hoops to make it work, please point me to it.
>   

Hi,

    My name is David Cournapeau, and I am one of the developer of numpy 
(I am not one of the core developer, but I have been heavily involved 
with a new build system for both numpy and scipy in the last few months, 
so I think I have one or two things to say in this respect).

    My first contact with distutils was because I wanted to add some 
functionalities to numpy.distutils, which is numpy own extensions to 
distutils for numpy needs (things like fortran support, etc...). I 
wanted to add support for building ctypes extensions (.so on linux, 
.dylib on mac os X, .dll on windows, etc...), etc... I quickly gave up 
because of the complexity of distutils, and took a different approach 
(using scons within distutils to build all our compiled code, distutils 
still doing the packaging).

Here are some things which I find frustrating with distutils:

1 extending distutils is not documented at all. Sure, you have a few 
words on distutils commands, but once you want to use compilers in your 
own commands, you are on your own. For example, a working example on how 
to extend distutils with a new command to build something from C would 
be a really good addition.

The relationship between Distribution classes, Command classes and 
Compiler classes should be documented somewhere. The relationship 
between the different distutils commands should be documented somewhere: 
I wanted to do something as simple as adding a distutils command to add 
a whole directory of files: doing it such as it works with sdist, 
install, distutils and setuptools was found impossible, and I found 
easier to regenerate MANIFEST.in with a shell script. That's something 
that should be doable in a hour or two for anyone who does not know 
anything about distutils; today, I am not sure it is doable by anyone 
without a deep knowledge of distutils.
 
2 The only way to understand how distutils works is to run code, because 
a lot of code is based on adding attributes at run-time, etc... 
Basically, a lot of distutils feels like magic to me. For example:
    - in numpy, we want to have tight control of compiler flags: this is 
extremely complicated to do with distutils, because flags are added from 
everywhere in the code, and understanding it enough to change it wo 
breaking anything is nearly impossible. Removing the magic would be 
great (all the configuration in some separate configuration files, for 
example, and the customization at runtime in one clearly separated 
module, for example). But this is a difficult problem: I don't see how 
to change this (in distutils) without breaking someone else code. 
Ideally, it should be easy to customize compiler flags from the command 
line (I bet this is one of the rpm/deb maintainers complain); every few 
days, some people complain on numpy/scipy ML because they use CFLAGS, it 
does not work as they expect it to work, and it breaks the build.
    - compiler usage is not documented. Some functions (initialized) 
have to be called in some order with compiler instances to get some of 
their characteristics; of course, neither the order or which function to 
call for which characteristic is documented anywhere; worse, it depends 
on the compiler (unix vs windows). I don't understand the point of 
adding attributes on runtime, differently, in different cases. Maybe I 
am missing something here
    - why msvc is different from everything else ? In particular, why it 
is not possible to have access to msvc flags in the same manner than all 
other platforms ? Instead, it is burried in the MSVCCompiler code...
    - generally, it is not specified what is public interface and what 
is not. Everything is leaking everywhere, there is no specification.

3 Some code to detect libraries would be good. For example, you write 
code which depends on libfoo: we have our own code in numpy.distutils, 
but that's something which I think many people would like to be able to 
do. A helper tool to parse pkg-config would be good, too.

The magic behavior + lack of documentation really is the main problem 
for me: if there was a small core of functionalities that we could 
extend, the situation would be better. It is difficult to say one 
particular thing is broken, because almost any distutils functionality 
is linked to something else; I cannot find a more precise description 
than magic, and the above points are the first which come to my mind (I 
can find other ones if necessary, but they are all linked to this magic 
thing and lack of precise interfaces).

But changing this in a backward-compatible manner may be extremely 
difficult, maybe even impossible. To be frank, I was secretly hoping 
something would be done on this front for python 3k... I would certainly 
be happy to help if there was some work on a distutils2.

cheers,

David


More information about the Distutils-SIG mailing list