[SciPy-dev] problems with numpy.setuptools...

David M. Cooke cookedm at physics.mcmaster.ca
Sat Sep 29 18:50:01 EDT 2007


David Cournapeau <david at ar.media.kyoto-u.ac.jp> writes:

> Pearu Peterson wrote:
>> On Fri, September 28, 2007 11:40 pm, Stefan van der Walt wrote:
>>> I would prefer if numpy were *always* in a release-ready state.
>>
>> Most of the time it actually is.
>>
>>> Why can't we instrument tests for distutils?  If the code is so confusing
>>> that we can't test it (or "practical corner cases"), should we be
>>> using it as a core ingredient in the first place?
>>
>> The code is not so confusing - it has well-defined (though undocumented)
>> structure that can be extended after one gets an idea how things
>> work in distutils.
>>
> Before I start to make my points, and because I feel this discussion is 
> heating a bit, let me say that I never intend to criticize any numpy's 
> developers work/code. I understand that distutils is complicated to 
> extend, that numpy's needs go far beyond the usual needs of python 
> package, and I really think that numpy.distutils has already achieved 
> quite a lot. All other impressions are surely du to the fact that 
> English is not my native language :)
>
> Now, although I certainly do not have as much experience as you with 
> distutils, I have used autotools for non trivial projects, I have used 
> scons as well, and I have some experience with complicated gnu make 
> based projects (eg my numpy.gar script which can builds numpy and all 
> its dependencies in different configuration/compilers on linux/unix). 
> This is not to say I am knowledgeable, but at least I think I am aware 
> of the difficulties of a cross platform build system.
>
>> The variety of different compilers and platforms makes developing
>> distutils difficult. I guess none of the developers who dare to
>> touch distutils have access to all platforms and compilers that
>> we are trying to support. This fact will not change after switching
>> to some other tool such as scons.

> That's where I think we fundamentally disagree. I for one think 
> distutils (here I mean the official, python.distutils, and only to build 
> C/Fortran extensions; I will never talk about any other capabilities of 
> distutils) is fundamentally flawed wrt to several key points, and I am 
> not even talking about its implementation. Since you are much more 
> familiar with distutils than me, I would really appreciate being told wrong:

>      1 - the way it tries to detect the platform capability. When you 
> try to adapt to a platform in a build tool, you have at least two 
> possibilities: either you hardcode every new platform, or you try to get 
> the information from the platform. Distutils does the former, autoconf 
> does the latter. For me, there is absolutely no question which one is 
> better in this respect. This alone explains a lot of distutils fragility 
> IMHO. For example, numpy.distutils define the "f2c" library for each new 
> tool, but autoconf (more explicitely, the autoconf macro 
> AC_F77_LIBRARY_LDFLAGS) finds it automatically from compiler output.  
> This is a fundamental point, maybe the only most important one.

When you mention the 'f2c' library, you mean the Fortran intrinsic and
runtime libraries? We try to use the Fortran compiler for linking, so
knowing those tends not to be needed.

For comparision, we've got 2348 lines of source in
numpy/distutils/fcompiler/, and fortran.m4 included with autoconf is
1234 lines. However, fortran.m4 mostly only deals with finding a Fortran
compiler, finding flags for the libraries mentioned above, and name
mangling (and nothing about optimisation flags). SCons also does its
own handling of individual tools in much the same way as
numpy.distutils (with, again, no optimisations).

Now, one design point that I think distutils gets wrong is to use
lists for describing compiler options: the description of a compiler
(or any of the command line tools used) should be an object, that
'knows' how to turn, say, a .c file into a .o file (something like
scons' Builders, I think). We have some tools where the pattern
EXECUTABLE_NAME + SOME_OPTIONS + file.c + MORE_OPTIONS + file.o +
EVEN_MORE_OPTIONS doesn't work (most notably, linking shared libraries
on AIX must be handled specially). The CCompiler and FCompiler classes
are too high-level for this, as they contain knowledge about multiple
tools.

>     2 - numpy.distutils is difficult to extend. This, I think we all 
> agree. This is also significant, because it means that when 
> numpy.distutils fails, only a few people can change the situation. This 
> should not be the case. If the problem is one compiler flag, it should 
> not require anyone's involvement but the ones who experience the problem.

Fiddling compiler options shouldn't be too hard, espicially for the
Fortran compilers; it's usually pretty obvious where to fiddle for
those. But other things (C compilers, linking), I agree, could be
easier.

>     3 - Customizing compilation options: this is extremely difficult 
> right now, and this again by design.

My comment about using objects to represent tools applies here;
creating a custom tool object should be easy.

>     4 - Detecting libraries is extremely difficult: the first time I 
> used system_info, it was totally wrong, and I had to modify it for each 
> new platform. This is not good: I needed more time to make it right than 
> I ever did for autotools based projects of mine, which really says a lot 
> in my book.

Could you detail your problems? system_info.py needs to be cleaned up
(removal of repeative code, for instance) and finding libraries
standardised (for instance, there should always be an environment
variable and/or distutils key for things system_info wants to find,
that will override system_info).

>     5 - By design, scons does not depend on the environment. That is, it 
> does NOT use PATH information, LD_LIBRARY_FLAGS, etc... which again 
> makes it much more reliable. Of course, you can add command line 
> arguments for configuration, but by default, nothing will stab you in 
> the back.

Blech, I hate the fact that scons doesn't respect my PATH. That means
the compiler it may find is not necessarily the one that I would find
at my command line. Makes bugs more obscure. Other variables (from my
environment) that *should* be used as-is and not second-guessed by a
build tool are things like TMPDIR, TMP, LANG, and PKG_CONFIG_PATH. Any
variables like PATH or PKG_CONFIG_PATH are probably set for a reason.

> Now, you would rightly argue that doing all this in scons/whatever would 
> take a lot of time just to replicate distutils, and you would be right. 
> But I think it would require *less* time than implementing the things 
> stated eg in
> http://projects.scipy.org/scipy/numpy/wiki/DistutilsRevamp.

Not if I steal stuff from scons ;-)

(Or, for that matter, if I actually *had* any time to hack on it,
which I don't right now.)

Speaking of which, if the scons developers would make their code an
actual Python package (for one thing, install it by default in
site-packages/, and try to keep interfaces constantish), I would
support using scons modules in numpy.distutils. From the looks of it,
it's nicely laid out, with a good set of abstractions. Instead,
they've set it up so that scons must be in control, through SConstruct
files. Those have always had a faux-python feel to them for me: there
are names injected into them (i.e. Environment) that don't come from
an import statement, and the sharing of variables between SConscript
files using the Export function is quite unpythonic.

> But I don't ask you to take my word on it: I am more than willing to 
> code a prototype of what I have in mind. But then, I need to know what 
> is required for a prototype, to make a plan of what needs to be implemented:
>     - Is a prototype required to support all platforms supported now ?
>     - Is a prototype which can build numpy on Mac OS X/Linux/Win32 
> enough (keeping in mind that the key point is that adding new platforms 
> would be much easier once the prototype is done). In which configuration ?

Those are the big three. If it builds on those, we're probably 80% of
the way there.

Of course, it's the other 20% that'll take 80% of the work, as usual.

>     - If required, I could try to make it possible to have two top setup 
> scripts, one which uses the current distutils, one which use the 
> scons-based one.
>
> As said earlier, I have started hacking on a numpy branch which, right now:
>     - implements a scons command
>     - a support library, which enables to build/run ctypes extensions 
> working on many platforms (all the one I managed to use on the buildbot)
>     - a support library to find libraries and headers in standard 
> location on all supported platforms by scons (of which numpy.distutils 
> is a subset AFAIK), and can be customized using site.cfg.
>
> One problem I did not foresee is how difficult it is to be sure that 
> scons uses exactly the same tools than distutils (because distutils has 
> no common api to get things such as compiler, compiler path, etc...); 
> that's why contrary to what I thought first, using scons to build all 
> the compiled extension may actually be easier than making them coexist, 
> because scons and distutils have fundamentally different ways of looking 
> for tools.

We can see what we can do to make it easier to get at those
properties. For instance, it would be possible to separate the Fortran
compiler detection/option-setting more from the distutils parts.

> To start building numpy itself with it would require more work (mostly 
> tests for fortran / C abi), but with the help of people willing to help, 
> I don't see major problems on this side, since scons provide a framework 
> for autoconf-like testing. Also, this will certainly use some facilities 
> of numpy.distutils (basically, for a first prototype, scons would mostly 
> replace system_info.py, command/build_clib.py, command/build_ext.py).

If you have the time, I think it's worth a try. Unfortunately, I don't
have much time to contribute to it.

Basically, my opinion is not to replace numpy.distutils with scons
(for the simple reason of interacting nicely with distutils). However,
I'm not opposed to chucking parts of numpy.distutils and replacing
them with scons-like concepts.

-- 
|>|\/|<
/------------------------------------------------------------------\
|David M. Cooke              http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca



More information about the SciPy-Dev mailing list