[Numpy-discussion] parallel compilation of numpy

Michael Abshoff michael.abshoff at googlemail.com
Thu Feb 19 00:26:59 EST 2009


David Cournapeau wrote:
> Christian Heimes wrote:
>> David Cournapeau wrote:

Hi,

>> You may call me naive and ignorant. Is it really that hard to archive
>> some kind of poor man's concurrency? You don't have to parallelize
>> everything to get a speed up on multi core machines. Usually the compile
>> process from C/C++ file to an object files takes up most of the time.
>>
>> How about
>>
>> * assemble a list of all C/C++ source files of all extensions.
>> * compile all source files in parallel
>> * do the rest (linking etc.) in serial
>>   

With Sage we do the cythonization in parallel and for now build 
extension serially, but we have code to do that in parallel, too. Given 
that we are building 180 extensions or so the speedup is linear. I often 
do this using 24 cores, so it seems robust since I do work on Sage daily 
and often to test builds from scratch and I never had any problems with 
that code.

We use pyprocessing to launch the jobs and the changes to disutils are 
surprisingly small, but the original version of the patch broke the 
build of numpy/scipy, but I do believe the author already has a fix for 
that, too - he is just busy finishing his PhD thesis next month and will 
then be back to work on Sage. The plan here is to definitely push things 
back into Python so that all people building extensions can benefit.

> That's more or less how make works - it does not work very well IMHO.
> And doing the above correctly in distutils may be harder than it seems:
> both scons and waf had numerous problems with calling subtasks because
> of race conditions in subprocess for example (both have their own module
> for that).
> 
> More fundamentally though, I have no interest in working on distutils.

:)

> Not working on a DAG is fundamentally and hopelessly broken for a build
> tool, and this is unfixable in distutils. Everything is wrong, from the
> concepts to the UI through the implementation, to paraphrase a famous
> saying. There is nothing to save IMHO. Of course, someone else can work
> on it. I prefer working on a sane solution myself,
> 
> cheers,
> 
> David

To taunt Ondrej: A one minute build isn't forever - numpy is tiny and I 
understand why it might seem long compared to SymPy, but just wait until 
you add Cython extensions per default and those build times will go up 
substantially ;).

Cheers,

Michael

> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
> 




More information about the NumPy-Discussion mailing list