[Numpy-discussion] Should I use pip install numpy in linux?

David Cournapeau cournape at gmail.com
Sat Jan 9 08:48:40 EST 2016


On Sat, Jan 9, 2016 at 12:12 PM, Julian Taylor <
jtaylor.debian at googlemail.com> wrote:

> On 09.01.2016 04:38, Nathaniel Smith wrote:
> > On Fri, Jan 8, 2016 at 7:17 PM, Nathan Goldbaum <nathan12343 at gmail.com>
> wrote:
> >> Doesn't building on CentOS 5 also mean using a quite old version of gcc?
> >
> > Yes. IIRC CentOS 5 ships with gcc 4.4, and you can bump that up to gcc
> > 4.8 by using the Redhat Developer Toolset release (which is gcc +
> > special backport libraries to let it generate RHEL5/CentOS5-compatible
> > binaries). (I might have one or both of those version numbers slightly
> > wrong.)
> >
> >> I've never tested this, but I've seen claims on the anaconda mailing
> list of
> >> ~25% slowdowns compared to building from source or using system
> packages,
> >> which was attributed to building using an older gcc that doesn't
> optimize as
> >> well as newer versions.
> >
> > I'd be very surprised if that were a 25% slowdown in general, as
> > opposed to a 25% slowdown on some particular inner loop that happened
> > to neatly match some new feature in a new gcc (e.g. something where
> > the new autovectorizer kicked in). But yeah, in general this is just
> > an inevitable trade-off when it comes to distributing binaries: you're
> > always going to pay some penalty for achieving broad compatibility as
> > compared to artisanally hand-tuned binaries specialized for your
> > machine's exact OS version, processor, etc. Not much to be done,
> > really. At some point the baseline for compatibility will switch to
> > "compile everything on CentOS 6", and that will be better but it will
> > still be worse than binaries that target CentOS 7, and so on and so
> > forth.
> >
>
> I have over the years put in one gcc specific optimization after the
> other so yes using an ancient version will make many parts significantly
> slower. Though that is not really a problem, updating a compiler is easy
> even without redhats devtoolset.
>
> At least as far as numpy is concerned linux binaries should not be a
> very big problem. The only dependency where the version matters is glibc
> which has updated its interfaces we use (in a backward compatible way)
> many times.
> But here if we use a old enough baseline glibc (e.g. centos5 or ubuntu
> 10.04) we are fine at reasonable performance costs, basically only
> slower memcpy.
>
> Scipy on the other hand is a larger problem as it contains C++ code.
> Linux systems are now transitioning to C++11 which is binary
> incompatible in parts to the old standard. There a lot of testing is
> necessary to check if we are affected.
> How does Anaconda deal with C++11?
>

For canopy packages, we use the RH devtoolset w/ gcc 4.8.X, and statically
link the C++ stdlib.

It has worked so far for the few packages requiring C++11 and gcc > 4.4
(llvm/llvmlite/dynd), but that's not a solution I am a fan of myself, as
the implications are not always very clear.

David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20160109/84674aff/attachment.html>


More information about the NumPy-Discussion mailing list