[Numpy-discussion] Linking other libm-Implementation

Mon Feb 8 12:36:33 EST 2016

On Feb 8, 2016 3:04 AM, "Nils Becker" <nilsc.becker at gmail.com> wrote:
>
[...]
> Very superficial benchmarks (see below) seem devastating for gnu libm. It
seems that openlibm (compiled with gcc -mtune=native -O3) performs really
well and intels libm implementation is the best (on my intel CPU). I did
not check the accuracy of the functions, though.
>
> My own code uses a lot of trigonometric and complex functions (optics
calculations). I'd guess it could go 25% faster by just using a better libm
implementation. Therefore, I have an interest in getting sane linking to a
defined libm implementation to work.

On further thought: I guess that to do this we actually will need to change
the names of the functions in openlibm and then use those names when
calling from numpy. So long as we're using the regular libm symbol names,
it doesn't matter what library the python extensions themselves are linked
to; the way ELF symbol lookup works, the libm that the python interpreter
is linked to will be checked *before* checking the libm that numpy is
linked to, so the symbols will all get shadowed.

I guess statically linking openlibm would also work, but not sure that's a
great idea since we'd need it multiple places.

> Apparently openlibm seems quite a good choice for numpy, at least
performance wise. However, I did not find any documentation or tests of the
accuracy of its functions. A benchmarking and testing (for accuracy) code
for libms would probably be a good starting point for a discussion. I could
maybe help with that - but apparently not with any linking/building stuff
(I just don't get it).
>
> Benchmark:
>
> gnu libm.so
> 3000 x sin(double[100000]):  6.68215647800389 s
> 3000 x log(double[100000]):  8.86350397899514 s
> 3000 x exp(double[100000]):  6.560557693999726 s
>
> openlibm.so
> 3000 x sin(double[100000]):  4.5058218560006935 s
> 3000 x log(double[100000]):  4.106520485998772 s
> 3000 x exp(double[100000]):  4.597905882001214 s
>
> Intel libimf.so
> 3000 x sin(double[100000]):  4.282402812998043 s
> 3000 x log(double[100000]):  4.008453270995233 s
> 3000 x exp(double[100000]):  3.301279639999848 s

I would be highly suspicious that this speed comes at the expense of
accuracy... My impression is that there's a lot of room to make
speed/accuracy tradeoffs in these functions, and modern glibc's libm has
seen a fair amount of scrutiny by people who have access to the same code
that openlibm is based off of. But then again, maybe not :-).

If these are the operations that you care about optimizing, an even better
approach might be to figure out how to integrate a vector math library here
like yeppp (BSD licensed) or MKL. Libm tries to optimize log(scalar); these
are libraries that specifically try to optimize log(vector). Adding this
would require changing numpy's code to use these new APIs though. (Very new
gcc can also try to do this in some cases but I don't know how good at it
it is... Julian might.)

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20160208/a0feff77/attachment.html>