[Python-Dev] Compiling Python on Linux with Intel's icc

Alex Leach albl500 at york.ac.uk
Thu Mar 1 19:39:19 CET 2012


Dear Python Devs,

I've been attempting to compile a fully functional version of Python 2.7 using 
Intel's C compiler, having built supposedly optimal versions of numpy and 
scipy, using Intel Composer XE and Intel's Math Kernel Library. I can build a 
working Python binary, but I'd really appreciate if someone could check my 
compile options, and perhaps suggest ways I could further optimise the build.

*** COMPILE FAILURE - ffi64.c ***

I've managed to compile everything in the python distribution except for 
Modules/_ctypes/libffi/src/x86/ffi64.c. So to get the compilation to actually 
work, I've had to use the config option '--with-system-ffi'. If someone could 
suggest a patch for ffi64.c, I'd happily test it, as I've been unable to fix the 
code myself! The problem is with register_args, which uses GCC's __int128_t, 
but this doesn't exist when using icc.

The include guard to use could be:-
#ifdef __INTEL_COMPILER
...
#else
...
#endif

I've tried using this guard around the register_args struct, at the top of 
ffi64.c, and where I see register_args used, around lines 592-616, according to 
the suggestion at http://software.intel.com/en-
us/forums/showthread.php?t=56652, but have been unable to get a working 
solution... A patch would be appreciated!

*** Tests ***

After compilation, there's a few tests that are consistently failing, mainly 
involved with floating point precision: test_cmath, test_math and test_float.  
Also, I wrote a very short script to test the time of for loop execution and 
integer multiplication. This script (below) has nearly always completed faster 
using the default Ubuntu Python rather than my own build.

Obviously, I was hoping to get a faster python, but the size of the final 
binary is almost twice the size of the default Ubuntu version (5.2MB cf. 
2.7MB), which I thought might cause a startup overhead that leads to slower 
execution times when running such a basic script.


*** TEST SCRIPT ***
$ cat ~/bin/timetest.py

RANGE = 10000

print "running {0}^2 = {1} for loop iterations".format( RANGE,RANGE**2 )

for i in xrange(RANGE):
    for j in xrange(RANGE):
        i * j


*** TIMES ***

## ICC-compiled python ##
$ time ./python ~/bin/timetest.py
running 10000^2 = 100000000 for loop iterations

real    0m2.767s
user    0m2.720s
sys     0m0.008s


## System python ##
$ time python ~/bin/timetest.py
running 10000^2 = 100000000 for loop iterations

real    0m2.781s
user    0m2.776s
sys     0m0.000s

Oh... My python appears to run faster than gcc's now - checked this a few 
times now, mine's staying faster... :) I've compiled and re-compiled python 
dozens of times now, but it's still failing some tests...


*** Build Environment ***

Ubuntu 10.10 server kernel (`uname -r`=3.0.0-16-server) with KDE 4.7.4

$ tail ~/.bashrc

#### Custom Commands
export PATH=$PATH:/usr/local/cuda/bin:$HOME/bin
export PYTHONPATH=$HOME/bin:/usr/lib/pymodules/python2.7
export PYTHONSTARTUP=$HOME/.pystartup
export 
LD_LIBRARY_PATH=/lib64:/usr/lib64:/usr/local/lib:/usr/local/cuda/lib64:/usr/local/cuda/lib
# Load Intel compiler and library variables.
source /usr/intel/bin/compilervars.sh intel64
source /usr/intel/impi/4.0.3/bin/mpivars.sh intel64
source /usr/intel/tbb/bin/tbbvars.sh intel64

$ env | grep 'PATH\|FLAGS'
MANPATH=/usr/intel/impi/4.0.3.008/man:/usr/intel/composer_xe_2011_sp1.9.293/man/en_US:/usr/intel/composer_xe_2011_sp1.9.293/man/en_US:/usr/intel/impi/4.0.3.008/man:/usr/intel/composer_xe_2011_sp1.9.293/man/en_US:/usr/intel/composer_xe_2011_sp1.9.293/man/en_US:/usr/intel/impi/4.0.3.008/man:/usr/intel/composer_xe_2011_sp1.9.293/man/en_US:/usr/intel/composer_xe_2011_sp1.9.293/man/en_US:/usr/local/man:/usr/local/share/man:/usr/share/man:/usr/intel/man:::
LIBRARY_PATH=/usr/intel/composer_xe_2011_sp1.9.293/tbb/lib/intel64//cc4.1.0_libc2.4_kernel2.6.16.21:/usr/intel/composer_xe_2011_sp1.9.293/compiler/lib/intel64:/usr/intel/composer_xe_2011_sp1.9.293/ipp/../compiler/lib/intel64:/usr/intel/composer_xe_2011_sp1.9.293/ipp/lib/intel64:/usr/intel/composer_xe_2011_sp1.9.293/compiler/lib/intel64:/usr/intel/composer_xe_2011_sp1.9.293/mkl/lib/intel64:/usr/intel/composer_xe_2011_sp1.9.293/tbb/lib/intel64//cc4.1.0_libc2.4_kernel2.6.16.21
FPATH=/usr/intel/composer_xe_2011_sp1.9.293/mkl/include:/usr/intel/composer_xe_2011_sp1.9.293/mkl/include
LD_LIBRARY_PATH=/usr/intel/composer_xe_2011_sp1.9.293/tbb/lib/intel64//cc4.1.0_libc2.4_kernel2.6.16.21:/usr/intel/impi/4.0.3.008/ia32/lib:/usr/intel/composer_xe_2011_sp1.9.293/compiler/lib/intel64:/usr/intel/composer_xe_2011_sp1.9.293/ipp/../compiler/lib/intel64:/usr/intel/composer_xe_2011_sp1.9.293/ipp/lib/intel64:/usr/intel/composer_xe_2011_sp1.9.293/compiler/lib/intel64:/usr/intel/composer_xe_2011_sp1.9.293/mkl/lib/intel64:/usr/intel/composer_xe_2011_sp1.9.293/tbb/lib/intel64//cc4.1.0_libc2.4_kernel2.6.16.21:/biol/arb/lib:/lib64:/usr/lib64:/usr/local/lib:/usr/local/cuda/lib64:/usr/local/cuda/lib:/usr/intel/composer_xe_2011_sp1.9.293/debugger/lib/intel64:/usr/intel/composer_xe_2011_sp1.9.293/mpirt/lib/intel64
CPATH=/usr/intel/composer_xe_2011_sp1.9.293/tbb/include:/usr/intel/composer_xe_2011_sp1.9.293/mkl/include:/usr/intel/composer_xe_2011_sp1.9.293/tbb/include
NLSPATH=/usr/intel/composer_xe_2011_sp1.9.293/compiler/lib/intel64/locale/%l_%t/%N:/usr/intel/composer_xe_2011_sp1.9.293/ipp/lib/intel64/locale/%l_%t/%N:/usr/intel/composer_xe_2011_sp1.9.293/mkl/lib/intel64/locale/%l_%t/%N:/usr/intel/composer_xe_2011_sp1.9.293/debugger/intel64/locale/%l_%t/%N
PATH=/usr/intel/impi/4.0.3.008/ia32/bin:/usr/intel/composer_xe_2011_sp1.9.293/bin/intel64:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/intel/bin:/usr/local/cuda/bin:/usr/local/cuda/bin:/usr/intel/composer_xe_2011_sp1.9.293/mpirt/bin/intel64
PYTHONPATH=/usr/lib/pymodules/python2.7/
WINDOWPATH=7
QT_PLUGIN_PATH=$HOME/.kde/lib/kde4/plugins/:/usr/lib/kde4/plugins/

*** Download, configure and Build instructions ***
$ hg clone -r 2.7 http://hg.python.org/cpython
Since...
$ hg update -r 2.7

*** Generate Profile-Guided Optimisation stuff with first build ***
$ make distclean && mkdir PGO
$ CC=icc AR=xiar LD=xild CXX=icpc \
   CPPFLAGS+="-I/usr/include \
	-I/usr/include/x86_86-linux-gnu \
	-I/usr/src/linux-headers-3.0.0-16-server/include/" \
   CFLAGS+="-O3 \
	-fomit-frame-pointer \
	-shared-intel \
	-fpic \
	-prof-gen \
	-prof-dir $PWD/PGO \
	-fp-model precise \
	-fp-model source \
	-xHost \
	-ftz" 
   ./configure --with-system-ffi --with-libc="-lirc" --with-libm="-limf" 
$ make -j9

*** Use the PGO-generated information in new build ***
$ make clean 
$ CC=icc AR=xiar LD=xild CXX=icpc \
   CPPFLAGS+="-I/usr/include \
	-I/usr/include/x86_86-linux-gnu \
	-I/usr/src/linux-headers-3.0.0-16-server/include/" \
   CFLAGS+="-O3 \
	-fomit-frame-pointer \
	-shared-intel \
	-fpic \
	-prof-use \
	-prof-dir $PWD/PGO \
	-fp-model precise \
	-fp-model source \
	-xHost \
	-ftz \
	-fomit-frame-pointer" \
   ./configure --with-system-ffi --with-libc="-lirc" --with-libm="-limf" 
$ make -j9
...

$ make test
building dbm using gdbm

Python build finished, but the necessary bits to build these modules were not 
found:
_bsddb             bsddb185           dl              
imageop            sunaudiodev                        
To find the necessary bits, look in setup.py in detect_modules() for the 
module's name.

find ./Lib -name '*.py[co]' -print | xargs rm -f
./python -Wd -3 -E -tt  ./Lib/test/regrtest.py -l 
/usr/local/src/pysrc/cpython/Lib/unittest/util.py:2: ImportWarning: Not 
importing directory '/usr/local/src/pysrc/cpython/Lib/collections': missing 
__init__.py
  from collections import namedtuple, OrderedDict
== CPython 2.7.3rc1 (2.7:5c52e7c6d868+, Feb 29 2012, 22:10:22) [GCC Intel(R) 
C++ gcc 4.6 mode]
==   Linux-3.0.0-16-server-x86_64-with-debian-wheezy-sid little-endian
==   /usr/local/src/pysrc/cpython/build/test_python_16278
Testing with flags: sys.flags(debug=0, py3k_warning=1, division_warning=1, 
division_new=0, inspect=0, interactive=0, optimize=0, dont_write_bytecode=0, 
no_user_site=0, no_site=0, ignore_environment=1, tabcheck=2, verbose=0, 
unicode=0, bytes_warning=0, hash_randomization=0)

.........

test_cmath
test test_cmath failed -- Traceback (most recent call last):
  File "/usr/local/src/pysrc/cpython/Lib/test/test_cmath.py", line 352, in 
test_specific_values
    msg=error_message)
  File "/usr/local/src/pysrc/cpython/Lib/test/test_cmath.py", line 94, in 
rAssertAlmostEqual
    'got {!r}'.format(a, b))
AssertionError: acos0000: acos(complex(0.0, 0.0))
Expected: complex(1.5707963267948966, -0.0)
Received: complex(1.5707963267948966, 0.0)
Received value insufficiently close to expected value.
...
test_curses skipped -- Use of the `curses' resource not enabled
...
test_float
test test_float failed -- Traceback (most recent call last):
  File "/usr/local/src/pysrc/cpython/Lib/test/test_float.py", line 1273, in 
test_from_hex
    self.identical(fromHex('0x0.ffffffffffffd6p-1022'), MIN-3*TINY)
  File "/usr/local/src/pysrc/cpython/Lib/test/test_float.py", line 983, in 
identical
    self.fail('%r not identical to %r' % (x, y))
AssertionError: 0.0 not identical to 2.2250738585072014e-308
.....
test test_strtod failed -- multiple errors occurred; run in verbose mode for 
details
......


347 tests OK.
5 tests failed:
    test_cmath test_float test_gdb test_math test_strtod
1 test altered the execution environment:
    test_distutils
37 tests skipped:
    test_aepack test_al test_applesingle test_bsddb test_bsddb185
    test_bsddb3 test_cd test_cl test_codecmaps_cn test_codecmaps_hk
    test_codecmaps_jp test_codecmaps_kr test_codecmaps_tw test_curses
    test_dl test_gl test_imageop test_imgfile test_kqueue
    test_linuxaudiodev test_macos test_macostools test_msilib
    test_ossaudiodev test_scriptpackages test_smtpnet
    test_socketserver test_startfile test_sunaudiodev test_timeout
    test_tk test_ttk_guionly test_urllib2net test_urllibnet
    test_winreg test_winsound test_zipfile64
4 skips unexpected on linux2:
    test_bsddb test_bsddb3 test_tk test_ttk_guionly
make: *** [test] Error 1







*** Drill down to test_strtod error ***

$ ./python
Python 2.7.3rc1 (2.7:5c52e7c6d868+, Feb 29 2012, 22:10:22) 
[GCC Intel(R) C++ gcc 4.6 mode] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from test import test_strtod
>>> test_strtod.test_main()
test_bigcomp (test.test_strtod.StrtodTests) ... FAIL
test_boundaries (test.test_strtod.StrtodTests) ... FAIL
test_halfway_cases (test.test_strtod.StrtodTests) ... ok
test_parsing (test.test_strtod.StrtodTests) ... FAIL
test_particular (test.test_strtod.StrtodTests) ... FAIL
test_short_halfway_cases (test.test_strtod.StrtodTests) ... ok
test_underflow_boundary (test.test_strtod.StrtodTests) ... FAIL

======================================================================
FAIL: test_bigcomp (test.test_strtod.StrtodTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/src/pysrc/cpython/Lib/test/test_strtod.py", line 214, in 
test_bigcomp
    self.check_strtod(s)
  File "/usr/local/src/pysrc/cpython/Lib/test/test_strtod.py", line 105, in 
check_strtod
    "expected {}, got {}".format(s, expected, got))
AssertionError: Incorrectly rounded str->float conversion for 81608e-328: 
expected 0x0.0000000000002p-1022, got 0x0.0p+0

======================================================================
FAIL: test_boundaries (test.test_strtod.StrtodTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/src/pysrc/cpython/Lib/test/test_strtod.py", line 191, in 
test_boundaries
    self.check_strtod(s)
  File "/usr/local/src/pysrc/cpython/Lib/test/test_strtod.py", line 105, in 
check_strtod
    "expected {}, got {}".format(s, expected, got))
AssertionError: Incorrectly rounded str->float conversion for 
22250738585072002149149e-330: expected 0x0.ffffffffffffep-1022, got 0x0.0p+0

======================================================================
FAIL: test_parsing (test.test_strtod.StrtodTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/src/pysrc/cpython/Lib/test/test_strtod.py", line 243, in 
test_parsing
    self.check_strtod(s)
  File "/usr/local/src/pysrc/cpython/Lib/test/test_strtod.py", line 105, in 
check_strtod
    "expected {}, got {}".format(s, expected, got))
AssertionError: Incorrectly rounded str->float conversion for -6.E-310: 
expected -0x0.06e7344a56502p-1022, got -0x0.0p+0

======================================================================
FAIL: test_particular (test.test_strtod.StrtodTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/src/pysrc/cpython/Lib/test/test_strtod.py", line 393, in 
test_particular
    self.check_strtod(s)
  File "/usr/local/src/pysrc/cpython/Lib/test/test_strtod.py", line 105, in 
check_strtod
    "expected {}, got {}".format(s, expected, got))
AssertionError: Incorrectly rounded str->float conversion for 
12579816049008305546974391768996369464963024663104e-357: expected 
0x0.90bbd7412d19fp-1022, got 0x0.0p+0

======================================================================
FAIL: test_underflow_boundary (test.test_strtod.StrtodTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/src/pysrc/cpython/Lib/test/test_strtod.py", line 205, in 
test_underflow_boundary
    self.check_strtod(s)
  File "/usr/local/src/pysrc/cpython/Lib/test/test_strtod.py", line 105, in 
check_strtod
    "expected {}, got {}".format(s, expected, got))
AssertionError: Incorrectly rounded str->float conversion for 
24703282292062327208828439643411068618252990130716238221279284125033775363572e-400: 
expected 0x0.0000000000001p-1022, got 0x0.0p+0

----------------------------------------------------------------------
Ran 7 tests in 0.280s

FAILED (failures=5)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/src/pysrc/cpython/Lib/test/test_strtod.py", line 396, in 
test_main
    test_support.run_unittest(StrtodTests)
  File "/usr/local/src/pysrc/cpython/Lib/test/test_support.py", line 1094, in 
run_unittest
    _run_suite(suite)
  File "/usr/local/src/pysrc/cpython/Lib/test/test_support.py", line 1077, in 
_run_suite
    raise TestFailed(err)
test.test_support.TestFailed: multiple errors occurred


*** Binary size and linked libraries ***
## My Intel build ##
$ ls -l ./python && ldd ./python
-rwxrwxr-x 1 user user 5.2M 2012-02-29 22:10 ./python
        linux-vdso.so.1 =>  (0x00007fffde1ec000)
        libirc.so => 
/usr/intel/composer_xe_2011_sp1.9.293/compiler/lib/intel64/libirc.so 
(0x00007fe5f0f30000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 
(0x00007fe5f0cde000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fe5f0ada000)
        libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 
(0x00007fe5f08d7000)
        libimf.so => 
/usr/intel/composer_xe_2011_sp1.9.293/compiler/lib/intel64/libimf.so 
(0x00007fe5f050b000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fe5f0287000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 
(0x00007fe5f0071000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fe5efcd1000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fe5f107e000)
        libintlc.so.5 => 
/usr/intel/composer_xe_2011_sp1.9.293/compiler/lib/intel64/libintlc.so.5 
(0x00007fe5efb85000)

## System build ##
$ ls -lhH /usr/bin/python && ldd /usr/bin/python
-rwxr-xr-x 1 root root 2.7M 2011-10-04 22:26 /usr/bin/python
        linux-vdso.so.1 =>  (0x00007fff509ff000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 
(0x00007f3e339b0000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f3e337ab000)
        libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 
(0x00007f3e335a8000)
        libssl.so.1.0.0 => /lib/x86_64-linux-gnu/libssl.so.1.0.0 
(0x00007f3e33357000)
        libcrypto.so.1.0.0 => /lib/x86_64-linux-gnu/libcrypto.so.1.0.0 
(0x00007f3e32fa7000)
        libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f3e32d8f000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f3e32b0b000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f3e3276b000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f3e33c03000)


*** Conclusion (finally!) ***
The Intel Python build looks very promising, but I don't yet trust it to the 
extent that I'd to go ahead and install it or use it in place of the system 
build. None of the errors look too alarming though, so I'm confident that I 
could actually get this to work, with the right help.

If someone could help me pass these final tests and compile the ffi64.c module, 
that'd be amazing!

I hope to hear back from you,
Kind regards,
Alex

ps. Sorry how long this email turned out!
pps. I'd be happy to write up the fully working solution on a wiki or 
somewhere, if anyone has any suggestions where?



More information about the Python-Dev mailing list