Building Python 2.4 with icc and processor-specific optimizations

Michael Hoffman cam.ac.uk at mh391.invalid
Mon Mar 14 18:42:44 EST 2005


Martin v. Löwis wrote:

> OTOH, it could also be Python's failure to follow C's aliasing rules
> correctly; Python casts between C pointers which, in strict C, causes
> undefined behaviour. So if your compiler has something similar to GCC's
> -fno-strict-aliasing, you could see whether this helps.

There's nothing like that specifically. There is an -falias option
which the manual just says "assume aliasing."

> If not, just try comparing the assembler output of either code, on
> a function-by-function basis.

Oh boy, it's a 10,000 line diff. The joys of interprocedural
optimization. I think I'll quit while I'm ahead...

 > Alternatively, try to annotate the
> calls that go out of the sorting (e.g. to RichCompareBool) so that
> you get tracing, and then see where the traces differ.

Well, they go wrong almost right away:

non-optimized:

PyObject_RichCompareBool('signal', 'thread', 0)
PyObject_RichCompareBool('posix', 'signal', 0)
PyObject_RichCompareBool('errno', 'posix', 0)
PyObject_RichCompareBool('_sre', 'errno', 0)
PyObject_RichCompareBool('_codecs', '_sre', 0)
PyObject_RichCompareBool('zipimport', '_codecs', 0)
PyObject_RichCompareBool('zipimport', 'posix', 0)
PyObject_RichCompareBool('zipimport', 'thread', 0)
PyObject_RichCompareBool('_symtable', 'posix', 0)

optimized:

PyObject_RichCompareBool('signal', 'thread', 0)
PyObject_RichCompareBool('posix', 'errno', 0)  # hmmm, comparing in the wrong direction
PyObject_RichCompareBool('posix', 'thread', 0)
PyObject_RichCompareBool('posix', 'signal', 0)
PyObject_RichCompareBool('errno', 'errno', 0) # totally bogus!
PyObject_RichCompareBool('errno', 'errno', 0) # and repeating it twice for good measure!
PyObject_RichCompareBool('_sre', 'errno', 0)
PyObject_RichCompareBool('_sre', 'errno', 0)
PyObject_RichCompareBool('_sre', 'posix', 0)

Well I probably have spent too much time on this already. To top things off, python
compiled with -O3 and without -xN actually runs faster, so I shouldn't even be trying
this road.
-- 
Michael Hoffman



More information about the Python-list mailing list