Building Python 2.4 with icc and processor-specific optimizations
Michael Hoffman
cam.ac.uk at mh391.invalid
Mon Mar 14 18:42:44 EST 2005
Martin v. Löwis wrote:
> OTOH, it could also be Python's failure to follow C's aliasing rules
> correctly; Python casts between C pointers which, in strict C, causes
> undefined behaviour. So if your compiler has something similar to GCC's
> -fno-strict-aliasing, you could see whether this helps.
There's nothing like that specifically. There is an -falias option
which the manual just says "assume aliasing."
> If not, just try comparing the assembler output of either code, on
> a function-by-function basis.
Oh boy, it's a 10,000 line diff. The joys of interprocedural
optimization. I think I'll quit while I'm ahead...
> Alternatively, try to annotate the
> calls that go out of the sorting (e.g. to RichCompareBool) so that
> you get tracing, and then see where the traces differ.
Well, they go wrong almost right away:
non-optimized:
PyObject_RichCompareBool('signal', 'thread', 0)
PyObject_RichCompareBool('posix', 'signal', 0)
PyObject_RichCompareBool('errno', 'posix', 0)
PyObject_RichCompareBool('_sre', 'errno', 0)
PyObject_RichCompareBool('_codecs', '_sre', 0)
PyObject_RichCompareBool('zipimport', '_codecs', 0)
PyObject_RichCompareBool('zipimport', 'posix', 0)
PyObject_RichCompareBool('zipimport', 'thread', 0)
PyObject_RichCompareBool('_symtable', 'posix', 0)
optimized:
PyObject_RichCompareBool('signal', 'thread', 0)
PyObject_RichCompareBool('posix', 'errno', 0) # hmmm, comparing in the wrong direction
PyObject_RichCompareBool('posix', 'thread', 0)
PyObject_RichCompareBool('posix', 'signal', 0)
PyObject_RichCompareBool('errno', 'errno', 0) # totally bogus!
PyObject_RichCompareBool('errno', 'errno', 0) # and repeating it twice for good measure!
PyObject_RichCompareBool('_sre', 'errno', 0)
PyObject_RichCompareBool('_sre', 'errno', 0)
PyObject_RichCompareBool('_sre', 'posix', 0)
Well I probably have spent too much time on this already. To top things off, python
compiled with -O3 and without -xN actually runs faster, so I shouldn't even be trying
this road.
--
Michael Hoffman
More information about the Python-list
mailing list