[issue27725] Use Py_SIZE(x) instead of x->ob_size

REIX Tony report at bugs.python.org
Thu Aug 11 04:14:03 EDT 2016


REIX Tony added the comment:

Hi Raymond

I've got several email exchanges with the IBM XLC expert. From his own study of my issue, his conclusion is that this kind of Python v2 coding is not ANSI-aliasing safe. It seems that there is a standard that requires C code to NOT do some kinds of coding so that any C compiler optimizer can do its best.

The issue was not there with XLC v12.1.0.14 and -O2.
It appeared with XLC v13.1.3.2 and -O2 since XLC v13 optimizer is more agressive.

About GCC, I've not experimented yet with it for now (will do later today I hope), but the impact should be the same according to the optimizer level and improvements.

Here is what IBMer Steven said:

"I found the problem.
It is not a problem with the compiler, but a problem with the source code/option set.
It is an ansi aliasing violation. I'll try to provide as much detail as I can to explain it.

At line 2512 of Objects/longobject.c, we have the following code:

if (sign < 0)
z->ob_size = -(z->ob_size);
return long_normalize(z);

Note that we use z->ob_size to access size, and the type of z is "PyLongObject *".
This value is loaded in long_normalization.
After we inline this function call, the compiler moves the load done in long_normalization above the if statement (past the store that writes to it), which is why we ends up with the wrong sign.

Now the question is why does the compiler think that this is legal ?

In long_normalize, the size is obtained using a macro Py_SIZE(v) (line 47).
This macro expands to:

(((PyVarObject*)(v))->ob_size)

Notice that the pointer is cast to something of type PyVarObject*.
PyVarObject and PyLongObject are not compatible types, and, because ansi aliasing is assumed, the compiler believes they do not reference the same memory. Therefore it is safe to move.

A simple solution is to use "-qalias=noansi" when compiling. That will work, but could also hurt performance.

The other solution is to use either Py_SIZE all of the time to access the memory or never.
Do not mix and match. This will require some code changes.
I'll leave it to you to figure out how to handle it, but my guess is that Py_SIZE is supposed to always be used.
The comments in "object.h" lines 11-17 include this phrase "they must be accessed through special macros and functions only."

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue27725>
_______________________________________


More information about the Python-bugs-list mailing list