Simple question about how the optimizer works
Tim Peters
tim.one at comcast.net
Fri May 10 11:52:42 EDT 2002
[Andrew Dalke, on "optimization"]
> ...
> Given the limited resources for Python development, how much time
> should be spent on this? I think very little. As Tim Peters once
> pointed out, there haven't been problems with Python's optimization
> code <wink>.
That was before we tried any <wink>. One of my coworkers decided they were
tired of seeing LOAD_CONST followed by UNARY_NEGATIVE whenever they had a
negative literal in the source (like -4 or -1.23), and changed the compiler
to store the negation of the literal instead, leaving just LOAD_CONST at run
time.
We do that now, but it introduced several bugs, and the process of stumbling
into them stretched over almost a year. At the parsing end, it wound up
breaking mixtures of unary minus with exponentiation, and at the semantic
end it screwed up on negative float 0 literals (like -0.0). There was also
a bug in memory management, due to indirect mixing of the PyObject_xyz
memory API with raw platform malloc, and that bug went uncaught until very
recently because it could only matter if pymalloc was enabled.
The code today looks like this:
if ((childtype == PLUS || childtype == MINUS || childtype == TILDE)
&& NCH(n) == 2
&& TYPE((pfactor = CHILD(n, 1))) == factor
&& NCH(pfactor) == 1
&& TYPE((ppower = CHILD(pfactor, 0))) == power
&& NCH(ppower) == 1
&& TYPE((patom = CHILD(ppower, 0))) == atom
&& TYPE((pnum = CHILD(patom, 0))) == NUMBER
&& !(childtype == MINUS && is_float_zero(STR(pnum)))) {
if (childtype == TILDE) {
com_invert_constant(c, pnum);
return;
}
if (childtype == MINUS) {
char *s = PyMem_Malloc(strlen(STR(pnum)) + 2);
if (s == NULL) {
com_error(c, PyExc_MemoryError, "");
com_addbyte(c, 255);
return;
}
s[0] = '-';
strcpy(s + 1, STR(pnum));
PyMem_Free(STR(pnum));
STR(pnum) = s;
}
com_atom(c, patom);
}
else if (childtype == PLUS) {
com_factor(c, CHILD(n, 1));
com_addbyte(c, UNARY_POSITIVE);
}
else if (childtype == MINUS) {
com_factor(c, CHILD(n, 1));
com_addbyte(c, UNARY_NEGATIVE);
}
else if (childtype == TILDE) {
com_factor(c, CHILD(n, 1));
com_addbyte(c, UNARY_INVERT);
}
else {
com_power(c, CHILD(n, 0));
As that strongly hints, CPython's intermediate code representation is a
concrete syntax tree, and so very difficult for any sort of semantic
analysis to process. This is Good, because it inhibits my coworkers from
doing more of this <wink>.
More information about the Python-list
mailing list