[issue32758] Stack overflow when parse long expression to AST

Serhiy Storchaka report at bugs.python.org
Sat Feb 10 04:00:06 EST 2018


Serhiy Storchaka <storchaka+cpython at gmail.com> added the comment:

On Linux the limit is much larger that 2**15. This depends on the stack size, it is smaller on Windows.

The stack is overflowed by recursive call of ast2obj_expr in Python/Python-ast.c. The same problem exists in other recursive AST processing code: optimizing, unparsing. This is why 3.7 can crash in cases when 3.6 was not crashed.

I don't see an easy way of fixing this. The common way is surrounding recursive calls with Py_EnterRecursiveCall()/Py_LeaveRecursiveCall(). But this make the limit too small (1000 by default). In other cases it is enough. The data structures with 1000 levels of nesting are rare. In any case the Python compiler has lower limit on nesting indented blocks. But in this case the linear sequence of binary operators is compiled to deeply nested structure.

I see two hard ways of properly fixing this.

1. Rewrite all AST expression processing code with using an explicit stack instead of function stack. This will complicate a lot of C code too much. Python code should be rewritten too if you don't want to get a RecursionError in corner cases, but for common cases it can be left unmodified.

2. Change binary operations representation, so that a+b+c+d will be represented as ('+', (a, b, c, d)) instead of ('+', ('+', ('+', a, b), c), d). This is backward incompatible change and needs to rewrite any AST processing code (in C and Python). This solution can't be backported. But the resulting code will be simpler than when rewrite them for the first approach.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue32758>
_______________________________________


More information about the Python-bugs-list mailing list