[issue39159] Ideas for making ast.literal_eval() usable

Fri Feb 14 14:37:52 EST 2020

Batuhan Taskaya <batuhanosmantaskaya at gmail.com> added the comment:

> 1) We should document possible exceptions that need to be caught.  So far, I've found TypeError, MemoryError, SyntaxError, ValueError.

Maybe we should wrap all of these into something like LiteralEvalError to easily catch all of them, LiteralEvalError can be subclass of that four but I guess in some cases this change might break code.

> 2) Define a size limit guaranteed not to give a MemoryError.  The smallest unsafe size I've found so far is 301 characters:

>>> s = "(" * 101 + ")" * 101
>>> len(s)
202
>>> ast.literal_eval(s)
s_push: parser stack overflow
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.9/ast.py", line 61, in literal_eval
    node_or_string = parse(node_or_string, mode='eval')
  File "/usr/local/lib/python3.9/ast.py", line 49, in parse
    return compile(source, filename, mode, flags,
MemoryError

> 3) Consider writing a standalone expression compiler that doesn't have the same small limits as our usual compile() function.  This would make literal_eval() usable for evaluating tainted inputs with bigger datasets. (Imagine if the json module could only be safely used with inputs under 301 characters).

Can you explain it a bit more detailed, how does this standalone expression compiler should work?

----------
components: +Library (Lib)
nosy: +BTaskaya
type:  -> enhancement
versions: +Python 3.9

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue39159>
_______________________________________