[issue39159] Ideas for making ast.literal_eval() usable

Sun Dec 29 17:22:50 EST 2019

New submission from Raymond Hettinger <raymond.hettinger at gmail.com>:

A primary goal for ast.literal_eval() is to "Safely evaluate an expression node or a string".

In the context of a real application, we need to do several things to make it possible to fulfill its design goal:

1) We should document possible exceptions that need to be caught.  So far, I've found TypeError, MemoryError, SyntaxError, ValueError.

2) Define a size limit guaranteed not to give a MemoryError.  The smallest unsafe size I've found so far is 301 characters:

     s = '(' * 100 + '0' + ',)' * 100
     literal_eval(s)                    # Raises a MemoryError

3) Consider writing a standalone expression compiler that doesn't have the same small limits as our usual compile() function.  This would make literal_eval() usable for evaluating tainted inputs with bigger datasets. (Imagine if the json module could only be safely used with inputs under 301 characters).

4) Perhaps document an example of how we suggest that someone process tainted input:

     expr = input('Enter a dataset in Python format: ')
     if len(expr) > 300:
        error(f'Maximum supported size is 300, not {len(expr)}')
     try:
        data = literal_eval(expr)
     except (TypeError, MemoryError, SyntaxError, ValueError):
        error('Input cannot be evaluated')

----------
messages: 359011
nosy: rhettinger
priority: normal
severity: normal
status: open
title: Ideas for making ast.literal_eval() usable

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue39159>
_______________________________________