eval to dict problems NEWB going crazy !

Fri Jul 7 22:19:37 EDT 2006

On Fri, 07 Jul 2006 09:39:38 -0700, Ant wrote:

> 
>> [('recId', 3), ('parse', {'pos': u'np', 'gen': u'm'})]
>> [('recId', 5), ('parse', {'pos': u'np', 'gen': u'm'})]
>> # line injected by a malicious user
>> "__import__('os').system('echo if I were bad I could do worse')"
>> [('recId', 7 ), ('parse', {'pos': u'np', 'gen': u'm'})]
> 
> I'm curious, if you disabled import, could you make eval safe?

Safer, but possibly not safe.

> For example:
> 
>>>> eval("__import__('os').system('echo if I were bad I could do worse')")
> if I were bad I could do worse
> 0
>>>> eval("__import__('os').system('echo if I were bad I could do worse')", {'__import__': lambda x:None})
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>   File "<string>", line 0, in ?
> AttributeError: 'NoneType' object has no attribute 'system'
> 
> So, it seems to be possible to disable access to imports, but is this
> enough? Are there other ways to access modules, or do damage via
> built-in commands?

Does your code already import os? Then there is no need for the import at
all.

eval("os.system('echo BOOM!')",{'__import__': lambda x:None})

Or, we can do this:

bomb = """eval("__import__('os').system('echo BOOM!')", __builtins__)"""
eval(bomb, {'__import__': None})

The obvious response is to block eval:

eval(bomb, {'__import__': None, 'eval': None})

Does this make it safe now? I don't know -- I've hunted around for ten
minutes trying to break it, and haven't, but that might just mean I'm not
enough of a hacker or thinking deviously enough. Possibly eval() is more
limited, and therefore "safer", than exec, but I wouldn't want to risk
real data on that assumption.

Of course, this approach only protects against one class of attacks.
Suppose Evil J. Cracker has write access to your file, and is happy enough
with just a denial of service attack:

[('recId', 3), ('parse', {'pos': u'np'*1024**4, 'gen': u'm'})]

Do you have a couple of terrabytes of free memory on your system?

Of course, if your code is only going to be used by *trusted* users, then
you don't have to worry about malicious attacks. You do have to worry
about accidental bugs though. What if one of the lines is missing a
delimiter or otherwise malformed? The call to eval() will fail, and your
code will halt. Is that what you want, or is it better to skip over the
bad data and continue? (A try...except... block could be useful here.)

Anyway, eval is a legitimate tool to use, although it is often over-kill
for the tasks people use it for. In the Original Poster's example, he
doesn't really want to evaluate an arbitrary Python expression, he wants
to evaluate a specific data structure. 

> It seems that there must be a way to use eval safely, 

"Must" does not mean "I wish there was".

> as there are
> plenty of apps that embed python as a scripting language - 

As Fredrik points out, embedded Python isn't the same as running
untrusted code. The reality is, Python has not been designed for running
untrusted code safely. There was an attempt at a restricted-execution
module, but Guido decided to remove it -- see this thread here for his
reasoning:

http://mail.python.org/pipermail/python-dev/2002-December/031160.html

> and what's
> the point of an eval function if impossible to use safely, and you have
> to write your own Python parser!!

As for eval, it's a sledge-hammer. Sledge-hammers are legitimate tools,
for when you need one. eval is for evaluating arbitrary Python
expressions -- my rule of thumb (yours may be different) is that any time
I expect arbitrary data, eval is the right tool for the job, but if I
expect *specific* data, I use something else.

Imagine if the only way to get an integer was by calling eval on the
string -- I think we'd all agree that would be a bad move. Instead we have
a function which does nothing but convert strings (well, any object
really) to integers: int. It would be great if Python included tools to do
the same for dicts and lists, reducing the need for people to use a
sledge-hammer.

Anyway, my point was that you, the developer, have to weigh up the costs
and benefits of eval over a custom parser. The benefit is that eval is
already there, built-in and debugged. The costs are that it can be
insecure, and that it doesn't give you fine control over what data you
parse or how forgiving the parser is.

After that, the decision is yours.

-- 
Steven.