eval [was Re: dict to boolean expression, how to?]

Fri Aug 1 22:59:20 EDT 2014

On Fri, 01 Aug 2014 17:44:27 +0200, Peter Otten wrote:

> Alex van der Spek wrote:
> 
> 
>> I do know eval() lends itself to code injection but can't say I am
>> fully aware of its dangers. It seemed like a handy tool to me.
> 
> In a lab if you don't need to protect your script against attacks from
> outside eval() (and exec()) is fine. If the data fed to eval() is
> completely under your control (think collections.namedtuple) eval() is
> also fine.

I'm not entirely happy with describing eval as "fine", even when you're 
not concerned with security. There are at least two problems with eval 
that should make it a tool of last resort:

* its less direct, typically more convoluted, which makes it 
  hard to read, harder to write, and harder to reason about;

* it's slower than running code directly (in my experience, 
  about ten times slower).

If you have the choice between running code directly, or running eval() 
or exec() that runs the same code, you should nearly always prefer to run 
the code directly.

y = x + 1  # Yes.
y = eval("x + 1")  # No.
y = eval("eval('x + 1')")  # Good grief what are you thinking???

Now obviously nobody sensible is going to use eval in production code for 
such simple expressions as "x+1", but the same principle applies even for 
more complex examples. If you can write the code once, as source code, 
it's usually better than generating the source code at runtime and 
running it with eval or exec.

Consider the namedtuple implementation in the standard library. There's a 
lot of criticism of it, some of it justified. It uses exec extensively, 
which means the code is dominated by a giant string template. This 
defeats your editor's syntax colouring, makes refactoring harder, and 
makes how the namedtuple works rather less understandable. It seems to me 
that it's only generating the __new__ method which genuinely needs to use 
exec, the rest of the namedtuple could and should use just an ordinary 
class object (although I concede that some of this is just a matter of 
personal taste).

Raymond Hettinger's original, using exec for the entire inner class:

http://code.activestate.com/recipes/500261-named-tuples/

My refactoring, with the bare minimum use of exec necessary:

https://code.activestate.com/recipes/578918-yet-another-namedtuple/

[...]
>> bool = ((df['a'] == 1) & (df['A'] == 0) |
>>          (df['b'] == 1) & (df['B'] == 0) |
>>          (df['c'] == 1) & (df['C'] == 0))
>  
> This is how it might look without eval():
> 
> #untested
> result = functools.reduce(operator.or_, ((v == 1) & (df[k.upper()] == 0)
> for k, v in df.items() if k.islower()))

For those who agree with Guido that reduce makes code unreadable:

result = True
for key in df:
    if key.islower():
        result = result or (df[key] == 1 and df[key.upper()] == 0)

Or if you insist on a single expression:

result = any(df[k] == 1 and df[k.upper()] == 0 for k in df if k.islower())

> And here is an eval-based solution:
> 
> # untested
> expr = "|".join(
>     "((df[{}] == 1) | (df[{}] == 0))".format(c, c.upper()) for c in df
>     is c.islower())
> result = eval(expr)

I really don't believe that there is any benefit to that in readability, 
power, flexibility, or performance. Also, you're using bitwise operators 
instead of shortcut bool operators. Any reason why?

-- 
Steven