'_[1]' in .co_names using builtin compile() in Python 2.6

Wed Nov 27 15:09:32 EST 2013

On 11/27/13 2:40 PM, magnus.lycka at gmail.com wrote:
> When I run e.g. compile('sin(5) * cos(6)', '<string>', 'eval').co_names, I get ('sin', 'cos'), which is just what I expected.
>
> But when I have a list comprehension in the expression, I get a little surprise:
>>>> compile('[x*x for x in y]',  '<string>', 'eval').co_names
> ('_[1]', 'y', 'x')
>>>>
>
> This happens in Python 2.6.6 on Red Hat Linux, but not when I run Python 2.7.3 in Windows. Unfortunately I'm stuck with 2.6.
>
> * Are there more surprises similar to this one that I can expect from compile(...).co_names? Is this "behaviour" documented somewhere?
>

That name is the name of the list being built by the comprehension, 
which I found out by disassembling the code object to see the bytecodes:

     >>> co = compile("[x*x for x in y]", "<s>", "eval")
     >>> co.co_names
     ('_[1]', 'y', 'x')
     >>> import dis
     >>> dis.dis(co)
       1           0 BUILD_LIST               0
                   3 DUP_TOP
                   4 STORE_NAME               0 (_[1])
                   7 LOAD_NAME                1 (y)
                  10 GET_ITER
             >>   11 FOR_ITER                17 (to 31)
                  14 STORE_NAME               2 (x)
                  17 LOAD_NAME                0 (_[1])
                  20 LOAD_NAME                2 (x)
                  23 LOAD_NAME                2 (x)
                  26 BINARY_MULTIPLY
                  27 LIST_APPEND
                  28 JUMP_ABSOLUTE           11
             >>   31 DELETE_NAME              0 (_[1])
                  34 RETURN_VALUE

The same list comprehension in 2.7 uses an unnamed list on the stack:

       1           0 BUILD_LIST               0
                   3 LOAD_NAME                0 (y)
                   6 GET_ITER
             >>    7 FOR_ITER                16 (to 26)
                  10 STORE_NAME               1 (x)
                  13 LOAD_NAME                1 (x)
                  16 LOAD_NAME                1 (x)
                  19 BINARY_MULTIPLY
                  20 LIST_APPEND              2
                  23 JUMP_ABSOLUTE            7
             >>   26 RETURN_VALUE

I don't know whether such facts are documented.  They are deep 
implementation details, and change from version to version, as you've seen.

> * Is there perhaps a better way to achieve what I'm trying to do?
>
> What I'm really after, is to check that python expressions embedded in text files are:
> - well behaved (no syntax errors etc)
> - don't accidentally access anything it shouldn't
> - I serve them with the values they need on execution

I hope you aren't trying to prevent malice this way: you cannot examine 
a piece of Python code to prove that it's safe to execute.  For an 
extreme example, see: Eval Really Is Dangerous: 
http://nedbatchelder.com/blog/201206/eval_really_is_dangerous.html

In your environment it looks like you have a whitelist of identifiers, 
so you're probably ok.

>
> So, in the case of "a.b + x" I'm really just interested in a and x, not b. So the (almost) whole story is that I do:
>
>      # Find names not starting with ".", i.e a & b in "a.c + b"
>      abbr_expr = re.sub(r"\.\w+", "", expr)
>      names = compile(abbr_expr, '<string>', 'eval').co_names
>      # Python 2.6 returns '_[1]' in co_names for list comprehension. Bug?
>      names = [name for name in names if re.match(r'\w+$', name)]
>
>      for name in names:
>          if name not in allowed_names:
>              raise NameError('Name: %s not permitted in expression: %s' % (name, expr))
>

I don't know of a better way to determine the real names in the 
expression.  I doubt Python will insert a valid name into the namespace, 
since it doesn't want to step on real user names.  The simplest way to 
do that is to autogenerate invalid names, like "_[1]" (I wonder why it 
isn't "_[0]"?)

--Ned.