securing a python execution environment...

miller.paul.w at gmail.com miller.paul.w at gmail.com
Wed Nov 28 05:43:26 EST 2007


Here's some proof of concept code I wrote a while back for this very
purpose.  What I do is use compiler.parse to take a code string and
turn it into an abstract syntax tree.  Then, using a custom visitor
object that raises an exception if it comes across something it
doesn't like, I use compiler.ast.walk to walk the tree and check for
Bad Stuff (tm) such as:

  * Importing a module not on the whitelist of safe modules.
  * Accessing a name that begins with "__" (double underscore).
  * Using the exec statement.
  * Expecting the Spanish Inquisition.

Hehe, ok, just kidding about that last one. :-)

If Bad Stuff(tm) is not detected, then I just compile it with the
built-in compile function and return the resulting code object.

Here it is:

---CUT HERE---

import compiler

# 'this' is the only module I was absolutely sure was safe. :P

allowed_imports = ['this', ]

class __Visitor (compiler.visitor.ExampleASTVisitor):

    def visitImport (self, node):

        # I have no idea why, but the 'names' attribute of an Import
or a From
        # node always seem to be of the form ('name', None). (Hence
the
        # 'name[0]' in the following list comprehension rather than
simply
        # 'name'.  Someone(tm) should definitely work on the
documentation for
        # the compiler module.

        bad_names = [name[0] for name in node.names
                     if name[0] not in allowed_imports ]
        if any (bad_names):
            if len (bad_names) > 1:
                raise ImportError, "The modules %s could not be
imported." % \
                                   ", ".join (bad_names)
            else:
                raise ImportError, "The module %s could not be
imported." % \
                                    bad_names[0]

    def visitFrom(self, node):
        modname = node.modname
        if modname not in allowed_imports:
            names = [name[0] for name in node.names]
            if len (names) > 1:
                names = ", ".join (stuff)
            raise ImportError, "Could not import %s from module %s." %
\
                  (names, modname)

    def visitName (self, node):
        if node.name.startswith ('__'):
            raise AttributeError, 'Cannot access attribute %s.\n' %
node.name

    def visitExec(self, node):
        raise SyntaxError, "Use of the exec statement is not allowed."

# Save the builtin compile function for later.

__compile = compile

def compile (source, filename, mode, *args):
    v = __Visitor()
    ast = compiler.parse (source, mode=mode)
    compiler.visitor.walk (ast, v)

    # If we got here without an exception, then we ought to be OK.
    # To be truly nice, we might want to fix up the traceback in case
we
    # deliberately raised the exception, so it's easy to tell the
difference
    # between intentional and unintentional exceptions.

    return __compile (source, filename, mode, *args)

---CUT HERE---

As neat as this code is, in the end, I discarded the idea of running
"untrusted" Python code this way.  It's just too easy to screw up and
miss something (c.f. the many SQL injection vulnerabilities we've seen
over the years).  And, once an "untrusted" individual has the ability
to run Python code on your box, you're basically screwed, anyway.  For
example, if I wanted to eat a lot of CPU and exhaust memory on
someone's machine, I could just have Python calculate Ackerman's
function or the factorial of some giant number.

Once you've sorted out every such possible problem by developing a
comprehensive threat model and defending against every attack you can
come up with, you'll either have

  A) Missed something, or
  B) Written a Python interpreter in Python.

Obviously, A is bad, and B is most likely too slow to be usable,
unless you base your efforts on pypy. :-)

Nonetheless, the compiler module is a fun toy to have around.  You can
use it for lots of cool stuff, and I heartily recommend playing around
with it.  I just wish it had better documentation. :-)



More information about the Python-list mailing list