Untrusted code execution

Thu Apr 7 08:13:02 EDT 2016

On 2016-04-06, Steven D'Aprano <steve at pearwood.info> wrote:
> On Wed, 6 Apr 2016 03:48 am, Chris Angelico wrote:
>> On Wed, Apr 6, 2016 at 3:26 AM, Jon Ribbens
>> <jon+usenet at unequivocal.co.uk> wrote:
>>> The received wisdom is that restricted code execution in Python is
>>> an insolubly hard problem, but it looks a bit like my 7-line example
>>> above disproves this theory, 
>
> Jon's 7-line example doesn't come even close to providing restricted code
> execution in Python. What it provides is a restricted subset of expression
> evaluation, which is *much* easier.

It's true that I was using eval(), but I don't think that actually
fundamentally changes the game. Almost exactly the same sanitisation
method can be used to make exec() code safe. ("import" for example
does not work because there is no "__import__" in the provided
builtins, but even if it did work it could be trivially disallowed by
searching for ast.Import and ast.ImportFrom nodes. "with" must be
disallowed because otherwise __exit__ can be used to get a frame
object.)

> It's barely more powerful than the ast.safe_eval function.

I think you mean ast.literal_eval(), and you're misremembering.
That function isn't even a calculator, it won't even work out
"2*2" for you. It (almost) literally just parses literals ;-)

> [Jon again]
>>> provided you choose carefully what you 
>>> provide in your restricted __builtins__ - but people who knows more
>>> than me about Python seem to have thought about this problem for
>>> longer than I have and come up with the opposite conclusion so I'm
>>> curious what I'm missing.
>
> You're missing that they're trying to allow enough Python functionality to
> run useful scripts (not just evaluate a few arithmetic expressions), but
> without allowing the script to break out of the restricted environment and
> do things which aren't permitted.

Hmm, I'm not missing that, I even explicitly mentioned it previously.
I think you're also missing that eval() can do a very great deal more
than just "arithmetic expressions".

> For example, check out Tav's admirable work some years ago on trying to
> allow Python code to read but not write files:
>
> http://tav.espians.com/a-challenge-to-break-python-security.html

Indeed, I have read that and the follow-ups. He was again making it
hard for himself by trying to allow execution of completely arbitrary
code, and still almost every way to escape relied on "_" attributes
(or him missing the obvious point that you can't check a string is
safe by doing "if foo == 'blah'" if "foo" might be a subtype of
str with a malicious __eq__ method).

> You should also read Guido's comments on capabilities:
>
> http://neopythonic.blogspot.com.au/2009/03/capabilities-for-python.html

Thanks, that's interesting.

> As Zooko says, Guido's "best argument is that reducing usability (in terms
> of forbidding language features, especially module import) and reducing the
> usefulness of extant library code" would make the resulting interpreter too
> feeble to be useful.

Well, no. It makes it too feeble to be used as a generic programming
language. But there is a whole other class of uses for which it would
still be very useful - making very configurable or dynamic systems,
for example. I don't know, imagine github allowed you to upload
restricted-Python code that could be used as a server-side commit
hook, to take a completely random example, or you could upload code
that would generate reports or data for graphing.

> Look at what you've done: you've restricted the entire world of
> Python down to, effectively, a calculator and a few string methods.

Again, no not really. You've tuples, sets, lists, dictionaries,
lambdas, generator and list expressions, etc. And although I made my
example __builtins__ very restricted indeed, that was just because
I'm asking about the basic principle of the idea. If the idea is
ok then the builtins can be gone through one by one and added if
they're safe.

> All the obvious, and even not-so-obvious, attack tools are gone:
> eval, exec, getattr, type, __import__.

Indeed. The fundamental point is that we must not allow the attacker
to have access to any of those things, or to gain access by using any
of the tools which we have provided. I think this is not an impossible
problem.

> I think this approach is promising enough that Jon should take it to a few
> other places for comments, to try to get more eyeballs. Stackoverflow and
> Reddit's /r/python, perhaps. 

I'll post some example code on github in a bit and see what people
think.