[Python-Dev] Challenge: Please break this! (a.k.a restricted mode revisited)

Mon Apr 11 05:40:05 EDT 2016

2016-04-10 18:43 GMT+02:00 Jon Ribbens <jon+python-dev at unequivocal.co.uk>:
> On Sat, Apr 09, 2016 at 02:43:19PM +0200, Victor Stinner wrote:
>>    Please don't loose time trying yet another sandbox inside CPython. It's
>>    just a waste of time. It's broken by design.
>>
>>    Please read my email about my attempt (pysandbox):
>>    https://lwn.net/Articles/574323/
>>
>>    And the LWN article:
>>    https://lwn.net/Articles/574215/
>>
>>    There are a lot of safe ways to run CPython inside a sandbox (and not rhe
>>    opposite).
>>
>>    I started as you, add more and more things to a blacklist, but it doesn't
>>    work.
>
> That's the opposite of my approach though - I'm starting small and
> adding things, not starting with everything and removing stuff. Even
> if what we end up with is an extremely restricted subset of Python,
> there are still cases where that could be a useful tool to have.

You design rely on the assumption that CPython is only pure Python.
That's wrong. A *lot* of Python features are implemented in C and
"ignore" your sandboxing code. Quick reminder: 50% of CPython is
written in the C language.

It means that your protections like hiding builtin functions from the
Python scope don't work. If an attacker gets access to a C function
giving access to the hidden builtin, the game is over.

pysandbox is based on the idea of tav (his project safelite.py):
remove features in the dictionary of builtin C types like FrameType,
CodeObject, etc. See sandbox/attributes.py. It's not enough to be 100%
safe, a C function can still access fields of the C structure
directly, but it was enough to protect "most" C functions.

It's hard to list all features of the C code which are indirectly
accessible from the Python scope. Some examples: warnings and
tracebacks. These features killed the pysandbox project because they
open directly files on the filesystem, it's not possible to control
these features from the Python scope.

Another example which exposes a vulnerability of your sandbox:
str.format() gets directly object attributes without the getattr()
builtin function, so it's possible to escape your sandbox. Example:
"{0.__class__}".format(obj) shows the type of an object.

Think also about the new f-string which allows arbitrary Python code: f"{code}".

> However on the other hand, nobody has tried before to do what I am
> doing (static code analysis),

You're wrong.

Zope Security ("RestrictedPython") has a similar design. Analyzing AST
is a common design to build a sanbox. But it's not safe.

The "See also" section of my pysandbox has a long list of Python
sandboxes without various design.

> so it's not necessarily a safe
> assumption that the idea is doomed. For example, as far as I can see,
> none of the methods used to break out of your pysandbox would work to
> break out of my experiment.

What I see is that you asked to break your sandbox, and less than 1
hour later, a first vulnerability was found (exec called with two
parameters). A few hours later, a second vulnerability was found
(async generator and cr_frame). By the way, are you sure that you
fixed the vulnerability? You blacklisted "cb_frame", not cr_frame ;-)

You should look closer, pysandbox is very close to you project. It
also uses whitelists for some protections (ex: builtins) and blacklist
for other protections (ex: hide sensitive attributes). You are using a
blacklist for attributes. By the way, you hide cr_frame but not
cr_code. I'm quite sure that it's possible to execute arbitrary
bytecode in your sandbox, I just don't have enough time to dig into
the code. Your sandbox is not fully based on whitelists.

Victor