[Python-Dev] Challenge: Please break this! (a.k.a restricted mode revisited)

Tue Apr 12 08:05:06 EDT 2016

2016-04-12 13:10 GMT+02:00 Jon Ribbens <jon+python-dev at unequivocal.co.uk>:
> No, it's a matter of reducing the whitelist. I must admit that
> I don't understand in what way this is not already clear. Look:
>
>   >>> len(unsafe._SAFE_MODULES)
>   23

You don't understand that even if the visible "Python scope", "Python
namespace", or call it as you want (the code that is accessible from
your sandbox) looks very tiny, the real effictive code is HUGE. For
example, you give a full access to the str type which is made of 20K
lines of C code:

haypo at smithers$ wc -l Objects/unicodeobject.c Objects/unicodectype.c
Objects/stringlib/*h
 15670 Objects/unicodeobject.c
   297 Objects/unicodectype.c
    29 Objects/stringlib/asciilib.h
   827 Objects/stringlib/codecs.h
    27 Objects/stringlib/count.h
   109 Objects/stringlib/ctype.h
    25 Objects/stringlib/eq.h
   250 Objects/stringlib/fastsearch.h
   201 Objects/stringlib/find.h
   133 Objects/stringlib/find_max_char.h
   140 Objects/stringlib/join.h
   180 Objects/stringlib/localeutil.h
   116 Objects/stringlib/partition.h
    53 Objects/stringlib/replace.h
   390 Objects/stringlib/split.h
    28 Objects/stringlib/stringdefs.h
   266 Objects/stringlib/transmogrify.h
    30 Objects/stringlib/ucs1lib.h
    29 Objects/stringlib/ucs2lib.h
    29 Objects/stringlib/ucs4lib.h
    11 Objects/stringlib/undef.h
    32 Objects/stringlib/unicodedefs.h
  1284 Objects/stringlib/unicode_format.h
 20156 total

Did you review carefully *all* these lines? If a single C line gives
access to the real Python namespace, the game is over.

In a few minutes, I found "{0.__class__}".format(obj) which is not a
full escape of the sandbox, but it's just to give one example. With
more time, I'm sure that a line can be found in the str type to escape
your sandbox.

> I could "mathematically prove" that there are no more security holes
> in that list by reducing its length to zero.

You only see a very tiny portion of the real attack surface.

> The "minimum viable set" in my view would be: no builtins at all,
> only allowing eval() not exec(), and disallowing yield [from],
> lambdas and generator expressions.

IMHO it's a waste of time to try to reduce the great Python with
battery included to a simple calculator to compute 1+2. You will never
be able to fix all holes, there are too many holes in your sandbox.

It's very easy to implement your own calculator in pure Python, from
the parser to the code to compute the operators. If you write yourself
the whole code, it's much easier to control what is allowed and put
limits. For example, with your own code, you can put limits on the
maximum number, whereas your sandbox will kill your CPU and memory if
you try 2**(2**100) (no builtin function required for this "exploit").

Victor