Creating a reliable sandboxed Python environment

Sat May 30 16:00:12 EDT 2015

Steven D'Aprano <steve at pearwood.info> writes:
> I wouldn't have imagined that the claim "it's easier to secure a small
> language with a few features than a big language with lots of features"
> would have been so controversial.

Consider that if the small language is Turing-complete, you can use it
to implement the big language.  If the small language is also secure (in
the sense of not being able to escape a sandbox), the big language
implemented in it can't escape the sandbox either.  Therefore the size
of the language doesn't inherently affect the sandbox security.

Implementing Python in Lua (with LuaJIT) might even have tolerable
performance, possibly beating CPython.

> I wonder if this claim will be equally as controversial?  There is a
> rough correlation between the number of lines of code in a code base,
> and the number of potential security holes that need to be guarded
> against.

Bigger programs are more likely to have bugs, sure, and Lua might have
those already.  But that's not the issue Python faces regarding
sandboxing, where it's insecure by design.

>> Stuff like bignums and unicode in themselves wouldn't have 
>> affected security. 
>
> Do you consider a Denial of Service or Memory Exhaustion attack to be a
> security issue? 

It's less of an issue on the client side were you don't mind too much if
an attacker DOS's his own machine.  Otherwise you have to consider
memory allocation and CPU cycles to be controlled system resources,
which is not rocket science (every operating system does that).  I'm not
sure where Lua sits with regard to this.

> If not, try running this in Python:
> 100**100**100

That's not an issue with bignums in themselves, but rather it's an
artifact of CPython's implementation.  Exponentiation works by repeated
squaring, and each squaring step only doubles the size of its input and
uses predictable cycles, so a sandboxed implementation could get by with
just checking input sizes before every multiplication.

> (Perhaps not a great idea.) How about defeating cryptographic protection
> mechanisms?...
> Or using Unicode to bypass data validation?...
> https://capec.mitre.org/data/definitions/71.html
> Unicode encoding attacks?... ... ...

None of the stuff you listed appear to be issues inherent with
supporting some feature in a language.  They are mostly application and
library bugs.  I got bored enough that I didn't look at all of them, so
maybe I missed something.