[IPython-dev] Notebook kernels + LXC

Brian Granger ellisonbg at gmail.com
Wed Oct 24 15:43:52 EDT 2012


Thomas,

LXC containers are an amazing technology and are something I have
thought a lot about.

What you are asking about is 1) sharing and 2) security.  Here is my
short answer:

We want to promote *sharing* models that don't suffer from *security*
holes that require measures such as LXC containers.

Think about shell access.  What would you say if I developed a way of
sharing code, data, programs in the shell that required an LXC
container.  You would say "that is insane, if you trust a user, give
them a shell account and use groups and shared directories and if you
don't, keep them out and share things on the web, public github repos,
etc."

Now a few more details...

On Wed, Oct 24, 2012 at 9:48 AM, Thomas Kluyver <takowl at gmail.com> wrote:
> A question on SO [1] got me thinking again about security in
> multi-user cases. I've read recently about LXC [2], which provides
> lightweight isolated environments for a set of processes.

# More on security

I have spent *a lot* of time thinking about the security model for the
notebook and I will try to summarize my current thinking:

* There are only two sets of users: trusted (those you would give a
unix shell account to on that system) and malicious (those you would
not give a shell account to).  There are no such things as "malicious
users that are not too bad".

Trusted users:

* For trusted users, the main thing you need to do is provide good
isolation.  You can provide most of that using regular unix accounts.
When we implement the multiuser support, we will implement a variety
of ways for trusted users to share notebooks.  This will avoid the
need for one trusted user to run a kernel as another user.
* If two trusted users *really* want to share a notebook, we will
allow that, but that will be akin to giving them your shell logon
credentials.  If a user is not OK with that they should just give the
notebook to the other user to run on their own kernels.
* For trusted users, the only situation where LXC helps is in limiting
system resources.  But I am a bit hesitant to add this support as it
would create a false sense of security.

Malicious users:

* Securing this case is extremely difficult and LXC alone is not sufficient.
* You have to build a custom version of Python with some libraries
removed, so malicious users cannot launch attacks from the LXC
container on other hosts using things like socket.
* You have to protect the rest of your backend infrastructure from the
kernels running inside the LXC containers.  This is extremely
difficult.  The main problem here is that the backend needs to be able
to talk to the kernel in the LXC container using ZeroMQ sockets.  BUT,
that means the users code can import pyzmq and use that to attack the
rest of the backend infrastructure.  You need a sandbox that has holes
in it.
* These problems are solvable, but have to be tackled with a system
level approach that involves LXC containers, multiple hosts running
different parts of the backend infrastructure, dynamic firewall rules,
careful monitoring, an intricately designed network architecture that
isolates potentially hostile traffic from trusted backend traffic,
etc.
* In short, it is not possible to "write some python code" that use
LXC containers and magically have security.

Our plan is to develop these things in a commercial, cloud based
offering where you can afford to pay people to develop and run the
needed infrastructure.

Our goal with the open source project is to cover the usage case of
groups of trusted users that have multiple easy ways of sharing
notebooks that don't require running code as someone else.

# Back to sharing

Again my main idea is that we want easy ways of sharing that don't
have intrinsic difficult security problems.

Example 1: nbviewer is a perfect example of the type of sharing model
we want to support.  By adding a "Publish to gist/nbviewer" button it
will become extremely easy for people to share in this manner.
Absolutely no security issues to worry about.

Example 2: We should add a button in the NB UI to "Import notebook
from the web" by its URL, gist id, github URL, etc.

Example 3: In our multiuser notebook, we will add the ability for
users to publish nbviewer like static views of their notebooks that
have "download notebook" links that can be pasted into the "Import
notebook from the web" field.

Example 4: people should be putting their notebooks on github/gists
where they can be easily cloned.  I strongly feel that we want to
encourage people to use git/github as the primary means of sharing a
notebook.  We should integrate git into the notebook UI to make this
even easier, such as "Create a new Notebook Project from the following
git repo"

In all of these cases, there is a clear and easy path for sharing and
there is simply no reason for a user to run kernels on another persons
system.

I would claim that this model is exactly the sharing model of git.
There are probably oodles of git servers running all over the world.
Each group running a git server will only give trusted people R+W
access, but can choose which repos the public has R access to.  But no
one would ever give an unknown, possibly malicious person R+W access
to a trusted groups git server or repo.  This architecture doesn't
prevent sharing of git repos - in fact, I would say git enables share
by a combination of these mini security zones along with super easy
ways of moving code between the security zones.

For the IPython Notebook, allowing a user to run code is the security
equivalent of giving a user W access to a repo - that is what enables
them to do harm.

Hope this clarifies the situation wrt LXC.

Cheers,

Brian

> Is there mileage in an option for the notebook server to start each
> kernel in a new LXC container? That would give OS-level limitations on
> what a remote user can do, without the overhead of running full
> virtual machines. I imagine this could be paired with a way to share
> access to a particular notebook or session, so a malicious user
> getting access can only damage files in that project. It could
> probably also be set up so that file access is read-only.
>
> Of course, I may be on completely the wrong track. But the notebook is
> clearly going to be used in cases where the 'all or nothing' access to
> the underlying system is too coarse. Maybe this is one way to offer
> finer-grained control.
>
> [1] http://stackoverflow.com/questions/13044921/prevent-user-del-files-in-ipython-notebook-environment/13053501#13053501
> [2] http://lxc.sourceforge.net/
>
> Thomas
> _______________________________________________
> IPython-dev mailing list
> IPython-dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-dev



-- 
Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger at calpoly.edu and ellisonbg at gmail.com



More information about the IPython-dev mailing list