[IPython-dev] Some Thoughts on Notebook Security

Tue Dec 11 10:10:04 EST 2012

Hi Brian

> * In markdown cells users can put arbitrary HTML and JS.
>
> To fix this, we need to enable the HTML sanitizer that comes with the
> JS Markdown rendered that we are using.  This is what StackOverflow
> uses to sanitize their markdown and should completely remove any
> security risks coming from within markdown cells.

StackOverflow don't want any user-provided JavaScript to be rendered
inside there forums. Obviously it would create the same security
issues we're discussing here, but it's not a feature of StackOverflow.
User's only provide rich text, so sanitising posts makes perfect
sense. I think we should bare in mind that we're trying to support
arbitrary JS, while almost everyone else is trying to prevent it, not
for security reasons, but to prevent their site from becoming Myspace.

> * In CodeCell output, the Javascript repr is dynamically passed
> into eval.  This only happens when code is run, not when the notebook
> is loaded, so it is less critical, but still needs to be fixed.
>
> To fix this, we need to disable the Javascript representation of
> objects altogether.
>
> Will these two things not completely fix the security problems we
> currently have?

> Now the question is how to enable all of the nice things you can do
> with Javascript.  I think the answer is Javascript plugins, JSON reprs
> and JSON handlers:
>
> https://github.com/ipython/ipython/pull/2518
>
> The idea is that the extra Javascript cool-stuff will be installed by
> the person who runs the notebook server once and for all notebooks on
> that server.  Similar to how python packages are installed = you do
> this before you start python.  To get data from python to the
> Javascript plugins we will use JSON objects and trigger the callbacks
> to handle them.

This seems to be dependent on a kernel, which static notebooks don't
have? If I generate a static notebook, which is just a web page, then
post that page to a hosting service, or email it to someone, how would
the plugins work? Maybe we're looking at two slightly different
scenarios. I'm focussed on static views only. The host should not have
to allow anything more than posting and getting HTML documents.

When creating statics, I see IPython Notebook as essentially a webpage
editor. It should be able to just use JS as freely as any page, then
be distributed like any page.

All XSS attacks, as I understand it, depend on the baddy being able to
have their malicious JS served from another domain that is trusted,
circumventing the Same Origin Policy. Simple solution: don't serve
user's notebooks from trusted domains.

================

Hi Matthias

> Static notebooks, served from a different domain, could be rendered
> inside iframes, enabling us to embed them inside other webpages and
> applications. These notebooks would still be superficially served by
> our own servers, so the UX wouldn't be effected.
>
>
> keep in mind that iframe are not sandboxed, and you can inject js on parent
> frame.
> Unless you use the sandbox attributes, which is part of html5 but not
> implemented in every
> browser… And not yet infallible, it is more a "we'll help you embed other
> pages by providing a separate
> js namespace but we don't guaranty yes that the VM is unbreachable"

I pretty sure iframes are sandboxed in the sense that a parent page
and an iframe can not communicate unless they have the same origin,
and this is an old feature. The new sandbox attribute in HTML5 is for
a different purpose.

> Responsible disclosure don't want to say much more but but having a
> statically display
> notebook is often link to having a "sharing/import" button which is
> dangerous.
> And could lead to self propagating notebook through account that can infect
> other
> notebooks, or share itself on twitter...

Any buttons, like for importing a notebook, would live in the parent
page and would have no access to, nor allow access from, the iframe.
The parent page would know which static notebook it embeds in the
iframe though, so it could provide buttons that connect to the actual
notebook in question, which is a totally different file to the static
notebook being rendered in the iframe anyway.

> Multi domain is a real good idea. I have a clear view in my head on how we
> could use that in a way close to OAuth to allow javascript by still having
> "logged-in" users.
> It wouldn't be as seamless a something like github, but close.

I think we're looking at things differently: You seem to be
considering static views as something generated on the fly and on
demand, nbviewer style. I'm thinking about running nbconvert on a
notebook, then keeping the output as a webpage to be copied and passed
around freely. Once the static notebook exists, it's a done deal.
There's no chance of any changes to IPython breaking it. It's a
independent webpage. Updating it would amount to deleting it and
replacing it with a new version.

> The **big** question is:
> Are viewer logged in (in any way) to the given server, and if so do they
> have the right to do anything else with those credentials ?
> If it is just a public notebook viewer, then it's fine.
>
> If you want something more "interactive" (sharing/ permissions…etc, and the
> display any JS ) you won't have much choice.
> Or you will have a painful multi-login.

I'm very much against hosting user submitted notebooks on any domain
with cookie based authentication. It needs to be divided into 'trusted
domain', where no user's JS will ever be served, and 'hosting domain'
that has no account system of it's own. The trusted domain would
control the hosting domain, as a kind of slave.

That's just my take on all this.

Cheers

Carl