[IPython-dev] Some Thoughts on Notebook Security

Matthias BUSSONNIER bussonniermatthias at gmail.com
Fri Dec 14 03:15:54 EST 2012


> 
> I understand that multiple domains allow a fine grained approach to
> authentication.
> 
> Most people running notebooks are just going to throw the notebook
> server up on a port and point people to it.  Many will not even create
> an actual domain, just use a raw IP address.  This will be true even
> when the notebook server gains multiuser capabilities.  I don't want
> to have to  tell people - to run the notebook server, you have to run
> servers on two separate domains with an infrastructure that allows
> those servers to work together.  We want the multiuser support to be
> as simply as:
> 
> ipython notebook --multiuser ...
> 
> on a single machine.
> 
> But even if we forced everyone to use a multiple domain notebook
> server I don't understand how it would work.  Here is my
> understanding:
> 
> * There would be a secure domain that would not allow any dynamic
> Javascript in notebooks.  We would simply clean it out everytime in
> the notebook app.
> * Because this domain is "safe" this is where people would want to
> author and keep all of their notebooks.
> * Because dynamic Javascript is cleaned out every time, this domain
> could not be used to create Javascript containing notebooks.
> * In my mind, life on this domain is exactly as I propose above = we
> simply don't allow JS because it could do bad things.
> * Notebooks on this domain can safely be given access to the users
> regular kernels - IOW, these notebooks are completely equivalent to
> having shell access to the users files.
> 
> * When people want to create or view notebooks containing Javascript
> code, they have to go to a separate insecure domain.  This domain has
> to be completely sandboxed.
> * The unsecure domain can't be allowed to edit/remove/etc. notebooks
> from the secure domain, other wise hostile Javascript code could
> completely destroy the safe notebooks.
> * Notebooks authored on this domain have to stay on this domain, so
> users will have to manage two sets of notebooks.
> * Kernels on this domain have to be completely sandboxed as
> potentially hostile Javascript code could use Kernel.execute to do
> anything on the kernel.  These kernels could not even be allowed
> network connectivity as users could inject Js code that write python
> code that launches DOS attacked on the server or other hosts on the
> net.
> * Notebooks that users share with each other have to remain on this
> domain as they might contain hostile JS code.
> * Hostile JS code on this domain would have the ability to completely
> screw up notebooks on this domain.  So I could write a notebook that
> will delete all of your other JS containing notebooks if you open it?
> 
> Is this a correct summary of how the multiple domains would work?  Am
> I missing or misunderstanding things?  Would we give notebooks on the
> insecure domain more capabilities or access to the safe domain side of
> things?  Is dynamic Javascript less risky than I am understanding so
> we wouldn't have to isolate kernels?
> 

I think this is a correct summary. 
You just have to consider the fact that as the server(s) have control of the two domains
you can move thins from one to the other under the hood and have seamless links between the two.
-- 
Matthias


> Cheers,
> 
> Brian
> 
> 
> 
>> It does not prevent collaborating in anyways,
>> But if you share kernel you share filesystem, so there are no point in really using subdomain at this stage.
>> Still it gives another layer of security as if  there is a way to inject javascript, this javascript will not find any
>> malicious things to do.
>> 
>> Or it allows you "test" untrusted plugins..
>> 
>> I let you imagine other stuff.
>> --
>> Matthias
>> 
>> 
>> 
>>> * I don't see any fundamental different between "static notebooks" and
>>> "notebooks with kernels" - both can have Javascript and the new
>>> Javascript plugins will work on both.  Both have the same overall
>>> security issues and I don't think it makes sense to try and handle
>>> them separately.
>>> 
>>> Cheers,
>>> 
>>> Brian
>>> 
>>> 
>>> 
>>> 
>>> On Tue, Dec 11, 2012 at 9:15 AM, Matthias BUSSONNIER
>>> <bussonniermatthias at gmail.com> wrote:
>>>> 
>>>> Le 11 déc. 2012 à 16:10, Carl Smith a écrit :
>>>> 
>>>>> Hi Brian
>>>>>> 
>>>>>> The idea is that the extra Javascript cool-stuff will be installed by
>>>>>> the person who runs the notebook server once and for all notebooks on
>>>>>> that server.  Similar to how python packages are installed = you do
>>>>>> this before you start python.  To get data from python to the
>>>>>> Javascript plugins we will use JSON objects and trigger the callbacks
>>>>>> to handle them.
>>>>> 
>>>>> This seems to be dependent on a kernel, which static notebooks don't
>>>>> have? If I generate a static notebook, which is just a web page, then
>>>>> post that page to a hosting service, or email it to someone, how would
>>>>> the plugins work? Maybe we're looking at two slightly different
>>>>> scenarios. I'm focussed on static views only. The host should not have
>>>>> to allow anything more than posting and getting HTML documents.
>>>> 
>>>> IIRC, you can embed several repr in the ipynb file.
>>>> So you could provide a plugin that can "render" object on static view.
>>>> (like d3.js graph, you don't need the kernel to do that)
>>>> 
>>>>> ================
>>>>> 
>>>>> Hi Matthias
>>>>> 
>>>>>> Static notebooks, served from a different domain, could be rendered
>>>>>> inside iframes, enabling us to embed them inside other webpages and
>>>>>> applications. These notebooks would still be superficially served by
>>>>>> our own servers, so the UX wouldn't be effected.
>>>>>> 
>>>>>> keep in mind that iframe are not sandboxed, and you can inject js on parent
>>>>>> frame.
>>>>>> Unless you use the sandbox attributes, which is part of html5 but not
>>>>>> implemented in every
>>>>>> browser… And not yet infallible, it is more a "we'll help you embed other
>>>>>> pages by providing a separate
>>>>>> js namespace but we don't guaranty yes that the VM is unbreachable"
>>>>> 
>>>> 
>>>>> I pretty sure iframes are sandboxed in the sense that a parent page
>>>>> and an iframe can not communicate unless they have the same origin,
>>>>> and this is an old feature. The new sandbox attribute in HTML5 is for
>>>>> a different purpose.
>>>> 
>>>> But this is still kind of a problem, as usually the static view will be served from the "same origin"
>>>> as the rest of the website.
>>>> 
>>>> If you don't want to make a notebook fully public, you have to have some kind of authentication
>>>> that allows you to load it.
>>>> 
>>>> I'm still doubt a little about what frames are supposed to do and what they actually do.
>>>> I'm not an expert on that, but it is still worth digging.
>>>> 
>>>>>> Responsible disclosure don't want to say much more but but having a
>>>>>> statically display
>>>>>> notebook is often link to having a "sharing/import" button which is
>>>>>> dangerous.
>>>>>> And could lead to self propagating notebook through account that can infect
>>>>>> other
>>>>>> notebooks, or share itself on twitter...
>>>>> 
>>>>> Any buttons, like for importing a notebook, would live in the parent
>>>>> page and would have no access to, nor allow access from, the iframe.
>>>>> The parent page would know which static notebook it embeds in the
>>>>> iframe though, so it could provide buttons that connect to the actual
>>>>> notebook in question, which is a totally different file to the static
>>>>> notebook being rendered in the iframe anyway.
>>>> 
>>>> I understand what you want to do, I guess the definition of "static" is blurry.
>>>> If you want a perfectly static version (does it make sense in html) you can go with iframe.
>>>> If you want the ability to comment on a particular cell, then you have to build iframe for
>>>> every cell.
>>>> and you lose the ability to comment "inline" as github does.
>>>> 
>>>> 
>>>>> 
>>>>>> Multi domain is a real good idea. I have a clear view in my head on how we
>>>>>> could use that in a way close to OAuth to allow javascript by still having
>>>>>> "logged-in" users.
>>>>>> It wouldn't be as seamless a something like github, but close.
>>>>> 
>>>>> I think we're looking at things differently: You seem to be
>>>>> considering static views as something generated on the fly and on
>>>>> demand, nbviewer style. I'm thinking about running nbconvert on a
>>>>> notebook, then keeping the output as a webpage to be copied and passed
>>>>> around freely. Once the static notebook exists, it's a done deal.
>>>>> There's no chance of any changes to IPython breaking it. It's a
>>>>> independent webpage. Updating it would amount to deleting it and
>>>>> replacing it with a new version.
>>>> 
>>>> I don't think those are quite different.
>>>> You can have a "perfectly static" version that embeds bad js and require some kind of authentication to be seen.
>>>> The line between "on the fly" and static is thin.
>>>> 
>>>>> 
>>>>>> The **big** question is:
>>>>>> Are viewer logged in (in any way) to the given server, and if so do they
>>>>>> have the right to do anything else with those credentials ?
>>>>>> If it is just a public notebook viewer, then it's fine.
>>>>>> 
>>>>>> If you want something more "interactive" (sharing/ permissions…etc, and the
>>>>>> display any JS ) you won't have much choice.
>>>>>> Or you will have a painful multi-login.
>>>>> 
>>>>> I'm very much against hosting user submitted notebooks on any domain
>>>>> with cookie based authentication. It needs to be divided into 'trusted
>>>>> domain', where no user's JS will ever be served, and 'hosting domain'
>>>>> that has no account system of it's own. The trusted domain would
>>>>> control the hosting domain, as a kind of slave.
>>>> 
>>>> Yep, kind of what I have in mind.
>>>> The hosting domain can have "tokens"
>>>> Publish this comment on this notebook on the behalf of ...
>>>> The you have to "validate" those action on the "trusted domain".
>>>> --
>>>> Matthias
>>>> 
>>>> 
>>>> 
>>>>> 
>>>>> That's just my take on all this.
>>>>> 
>>>>> Cheers
>>>>> 
>>>>> Carl
>>>>> _______________________________________________
>>>>> IPython-dev mailing list
>>>>> IPython-dev at scipy.org
>>>>> http://mail.scipy.org/mailman/listinfo/ipython-dev
>>>> 
>>>> _______________________________________________
>>>> IPython-dev mailing list
>>>> IPython-dev at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/ipython-dev
>>> 
>>> 
>>> 
>>> --
>>> Brian E. Granger
>>> Cal Poly State University, San Luis Obispo
>>> bgranger at calpoly.edu and ellisonbg at gmail.com
>>> _______________________________________________
>>> IPython-dev mailing list
>>> IPython-dev at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/ipython-dev
>> 
>> _______________________________________________
>> IPython-dev mailing list
>> IPython-dev at scipy.org
>> http://mail.scipy.org/mailman/listinfo/ipython-dev
> 
> 
> 
> --
> Brian E. Granger
> Cal Poly State University, San Luis Obispo
> bgranger at calpoly.edu and ellisonbg at gmail.com
> _______________________________________________
> IPython-dev mailing list
> IPython-dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-dev




More information about the IPython-dev mailing list