[IPython-dev] Starting to plan for 0.11 (this time for real)

Sat Oct 30 03:10:44 EDT 2010

On Fri, Oct 29, 2010 at 23:25, Brian Granger <ellisonbg at gmail.com> wrote:

> Min,
>
> On Fri, Oct 29, 2010 at 10:55 PM, MinRK <benjaminrk at gmail.com> wrote:
> > This is more agressive than I expected, I'll have to get the new parallel
> > stuff in gear.
>
> If you stopped writing great code, we wouldn't be tempted to do crazy
> things like this ;-)
>
> > The main roadblock for me is merging work into the Kernels.  I plan to
> spend
> > tomorrow working on getting the new parallel code ready for review, and
> > identifying what needs to happen with code in master in order for this to
> go
> > into 0.11.  The only work that needs merge rather than drop-in is in
> Kernels
> > and Session.  I expect that just using the new Session will just be fine
> > after a rewview, but getting the existing Kernels to provide what is
> > necessary for the parallel code will be some work, and I'll try to
> identify
> > exactly what that will look like.
>
> Are you thinking of having only one top-level kernel script that
> handles both the parallel computing stuff and the interactive IPython?
>  I think the idea of that is fantastic, but I am not sure we need to
> have all of that working to merge your stuff.  I am not opposed to
> attempting this before/during the merge, but I don't view it as
> absolutely needed.  Also, it may make sense to review your code
> standalone first and then discuss merging the kernel and session stuff
> with what we already have.
>

I was thinking that we already have a remote execution object, and the only
difference between the two is the connection patterns. New features/bugfixes
will likely want to be shared by both.  My StreamKernel was derived from the
original pykernel, but I kept working on it while you were developing on it,
so they diverged.  I think they can be merged, as long as we do a few
things, mostly to do with abstracting the connections:

     * allow Kernels to connect, not just bind
     * use action-based, not socket-type names
     * allow execution requests to come from a *list* of connections, not
just one
     * use sessions/ioloop instead of direct send/recv_json

I also think using a KernelManager would be good, because it gets nice
process management (restart the kernel, etc.), and I can't really do that
without a Kernel, but I could subclass.

Related question:

why is ipkernel not a subclass of pykernel?  There's lots of identical code
there.

>
> > The main things I know already:
> > * Names should change (GH-178). It's really a coincidence that we had
> just
> > one action per socket type, and the parallel code has several sockets of
> the
> > same type, and some actions that can be on different socket types,
> depending
> > on the scheduler.
>
> Yep.
>
> > * Use IOLoop/ZMQStream - this isn't necessarily critical, and I can
> probably
> > do it with a subclass if we don't want it in the main kernels.
>
> At this point I think that zmqstream has stablized enough that we
> *should* be using it in the kernel and kernel manager code anyways.  I
> am completely fine with this.
>
> > * apply_request. This should be all new code, and shouldn't collide with
> > anything.
>
> Ok.
>
> One other point that Fernando and I talked about is actually shipping
> the rest of tornado with pyzmq.  I have been thinking more about the
> architecture of the html notebook that James has been working on and
> it is an absolutely perfect fit for implementing the server using our
> zmq enabled Tornado event loop with tornado's regular http handling.
> It would also give us ssl support, authentication and lots of other
> web server goodies like websockets.  If we did this, I think it would
> be possible to have a decent prototype of James' html notebook in
> 0.11.  What do you think about this Min?  We are already shipping a
> good portion of tornado already with pyzmq and the rest is just a
> dozen or so .py files (there is one .c file that we don't need for
> python 2.6 and up).
> Eventually I would like to contribute our ioloop.py and zmqstream to
> tornado itself, but I don't think we have to worry about that yet.
>

I'm not very familiar with Tornado other than our use in pyzmq.  If we can
use it for authentication
without significant performance penalty, then that's a pretty big deal, and
well worth it.

It sounds like it would definitely provide a good toolkit for web backends,
so using it is probably a good idea.

I'm not sure that it should be *shipped* with pyzmq, though.  I think it
would be fine to ship with IPython
if we use it there, but I don't see a need to include it inside pyzmq.  If
we depend on it, then depend on it in PyPI,
but if it's only for some extended functionality, I don't see any problem
with asking people to install it, since it is
easy_installable (and apt-installable on Ubuntu).  PyZMQ is a pretty
low-level library - I don't think shipping someone else's
project inside it is a good idea unless there are significant benefits.

>
> Also, moving tornado into pyzmq would allow us to so secure https
> connections for the parallel computing client - controller connection.
>

Secure connections would be *great* if the performance is good enough.

>
> Cheers,
>
> Brian
>
> > Let me know what I can do to help things along.
> > -MinRK
> >
> > On Fri, Oct 29, 2010 at 20:28, Fernando Perez <fperez.net at gmail.com>
> wrote:
> >>
> >> On Fri, Oct 29, 2010 at 11:23 AM, Brian Granger <ellisonbg at gmail.com>
> >> wrote:
> >> > Remove all of the twisted stuff from 0.11 and put the new zmq stuff in
> >> > place as a prototype.
> >> >
> >> > Here is my logic:
> >> >
> >> > * The Twisted parallel stuff is *already* broken in 0.11 and if anyone
> >> > has stable code running on it, they should be using 0.10.
> >> > * If someone is happy to run non-production ready code, there is no
> >> > reason they should be using the Twisted stuff, they should use the
> >> > pyzmq stuff.
> >> > * Twisted is a *massive* burden on our code base:
> >> >  - For package managers, it brings in Twisted, Foolscap and
> >> > zope.interface.
> >> >  - It makes our test suite unstable and fragile because we have to
> >> > run tests in subprocesses and use trial sometimes and nose other
> >> > times.
> >> >  - It is a huge # of LOC.
> >> >  - It means that most of our codebase is Python 3 ready.
> >> >
> >> > There are lots of cons to this proposal:
> >> >
> >> > * That is really quick to drop support for the Twisted stuff.
> >> > * We may piss some people off.
> >> > * It possibly means maintaining the 0.10 series longer than we
> imagined.
> >> > * We don't have a security story for the pyzmq parallel stuff yet.
> >>
> >> I have to say that I simply didn't have Brian's boldness to propose
> >> this, but I think it's the right thing to do, ultimately.  It *is*
> >> painful in the short term, but it's also the honest approach.  I keep
> >> forgetting but Brian reminded me that even the Twisted-based code in
> >> 0.11 has serious regressions re. the 0.10.x series, since in the big
> >> refactoring for 0.11 not quite everything made it through.
> >>
> >> The 0.10 maintenance doesn't worry me a whole lot: as long as we limit
> >> it to small changes, by now merging them as self-contained pull
> >> requests is really easy (as I just did recently with the ones Paul and
> >> Tom sent).  And rolling out a new release when the total delta is
> >> small is actually not that much work.
> >>
> >> So I'm totally +1 on this radical, but I think ultimately beneficial,
> >> approach.  It's important to keep in mind that doing this will lift a
> >> big load off our shoulders, and we're a small enough team that this
> >> benefit is significant.  It will let us concentrate on moving the new
> >> machinery forward quickly without having to worry about the large
> >> Twisted code.  It will also help Thomas with his py3 efforts, as it's
> >> one less thing he has to keep getting out of his way.
> >>
> >> Concrete plan:
> >>
> >> - Wait a week or two for feedback.
> >> - If we decide to move ahead, make a shared branch on the main repo
> >> where we can do this work and review it, with all having the chance to
> >> contribute while it happens.
> >> - Move all twisted-using code (IPython/kernel and some code in
> >> IPython/testing) into IPython/deathrow.  This will let anyone who
> >> reall wants it find it easily, without having to dig through version
> >> control history.  Note that deathrow does *not* make it into official
> >> release tarballs.
> >>
> >> Cheers,
> >>
> >> f
> >> _______________________________________________
> >> IPython-dev mailing list
> >> IPython-dev at scipy.org
> >> http://mail.scipy.org/mailman/listinfo/ipython-dev
> >
> >
>
>
>
> --
> Brian E. Granger, Ph.D.
> Assistant Professor of Physics
> Cal Poly State University, San Luis Obispo
> bgranger at calpoly.edu
> ellisonbg at gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20101030/9fb2e499/attachment.html>