[IPython-dev] Status of IPython+GUI+Threads+PyOS_InputHook

Sun Feb 8 16:38:52 EST 2009

Copying to the list, which you probably forgot.

On Sun, Feb 08, 2009 at 01:31:27PM -0800, Brian Granger wrote:
> On Sun, Feb 8, 2009 at 12:53 PM, Gael Varoquaux
> <gael.varoquaux at normalesup.org> wrote:
> > On Sun, Feb 08, 2009 at 12:15:53PM -0800, Brian Granger wrote:
> >> The things you have done to get this working (in my mind) are simply
> >> welding the Core to wx.  If we have to do things like this for each GUI
> >> toolkit, we are back to having a core that is non-reusable and
> >> difficult to maintain, test, extend, debug, etc.

> > I am actually not too worried about this. I believe the problems can be
> > abstracted to a small set of well-defined methods to be implemented by a
> > frontend.

> I agree that we should be able to abstract these things out.  But, if
> you believe that all abstractions are leaky (I do) then what you
> really end up with is 1) nice clean abstractions and 2) duct tape to
> plug the leaks.  The question is how much do they leak.

Yup, agreed.

> >> To make matters worse, these types of hacks are notoriously fragile.

> > Indeed, this worries me a lot/

> It is this fragility that leads me to think that the above
> abstractions will have leaks and require duct taping.

Yup.

> > I have given a lot of thought to this, and I believe that to avoid
> > freezing, there is no other option than multiple processes. If you think
> > of all the interactive environments that are user friendly, they all run
> > code in a sandbox that has nothing to do with the GUI (sage included). If
> > you don't go multi-processes, you are bound to expose some of the
> > oddities of the event loop, or worst, threads, to the user.

> Yes, as far as I know all the commercial product (Mathematica, Matlab,
> etc) all do this as well.  What is interesting is that they also do
> interactive plotting.  I wonder if they share data?  I too think that
> a multiprocess solution is key.

> > The problem with multiple processes is that if I want to do interactive
> > data visualization or debugging, I end up copying my data between
> > processes, and this is not good at all. Maybe the shared array that we
> > have been discussing on the scipy-users mailing list is an answer to this
> > problem, but it is not there yet, and it seems like a sledge hammer
> > approach.

> Visualization:  yes, this could require data movement if the
> visualization GUI is running in a process where the data isn't.

I would really like to be able not to have to copy data. In some of the
work that I have been doing currently, I have been able to achieve
results that others have not simply because I was more clever with memory
(hint: 64 bit + a lot of memmapping). My raw data takes more than 2Gb. I
can't copy it many times before I kill my 8Gb box.

> Debugging:  I think we could make a debugger work across processes.
> After all, an interactive debugger just takes input and write output.
> That IO can be sent over the wire.

Yes, but that not the scientific debugger than I am dreaming of, in which
I could do visualization.

Thinking these things over, it seems that we should have visualization at
both ends. To use frontend visualization, we would have data copying, but
no freezing during calculation and debugging. A good UI paradigm would be
needed to make it clear what happens on which end.

> > If you only want a friendly, interactive terminal that does not freeze
> > when you run computations, ie an IPython++, I believe something like the
> > sage approach is excellent. In other words, you have a canvas where you
> > lay out commands and their output, that lives in a totally different
> > process than the execution engine. Not too much data flows between the
> > canvas and the execution engine, so there is no shared memory problem.
> > Interactive visualization can be done by running a GUI loop in the
> > execution engine, and having things like matplotlib or Mayavi living in
> > the same process. I believe the best approach so far would be to build
> > upon something like knoboo (http://knoboo.com/, which is unfortunatey
> > GPL) for the front, and an execution engine with an eventloop built upon
> > IPython1. Maybe the frontend could be made to look cooler by building a
> > GUI around an embedded Webkit, but that's probably the easiest part.

> This is the type of design that I have in mind.  We just need to think
> about how things like print, raw_input will be handled.  But it should
> be possible.  I will try to sketch out some of these ideas.

> > Interactive visualization of very large dataset is an important feature
> > for my current work. Mayavi + ipython is fantastic in this regard (sorry
> > for my lack of humility). It is a great debug tool to be able to
> > visualize the very exact huge datasets that you are working with, and the
> > larger the dataset, the more important it is to have interactivity, to be
> > able to explore it better.

> I agree that interactive viz is an absolute requirement.  It has to
> work or we can't get science done!!!

Thanks for working on that, Brian, these are tricky and important
questions.

Good luck,

Gaël