[IPython-dev] Extensible pretty-printing

Thu Oct 28 21:31:15 EDT 2010

I am in the middle of lab (they are taking a quiz), so I don't have
time to dig into the full thread ATM, but I do have a few comments:

* The main thing that I am concerned about is how we answer the
question i) "how do I (a developer of foo) make my class Foo, print
nice HTML/SVG.  IOW, what is does the public API for all of this look
like?

* In the current IPython, displayhook is only triggered 1x per block.
Thus, you can't use displayhook to get the str/html/svg/png
representation of an object inside a block block or loop.  This is a
serious limitation, that Fernando and I feel is a good thing in the
end.  But, this means that we will also need top-level functions that
users can put in their code to trigger all of this logic independent
of displayhook.

Like this:

for t in times:
    a = compute_thing(t)
    print_html(a)  # This should use the APIs that we are designing
and the payload system to deliver the html to the frontend.

We should also have functions like print_png, print_svg, print_latex
that we inject into builtins.

What this means is that we need to design an implementation that is
independent from displayhook and that is cleanly integrated with the
payload system.

Cheers,

Brian

On Thu, Oct 28, 2010 at 6:17 PM, Fernando Perez <fperez.net at gmail.com> wrote:
> On Thu, Oct 28, 2010 at 5:13 PM, Robert Kern <robert.kern at gmail.com> wrote:
>
>>> OK, so how do you want to proceed: do you want to reopen your pull
>>> request (possibly rebasing it if necessary) as it was, or do you want
>>> to go ahead and implement the above approach right away?
>>
>> I'd rather implement this approach right away. We just need to decide what the
>> keys should be and what they should mean. I originally used the ID of the
>> DisplayFormatter. This would allow both a "normal" representation and an
>> enhanced one both of the same type (plain text, HTML, PNG image) to coexist.
>> Then the frontend could pick which one to display and let the user flip back and
>> forth as desired even for old Out[] entries without reexecuting code. This may
>> be a case of YAGNI.
>
> Actually I don't think it's YAGNI, and I have a specific use case in
> mind, with a practical example.  Lyx shows displayed equations, but if
> you copy one, it's nice enough to actually feed the clipboard with the
> raw Latex for the equation.  This is very convenient, and I often use
> it to edit complex formulas in lyx that I then paste into reST docs.
>
> We could similarly have pretty display of e.g. sympy output, but where
> one could copy the raw latex fort the output cell.  The ui could
> expose this via a context menu that offers 'copy image, copy latex,
> copy string' for example.
>
> So this does strike me like genuinely useful and valuable functionality.
>
>> However, that means that the frontend needs to know about the IDs of the
>> DisplayFormatters. It needs to know that 'my-tweaked-html' formatter is HTML. I
>> might propose this as the fully-general solution:
>>
>> Each DisplayFormatter has a unique ID and a non-unique type. The type string
>> determines how a frontend would actually interpret the data for display. If a
>> frontend can display a particular type, it can display it for any
>> DisplayFormatter of that type. There will be a few predefined type strings with
>> meanings, but implementors can define new ones as long as they pick new names.
>>
>>   text -- monospaced plain text (unicode)
>>   html -- snippet of HTML (anything one can slap inside of a <div>)
>>   image -- bytes of an image file (anything loadable by PIL, so no need to have
>> different PNG and JPEG type strings)
>>   mathtext -- just the TeX-lite text (the frontend can render it itself)
>>
>> When given an object for display, the DisplayHook will give it to each of the
>> DisplayFormatters in turn. If the formatter can handle the object, it will
>> return some JSONable object[1]. The DisplayHook will append a 3-tuple
>>
>>   (formatter.id, formatter.type, data)
>>
>> to a list. The DisplayHook will give this to whatever is forming the response
>> message.
>>
>> Most likely, there won't be too many of these formatters for the same type
>> active at any time and there should always be the (id='default', type='text')
>> formatter. A simple frontend can just look for that. A more complicated GUI
>> frontend may prefer a type='html' response and only fall back to a type='text'
>> format. It may have an ordered list of formatter IDs that it will try to display
>> before falling back in order. It might allow the user to flip through the
>> different representations for each cell. For example, if I have a
>> type='mathtext' formatter showing sympy expressions, I might wish to go back to
>> a simple repr so I know what to type to reproduce the expression.
>>
>> I'm certain this is overengineered, but I think we have use cases for all of the
>> features in it. I think most of the complexity is optional. The basic in-process
>> terminal frontend doesn't even bother with most of this and just uses the
>> default formatter to get the text and prints it.
>>
>> [1] Why a general JSONable object instead of just bytes? It would be nice to be
>> able to define a formatter that could give some structured information about the
>> object. For example, we could define an ArrayMetadataFormatter that gives a dict
>> with shape, dtype, etc. A GUI frontend could display this information nicely
>> formatted along with one of the other representations.
>
> Most of this I agree with.  Just one question: why not use real mime
> types for the type info?  I keep thinking that for our payloads and
> perhaps also for this, we might as well encode type metadata as
> mimetypes: they're reasonably standardized, python has a mime library,
> and browsers are wired to do something sensible with mime-tagged data
> already.  Am I missing something?
>
>>> If the latter, I'm not sure I like the approach of passing a dict
>>> through and letting each formatter modify it.  Sate that mutates
>>> as-it-goes tends to produce harder to understand code, at least in my
>>> experience.  Instead, we can call all the formatters in sequence and
>>> get from each a pair of key, value.  We can then insert the keys into
>>> a dict as they come on our side (so if the storage structure ever
>>> changes from a dict to anything else, likely the formatters can stay
>>> unmodified).  Does that sound reasonable to you?
>>
>> That's actually how I would have implemented it [my original ipwx code
>> notwithstanding ;-)].
>
> OK.  It seems we're converging design wise to the point where code can
> continue the conversation :)
>
> Thanks!
>
> Cheers,
>
> f
> _______________________________________________
> IPython-dev mailing list
> IPython-dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-dev
>

-- 
Brian E. Granger, Ph.D.
Assistant Professor of Physics
Cal Poly State University, San Luis Obispo
bgranger at calpoly.edu
ellisonbg at gmail.com