[IPython-dev] magics and metadata

Brian Granger ellisonbg at gmail.com
Tue Jun 19 19:20:16 EDT 2012


When the metadata PR come up, I was originally going to vote -1 on it
because of this issue.  I sat on it for a while and in the end decided
that it was OK because I think the need for metadata is already upon
us even though we don't have an actual usage case in our own code base
(for example, we don't have a metadata UI in the notebook web app).

There is a fine line to walk here.  On one hand, I completely agree
with you that we should try to future-proof the notebook format to
minimize disruptive format changes.  On the other hand, adding things
too soon leads to even more potential disruption for the following
reason.  As I developed the notebook format and notebook UI last
summer, there were multiple situations where I added something to the
notebook format before I actually used it in the UI.  In many of these
cases, when I did get around to developing the UI for it, I realized
that my original thoughts on that element were incomplete.  It wasn't
until I wrote the UI that used the data that I realized exactly what
the format of that data needed to be.  As a result, I had to go back
and modify the notebook format.  After a few iterations of this, I
realized that this approach was broken and started to enforce the
following simple rule on myself: don't add it to the notebook format
until I am ready to write the UI code that uses it.  That rule served
me very well last summer.

This is why for example the notebook and cells do not currently have
any timestamp information (even though I think we will eventually want
it).  The one notebook feature (which I regret adding to the format)
that doesn't have a UI is the multiple worksheets.  We absolutely want
that as a feature, I just wish I had waited to add it to the notebook
format.  When we do implement the mulitple worksheet UI, it is likely
we will want to go back and make changes to the notebook format to
better reflect the UI (for example, we will probably want to persist
which worksheet is active/open).

For the cell and worksheet metadata, I knew we would eventually need
it and I didn't want to hold up the beta release any longer.  But
there are still unanswered questions related to it:

* What types of things go in the metadata?
* Is this an area for us to write data to, or for advanced users to
write data to?
* Is it entirely unstructured, or will we require a discussion for
each new key/value entry into it.

It is not at all clean that the current metadata design will hold up
to our answers of these questions.  But in the end, I sort of wanted
to add the metadata as it is now, so we could being to see how we and
others start to use it.  But just because we added the metadata to the
notebook format definitely doesn't mean that future-proofs this part
of the notebook format.

Hope this clarifies things a bit.

Back to the question of output-level metadata.  When a bit of code
remains unused for almost a year, I start to question whether we
really need it.  I not convinced we don't need it, I am not sure.  In
light of this, I don't think that adding it to the notebook format
makes sense.  When one of us finds a good purpose for this metadata,
let's add it to the nbformat them.

The other philosophical line of reasoning that I am being guided by
here is simplicity.  It would be very easy to over design the notebook
format and add all sorts of feature that we might need.  I think this
is a wrong direction to go.  We want a notebook format that is as
compact and minimal as possible, where each and every bit of data is
there for a well-defined and justified reason.

Cheers,

Brian



On Tue, Jun 19, 2012 at 3:25 PM, MinRK <benjaminrk at gmail.com> wrote:
>
>
> On Tue, Jun 19, 2012 at 3:23 PM, Brian Granger <ellisonbg at gmail.com> wrote:
>>
>> On Tue, Jun 19, 2012 at 3:19 PM, MinRK <benjaminrk at gmail.com> wrote:
>> >
>> >
>> > On Tue, Jun 19, 2012 at 3:18 PM, Brian Granger <ellisonbg at gmail.com>
>> > wrote:
>> >>
>> >> On Tue, Jun 19, 2012 at 2:59 PM, Fernando Perez <fperez.net at gmail.com>
>> >> wrote:
>> >> > On Tue, Jun 19, 2012 at 1:17 PM, MinRK <benjaminrk at gmail.com> wrote:
>> >> >> Yes - we put metadata on outputs for a reason, presumably.  If this
>> >> >> shouldn't be saved, it should probably be removed from the API.
>> >> >
>> >> > I can't recall precisely what we had in mind when we put it in, but
>> >> > something that springs to mind as potentially useful, for example,
>> >> > would be to specify a desired priority order for the various types of
>> >> > outputs. Right now when a client can display several kinds of output
>> >> > it just makes a choice, but we could let objects provide a hint of
>> >> > the
>> >> > preferred order, based on what they know about the relative quality
>> >> > of
>> >> > each.
>> >>
>> >> I originally put it there to allow objects to provide hints to the
>> >> frontend on how it should display a representation.  This is similar
>> >> to how the payloads can indicate where it came from.
>> >>
>> >> > So I'd vote for not removing this, as it may prove useful...
>> >>
>> >> I also think it could be useful, although it seems a bit excessive to
>> >> store metadata for each output.  Here is what I propose.  We simply
>> >> leave it alone until we have an actual use case that will help us
>> >> figure out exactly what this should look like.  Without a concrete
>> >> usage case, it is difficult to know what is needed.
>> >
>> >
>> > But this doesn't answer the immediate question: Should this metadata
>> > dict be
>> > included in the nbformat
>>
>> I would vote no - not until we have a real usage case.  I don't like
>> to add things to the notebook format until we are actually using them.
>
>
> Then should we remove all of the metadata stuff we just added?  The whole
> point was to prepare the nbformat for future changes to we don't have to
> update the nbformat, which is incredibly painful and should be done as
> rarely as possible.
>
> -MinRK
>
>>
>>
>> >>
>> >>
>> >> > f
>> >> > _______________________________________________
>> >> > IPython-dev mailing list
>> >> > IPython-dev at scipy.org
>> >> > http://mail.scipy.org/mailman/listinfo/ipython-dev
>> >>
>> >>
>> >>
>> >> --
>> >> Brian E. Granger
>> >> Cal Poly State University, San Luis Obispo
>> >> bgranger at calpoly.edu and ellisonbg at gmail.com
>> >> _______________________________________________
>> >> IPython-dev mailing list
>> >> IPython-dev at scipy.org
>> >> http://mail.scipy.org/mailman/listinfo/ipython-dev
>> >
>> >
>> >
>> > _______________________________________________
>> > IPython-dev mailing list
>> > IPython-dev at scipy.org
>> > http://mail.scipy.org/mailman/listinfo/ipython-dev
>> >
>>
>>
>>
>> --
>> Brian E. Granger
>> Cal Poly State University, San Luis Obispo
>> bgranger at calpoly.edu and ellisonbg at gmail.com
>> _______________________________________________
>> IPython-dev mailing list
>> IPython-dev at scipy.org
>> http://mail.scipy.org/mailman/listinfo/ipython-dev
>
>
>
> _______________________________________________
> IPython-dev mailing list
> IPython-dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-dev
>



-- 
Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger at calpoly.edu and ellisonbg at gmail.com



More information about the IPython-dev mailing list