[Matplotlib-devel] Units discussion...

Antony Lee antony.lee at berkeley.edu
Thu Feb 8 11:09:32 EST 2018


I'm momentarily a bit away from Matplotlib development due to real life
piling up, so I'll just keep this short.

One major point (already mentioned by others) that led, I think, to some
devs (including myself) being relatively dismissive about unit support is
the lack of well-defined use case, other than "it'd be nice if we supported
units" (i.e., especially from the point of view of devs who *don't* use
units themselves, it ends up being an ever moving target).  In particular,
tests on unit support ("unit unit tests"? :-)) currently only rely on the
old JPL unit code that ended up integrated into Matplotlib's test suite,
but does not test integration with the two major unit packages I am aware
of (pint and astropy.units).

>From the email of Ted it appears that these are not sufficient to represent
all kinds of relevant units.  In particular, I was at some point hoping to
completely work in deunitized data internally, *including the plotting*,
and rely on the fact that if the deunitized and the unitized data are
usually linked by an affine transform, so the plotting part doesn't need to
convert back to unitized data and we only need to place and label the ticks
accordingly; however Ted mentioned relativistic units, which imply the use
of a non-affine transform.  So I think it would also be really helpful if
JPL could release some reasonably documented unit library with their actual
use cases (and how it differs from pint & astropy.units), so that we know
better what is actually needed (I believe carrying the JPL unit code in our
own code base is a mistake).

As for the public vs private, or rather unitized vs deunitized API
discussion, I believe a relatively simple and consistent line would be to
make Axes methods unitized and everything else deunitized (but with clear
ways to convert to and from unitized data when not using Axes methods).

Antony

2018-02-07 16:33 GMT+01:00 Drain, Theodore R (392P) <
theodore.r.drain at jpl.nasa.gov>:

> That sounds fine to me.  Our original unit prototype API actually had
> conversions for both directions but I think the float->unit version was
> removed (or really moved) when the ticker/formatter portion of the unit API
> was settled on.
>
> Using floats/numpy arrays internally is going to easier and faster so I
> think that's a plus.  The biggest issue we're going to run in to is what's
> defined as "internal" vs part of the unit API.  Some things are easy like
> the Axes/Axis API.  But we also use low level API's like the patches.  Are
> those unitized?  This is the pro and con of using something like Python
> where basically everything is public.  It makes it possible to do lots of
> things, but it's much harder to define a clear library with a specific
> public API.
>
> Somewhere in the process we should write a proposal that outlines which
> classes/methods are part of the unit api and which are going to be
> considered internal.  I'm sure we can help with that effort.
>
> That also might help clarify/influence code structure - if internal
> implementation classes are placed in a sub-package inside MPL 3.0, it
> becomes clearer to people later on what the "official' public API vs what
> can be optimized to just use floats.  Obviously the dev's would need to
> decide if that kind of restructuring is worth it or not.
>
> Ted
>
> ________________________________________
> From: David Stansby <dstansby at gmail.com>
> Sent: Wednesday, February 7, 2018 3:42 AM
> To: Jody Klymak
> Cc: Drain, Theodore R (392P); matplotlib development list
> Subject: Re: [Matplotlib-devel] Units discussion...
>
> Practically, I think what we are proposing is that for unit support the
> user must supply two functions for each axis:
>
>   *   A mapping from your unit objects to floating point numbers
>   *   A mapping from those floats back to your unit objects
>
> As far as I know function 2 is new, and doesn't need to be supplied at the
> moment. Doing this would mean we can convert units as soon as they enter
> Matplotlib, only ever have to deal with floating point numbers internally,
> and then use the second function as late as possible when the user requests
> stuff like e.g. the axis limits.
>
> Also worth noting that any major change like this will go in to Matplotlib
> 3.0 at the earliest, so will be python 3 only.
>
> David
>
> On 7 February 2018 at 06:06, Jody Klymak <jklymak at uvic.ca<mailto:jklyma
> k at uvic.ca>> wrote:
> Dear Ted,
>
> Thanks so much for engaging on this.
>
> Don’t worry, nothing at all is changing w/o substantial back and forth,
> and OK from downstream users.   I actually don’t think it’ll be a huge
> change, probably just some clean up and better documentation.
>
> FWIW, I’ve not personally done much programming w/ units, just been a bit
> perplexed by their inconsistent and (to my simple mind) convoluted
> application in the codebase.  Having experience from people who try to use
> them everyday will be absolutely key.
>
> Cheers,   Jody
>
> > On Feb 6, 2018, at  14:17 PM, Drain, Theodore R (392P) <
> theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov>>
> wrote:
> >
> > We use units for everything in our system (in fact, we funded John
> Hunter originally to add in a unit system so we could use MPL) so it's a
> crucial system for us.  In our system, we have our own time classes (which
> handle relativistic time frames as well as much higher precision
> representations) and a custom unit system for floating point values.
> >
> > I think it's important to talk about these changes in concrete terms.  I
> understand the words you're using,  but I'm not really clear on what the
> real proposed changes are.  For example, the current unit API returns a
> units.AxisInfo object so the converter can set the formatter and locators
> to use.  Is that what you mean in the 2nd paragraph about ticks and
> labels?  Or is that changing?
> >
> > The current unit api is pretty simple and in units.ConversionInterface.
> Are any of these changes going to change the conversion API?  (note - I'm
> not against changing it - I'm just not sure if there are any changes or
> not).
> >
> > Another thing to consider:  many of the examples people use are scripts
> which make a plot and stop.  But there are other use cases which are more
> complicated and stress the system in different ways.  We write several GUI
> applications (in PyQt) that use MPL for plotting.  In these cases, the user
> is interacting with the plot to add and remove artists, change styles,
> modify data, etc etc.  So having a good object oriented API for modifying
> things after construction is important for this to work.  So when units are
> involved, it can't be a "convert once at construction" and never touch
> units again.   We are constantly adjusting limits, moving artists, etc in
> unitized space after the plot is created.
> >
> > So in addition to the ConversionInterface API, I think there are other
> items that would be useful to explicitly spelled out.  Things like which
> API's in MPL should accept units and which won't and which methods return
> unitized data and which don't.   It would be nice if there was a clear
> policy on this.  Maybe one exists and I'm not aware of it - it would be
> helpful to repeat it in a discussion on changing the unit system.
> Obviously I would love to have every method accept and return unitized data
> :-).
> >
> > I bring this up because I was just working on a hover/annotation class
> that needed to move a single annotation artist with the mouse.  To move the
> annotation box the way I needed to, I had to set to one private member
> variable, call two set methods, use attribute assignment for one value, and
> set one semi-public member variable - some of which work with units and
> some of which didn't.  I think having a clear "this kind of method
> accepts/returns units" policy would help when people are adding new
> accessors/methods/variables to make it more clear what kind of data is
> acceptable in each.
> >
> > Ted
> > ps: I may be able to help with some resources to work on any unit
> upgrades, but to make that happen I need to get a clear statement of what
> problem is being solved and the scope of the work so I can explain to our
> management why it's important.
> >
> > ________________________________________
> > From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=
> jpl.nasa.gov at python.org<mailto:jpl.nasa.gov at python.org>> on behalf of
> Jody Klymak <jklymak at uvic.ca<mailto:jklymak at uvic.ca>>
> > Sent: Saturday, February 3, 2018 9:25 PM
> > To: matplotlib development list
> > Subject: [Matplotlib-devel] Units discussion...
> >
> > Hi all,
> >
> > To carry on the gitter discussion about unit handling, hopefully to lead
> to a more stringent documentation and implimentation….
> >
> > In response to @anntzer I thought about the units support a bit - it
> seems that rather than a transform, a more straightforward approach is to
> have the converter map to float arrays in a unique way.  This float mapping
> would be completely analogous to `date2num` in `dates`, in that it doesn’t
> change and is perfectly invertible without matplotlib ever knowing about
> the unit information, though the axis could store it for the the tick
> locators and formatters.  It would also have an inverse that would supply
> data back to the user in unit-aware data (though not necessarily in the
> unit that the user supplied.  e.g. if they supply 8*in, the and the
> converter converts everything to meter floats, then the returned unitized
> inverse would be 0.203*m, or whatever convention the converter wants to
> supply.).
> >
> > User “unit” control, i.e. making the plot in inches instead of m, would
> be accomplished with ticks locators and formatters.  Matplotlib would never
> directly convert between cm and inches (any more than it converts from days
> to hours for dates), the downstream-supplied tick formatter and labeller
> would do it.
> >
> > Each axis would only get one converter, set by the first call to the
> axis. Subsequent calls to the axis would pass all data (including bare
> floats) to the converter.  If the converter wants to pass bare floats then
> it can do so.  If it wants to accept other data types then it can do so.
> It should be possible for the user to clear or set the converter, but then
> they should know what they are doing and why.
> >
> > Whats missing?  I don’t think this is wildly different than what we
> have, but maybe a bit more clear.
> >
> > Cheers,   Jody
> >
> >
> >
> >
> > _______________________________________________
> > Matplotlib-devel mailing list
> > Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>
> > https://mail.python.org/mailman/listinfo/matplotlib-devel
> > _______________________________________________
> > Matplotlib-devel mailing list
> > Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>
> > https://mail.python.org/mailman/listinfo/matplotlib-devel
>
> _______________________________________________
> Matplotlib-devel mailing list
> Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>
> https://mail.python.org/mailman/listinfo/matplotlib-devel
>
> _______________________________________________
> Matplotlib-devel mailing list
> Matplotlib-devel at python.org
> https://mail.python.org/mailman/listinfo/matplotlib-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-devel/attachments/20180208/fc3c08ed/attachment-0001.html>


More information about the Matplotlib-devel mailing list