[SciPy-dev] design of a physical quantities package: seeking comments

Sun Aug 3 09:22:33 EDT 2008

Hi Anne,

On Sunday 03 August 2008 3:46:21 am Anne Archibald wrote:
> 2008/8/2 Darren Dale <dsdale24 at gmail.com>:
> > On Saturday 02 August 2008 4:23:43 pm Anne Archibald wrote:
> >> 2008/8/2 Darren Dale <dsdale24 at gmail.com>:
> >> I think it's a good idea to try to keep things in the units they are
> >> provided in; this should reduce the occurrences of unexpected
> >> overflows when (for example) cubing a distance in megaparsecs (put
> >> this in centimetres and use single precision and you might overflow).
> >> But this does mean you need to decide, when adding (say) a distance in
> >> feet to a distance in metres, which unit the result should be in.
> >
> > Yes, a decision will have to be made as to whether A*B+C will yield a
> > result in units of A or C. I am not worried at this point about overflows
> > or loss of precision when attempting to convert a quantity that is some
> > integer dtype.
>
> I don't think it's worth worrying about integers. But if you take,
> say, 10 megaparsecs, and express it in metres, you get 3e23 m. For the
> volume of a cube 10 Mpc on a side, you get about 3e70 - a number too
> big to fit in a single-precision float. So the cosmologists will be
> forced to use doubles if you internally represent everything using SI
> units. For the same reason, you may want to make the conversion rules
> very clear. ("Always convert to the left-hand unit" would work.)

I guess I should have been more clear that I intend to internally represent 
each dimension in the unit specified. As you point out later, converting back 
and forth between some standard internal representation like SI would be 
costly.

[...]
>
> Here's a simple example: I have a quantity A I think is in metres. I
> write "A=convert(A,'ft')". My program continues, combining A with
> various other quantities. At the end I obtain a meaningless number
> with bizarre units and I have to trace back through my code to find
> out that these weird units were attached to A, which in fact was in
> units of energy density.

You thought A had different dimensions than it did, and when you get to the 
end of your calculation, you have wierd units and you have to go back and see 
what went wrong. That would have happened whether or not you changed units 
from meters to feet.

> More, what happens when you have a quantity in Newton-metres (say) and
> you ask it to convert to feet? Do you get Newton-feet, or do all the
> occurrences of "feet" in the Newtons get converted?

They all get converted.

> What about conversions like ergs -> joules? Each of these is a
> composite unit, so the code has to have some kind of procedure to find
> the right number of powers of ergs to convert, leaving behind
> whatever's left. Combine this with fractional exponents and you have a
> real nightmare.

Compound units would be internally represented by their components. I think if 
you set your units to joules, all length dimensions would be expressed in m, 
mass in Kg, time in s. In order to print a units representation in a compound 
unit like J, I will probably have to inspect the units and power of each 
dimension and reconstruct the compound unit in some semi-intelligent manner. 
But I don't plan on dealing with representing compound units, for now.

I just downloaded the java webstart version of Frink to have a look at its 
behavior. It looks like my approach is very similar to Frink's. From the 
Frink website: "All units are standardized and normalized into combinations a 
small number of several "Fundamental Dimensions" that cannot be reduced any 
further." Here are a few examples:

b=32 J
32 m^2 s^-2 kg (energy)

c=10 s
10 s (time)

b*c
320 m^2 s^-1 kg (angular_momentum)

b/c
16/5 (exactly 3.2) m^2 s^-3 kg (power)

AA=12 pc/cm^3
3.7028130975668747443e+23 m^-2 (unknown unit type)

DD=b->ft
 Conformance error
   Left side is: 24384/125 (exactly 195.072) m^3 s^-2 kg (unknown unit type)
  Right side is: 381/1250 (exactly 0.3048) m (length)
     Suggestion: divide left side by energy

 For help, type: units[energy]
                   to list known units with these dimensions.

I had previously looked for Frinks source and when I didnt find it, I didn't 
give it more thought until now. Having seen the way Frink behaves, I feel 
more confident that I am on the right track in terms of abstraction.

> >> On a related topic, how are you going to handle ufuncs? Addition and
> >> subtraction should require commensurable units, multiplication should
> >> multiply the units, and I think all other standard ufuncs should
> >> require something with no units. Well, except for mean, std, and var
> >> maybe. And "pow" is tricky. And, well, you see what I'm getting at.
> >
> > Well, I already laid out a strategy for dealing with multiplication and
> > addition, but I am really not that familiar with ufuncs and there are
> > probably some problems lurking that I am not aware of. Maybe I will have
> > to rely on object methods to wrap the incompatible ufuncs and return
> > Quantities with the appropriate units.
>
> I don't really have any idea how this is to be accomplished, but (for
> example) the function "exp" really needs to check that its argument
> has no units. It would also be nice to be able to use numpy.exp rather
> than scipy.units.exp.

I agree it would be nice, but let me get a working implementation together and 
if the project eventually receives the community's blessing, let the numpy or 
scipy folks decide if this can and should be supported. Seems a ways off to 
me.

[...]
>
> > I would like to make clear: my concern is to get the abstractions right
> > so it will be flexible enough that others can build on it to provide
> > their desired functionality. If anyone has ideas on how the abstractions
> > need to be improved, I would like to here them.
>
> I think the hard part will be getting the ufuncs to behave correctly.
> As you say, if you can get addition, multiplication, and fractional
> powers working, you're pretty much there.
>
> The key question for me is what units quantities are kept in
> internally. Keep in mind that some arrays are large, so conversions of
> quantities between units can be expensive. If I'm doing many
> calculations with quantities expressed in pc/cm^3, I would hope that
> they are not constantly being converted to m^{-2} on input and back to
> pc/cm^3 on output. In fact I can imagine cases where both these
> conversions happen inside a loop. That could get expensive.

The values would not be stored in some standard internal representation like 
SI, but rather in the units specified, decomposed into fundamental 
dimensions. I do not know how a compound unit like pc/cm^3 would work, but 
Frink doesnt know how to do it either. There is an example from my field too: 
How much surface-area can you pack into a given volume? This is often 
expressed in m^2/cm^3. What a headache. Maybe I could come up with a method 
that returns an alternate string representation of the quantity: 
q.in_units_of("pc/cm^3"). But it would not effect the internal 
representation.

Darren