[Python-ideas] Units in type hints

Steven D'Aprano steve at pearwood.info
Fri May 15 19:14:07 CEST 2015


On Fri, May 15, 2015 at 05:00:26PM +0300, Koos Zevenhoven wrote:
> On 14.5.2015 14:59, Steven D'Aprano wrote:
[...]
> >>For instance,
> >>
> >>     def sleep(duration : Float['s']):
> >>         ...
> >>
> >>Now the type checker could catch the error of trying to pass the sleep
> >>duration in milliseconds, Float['ms'].
> >
> >But that's not an error. Calling sleep(weight_in_kilograms) is an error.
> 
> In the example I gave, it is clearly an error. And it would be an error 
> with time.sleep. But you are obviously right, sleeping for kilograms is 
> also an error, although a very bizarre one.

Calling sleep(x) where x is a millisecond unit should not be an error, 
because millisecond is just a constant times second. To be precise, 
milliseconds and seconds both have the same dimension, T (time) and so 
differ only by a fixed conversion constant. Hence:

sleep( 1000 millisecond )
sleep( 1 second )
sleep( 0.016666667 minute )
sleep( 8.2671958e-07 fortnight )
sleep( 0.91134442 feet/kph )

etc. should all have exactly the same result, namely, to sleep for one 
second.

(Obviously "1 second" is not valid Python syntax. I'm just using it as 
shorthand for whatever syntax is used, possibly a function call.)


> >But calling sleep(milliseconds(1000)) should be the same as calling
> >sleep(seconds(1)).
> 
> Yes, something like that would be nice. What would sleep(1) do?

That depends on the sleep function. If we're talking about the actual 
time.sleep function that exists today, it will sleep for one second. But 
that's because it's not aware of units. A unit-aware function could:

- assume you know what you are doing and assign a default unit 
  to scalar quantities, e.g. treat 1 as "1 second";

- treat 1 as a dimensionless quantity and raise an exception
  ("no dimension" is not compatible with "time dimension").

Of the two, the Zen suggests the second is the right thing to do. ("In 
the face of ambiguity, refuse the temptation to guess.") But perhaps 
backwards-compatibility requires the first.

You could, I suppose, use a static type checker to get a really poor 
unit checker:

def sleep(t: Second):
    time.sleep(t)

class Second(float):
    pass


sleep(1.0)  # type checker flags this as wrong
sleep(Second(1.0))  # type checker allows this

But it's a really poor one, because it utter fails to enforce 
conversion factors, not even a little bit:

class Minute(float):
    pass

one_hour = Minute(60.0)
sleep(one_hour)  # flagged as wrong
sleep(Second(one_hour))  # allowed

but of course that will sleep for 60 seconds, not one hour. The problem 
here is that we've satisfied the type checker with meaningless types 
that don't do any conversions, and left all the conversions up to the 
user.



> >If the user has to do the conversion themselves,
> >that's a source of error:
> >
> >sleep(time_in_milliseconds / 1000)  # convert to seconds
> >
> >If you think that's too obvious an error for anyone to make,
> 
> You lost me now. There does not seem to be an error in the line of code 
> you provided, especially not when using Python 3, which has true 
> division by default.

D'oh!

Well, I demonstrated my point that unit conversions are prone to human 
error, only not the way I intended to. 

I *intended* to write the conversion the wrong way around, except I got 
it wrong myself. I *wrongly* convinced myself that the conversion factor 
was milliseconds * 1000 -> seconds, hence /1000 would get it wrong. Only 
it isn't.

Believe me, I didn't intend to make my point in such a convoluted way. 
This was a genuine screw-up on my part.


> However, in what I proposed, the type checker would 
> complain because you made a manual conversion without changing the unit 
> hint (which is also potential source of error, and you seem to agree). 
> According to my preliminary sketch, the correct way (which you did not 
> quote) would be
> 
>     sleep(convert(time_in_milliseconds, 'ms', 's'))

That can't work, because how does the static type checker know that 

    convert(time_in_milliseconds, 'ms', 's')

returns seconds rather than milliseconds or minutes or days? What sort 
of annotations can you give convert() that will be known at compile 
time? Maybe you can see something I haven't thought of, but I cannot 
think of any possible static declaration which would allow a type 
checker to correctly reason that

    convert(x, 'ms', 's')

returns Second or Float['s'], and so does this:

    new_unit = get_unit_from_config()
    convert(y, 'minute', new_unit)

but not this:

    convert(z, 's', 'hour')

Let alone more complex cases involving units multiplied, divided, and 
raised to powers.


-- 
Steve


More information about the Python-ideas mailing list