Pre-Pre-PEP: The datetime.timedeltacal class

Chris Angelico rosuav at gmail.com
Sat Apr 16 16:08:54 EDT 2022


On Sun, 17 Apr 2022 at 03:37, Peter J. Holzer <hjp-python at hjp.at> wrote:
> Datetime arithmetic in the real world is typically not done in seconds,
> but in calendaric units: Hours, days, weeks, months, years, ...
> The problem is that several of these have varying lengths:
>
> * 1 minute may be 60 or 61 seconds (theoretically also 59, but that
>   hasn't happened yet).
> * 1 day can be 23, 24 or 25 hours (unless you are in Troll, Antarctica,
>   where it's even weirder).

I think Troll still only has days that consist of 23-25 hours; the
weird part is that they move their clocks forward for Oslo's summer,
which is their winter.

> * 1 month may be 28, 29, 30 or 31 days (let's stick to the Gregorian
>   calendar)
>
> The standard library has a datetime.timedelta class which does store
> days and seconds separately, so somebody seems to have had the right
> idea, but the normalization rules make it impossible to distinguish
> between "1 day plus 1 hour" and "25 hours", and it doesn't deal with
> months at all.
>
> Technically it shouldn't be too hard to "fix" timedelta, but that
> wouldn't be backward compatible and would very likely break existing
> code.

Almost certainly, yes; but I would say that that's because you're not
"fixing" timedelta, you're making a completely different concept. The
existing timedelta measures a difference in time; your proposal
represents a difference between two civil calendar points. So I agree
with your suggestion to make it a new and independent class.

> Therefore a new class (provisionally called timedeltacal, because it is
> calendaric, not absolute) should be added to datetime:
>
> Internally it stores months, days, seconds and microseconds as ints.
>
> The seconds and microseconds split is mostly for compatibility with
> datetime and timedelta. We could store seconds as a float instead.
>
> We don't store minutes since leap seconds aren't usually represented in
> "computer time", so they are unlikely to be useful in a timedeltacal
> object.
>
> Days are stored since they aren't a fixed multiple of any smaller unit.
> Months are stored since they aren't a fixed multiple of any smaller unit.
>
> Hours, weeks and years aren't stored since they are always 60 minutes, 7
> days and 12 months respectively.

It sounds like you're planning for annual DST changes, but what about
other shifts? What about when a location adopts standard time, which
could change their UTC offset (yes, I'm aware that most places adopted
standard time before UTC was a thing, but we still usually call it a
UTC offset rather than messing with GMT-UTC changeover) by an
arbitrary amount, even minutes?

It might be cleaner to simply have all of the arguments that datetime
has: year, month, day, hour, minute, second, microsecond (with the
possibility of merging second/usec into a single float).

> When adding a timedeltacal object to a datetime, the fields are added
> from most to least significant: First a new date is computed by
> advancing the number of months specified [TODO: Research how other
> systems handle overflow (e.g. 2022-01-31 + 1 month: 2022-02-31 doesn't
> exist)]

Quick test in Pike:

Pike v8.1 release 15 running Hilfe v3.5 (Incremental Pike Frontend)
> import Calendar.ISO;
> object base = now();
> base;
(1) Result: Fraction(Sun 17 Apr 2022 5:42:15.703571 AEST)
> (base - Day() * 17) + Month();
(2) Result: Fraction(Sat 30 Apr 2022 5:42:15.703571 AEST)
> (base - Day() * 18) + Month();
(3) Result: Fraction(Sat 30 Apr 2022 5:42:15.703571 AEST)
> (base - Day() * 19) + Month();
(4) Result: Fraction(Fri 29 Apr 2022 5:42:15.703571 AEST)
> (base - Day() * 16) + Month();
(5) Result: Fraction(Sun 1 May 2022 5:42:15.703571 AEST)
> (base - Day() * 15) + Month();
(6) Result: Fraction(Mon 2 May 2022 5:42:15.703571 AEST)

Subtracting seventeen days from today gets us to the 31st of March,
and adding one month to that gives us the 30th of April. Subtracting
eighteen days gets us to the 30th of March, and adding a month to that
_also_ gives us the 30th of April. Other nearby dates given for
reference.

> then advance the number of days. Finally add the number of
> seconds and microseconds, taking into accout daylight savings time
> switches if the datetime is time zone aware.

Here's the local DST switchover:

> base - Day() * 15;
(7) Result: Fraction(Sat 2 Apr 2022 5:42:15.703571 AEDT)
> base - Day() * 14;
(8) Result: Fraction(Sun 3 Apr 2022 5:42:15.703571 AEST)
> base - Day() * 14 - Hour() * 4;
(9) Result: Fraction(Sun 3 Apr 2022 2:42:15.703571 AEDT)
> base - Day() * 14 - Hour() * 3;
(10) Result: Fraction(Sun 3 Apr 2022 2:42:15.703571 AEST)
> base - Day() * 14 - Hour() * 2;
(11) Result: Fraction(Sun 3 Apr 2022 3:42:15.703571 AEST)
> base - Day() * 14 - Hour() * 1;
(12) Result: Fraction(Sun 3 Apr 2022 4:42:15.703571 AEST)

BTW, even though the "object display" notation (the repr, if you like)
shows some of these in AEST and some in AEDT, internally, they all
have the same timezone: Australia/Melbourne.

(Side point: I've used consistent syntax here, but there are
shorthands for common cases - I could write "base + Day" to add one
day.)

> Subtracting a timedeltacal object from a datetime is the same, just in
> the opposite direction.
>
> Note that t + d - d is in general not equal to t.

By "in general", do you mean "there will be some odd exceptions", or
"this is usually going to be unequal"? For instance, going back to the
month boundary case, since it's not possible for "add one month" from
two different dates to give the same result, obviously subtracting a
month from that result can't give both the originals. But for the most
part, I would expect t + d - d to be equal to t, modulo rounding error
and possible DST corrections. Crossing a DST boundary shouldn't break
this pattern; only landing in the actual gap/fold should cause issues.
Is that the intention?

> We can't cnange the semantics of datetime - datetime, so there must be a
> function to compute the difference between to datetimes as a
> timedeltacal. It could be a method on datetime (maybe t.sub(u) for t-u
> like in Go) or a constructor which takes two datetime objects.
>
> In any case I think that u + (t - u) == t should hold. [TODO: Check that
> this is possible]
>

Isn't that the exact same thing as saying that t + d - d == t? Or are
you saying that, when you subtract two timestamps, you can never get
one of the odd exceptions that would cause problems? Because I'd
believe that; the net result would be that u+(t-u) == t, but
u+((t+d)-(u+d)) might not equal t. Putting it another way: Shift
invariance should *normally* hold, but there may be some exceptions at
month ends (basically rounding error when you add/subtract a month)
and DST switches.

When the time comes, if you need a hand putting together a PEP, feel
free to reach out to me.

ChrisA


More information about the Python-list mailing list