[Python-Dev] Status on PEP-431 Timezones

Wed Apr 15 21:53:16 CEST 2015

Lennart Regebro <regebro at gmail.com> writes:

> OK, so I realized another thing today, and that is that arithmetic
> doesn't necessarily round trip.
>
> For example, 2002-10-27 01:00 US/Eastern comes both in DST and STD.
>
> But 2002-10-27 01:00 US/Eastern STD minus two days is 2002-10-25 01:00
> US/Eastern DST

"two days" is ambiguous here. It is incorrect if you mean 48 hours (the
difference is 49 hours):

  #!/usr/bin/env python3
  from datetime import datetime, timedelta
  import pytz

  tz = pytz.timezone('US/Eastern')
  then_isdst = False # STD
  then = tz.localize(datetime(2002, 10, 27, 1), is_dst=then_isdst)
  now =  tz.localize(datetime(2002, 10, 25, 1), is_dst=None) # no utc transition
  print((then - now) // timedelta(hours=1))
  # -> 49

> However, 2002-10-25 01:00 US/Eastern DST plus two days is 2002-10-27
> 01:00 US/Eastern, but it is ambiguous if you want DST or not DST.

It is not ambiguous if you know what "two days" *in your particular
application* should mean (`day+2` vs. +48h exactly):

  print(tz.localize(now.replace(tzinfo=None) + timedelta(2), is_dst=then_isdst))
  # -> 2002-10-27 01:00:00-05:00 # +49h
  print(tz.normalize(now + timedelta(2))) # +48h
  # -> 2002-10-27 01:00:00-04:00

Here's a simple mental model that can be used for date arithmetics:

- naive datetime + timedelta(2) == "same time, elapsed hours unknown"
- aware utc datetime + timedelta(2) == "same time, +48h"
- aware datetime with timezone that may have different utc offsets at
different times + timedelta(2) == "unknown time, +48h"

"unknown" means that you can't tell without knowning the specific
timezone.

It ignores leap seconds.

The 3rd case behaves *as if* the calculations are performed using these
steps (the actual implementation may be different):

1. convert an aware datetime
object to utc (dt.astimezone(pytz.utc))
2. do the simple arithmetics using utc time
3. convert the result to the original pytz timezone (utc_dt.astimezone(tz))

you don't need `.localize()`, `.normalize()` calls here.

> And you can't pass in a is_dst flag to __add__, so the arithmatic must
> just pick one, and the sensible one is to keep to the same DST.
>
> That means that:
>
> tz = get_timezone('US/Eastern')
> dt = datetimedatetime(2002, 10, 27, 1, 0, tz=tz, is_dst=False)
> dt2 = dt - 420 + 420
> assert dt == dt2
>
> Will fail, which will be unexpected for most people.
>
> I think there is no way around this, but I thought I should flag for
> it. This is a good reason to do all your date time arithmetic in UTC.
>
> //Lennart

It won't fail:

  from datetime import datetime, timedelta
  import pytz

  tz = pytz.timezone('US/Eastern')
  dt = tz.localize(datetime(2002, 10, 27, 1), is_dst=False)
  delta = timedelta(seconds=420)

  assert dt == tz.normalize(tz.normalize(dt - delta) + delta)

The only reason `tz.normalize()` is used so that tzinfo would be correct
for the resulting datetime object; it does not affect the comparison otherwise:

  assert dt == (dt - delta + delta) #XXX tzinfo may be incorrect
  assert dt == tz.normalize(dt - delta + delta) # correct tzinfo for the final result