[Datetime-SIG] Computing .dst() as a timedelta

Sun Sep 20 00:01:59 CEST 2015

[Alexander Belopolsky <alexander.belopolsky at gmail.com>]
> The datetime.dst() and its namesake tzinfo.dst() [1] methods are required to
> return a timedelta object that represents a quantity added to standard time
> in a spring-forward transition.
>
> As explained in documentation, the dst() value is already incorporated in
> the value returned by utcoffset() and is not needed in typical calculations.
> Therefore, it is not surprising that both dateutil and pytz get it wrong in
> some cases. [2,3]

Ya, the docs over-promised here ;-)  I think the only "important"
invariant to maintain is that _some_ kind of DST is in effect if and
only if .dst() != timedelta(0).

> While pytz does slightly better than dateutil, it looks like it may not be
> possible to derive the correct value of dst() from the compiled binary
> tzfiles alone in all cases.

You're right, it can't, but for a more general reason than what you
give next:  at base, it's impossible to always know what a zone's
"standard offset" is from what a tzfile stores, even though the
zoneinfo source (text) files do spell that out.

> The problematic cases are transitions that involve a simultaneous change in
> standard time and a DST transition.  For example, Portugal switching from
> CET to WEST in 1996. [2]

Specifically, on 1996-03-31 that simultaneously switched from CET
(standard time) to WEST (daylight time), yes?  The total UTC offset
was !;00:00 both before and after.

In cases "like this", you can search either backward or forward in the
transition list, to find a closest _different_ DST switch, and
calculate a change of 1 hour either way.  So it's "almost certain"
that the DST offset is an hour in this case too.

A case where that doesn't work, unless squinting:  that place in
Antarctica with two kinds of DST each year.  The total UTC offset
increases by 1 when the first DST kicks in, and by 1 again when the
second kicks in. . So, in the second case, the delta between adjacent
total UTC offsets is just 1, despite that the (total) DST offset is
actually 2.

Which suggests a more general "good guess":

    If the transition record says DST is not in effect,
        dst() should return timedelta(0).
    Else it says DST is in effect.
    If the prior transition record says it was not in effect
        and the total UTC offsets differ,
        .dst() should return their difference.
    Else the total offsets are the same, or
        DST is in effect for both.
    Search back to find the closest preceding time DST switched.
    Use the total UTC offset from the "not DST" half of that switch instead.
    If none can be found going backward, go forward instead.
    And if both searches fail, return timedelta(hours=1).

> While the "SAVE" amount can be found in the raw tzdist files, this
> information is lost when the raw files are compiled.  The transition
> information includes only the full new UTC offset and a boolean isdst flag.
> If the transition is a pure DST transition, then dst() is just the
> difference between the new UTC offset and the old, but if the standard time
> offset changes at the time of the DST transition, there is no information in
> the binary tzfile to split the full difference into standard time change and
> DST adjustment.
>
> Unless I miss something, it looks like a high-quality tzinfo implementation
> should extract the "SAVE" information from the raw files.

I will continue to draw a distinction between "high quality" and
"timezone wonk" quality ;-)

> [1]: https://docs.python.org/3/library/datetime.html#datetime.tzinfo.dst
> [2]: https://github.com/dateutil/dateutil/issues/128
> [3]: https://bugs.launchpad.net/pytz/+bug/1497619