[Datetime-SIG] Implementing tzinfo for all valid datetimes (was Re: PEP-431/495)

Tim Peters tim.peters at gmail.com
Mon Aug 24 09:05:57 CEST 2015


[Tim]
> Oops!  Somewhere around 2037-2038 it apparently lost all knowledge of
> US/Eastern daylight time.  I expect this is why:
>
>     >>> ez._utc_transition_times[-1]
>     datetime.datetime(2037, 11, 1, 6, 0)
>
> That is, the last transition it knows about is the end of daylight time in 2037.
...
>
> Digging deeper, I don't think I can pin this on tzfile.  The docs say
> that, if possible, a tzfile also contains a POSIX-TZ-style rule to be
> used for times beyond the last explicit transition instant.  In the
> US/Eastern tzfile shipped with this version of pytz, that's:
>
>     EST5EDT,M3.2.0,M11.1.0
>
> So a "complete" wrapping of zoneinfo also requires implementing such
> rules when present.

This appears to be the scoop, although I may be wrong about some:
when tzfile was first invented, like most other stuff at the time it
assumed the world would end before 2038 (the first year a signed
32-bit int is too narrow to hold a UNIX(tm) seconds-since-1970
timestamp).  Values in a tzfile were all at most 4 bytes, zic
generated all transitions explicitly through the end of 2037, and that
was that.

Sometime later, but before the current NEWS file goes back, version 2
of tzfile was introduced.  This added a new section allowing for
8-byte data, and with that came the realization that generating all
transitions explicitly was a doomed approach.  So version 2 also added
the POSIX-TZ gimmick:  so long as the most recent behavior was regular
enough to use a TZ rule, there was no need to generate any explicit
transitions covered by that rule.

But what about old clients, who used version 1?  Would updates to
zones become useless to them because they couldn't deal with version 2
yet?  A comment in zic.c's `outzone()` function:

/*
** For the benefit of older systems,
** generate data from 1900 through 2037.
*/

So that's why they still generate everything explicitly through 2037:
the first piece of a version 2 (and version 3) tzfile _is_ a version 1
tzfile, and ancient software expecting version 1 can still use current
version 3 tzfiles without problems.  But "modern" software is expected
to use the TZ rule - they're never going to generate explicit
transitions beyond 2037 except when a TZ rule is inadequate to express
them.  They only generate them now for the benefit of legacy systems.
Python is aging, but it's not _that_ old yet ;-)


More information about the Datetime-SIG mailing list