[Python-Dev] Aware datetime from naive local time Was: Status on PEP-431 Timezones

Alexander Belopolsky alexander.belopolsky at gmail.com
Sat Apr 11 01:32:00 CEST 2015


On Fri, Apr 10, 2015 at 6:38 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> It actually took me a long time to understand that the "isdst" flag in
> this context related to the following chain of reasoning:
>
> 1. Due to various reasons, local time offsets relative to UTC may change,
> thus repeating certain subsets of local time
> 2. Repeated local times usually relate to winding clocks back an hour at
> the end of a DST period
> 3. "isdst=True" thus refers to "before the local time change winds the
> clocks back", while "isdst=False" refers to *after* the clocks are wound
> back
>
As Alexander says, you can reduce the amount of assumed knowledge needed to
> understand the API by focusing on the ambiguity resolution directly without
> assuming that the *reason* for the ambiguity is "end of DST period".
>
This is an excellent summary of my original post.  (It is in fact better
than the post itself which I therefore did not include in the quote.)  For
the mathematically inclined, I can reformulate the problem as follows.  For
any given geographical location, loc, and a moment in time t expressed
as UTC time, one can tell what time was shown on a "local clock-tower."
This defines a function  wall(loc, t).   This function is a piece-wise
linear function which may have regular or irregular discontinuities.
Because of these discontinuities, an equation wall(loc, t) = lt may have 0,
1
or 2 solutions.

The DST switchovers are an example of regular discontinuities.  In most
locations, they follow a somewhat predictable pattern with two
discontinuities per year.  Irregular discontinuities happen in locations
with activist governments and don't follow any general rules.

For most world locations past discontinuities are fairly well documented
for at least a century and future changes are published with at least 6
months lead time.

>From a pedagogical point of view, having a separate API that returned 0, 1,
> or 2 results for a local time lookup could thus help make it clear that
> local time to absolute time conversions are effectively a database lookup
> problem, and that timezone offset changes (whether historical or cyclical)
> mean that the mapping isn't 1:1 - some expressible local times never
> actually happen, while others happen more than once.
>
The downside of this API is that naively written code is prone to crashes.
Someone unaware of the invalid local times and not caring about the choice
between ambiguities may write code like

t = utc_times_from_local(lt)[0]

which may work fine for many years before someone gets an IndexError and a
backtrace in her server log.


> For the normal APIs, NonExistentTimeError would then correspond with the
> case where the record lookup API returned no results, while the suggested
> "which" index would handle the two results case without assuming the
> repeated local time was specifically due to the end of a DST period.
>

The NonExistentTimeError has a similar problem as an API returning an empty
list.  Seeing NonExistentTimeError in a server log is not a big improvement
over seeing an IndexError.

Moreover, a program that rejects invalid times on input, but stores them
for a long time may see its database silently corrupted after a zoneinfo
update.

Now it is time to make specific proposal.  I would like to extend
datetime.astimezone() method to work on naive datetime instances.  Such
instances will be assumed to be in local time and discontinuities will be
handled as follows:


1. wall(t) == lt has a single solution.  This is the trivial case and
lt.astimezone(utc) and lt.astimezone(utc, which=i)  for i=0,1 should return
that solution.

2. wall(t) == lt has two solutions t1 and t2 such that t1 < t2. In this
case lt.astimezone(utc) == lt.astimezone(utc, which=0) == t1 and
 lt.astimezone(utc, which=1) == t2.

3. wall(t) == lt has no solution.  This happens when there is UTC time t0
such that wall(t0) < lt and wall(t0+epsilon) > lt (a positive discontinuity
at time t0). In this case lt.astimezone(utc) should return t0 + lt -
wall(t0).  I.e., we ignore the discontinuity and extend wall(t) linearly
past t0.  Obviously, in this case the invariant wall(lt.astimezone(utc)) ==
lt won't hold.   The "which" flag should be handled as follows:
 lt.astimezone(utc) == lt.astimezone(utc, which=0) and lt.astimezone(utc,
which=0) == t0 + lt - wall(t0+eps).

With the proposed features in place, one can use the naive code

t =  lt.astimezone(utc)

and get predictable behavior in all cases and no crashes.

A more sophisticated program can be written like this:

t1 = lt.astimezone(utc, which=0)
t2 = lt.astimezone(utc, which=1)
if t1 == t2:
    t = t1
elif t2 > t1:
    # ask the user to pick between t1 and t2 or raise
AmbiguousLocalTimeError
else:
    t = t1
    # warn the user that time was invalid and changed or raise
InvalidLocalTimeError
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20150410/aa38051c/attachment.html>


More information about the Python-Dev mailing list