[Python-Dev] TZ-aware local time

Alexander Belopolsky alexander.belopolsky at gmail.com
Thu Jun 14 02:09:23 CEST 2012


On Tue, Jun 12, 2012 at 1:14 AM, Ben Finney <ben+python at benfinney.id.au> wrote:
>> To the contrary, without the POSIX timestamp model to define the
>> equivalency between the same point in time expressed using different
>> timezones, sane comparisons and arithmetic on timestamps would be
>> impossible.
>
> Why is the POSIX timestamp model the only possible model? To the
> contrary, there are many representations with different tradeoffs but
> with the common properties you name (“equivalency between the same point
> in time expressed using different timezones”).

Here is my take on this.  If datetime objects did not support
fractions of a second, the difference between naive datetime and
POSIX timestamp would be just a matter of tradeoff between human
readability and efficient storage and fast calculations.  This is very
similar to a more familiar tradeoff between decimal and binary
representation of numbers.  Binary arithmetics is more efficient in
terms of both storage and processing, so it is common to convert from
decimal to binary on input and convert back on output.  In some
applications (e.g. hand-held calculators), I/O dominates internal
processing, so implementing direct decimal arithmetics is not
uncommon.   The original designers of the datetime module chose the
"decimal" internal format.  Equivalent functionality can be
implemented using "binary" format and in fact popular mxDateTime
library is implemented that way.

(Since we do support  fractions of a second, there may be small
difference in calculations performed using datetime types and float
timestamps, but this issue has nothing to do with time zones, local
time or UTC.)

It is a common misconception that POSIX timestamps are somehow more
closely tied to UTC than broken down time values.  The opposite is
true.  at the end of this month, UTC clocks (e.g.
http://tycho.usno.navy.mil/simpletime.html) will show 2012-06-30
23:59:59, 2012-06-30 23:59:60, 2012-06-31 00:00:00, while
corresponding POSIX timestamps are 1341100799, 1341100800, 1341100800.
 Most POSIX systems will not freeze their clocks for a whole second or
move them back, but instead they will either be one second ahead until
someone (or something like the NTP daemon) causes the adjustment.

In an earlier message, Guido criticized the practice of converting
local broken down time to an integer using POSIX algorithm for
calculations.  While keeping time using local timezone is often not
the best choice, there is nothing wrong with implementing local time
arithmetics using timegm/gmtime conversion to integers.   In fact, the
results will be exactly the same as if the calculations were performed
using datetime module with tzinfo=LocalTimezone.

I think many users are confused by references to "Seconds Since the
Epoch" and think that time_t should contain an actual number of SI
seconds elapsed since a world-wide radio broadcast of
"1970-01-01T00:00:00+0000".  First of all, there was no such broadcast
in 1970, but the time known as UTC 1970-01-01T00:00:00+0000 was 35 or
so seconds earlier than time.time() seconds ago.

The POSIX standard gets away hiding the true nature of UTC by defining
the integral timestamp as "a value that *approximates* the number of
seconds that have elapsed since the Epoch."
<http://pubs.opengroup.org/onlinepubs/009604599/basedefs/xbd_chap04.html#tag_04_14>
  The choice of approximation is specified by an explicit formula:

tm_sec + tm_min*60 + tm_hour*3600 + tm_yday*86400 +
    (tm_year-70)*31536000 + ((tm_year-69)/4)*86400 -
    ((tm_year-1)/100)*86400 + ((tm_year+299)/400)*86400

This formula can be used to pack any broken down time value in an
integer.  As long as one does not care about leap seconds, any broken
down time value can be converted to an integer and back using this
formula.  If the broken down value was in local time, the values
stored in an integer will represent local time.  If the broken down
value was in UTC, the values stored in an integer will represent UTC.
Either value will "approximate the number of seconds that have elapsed
since the Epoch" given the right definition of the Epoch and
sufficient tolerance for the approximation.

The bottom line: POSIX time stamps and datetime objects (interpreted
as UTC) implement the same timescale.  The only difference is that
POSIX timestamps can be stored using fewer bytes at the expense of not
having an obvious meaning.  While conversion from UTC to local time
may require using POSIX timestamps internally, there is no need to
expose them in datetime module interfaces.


More information about the Python-Dev mailing list