[Datetime-SIG] Calendar vs timespan calculations...

Chris Barker chris.barker at noaa.gov
Tue Jul 28 22:55:02 CEST 2015


Moved from python-dev to the datetime SIG

TL;DR:

There are are two types of operations on datetimes: adding subtracting
actual time units (multiples of seconds) and calendar operations: things
like "this time the next day".

I think the the datetime module should only support the former -- i.e. the
same old timedelta we have. In that context, storing datetimes in UTC,
doing all operations there, and converting back makes the most sense.

But for calendar manipulations, they need to be done in a time-zone aware
way.

If you're really interested, I got quite carried away below....


On Mon, Jul 27, 2015 at 10:03 PM, Lennart Regebro <regebro at gmail.com> wrote:

> On Tue, Jul 28, 2015 at 12:03 AM, Tim Peters <tim.peters at gmail.com> wrote:
> > timedelta objects only store days, seconds, and microseconds,
>
> Except that they don't actually store days. They store 24 hour
> periods, which, because of timezones changing, is not the same thing.
>

yes, it is ;-)

 And as you have repeated
> many times now, the datetime module's arithmetic is "naive" ie, it
> assumes that one day is always 24 hours. The problem with that
> assumption is that it isn't true.
>

it's not an assumption, it's a definition.

We have a serious semantic problem here -- and I _think_ it's the source of
almost all the discussion in this thread -- I'm not sure what to do about
it though.

Tim understands this stuff, Lennart understands this stuff, I'm pretty sure
I understand this stuff -- I'm not sure about everyone else on this
discussion, but they probably all understand it, too -- the only thing any
of us doesn't understand is what the heck anyone is talking about!

Maybe use Tim's approach of "naive" -- so a naive day is exactly 24 hours
period end of story -- this is the definition.

But what do we call what I've been trying to refer to as "calendar"
operations?

A summary for (maybe) some clarity:

We have this nifty model of a continuous time axis -- moving along at a
steady rate from the beginning of the Universe to the end of the Universe.
Modulo relativity, it works pretty well.

Then we have units of time spans: in SI units it's seconds. then a bunch of
other units that are clearly defined in terms of seconds: minutes (60 s)
hours (60 min) days (24 hrs). And of course milliseconds and microseconds.

Then we have Calendars: this is  the year, month, day, etc. we are all
familiar with (and the hours in that day 4:00 o'clock pm, etc) -- confusing
here is that we use the same word for "day" and "hour" as part of a
calendar descriptions AND also as timespan units -- but they are NOT the
same thing (yes, they are related).

Calendars are how we map a nice human understandable (and historically
based) naem onto the theoretically time axis. Being both human-oriented and
with a lot of historical baggage, calendar naming is designed to fit more
or less with relationship between the earth and the sun. So we want the
Solstices to fall around the same date every year, for instance, and we
want 1200 hrs to be around the middle of the day. This is why it all gets
ugly, because the various celestial phenomena aren't nice integers
multiples of each-other (hence the need for leap years) and even constant
(hence the need for leap seconds).

And, of course, the earth is round, and the sun's relative position to each
point on earth is different, hence the need for time zones. (then add
political differences for it to get really ugly.

So: I think there are more or less two types of manipulations one might
need:

What is currently supported by the datetime module, and I think Tim is
referring to as "naive" time operations: adding, subtracting, units of time
along the theoretical time axis. This is actually really simple math -- as
Tim points out all the timedelta object really is is a fancy integer.

Then there are what I am calling "calendar" operations -- these are
operations that only make sense with a calendar (and, in fact a timezone, I
think). this is operations like: "the same time two days later" -- this is
not the same as moving two days (48 hrs) along the time axis -- it simply
is not. It is a shame that we use the name "day" to refer to both 24 hours
along the time axis and enumeration of sunrises in a month -- or, the thing
we use on calendars.

There has been a lot of chatter about "tomorrow" or "adding three days", as
these bring up the ugly DST issues, but once you add one calendaring
operation, people will want (and they should) more: next month, next year,
or even uglier, next business day, etc. These are very useful things, but I
argue they belong, as a unit, in a separate package -- maybe for potential
inclusion in the standard library, but I don't think that's on the table
now. (and doesn't dateutil support many of these?)


Now on to time zones: the datetime package is useful because it not only
supports time-axis arithmetic (really pretty trivial -- it's just integer
arithmetic), but it supports translating from time-axis units (microseconds
since some epoch) to Calendar units:

year, month, day, hour, minute, seconds, microseconds

(Using the (proleptic?) Gregorian Calendar -- note that different Calendars
are a whole other ball of wax!

This is where the magic is (or really where the ugly code is) , and where
tiem zones come in.

UTC time is the calendar time at one longitude on the earth. (ignoring leap
seconds for the moment) it is relatively simple: continuous, etc, no DST,
no changing definitions, etc. It is a useful reference system for this
reason. math, and all that, are easier in this zone.

By default, Python datetime objects are "naive" -- meaning they know
nothing of timezones or daylight savings -- they can convert back and forth
between the internal representation (time span since an epoch) and human
calendar times (Gregorian, anyway). It turns out that you can use naive
datetimes as UTC time -- there really is no difference, until you want to
convert to a different time zone -- with UTC, you may be able to do that,
with naive, you can't do that unless you specify what time zone the time
is, and then it's no longer naive ;-)

But how should pyton handle time zones?

Given all the ugliness of DST and changing time zones, and all that, UTC is
the lingua franca of time -- time zones are defined by how the are offset
from UTC, and in UTC (as in naive), math is relatively easy. So the best
way to handle time zones is to store and manipulate everything in UTC, and
then convert to/from the calendar representation using the time zone, when
the user needs (or provides).

I've been trying to figure out what all the confusion, discussion has been
about, and I think it's this:

If you want to do a Calendar operation, like "this time tomorrow", then
THAT is best done in a timezone aware way -- in particular, a DST-aware
way. i.e. from 12:00 one day to 12:00 the next day will generally be 24
hours, but might be 23 or 25 hours if crossing a DST transition.

But if we aren't supporting those operations, we don't need to worry about
that now.

-Chris









































> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/chris.barker%40noaa.gov
>



-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150728/15e9c1e0/attachment.html>


More information about the Datetime-SIG mailing list