[Datetime-SIG] Timeline arithmetic?

Mon Sep 7 21:38:41 CEST 2015

On 09/07/2015 12:43 PM, Tim Peters wrote:
> [Carl]
>>>> In Model A, an aware datetime (in any timezone) is nothing more than an
>>>> alternate (somewhat complexified for human use) spelling of a Unix
>>>> timestamp, much like a timedelta is just a complexified spelling of some
>>>> number of microseconds.
> 
> [Tim]
>>> A Python datetime is also just a complexified spelling of some number
>>> of microseconds (since the start of 1 January 1 of the proleptic
>>> Gregorian calendar).
> 
> [Carl]
>> Which is a "naive time" concept, which is a pretty good sign that Python
>> datetime wasn't intended to implement Model A. I thought it was already
>> pretty clear that I'd figured that out by now :-)
> 
> So:
> 
> - You tell me that in model A an aware datetime is a spelling of a
> Unix timestamp.
> 
> - I tell you that a Python datetime is a spelling of a different
> flavor of timestamp.
> 
> - You tell me that "means" Python is using a naive time concept, and wasn't
>   intended to implement model A.
> 
> Can you see why I'm baffled?  If it needs to explained, it's even more
> baffling to me.  So here goes anyway:  Model A uses a very similar
> concept.  Not identical, because:
> 
> - The Unix timestamp takes 1970-1-1 as its epoch, while Python's takes 1-1-1.
>   They nevertheless use exactly the same proleptic calendar system.
> 
> - The Unix timestamp counts seconds, but Python's counts microseconds (on
>   a platform where time_t is a floating type, a Unix timestamp can approximate
>   decimal microseconds too, as fractions of a second).
> 
> - The resolution and range of a Unix timestamp vary across platforms, but Python
>   defines both.

Right, but (as you know) those are all incidental to the actual
distinction I was trying to make.

> Where's a theoretically _significant_ difference?  It's simply not
> true that viewing datetimes as timestamps has anything to do with
> drawing a distinction between your models A and B.

The key difference is that a Unix timestamp defines a single instant in
"real time" (or the UTC approximation of "real time," which is good
enough), because the Unix epoch is defined to be in UTC. The point of
even _having_ representations in other timezones (under Model A) is
never to change that basic "real monotonic time" model, it's solely to
get or parse a representation for the sake of a human (or some other
computer system) living naively in that timezone.

A Python datetime "timestamp," on the other hand, is "naive" or
"timezone-relative." It doesn't define a single instant in real time
until you pair it with an offset. The timestamp itself is
timezone-relative (it's "the number of microseconds since datetime(1, 1,
1) in naive local time in whatever timezone we're currently in"). That's
why doing integer arithmetic on this kind of timestamp does classic
arithmetic instead of timeline arithmetic. That's a Model B
understanding of what a non-UTC aware datetime represents.

> An implementation of model A may or may not explicitly store the Unix
> timestamp it has in mind.  From your statement that under model A it's
> a "complexified" spelling of a Unix timestamp, I have to assume you
> have in mind implementations where it's not explicitly stored.  In
> which case it's exactly the same as in Python today:  to _find_ that
> Unix timestamp, you need to convert your complexified spelling to UTC
> first.

I intentionally didn't specify any implementation. In outlining the
difference between Model A and Model B, I'm not concerned about
implementation details; I'm concerned about the mental model of what an
"aware datetime" represents (and thus what invariants you can expect it
to keep once you grasp the model.) I think Model A and Model B do
represent clear alternative mental models in that respect (regardless of
how they are implemented, and what e.g. speed/size tradeoffs that may
involve).

In Model A, an aware datetime is always a single unambiguous instant in
time (that is, isomorphic to UTC), and that alone tells you a lot about
how to expect it to behave in terms of arithmetic, equality, etc (or
even in "being stored across a zoneinfo update").

In Model B, an aware datetime is a local-clock time annotated with a
timezone, and that gives you a different set of consistent expectations
about how it should behave.

> Perhaps the distinction you have in mind is that, under Model A, it's
> impossible to think of an aware datetime as being anything _other_
> than a Unix timestamp? 

Yes, that's basically right. If you're working in Model A and you want
to work in "local clock time", you strip off the timezone information
and use an object representing simple naive clock time, with no timezone
awareness at all.

> That may have been what your "nothing more"
> meant.  Then, yes, there is that difference:  Python doesn't intend to
> force any specific interpretation of its timestamps beyond that
> they're instants in the proleptic Gregorian calendar. 

According to my use of the term (which I borrowed from J/NodaTime)
datetime's "timestamps" aren't really "instants" at all, in the sense
that they don't (alone) tell you when something occurred in the real
world (which is another way of saying that they don't map isomorphically
to UTC, or any other monotonic representation of time).  They represent
a point in the (abstract) proleptic Gregorian calendar, which only
represents an instant once paired with a UTC offset.

> Model A also
> views them as instants in the proleptic Gregorian calendar, but tacks
> on "and that calendar must always be viewed as being in (a proleptic
> extension of an approximation to) UTC".

I think I understand what you mean here. I would say that both Model A
and Model B have an equally opinionated interpretation of what an aware
datetime represents, but it's true that Model A's interpretation
requires it to carry enough information (in some form) to always be
isomorphic to UTC, whereas Model B doesn't require it to carry that much
information.

What Python actually _does_ is a bit more muddled, as we've both said
many times, because sometimes it acts like Model B (intra-zone) and
sometimes like Model A (inter-zone). I think that's unfortunate, because
it results in arithmetic and ordering inconsistencies, and headaches
like the ones you're having with PEP 495.

But I've accepted that Python wants _more_ to be Model B than Model A,
so it's best to just discourage use of the "magic" interzone operations
and be consistently Model B everywhere else, rather than finding a way
(like my earlier "strict tzinfo" proposal tried to) to arrive at an
implementation that's consistently Model A.

> So maybe I understand you now after all.  But, if so, are these kinds
> of seeming disagreements really worth resolving?  It requires a
> seemingly unreasonable amount of time & effort to arrive at the
> obvious ;-)

Well, perhaps all of this was always obvious to you, in which case I do
apologize for wasting so much of your time! But it _seemed_ to me that
we had proponents of both Model A and Model B in this mailing list,
almost entirely talking past each other, and that trying to outline how
each one is a consistent and usable model on its own terms might help
proponents of both to at least understand the other better.

It helped me understand the benefits of Model B, anyway. I'm curious if
it made any sense to Chris, if he's still following this thread. I'm
still hopeful of leveraging that understanding into something useful for
the docs. Sorry if it didn't help you :/ I certainly don't want to keep
wasting your time, so I'm happy to leave it here. Thanks for the
discussion; it's been useful to me, and I appreciate your time.

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150907/8fe1627c/attachment.sig>