From tim.peters at gmail.com  Tue Sep  1 01:55:10 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 31 Aug 2015 18:55:10 -0500
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xYvpqEOG0GF_NCfG+DdoJi3BAc9+YmxSCcykVVRwuSUsQ@mail.gmail.com>
 <CAExdVNmKk2jJvUaGG7Q0aa=_Yya_LMEiJdd+TN55yDO+RfwoeA@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
Message-ID: <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>

[Alex]
>> After some thought, I believe the way to fix the implementation is what I
>> suggested at first: reset fold to 0 before calling utcoffset() in __hash__.
>> A rare hash collision is a small price to pay for having datetimes with
>> different timezones in the same dictionary.

[Tim]
> Ya, I can live with that.  In effect, we give up on converting to UTC
> correctly for purposes of computing hash(), but only in rare cases.
> hash() doesn't really care, and it remains true that datetime equality
> (which does care) still implies hash equality.  The later and earlier
> of ambiguous times will simply land on the same hash chain.

Nope, you wore me out prematurely ;-)

Consider datetimes dt1 and dt2 representing the earlier & later of an
ambiguous time in their common zone (whatever it may be - doesn't
matter).  Then all fields are identical except for `fold`.  Assume
__hash__ forces `fold` to 0 before obtaining the UTC offset.  Then we
have:

    dt1 == dt2
    hash(dt1) == hash(dt2)

Fine so far as it goes.  Now do:

    u1 = dt1.astimezone(timezone.utc)
    u2 = dt2.astimezone(timezone.utc)

At this point we have:

    u1 == dt1 == dt2 == u2 and u1 < u2
    hash(dt1) == hash(dt2) == hash(u1)

(Parenthetically, note that despite the chain of equalities in the
first of those lines, we do _not_ have u1 == u2 - transitivity fails,
which is a bit of a wart by itself.)

Since u1 == dt1, and hash(u1) == hash(dt1), no problem there either.

But u1 isn't at all the same as u2, so hash(u2) can be the same as
hash(u1) only by (unlikely) accident.  hash(u2) is off in a world of
its own.  Therefore hash(dt2) can be the same as hash(u2) only by (the
same unlikely) accident, despite that dt2 == u2.

So, in all, __hash__ forcing fold=0 at the start hides the problem for
ambiguous times in the same zone, but doesn't really touch the problem
for cross-zone equivalent spellings of such times (not even if one of
the zones is UTC, which is likely the most important case).

One way to fix that is to have datetime.__hash__() _always_ return, say, 0 ;-)

From alexander.belopolsky at gmail.com  Tue Sep  1 02:16:29 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 31 Aug 2015 20:16:29 -0400
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xYvpqEOG0GF_NCfG+DdoJi3BAc9+YmxSCcykVVRwuSUsQ@mail.gmail.com>
 <CAExdVNmKk2jJvUaGG7Q0aa=_Yya_LMEiJdd+TN55yDO+RfwoeA@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
Message-ID: <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>

On Mon, Aug 31, 2015 at 7:55 PM, Tim Peters <tim.peters at gmail.com> wrote:
>
> [Alex]
> >> After some thought, I believe the way to fix the implementation is
what I
> >> suggested at first: reset fold to 0 before calling utcoffset() in
__hash__.
> >> A rare hash collision is a small price to pay for having datetimes with
> >> different timezones in the same dictionary.
>
> [Tim]
> > Ya, I can live with that.  In effect, we give up on converting to UTC
> > correctly for purposes of computing hash(), but only in rare cases.
> > hash() doesn't really care, and it remains true that datetime equality
> > (which does care) still implies hash equality.  The later and earlier
> > of ambiguous times will simply land on the same hash chain.
>
> Nope, you wore me out prematurely ;-)
>

It's getting late in my TZ, but what you are saying below sounds like a
complaint that if you put t=second 01:30 as a key in the dictionary, you
cannot later retrieve it by looking up t.astimezone(timezone.utc).  Sorry,
but PEP 495 has never promised you that: "instances that differ only by the
value of fold will compare as equal. Applications that need to
differentiate between such instances should check the value of fold or
convert them to a timezone that does not have ambiguous times."

<https://www.python.org/dev/peps/pep-0495/#temporal-arithmetic>

Maybe if we decide to do something with the arithmetic, we will be able to
fix this wart as well.

>
> Consider datetimes dt1 and dt2 representing the earlier & later of an
> ambiguous time in their common zone (whatever it may be - doesn't
> matter).  Then all fields are identical except for `fold`.  Assume
> __hash__ forces `fold` to 0 before obtaining the UTC offset.  Then we
> have:
>
>     dt1 == dt2
>     hash(dt1) == hash(dt2)
>
> Fine so far as it goes.  Now do:
>
>     u1 = dt1.astimezone(timezone.utc)
>     u2 = dt2.astimezone(timezone.utc)
>
> At this point we have:
>
>     u1 == dt1 == dt2 == u2 and u1 < u2
>     hash(dt1) == hash(dt2) == hash(u1)
>
> (Parenthetically, note that despite the chain of equalities in the
> first of those lines, we do _not_ have u1 == u2 - transitivity fails,
> which is a bit of a wart by itself.)
>
> Since u1 == dt1, and hash(u1) == hash(dt1), no problem there either.
>
> But u1 isn't at all the same as u2, so hash(u2) can be the same as
> hash(u1) only by (unlikely) accident.  hash(u2) is off in a world of
> its own.  Therefore hash(dt2) can be the same as hash(u2) only by (the
> same unlikely) accident, despite that dt2 == u2.
>
> So, in all, __hash__ forcing fold=0 at the start hides the problem for
> ambiguous times in the same zone, but doesn't really touch the problem
> for cross-zone equivalent spellings of such times (not even if one of
> the zones is UTC, which is likely the most important case).
>
> One way to fix that is to have datetime.__hash__() _always_ return, say,
0 ;-)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150831/451c7bec/attachment-0001.html>

From alexander.belopolsky at gmail.com  Tue Sep  1 02:25:36 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 31 Aug 2015 20:25:36 -0400
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xYvpqEOG0GF_NCfG+DdoJi3BAc9+YmxSCcykVVRwuSUsQ@mail.gmail.com>
 <CAExdVNmKk2jJvUaGG7Q0aa=_Yya_LMEiJdd+TN55yDO+RfwoeA@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
Message-ID: <CAP7h-xbeQbHOGn==7JwkKGJ_0pj71mQ4KPLJg+i+rsX=efgVyQ@mail.gmail.com>

On Mon, Aug 31, 2015 at 8:16 PM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:

> Sorry, but PEP 495 has never promised you that: "instances that differ
> only by the value of fold will compare as equal. Applications that need to
> differentiate between such instances should check the value of fold or
> convert them to a timezone that does not have ambiguous times."


When I was writing some early drafts, I thought about advising users to use
(local_datetime, local_datetime.fold) pairs as dictionary keys, but decided
not to because using local_datetime.astimezone(timezone.utc) is a much
better option.  Now I think such advise may be relevant if a user truly has
a need to sort out timestamps that come from many different timezones and
for some reason wants to avoid conversion to UTC, but I don't think it will
belong to the PEP.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150831/632df42a/attachment.html>

From tim.peters at gmail.com  Tue Sep  1 03:01:18 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 31 Aug 2015 20:01:18 -0500
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xYvpqEOG0GF_NCfG+DdoJi3BAc9+YmxSCcykVVRwuSUsQ@mail.gmail.com>
 <CAExdVNmKk2jJvUaGG7Q0aa=_Yya_LMEiJdd+TN55yDO+RfwoeA@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
Message-ID: <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>

[Alex]
> It's getting late in my TZ, but what you are saying below sounds like a
> complaint that if you put t=second 01:30 as a key in the dictionary, you
> cannot later retrieve it by looking up t.astimezone(timezone.utc).

I don't grasp that.  What I am saying should be very clear:  under all
schemes so far, there are datetimes x and y such that x == y but
hash(x) != hash(y).  You see that or you don't.  If you don't, I'll
keep trying until you do ;-)  So do you see that?


> Sorry, but PEP 495 has never promised you that: "instances that differ
> only by the value of fold will compare as equal. Applications that need to
> differentiate between such instances should check the value of fold or
> convert them to a timezone that does not have ambiguous times."

Oh, come on.  That's in the "Temporal Arithmetic" section:

> <https://www.python.org/dev/peps/pep-0495/#temporal-arithmetic>

There isn't a single instance of any kind of arithmetic in the example
I gave, except for comparison, where I assumed only that comparison
would behave the way the PEP _said_ it behaves.  I'm not fighting the
PEP here - I'm trying to illustrate a _consequence_ of what the PEP
says.

It's simply impossible to deduce from the paragraph above.that the
fundamental invariant required for dict key types may fail.  Here from
the __hash__ docs:

    https://docs.python.org/3/reference/datamodel.html#object.__hash__
    ...
    The only required property is that objects which compare equal have
    the same hash value;

It's a violation of __hash__'s _only_  requirement, so even if there's
no intent to fix it, the PEP needs to spell that out clearly.  Code
slinging dicts can fail in bizarre ways when the invariant is
violated.


> Maybe if we decide to do something with the arithmetic, we will be able to
> fix this wart as well.

Doubt it - this has nothing to do with arithmetic I can see.  It's a
consequence of wanting to ignore `fold` in contexts where it really
does make a difference.  __hash__() is one such place.

Like I said at the start, it's a puzzle.

From ethan at stoneleaf.us  Tue Sep  1 03:12:15 2015
From: ethan at stoneleaf.us (Ethan Furman)
Date: Mon, 31 Aug 2015 18:12:15 -0700
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xYvpqEOG0GF_NCfG+DdoJi3BAc9+YmxSCcykVVRwuSUsQ@mail.gmail.com>
 <CAExdVNmKk2jJvUaGG7Q0aa=_Yya_LMEiJdd+TN55yDO+RfwoeA@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
Message-ID: <55E4FB6F.4080504@stoneleaf.us>

On 08/31/2015 04:55 PM, Tim Peters wrote:

[...]

> At this point we have:
>
>      u1 == dt1 == dt2 == u2 and u1 < u2
>      hash(dt1) == hash(dt2) == hash(u1)
>
> (Parenthetically, note that despite the chain of equalities in the
> first of those lines, we do _not_ have u1 == u2 - transitivity fails,
> which is a bit of a wart by itself.)

At this point are there any other cases in the stdlib where transitivity fails?  I was under the impression that such cases are to be considered bugs.  I know it was a driving concern in the 
implementation of the enum module.

--
~Ethan~

From tim.peters at gmail.com  Tue Sep  1 04:59:22 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 31 Aug 2015 21:59:22 -0500
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <55E4FB6F.4080504@stoneleaf.us>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xYvpqEOG0GF_NCfG+DdoJi3BAc9+YmxSCcykVVRwuSUsQ@mail.gmail.com>
 <CAExdVNmKk2jJvUaGG7Q0aa=_Yya_LMEiJdd+TN55yDO+RfwoeA@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
 <55E4FB6F.4080504@stoneleaf.us>
Message-ID: <CAExdVN=WEjj7W7N=xRBfpeWs_UQyCcYSFT4en4-fLnR850HZVA@mail.gmail.com>

[Tim]
>> [...]
>> At this point we have:
>>
>>      u1 == dt1 == dt2 == u2 and u1 < u2
>>      hash(dt1) == hash(dt2) == hash(u1)
>>
>> (Parenthetically, note that despite the chain of equalities in the
>> first of those lines, we do _not_ have u1 == u2 - transitivity fails,
>> which is a bit of a wart by itself.)

[Ethan Furman <ethan at stoneleaf.us>]
> At this point are there any other cases in the stdlib where transitivity
> fails?

I don't know.  Python grew more features than I needed some time ago,
so I'm not up to date.  Did we ever implement the long-awaited
RockScissorsPaper type?  ;-)  If not, there are none that I know of.


> I was under the impression that such cases are to be considered
> bugs.  I know it was a driving concern in the implementation of the enum
> module.

Sorry, I just had to laugh at the notion that enums _could_ be
implemented in such a convoluted way that there'd ever be the
slightest possibility of transitivity failing ;-)

Anyway, sure, they're considered bugs, unless there's some darned good
reason for it.  In this case, I'm not entirely sure.  Having
comparison ignore `fold` seems aimed at backward compatibility - but
it's another case where a non-zero fold can't appear unless a user
forces it to, until 495-compliant tzinfos appear (in which case
.fromutc() may create fold=1 by magic).  When that happens, it will
seem strange that fold is ignored by comparisons.  At a higher level,
I'd say that a datetime with fold=1 is veritably _screaming_ "I'm no
longer following the naive time model".  But there are consequences
too from following that intuition ...

From rosuav at gmail.com  Tue Sep  1 06:03:46 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Tue, 1 Sep 2015 14:03:46 +1000
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAExdVN=WEjj7W7N=xRBfpeWs_UQyCcYSFT4en4-fLnR850HZVA@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xYvpqEOG0GF_NCfG+DdoJi3BAc9+YmxSCcykVVRwuSUsQ@mail.gmail.com>
 <CAExdVNmKk2jJvUaGG7Q0aa=_Yya_LMEiJdd+TN55yDO+RfwoeA@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
 <55E4FB6F.4080504@stoneleaf.us>
 <CAExdVN=WEjj7W7N=xRBfpeWs_UQyCcYSFT4en4-fLnR850HZVA@mail.gmail.com>
Message-ID: <CAPTjJmqo2mqtwVrrQt1_5W76fMLXiq3pE_6yOh2iQj54-tUC0Q@mail.gmail.com>

On Tue, Sep 1, 2015 at 12:59 PM, Tim Peters <tim.peters at gmail.com> wrote:
>> I was under the impression that such cases are to be considered
>> bugs.  I know it was a driving concern in the implementation of the enum
>> module.
>
> Sorry, I just had to laugh at the notion that enums _could_ be
> implemented in such a convoluted way that there'd ever be the
> slightest possibility of transitivity failing ;-)
>

Easy: you just declare that different enumerations are not
comparable,, but that all are comparable to their base type.

class Color(IntEnum):
    RED=1
    GREEN=2
    BLUE=3
class Permission(IntEnum):
    READ=1
    WRITE=2
    EXECUTE=3

What should Color.RED==Permission.READ give? True, because they're
both 1? False, because they're completely different things? TypeError,
because you can't logically compare colors and permissions?

If you allow that Color.RED==1 (which you need to if it's going to be
possible to backwardly-compatibly change raw numbers into an enum),
then transitivity demands that the otherwise-illogical comparison
above succeed, and be True. As a design decision, it could viably be
taken either way, but once taken, it has to be maintained forever.

ChrisA

From tim.peters at gmail.com  Tue Sep  1 07:23:46 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 1 Sep 2015 00:23:46 -0500
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7h-xanPoc3Qk=_gUSKnyPen9d3MsNN04aH++d7G3W0Jtr32A@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <55E49D28.8030204@oddbird.net>
 <CAExdVN=y+RQ_towXOVQTxp8HoL1Rsmf2V==Bmju5WAW=TmWOEg@mail.gmail.com>
 <55E49F27.9000906@oddbird.net>
 <CAP7h-xanPoc3Qk=_gUSKnyPen9d3MsNN04aH++d7G3W0Jtr32A@mail.gmail.com>
Message-ID: <CAExdVNmnsaDPWXwYmT6+D10ih+UO-=FvavA9Zw6b=vqL30TinA@mail.gmail.com>

[Alex]
> I think the main difference between Tim's current proposal and what was
> previously discussed is that all older proposals somehow required  a third
> value for fold.   Note that there is a third variant suggested by Guido
> off-list and discussed in the PEP:  have fold=-1 by default, ignore it
> unless it is nonnegative and design whatever you want for fold=0/1 without
> concerns for backward compatibility.   This effectively will give two
> different datetime classes: classic and new.  Both are perfectly consistent,
> but if you think interoperation between naive and aware is confusing, try to
> explain how new naive instances will interoperate with classic aware!

It's worth some thought.  I don't think interoperation between naive
and aware now is confusing at all.  It's usually just plain forbidden;
e.g.,

>>> import datetime
>>> x = datetime.datetime.now()  # naive
>>> y = x.replace(tzinfo=datetime.timezone.utc) # aware

>>> x < y
Traceback (most recent call last):
  File "<pyshell#5>", line 1, in <module>
    x < y
TypeError: can't compare offset-naive and offset-aware datetimes

>>> x - y
Traceback (most recent call last):
  File "<pyshell#6>", line 1, in <module>
    x - y
TypeError: can't subtract offset-naive and offset-aware datetimes

>>> x == y
False

Only that last one may be surprising, but it's really just another way
of saying "naive and aware don't mix, period".  Do you have other
kinds of interoperation in mind?

Presumably a similarly high wall would be erected between fold < 0 and
fold >= 0 instances.

If this were pursued then, e.g., the seemingly intractable problem
with __hash__() would go away (no more reason to _try_ to ignore fold
>= 0), and, e.g., for an aware dt then dt.replace(fold=1) -
dt.replace(fold=0) could return the expected result when dt specified
an ambiguous time (ditto:  no more reason to try to ignore fold==1),
and likewise for comparing those values.

I can see one kind of annoyance that would remain:

    dt2 = dt1 + a_timedelta

is currently specified to force dt2.fold==0 even if dt1.fold==1.  But
that may not make good sense.  There's no way to know whether adding
`a_timedelta` takes dt1 out of a fold without doing timeline
arithmetic.

The conceptual mess in my head is that "fold=1" screams "I'm no longer
in naive time", but "fold=0" does not (where "in naive time" means
classic arithmetic is appropriate, and "not in naive time" means
timeline arithmetic is appropriate - while fold < 0 would be an
explicit way to say "in naive time", it's unclear that "fold >= 0"
should always mean "not in naive time", despite that fold=1 makes no
sense in naive time).

At least it's all clear now ;-)

From alexander.belopolsky at gmail.com  Tue Sep  1 16:34:45 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 1 Sep 2015 10:34:45 -0400
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAExdVNmnsaDPWXwYmT6+D10ih+UO-=FvavA9Zw6b=vqL30TinA@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <55E49D28.8030204@oddbird.net>
 <CAExdVN=y+RQ_towXOVQTxp8HoL1Rsmf2V==Bmju5WAW=TmWOEg@mail.gmail.com>
 <55E49F27.9000906@oddbird.net>
 <CAP7h-xanPoc3Qk=_gUSKnyPen9d3MsNN04aH++d7G3W0Jtr32A@mail.gmail.com>
 <CAExdVNmnsaDPWXwYmT6+D10ih+UO-=FvavA9Zw6b=vqL30TinA@mail.gmail.com>
Message-ID: <CAP7h-xa4-oJ2ThMzWJrQUTMgeFFWXBZpSbggeGZ6=xc5Kc3bjA@mail.gmail.com>

On Tue, Sep 1, 2015 at 1:23 AM, Tim Peters <tim.peters at gmail.com> wrote:

> >>> x == y
> False
>
> Only that last one may be surprising, but it's really just another way
> of saying "naive and aware don't mix, period".
>

This is a relatively recent feature. [1, 2, 3]  Changed in version 3.3.

[1]: http://mail.python.org/pipermail/python-dev/2012-June/119933.html
[2]: http://bugs.python.org/issue15006
[3]: https://hg.python.org/cpython/rev/8272699973cb
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150901/2c584d59/attachment.html>

From alexander.belopolsky at gmail.com  Tue Sep  1 16:41:28 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 1 Sep 2015 10:41:28 -0400
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAExdVNmnsaDPWXwYmT6+D10ih+UO-=FvavA9Zw6b=vqL30TinA@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <55E49D28.8030204@oddbird.net>
 <CAExdVN=y+RQ_towXOVQTxp8HoL1Rsmf2V==Bmju5WAW=TmWOEg@mail.gmail.com>
 <55E49F27.9000906@oddbird.net>
 <CAP7h-xanPoc3Qk=_gUSKnyPen9d3MsNN04aH++d7G3W0Jtr32A@mail.gmail.com>
 <CAExdVNmnsaDPWXwYmT6+D10ih+UO-=FvavA9Zw6b=vqL30TinA@mail.gmail.com>
Message-ID: <CAP7h-xajfu269ugL85HNPH12m5K7EUsPNwmM_7n2_Fy8QoWC=Q@mail.gmail.com>

On Tue, Sep 1, 2015 at 1:23 AM, Tim Peters <tim.peters at gmail.com> wrote:

> I can see one kind of annoyance that would remain:
>
>     dt2 = dt1 + a_timedelta
>
> is currently specified to force dt2.fold==0 even if dt1.fold==1.  But
> that may not make good sense.
>

Note that dt2.fold==0 even if dt1.fold==1 *and* a_timedelta==timedelta(0).

This is what I call "fold-unaware" arithmetic.  It is consistent with
dt2==dt1.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150901/40c6a916/attachment.html>

From alexander.belopolsky at gmail.com  Tue Sep  1 17:59:15 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 1 Sep 2015 11:59:15 -0400
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xYvpqEOG0GF_NCfG+DdoJi3BAc9+YmxSCcykVVRwuSUsQ@mail.gmail.com>
 <CAExdVNmKk2jJvUaGG7Q0aa=_Yya_LMEiJdd+TN55yDO+RfwoeA@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
Message-ID: <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>

On Mon, Aug 31, 2015 at 9:01 PM, Tim Peters <tim.peters at gmail.com> wrote:

> [Alex]
> > It's getting late in my TZ, but what you are saying below sounds like a
> > complaint that if you put t=second 01:30 as a key in the dictionary, you
> > cannot later retrieve it by looking up t.astimezone(timezone.utc).
>
> I don't grasp that.  What I am saying should be very clear:  under all
> schemes so far, there are datetimes x and y such that x == y but
> hash(x) != hash(y).  You see that or you don't.  If you don't, I'll
> keep trying until you do ;-)  So do you see that?
>
>
I do (in the morning.)

> ..
>
[Alex]

>
> > Maybe if we decide to do something with the arithmetic, we will be able
> to
> > fix this wart as well.
>

[Tim]

> Doubt it - this has nothing to do with arithmetic I can see.  It's a
> consequence of wanting to ignore `fold` in contexts where it really
> does make a difference.  __hash__() is one such place.
>

Arithmetic and comparisons are intertwined as long as you require that
not(a - b) ? a==b.  The main problem for hash as I see it is that x == y
may or may not call x.utcoffset() depending on the value of y.  This is a
problem for hash(x) which should decide whether

>
> Like I said at the start, it's a puzzle.
>

Let's formulate the puzzle: Define datetime.__hash__ so that given PEP 495
rules for datetime.__eq__, x == y implies hash(x) == hash(y).

First (trivial) observation:  a solution exists, e.g. hash(x) == 0.  This
is not a very practical solution, but shows that the puzzle is not a
logical impossibility.

Second observation: We cannot improve on hash(x) == 0 without some
knowledge of what timezones are known to the system. Proof: let u1 < u2 be
two arbitrary UTC times.   We can always construct a timezone (FoldZone)
where u1 and u2 map to the same local time.  All we need to do is to create
a fold of size u2 - u1 at some time u between u1 and u2.  Let t1 =
u1.astimezone(FoldZone)
and t2 = u2.astimezone(FoldZone).  By construction, t1 == t2, t1.fold = 0
and t2.fold = 1.  If x == y implies hash(x) == hash(y), then u1 == t1
implies hash(u1) == hash(t1) and similarly u2 == t2 implies hash(u2) ==
hash(t2) and t1 == t2 implies hash(t1) == hash(t2).   Since hash values are
integers and == is transitive for integers, from a chain hash(u1) ==
hash(t1), hash(t1) == hash(t2), hash(t2) == hash(u2 )we conclude that
hash(u1) == hash(u2) and therefore the only solution is hash(x) == const.

This sounds discouraging, but note that the FoldZone that we constructed is
rather unrealistic because depending on the values of u1 and u2, the size
of the fold can range from microseconds to centuries.

Third observation:  If we have only one variable offset timezone (Local),
then we can solve the problem by defining datetime.__hash__(self) as for
example, hash(self.astimezone(Local).replace(fold=0) - datetime(1, 1, 1,
tzinfo=Local)).  Note that in the last expression, hash is taken of a
timedelta object, so the definition is not circular.  (A proof that x == y
implies hash(x) == hash(y) in this case is left as an exercise for the
reader.:-)

Fourth observation:  A solution for one variable offset timezone
generalizes to the case of an arbitrary number of such timezones.  A
theoretical construction is to simply iterate x =
x.astimezone(Zone).replace(fold=0) over all the zones known to the system,
but certainly a more efficient algorithm can be devised to to achieve the
same result in a single lookup in a specially crafted table.

So the puzzle is not unsolvable, but how much of it has to be solved in PEP
495?  I would say not much.  I agree with Tim that non-transitivity of ==
and the violation of the hash invariant need to be mentioned in the PEP.
However, since PEP 495 by itself does not introduce any new tzinfo
implementations and the existing fixed offset timezones don't suffer from
this problem, I think we can leave the final resolution to the
timezone.local or the zoneinfo PEP.

An important lesson is in the second observation.  To solve the hash
puzzle, we need to have a global view of the totality of timezones that
will be supported by the system.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150901/cda02d1f/attachment.html>

From alexander.belopolsky at gmail.com  Tue Sep  1 18:06:36 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 1 Sep 2015 12:06:36 -0400
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xYvpqEOG0GF_NCfG+DdoJi3BAc9+YmxSCcykVVRwuSUsQ@mail.gmail.com>
 <CAExdVNmKk2jJvUaGG7Q0aa=_Yya_LMEiJdd+TN55yDO+RfwoeA@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
Message-ID: <CAP7h-xaZrnCqy_73L9M29de9SNgoWNuqciCW3UxyNSEpSAc=JA@mail.gmail.com>

On Tue, Sep 1, 2015 at 11:59 AM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:

> Third observation:  If we have only one variable offset timezone (Local),
> then we can solve the problem by defining datetime.__hash__(self) as for
> example, hash(self.astimezone(Local).replace(fold=0) - datetime(1, 1, 1,
> tzinfo=Local)).
>

Note that given classic arithmetic, .replace(fold=0) is redundant.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150901/77968437/attachment-0001.html>

From guido at python.org  Tue Sep  1 18:12:53 2015
From: guido at python.org (Guido van Rossum)
Date: Tue, 1 Sep 2015 09:12:53 -0700
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xYvpqEOG0GF_NCfG+DdoJi3BAc9+YmxSCcykVVRwuSUsQ@mail.gmail.com>
 <CAExdVNmKk2jJvUaGG7Q0aa=_Yya_LMEiJdd+TN55yDO+RfwoeA@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
Message-ID: <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>

I could not accept a PEP that leads to different datetime being considered
== but having a different hash (*unless* due to a buggy tzinfo subclass
implementation -- however no historical timezone data should ever depend on
such a bug).

I'm much less concerned about < being intransitive in edge cases. I also
don't particularly care about == following from the difference being zero.
Still, unless we're constrained by backward compatibility, I would rather
not add equivalence between *any* two datetimes whose tzinfo is not the
same object -- even if we can infer that they both must refer to the same
instant.


On Tue, Sep 1, 2015 at 8:59 AM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:

>
>
> On Mon, Aug 31, 2015 at 9:01 PM, Tim Peters <tim.peters at gmail.com> wrote:
>
>> [Alex]
>> > It's getting late in my TZ, but what you are saying below sounds like a
>> > complaint that if you put t=second 01:30 as a key in the dictionary, you
>> > cannot later retrieve it by looking up t.astimezone(timezone.utc).
>>
>> I don't grasp that.  What I am saying should be very clear:  under all
>> schemes so far, there are datetimes x and y such that x == y but
>> hash(x) != hash(y).  You see that or you don't.  If you don't, I'll
>> keep trying until you do ;-)  So do you see that?
>>
>>
> I do (in the morning.)
>
>> ..
>>
> [Alex]
>
>>
>> > Maybe if we decide to do something with the arithmetic, we will be able
>> to
>> > fix this wart as well.
>>
>
> [Tim]
>
>> Doubt it - this has nothing to do with arithmetic I can see.  It's a
>> consequence of wanting to ignore `fold` in contexts where it really
>> does make a difference.  __hash__() is one such place.
>>
>
> Arithmetic and comparisons are intertwined as long as you require that
> not(a - b) ? a==b.  The main problem for hash as I see it is that x == y
> may or may not call x.utcoffset() depending on the value of y.  This is a
> problem for hash(x) which should decide whether
>
>>
>> Like I said at the start, it's a puzzle.
>>
>
> Let's formulate the puzzle: Define datetime.__hash__ so that given PEP 495
> rules for datetime.__eq__, x == y implies hash(x) == hash(y).
>
> First (trivial) observation:  a solution exists, e.g. hash(x) == 0.  This
> is not a very practical solution, but shows that the puzzle is not a
> logical impossibility.
>
> Second observation: We cannot improve on hash(x) == 0 without some
> knowledge of what timezones are known to the system. Proof: let u1 < u2 be
> two arbitrary UTC times.   We can always construct a timezone (FoldZone)
> where u1 and u2 map to the same local time.  All we need to do is to create
> a fold of size u2 - u1 at some time u between u1 and u2.  Let t1 =
> u1.astimezone(FoldZone)
> and t2 = u2.astimezone(FoldZone).  By construction, t1 == t2, t1.fold = 0
> and t2.fold = 1.  If x == y implies hash(x) == hash(y), then u1 == t1
> implies hash(u1) == hash(t1) and similarly u2 == t2 implies hash(u2) ==
> hash(t2) and t1 == t2 implies hash(t1) == hash(t2).   Since hash values are
> integers and == is transitive for integers, from a chain hash(u1) ==
> hash(t1), hash(t1) == hash(t2), hash(t2) == hash(u2 )we conclude that
> hash(u1) == hash(u2) and therefore the only solution is hash(x) == const.
>
> This sounds discouraging, but note that the FoldZone that we constructed
> is rather unrealistic because depending on the values of u1 and u2, the
> size of the fold can range from microseconds to centuries.
>
> Third observation:  If we have only one variable offset timezone (Local),
> then we can solve the problem by defining datetime.__hash__(self) as for
> example, hash(self.astimezone(Local).replace(fold=0) - datetime(1, 1, 1,
> tzinfo=Local)).  Note that in the last expression, hash is taken of a
> timedelta object, so the definition is not circular.  (A proof that x == y
> implies hash(x) == hash(y) in this case is left as an exercise for the
> reader.:-)
>
> Fourth observation:  A solution for one variable offset timezone
> generalizes to the case of an arbitrary number of such timezones.  A
> theoretical construction is to simply iterate x =
> x.astimezone(Zone).replace(fold=0) over all the zones known to the system,
> but certainly a more efficient algorithm can be devised to to achieve the
> same result in a single lookup in a specially crafted table.
>
> So the puzzle is not unsolvable, but how much of it has to be solved in
> PEP 495?  I would say not much.  I agree with Tim that non-transitivity of
> == and the violation of the hash invariant need to be mentioned in the
> PEP.  However, since PEP 495 by itself does not introduce any new tzinfo
> implementations and the existing fixed offset timezones don't suffer from
> this problem, I think we can leave the final resolution to the
> timezone.local or the zoneinfo PEP.
>
> An important lesson is in the second observation.  To solve the hash
> puzzle, we need to have a global view of the totality of timezones that
> will be supported by the system.
>
> _______________________________________________
> Datetime-SIG mailing list
> Datetime-SIG at python.org
> https://mail.python.org/mailman/listinfo/datetime-sig
> The PSF Code of Conduct applies to this mailing list:
> https://www.python.org/psf/codeofconduct/
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150901/8fd613c7/attachment.html>

From carl at oddbird.net  Tue Sep  1 18:36:05 2015
From: carl at oddbird.net (Carl Meyer)
Date: Tue, 1 Sep 2015 10:36:05 -0600
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
Message-ID: <55E5D3F5.40600@oddbird.net>

On 09/01/2015 10:12 AM, Guido van Rossum wrote:
> I'm much less concerned about < being intransitive in edge cases. I also
> don't particularly care about == following from the difference being
> zero. Still, unless we're constrained by backward compatibility, I would
> rather not add equivalence between *any* two datetimes whose tzinfo is
> not the same object -- even if we can infer that they both must refer to
> the same instant.

I think the latter is certainly a backwards-compatibility requirement,
since that equivalence is already very much present in the current
implementation of datetime.__eq__ (well, datetime._cmp). If two
datetimes have different tzinfo objects, they are converted to UTC and
compared as instants.

Following the same model would certainly imply that a fold=0 and fold=1
datetime that are otherwise identical should not be considered equal,
because they clearly represent different instants. I guess Alex's
opposition to that is the (very small) chance of
backwards-incompatibility, since currently it is possible to take two
non-equal UTC datetimes an hour apart at a fold, convert them to local
time, and then have them compare equal (since pre PEP 495 the conversion
to local time during a fold loses information).

Personally I think that latter backwards-incompatibility would be a
reasonable bugfix to make the existing semantics of datetime equality
consistent in folds. Though I suppose it's possible someone somewhere is
relying on that as a very strange way of detecting a fold?

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150901/cf37e0ad/attachment.sig>

From alexander.belopolsky at gmail.com  Tue Sep  1 18:37:05 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 1 Sep 2015 12:37:05 -0400
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xYvpqEOG0GF_NCfG+DdoJi3BAc9+YmxSCcykVVRwuSUsQ@mail.gmail.com>
 <CAExdVNmKk2jJvUaGG7Q0aa=_Yya_LMEiJdd+TN55yDO+RfwoeA@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
Message-ID: <CAP7h-xYhb_6wXijJ+Wms66XnmxrfnLE59nGLY_zBk84F19eS3A@mail.gmail.com>

On Tue, Sep 1, 2015 at 12:12 PM, Guido van Rossum <guido at python.org> wrote:

> I could not accept a PEP that leads to different datetime being considered
> == but having a different hash (*unless* due to a buggy tzinfo subclass
> implementation -- however no historical timezone data should ever depend on
> such a bug).
>

I agree, but my analysis demonstrates that we cannot fix hash to make an
arbitrary tzinfo work.  ("Arbitrary" includes tzinfos with leap
microseconds and leap centuries.)   We can probably come up with a good
enough hash if we restrict fold sizes to multiples of 15 min up to 1 hour
and locations to a hour boundaries.

My preferred solution would be to delegate hash calculation to tzinfo and
make it someone else's headache, but I know you don't like this solution.


> I'm much less concerned about < being intransitive in edge cases. I also
> don't particularly care about == following from the difference being zero.
>

I believe Tim does care about this.  I did consider divorcing comparison
and arithmetic, but I think that led to problems with the total ordering.
Maybe we can make == differentiate between fold=0 and fold=1 at the expense
of not(a > b) and not(b<a) implying a==b?
I am not too hopeful.  Messing with total ordering axioms is just as fatal
for binary searches as messing with hash invariants is for dictionary
lookups.


> Still, unless we're constrained by backward compatibility, I would rather
> not add equivalence between *any* two datetimes whose tzinfo is not the
> same object -- even if we can infer that they both must refer to the same
> instant.
>

Not even for fixed offset timezones?  I am afraid this will break too many
programs.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150901/cb246b7a/attachment-0001.html>

From guido at python.org  Tue Sep  1 18:58:34 2015
From: guido at python.org (Guido van Rossum)
Date: Tue, 1 Sep 2015 09:58:34 -0700
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7h-xYhb_6wXijJ+Wms66XnmxrfnLE59nGLY_zBk84F19eS3A@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xYvpqEOG0GF_NCfG+DdoJi3BAc9+YmxSCcykVVRwuSUsQ@mail.gmail.com>
 <CAExdVNmKk2jJvUaGG7Q0aa=_Yya_LMEiJdd+TN55yDO+RfwoeA@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <CAP7h-xYhb_6wXijJ+Wms66XnmxrfnLE59nGLY_zBk84F19eS3A@mail.gmail.com>
Message-ID: <CAP7+vJ+j+O0Ca2r09qYhGo0oDXWxWLg6DtRiZ_GkfycC8uCHBQ@mail.gmail.com>

On Tue, Sep 1, 2015 at 9:37 AM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:

>
> On Tue, Sep 1, 2015 at 12:12 PM, Guido van Rossum <guido at python.org>
> wrote:
>
>> I could not accept a PEP that leads to different datetime being
>> considered == but having a different hash (*unless* due to a buggy tzinfo
>> subclass implementation -- however no historical timezone data should ever
>> depend on such a bug).
>>
>
> I agree, but my analysis demonstrates that we cannot fix hash to make an
> arbitrary tzinfo work.  ("Arbitrary" includes tzinfos with leap
> microseconds and leap centuries.)   We can probably come up with a good
> enough hash if we restrict fold sizes to multiples of 15 min up to 1 hour
> and locations to a hour boundaries.
>

That's bizarre. I suspect this came from assuming too much about how ==
must work.


> My preferred solution would be to delegate hash calculation to tzinfo and
> make it someone else's headache, but I know you don't like this solution.
>
>
>
>> I'm much less concerned about < being intransitive in edge cases. I also
>> don't particularly care about == following from the difference being zero.
>>
>
> I believe Tim does care about this.  I did consider divorcing comparison
> and arithmetic, but I think that led to problems with the total ordering.
> Maybe we can make == differentiate between fold=0 and fold=1 at the expense
> of not(a > b) and not(b<a) implying a==b?
> I am not too hopeful.  Messing with total ordering axioms is just as fatal
> for binary searches as messing with hash invariants is for dictionary
> lookups.
>

I think it's better to have some values that are neither < nor == nor >
each other, than to have two values that are == but differ in hash.


> Still, unless we're constrained by backward compatibility, I would rather
>> not add equivalence between *any* two datetimes whose tzinfo is not the
>> same object -- even if we can infer that they both must refer to the same
>> instant.
>>
>
> Not even for fixed offset timezones?  I am afraid this will break too many
> programs.
>

Oh, it looks like we currently allow < and >  if the utcoffset() of both
arguments are the same. I presume that's really a proxy for "both tzinfos
have the same fixed offset" which we can't detect directly. But this is
already pretty broken -- for tzinfos that don't have fixed offsets, the
comparison succeeds if both datetimes happen to fall in a period where the
offsets *are* the same.

In any case, a broken total ordering doesn't bother me that much, except
when the tzinfo is the same object. I wonder if we could cache the built-in
fixed-offset timezone instances? (Currently a new instance is created each
time you call astimezone(None).) Does pytz reuse its fixed-offset objects?

And given that we already have total ordering problems, from that
perspective I could live with declaring that two datetimes that differ only
in the fold are unequal. (Hm, aren't they already unequal because their
utcoffset() differs?)

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150901/0409864c/attachment.html>

From alexander.belopolsky at gmail.com  Tue Sep  1 19:00:19 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 1 Sep 2015 13:00:19 -0400
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <55E5D3F5.40600@oddbird.net>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
Message-ID: <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>

On Tue, Sep 1, 2015 at 12:36 PM, Carl Meyer <carl at oddbird.net> wrote:

> On 09/01/2015 10:12 AM, Guido van Rossum wrote:
> > I'm much less concerned about < being intransitive in edge cases. I also
> > don't particularly care about == following from the difference being
> > zero. Still, unless we're constrained by backward compatibility, I would
> > rather not add equivalence between *any* two datetimes whose tzinfo is
> > not the same object -- even if we can infer that they both must refer to
> > the same instant.
>
> I think the latter is certainly a backwards-compatibility requirement,
> since that equivalence is already very much present in the current
> implementation of datetime.__eq__ (well, datetime._cmp). If two
> datetimes have different tzinfo objects, they are converted to UTC and
> compared as instants.
>
> Following the same model would certainly imply that a fold=0 and fold=1
> datetime that are otherwise identical should not be considered equal,
> because they clearly represent different instants. I guess Alex's
> opposition to that is the (very small) chance of
> backwards-incompatibility, since currently it is possible to take two
> non-equal UTC datetimes an hour apart at a fold, convert them to local
> time, and then have them compare equal (since pre PEP 495 the conversion
> to local time during a fold loses information).


Here is an idea that I think may work: let's consider fold=1 instances as
if they have a different tzinfo instance from the other side in both
datetime subtractions and comparisons.  This will be consistent with the
current stdlib and pytz work-arounds of representing "second" times using
fictitious fixed-offset timezones.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150901/853e1701/attachment.html>

From carl at oddbird.net  Tue Sep  1 19:01:45 2015
From: carl at oddbird.net (Carl Meyer)
Date: Tue, 1 Sep 2015 11:01:45 -0600
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
Message-ID: <55E5D9F9.4050700@oddbird.net>

On 09/01/2015 11:00 AM, Alexander Belopolsky wrote:
> Here is an idea that I think may work: let's consider fold=1 instances
> as if they have a different tzinfo instance from the other side in both
> datetime subtractions and comparisons.  This will be consistent with the
> current stdlib and pytz work-arounds of representing "second" times
> using fictitious fixed-offset timezones. 

+1

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150901/c24d8b73/attachment.sig>

From alexander.belopolsky at gmail.com  Tue Sep  1 19:13:04 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 1 Sep 2015 13:13:04 -0400
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7+vJ+j+O0Ca2r09qYhGo0oDXWxWLg6DtRiZ_GkfycC8uCHBQ@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xYvpqEOG0GF_NCfG+DdoJi3BAc9+YmxSCcykVVRwuSUsQ@mail.gmail.com>
 <CAExdVNmKk2jJvUaGG7Q0aa=_Yya_LMEiJdd+TN55yDO+RfwoeA@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <CAP7h-xYhb_6wXijJ+Wms66XnmxrfnLE59nGLY_zBk84F19eS3A@mail.gmail.com>
 <CAP7+vJ+j+O0Ca2r09qYhGo0oDXWxWLg6DtRiZ_GkfycC8uCHBQ@mail.gmail.com>
Message-ID: <CAP7h-xZuZ+u8AEYwBAwu=-KkRa_-xq9RJ79E-=ia5bpCZCWpYw@mail.gmail.com>

On Tue, Sep 1, 2015 at 12:58 PM, Guido van Rossum <guido at python.org> wrote:

> And given that we already have total ordering problems, from that
> perspective I could live with declaring that two datetimes that differ only
> in the fold are unequal. (Hm, aren't they already unequal because their
> utcoffset() differs?)


They are not unequal because their tzinfos are the same.  In this case
__sub__ (and as a consequence __eq__) does not call utcoffset() to follow
the rules of classic arithmetic.  My new suggestion is to use timeline
arithmetic whenever fold=1 datetime instance is involved.  This should not
break any programs that don't encounter fold=1 instances and in effect will
make fold=1 instances behave similar to how their timezone.utc equivalents
behave now.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150901/d093d7ea/attachment-0001.html>

From tim.peters at gmail.com  Tue Sep  1 19:26:51 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 1 Sep 2015 12:26:51 -0500
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xYvpqEOG0GF_NCfG+DdoJi3BAc9+YmxSCcykVVRwuSUsQ@mail.gmail.com>
 <CAExdVNmKk2jJvUaGG7Q0aa=_Yya_LMEiJdd+TN55yDO+RfwoeA@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
Message-ID: <CAExdVNmrjmxv_YneXYLN8es=SrCMfJkVWR2XEKy-O81mkPxs0g@mail.gmail.com>

[Guido]
> I could not accept a PEP that leads to different datetime being considered
> == but having a different hash (*unless* due to a buggy tzinfo subclass
> implementation -- however no historical timezone data should ever depend on
> such a bug).
>
> I'm much less concerned about < being intransitive in edge cases.

Offhand I don't know whether it can be (probably).  The case I
stumbled into yesterday showed that equality ("==") could be
intransitive:

    assert a == b == c == d  and  a < d

While initially jarring, I called it a "minor wart", because the
middle "==" there is working in classic arithmetic but the other two
are working in timeline arithmetic.  But _a_ wart all the same, since
transitivity doesn't fail today.


> I also don't particularly care about == following from the difference being zero.
> Still, unless we're constrained by backward compatibility, I would rather
> not add equivalence between *any* two datetimes whose tzinfo is not the same
> object -- even if we can infer that they both must refer to the same
> instant.

Assuming "equivalent" means "compare equal", we're highly constrained.
For datetimes x and y with distinct non-None tzinfos, it's always been
the case that:

1. x-y effectively converted both to UTC before subtraction.

2. comparison effectively interpreted x-y as a __cmp__ result
2a.  various comparison transitivities essentially followed from that

3. Because of #2, to maintain __hash__'s contract datetime.__hash__
    also effectively converted to UTC before hashing

All of that would (well, "should") continue to work fine, except that
fold=1 is being ignored in intrazone arithmetic (subtraction and
comparisons) and by hash().  Maybe there are other surprises.  I just
happened to notice the hash() problem, and equality intransitivity,
both yesterday. via thought experiments.

On the face of it, it's a conceptual mess to try to make fold=1 "mean
something" in some contexts but not in others.  In particular,
arithmetic, comparison, and hashing are usually deeply interrelated,
and have been in datetime so far.  Ignoring `fold` in single-zone
arithmetic, comparisons and hashing works fine (in "naive time", where
`fold` is senseless), but when going across zones `fold` cannot be
ignored.

That's a huge problem for hash(), because it can have no idea whether
the pattern of later equality comparisons relying on hash results
_will_ be using classic or timeline rules (or a mix of both).

That didn't matter before, because _a_ unique UTC equivalent always
existed (the possibility of ambiguous times was effectively ignored).

Now it does matter, because the UTC equivalent can differ depending on
the `fold` value.  Ignoring it sometimes but not others leads to the
current quandary.

From tim.peters at gmail.com  Tue Sep  1 19:35:23 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 1 Sep 2015 12:35:23 -0500
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7h-xajfu269ugL85HNPH12m5K7EUsPNwmM_7n2_Fy8QoWC=Q@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <55E49D28.8030204@oddbird.net>
 <CAExdVN=y+RQ_towXOVQTxp8HoL1Rsmf2V==Bmju5WAW=TmWOEg@mail.gmail.com>
 <55E49F27.9000906@oddbird.net>
 <CAP7h-xanPoc3Qk=_gUSKnyPen9d3MsNN04aH++d7G3W0Jtr32A@mail.gmail.com>
 <CAExdVNmnsaDPWXwYmT6+D10ih+UO-=FvavA9Zw6b=vqL30TinA@mail.gmail.com>
 <CAP7h-xajfu269ugL85HNPH12m5K7EUsPNwmM_7n2_Fy8QoWC=Q@mail.gmail.com>
Message-ID: <CAExdVNmdiB+7mCKjSgwiYUYoVb3btA_+0=99OAQw65K5ZjzxjQ@mail.gmail.com>

[Tim]
>> I can see one kind of annoyance that would remain:
>>
>>     dt2 = dt1 + a_timedelta
>>
>> is currently specified to force dt2.fold==0 even if dt1.fold==1.  But
>> that may not make good sense.

[Alex]
> Note that dt2.fold==0 even if dt1.fold==1 *and* a_timedelta==timedelta(0).

Yup.


> This is what I call "fold-unaware" arithmetic.  It is consistent with
> dt2==dt1.

Heh - setting

    dt2.fold = random.randrange(2)

would also be consistent with dt2 == dt1.  That is, "==" ignores both
`fold`s entirely in this case.

From tim.peters at gmail.com  Tue Sep  1 19:44:47 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 1 Sep 2015 12:44:47 -0500
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
Message-ID: <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>

[Alex]
> Here is an idea that I think may work: let's consider fold=1 instances as if
> they have a different tzinfo instance from the other side in both datetime
> subtractions and comparisons.  This will be consistent with the current
> stdlib and pytz work-arounds of representing "second" times using fictitious
> fixed-offset timezones.

That's what I was getting at by saying "fold=1 veritably _screams_
'I'm no longer working in naive time'".  Which implies "I need
timeline arithmetic", and everything else follows from that, including
hash() not ignoring fold=1 either.

But then the concept of "naive time" gets muddier:  sometimes, e.g.,

     dt1  - dt2

in a common zone (same tzinfo) will use classic arithmetic, but in
other cases (fold=1 in at least one) timeline arithmetic.

And there's also that, after

    d = dt1 - dt2

I suspect it may no longer always be the case that

    dt1 == dt2 + d

(unsure, but can't make time for it now)

From alexander.belopolsky at gmail.com  Tue Sep  1 20:00:31 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 1 Sep 2015 14:00:31 -0400
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
Message-ID: <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>

On Tue, Sep 1, 2015 at 1:44 PM, Tim Peters <tim.peters at gmail.com> wrote:

> [Alex]
> > Here is an idea that I think may work: let's consider fold=1 instances
> as if
> > they have a different tzinfo instance from the other side in both
> datetime
> > subtractions and comparisons.  This will be consistent with the current
> > stdlib and pytz work-arounds of representing "second" times using
> fictitious
> > fixed-offset timezones.
>
> That's what I was getting at by saying "fold=1 veritably _screams_
> 'I'm no longer working in naive time'".  Which implies "I need
> timeline arithmetic", and everything else follows from that, including
> hash() not ignoring fold=1 either.
>
> But then the concept of "naive time" gets muddier:  sometimes, e.g.,
>
>      dt1  - dt2
>
> in a common zone (same tzinfo) will use classic arithmetic, but in
> other cases (fold=1 in at least one) timeline arithmetic.
>

I don't think this is a problem as long as we disallow mixing naive and
aware instances in arithmetic and ordering and keep naive ? aware always
rule.


>
> And there's also that, after
>
>     d = dt1 - dt2
>
> I suspect it may no longer always be the case that
>
>     dt1 == dt2 + d
>
> (unsure, but can't make time for it now)
>

That's the price we pay for classic arithmetic anyways.  I am not even sure
we want to trigger timeline arithmetics in dt + delta expressions when
dt.fold=1.  If you do, dt - hour + hour will still not take you back
because the seconds + hour will be classic.

I don't think we can ever get rid of all paradoxes here.  Once you let your
time go back, all bets are off.  What we can do is to shift them from one
place to another so that you only see odd behavior when a fold=1 instance
is involved.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150901/0f495117/attachment.html>

From guido at python.org  Tue Sep  1 21:03:59 2015
From: guido at python.org (Guido van Rossum)
Date: Tue, 1 Sep 2015 12:03:59 -0700
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
Message-ID: <CAP7+vJJaNdnxy9wb7GoOv7EQCF-7ZKgkT2GHRSuHyUqj3oBy3g@mail.gmail.com>

As a point of order, I don't have time today (nor probably this week) to
keep up with this discussion. :-(

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150901/bc09ea8b/attachment-0001.html>

From alexander.belopolsky at gmail.com  Tue Sep  1 21:50:41 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 1 Sep 2015 15:50:41 -0400
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
Message-ID: <CAP7h-xb5LhP11BnRqPd-3YmWJdRQjuko2W7JmE9MSDBTp2kY_w@mail.gmail.com>

On Tue, Sep 1, 2015 at 2:00 PM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:

> And there's also that, after
>>
>>     d = dt1 - dt2
>>
>> I suspect it may no longer always be the case that
>>
>>     dt1 == dt2 + d
>>
>> (unsure, but can't make time for it now)
>>
>
> That's the price we pay for classic arithmetic anyways.
>

Let me clarify what I mean by that:

>>> from datetime import *
>>> exec(open("Doc/includes/tzinfo-examples.py").read())
>>> t1 = datetime(2015, 10, 31, 12, tzinfo=Eastern)
>>> t2 = datetime(2015, 11, 1, 12, tzinfo=Eastern)
>>> u = datetime(2000, 1, 1, tzinfo=timezone.utc)
>>> (t1 - u) - (t2 - u) == t2 - t1
False

This is a fundamental property of classic arithmetic and the only way to
prevent something like this from happening is (as Guido mentioned
previously) to disallow cross-zone arithmetic.  This would be quite
justifiable from the relativity POV: whether or not two events occur
simultaneously at two different places depends on the speed of the
observer.   This fact will be important when ordinary computers get clocks
with nanosecond precision.  Meanwhile, our governments let us enjoy the
effects relativity and time travel twice a year at pedestrian speeds: if
you add a day in New York, go to Paris, subtract a day there and come back
to New York you may not find yourself at the same time.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150901/08315b3e/attachment.html>

From alexander.belopolsky at gmail.com  Tue Sep  1 21:55:22 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 1 Sep 2015 15:55:22 -0400
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7h-xb5LhP11BnRqPd-3YmWJdRQjuko2W7JmE9MSDBTp2kY_w@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAP7h-xb5LhP11BnRqPd-3YmWJdRQjuko2W7JmE9MSDBTp2kY_w@mail.gmail.com>
Message-ID: <CAP7h-xYvDuS1Nh5vaD995gOQEbK593_JE2Cfae4LJt4qL=egPg@mail.gmail.com>

On Tue, Sep 1, 2015 at 3:50 PM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:

> >>> (t1 - u) - (t2 - u) == t2 - t1
> False
>

I messed up the order.  the above should have been

>>> (t1 - u) - (t2 - u) == t1 - t2
False
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150901/de846a5a/attachment.html>

From tim.peters at gmail.com  Tue Sep  1 23:56:23 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 1 Sep 2015 16:56:23 -0500
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
Message-ID: <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>

[Alex]
>>> Here is an idea that I think may work: let's consider fold=1 instances
>>> as if they have a different tzinfo instance from the other side in both
>>> datetime subtractions and comparisons.  This will be consistent with
>>> the current stdlib and pytz work-arounds of representing "second"
>>> times using fictitious fixed-offset timezones.

[Tim]
>> That's what I was getting at by saying "fold=1 veritably _screams_
>> 'I'm no longer working in naive time'".  Which implies "I need
>> timeline arithmetic", and everything else follows from that, including
>> hash() not ignoring fold=1 either.
>>
>> But then the concept of "naive time" gets muddier:  sometimes, e.g.,
>>
>>      dt1  - dt2
>>
>> in a common zone (same tzinfo) will use classic arithmetic, but in
>> other cases (fold=1 in at least one) timeline arithmetic.

[Alex]
> I don't think this is a problem as long as we disallow mixing naive and
> aware instances in arithmetic and ordering and keep naive ? aware always
> rule.

"Concept gets muddier" isn't about the code, it's about the concept
getting muddier ;-)  That is, the number of brain cells needed for a
human to grasp the model, and the number of words in the docs needed
to explain it all.

Paying attention to fold=1 in naive time does muddy the naive-time
concept.  A little.  But it should hardly ever matter:  even using a
495 tzinfo, there is nothing a user working _in_ naive time can do to
see a fold=1 value.  They have to force it by hand, or use an
operation _outside_ of naive time (like .astimezone()) to get one.
Doesn't really bother me.


>> And there's also that, after
>>
>>     d = dt1 - dt2
>>
>> I suspect it may no longer always be the case that
>>
>>     dt1 == dt2 + d
>>
>> (unsure, but can't make time for it now)

> That's the price we pay for classic arithmetic anyways.

Not so.  Classic arithmetic obeys all the same friendly identities as
do, e.g., timedelta and integer arithmetic.

You gave an example in a later message, but that didn't stick to
classic arithmetic.  As soon as you mixed timezones, you went outside
of naive time, and timeline arithmetic was used in the instances of
cross-zone subtraction.  Of course the classic arithmetic identities
won't (can't always) apply to a _mix_ of classic and timeline
arithmetic.

The proposed behavior will be the first time timeline arithmetic can
be used sticking to what sure looks like "naive time" operations
(staying within a single zone).  It's the invisible fold=1 in this
case that says "not in naive time - I really want timeline
arithmetic".  I have little problem with that.  I'm just not going to
pretend it isn't _a_ change, or not _a_ muddying.


> I am not even sure we want to trigger timeline arithmetics in dt + delta expressions
> when dt.fold=1.

I am ;--)  Leaving aside that there's no sane reason to refuse to
believe a datetime _means_ fold=1 when we see it, haven't we had
enough of "unintended consequences" from trying to ignore it in other
contexts?  And carving out an exception for "oh - except fold is
ignored in datetime + timedelta, and datetime - timedelta" would be
another muddying of the newly-muddied model.  If there's isn't a solid
reason in favor of ignoring it, that would be a gratuitous muddying.


> If you do, dt - hour + hour will still not take you back because
> the seconds + hour will be classic.
>
> I don't think we can ever get rid of all paradoxes here.  Once you let your
> time go back, all bets are off.  What we can do is to shift them from one
> place to another so that you only see odd behavior when a fold=1 instance is
> involved.

I agree.  Where I go beyond is that they should _always_ see
potentially odd (to naive-time eyes) behavior when fold=1.  That's
understandable.  "Sometime yes, sometimes no" is unexplainable beyond
exhaustive listing of the "sometimes yes" and "sometimes no" cases.
Unless there's a strong reason for the distinction.

From tim.peters at gmail.com  Wed Sep  2 03:41:05 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 1 Sep 2015 20:41:05 -0500
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7+vJJaNdnxy9wb7GoOv7EQCF-7ZKgkT2GHRSuHyUqj3oBy3g@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAP7+vJJaNdnxy9wb7GoOv7EQCF-7ZKgkT2GHRSuHyUqj3oBy3g@mail.gmail.com>
Message-ID: <CAExdVNmSNgdzdOWR57fvaDq-cyD39PT99Q=T6WOMOKgCgXbQBw@mail.gmail.com>

[Guido]
> As a point of order, I don't have time today (nor probably this week) to
> keep up with this discussion. :-(

So. short & sweet, the higher-order bit of the hash problem is easy
enough to sketch.  Suppose x and y represent the earlier and later of
an ambiguous time in their common zone.  All fields are identical
except for `fold`.

If intrazone comparison ignores `fold`, then x == y is true.  Implying
their hashes must be equal.  Implying that (any
non-insanely-convoluted) hash() must also ignore `fold`, to get the
same UTC offset for both.  All fine so far.

But screws up when x and y are (for example) converted to their _real_
UTC equivalents, ux and uy.  Those _aren't_ equal.  hash(x) == hash(y)
== hash(ux) then, but hash(uy) is almost certainly different.  But y
== uy is true, so we're left with two equal datetimes whose hashes are
almost certainly different.  Note "y == uy is true" must be so for
backward compatibility (interzone comparisons have always been
supported).

The high-order bit of the proposed solution (to this,and to the loss
of total ordering, and ..) is to stop ignoring fold=1.  End of
problems.

Start of other problems.  For why the latter are thought (so far) to
be infinitely easier to live with, you would have to follow the
discussion.  By the time you do, there will be no problems left - or
at least none we'll admit to ;-)

From guido at python.org  Wed Sep  2 04:30:56 2015
From: guido at python.org (Guido van Rossum)
Date: Tue, 1 Sep 2015 19:30:56 -0700
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAExdVNmSNgdzdOWR57fvaDq-cyD39PT99Q=T6WOMOKgCgXbQBw@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAP7+vJJaNdnxy9wb7GoOv7EQCF-7ZKgkT2GHRSuHyUqj3oBy3g@mail.gmail.com>
 <CAExdVNmSNgdzdOWR57fvaDq-cyD39PT99Q=T6WOMOKgCgXbQBw@mail.gmail.com>
Message-ID: <CAP7+vJJtadFzHTPOzW1zU3vHa69+vU033gCv2yOPT=OBTuF_wA@mail.gmail.com>

On Tue, Sep 1, 2015 at 6:41 PM, Tim Peters <tim.peters at gmail.com> wrote:

> [Guido]
> > As a point of order, I don't have time today (nor probably this week) to
> > keep up with this discussion. :-(
>
> So. short & sweet, the higher-order bit of the hash problem is easy
> enough to sketch.  Suppose x and y represent the earlier and later of
> an ambiguous time in their common zone.  All fields are identical
> except for `fold`.
>
> If intrazone comparison ignores `fold`, then x == y is true.  Implying
> their hashes must be equal.  Implying that (any
> non-insanely-convoluted) hash() must also ignore `fold`, to get the
> same UTC offset for both.  All fine so far.
>
> But screws up when x and y are (for example) converted to their _real_
> UTC equivalents, ux and uy.  Those _aren't_ equal.  hash(x) == hash(y)
> == hash(ux) then, but hash(uy) is almost certainly different.  But y
> == uy is true, so we're left with two equal datetimes whose hashes are
> almost certainly different.  Note "y == uy is true" must be so for
> backward compatibility (interzone comparisons have always been
> supported).
>

Ah, now I understand why someone in desperation proposed to do make some
kind of assumption about the size of DST offsets.


> The high-order bit of the proposed solution (to this,and to the loss
> of total ordering, and ..) is to stop ignoring fold=1.  End of
> problems.
>
> Start of other problems.  For why the latter are thought (so far) to
> be infinitely easier to live with, you would have to follow the
> discussion.  By the time you do, there will be no problems left - or
> at least none we'll admit to ;-)
>

OK, looks like the PEP has some evolving to do!

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150901/01df7a74/attachment.html>

From stuart at stuartbishop.net  Wed Sep  2 09:42:07 2015
From: stuart at stuartbishop.net (Stuart Bishop)
Date: Wed, 2 Sep 2015 14:42:07 +0700
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7h-xYqdRWJ_5xyKyHSwQog8OWEH84rOc2u5pXR1uPuph2Qqw@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xYvpqEOG0GF_NCfG+DdoJi3BAc9+YmxSCcykVVRwuSUsQ@mail.gmail.com>
 <CAExdVNmKk2jJvUaGG7Q0aa=_Yya_LMEiJdd+TN55yDO+RfwoeA@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAP7h-xYqdRWJ_5xyKyHSwQog8OWEH84rOc2u5pXR1uPuph2Qqw@mail.gmail.com>
Message-ID: <CADmi=6MiZG01StmO8Tw+fi=Wgxkq0gieS980i+DfBRjo7WLG9g@mail.gmail.com>

On 1 September 2015 at 04:42, Alexander Belopolsky
<alexander.belopolsky at gmail.com> wrote:

> This forum may not be inclusive enough for this.  People in this group know
> too much!

Not all of us. I claim ignorance from not being able to follow this
complete thread :)

My naive assumptions would be that dt1 == dt2 implies that
dt1.utctimetuple() == dt2.utctimetuple(). Which means the hash
implementation can just be hash(dt.utctimetuple()).
datetime.utctimetuple() already defines dst flag munging, which seems
very similar to the fold munging suggestions I skimmed past.


-- 
Stuart Bishop <stuart at stuartbishop.net>
http://www.stuartbishop.net/

From tim.peters at gmail.com  Wed Sep  2 18:33:59 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 2 Sep 2015 11:33:59 -0500
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CADmi=6MiZG01StmO8Tw+fi=Wgxkq0gieS980i+DfBRjo7WLG9g@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xYvpqEOG0GF_NCfG+DdoJi3BAc9+YmxSCcykVVRwuSUsQ@mail.gmail.com>
 <CAExdVNmKk2jJvUaGG7Q0aa=_Yya_LMEiJdd+TN55yDO+RfwoeA@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAP7h-xYqdRWJ_5xyKyHSwQog8OWEH84rOc2u5pXR1uPuph2Qqw@mail.gmail.com>
 <CADmi=6MiZG01StmO8Tw+fi=Wgxkq0gieS980i+DfBRjo7WLG9g@mail.gmail.com>
Message-ID: <CAExdVNncif+0EuSAaWOyfU_uA=-UE_RYOrE4kYF7rP-ykJDeoA@mail.gmail.com>

[Alex]
>> This forum may not be inclusive enough for this.  People in this group know
>> too much!

[Stuart]
> Not all of us. I claim ignorance from not being able to follow this
> complete thread :)

Despite the Subject line, it's mostly been about consequences of PEP
495 ignoring `fold` altogether in some contexts, so as to have no
visible effect whatsoever in "naive time" (even for a datetime with a
zone).  Your brain cells have worked in the opposite direction so far,
to fight "naive time" tooth & nail inside pytz for aware datetimes.


> My naive assumptions would be that dt1 == dt2 implies that
> dt1.utctimetuple() == dt2.utctimetuple().

Yup!  Which is another reasonable expectation that could fail under
the current 495, when dt1 and dt2 share a zone.  If dt1 and dt2 are
the earlier and later of an ambiguous time in a common zone, they
differ only in their `fold` value.  Under 495, dt1 == dt2 would be
true anyway, but anything related to zone _conversion_ would see the
difference.  So .utctimetuple() would differ.  At a more basic level,
utcoffset() would also differ.

The proposed solution is to "simply" stop ignoring fold=1.  Then dt1
!= dt2 from the start, so no reasonable expectations are violated.
Except for someone working in naive time who somehow manages to force
`fold` to 1 anyway.  They may be surprised to see dt1 != dt2 in the
case above.  But only the first time they see it ;-)


> Which means the hash implementation can just be hash(dt.utctimetuple()).

Yup, it could be, provided 495 is changed to stop ignoring fold=1 for
intrazone comparisons.  It isn't (and won't be), because that would be
a poorer-quality hash implementation (nothing about the current
__hash__ should need to change):

-  .utctimetuple() throws away dt.microsecond, so hash() would produce massive
   collisions in some cases of regular inputs.

- There are faster ways of getting the effect of converting to UTC (including
  microseconds).  The actual implementation isn't documented, because it
  doesn't need to be ;-)

From alexander.belopolsky at gmail.com  Wed Sep  2 18:54:56 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Wed, 2 Sep 2015 12:54:56 -0400
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
Message-ID: <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>

On Tue, Sep 1, 2015 at 5:56 PM, Tim Peters <tim.peters at gmail.com> wrote:

> Paying attention to fold=1 in naive time does muddy the naive-time
> concept.  A little.  But it should hardly ever matter:  even using a
> 495 tzinfo, there is nothing a user working _in_ naive time can do to
> see a fold=1 value.  They have to force it by hand, or use an
> operation _outside_ of naive time (like .astimezone()) to get one.
>

There are two more cases:

(1) datetime.now() will return fold=1 instances during one hour each year;
(2) datetime.fromtimestamp(s) will return fold=1 instances for some values
of s.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150902/2c22e4da/attachment.html>

From tim.peters at gmail.com  Wed Sep  2 19:20:14 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 2 Sep 2015 12:20:14 -0500
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
Message-ID: <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>

[Tim]
>> Paying attention to fold=1 in naive time does muddy the naive-time
>> concept.  A little.  But it should hardly ever matter:  even using a
>> 495 tzinfo, there is nothing a user working _in_ naive time can do to
>> see a fold=1 value.  They have to force it by hand, or use an
>> operation _outside_ of naive time (like .astimezone()) to get one.

[Alex]
> There are two more cases:
>
> (1) datetime.now() will return fold=1 instances during one hour each year;
> (2) datetime.fromtimestamp(s) will return fold=1 instances for some values
> of s.

Sure - but anything reflecting how a local clock actually behaves is
outside of "naive time".  Clocks in naive time never jump forward or
backward.  Specifically, .now() and .fromtimestamp() are also
operations outside of naive time.

It might, of course, have helped had the docs said a word about any of this ;-)

From alexander.belopolsky at gmail.com  Wed Sep  2 19:40:15 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Wed, 2 Sep 2015 13:40:15 -0400
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
Message-ID: <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>

On Wed, Sep 2, 2015 at 1:20 PM, Tim Peters <tim.peters at gmail.com> wrote:

> [Alex]
> > There are two more cases:
> >
> > (1) datetime.now() will return fold=1 instances during one hour each
> year;
> > (2) datetime.fromtimestamp(s) will return fold=1 instances for some
> values
> > of s.
>
> Sure - but anything reflecting how a local clock actually behaves is
> outside of "naive time".  Clocks in naive time never jump forward or
> backward.  Specifically, .now() and .fromtimestamp() are also
> operations outside of naive time.
>

I agree, but the worst thing we can do to our users is to plant a time bomb
that will go off once a year.  Suppose someone has a program that uses
naive local times and relies on t < prev_t test to detect the fall-back
fold and do something about it.  If we don't ignore fold in naive datetime
comparisons - this program will start producing incorrect results.

Fortunately, we don't need to do anything about naive times.  The hash
invariant is only violated by aware instances.

I think what you are really fighting against is the notion that for regular
times, fold=1 is just an alternative spelling for fold=0 times.  It looks
like you would rather see fold=1 as some different (and invalid) time.

Think of the German A and B hours: are regular hours A or B?  The German
standard say that they are neither, but PEP 495 say that they are both: 2A
is the same as 2B  unless "2" in the fold and that allows you not to
display A/B in those cases.

Folds do not exist in naive time, so all times are regular and therefore
time(h, m, s, us,  fold=0) == time(h, m, s, us,  fold=1) always.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150902/7d2a540f/attachment.html>

From tim.peters at gmail.com  Wed Sep  2 21:59:03 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 2 Sep 2015 14:59:03 -0500
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
Message-ID: <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>

[Alex]
>>> There are two more cases:
>>>
>>> (1) datetime.now() will return fold=1 instances during one hour each
>>> year;
>>> (2) datetime.fromtimestamp(s) will return fold=1 instances for some
>>> values of s.

[Tim]
>> Sure - but anything reflecting how a local clock actually behaves is
>> outside of "naive time".  Clocks in naive time never jump forward or
>> backward.  Specifically, .now() and .fromtimestamp() are also
>> operations outside of naive time.

[Alex]
> I agree, but the worst thing we can do to our users is to plant a time bomb
> that will go off once a year.  Suppose someone has a program that uses naive
> local times and relies on t < prev_t test to detect the fall-back fold and
> do something about it.  If we don't ignore fold in naive datetime
> comparisons - this program will start producing incorrect results.

Yes, but I believe it's worse:  that it's impossible for PEP 495 to be
wholly backward compatible regardless of whether intrazone comparison
ignores `fold`.  It's not just "stare at one line of code" that counts
for compatibility, breaking former invariants also counts.  Like
Stewart mentioned just before, anyone in their right mind ;-)
_implicitly_ assumed all along that

    x == y

implies

    x.utctimetuple() == y.utctimetuple()

and, indeed,

    x.astimezone(SOMETZINFO) == y.astimezone(SOMETZINFO)

too for any value of SOMETZINFO.

PEP 495's original form breaks those (among others) - it's not
credible to claim that no existing code could possibly be relying on
those (or relying on total datetime ordering, etc).  That may not be
reflected in any single line of code, but only in what code _didn't_
do to worm around "a problem" it reasonably - perhaps not even
consciously - assumed could never happen.

The only way I see to be wholly backward compatible is to default to
fold = -1, where fold < 0 is wholly ignored by everything, always.
That's the only way to be sure no code breaks, because no behaviors
whatsoever change, in any context, except possibly for the
datetime.__repr_() string produced.  Not just in single lines of code,
but no invariants break either.

But that also means .now() and .fromtimestamp() and .fromutc() must
set set fold = -1, lest a fold=1 sneak in (your "time bomb once a
year" scenario).  Then we either need different fold-aware versions of
all such functions, or new optional foldaware=False arguments on all
such functions.  But then it's so annoying and error-prone to use, who
would bother?  Whoever responds with "global flag" will be shot ;-)

> Fortunately, we don't need to do anything about naive times.  The hash
> invariant is only violated by aware instances.

Proving yet again that naive time is the only way to go ;-)


> I think what you are really fighting against is the notion that for regular
> times, fold=1 is just an alternative spelling for fold=0 times.  It looks
> like you would rather see fold=1 as some different (and invalid) time.

In naive time, `fold=1` is simply senseless.  It "should be" ignored
in naive time.  But there is no wall between "naive time" and
"timeline time" in datetime's design - indeed, there is no _explicit_
way to say which you have in mind.  Something has to give, because an
aware datetime can be _viewed_ as being either in naive time or as in
timeline time.  That's in the programmer's head.  Since fold=1 makes
no sense in naive time, the sanest thing is to take it as meaning the
datetime can _only_ be viewed as being in timeline time.  We already
know that solves a world of problems.  But it will create others.
Alas, best I can see, nothing short of fold < 0 can create _no_
problems (except for making it all kinds of pain to get fold-aware
behaviors instead).


> Think of the German A and B hours: are regular hours A or B?  The German
> standard say that they are neither, but PEP 495 say that they are both:
> 2A is the same as 2B  unless "2" in the fold and that allows you not to display
> A/B in those cases.

I'm not sure appealing to German A and B hours really clarifies it ;-)


> Folds do not exist in naive time, so all times are regular and therefore
> time(h, m, s, us,  fold=0) == time(h, m, s, us,  fold=1) always.

As above, we can have no real idea whether the programmer _intends_
that an aware datetime lives in naive time or timeline time.  fold=1
screams "timeline".

From carl at oddbird.net  Wed Sep  2 22:26:10 2015
From: carl at oddbird.net (Carl Meyer)
Date: Wed, 2 Sep 2015 14:26:10 -0600
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
Message-ID: <55E75B62.2060905@oddbird.net>

On 09/02/2015 01:59 PM, Tim Peters wrote:
[snip]
> Yes, but I believe it's worse:  that it's impossible for PEP 495 to be
> wholly backward compatible regardless of whether intrazone comparison
> ignores `fold`.  It's not just "stare at one line of code" that counts
> for compatibility, breaking former invariants also counts.  Like
> Stewart mentioned just before, anyone in their right mind ;-)
> _implicitly_ assumed all along that
> 
>     x == y
> 
> implies
> 
>     x.utctimetuple() == y.utctimetuple()
> 
> and, indeed,
> 
>     x.astimezone(SOMETZINFO) == y.astimezone(SOMETZINFO)
> 
> too for any value of SOMETZINFO.
> 
> PEP 495's original form breaks those (among others) - it's not
> credible to claim that no existing code could possibly be relying on
> those (or relying on total datetime ordering, etc).  That may not be
> reflected in any single line of code, but only in what code _didn't_
> do to worm around "a problem" it reasonably - perhaps not even
> consciously - assumed could never happen.
> 
> The only way I see to be wholly backward compatible is to default to
> fold = -1, [...]
>
> In naive time, `fold=1` is simply senseless.  It "should be" ignored
> in naive time.  But there is no wall between "naive time" and
> "timeline time" in datetime's design - indeed, there is no _explicit_
> way to say which you have in mind.  Something has to give, because an
> aware datetime can be _viewed_ as being either in naive time or as in
> timeline time.  That's in the programmer's head.  Since fold=1 makes
> no sense in naive time, the sanest thing is to take it as meaning the
> datetime can _only_ be viewed as being in timeline time.  We already
> know that solves a world of problems.

Totally in agreement with everything above.

To summarize: trying to disambiguate folds leads to contradiction if the
implementation doesn't fully accept a "timeline" view of tz-aware
datetimes, because in a "naive" view, the two overlapping times in a
fold are the _same time_. The very idea of disambiguation itself is a
"timeline view" concept; it's not consistent with naive time.

> But it will create others.

Can we enumerate the specific problems this would create?

Let's hypothesize the following proposal:

* As discussed in earlier threads, datetime is taught to respect a new
`strict` flag on tzinfo objects, treating aware datetimes as fully in
"timeline time," including for arithmetic, (only) if it is set. If it is
not set, no behavior changes from what we have today.

* The `fold` flag is respected in any way (and ever set to anything
other than -1 by built-in methods) _only_ if the attached tzinfo has
`strict=True`.

Now what problems would this cause?

* Backwards compatibility is not a problem. There are no tzinfo classes
currently in existence with `strict=True`.

* All of PEP 495's problems with hashes, equality, and ordering that
have been discussed in this thread are solved; `fold` is entirely unused
with non-strict tzinfo, and entirely consistent with strict tzinfo.

* Ability to work with timezone-annotated datetimes (I can't say
"timezone-aware" with a straight face for datetimes that operate in
naive time) in naive time, which is a use case that some people have, is
preserved; just use a tzinfo with `strict=False`.

* Working with a "timeline view" of tz-aware datetimes (which is also a
valid use case that some people have) becomes much simpler than it is
today; much simpler even than with pytz.

It looks like all wins to me.

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150902/58efc1c6/attachment.sig>

From tim.peters at gmail.com  Thu Sep  3 00:26:41 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 2 Sep 2015 17:26:41 -0500
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <55E75B62.2060905@oddbird.net>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
Message-ID: <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>

[Carl Meyer <carl at oddbird.net>]
> ...
> To summarize: trying to disambiguate folds leads to contradiction if the
> implementation doesn't fully accept a "timeline" view of tz-aware
> datetimes, because in a "naive" view, the two overlapping times in a
> fold are the _same time_. The very idea of disambiguation itself is a
> "timeline view" concept; it's not consistent with naive time.

Fun, isn't it?

[Tim]
>> But it will create others.

> Can we enumerate the specific problems this would create?

That use of "we" appears to mean "anyone but Carl" ;-)  The problems
it could create depend on the contexts in which the PEP says fold
would not be ignored.  While nobody has mentioned it, it _could_ be
that someone working in naive time would be annoyed even by
dt.utcoffset() returning different results depending on `fold`.  While
it's a goal of the PEP to _make_ them differ in some cases, that in
itself isn't wholly backward compatible in all conceivable cases.
Then why would the use a 495-compliant tzinfo to begin with?  Because
they'er a user, and they don't understand any of this stuff ;-)


> Let's hypothesize the following proposal:
>
> * As discussed in earlier threads, datetime is taught to respect a new
> `strict` flag on tzinfo objects, treating aware datetimes as fully in
> "timeline time," including for arithmetic, (only) if it is set. If it is
> not set, no behavior changes from what we have today.

Why conflate this with arithmetic?  It's. e.g., quite possible someone
wants correct interzone conversion in all cases without getting sucked
into way-slower arithmetic too.  For the purposes of 495, I'm going to
pretend that using fold is controlled by the presence of a new tzinfo
__fold__ attribute (we can't use a flag, because _existing_ tzinfos
don't already have it).

Arithmetic is a different issue.  Presumably a `strict` tzinfo would
be required to say "fold-aware" too, but also say more than just that.


> * The `fold` flag is respected in any way (and ever set to anything
> other than -1 by built-in methods) _only_ if the attached tzinfo has
> `strict=True`.

Since there's now a way to spell "ignore fold" versus "respect fold",
there's no longer any point to fold < 0.  "Ignore fold" is now the
default, and "respect fold" has to be explicitly requested.

For simplicity, any function that knows how to set fold correctly
should be _allowed_ to do so regardless.


> Now what problems would this cause?
>
> * Backwards compatibility is not a problem. There are no tzinfo classes
> currently in existence with `strict=True`.

It does appear to be wholly backward compatible, and that would be great.


> * All of PEP 495's problems with hashes, equality, and ordering that
> have been discussed in this thread are solved; `fold` is entirely unused
> with non-strict tzinfo, and entirely consistent with strict tzinfo.

There are still questions, like, e.g., what

    fold_aware_datetime + timedelta

should do when fold=1, but only in my variation of what you proposed.
You proposed mixing "pay attention to fold" with "timeline
arithmetic", which leaves no choice.  Alex and I seem to disagree
about what to do when "only pay attention to fold" is meant instead.
I think it makes a difference now that they're explicitly asking to
respect fold - but I'm not yet sure _what_ difference it makes ;-)


> * Ability to work with timezone-annotated datetimes (I can't say
> "timezone-aware" with a straight face for datetimes that operate in
> naive time) in naive time, which is a use case that some people have, is
> preserved; just use a tzinfo with `strict=False`.

"timezone-annotated" is a winner!  LOL - what a frickin ' mess ;-)


> * Working with a "timeline view" of tz-aware datetimes (which is also a
> valid use case that some people have) becomes much simpler than it is
> today; much simpler even than with pytz.

I'm still to keen to push timeline arithmetic off to a later PEP.  It
doesn't have to be addressed to solve 495's problems.


> It looks like all wins to me.

Good food for thought.  Thanks!

From carl at oddbird.net  Thu Sep  3 01:01:32 2015
From: carl at oddbird.net (Carl Meyer)
Date: Wed, 2 Sep 2015 17:01:32 -0600
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
Message-ID: <55E77FCC.9040507@oddbird.net>

[Tim]
> [Carl Meyer <carl at oddbird.net>]
>> ...
>> To summarize: trying to disambiguate folds leads to contradiction if the
>> implementation doesn't fully accept a "timeline" view of tz-aware
>> datetimes, because in a "naive" view, the two overlapping times in a
>> fold are the _same time_. The very idea of disambiguation itself is a
>> "timeline view" concept; it's not consistent with naive time.
> 
> Fun, isn't it?

I think the number of people for whom this qualifies as "fun" approaches
the number of people who have ever implemented a tzinfo ;)

> [Tim]
>>> But it will create others.
> 
>> Can we enumerate the specific problems this would create?
> 
> That use of "we" appears to mean "anyone but Carl" ;-)

Right - I mis-read the referent of "it" above. You were talking about
the proposal to make fold=1 (only) force "timeline view". I understand
the problems that causes.

I mis-read and thought you were suggesting the possibility of "use
timeline view always," and saying _that_ "creates other problems." So I
was trying to think of what problems those would be, and not thinking of
any.

[Carl]
>> Let's hypothesize the following proposal:
>>
>> * As discussed in earlier threads, datetime is taught to respect a new
>> `strict` flag on tzinfo objects, treating aware datetimes as fully in
>> "timeline time," including for arithmetic, (only) if it is set. If it is
>> not set, no behavior changes from what we have today.
[Tim]
> Why conflate this with arithmetic?  It's. e.g., quite possible someone
> wants correct interzone conversion in all cases without getting sucked
> into way-slower arithmetic too.

One reason to conflate with arithmetic is to limit the number of mental
models people have to comprehend. If we conflate, there would be two
models: "naive model" and "timeline model", and the choice between them
would be controlled by one flag.

I think that's already more than enough complexity for most people, but
it's simplicity itself compared to the possibility that we could end up
with three models: "naive model", "timeline model for conversions but
still naive for arithmetic", and "timeline model".

ISTM the second is too confusing and inconsistent in its view of the
world to be featured as a primary mode; if someone really needs it, it'd
be easy enough to write functions to do fast naive arithmetic on
strict-aware datetimes (strip the tzinfo, then add it back). (The
write-your-own-function argument can go both ways! ;)

> For the purposes of 495, I'm going to
> pretend that using fold is controlled by the presence of a new tzinfo
> __fold__ attribute (we can't use a flag, because _existing_ tzinfos
> don't already have it).

As an API choice I think "boolean flag with default if not present" is
preferable to "mere existence of an attribute causes a switch in
behavior, regardless of its value." But this is definitely a low-order
bit here.

>> * The `fold` flag is respected in any way (and ever set to anything
>> other than -1 by built-in methods) _only_ if the attached tzinfo has
>> `strict=True`.
> 
> Since there's now a way to spell "ignore fold" versus "respect fold",
> there's no longer any point to fold < 0.  "Ignore fold" is now the
> default, and "respect fold" has to be explicitly requested.
> 
> For simplicity, any function that knows how to set fold correctly
> should be _allowed_ to do so regardless.

Yes, good point.

>> * All of PEP 495's problems with hashes, equality, and ordering that
>> have been discussed in this thread are solved; `fold` is entirely unused
>> with non-strict tzinfo, and entirely consistent with strict tzinfo.
> 
> There are still questions, like, e.g., what
> 
>     fold_aware_datetime + timedelta
> 
> should do when fold=1, but only in my variation of what you proposed.
> You proposed mixing "pay attention to fold" with "timeline
> arithmetic", which leaves no choice.

Yes. Point for my proposal :-)

The fact that that even has to be a question illustrates how
"timeline-mode conversions with fold disambiguation, but naive model for
arithmetic" remains a problematic split-brain model that leads to
inconsistencies.

[Tim]
> I'm still to keen to push timeline arithmetic off to a later PEP.  It
> doesn't have to be addressed to solve 495's problems.

I think you've convincingly demonstrated in this thread that
conversions, equality, comparisons, and arithmetic _are_ all
fundamentally linked. If you try to cut them apart and handle some with
a timeline model and some with a naive model, you'll have to violate a
reasonably-expected invariant _somewhere_.

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150902/e73632ac/attachment-0001.sig>

From alexander.belopolsky at gmail.com  Thu Sep  3 01:05:04 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Wed, 2 Sep 2015 19:05:04 -0400
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
Message-ID: <CAP7h-xY5Mmnqx1t0c0LEr0jcCaFqsFK5YJzd7q+rWDP8ZH-7gg@mail.gmail.com>

On Wed, Sep 2, 2015 at 6:26 PM, Tim Peters <tim.peters at gmail.com> wrote:

> There are still questions, like, e.g., what
>
>     fold_aware_datetime + timedelta
>
> should do when fold=1, but only in my variation of what you proposed.
> You proposed mixing "pay attention to fold" with "timeline
> arithmetic", which leaves no choice.  Alex and I seem to disagree
> about what to do when "only pay attention to fold" is meant instead.
>

This is one of those cases where I don't have a strong opinion.  Unlike the
datetime - datetime case where we have a strong argument to do timeline
arithmetic in the presence of fold=1 (namely to preserve the hash
invariant),
any choice here will lead to surprises.

What should [01:30/fold=1] - (1 hour) yield?  Given that [01:30/fold=0] +
(1 hour) = [02:30/fold=0] and  [00:30/fold=0] + (1 hour) = [01:30/fold=0],
both answers [01:30/fold=0] and [00:30/fold=0] are equally wrong.  The
third possibility,  [00:30/fold=1] is probably more wrong than the first
two.

Whatever logic we will end up implementing will likely need to be modified
by the applications to fit their needs.  In this case, I think we need to
provide the faster to compute option so that applications don't end up
undoing some expensive operations.

As Guido said, arithmetic is a way to move the hands of the clock.  It does
not need to be a way to mess with the fold attribute.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150902/72fcf656/attachment.html>

From tim.peters at gmail.com  Thu Sep  3 02:39:41 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 2 Sep 2015 19:39:41 -0500
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <55E77FCC.9040507@oddbird.net>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
Message-ID: <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>

We should be clearer about something up front:  there are _two_
notions of "backward compatibility" here:

1. After PEP 495 is implemented but before any 495-compliant tzinfos exist.

2. After PEP 495 is implemented and the user explicitly employs a
495-compliant tzinfo.

There are few potential issues under #1, but - alas - not none, and -
double alas - none of them would be altered by any kind of flag or
inheritance pattern on tzinfos.

Under #1, no tzinfos know anything about `fold`, so little _could_
possibly change.  Users can explicitly set fold=1, and if so they
deserve whatever they get ;-)  But, as I read the PEP, there are 3
places Python may force fold=1 all on its own, all based on Python's
own (independent of any tzinfo) idea of the local timezone:

A. datetime.fromtimestamp() without a trzinfo argument.
B. datetime.now().
C. datetime.today()

[The PEP doesn't mention #C, but it probably should (it's defined to
be equivalent to passing time.time() to #A)]

Since none of these consult a tzinfo, they can start producing fold=1
immediately, and nothing can stop that.

So I take almost everything back ;-)

1. No trick with tzinfos can make a lick of difference to what A/B/C
will do from the start.

2. Because of #1, the idea of explicitly saying "I want a fold-aware
tzinfo" is the same thing as using a 495-compliant tzinfo.  Keep using
pre-495 tzinfos, and A/B/C remain your only worries.

But they're minor worries at worst, since in #1 no tzinfos pay any
attention to `fold` - the fundamental .utcoffset() in a pre-495 tzinfo
is oblivious to `fold`.

BTW, it may be useful to add a standardized (by the PEP) way for a
tzinfo to _say_ "I implement 495".  Like a magic new attribute.  Then
code that cares could use hasattr() to refuse or require using
495-compliant tzinfos.


[Carl]
> ...
> One reason to conflate with arithmetic is to limit the number of mental
> models people have to comprehend. If we conflate, there would be two
> models: "naive model" and "timeline model", and the choice between them
> would be controlled by one flag.
>
> I think that's already more than enough complexity for most people, but
> it's simplicity itself compared to the possibility that we could end up
> with three models: "naive model", "timeline model for conversions but
> still naive for arithmetic", and "timeline model".
>
> ISTM the second is too confusing and inconsistent in its view of the
> world to be featured as a primary mode; if someone really needs it, it'd
> be easy enough to write functions to do fast naive arithmetic on
> strict-aware datetimes (strip the tzinfo, then add it back). (The
> write-your-own-function argument can go both ways! ;)

It was always intended that users who wanted timeline arithmetic work
in UTC instead.  Everyone agrees that's best practice for many
reasons.  "Even Stuart" ;-) will agree with the latter.

As to using functions, they're not symmetric situations:  classic
arithmetic is very fast, so fast that the overheads of calling a
function and mucking around with stripping/reattaching tzinfos would
be a major speed hit.  timeline arithmetic is so slow that hardly
matters.  But work in UTC, as intended, and timeline arithmetic is the
same thing as classic arithmetic, so is also very fast when performed
the intended way.

> ...
> The fact that that even has to be a question illustrates how
> "timeline-mode conversions with fold disambiguation, but naive model for
> arithmetic" remains a problematic split-brain model that leads to
> inconsistencies.

I'm more inclined now to see it as an illustration that Alex's view is
right:  datetime +/- timedelta should indeed ignore fold.  If I wanted
timeline arithmetic, I should have been working in UTC from the start
;-)

...

>> I'm still to keen to push timeline arithmetic off to a later PEP.  It
>> doesn't have to be addressed to solve 495's problems.

> I think you've convincingly demonstrated in this thread that
> conversions, equality, comparisons, and arithmetic _are_ all
> fundamentally linked. If you try to cut them apart and handle some with
> a timeline model and some with a naive model, you'll have to violate a
> reasonably-expected invariant _somewhere_.

Python already did, using timeline arithmetic for cross-zone
subtraction and comparisons, and (necessarily so) for timezone
conversions, but classic arithmetic for all other intrazone
computations.  Mucking with that old model really does belong in a
different PEP.  We're having quite enough pain already just figuring
out what can go wrong with a single new bit ;-)

From tim.peters at gmail.com  Thu Sep  3 02:48:54 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 2 Sep 2015 19:48:54 -0500
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7h-xY5Mmnqx1t0c0LEr0jcCaFqsFK5YJzd7q+rWDP8ZH-7gg@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <CAP7h-xY5Mmnqx1t0c0LEr0jcCaFqsFK5YJzd7q+rWDP8ZH-7gg@mail.gmail.com>
Message-ID: <CAExdVN=PmkN9_qV1=be0a3pTzut4XJ=yd4=UYgE3xXzihaZ5Vw@mail.gmail.com>

[Tim]
>> There are still questions, like, e.g., what
>>
>>     fold_aware_datetime + timedelta
>>
>> should do when fold=1, but only in my variation of what you proposed.
>> You proposed mixing "pay attention to fold" with "timeline
>> arithmetic", which leaves no choice.  Alex and I seem to disagree
>> about what to do when "only pay attention to fold" is meant instead.

[Alex]
> This is one of those cases where I don't have a strong opinion.

I do:  it should ignore fold=1.  Precisely the opposite of what you
_thought_ I've been saying ;-)


> Unlike the datetime - datetime case where we have a strong argument
> to do timeline arithmetic in the presence of fold=1 (namely to preserve
> the hash invariant),

And total ordering, and equivalence between comparison outcomes and
subtraction results.  There are any number of "common sense"
invariants that rely on this.

> any choice here will lead to surprises.

Indeed so.  So screw it ;-)

> ...

From carl at oddbird.net  Thu Sep  3 05:24:25 2015
From: carl at oddbird.net (Carl Meyer)
Date: Wed, 2 Sep 2015 21:24:25 -0600
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
Message-ID: <55E7BD69.3060905@oddbird.net>

[Tim]
>>> I'm still to keen to push timeline arithmetic off to a later PEP.  It
>>> doesn't have to be addressed to solve 495's problems.
[Carl]
>> I think you've convincingly demonstrated in this thread that
>> conversions, equality, comparisons, and arithmetic _are_ all
>> fundamentally linked. If you try to cut them apart and handle some with
>> a timeline model and some with a naive model, you'll have to violate a
>> reasonably-expected invariant _somewhere_.
[Tim]
> Python already did, using timeline arithmetic for cross-zone
> subtraction and comparisons, and (necessarily so) for timezone
> conversions, but classic arithmetic for all other intrazone
> computations.

I know :(

> Mucking with that old model really does belong in a
> different PEP.  We're having quite enough pain already just figuring
> out what can go wrong with a single new bit ;-)

But the point is that changing that model (in a backwards-compatible
way, by means of a tzinfo flag) to draw a clear line between
timeline-mode and naive-mode, _eliminates_ almost all of that pain. All
these puzzles about arithmetic, ordering, equality, and hashing go away
entirely (that is, they have obvious and unsurprising answers).

So doing these two things together doesn't add to the net pain; it
reduces it considerably.

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150902/641e01fa/attachment-0001.sig>

From carl at oddbird.net  Thu Sep  3 05:36:21 2015
From: carl at oddbird.net (Carl Meyer)
Date: Wed, 2 Sep 2015 21:36:21 -0600
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAExdVN=PmkN9_qV1=be0a3pTzut4XJ=yd4=UYgE3xXzihaZ5Vw@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <CAP7h-xY5Mmnqx1t0c0LEr0jcCaFqsFK5YJzd7q+rWDP8ZH-7gg@mail.gmail.com>
 <CAExdVN=PmkN9_qV1=be0a3pTzut4XJ=yd4=UYgE3xXzihaZ5Vw@mail.gmail.com>
Message-ID: <55E7C035.7080707@oddbird.net>

[Tim]
>>> There are still questions, like, e.g., what
>>>
>>>     fold_aware_datetime + timedelta
>>>
>>> should do when fold=1, but only in my variation of what you proposed.
>>> You proposed mixing "pay attention to fold" with "timeline
>>> arithmetic", which leaves no choice.  Alex and I seem to disagree
>>> about what to do when "only pay attention to fold" is meant instead.
[Alex]
>> This is one of those cases where I don't have a strong opinion.
[Tim]
> I do:  it should ignore fold=1.  Precisely the opposite of what you
> _thought_ I've been saying ;-)

[Alex]
>> Unlike the datetime - datetime case where we have a strong argument
>> to do timeline arithmetic in the presence of fold=1 (namely to preserve
>> the hash invariant),
> 
> And total ordering, and equivalence between comparison outcomes and
> subtraction results.  There are any number of "common sense"
> invariants that rely on this.

IIUC, choosing this combination of behavior means that it is possible to
have a datetime `dt1` (with fold=1) such that:

dt1 - dt2 => delta

where `fold` is respected in this case, but

dt2 + delta != dt1

because fold is ignored for timedelta arithmetic (but is respected for
equality-checking, because that's necessary to maintain the hashing
invariant).

Are we really so wedded to maintaining an unpredictable hybrid
naive/aware model for timezone-annotated datetimes that we're willing to
break basic invariants of arithmetic and equality to preserve it?

>> any choice here will lead to surprises.
> 
> Indeed so.  So screw it ;-)

An alternative to "so screw it" in the face of this puzzle would be to
choose the option that preserves all the invariants and behaves
predictably in all cases. But I suppose that would make things too easy...

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150902/2c290dc9/attachment.sig>

From carl at oddbird.net  Thu Sep  3 05:47:26 2015
From: carl at oddbird.net (Carl Meyer)
Date: Wed, 2 Sep 2015 21:47:26 -0600
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
Message-ID: <55E7C2CE.5030407@oddbird.net>

[Tim]
> [Carl Meyer <carl at oddbird.net>]
>> ...
>> To summarize: trying to disambiguate folds leads to contradiction if the
>> implementation doesn't fully accept a "timeline" view of tz-aware
>> datetimes, because in a "naive" view, the two overlapping times in a
>> fold are the _same time_. The very idea of disambiguation itself is a
>> "timeline view" concept; it's not consistent with naive time.
> 
> Fun, isn't it?

If this is a fair summary, then why are we still trying to both keep a
"naive" model for aware datetimes and also disambiguate folds, when
we've just accepted that the two concepts are inherently contradictory
and combining them inevitably will lead to surprises?

If timezone-annotated datetimes in Python are really just supposed to
represent naive clock time with an associated timezone, then there is no
point in trying to disambiguate at a fold; both sides of the fold are
the same naive clock time in the same timezone.

If timezone-annotated datetimes in Python represent an unambiguously
UTC-convertible instant, then why shouldn't they consistently behave
that way (and happily eliminate all the surprising corner cases from PEP
495)?

If they are supposed to represent some quantum hybrid of the two, where
in some situations they behave like one and in some situations like the
other (that is the status quo, of course), is there a concisely-stated
consistent rule by which one can predict when they will behave like one
and when they will behave like the other? Will that rule still apply
post-PEP-495?

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150902/4e3db850/attachment.sig>

From carl at oddbird.net  Thu Sep  3 06:02:39 2015
From: carl at oddbird.net (Carl Meyer)
Date: Wed, 2 Sep 2015 22:02:39 -0600
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
Message-ID: <55E7C65F.8050106@oddbird.net>


On 09/02/2015 06:39 PM, Tim Peters wrote:
> We should be clearer about something up front:  there are _two_
> notions of "backward compatibility" here:
> 
> 1. After PEP 495 is implemented but before any 495-compliant tzinfos exist.
> 
> 2. After PEP 495 is implemented and the user explicitly employs a
> 495-compliant tzinfo.
> 
> There are few potential issues under #1, but - alas - not none, and -
> double alas - none of them would be altered by any kind of flag or
> inheritance pattern on tzinfos.

If the `fold` attribute is entirely ignored, in all cases, unless a
new-style tzinfo is present, then there are no issues under #1.

> [Carl]
>> ...
>> One reason to conflate with arithmetic is to limit the number of mental
>> models people have to comprehend. If we conflate, there would be two
>> models: "naive model" and "timeline model", and the choice between them
>> would be controlled by one flag.
>>
>> I think that's already more than enough complexity for most people, but
>> it's simplicity itself compared to the possibility that we could end up
>> with three models: "naive model", "timeline model for conversions but
>> still naive for arithmetic", and "timeline model".
>>
>> ISTM the second is too confusing and inconsistent in its view of the
>> world to be featured as a primary mode; if someone really needs it, it'd
>> be easy enough to write functions to do fast naive arithmetic on
>> strict-aware datetimes (strip the tzinfo, then add it back). (The
>> write-your-own-function argument can go both ways! ;)
> 
> It was always intended that users who wanted timeline arithmetic work
> in UTC instead.  Everyone agrees that's best practice for many
> reasons.  "Even Stuart" ;-) will agree with the latter.

For apps doing heavy datetime arithmetic, I agree that working in UTC is
best (and that's what I do). It would also be reasonable to say that if
you want naive arithmetic with an implied timezone shared by all
instances, the best practice is to use naive datetimes and track the
implied timezone separately.

But given that we're not proposing to raise an exception on all
arithmetic with tz-annotated datetimes, it has to behave _somehow_, and
it should behave in the least-surprising and most-consistent way
possible. In a post-PEP-495 world, it is abundantly clear that
consistent timeline arithmetic would require fewer (that is, zero)
surprising violations of invariants.

If a new Python user is trying to calculate how long they slept when
they went to bed at 10pm on March 2 and got up at 6am on March 3, Python
should give them the right answer. Telling them "you should convert to
UTC first if you want your tz-aware datetime to actually be aware of the
tz transition" is going to sound a bit silly to them; they neither went
to sleep in UTC nor awoke in UTC; they did both in their own timezone.

> As to using functions, they're not symmetric situations:  classic
> arithmetic is very fast, so fast that the overheads of calling a
> function and mucking around with stripping/reattaching tzinfos would
> be a major speed hit.  timeline arithmetic is so slow that hardly
> matters.  But work in UTC, as intended, and timeline arithmetic is the
> same thing as classic arithmetic, so is also very fast when performed
> the intended way.

Ok, continue using an old-style tzinfo (without the new `strict`
attribute) and you can continue to have fast classic arithmetic on
tz-annotated datetimes forever.

Or you can use a strict tzinfo and have tz-aware datetimes that
unambiguously represent a UTC-convertible instant.

But how many contortions and surprising behaviors is it worth to try to
provide both of those at once, in the same object?

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150902/bb0b50b8/attachment.sig>

From tim.peters at gmail.com  Thu Sep  3 06:07:14 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 2 Sep 2015 23:07:14 -0500
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <55E7BD69.3060905@oddbird.net>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7BD69.3060905@oddbird.net>
Message-ID: <CAExdVNntXqNz55cdskBuH-b=V_SXG52=iw6WpG47SapkTQFr8Q@mail.gmail.com>

[Carl Meyer]
> But the point is that changing that model (in a backwards-compatible
> way, by means of a tzinfo flag) to draw a clear line between
> timeline-mode and naive-mode, _eliminates_ almost all of that pain. All
> these puzzles about arithmetic, ordering, equality, and hashing go away
> entirely (that is, they have obvious and unsurprising answers).

The puzzles about arithmetic, ordering, equality and hashing have
already been resolved.  The problems were all due to a single cause:
ignoring fold=1 where it really matters.  There remain no significant
backward-compatibility issues until 495-compliant tzinfos exist.  Then
people can choose to use them, or not.  Up to them.


> So doing these two things together doesn't add to the net pain; it
> reduces it considerably.

You're trying to retroactively change datetime's original design.  It
simply won't fly.  Classic arithmetic was intentional in Python.

It's unreasonable to ask people to settle for arithmetic at best 10x
slower just to get correct timezone conversions (your idea of
"backward compatible":  get both or neither, and only "neither" is
_really_ backward-compatible - more below).  pytz users are certainly
free to chose that, but we can't inflict it on everyone.  Worse for
your view, Guido wouldn't _want_ to regardless.  Under 495's view, you
can get fast timeline arithmetic _and_ correct conversions just by
working in UTC.  Stop fighting the intent, and life is easy.

I also need to mention that your idea requires a lot more changes to
the core Python code, from implementing timeline arithmetic internally
to slowing down everything all the time with "is this the right kind
of tzinfo?" conditional branches. then doing entirely different things
depending on the outcome.  Layers of complication do not generally
increase robustness ;-)

Even then, it's certain to be backward _incompatible_ with mounds of
code if they choose to use the "fold and timeline" option.  I have,
for example, previously shown  pieces of Python's own datetime
implementation, and of my own code, that _require_ using classic
arithmetic.  Python's own datetime implementation would fail in sundry
miserable ways under your option.  All such places could be changed to
live with timeline arithmetic, but they can't find and fix themselves
by magic.  Neither can any other user code implicitly or explicitly
relying on classic arithmetic.  Since classic _has_ been used forever,
it's certain that lots of code does.  495 triggers no such problems.

So I await the patch ;-)

In its absence, we'll likely continue taking one useful, small step at a time.

From carl at oddbird.net  Thu Sep  3 06:16:22 2015
From: carl at oddbird.net (Carl Meyer)
Date: Wed, 2 Sep 2015 22:16:22 -0600
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAExdVNntXqNz55cdskBuH-b=V_SXG52=iw6WpG47SapkTQFr8Q@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7BD69.3060905@oddbird.net>
 <CAExdVNntXqNz55cdskBuH-b=V_SXG52=iw6WpG47SapkTQFr8Q@mail.gmail.com>
Message-ID: <55E7C996.5060009@oddbird.net>

On 09/02/2015 10:07 PM, Tim Peters wrote:
> [Carl Meyer]
>> But the point is that changing that model (in a backwards-compatible
>> way, by means of a tzinfo flag) to draw a clear line between
>> timeline-mode and naive-mode, _eliminates_ almost all of that pain. All
>> these puzzles about arithmetic, ordering, equality, and hashing go away
>> entirely (that is, they have obvious and unsurprising answers).
> 
> The puzzles about arithmetic, ordering, equality and hashing have
> already been resolved.  The problems were all due to a single cause:
> ignoring fold=1 where it really matters.

But aren't we still left with arithmetic that violates basic invariants
in the presence of a fold=1 datetime?

[Tim]
> It's unreasonable to ask people to settle for arithmetic at best 10x
> slower just to get correct timezone conversions

If the intended meaning of a tz-annotated datetime is "naive clock time
with an associated timezone", then we don't need PEP 495; timezone
conversions are already as correct as the model allows.

PEP 495 just worsens the existing "naive or aware?" identity crisis of
tz-annotated datetimes.

> So I await the patch ;-)

Fair! I'll work on one :-)

> In its absence, we'll likely continue taking one useful, small step at a time.

It's no longer clear to me that PEP 495 is a useful step.

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150902/f54b9575/attachment.sig>

From tim.peters at gmail.com  Thu Sep  3 08:17:12 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 3 Sep 2015 01:17:12 -0500
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <55E7C65F.8050106@oddbird.net>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7C65F.8050106@oddbird.net>
Message-ID: <CAExdVN=nhKWmEhN+abmx=YgCg-OX+0++AE1GwGPJCF8Z7fht-g@mail.gmail.com>

I'm out of time for tonight, but will try to make more tomorrow.  Just
one for now, because I think it cuts to the _real_ heart of this batch
of messages:

[Carl Meyer <carl at oddbird.net>]
> ...
> If a new Python user is trying to calculate how long they slept when
> they went to bed at 10pm on March 2 and got up at 6am on March 3, Python
> should give them the right answer.  Telling them "you should convert
> to UTC first if you want your tz-aware datetime to actually be aware of
> the tz transition" is going to sound a bit silly to them; they neither went
> to sleep in UTC nor awoke in UTC; they did both in their own timezone.

That's the heart:  you simply despise classic arithmetic.  This
example has nothing to with PEP 495 - it's a complaint about classic
arithmetic, period.

It's more likely that a new user will want to set an alarm to get up
at 6am, then add timedelta(days=1) to set a new alarm for "same time
next day".  They'd be surprised and annoyed if that ended up at 7am or
9am just because DST switched.  Their stupid alarm-setting code works
fine today, and will continue to work fine with a 495-compliant tzinfo
when they're available.

It doesn't help to point out that "period arithmetic" _could_ be done
in some other way.  These particular kinds of uses already work, and
always have.

Different purposes require different kinds of arithmetic.  Python
picked one.  That timeline arithmetic wasn't the choice doesn't mean
Python despises it.  It was just judged "probably less useful overall
- and there are other, better ways to get it".  You can disagree with
that choice, but it can't be changed now.  I know, you're not
proposing to change it:  you're proposing to leave it exactly the way
it is, but exploit the desire for correct timezone conversions to
sneak timeline arithmetic into the core - because that's "the only
sane way" to do it.  Tricky ;-)

Once your new user understood the _potential_ problems when dealing
with pseudo-real-world durations in classic arithmetic, no, for
something this trivial I wouldn't advise converting to UTC explicitly.
Instead I'd give them a 1-line Python function implementing timeline
datetime-datetime subtraction, which they can use forever after.  It
can't always work right today, because conversions alone can't always
work right today.  When the user obtains a 495-compliant tzinfo , the
same function will always work right, by magic.

But _only_ under 495.  Under your view, timezone conversion would
continue to fail in some cases, because users who didn't want to drink
the timeline-all-the-time Kool-Aid would be left out.  Also left out
would be users who usually want classic arithmetic but _do_ convert to
UTC for fancier stuff:  conversion to UTC would continue to give rare
wrong results for them too.

So you're not really looking to do anything for anyone, _except_ for
those who want the whole timeline enchilada.  That's a legitimate
view, but in particular it wouldn't help me a bit ;-)  I _want_ what
495 is offering.  I usually want classic arithmetic.  When I want
timeline arithmetic, I switch to UTC, or use a 1-liner, and I'd sleep
a tad better if the latter two always did work correctly.

BTW, if your new user is also a physicist, we'[ll _both_ need to give
them a much more annoying function, in case they ask the question near
the end of June or December, and need to account for a leap second
that may have occurred while they were sleeping.

From carl at oddbird.net  Thu Sep  3 12:59:48 2015
From: carl at oddbird.net (Carl Meyer)
Date: Thu, 3 Sep 2015 04:59:48 -0600
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAExdVN=nhKWmEhN+abmx=YgCg-OX+0++AE1GwGPJCF8Z7fht-g@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7C65F.8050106@oddbird.net>
 <CAExdVN=nhKWmEhN+abmx=YgCg-OX+0++AE1GwGPJCF8Z7fht-g@mail.gmail.com>
Message-ID: <55E82824.7020607@oddbird.net>

[Tim]
> I'm out of time for tonight, but will try to make more tomorrow.  Just
> one for now, because I think it cuts to the _real_ heart of this batch
> of messages:

I don't think this cuts to the heart of anything :/ I think it avoids
the main point I've made (several times) to latch instead onto a tangent
I should have left out.

[Tim]
> That's the heart:  you simply despise classic arithmetic.

Sorry, but no. I have nothing at all against naive arithmetic. I think
both naive arithmetic and timeline arithmetic have good use cases.

What I have trouble with is a tz-annotated datetime object that
fundamentally can't decide whether it's living in a naive or timeline
model, and thus behaves unpredictably.

This is a problem today, but at least the behavior can be explained
fairly simply: the model is naive when operating within the same
timezone, and aware anytime you're converting between timezones or
interoperating between timezones.

PEP 495, AFAICS, makes the problem worse, because it introduces another
bit of information that only makes sense in a timeline view. That new
bit now allows round-tripping from UTC, which is great (no problem,
because conversions are an area where tz-annotated datetimes already
tried to behave as tz-aware instants in time). But then it can't quite
decide how to rationalize that new bit of information with its naive
internal view of time, so it settles on a mish-mash of inconsistent
behavior that violates basic arithmetic identities we all learned in
elementary school and only makes any sense if you've followed this
entire thread.

If you want to cut to the heart of the matter, tell me how you would
write the documentation for how arithmetic works on a tz-annotated
datetime post-PEP-495. Does it work on a naive "move the hands of the
clock" model? (No, because I can subtract 1:30AM from 2:30AM and get "2
hours" in some cases.) Does it work on a UTC timeline model? (No,
clearly not.) So what is the model, stated precisely and concisely?

And is it actually backwards-compatible with current code that converts
from UTC to local time and then does arithmetic on those local times, or
compares them to each other? (Not around a DST transition, no.)

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150903/20142158/attachment.sig>

From alexander.belopolsky at gmail.com  Thu Sep  3 16:27:48 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Thu, 3 Sep 2015 10:27:48 -0400
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <55E82824.7020607@oddbird.net>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7C65F.8050106@oddbird.net>
 <CAExdVN=nhKWmEhN+abmx=YgCg-OX+0++AE1GwGPJCF8Z7fht-g@mail.gmail.com>
 <55E82824.7020607@oddbird.net>
Message-ID: <CAP7h-xbvd0JyFBvzRBvOq=-b+DYj0vmWs7Voc795_s89QnsDZw@mail.gmail.com>

On Thu, Sep 3, 2015 at 6:59 AM, Carl Meyer <carl at oddbird.net> wrote:

> But then it can't quite
> decide how to rationalize that new bit of information with its naive
> internal view of time, so it settles on a mish-mash of inconsistent
> behavior that violates basic arithmetic identities we all learned in
> elementary school and only makes any sense if you've followed this
> entire thread.
>

It is actually easier to understand if you *don't* read this thread because
some of the earlier posts (including my own) are quite confusing.  The rule
we settled on is quite simple and consistent with the status quo.   First,
you need to realize that aware fold=1 times *can* be represented in the
current version of datetime, but you must use a different tzinfo for that.
  Popular choices are timezone.utc or the fictitious fixed offset standard
time zone.  (I call these zones fictitious because they represent a
possibly non-existing time zone which does not observe DST changes.)

For example, in US/Eastern, if you want to represent [01:30/fold=1], you
can either use [01:30/tzinfo=EST] or [06:30/tzinfo=UTC] which conveniently
compare as equal.  What 495 gives you is the third way to spell the same
time: [01:30/fold=1,tzinfo=Eastern].  It is quite natural that this third
spelling will have exactly the same properties as the first two:

[01:30/fold=0] < [01:59/fold=0] < [06:30/tzinfo=UTC] == [01:30/tzinfo=EST]
== [01:30/fold=1] < [02:00/fold=0]

The only "basic arithmetic identities" that are being violated here are the
ones that are already violated by aware datetimes.  For example (t1 - u) -
(t2 - u) is not equal to t1 - t2 if u is a tzinfo=UTC instance and t1 and
t2 are two tzinfo=Eastern instances on the different sides of the gap.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150903/143cd772/attachment.html>

From alexander.belopolsky at gmail.com  Thu Sep  3 16:43:30 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Thu, 3 Sep 2015 10:43:30 -0400
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7h-xbvd0JyFBvzRBvOq=-b+DYj0vmWs7Voc795_s89QnsDZw@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7C65F.8050106@oddbird.net>
 <CAExdVN=nhKWmEhN+abmx=YgCg-OX+0++AE1GwGPJCF8Z7fht-g@mail.gmail.com>
 <55E82824.7020607@oddbird.net>
 <CAP7h-xbvd0JyFBvzRBvOq=-b+DYj0vmWs7Voc795_s89QnsDZw@mail.gmail.com>
Message-ID: <CAP7h-xY-VOSM-gSk-CsapMbCPzO+gFX2J31HFzn4DSUOk2sNWg@mail.gmail.com>

On Thu, Sep 3, 2015 at 10:27 AM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:

> .. are two tzinfo=Eastern instances on the different sides of the gap.


s/gap/fold/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150903/033fbadc/attachment.html>

From carl at oddbird.net  Thu Sep  3 16:52:06 2015
From: carl at oddbird.net (Carl Meyer)
Date: Thu, 3 Sep 2015 08:52:06 -0600
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7h-xbvd0JyFBvzRBvOq=-b+DYj0vmWs7Voc795_s89QnsDZw@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7C65F.8050106@oddbird.net>
 <CAExdVN=nhKWmEhN+abmx=YgCg-OX+0++AE1GwGPJCF8Z7fht-g@mail.gmail.com>
 <55E82824.7020607@oddbird.net>
 <CAP7h-xbvd0JyFBvzRBvOq=-b+DYj0vmWs7Voc795_s89QnsDZw@mail.gmail.com>
Message-ID: <55E85E96.5050500@oddbird.net>

[Alex]
> The only "basic arithmetic identities" that are being violated here are
> the ones that are already violated by aware datetimes.  For example (t1
> - u) - (t2 - u) is not equal to t1 - t2 if u is a tzinfo=UTC instance
> and t1 and t2 are two tzinfo=Eastern instances on the different sides of
> the gap.

Yes, you can already get such results, because aware datetimes are
already sometimes aware and sometimes naive depending on context. That's
a problem for learning the API, but it's at least an easily-explained
problem: arithmetic within a timezone is always naive, arithmetic
between timezones is always aware, if you mix the two (as your example
does) you may get surprising results.

I don't see any such easily comprehensible explanation for the new
proposed PEP 495 behavior. It is no longer true that "arithmetic within
a timezone is always naive." Now "arithmetic within a timezone is naive,
unless you happen to have a particular kind of special time in a single
hour once per year, in which case some kinds of arithmetic (dt/dt) are
aware of the DST transition, but other kinds (dt/delta) still ignore
it." Is that roughly what you propose to put in the documentation?

Currently you only get results that violate arithmetic identities if you
mix arithmetic within a timezone and arithmetic between timezones.
Again, a simple rule. Under PEP 495, you can get such results even if
you always stay within a single timezone.

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150903/1282153f/attachment-0001.sig>

From alexander.belopolsky at gmail.com  Thu Sep  3 17:02:21 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Thu, 3 Sep 2015 11:02:21 -0400
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <55E85E96.5050500@oddbird.net>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7C65F.8050106@oddbird.net>
 <CAExdVN=nhKWmEhN+abmx=YgCg-OX+0++AE1GwGPJCF8Z7fht-g@mail.gmail.com>
 <55E82824.7020607@oddbird.net>
 <CAP7h-xbvd0JyFBvzRBvOq=-b+DYj0vmWs7Voc795_s89QnsDZw@mail.gmail.com>
 <55E85E96.5050500@oddbird.net>
Message-ID: <CAP7h-xZCDcKA2V0KFnXYaxLeRiGUktZQUAxTJEjf9rVJ5Mzi1w@mail.gmail.com>

On Thu, Sep 3, 2015 at 10:52 AM, Carl Meyer <carl at oddbird.net> wrote:

> It is no longer true that "arithmetic within a timezone is always naive."
>

If you like this rule, you can keep it. :-)   Just note that fold=1
instances are in a different timezone.  This is unavoidable because within
the same timezone fold=1 instances don't exist: 01:59 is followed by 02:00
with no room for "second 01:30" in between.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150903/08a04b91/attachment.html>

From tim.peters at gmail.com  Thu Sep  3 17:05:40 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 3 Sep 2015 10:05:40 -0500
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7h-xY-VOSM-gSk-CsapMbCPzO+gFX2J31HFzn4DSUOk2sNWg@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7C65F.8050106@oddbird.net>
 <CAExdVN=nhKWmEhN+abmx=YgCg-OX+0++AE1GwGPJCF8Z7fht-g@mail.gmail.com>
 <55E82824.7020607@oddbird.net>
 <CAP7h-xbvd0JyFBvzRBvOq=-b+DYj0vmWs7Voc795_s89QnsDZw@mail.gmail.com>
 <CAP7h-xY-VOSM-gSk-CsapMbCPzO+gFX2J31HFzn4DSUOk2sNWg@mail.gmail.com>
Message-ID: <CAExdVNmidkH44GSz0x1r3OSPmts0B8C07Upb9V+K+6EhC0Bj-g@mail.gmail.com>

[Alex]
>> The only "basic arithmetic identities" that are being violated here are
>> the ones that are already violated by aware datetimes.  For example
>> (t1 - u) - (t2 - u) is not equal to t1 - t2
>> if u is a tzinfo=UTC instance and t1 and t2 are two tzinfo=Eastern
>> instances on the different sides of the gap.

[Alex]
> s/gap/fold/

What you said is true either way (fold or gap); the sign of the hour
difference (between the two expressions) just differs.  Although
_sometimes_ the expressions can be equal, if you move t1 and/or t2 far
enough away from the gap/fold to encompass some number of _additional_
gaps/folds, so as to just cancel out overall. As an obvious example,
pick d1 = 2000-01-01 and d2 = 2001-01-01.  They're on different sides
of one gap, but also on different sides of one fold.  Then you get 366
days (2000 is a leap year) via either way of computing the difference.

The conceptual muddying here is that this kind of stuff wasn't
possible before when sticking within a _single_ zone.  We are
introducing oddball cases of timeline arithmetic into what used to be
"surprise-free" classic arithmetic.  I don't like that, but I'm not
scared to death of it either.  Yet ;-)

From alexander.belopolsky at gmail.com  Thu Sep  3 17:19:16 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Thu, 3 Sep 2015 11:19:16 -0400
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAExdVNmidkH44GSz0x1r3OSPmts0B8C07Upb9V+K+6EhC0Bj-g@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7C65F.8050106@oddbird.net>
 <CAExdVN=nhKWmEhN+abmx=YgCg-OX+0++AE1GwGPJCF8Z7fht-g@mail.gmail.com>
 <55E82824.7020607@oddbird.net>
 <CAP7h-xbvd0JyFBvzRBvOq=-b+DYj0vmWs7Voc795_s89QnsDZw@mail.gmail.com>
 <CAP7h-xY-VOSM-gSk-CsapMbCPzO+gFX2J31HFzn4DSUOk2sNWg@mail.gmail.com>
 <CAExdVNmidkH44GSz0x1r3OSPmts0B8C07Upb9V+K+6EhC0Bj-g@mail.gmail.com>
Message-ID: <CAP7h-xayj-F9+pGy4era9h0VL+4Zg06_fhhF3k=_Qx8bPovAxQ@mail.gmail.com>

On Thu, Sep 3, 2015 at 11:05 AM, Tim Peters <tim.peters at gmail.com> wrote:

> The conceptual muddying here is that this kind of stuff wasn't
> possible before when sticking within a _single_ zone.
>

This is what Carl is complaining about, but once you realize that fold=1 on
an ambiguous datetime instance effectively modifies the zone (changes the
value returned by utcoffset()), it becomes quite natural.


>   We are introducing oddball cases of timeline arithmetic into what used
> to be
> "surprise-free" classic arithmetic.  I don't like that, but I'm not
> scared to death of it either.  Yet ;-)
>

Wait for the next PEP update. :-)  I am adding a section titled "An
Overview of the Current State of Aware Arithmetic and Comparisons."   A
reader who will survive that won't be impressed by the additional PEP 495
rules.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150903/14814591/attachment.html>

From carl at oddbird.net  Thu Sep  3 17:19:22 2015
From: carl at oddbird.net (Carl Meyer)
Date: Thu, 3 Sep 2015 09:19:22 -0600
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7h-xZCDcKA2V0KFnXYaxLeRiGUktZQUAxTJEjf9rVJ5Mzi1w@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7C65F.8050106@oddbird.net>
 <CAExdVN=nhKWmEhN+abmx=YgCg-OX+0++AE1GwGPJCF8Z7fht-g@mail.gmail.com>
 <55E82824.7020607@oddbird.net>
 <CAP7h-xbvd0JyFBvzRBvOq=-b+DYj0vmWs7Voc795_s89QnsDZw@mail.gmail.com>
 <55E85E96.5050500@oddbird.net>
 <CAP7h-xZCDcKA2V0KFnXYaxLeRiGUktZQUAxTJEjf9rVJ5Mzi1w@mail.gmail.com>
Message-ID: <55E864FA.90704@oddbird.net>

On 09/03/2015 09:02 AM, Alexander Belopolsky wrote:
> On Thu, Sep 3, 2015 at 10:52 AM, Carl Meyer <carl at oddbird.net
> <mailto:carl at oddbird.net>> wrote:
> 
>     It is no longer true that "arithmetic within a timezone is always
>     naive."
> 
> If you like this rule, you can keep it. :-)   Just note that fold=1
> instances are in a different timezone.

Ok, so for most of the year when I do utctime.astimezone(Eastern), I get
a result in Eastern, but during one hour of the year I get a result in
"some other timezone that isn't quite Eastern" (but its tzinfo is still
the same object as all the others). That's your proposal for a _less_
surprising interpretation? ;-)

>  This is unavoidable because
> within the same timezone fold=1 instances don't exist: 01:59 is followed
> by 02:00 with no room for "second 01:30" in between.

Right. That's an excellent statement of why respecting `fold` at all is
inconsistent with how tz-annotated datetimes are designed to behave in
Python (they operate internally in naive time, in which the "fold" time
does not even exist).

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150903/ab0d8f29/attachment.sig>

From chris.barker at noaa.gov  Thu Sep  3 17:19:31 2015
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Thu, 3 Sep 2015 08:19:31 -0700
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAExdVNntXqNz55cdskBuH-b=V_SXG52=iw6WpG47SapkTQFr8Q@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7BD69.3060905@oddbird.net>
 <CAExdVNntXqNz55cdskBuH-b=V_SXG52=iw6WpG47SapkTQFr8Q@mail.gmail.com>
Message-ID: <-7656714205635425925@unknownmsgid>

> It's unreasonable to ask people to settle for arithmetic at best 10x
> slower just to get correct timezone conversions

I'm not sure. As has been pointed out, best practice is to use UTC or
naive time anyway.

So if the casual user wants to compute how long s/he slept last night,
it can be slow. It's easier to document "computations are much faster
in UTC" than to document all the surprising inconsistencies.

And as for original intent -- my understanding of the entire
architecture was designed NOT to be about fast arithmetic. If you want
that, use tics or numpy.datetime64.

And intentional or not, "classic" arithmetic may be easy to implement
and fast, but it is hard to explain, surprising, and not very useful.

-Chris

From alexander.belopolsky at gmail.com  Thu Sep  3 17:23:44 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Thu, 3 Sep 2015 11:23:44 -0400
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <-7656714205635425925@unknownmsgid>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7BD69.3060905@oddbird.net>
 <CAExdVNntXqNz55cdskBuH-b=V_SXG52=iw6WpG47SapkTQFr8Q@mail.gmail.com>
 <-7656714205635425925@unknownmsgid>
Message-ID: <CAP7h-xYEC5Mf3nK1DhBG4d0NGLdp1nHv7m21hdiFXECfi1K8Lw@mail.gmail.com>

On Thu, Sep 3, 2015 at 11:19 AM, Chris Barker - NOAA Federal <
chris.barker at noaa.gov> wrote:

> If you want that, use tics or numpy.datetime64.
>

Chris, please stop promoting numpy.datetime64 here.  It is definitely not a
positive example of how a date/time manipulation library should be designed.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150903/2b9c8cd6/attachment.html>

From tim.peters at gmail.com  Thu Sep  3 17:30:12 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 3 Sep 2015 10:30:12 -0500
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <55E85E96.5050500@oddbird.net>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7C65F.8050106@oddbird.net>
 <CAExdVN=nhKWmEhN+abmx=YgCg-OX+0++AE1GwGPJCF8Z7fht-g@mail.gmail.com>
 <55E82824.7020607@oddbird.net>
 <CAP7h-xbvd0JyFBvzRBvOq=-b+DYj0vmWs7Voc795_s89QnsDZw@mail.gmail.com>
 <55E85E96.5050500@oddbird.net>
Message-ID: <CAExdVNnmWbZn_K216uB9g1wwOLCNdKGE7azg7322rgnvoHu8Vw@mail.gmail.com>

[Alex]
>> The only "basic arithmetic identities" that are being violated here are
>> the ones that are already violated by aware datetimes.  For example (t1
>> - u) - (t2 - u) is not equal to t1 - t2 if u is a tzinfo=UTC instance
>> and t1 and t2 are two tzinfo=Eastern instances on the different sides of
>> the gap.

[Carl]
> Yes, you can already get such results, because aware datetimes are
> already sometimes aware and sometimes naive depending on context. That's
> a problem for learning the API, but it's at least an easily-explained
> problem: arithmetic within a timezone is always naive, arithmetic
> between timezones is always aware, if you mix the two (as your example
> does) you may get surprising results.
>
> I don't see any such easily comprehensible explanation for the new
> proposed PEP 495 behavior.

It can't possibly require more confusing words than _already_ exist
trying to explain the subtleties behind why timezone conversion can
fail in rare cases ;-)  People _expect_ the obvious roundtrip
identities there too.  It's a tradeoff.

The doc problem here seems much simpler:  in arithmetic involving two
datetimes, the operands will be treated as having distinct tzinfos if
at least one has fold=1.  It reduces to a prior case.  The equally
rare conversion problems require paragraph after paragraph to explain.

> ..
> Currently you only get results that violate arithmetic identities if you
> mix arithmetic within a timezone and arithmetic between timezones.

And we currently have timeline conversions that can violate basic
identities in _that_ space.  It is trading one for the other.

From carl at oddbird.net  Thu Sep  3 17:37:11 2015
From: carl at oddbird.net (Carl Meyer)
Date: Thu, 3 Sep 2015 09:37:11 -0600
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAExdVNnmWbZn_K216uB9g1wwOLCNdKGE7azg7322rgnvoHu8Vw@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7C65F.8050106@oddbird.net>
 <CAExdVN=nhKWmEhN+abmx=YgCg-OX+0++AE1GwGPJCF8Z7fht-g@mail.gmail.com>
 <55E82824.7020607@oddbird.net>
 <CAP7h-xbvd0JyFBvzRBvOq=-b+DYj0vmWs7Voc795_s89QnsDZw@mail.gmail.com>
 <55E85E96.5050500@oddbird.net>
 <CAExdVNnmWbZn_K216uB9g1wwOLCNdKGE7azg7322rgnvoHu8Vw@mail.gmail.com>
Message-ID: <55E86927.1000103@oddbird.net>

[Carl]
>> Currently you only get results that violate arithmetic identities if you
>> mix arithmetic within a timezone and arithmetic between timezones.
[Tim]
> And we currently have timeline conversions that can violate basic
> identities in _that_ space.  It is trading one for the other.

Yes. The new proposed behavior for PEP 495 abandons the assertion that
it can be "independent of arithmetic", recognizing that instead we're
trading consistency of arithmetic within a timezone for consistency of
round-trips between timezones.

So PEP 495 is already breaking the design of datetime, that tz-annotated
datetimes operate internally on a naive time model. It _has_ to break
that design, because it must introduce times that don't exist in that
model. But it's choosing to change that design piecemeal and
inconsistently instead of thoroughly and consistently.

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150903/e2e717d7/attachment.sig>

From alexander.belopolsky at gmail.com  Thu Sep  3 17:38:52 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Thu, 3 Sep 2015 11:38:52 -0400
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <55E864FA.90704@oddbird.net>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7C65F.8050106@oddbird.net>
 <CAExdVN=nhKWmEhN+abmx=YgCg-OX+0++AE1GwGPJCF8Z7fht-g@mail.gmail.com>
 <55E82824.7020607@oddbird.net>
 <CAP7h-xbvd0JyFBvzRBvOq=-b+DYj0vmWs7Voc795_s89QnsDZw@mail.gmail.com>
 <55E85E96.5050500@oddbird.net>
 <CAP7h-xZCDcKA2V0KFnXYaxLeRiGUktZQUAxTJEjf9rVJ5Mzi1w@mail.gmail.com>
 <55E864FA.90704@oddbird.net>
Message-ID: <CAP7h-xaxCjt2tDuAfgDoxRoYLRrU6f4fPomp6odKP_EBujor5w@mail.gmail.com>

On Thu, Sep 3, 2015 at 11:19 AM, Carl Meyer <carl at oddbird.net> wrote:

> >  This is unavoidable because
> > within the same timezone fold=1 instances don't exist: 01:59 is followed
> > by 02:00 with no room for "second 01:30" in between.
>
> Right. That's an excellent statement of why respecting `fold` at all is
> inconsistent with how tz-annotated datetimes are designed to behave in
> Python (they operate internally in naive time, in which the "fold" time
> does not even exist).


I wish we could have a design where fold is always ignored when you have a
single tzinfo.  The reason we cannot has been explained several times in
this thread.  The core reason is possibly a mistake in the original design
that permitted cross-zone arithmetic and comparison.  If == was defined so
that no two instances with different tzinfo ever compare equal and <, - and
friends are only defined for datetimes sharing the tzinfo, we would not
have this problem.  Recall that datetime was designed at the time when it
was thought that mixing bytes and unicode was a good idea.  We all know
what it took to fix that wart.   I don't think cross-zone datetime
arithmetic is an issue of the same scale or impact.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150903/9242aa8d/attachment.html>

From carl at oddbird.net  Thu Sep  3 17:42:02 2015
From: carl at oddbird.net (Carl Meyer)
Date: Thu, 3 Sep 2015 09:42:02 -0600
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7h-xaxCjt2tDuAfgDoxRoYLRrU6f4fPomp6odKP_EBujor5w@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7C65F.8050106@oddbird.net>
 <CAExdVN=nhKWmEhN+abmx=YgCg-OX+0++AE1GwGPJCF8Z7fht-g@mail.gmail.com>
 <55E82824.7020607@oddbird.net>
 <CAP7h-xbvd0JyFBvzRBvOq=-b+DYj0vmWs7Voc795_s89QnsDZw@mail.gmail.com>
 <55E85E96.5050500@oddbird.net>
 <CAP7h-xZCDcKA2V0KFnXYaxLeRiGUktZQUAxTJEjf9rVJ5Mzi1w@mail.gmail.com>
 <55E864FA.90704@oddbird.net>
 <CAP7h-xaxCjt2tDuAfgDoxRoYLRrU6f4fPomp6odKP_EBujor5w@mail.gmail.com>
Message-ID: <55E86A4A.6010503@oddbird.net>

On 09/03/2015 09:38 AM, Alexander Belopolsky wrote:
> On Thu, Sep 3, 2015 at 11:19 AM, Carl Meyer <carl at oddbird.net
> <mailto:carl at oddbird.net>> wrote:
>     >  This is unavoidable because
>     > within the same timezone fold=1 instances don't exist: 01:59 is followed
>     > by 02:00 with no room for "second 01:30" in between.
> 
>     Right. That's an excellent statement of why respecting `fold` at all is
>     inconsistent with how tz-annotated datetimes are designed to behave in
>     Python (they operate internally in naive time, in which the "fold" time
>     does not even exist).
> 
> I wish we could have a design where fold is always ignored when you have
> a single tzinfo.  The reason we cannot has been explained several times
> in this thread.  The core reason is possibly a mistake in the original
> design that permitted cross-zone arithmetic and comparison.  If == was
> defined so that no two instances with different tzinfo ever compare
> equal and <, - and friends are only defined for datetimes sharing the
> tzinfo, we would not have this problem.

Yes, I understand why that doesn't work. There is an alternative
solution available that avoids this problem, and all other inconsistencies.

> Recall that datetime was
> designed at the time when it was thought that mixing bytes and unicode
> was a good idea.  We all know what it took to fix that wart.   I don't
> think cross-zone datetime arithmetic is an issue of the same scale or
> impact. 

True. Which should make it more feasible to fix.

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150903/4e869ee7/attachment-0001.sig>

From alexander.belopolsky at gmail.com  Thu Sep  3 17:57:14 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Thu, 3 Sep 2015 11:57:14 -0400
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <55E86A4A.6010503@oddbird.net>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7C65F.8050106@oddbird.net>
 <CAExdVN=nhKWmEhN+abmx=YgCg-OX+0++AE1GwGPJCF8Z7fht-g@mail.gmail.com>
 <55E82824.7020607@oddbird.net>
 <CAP7h-xbvd0JyFBvzRBvOq=-b+DYj0vmWs7Voc795_s89QnsDZw@mail.gmail.com>
 <55E85E96.5050500@oddbird.net>
 <CAP7h-xZCDcKA2V0KFnXYaxLeRiGUktZQUAxTJEjf9rVJ5Mzi1w@mail.gmail.com>
 <55E864FA.90704@oddbird.net>
 <CAP7h-xaxCjt2tDuAfgDoxRoYLRrU6f4fPomp6odKP_EBujor5w@mail.gmail.com>
 <55E86A4A.6010503@oddbird.net>
Message-ID: <CAP7h-xbvq=38++n8t_ODnC81Z0J4eNoK9udXeWY=X390CNsOWw@mail.gmail.com>

On Thu, Sep 3, 2015 at 11:42 AM, Carl Meyer <carl at oddbird.net> wrote:

> There is an alternative solution available that avoids this problem, and
> all other inconsistencies.
>

Really?  PEP 495 has a more or less complete reference implementation in my
github fork [1] of cpython.  I have recently added the hash invariant
preservation rule which required a change to the grand total of two lines
in datetime.py.  Something that is that easy to implement cannot be too
hard to explain and document.

I would like to specifically point out that the only existing unit test
that my patch has to modify is the one which checks that astimezone()
method raises an exception on a naive datetime.

I have not seen any "alternative solution" implemented anywhere.  If you
have not tried it yourself - trust me - keeping 4000+ lines of unit tests
intact while adding features to the datetime module is not an easy task.

[1]: https://github.com/abalkin/cpython/tree/issue24773
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150903/f87b78f9/attachment.html>

From tim.peters at gmail.com  Thu Sep  3 17:56:47 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 3 Sep 2015 10:56:47 -0500
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <-7656714205635425925@unknownmsgid>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7BD69.3060905@oddbird.net>
 <CAExdVNntXqNz55cdskBuH-b=V_SXG52=iw6WpG47SapkTQFr8Q@mail.gmail.com>
 <-7656714205635425925@unknownmsgid>
Message-ID: <CAExdVNm9OL6-sykTiGgD0+D33YCP8hO9E8x1bT+Qm8RymxZcaw@mail.gmail.com>

[Tim]
>> It's unreasonable to ask people to settle for arithmetic at best 10x
>> slower just to get correct timezone conversions

[Chris Barker]
> I'm not sure. As has been pointed out, best practice is to use UTC or
> naive time anyway.

We're not designing a new language here.  Python already has more
users than an instance of numpy.datetime64 has bits ;-)

As is, working in UTC does nothing to help you get correct conversions
in all cases.  That problem has nothing to do with arithmetic.  It has
entirely to do with what PEP 495 is addressing:  the current inability
of a local time to record _which_ UTC time it corresponds to in
ambiguous cases.  timeline vs classic arithmetic is irrelevant to that
"in theory".  In practice, it seems to be unfortunately true that
resolving it in a way that plays nice with everything else requires
muddying the classic arithmetic rules in some rare cases.


> So if the casual user wants to compute how long s/he slept last night,
> it can be slow. It's easier to document "computations are much faster
> in UTC" than to document all the surprising inconsistencies.

Ditto.

> And as for original intent -- my understanding of the entire
> architecture was designed NOT to be about fast arithmetic.

Quite so.  But it's been in the field for over a decade, and
relatively fast arithmetic happens to a property that's been
maintained all along.  That's another kind of "backward compatibility"
we have to respect.

> If you want that, use tics or numpy.datetime64.

Or just leave your already-working Python datetime "fast enough" code alone.


> And intentional or not, "classic" arithmetic may be easy to implement
> and fast, but it is hard to explain, surprising, and not very useful.

I find it very useful.  So does Guido.  As to being hard to explain,
you must be joking:  classic arithmetic has the same semantics as
doing integer arithmetic on integer POSIX timestamps (although
extended to support microseconds).  They're different representations
of the same thing.  I would have _preferred_ that an aware datetime
followed timeline rules instead (or didn't support builtin arithmetic
at all), but too late for that.

From tim.peters at gmail.com  Thu Sep  3 18:39:56 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 3 Sep 2015 11:39:56 -0500
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <55E86927.1000103@oddbird.net>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7C65F.8050106@oddbird.net>
 <CAExdVN=nhKWmEhN+abmx=YgCg-OX+0++AE1GwGPJCF8Z7fht-g@mail.gmail.com>
 <55E82824.7020607@oddbird.net>
 <CAP7h-xbvd0JyFBvzRBvOq=-b+DYj0vmWs7Voc795_s89QnsDZw@mail.gmail.com>
 <55E85E96.5050500@oddbird.net>
 <CAExdVNnmWbZn_K216uB9g1wwOLCNdKGE7azg7322rgnvoHu8Vw@mail.gmail.com>
 <55E86927.1000103@oddbird.net>
Message-ID: <CAExdVNmVcbM2yrWk3Qvu4yash2aRNnk7t41OkCABhSQLQw4Ntw@mail.gmail.com>

[Carl]
> ...
> So PEP 495 is already breaking the design of datetime, that tz-annotated
> datetimes operate internally on a naive time model. It _has_ to break
> that design, because it must introduce times that don't exist in that
> model. But it's choosing to change that design piecemeal and
> inconsistently instead of thoroughly and consistently.

It was never consistent for all possible uses:  as has been gone over
many times before, an aware datetime _can_ be viewed as being an
instant in "naive time", _or_ as an instant in civil time.  That's
solely in the programmer's head.  They may even view a single datetime
in both ways in different lines of code (I know I do - indeed, that's
the norm for me).  Python has no way to know which the programmer has
in mind; there is no way to _spell_ "I mean naive time" versus "I mean
civil time" for aware datetimes.  I believe Guido thinks that's "a
feature".  I think it's just "good enough" ;-)

Since the concept of "timezone conversion" doesn't exist in naive
time, a programmer asking for a timezone conversion can only have
"instant in civil time" in mind at the instant they ask for that
conversion (or invoke any other tzinfo method).  We're aiming to
accommodate that use, in a design that never put a wall between the
concepts from the start.

It's not ideal, but that's not really news ;-)

From tim.peters at gmail.com  Thu Sep  3 18:58:22 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 3 Sep 2015 11:58:22 -0500
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7h-xaxCjt2tDuAfgDoxRoYLRrU6f4fPomp6odKP_EBujor5w@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7C65F.8050106@oddbird.net>
 <CAExdVN=nhKWmEhN+abmx=YgCg-OX+0++AE1GwGPJCF8Z7fht-g@mail.gmail.com>
 <55E82824.7020607@oddbird.net>
 <CAP7h-xbvd0JyFBvzRBvOq=-b+DYj0vmWs7Voc795_s89QnsDZw@mail.gmail.com>
 <55E85E96.5050500@oddbird.net>
 <CAP7h-xZCDcKA2V0KFnXYaxLeRiGUktZQUAxTJEjf9rVJ5Mzi1w@mail.gmail.com>
 <55E864FA.90704@oddbird.net>
 <CAP7h-xaxCjt2tDuAfgDoxRoYLRrU6f4fPomp6odKP_EBujor5w@mail.gmail.com>
Message-ID: <CAExdVN=Z7OghHCyGRtw26x=nvGMO2QYVdr6Wk9KEdLZzyQ4-3w@mail.gmail.com>

[Alex]
> I wish we could have a design where fold is always ignored when you have a
> single tzinfo.

Me too - and you tried very hard to make that so.  Valiant effort!

> The reason we cannot has been explained several times in
> this thread.  The core reason is possibly a mistake in the original design
> that permitted cross-zone arithmetic and comparison.  If == was defined so
> that no two instances with different tzinfo ever compare equal and <, - and
> friends are only defined for datetimes sharing the tzinfo, we would not have
> this problem.  Recall that datetime was designed at the time when it was
> thought that mixing bytes and unicode was a good idea.  We all know what it
> took to fix that wart.

It was also designed at a time when Python was just starting to stop
;-) allowing comparisons between _any_ two objects.  Things like

    1 < "1"
    {10: 20} < [None]

were true near that time.  Why?  "Because" in senseless cases (both
comparands said "not implemented"), sometimes the string names of the
types were compared instead, and "int" < "str" and "dict" < "list" are
true.

Compared to stuff like that, doing timeline arithmetic for interzone
comparisons seemed to be a welcome case of principled sanity ;-)

But there's no question (in my mind) that if datetime had been
designed today, interzone comparisons would be disallowed (except for
"==" always saying False and "!=" always True).

From tim.peters at gmail.com  Thu Sep  3 19:18:09 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 3 Sep 2015 12:18:09 -0500
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7h-xbvq=38++n8t_ODnC81Z0J4eNoK9udXeWY=X390CNsOWw@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7C65F.8050106@oddbird.net>
 <CAExdVN=nhKWmEhN+abmx=YgCg-OX+0++AE1GwGPJCF8Z7fht-g@mail.gmail.com>
 <55E82824.7020607@oddbird.net>
 <CAP7h-xbvd0JyFBvzRBvOq=-b+DYj0vmWs7Voc795_s89QnsDZw@mail.gmail.com>
 <55E85E96.5050500@oddbird.net>
 <CAP7h-xZCDcKA2V0KFnXYaxLeRiGUktZQUAxTJEjf9rVJ5Mzi1w@mail.gmail.com>
 <55E864FA.90704@oddbird.net>
 <CAP7h-xaxCjt2tDuAfgDoxRoYLRrU6f4fPomp6odKP_EBujor5w@mail.gmail.com>
 <55E86A4A.6010503@oddbird.net>
 <CAP7h-xbvq=38++n8t_ODnC81Z0J4eNoK9udXeWY=X390CNsOWw@mail.gmail.com>
Message-ID: <CAExdVNnriWMcLJ1LS861cx2ZcpC=FAuNHEEp2-hLSY6XXs2koA@mail.gmail.com>

[Carl]
>> There is an alternative solution available that avoids this problem, and
>> all other inconsistencies.

[Alex]
> Really?

Carl means ignoring `fold` everywhere, all the time, unless a
datetime's tzinfo is of a new "strict" flavor that implements PEP 495
_and_ forces the datetime to use timeline arithmetic all the time.


> ...
> I have not seen any "alternative solution" implemented anywhere.

In a sense, pytz kinda does this already (but not all by magic).

> If you have not tried it yourself - trust me - keeping 4000+ lines of unit tests
> intact while adding features to the datetime module is not an easy task.

They would continue to pass, _until_ you used one of the new "strict"
tzinfos.  Then they'd barf all over the place.  Indeed, it would
fatally confuse Python's _implementation_ of datetime (which, as you
know, currently exploits that arithmetic on aware datetimes is classic
- which could be changed, but won't change itself by magic).

So, assuming many changes to Python itself, this is "backward
compatible" even to the extent of leaving conversions broken forever
for code that wants to use classic arithmetic.

From alexander.belopolsky at gmail.com  Thu Sep  3 19:31:23 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Thu, 3 Sep 2015 13:31:23 -0400
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAExdVNnriWMcLJ1LS861cx2ZcpC=FAuNHEEp2-hLSY6XXs2koA@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7C65F.8050106@oddbird.net>
 <CAExdVN=nhKWmEhN+abmx=YgCg-OX+0++AE1GwGPJCF8Z7fht-g@mail.gmail.com>
 <55E82824.7020607@oddbird.net>
 <CAP7h-xbvd0JyFBvzRBvOq=-b+DYj0vmWs7Voc795_s89QnsDZw@mail.gmail.com>
 <55E85E96.5050500@oddbird.net>
 <CAP7h-xZCDcKA2V0KFnXYaxLeRiGUktZQUAxTJEjf9rVJ5Mzi1w@mail.gmail.com>
 <55E864FA.90704@oddbird.net>
 <CAP7h-xaxCjt2tDuAfgDoxRoYLRrU6f4fPomp6odKP_EBujor5w@mail.gmail.com>
 <55E86A4A.6010503@oddbird.net>
 <CAP7h-xbvq=38++n8t_ODnC81Z0J4eNoK9udXeWY=X390CNsOWw@mail.gmail.com>
 <CAExdVNnriWMcLJ1LS861cx2ZcpC=FAuNHEEp2-hLSY6XXs2koA@mail.gmail.com>
Message-ID: <CAP7h-xbUu+KwPbtCSRr1SUKjNDsxhLzRvP46ThyzPGzatk1yqw@mail.gmail.com>

On Thu, Sep 3, 2015 at 1:18 PM, Tim Peters <tim.peters at gmail.com> wrote:

> So, assuming many changes to Python itself, this is "backward
> compatible" even to the extent of leaving conversions broken forever
> for code that wants to use classic arithmetic.
>

On top of this, I think any operations that mix strict and classic
datetimes will be prohibited as well.  Effectively a new class is
proposed.  The only thing I don't understand is why would you want to call
it "datetime"?  mxDateTime will be a much better name. :-)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150903/aa3856ef/attachment.html>

From chris.barker at noaa.gov  Fri Sep  4 00:16:33 2015
From: chris.barker at noaa.gov (Chris Barker)
Date: Thu, 3 Sep 2015 15:16:33 -0700
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7h-xYEC5Mf3nK1DhBG4d0NGLdp1nHv7m21hdiFXECfi1K8Lw@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7BD69.3060905@oddbird.net>
 <CAExdVNntXqNz55cdskBuH-b=V_SXG52=iw6WpG47SapkTQFr8Q@mail.gmail.com>
 <-7656714205635425925@unknownmsgid>
 <CAP7h-xYEC5Mf3nK1DhBG4d0NGLdp1nHv7m21hdiFXECfi1K8Lw@mail.gmail.com>
Message-ID: <CALGmxE+DmXoFr9VvDCNhfJ8Zg8zsJJeQ+KwWDwEJSL9rp2-FFw@mail.gmail.com>

On Thu, Sep 3, 2015 at 8:23 AM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:

>
> On Thu, Sep 3, 2015 at 11:19 AM, Chris Barker - NOAA Federal <
> chris.barker at noaa.gov> wrote:
>
>> If you want that, use tics or numpy.datetime64.
>>
>
> Chris, please stop promoting numpy.datetime64 here.  It is definitely not
> a positive example of how a date/time manipulation library should be
> designed.
>

Sorry -- didn't mean to promote -- and yes, it's actually really horrible,
particularly for anything to do with time zones. The point was that there
are other ways to get performance for datetime arithmetic if that's what
you need. That's all.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150903/0d24b3cd/attachment.html>

From chris.barker at noaa.gov  Fri Sep  4 00:51:30 2015
From: chris.barker at noaa.gov (Chris Barker)
Date: Thu, 3 Sep 2015 15:51:30 -0700
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAExdVNm9OL6-sykTiGgD0+D33YCP8hO9E8x1bT+Qm8RymxZcaw@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7BD69.3060905@oddbird.net>
 <CAExdVNntXqNz55cdskBuH-b=V_SXG52=iw6WpG47SapkTQFr8Q@mail.gmail.com>
 <-7656714205635425925@unknownmsgid>
 <CAExdVNm9OL6-sykTiGgD0+D33YCP8hO9E8x1bT+Qm8RymxZcaw@mail.gmail.com>
Message-ID: <CALGmxEK+2+gZwOkvUmFpLi99JDjf2jgX9sHZG-MSA-BbxtCUzA@mail.gmail.com>

On Thu, Sep 3, 2015 at 8:56 AM, Tim Peters <tim.peters at gmail.com> wrote:

> > And intentional or not, "classic" arithmetic may be easy to implement
> > and fast, but it is hard to explain, surprising, and not very useful.
>


> <snip>
>


>  As to being hard to explain,
> you must be joking:


sigh. Look at the length of this stinking thread! and how much confusion
there was at the beginning about what the heck the current datetime
implementation actually did. Classic arithmetic may well be the best
possible solution given the constraints, but it is not obvious, clear,
lacking in surprises or well documented ( and no one reads docs until they
run into a problem)

I know I only got it when someone explained the implementation:

"remove the tzinfo object, do the math, tack the tzinfo back on"

Simple elegant, and now I get it. And get why things go wonky with
datetimes with two different tzinfo objects.

By the way, something like that should be in the docs.

Anyway, clearly timeline math is an important use case for folks -- just as
many (more?) than classic math. It would be nice to support it one way or
another.

Which can have nothing to do with this PEP -- so carry one.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150903/a5f788be/attachment.html>

From alexander.belopolsky at gmail.com  Fri Sep  4 01:55:05 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Thu, 3 Sep 2015 19:55:05 -0400
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CALGmxEK+2+gZwOkvUmFpLi99JDjf2jgX9sHZG-MSA-BbxtCUzA@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7BD69.3060905@oddbird.net>
 <CAExdVNntXqNz55cdskBuH-b=V_SXG52=iw6WpG47SapkTQFr8Q@mail.gmail.com>
 <-7656714205635425925@unknownmsgid>
 <CAExdVNm9OL6-sykTiGgD0+D33YCP8hO9E8x1bT+Qm8RymxZcaw@mail.gmail.com>
 <CALGmxEK+2+gZwOkvUmFpLi99JDjf2jgX9sHZG-MSA-BbxtCUzA@mail.gmail.com>
Message-ID: <CAP7h-xZ+pwKnuVihctqLiXTM0rz_U=WwnOodMCvBBrSCm+_b9Q@mail.gmail.com>

On Thu, Sep 3, 2015 at 6:51 PM, Chris Barker <chris.barker at noaa.gov> wrote:
>
> I know I only got it when someone explained the implementation:
>
> "remove the tzinfo object, do the math, tack the tzinfo back on"
>
> Simple elegant, and now I get it. And get why things go wonky with
datetimes with two different tzinfo objects.
>
> By the way, something like that should be in the docs.


Doc patches from good writers are always welcome, but in this case, I don't
see what needs to be added to what the reference manual already says:

"""
Subtraction of a datetime from a datetime is defined only if both operands
are naive, or if both are aware. If one is aware and the other is naive,
TypeError is raised.

If both are naive, or both are aware and have the same tzinfo attribute,
the tzinfo attributes are ignored, and the result is a timedelta object t
such that datetime2 + t == datetime1. No time zone adjustments are done in
this case.

If both are aware and have different tzinfo attributes, a-b acts as if a
and b were first converted to naive UTC datetimes first. The result is
(a.replace(tzinfo=None) - a.utcoffset()) -(b.replace(tzinfo=None) -
b.utcoffset()) except that the implementation never overflows.
"""

https://docs.python.org/3/library/datetime.html#datetime.datetime

The only improvement that comes to mind is to make "Supported operations:"
a linkable section.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150903/f7776b75/attachment-0001.html>

From tim.peters at gmail.com  Fri Sep  4 02:03:02 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 3 Sep 2015 19:03:02 -0500
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CALGmxEK+2+gZwOkvUmFpLi99JDjf2jgX9sHZG-MSA-BbxtCUzA@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7BD69.3060905@oddbird.net>
 <CAExdVNntXqNz55cdskBuH-b=V_SXG52=iw6WpG47SapkTQFr8Q@mail.gmail.com>
 <-7656714205635425925@unknownmsgid>
 <CAExdVNm9OL6-sykTiGgD0+D33YCP8hO9E8x1bT+Qm8RymxZcaw@mail.gmail.com>
 <CALGmxEK+2+gZwOkvUmFpLi99JDjf2jgX9sHZG-MSA-BbxtCUzA@mail.gmail.com>
Message-ID: <CAExdVNmp-eXDuZuQ2=k6tJNFBN5bGTnvfJZj4tox057fD2AcbA@mail.gmail.com>

[Chris Barker]
>>> And intentional or not, "classic" arithmetic may be easy to implement
>>>  and fast, but it is hard to explain, surprising, and not very useful.

[Tim]
>>  As to being hard to explain, you must be joking:

[Chris]
> sigh. Look at the length of this stinking thread!

I don't recall any confusions in this thread about what classic
arithmetic does.  Do you?

> and how much confusion there was at the beginning about what
> the heck the current datetime implementation actually did.

Which covered a world of issues.


> Classic arithmetic may well be the best possible solution given
> the constraints,

It's impossible that this - or any other - PEP could succeed at
changing the default arithmetic.


> but it is not obvious, clear, lacking in surprises or well documented
> ( and no one reads docs until they run into a problem)

Maybe they should ;-)  But, yup, the docs could be clearer.


> I know I only got it when someone explained the implementation:
>
> "remove the tzinfo object, do the math, tack the tzinfo back on"
>
> Simple elegant, and now I get it.

So, you start with "hard to explain", and end with "simple elegant,
and now I get it" after a one-sentence explanation - yet wonder why I
said "you must be joking"?  I don't see how it could be all of those
simultaneously.  It's easy to explain.  It just took you a while to
find the simple explanation.  Some people people get it instantly;
others don't.  For the latter, that's a doc problem, not a "hard to
explain" problem.


> And get why things go wonky with datetimes with two different tzinfo objects.
>
> By the way, something like that should be in the docs.

I agree.  Patches welcome ;-)


> Anyway, clearly timeline math is an important use case for folks -- just as
> many (more?) than classic math. It would be nice to support it one way or
> another.
>
> Which can have nothing to do with this PEP -- so carry one.

In the meantime, use UTC - you'll be much happier with that in the end
(simpler, clearer, cleaner, faster, ...).  That was the intent from
the start, and will likely always be the best way to get timeline
arithmetic (regardless of programming language too),

From tim.peters at gmail.com  Fri Sep  4 02:32:11 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 3 Sep 2015 19:32:11 -0500
Subject: [Datetime-SIG] Another round on error-checking
Message-ID: <CAExdVNnWz3iWWbecsztq_LR4K0ir1PHWeKJeZ4a9Zj9g9CoAZw@mail.gmail.com>

[Alex]
> Doc patches from good writers are always welcome, but in this case, I don't
> see what needs to be added to what the reference manual already says:

The docs lack a coherent, friendly overview.  For example, I don't
think they even mention "naive time".  The docs you quote here are
"buried" in a footnote on a table of datetime operations.  They're
accurate, but provide no context, motivation, or exposition of the
_model_.  Chris's

    "remove the tzinfo object, do the math, tack the tzinfo back on"

explains a whole lot about classic arithmetic in one brief &
comprehensible sentence.


> """
> Subtraction of a datetime from a datetime is defined only if both operands
> are naive, or if both are aware. If one is aware and the other is naive,
> TypeError is raised.

I wrote almost all this stuff to begin with, but right now even I'm
already half asleep ;-)


> If both are naive, or both are aware and have the same tzinfo attribute, the
> tzinfo attributes are ignored, and the result is a timedelta object t such
> that datetime2 + t == datetime1.

Assuming the reader already digested the similarly legalistic footnote
just above about what "datetime + timedelta" does.  In
reference-manual style, you can't jump in just anywhere, because the
details are too numerous and involved to keep repeating them.


> No time zone adjustments are done in this case.
>
> If both are aware and have different tzinfo attributes, a-b acts as if a and
> b were first converted to naive UTC datetimes first. The result is
> (a.replace(tzinfo=None) - a.utcoffset()) -(b.replace(tzinfo=None) -
> b.utcoffset()) except that the implementation never overflows.
> """

And stuff like "except that the implementation never overflows" is
important in a spec (it's a constraint on allowable implementations of
the spec), but of approximately no interest to 99.997% of  users.


> https://docs.python.org/3/library/datetime.html#datetime.datetime
>
> The only improvement that comes to mind is to make "Supported operations:" a
> linkable section.

As above, it's not that the docs lack sufficient detail - they're
_buried_ in detail.  Something more akin to the ever-popular "binary
floating-point" tutorial appendix would probably be more useful to
most users.  Just the high-order bits, with pragmatic advice (like "if
you need timeline arithmetic, use UTC - don't be a sucker" ;-) ).

From tim.peters at gmail.com  Fri Sep  4 06:11:30 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 3 Sep 2015 23:11:30 -0500
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <55E82824.7020607@oddbird.net>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7C65F.8050106@oddbird.net>
 <CAExdVN=nhKWmEhN+abmx=YgCg-OX+0++AE1GwGPJCF8Z7fht-g@mail.gmail.com>
 <55E82824.7020607@oddbird.net>
Message-ID: <CAExdVNnWNOP5d0=akRGXvUBXEAcb6R7YfnyT75xgN_iCQ=qf4Q@mail.gmail.com>

[Tim]
>> I'm out of time for tonight, but will try to make more tomorrow.  Just
>> one for now, because I think it cuts to the _real_ heart of this batch
>> of messages:

[Carl]
> I don't think this cuts to the heart of anything :/ I think it avoids
> the main point I've made (several times) to latch instead onto a tangent
> I should have left out.

Fair enough.  I only had time for one, so latched on to the lamest one ;-)


>> That's the heart:  you simply despise classic arithmetic.

> Sorry, but no. I have nothing at all against naive arithmetic. I think
> both naive arithmetic and timeline arithmetic have good use cases.
>
> What I have trouble with is a tz-annotated datetime object that
> fundamentally can't decide whether it's living in a naive or timeline
> model, and thus behaves unpredictably.
>
> This is a problem today, but at least the behavior can be explained
> fairly simply: the model is naive when operating within the same
> timezone, and aware anytime you're converting between timezones or
> interoperating between timezones.
>
> PEP 495, AFAICS, makes the problem worse, because it introduces another
> bit of information that only makes sense in a timeline view. That new
> bit now allows round-tripping from UTC, which is great (no problem,
> because conversions are an area where tz-annotated datetimes already
> tried to behave as tz-aware instants in time). But then it can't quite
> decide how to rationalize that new bit of information with its naive
> internal view of time, so it settles on a mish-mash of inconsistent
> behavior that violates basic arithmetic identities we all learned in
> elementary school and only makes any sense if you've followed this
> entire thread.

Eh.  It's not perfect, but I don't know that anyone (present company
excepted) will care much.  It matters only for the later of ambiguous
times in at worst (in common zones) one hour per year, and then only
for someone using classic datetime-datetime subtraction or comparison
starting in _some_ (not all) cases in such a fold.

Perhaps this makes it wholly unusable.  I doubt most would reach that
conclusion, but it's possible.


> If you want to cut to the heart of the matter, tell me how you would
> write the documentation for how arithmetic works on a tz-annotated
> datetime post-PEP-495.

Already did in a different message ("if at least one operand has
fold=1, acts as if the tzinfos were distinct" - reduced to a prior
case).  Of course that doesn't make _sense_ in the naive time model.
Repeating that point isn't really needed ;-)


> Does it work on a naive "move the hands of the
> clock" model? (No, because I can subtract 1:30AM from 2:30AM and get "2
> hours" in some cases.);

Assuming DST is ending and moves the clock back 1 hour, then:

1. Assuming a post-495 tzinfo:

   A. If 2:30AM is the later of ambiguous times with fold=1, 2 hours.
   B. If 2:30AM is the earlier of ambiguous times with fold=0, 1 hour.
   C. If 1:30AM is the later of ambiguous times with fold=1, 1 hour.
   D. If 1:30AM is the earlier of ambiguous times with fold=0, 1 hour.
   In all other cases, 1 hour.
   In all cases, 1:30AM will compare "less than" 2:30AM..
   Note that classic arithmetic is still used if both operands have fold=0;
   so nothing _could_ change in cases B and D.
   Note that using US rules, it's 1 hour in all cases (2:30AM isn;t
   ambiguous under US rules. so A and B can't apply).  Switch to,
    e.g., 1:30AM - 12;30AM  to get an "interesting" case for US rules..

2. Assuming a pre-495 tzinfo:

   What they see will depend on what their fold-blind tzinfo makes
   up for times in a fold.  The choice recommended in the docs is
   to treat an ambiguous time as being the later.  If so, cases 1A
   & 1C still apply, and all cases return the same results.
   If the tzinfo makes the opposite choice, then case 1A returns 1
   hour and case 1C returns 2 hours.

So after 495 is implemented, they will see a difference of 2 hours in
some cases when the "real world" difference really is 2 hours, and
regardless of whether they're using a pre- or post-495 tzinfo..
That's not particularly surprising:  nobody thinks _wholly_ in "naive
time" ;-)

Of course nobody will (or should even try to) remember all those
cases.  An app that really cares (if any exist - none of my code
cares) will need to "do something" about it.  Or we'll need to add
code to ignore `fold` if a pre-495 tzinfo is in use (in which case
nothing will change if they stick to pre-495 tzinfos).

Yes, it would be better if nobody had to do anything.  No, I'm not
appalled, just mildly annoyed so far.


> Does it work on a UTC timeline model? (No, clearly not.) So what is the
> model, stated precisely and concisely?

This part isn't driven a model; it's driven by pragmatism
("practicality beats purity").  The sanest model is "it's classic
unless you're near a fold, and if you care anything about what happens
then when doing classic arithmetic you're wasting your time:  e.g.,
force it out of a fold if you need to care".

I've never written an app that needs to worry about this.  Classic
arithmetic in naive time is a simple (but highly useful) form of
"period arithmetic", and things like "same time next week" are rarely
(never, for me) concerned with hours near a transition time.  They're
usually about interacting with other people or businesses.


> And is it actually backwards-compatible with current code that converts
> from UTC to local time and then does arithmetic on those local times, or
> compares them to each other? (Not around a DST transition, no.)

You don't need any of that - the 2:30AM - 1:30AM example above already
sufficed to show it's not always backward compatible.  That's not
surprising (said before I don't think anything _useful_ to existing
code can be wholly backward compatible).

From chris.barker at noaa.gov  Fri Sep  4 17:45:07 2015
From: chris.barker at noaa.gov (Chris Barker)
Date: Fri, 4 Sep 2015 08:45:07 -0700
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAP7h-xZ+pwKnuVihctqLiXTM0rz_U=WwnOodMCvBBrSCm+_b9Q@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <55E5D3F5.40600@oddbird.net>
 <CAP7h-xY9M0wPgO+OfztSG7XgP30r=s7pCxhjTWp53+SKw0oE5Q@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7BD69.3060905@oddbird.net>
 <CAExdVNntXqNz55cdskBuH-b=V_SXG52=iw6WpG47SapkTQFr8Q@mail.gmail.com>
 <-7656714205635425925@unknownmsgid>
 <CAExdVNm9OL6-sykTiGgD0+D33YCP8hO9E8x1bT+Qm8RymxZcaw@mail.gmail.com>
 <CALGmxEK+2+gZwOkvUmFpLi99JDjf2jgX9sHZG-MSA-BbxtCUzA@mail.gmail.com>
 <CAP7h-xZ+pwKnuVihctqLiXTM0rz_U=WwnOodMCvBBrSCm+_b9Q@mail.gmail.com>
Message-ID: <CALGmxE+jq7k=Ex-dxcr8WPw1PitOpoghJ4Qwd87jZHMYbN=v4Q@mail.gmail.com>

On Thu, Sep 3, 2015 at 4:55 PM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:


> Doc patches from good writers are always welcome, but in this case, I
> don't see what needs to be added to what the reference manual already says:
>

Wow -- I did not find that when I went looking early in this thread -- so
maybe not missing, but not in an easy  to find place.

The trick with docs is that:

a) people don't read them ;-)

b) OK, they do when they can't figure out how to do something -- in which
case, they read as little as they can to solve their problem.

The trick with datetime arithmetic is that people come to it with an
expectation of how it works, so we want to make sure they won't go away
with that that expectation (if it's wrong) with a quick read of the docs.
This is particularly a problem because datetime arithmetic behaves like
both Period and Duration arithmetic if you stay away from DST -- so folks
can come in with either expectation and their code could work fine, and not
find an the issue until it fails next fall.

And yes, I could have written some nice doc patches with the time I've
spent on this thread -- but that's less fun!

And maybe we should wait for post-PEP, so we know what to write (if
anything different).

-Chris

> https://docs.python.org/3/library/datetime.html#datetime.datetime
>
> The only improvement that comes to mind is to make "Supported operations:"
> a linkable section.
>


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150904/6507c9fd/attachment.html>

From chris.barker at noaa.gov  Fri Sep  4 18:01:19 2015
From: chris.barker at noaa.gov (Chris Barker)
Date: Fri, 4 Sep 2015 09:01:19 -0700
Subject: [Datetime-SIG] Timeline arithmetic?
Message-ID: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>

Folks,

It seems to me that it's clear that timeline arithmetic will not get
implemented in concert with PEP 495.

So -- is the door open to a PEP that DOES implement timeline arithmetic
with tz-aware datetimes in the standard lib?

I would like a flag on datetime, but it seems it might be better to put
that flag on a tzinfo object. But the implementation is the something to
argue about only if there is any chance of doing it at all.

Also, particularly as PEP 495 will introduce changes to tzinfo, that will
presumable lead to changes in tzinfo implementations (like pytz, etc), it
seems that if other changes are afoot, now is a good time to map out how
they should be done.

Stuart, if you are listening:

IIUC, you want "timeline" arithmetic to work with pytz tzinfo-aware
datetimes. To the extent that the current implementation functions in a
 maybe "hacky", and at least inconvenient, way to achieve this.

So you are an obvious person to say what we might put in the stdlib that
would facilitate cleaning all that up. If anything.

BTW: I'll at least take it as a given that we're not breaking backward
compatibility, and that arithmetic needs to stay as fast as it currently is
-- at least in the cases where it currently works.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150904/7fb4c668/attachment.html>

From alexander.belopolsky at gmail.com  Fri Sep  4 18:23:40 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Fri, 4 Sep 2015 12:23:40 -0400
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAExdVNnWz3iWWbecsztq_LR4K0ir1PHWeKJeZ4a9Zj9g9CoAZw@mail.gmail.com>
References: <CAExdVNnWz3iWWbecsztq_LR4K0ir1PHWeKJeZ4a9Zj9g9CoAZw@mail.gmail.com>
Message-ID: <CAP7h-xb=yrYW1AQTcuBMvgFT+E2hFnn96ewmrp-qMx0DwKeohw@mail.gmail.com>

On Thu, Sep 3, 2015 at 8:32 PM, Tim Peters <tim.peters at gmail.com> wrote:

> I wrote almost all this stuff to begin with, but right now even I'm
> already half asleep ;-)
>

I agree that the datetime documentation is showing its age and could
benefit from a face-lift, but note that being an entertaining read is not a
primary goal of the reference documentation if at all.  The datetime
documentation has evolved through a series of local patches as new features
have been added to the module.  At each turn, the primary goal was to have
a complete and accurate documentation for each method and not as much on
having the overall document well-organized.

Some of the complaints expressed in this thread can be better addressed in
a tutorial-style document rather than the reference documentation.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150904/2d8b3000/attachment.html>

From alexander.belopolsky at gmail.com  Fri Sep  4 18:31:38 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Fri, 4 Sep 2015 12:31:38 -0400
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
Message-ID: <CAP7h-xaYpDazDpoESdjbrm9ornCMmDvC-hdtr-nUege5i3VRxA@mail.gmail.com>

On Fri, Sep 4, 2015 at 12:01 PM, Chris Barker <chris.barker at noaa.gov> wrote:

> It seems to me that it's clear that timeline arithmetic will not get
> implemented in concert with PEP 495.
>
> So -- is the door open to a PEP that DOES implement timeline arithmetic
> with tz-aware datetimes in the standard lib?
>

The door is always open to good ideas!  PEP 500 was my failed attempt to
bring timeline arithmetic to aware datetime objects.   I will not make
another attempt before PEP 495 is finalized.  Please don't interpret this
as a lack of interest in the subject.  I just want to focus on one issue at
a time.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150904/e3c7507a/attachment.html>

From tim.peters at gmail.com  Fri Sep  4 18:39:38 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 4 Sep 2015 11:39:38 -0500
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
Message-ID: <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>

[Chris Barker <chris.barker at noaa.gov>]
> It seems to me that it's clear that timeline arithmetic will not get
> implemented in concert with PEP 495.

It's certainly not _part_ of 495.  495 aims to fix timezone
conversions in all cases for code that's already working fine in all
other respects.  Tying that to timeline arithmetic would wholly miss
the "for code that's already working fine" goal.  Carl's scheme would
tie fixing conversions _to_ using a brand new builtin implementation
of timeline arithmetic, so would do nothing for existing code (would
neither hurt nor help it, although all code currently doing arithmetic
on aware datetimes could fail in subtle or gross ways _if_ it tried
using one of Carl's new tzinfos).


> So -- is the door open to a PEP that DOES implement timeline
> arithmetic with tz-aware datetimes in the standard lib?

I would say instead the door isn't shut ;-)  Note that Guido already
rejected PEP 500, which proposed one way to allow it.  He didn't like
its generality.  A PEP concerned with timeline arithmetic alone would
overcome that objection.

But you have to know by now that datetime always intended that apps
needing timeline arithmetic use UTC instead (or timestamps), and
there's scarcely an experienced voice on the planet that would
_recommend_ doing it any other way.  Building in "by magic" timeline
arithmetic would be fighting both datetime's design and universally
recognized best practice.  So I dare to say it will never be
_attractive_ to Guido.  At best it could get grudging acceptance.

Which is possible!  Just want to make clear that it's likely to be an
uphill fight.

Note that PEP 495 may also be rejected.  "Grudging acceptance" is the
best 495 can do too (always-correct conversions are an interest of
mine, not particularly of Guido's - but, to be fair, at least Guido
doesn't hate the idea of fixing conversions ;-) ).


> ...
> Also, particularly as PEP 495 will introduce changes to tzinfo, that will
> presumable lead to changes in tzinfo implementations (like pytz, etc), it
> seems that if other changes are afoot, now is a good time to map out how
> they should be done.

It seems 495 really doesn't do anything for pytz, so I'm not sure
Stuart would bother to implement 495-conforming tzinfos.  _Someone_
will, though.  Eventually ;-)


> Stuart, if you are listening:
>
> IIUC, you want "timeline" arithmetic to work with pytz tzinfo-aware
> datetimes. To the extent that the current implementation functions in a
> maybe "hacky", and at least inconvenient, way to achieve this.
>
> So you are an obvious person to say what we might put in the stdlib that
> would facilitate cleaning all that up. If anything.
>
> BTW: I'll at least take it as a given that we're not breaking backward
> compatibility, and that arithmetic needs to stay as fast as it currently is
> -- at least in the cases where it currently works.

A timeline arithmetic PEP would have to ensure that timeline
arithmetic is never used unless a programmer explicitly asks for it.
PEP 500 met that goal, and so does Carl's scheme (both via the same
basic mechanism:  by the user asking for a new flavor of tzinfo).

From carl at oddbird.net  Fri Sep  4 19:37:47 2015
From: carl at oddbird.net (Carl Meyer)
Date: Fri, 4 Sep 2015 11:37:47 -0600
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
Message-ID: <55E9D6EB.2090108@oddbird.net>

[Tim]
> But you have to know by now that datetime always intended that apps
> needing timeline arithmetic use UTC instead (or timestamps), and
> there's scarcely an experienced voice on the planet that would
> _recommend_ doing it any other way.  Building in "by magic" timeline
> arithmetic would be fighting both datetime's design and universally
> recognized best practice.

I find this argument a bit disingenuous - though it depends what exactly
you are arguing, which isn't clear to me.

All else being equal, designing a green-field datetime library,
"universally recognized best practice" does not provide any argument for
naive arithmetic over aware arithmetic on aware datetimes. Making the
choice to implement aware arithmetic is not "fighting" a best practice,
it's just providing a reasonable and fully consistent convenience for
simple cases.

You could perhaps argue that implementing _any_ kind of arithmetic on
aware non-UTC datetimes is unnecessary and likely to give someone, at
some point, results they didn't expect, and that it should instead just
raise an exception telling you to convert to UTC first.

The fact that best practice is to manipulate datetimes internally in UTC
(meaning the use case already has a usually-better alternative) can
certainly _weaken_ the argument for bothering to _change_ the behavior
of arithmetic on aware datetimes, once it's been implemented otherwise
for many years. That may be all you're trying to say here, in which case
I fully agree.

The core arguments _for_ aware arithmetic on aware datetimes are:

1) Conceptual coherence. Naive is naive, aware is aware, both models are
fully internally consistent. Mixing them, as datetime does, will never
be fully consistent. You may call this "purity" if you like, but the
issues with PEP 495 do reveal a lack of coherence in datetime's design
(that is, that it lacks a consistently-applied notion of what a
tz-annotated datetime means). I think you've admitted this much
yourself, though you suggested (in passing) that it could/should have
achieved coherence in the opposite direction, by disallowing all
comparisons and aware arithmetic (that is, all implicit conversions to
UTC) between datetimes in different timezones.

2) Principle of least surprise for casual users. On this question, "you
should use UTC for arithmetic" is equivalent to "you should use a period
recurrence library for period arithmetic." Both arguments are true in
principle, neither one is relevant to the question of casual users
getting the results they expect. There may of course be legitimate
disagreement on which behavior is less surprising for casual users.
Unfortunately I don't think datetime.py (even in its many years of
existence) has given us useful data on that, since it never included a
timezone database and most people who need one use pytz.

It's often unclear to me when you're trying to justify datetime's design
choices, and when you're just pointing out that the bar is really high
for changing established "good enough" behavior. If you want me to shut
up and stop arguing with you (which would be an eminently reasonable
desire!) clarifying that it's the latter more than the former would help
tremendously, because on the latter point I agree completely.

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150904/6c09116a/attachment.sig>

From carl at oddbird.net  Fri Sep  4 19:50:23 2015
From: carl at oddbird.net (Carl Meyer)
Date: Fri, 4 Sep 2015 11:50:23 -0600
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAExdVNnWNOP5d0=akRGXvUBXEAcb6R7YfnyT75xgN_iCQ=qf4Q@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAExdVNnOGG7xjNyOa8xXnMm2Tkz4SxZfnRobRGXTcJ0aH-0fAg@mail.gmail.com>
 <CAP7h-xbtgUVKBCNzi9bPQqgQxPCk8jmw68dSyYbDw-TTs85u4Q@mail.gmail.com>
 <CAExdVNmtT4o_J_czUwFp7h_FhmP9QyBSkLbG_iKOKg7SSPNf0A@mail.gmail.com>
 <CAP7h-xbVc2WnhT3y9UFk_eqvxdfLsvMyYJJw=xsg_VFQ4gHdjA@mail.gmail.com>
 <CAExdVNki3LtAn8Dqq536TUK3+95pCmE6nLmyN6aaTZfSHEayVg@mail.gmail.com>
 <CAP7h-xaST_4ZcKTCf=s_XxerRT7W1C9=a6FcCnzYGSC3etepDA@mail.gmail.com>
 <CAExdVNk2rRgEv58UX31xsGYCoQZzfSYShHqDSXUf5m2VU49FoQ@mail.gmail.com>
 <55E75B62.2060905@oddbird.net>
 <CAExdVN=LTqDWF6E56a8ALoQfOb_fOn1ADUO7E0X6LrbyWCkw7Q@mail.gmail.com>
 <55E77FCC.9040507@oddbird.net>
 <CAExdVN=Y0+VAO_kqS6GjVG8YWcXY4gcHybF9_ZoH5QRSdUiDOQ@mail.gmail.com>
 <55E7C65F.8050106@oddbird.net>
 <CAExdVN=nhKWmEhN+abmx=YgCg-OX+0++AE1GwGPJCF8Z7fht-g@mail.gmail.com>
 <55E82824.7020607@oddbird.net>
 <CAExdVNnWNOP5d0=akRGXvUBXEAcb6R7YfnyT75xgN_iCQ=qf4Q@mail.gmail.com>
Message-ID: <55E9D9DF.4000309@oddbird.net>

[Tim]
> Eh.  It's not perfect, but I don't know that anyone (present company
> excepted) will care much.  It matters only for the later of ambiguous
> times in at worst (in common zones) one hour per year, and then only
> for someone using classic datetime-datetime subtraction or comparison
> starting in _some_ (not all) cases in such a fold.
> 
> Perhaps this makes it wholly unusable.  I doubt most would reach that
> conclusion, but it's possible.

It's certainly not wholly unusable; I'd never claim that. We can have
reasonable disagreement about whether it's the best option available.

I think it's reasonable (in principle; pending working code) to tie
fully-consistent timezone conversions to full consistency in general,
and make it a migration choice, leaving existing working code entirely
alone. You (and Alex and PEP 495) think consistent timezone-conversion
round-trips are a valuable enough addition (even for existing code
that's already working) to be worth pragmatically trading off some
consistency and backwards-compatibility in other edge cases. I can see
your point of view, and I think it's a reasonable disagreement to have.
And I don't have much leg to stand on until I provide a working patch
for my point of view, since Alex already has one for yours :-)

[Tim]
> Of course nobody will (or should even try to) remember all those
> cases.  An app that really cares (if any exist - none of my code
> cares) will need to "do something" about it.  Or we'll need to add
> code to ignore `fold` if a pre-495 tzinfo is in use (in which case
> nothing will change if they stick to pre-495 tzinfos).

I'm actually quite curious how many homegrown tzinfo implementations
exist in the wild, or if we're really just talking about "dateutil.tz
users" vs "pytz users". When you talk about "your code", which bucket
does it fall into? Clearly not the latter - are you a "homegrown tzinfo"
user, or a dateutil.tz user?

[Tim]
> This part isn't driven a model; it's driven by pragmatism
> ("practicality beats purity").  The sanest model is "it's classic
> unless you're near a fold, and if you care anything about what happens
> then when doing classic arithmetic you're wasting your time:  e.g.,
> force it out of a fold if you need to care".

Yep.

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150904/a442318e/attachment.sig>

From alexander.belopolsky at gmail.com  Fri Sep  4 20:11:46 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Fri, 4 Sep 2015 14:11:46 -0400
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <55E9D6EB.2090108@oddbird.net>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
Message-ID: <CAP7h-xag=HKFUB5DP-4iHuvFBjNU+A+0P_o60URzbXfEnLfbKA@mail.gmail.com>

On Fri, Sep 4, 2015 at 1:37 PM, Carl Meyer <carl at oddbird.net> wrote:

> Principle of least surprise for casual users. On this question, "you
> should use UTC for arithmetic" is equivalent to "you should use a period
> recurrence library for period arithmetic."
>

Keep in mind that the standard library should not only support "casual
users", but also those who will write a "period
recurrence library" for those "casual users."  This is where classic
arithmetic is indispensable.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150904/1c2fa9bf/attachment.html>

From carl at oddbird.net  Fri Sep  4 20:19:47 2015
From: carl at oddbird.net (Carl Meyer)
Date: Fri, 4 Sep 2015 12:19:47 -0600
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <CAP7h-xag=HKFUB5DP-4iHuvFBjNU+A+0P_o60URzbXfEnLfbKA@mail.gmail.com>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAP7h-xag=HKFUB5DP-4iHuvFBjNU+A+0P_o60URzbXfEnLfbKA@mail.gmail.com>
Message-ID: <55E9E0C3.7070003@oddbird.net>

On 09/04/2015 12:11 PM, Alexander Belopolsky wrote:
> Keep in mind that the standard library should not only support "casual
> users", but also those who will write a "period
> recurrence library" for those "casual users."  This is where classic
> arithmetic is indispensable.

Oh, I'm well aware. But naive arithmetic is always available - on naive
datetimes.

Btw, I have a minor objection to the term "classic arithmetic." It's a
made-up term from this mailing list, and I don't think it describes a
real distinct thing, it's just a euphemism for "naive arithmetic."

I'm not sure why the euphemism arose; I _think_ it arose because it
sounds wrong to say that aware datetimes perform naive arithmetic. I
think that sounds wrong to roughly the same extent that it is wrong, so
I don't see any point in using a made-up euphemism to hide it :-)

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150904/7064623c/attachment.sig>

From guido at python.org  Fri Sep  4 20:25:21 2015
From: guido at python.org (Guido van Rossum)
Date: Fri, 4 Sep 2015 11:25:21 -0700
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <55E9E0C3.7070003@oddbird.net>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAP7h-xag=HKFUB5DP-4iHuvFBjNU+A+0P_o60URzbXfEnLfbKA@mail.gmail.com>
 <55E9E0C3.7070003@oddbird.net>
Message-ID: <CAP7+vJKAOVNQKktgWFsBoQybsyroioq-=hrt52d+G+EiqS1CMg@mail.gmail.com>

I made it up, in analogy to "classic classes" in Python 2. I did this not
as a euphemism, but to avoid confusion, since in the existing docs "naive"
is only ever applied to objects (meaning tzinfo-less) and I wanted to have
a term that couldn't confuse anyone into thinking we were only talking
about arithmetic of naive objects.

On Fri, Sep 4, 2015 at 11:19 AM, Carl Meyer <carl at oddbird.net> wrote:

> On 09/04/2015 12:11 PM, Alexander Belopolsky wrote:
> > Keep in mind that the standard library should not only support "casual
> > users", but also those who will write a "period
> > recurrence library" for those "casual users."  This is where classic
> > arithmetic is indispensable.
>
> Oh, I'm well aware. But naive arithmetic is always available - on naive
> datetimes.
>
> Btw, I have a minor objection to the term "classic arithmetic." It's a
> made-up term from this mailing list, and I don't think it describes a
> real distinct thing, it's just a euphemism for "naive arithmetic."
>
> I'm not sure why the euphemism arose; I _think_ it arose because it
> sounds wrong to say that aware datetimes perform naive arithmetic. I
> think that sounds wrong to roughly the same extent that it is wrong, so
> I don't see any point in using a made-up euphemism to hide it :-)
>
> Carl
>
>
> _______________________________________________
> Datetime-SIG mailing list
> Datetime-SIG at python.org
> https://mail.python.org/mailman/listinfo/datetime-sig
> The PSF Code of Conduct applies to this mailing list:
> https://www.python.org/psf/codeofconduct/
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150904/068de89e/attachment-0001.html>

From chris.barker at noaa.gov  Fri Sep  4 20:38:58 2015
From: chris.barker at noaa.gov (Chris Barker)
Date: Fri, 4 Sep 2015 11:38:58 -0700
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <55E9E0C3.7070003@oddbird.net>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAP7h-xag=HKFUB5DP-4iHuvFBjNU+A+0P_o60URzbXfEnLfbKA@mail.gmail.com>
 <55E9E0C3.7070003@oddbird.net>
Message-ID: <CALGmxEJQ5o5TkM2F3T7q87kpzbxvRKitHgaBpFD7B4xZ1Nkr5A@mail.gmail.com>

On Fri, Sep 4, 2015 at 11:19 AM, Carl Meyer <carl at oddbird.net> wrote:

> On 09/04/2015 12:11 PM, Alexander Belopolsky wrote:
> > Keep in mind that the standard library should not only support "casual
> > users", but also those who will write a "period
> > recurrence library" for those "casual users."  This is where classic
> > arithmetic is indispensable.
>

I dont get that at all -- a Period recurrence lib needs to know all sorts
of stuff about the timezone, and other things, like days of the week. And
it needs to be able to do "timeline arithmetic", but it would presumable be
able to remove and tack back on a tzinfo object all on it's own  -- i.e. so
the arithmetic it wants.

But maybe if I tried to implement one (which I will never do) , I'd see you
point. Bu tin any case, doesn't dateutils already provide this?

Btw, I have a minor objection to the term "classic arithmetic." It's a
> made-up term from this mailing list, and I don't think it describes a
> real distinct thing, it's just a euphemism for "naive arithmetic."
>

well, naive arithmetic is a made-up term too. there was a lot of bandying
about about terminology early on, and this seems to be what we've settled
on. And unlike "Period arithmetic" or "Duration arithmetic", I haven't seen
any other reference to this type of arithmetic anywhere.


> I'm not sure why the euphemism arose; I _think_ it arose because it
> sounds wrong to say that aware datetimes perform naive arithmetic.


yes -- I know I, and probably other thought "naive arithmetic" meant
arithmetic on naive datetimes -- there was much confusion ;-)

I don't see any point in using a made-up euphemism to hide it :-)


unless you can find another reference, we need to make up something.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150904/1a543a81/attachment.html>

From tim.peters at gmail.com  Fri Sep  4 21:08:02 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 4 Sep 2015 14:08:02 -0500
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <55E9D6EB.2090108@oddbird.net>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
Message-ID: <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>

 [Tim]
>> But you have to know by now that datetime always intended that apps
>> needing timeline arithmetic use UTC instead (or timestamps), and
>> there's scarcely an experienced voice on the planet that would
>> _recommend_ doing it any other way.  Building in "by magic" timeline
>> arithmetic would be fighting both datetime's design and universally
>> recognized best practice.

[Carl Meyer <carl at oddbird.net>]
> I find this argument a bit disingenuous - though it depends what exactly
> you are arguing, which isn't clear to me.

In the above, I'm not arguing at all.  I'm trying to tell Chris in
advance what the most likely fundamental objections to any "timeline
arithmetic PEP" are likely to be when it comes to the one vote that
matters the most:  Guido's.  Here I'm wearing my "attempt to channel
Guido in his absence" hat.  Forewarned is forearmed.  In this case, it
happens to be much the same as I'd say wearing several of my other
hats ;-)

In other contexts, I wear my "Tim as a Python user hat", "Tim as a
computer `scientist'" hat, "Tim as an explainer of past decisions"
hat, "Tim as an advocate for a particular change" hat, "Tim as a
Python developer" hat, "Tim thinking out loud" hat, and so on.  It's
absurd to expect consistency among _all_ those roles.  In human
communication, context is necessary to distinguish, but sometimes
fails.


> All else being equal, designing a green-field datetime library,
> "universally recognized best practice" does not provide any argument for
> naive arithmetic over aware arithmetic on aware datetimes. Making the
> choice to implement aware arithmetic is not "fighting" a best practice,
> it's just providing a reasonable and fully consistent convenience for
> simple cases.

It would create an "attractive nuisance", yes ;-)  It's for much the
same reason, e.g., that Guido never gave a moment's serious
consideration to magically making

    1 + "123"

return "1123" or 124.  Make a dubious thing dead easy to spell, and
that _implicitly_ encourages its use.  That's where "best practice"
comes in.  Best practice when mixing ints and strings is to explicitly
force the choice you intend.  Best practice for timeline arithmetic in
goofy timezones is to explicitly convert to a non-goofy zone first.
In which case the distinction between "timeline" and "classic"
arithmetic is non-existent.

For that reason ("explainer of past decisions" hat), timeline
arithmetic was never really on the table - it was never needed for any
"best practice" use case, and Python never _intends_ to encourage poor
practices.


> You could perhaps argue that implementing _any_ kind of arithmetic on
> aware non-UTC datetimes is unnecessary and likely to give someone, at
> some point, results they didn't expect, and that it should instead just
> raise an exception telling you to convert to UTC first.

Wearing my "Tim as computer 'scientist'" hat, that's what I would have
preferred.  As a plain old Python user, I'm happy enough with the
status quo.  It's been useful to me!


> The fact that best practice is to manipulate datetimes internally in UTC
> (meaning the use case already has a usually-better alternative) can
> certainly _weaken_ the argument for bothering to _change_ the behavior
> of arithmetic on aware datetimes,

There is no argument that can possibly succeed for changing arithmetic
on aware datetimes:  "Tim as Python developer hat" there.  That would
be massively backward-incompatible.  No chance whatsoever.  Not even
if there were 100% agreement from everyone that classic arithmetic is
utterly useless for all purposes and that allowing it at all was a
horrible mistake.  That kind of change could only be made in Python 4.


> once it's been implemented otherwise for many years. That may be all
> you're trying to say here, in which case I fully agree.

I wasn't saying any of that.  I was telling Chris where a timeline
arithmetic PEP would most likely face deepest resistance from Guido.

> The core arguments _for_ aware arithmetic on aware datetimes are:
>
> 1) Conceptual coherence. Naive is naive, aware is aware, both models are
> fully internally consistent. Mixing them, as datetime does, will never
> be fully consistent. You may call this "purity" if you like, but the
> issues with PEP 495 do reveal a lack of coherence in datetime's design

I think making no distinction between "naive time" and "civil time" is
the core of coherence glitches.  An aware datetime is purely neither
in the implementation, and different operations treat it in different
ways.  Wearing many hats, I don't like that.  Wearing my "real life
Python user" hat, though - eh, I can't really say it's caused me
problems.


> (that is, that it lacks a consistently-applied notion of what a
> tz-annotated datetime means). I think you've admitted this much
> yourself, though you suggested (in passing) that it could/should have
> achieved coherence in the opposite direction, by disallowing all
> comparisons and aware arithmetic (that is, all implicit conversions to
> UTC) between datetimes in different timezones.

When wearing several different hats, yes, _that's_ more appealing.
But kinda pointless, since that's not what's actually done, and PEPs
have to move on from what _is_ the case.


> 2) Principle of least surprise for casual users. On this question, "you
> should use UTC for arithmetic" is equivalent to "you should use a period
> recurrence library for period arithmetic." Both arguments are true in
> principle, neither one is relevant to the question of casual users
> getting the results they expect.

That last wasn't ever really a _driving_ force in Python's design.
>From the earlier example, a great many users have complained a great
many times that

    1 + "123"

_doesn't_ return 124.  That _is_ what most casual users expect.  Tough
luck - Python's not for the terminally lazy.

That said, Guido's belief was that "adding 24 hours" _should_ return
"same clock time tomorrow" in all cases.  There was extensive public
review at the time, and I don't recall anyone disagreeing.


> There may of course be legitimate disagreement on which behavior is
> less surprising for casual users.
> Unfortunately I don't think datetime.py (even in its many years of
> existence) has given us useful data on that, since it never included a
> timezone database and most people who need one use pytz.

I agree, except that I'm not sure we can deduce much from pytz's
experience either.  Stuart has said that his _primary_ goal was to fix
conversion in all cases, not really to "fix arithmetic".  To fix the
former, fixed-offset classes always get used (to supply the "missing
bit" in a wonderfully convoluted way), and "timeline arithmetic" was
the _natural_ result of doing so (because timeline and classic
arithmetic are exactly the same thing in any fixed-offset zone).  So,
in pytz, assuming they always remember to call .normalize(), timeline
arithmetic is forced.


> It's often unclear to me when you're trying to justify datetime's design
> choices,

I'm not sure I ever try to justify them.  Why bother?  I do often try
to explain them, and sometimes express an opinion _about_ them when
wearing one hat or another.  It doesn't really matter whether anyone
(including me) agrees or disagrees with decisions made a decade ago -
with my Python developer hat on, it's only what we do tomorrow that
matters.  The past can only be a constraint on, or inspiration for,
future decisions.


> and when you're just pointing out that the bar is really high
> for changing established "good enough" behavior. If you want me to shut
> up and stop arguing with you (which would be an eminently reasonable
> desire!) clarifying that it's the latter more than the former would help
> tremendously, because on the latter point I agree completely.

Well, you can't see me, but I really do have a collection of 42 hats
on the table next to me, and every time I write a reply, sentence by
sentence I put on the hat most appropriate to what the current
sentence intends ;-)

From tim.peters at gmail.com  Fri Sep  4 21:29:34 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 4 Sep 2015 14:29:34 -0500
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <55E9E0C3.7070003@oddbird.net>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAP7h-xag=HKFUB5DP-4iHuvFBjNU+A+0P_o60URzbXfEnLfbKA@mail.gmail.com>
 <55E9E0C3.7070003@oddbird.net>
Message-ID: <CAExdVNnQcBWctXwRN4TzmkXx9KuR9wUcfKFoV_fvNYOz6D7Axg@mail.gmail.com>

[Carl]
> Btw, I have a minor objection to the term "classic arithmetic." It's a
> made-up term from this mailing list, and I don't think it describes a
> real distinct thing, it's just a euphemism for "naive arithmetic."

"Naive arithmetic" is also a made-up term from this mailing list
(perhaps from one of the related messages choking some other mailing
list before this list was created).  I know, because I'm the one who
made it up :-)


> I'm not sure why the euphemism arose; I _think_ it arose because it
> sounds wrong to say that aware datetimes perform naive arithmetic. I
> think that sounds wrong to roughly the same extent that it is wrong, so
> I don't see any point in using a made-up euphemism to hide it :-)

Guido made up "classic arithmetic" to replace my made-up "naive
arithmetic".  I think it's a good change.  "naive arithmetic"
currently to both "naive" and "aware" datetimes, but from "naive
arithmetic" alone it's too easy to _assume_ it only applies to naive
datetimes.

There was also agreement that it was unfortunate the docs ever used
the word "naive" anywhere for any purpose.

The term "timeline arithmetic" (aka "strict arithmetic") was also made
up on this mailing list, but isn't needed to describe anything Python
does.

From carl at oddbird.net  Fri Sep  4 21:51:02 2015
From: carl at oddbird.net (Carl Meyer)
Date: Fri, 4 Sep 2015 13:51:02 -0600
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>
Message-ID: <55E9F626.1080906@oddbird.net>

[Tim]
> In other contexts, I wear my "Tim as a Python user hat", "Tim as a
> computer `scientist'" hat, "Tim as an explainer of past decisions"
> hat, "Tim as an advocate for a particular change" hat, "Tim as a
> Python developer" hat, "Tim thinking out loud" hat, and so on.  It's
> absurd to expect consistency among _all_ those roles.  In human
> communication, context is necessary to distinguish, but sometimes
> fails.

I don't expect consistency from humans, it's just that my hat-intuiter
doesn't always work right :-)

[Carl]
>> All else being equal, designing a green-field datetime library,
>> "universally recognized best practice" does not provide any argument for
>> naive arithmetic over aware arithmetic on aware datetimes. Making the
>> choice to implement aware arithmetic is not "fighting" a best practice,
>> it's just providing a reasonable and fully consistent convenience for
>> simple cases.
[Tim]
> It would create an "attractive nuisance", yes ;-)

I think that either choice of arithmetic might be an attractive
nuisance; what matters is consistency with the rest of the choices in
the library.

If datetime did naive arithmetic on tz-annotated datetimes, and also
refused to ever implicitly convert them to UTC for purposes of
cross-timezone comparison or arithmetic, and included a `fold` parameter
not on the datetime object itself but only as an additional input
argument when you explicitly convert from some other timezone to UTC,
that would be a consistent view of the meaning of a tz-annotated
datetime, and I wouldn't have any problem with that.

It would be a view consistent with what Guido described a few days ago,
that "noon Eastern on June 3 2020" is not necessarily equivalent to a
UTC instant; it means nothing more than "noon Eastern on June 3 2020"
until you choose to explicitly convert it to UTC, providing a full
zoneinfo definition of "Eastern" (and possibly a `fold` argument too,
though it's not needed for "noon Eastern June 3 2020" unless something
changes) at that moment.

But that isn't datetime's view, at least not consistently. The problem
isn't datetime's choice of arithmetic; it's just that sometimes it wants
to treat a tz-annotated datetime as one thing, and sometimes as another.
(The fact that a _person_ might also want to have one sometimes and
another sometimes is not a reason for an implementation to try to guess
when they want one and when they want another. It could be a reason for
two different types.)

[Tim]
> There is no argument that can possibly succeed for changing arithmetic
> on aware datetimes:  "Tim as Python developer hat" there.  That would
> be massively backward-incompatible.  No chance whatsoever.

Of course! That's abundantly clear, and I'd be every bit as opposed as
you are to a backwards-incompatible change. Can we just assume that if I
refer to "changing arithmetic" it's short-hand for "provide an option
for full consistency in a way that only occurs with an opt-in choice by
the user, leaving existing code behaving identically."

The latter is the only thing I've ever proposed, so your choice to
assume here that I meant the former feels a bit like an intentional
misunderstanding so as to provide an opportunity for unnecessary
hyperbole. Or maybe your intuiter is just fallible too ;-)

> I think making no distinction between "naive time" and "civil time" is
> the core of coherence glitches.  An aware datetime is purely neither
> in the implementation, and different operations treat it in different
> ways.  Wearing many hats, I don't like that.

Yes!

> Wearing my "real life
> Python user" hat, though - eh, I can't really say it's caused me
> problems.

Fair enough. I am also not sure that the consistency glitches are enough
of a problem to be worth fixing. I still think it's useful to clearly
identify them and understand their source.

"What is the root issue" and "is the root issue practically worth fixing
today" are separable questions. I'm still trying to figure out the
former (but I think we're finally getting there); I'm not at all sure
what I think of the latter (and won't be until I try an implementation).

[Carl]
>> (that is, that it lacks a consistently-applied notion of what a
>> tz-annotated datetime means). I think you've admitted this much
>> yourself, though you suggested (in passing) that it could/should have
>> achieved coherence in the opposite direction, by disallowing all
>> comparisons and aware arithmetic (that is, all implicit conversions to
>> UTC) between datetimes in different timezones.
[Tim]
> When wearing several different hats, yes, _that's_ more appealing.
> But kinda pointless, since that's not what's actually done, and PEPs
> have to move on from what _is_ the case.

Of course. But I don't believe at all that understanding the core issues
clearly, and identifying what we'd ideally have chosen initially, is
pointless. It can be very useful (even a precondition) for deciding
_how_ to move on from what is the case.

>> 2) Principle of least surprise for casual users. On this question, "you
>> should use UTC for arithmetic" is equivalent to "you should use a period
>> recurrence library for period arithmetic." Both arguments are true in
>> principle, neither one is relevant to the question of casual users
>> getting the results they expect.
> 
> That last wasn't ever really a _driving_ force in Python's design.
> From the earlier example, a great many users have complained a great
> many times that
> 
>     1 + "123"
> 
> _doesn't_ return 124.  That _is_ what most casual users expect.  Tough
> luck - Python's not for the terminally lazy.

This example is a false equivalence. Clearly, trying to guess what a
casual user expects to result from an ambiguous operation is a bad idea.
I don't think datetime arithmetic (even on non-UTC datetimes) is an
ambiguous operation, given an implementation that consistently treats
all timezone-aware datetimes as unambiguous instants, or an
implementation that consistently treats them as naive datetimes with a
timezone annotation.

Given an implementation like datetime that isn't sure what they are,
_either_ choice of arithmetic is an attractive nuisance.

> Well, you can't see me, but I really do have a collection of 42 hats
> on the table next to me, and every time I write a reply, sentence by
> sentence I put on the hat most appropriate to what the current
> sentence intends ;-)

That's an excellent image, and I'll keep it in mind :-)

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150904/697d03e4/attachment-0001.sig>

From alexander.belopolsky at gmail.com  Fri Sep  4 22:50:10 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Fri, 4 Sep 2015 16:50:10 -0400
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <CALGmxEJQ5o5TkM2F3T7q87kpzbxvRKitHgaBpFD7B4xZ1Nkr5A@mail.gmail.com>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAP7h-xag=HKFUB5DP-4iHuvFBjNU+A+0P_o60URzbXfEnLfbKA@mail.gmail.com>
 <55E9E0C3.7070003@oddbird.net>
 <CALGmxEJQ5o5TkM2F3T7q87kpzbxvRKitHgaBpFD7B4xZ1Nkr5A@mail.gmail.com>
Message-ID: <CAP7h-xb+B7YzaLQUpsQS9nG4yKuxdhXVZ2o+VZciuPg5CeMzNQ@mail.gmail.com>

On Fri, Sep 4, 2015 at 2:38 PM, Chris Barker <chris.barker at noaa.gov> wrote:

> On Fri, Sep 4, 2015 at 11:19 AM, Carl Meyer <carl at oddbird.net> wrote:
>
>> On 09/04/2015 12:11 PM, Alexander Belopolsky wrote:
>> > Keep in mind that the standard library should not only support "casual
>> > users", but also those who will write a "period
>> > recurrence library" for those "casual users."  This is where classic
>> > arithmetic is indispensable.
>>
>
> I dont get that at all -- a Period recurrence lib needs to know all sorts
> of stuff about the timezone, and other things, like days of the week. And
> it needs to be able to do "timeline arithmetic", but it would presumable be
> able to remove and tack back on a tzinfo object all on it's own  -- i.e. so
> the arithmetic it wants.
>

Let me try again.  In my view, datetime class is a fancy way to encode
315537897600000000 integers:

>>> 1 + (datetime.max - datetime.min) // datetime.resolution
315537897600000000

A timedelta class is a slightly less fancy way to encode some
other 172799999913600000000 integers.  The *natural* arithmetic on datetime
and timedelta objects stems from the bijection between them and long
integers.

>>> t = datetime.now()
>>> i = (t - datetime.min) // datetime.resolution
>>> t == datetime.min + i * datetime.resolution
True

>>> d = timedelta(0, random())
>>> j = (d - timedelta.min) // timedelta.resolution
>>> d == timedelta.min + j * timedelta.resolution
True

The "arithmetic" that datetime module implements is an efficient way to do
addition and subtraction of datetime/timedelta objects without an explicit
round trip to long integers (even though at the implementation level a
round trip may take place).

This arithmetic forms the basis for anything that you may want to do with
datetimes: compute the number of business days in a year, compute the
number of seconds in a century with or without the leap seconds, compute
the angle in radians between the long hand and short hand of the Big Ben at
17:45:33.01 New York time.

Timeline arithmetic is one of the simpler applications of the *natural*
arithmetic provided by the datetime module: the timeline difference between
t1 and t2 (assuming t1.tzinfo is t2.tzinfo) is just (t1 - t1.utcoffset()) -
(t2 -  t2.utcoffset()).  Since "naive" difference between t1 and t2 that
don't share tzinfo does not make sense, it was defined as a timeline
difference.  I think that was a mistake.  I believe both Tim and Guido
expressed a similar sentiment at various times.  If datetime was designed
today, t1 - t2 where t1.tzinfo is not t2.tzinfo would be an error and the
user would have to choose between t1 - t2.astimezone(t1.tzinfo),
t1.astimezone(t2.tzinfo) - t2 or t1.astimezone(utc) - t2.astimezone(utc)
depending on the application need and with a full understanding that these
three expressions can produce different results.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150904/8a10093b/attachment.html>

From carl at oddbird.net  Fri Sep  4 23:54:41 2015
From: carl at oddbird.net (Carl Meyer)
Date: Fri, 4 Sep 2015 15:54:41 -0600
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <CAExdVNnQcBWctXwRN4TzmkXx9KuR9wUcfKFoV_fvNYOz6D7Axg@mail.gmail.com>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAP7h-xag=HKFUB5DP-4iHuvFBjNU+A+0P_o60URzbXfEnLfbKA@mail.gmail.com>
 <55E9E0C3.7070003@oddbird.net>
 <CAExdVNnQcBWctXwRN4TzmkXx9KuR9wUcfKFoV_fvNYOz6D7Axg@mail.gmail.com>
Message-ID: <55EA1321.8030805@oddbird.net>

[Tim]
> The term "timeline arithmetic" (aka "strict arithmetic") was also made
> up on this mailing list, but isn't needed to describe anything Python
> does.

Not even the thing that Python does when you subtract two datetimes
whose tzinfo differs?

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150904/b1aeb0d5/attachment.sig>

From alexander.belopolsky at gmail.com  Sat Sep  5 00:02:17 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Fri, 4 Sep 2015 18:02:17 -0400
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <55EA1321.8030805@oddbird.net>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAP7h-xag=HKFUB5DP-4iHuvFBjNU+A+0P_o60URzbXfEnLfbKA@mail.gmail.com>
 <55E9E0C3.7070003@oddbird.net>
 <CAExdVNnQcBWctXwRN4TzmkXx9KuR9wUcfKFoV_fvNYOz6D7Axg@mail.gmail.com>
 <55EA1321.8030805@oddbird.net>
Message-ID: <CAP7h-xbX7-rocbwCEdCk+L4ZvpDQu-aaedTJ52_F=BVCDE6wrQ@mail.gmail.com>

On Fri, Sep 4, 2015 at 5:54 PM, Carl Meyer <carl at oddbird.net> wrote:

> [Tim]
> > The term "timeline arithmetic" (aka "strict arithmetic") was also made
> > up on this mailing list, but isn't needed to describe anything Python
> > does.
>
> Not even the thing that Python does when you subtract two datetimes
> whose tzinfo differs?
>

No, because in this case there is no sensible alternative other than what
is implemented and making it an error.  The only case where two options
make sense is the t1 - t2 case where t1.tzinfo is t2.tzinfo.  In this
case "timeline arithmetic"  is not used, so it does not need a name.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150904/475bd119/attachment.html>

From tim.peters at gmail.com  Sat Sep  5 00:10:18 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 4 Sep 2015 17:10:18 -0500
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <55EA1321.8030805@oddbird.net>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAP7h-xag=HKFUB5DP-4iHuvFBjNU+A+0P_o60URzbXfEnLfbKA@mail.gmail.com>
 <55E9E0C3.7070003@oddbird.net>
 <CAExdVNnQcBWctXwRN4TzmkXx9KuR9wUcfKFoV_fvNYOz6D7Axg@mail.gmail.com>
 <55EA1321.8030805@oddbird.net>
Message-ID: <CAExdVNmht+h+Eqf1zMbm1iq6wY8NmvF-4F8S5WP4-61V8NdoDg@mail.gmail.com>

[Tim]
>> The term "timeline arithmetic" (aka "strict arithmetic") was also made
>> up on this mailing list, but isn't needed to describe anything Python
>> does.

[Carl]
> Not even the thing that Python does when you subtract two datetimes
> whose tzinfo differs?

It's reasonable to call that "timeline arithmetic".  "Need" is much
stronger ;-)  The docs don't give a name to it at all - they just
provide a mathematical expression defining the result.  Because that,
and interzone comparison (which is really just a way of squashing most
of the bits out of interzone subtraction), are the only instances of
what's being called "timeline arithmetic" in this mailing list, the
docs are better off not naming it.

The docs don't give a name to what's being called "classic arithmetic"
here either, but for the opposite reason:  that's so _much_ the norm,
there's no need to give a name to a thing with just a few exceptions
explicitly defined to do their own thing.

That's all about what "Python does".  For talking about what some
future Python _may_ do, the terms can be indispensable.

From tim.peters at gmail.com  Sat Sep  5 04:02:36 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 4 Sep 2015 21:02:36 -0500
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <CALGmxEJQ5o5TkM2F3T7q87kpzbxvRKitHgaBpFD7B4xZ1Nkr5A@mail.gmail.com>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAP7h-xag=HKFUB5DP-4iHuvFBjNU+A+0P_o60URzbXfEnLfbKA@mail.gmail.com>
 <55E9E0C3.7070003@oddbird.net>
 <CALGmxEJQ5o5TkM2F3T7q87kpzbxvRKitHgaBpFD7B4xZ1Nkr5A@mail.gmail.com>
Message-ID: <CAExdVNmtHjiKFCBhRS8Zv+jS7wYWPs7jNv9CsVX=Nw4wg5eLXA@mail.gmail.com>

[Chris Barker]
> I dont get that at all -- a Period recurrence lib needs to know all sorts of
> stuff about the timezone, and other things, like days of the week. And it
> needs to be able to do "timeline arithmetic", but it would presumable be
> able to remove and tack back on a tzinfo object all on it's own  -- i.e. so
> the arithmetic it wants.

Chris, I think you must mean something quite different by "period
recurrence" than others mean.  I like the term "calendar operation"
better.  Things like "2pm the 3rd Monday of every 5th month". Nobody
ever means, for example, "but change it to 1pm or maybe 3pm if
daylight time starts or ends".  They always mean "2pm on the local
clock, regardless of how often or by how much politicians change the
local clock".  timeline arithmetic is horrid for this kind of thing.

It's only if you _do_ use timeline arithmetic for calendar operations
that you need to know about timezone rules, in order to _undo_ the
damage timeline arithmetic did.  Ignore the timezone entirely (classic
arithmetic), and it's much easier.

Indeed, if you added a dateutil relativedelta to a datetime with a
tzinfo that _did_ force timeline arithmetic, nothing would blow up but
the result could be dead wrong, and _would_ most likely be dead wrong
whenever the input and result had an odd number of DST transitions
between them.

You can, of course, look at its source.  While it could be rewritten
to force classic arithmetic, it doesn't bother now.  The relativedelta
type's implementation never even checks to see whether a datetime
input _has_ a tzinfo.  It doesn't need to care now.  It builds the
result out of a mix of replacing some fields in the datetime (like the
year and/or month, if required), and leaves the rest to one or more
uses of Python's datetime + timedelta arithmetic.  For example, "3rd
Monday of the month" reduces to dateutil figuring out when the first
Monday of the month is, then adding a Python timedelta with days=[the
number needed to get to the first Monday of the month] and weeks=2.

dateutil has lots of its own logic to implement, but it currently
relies on that classic arithmetic is always in effect, and is spared
from needing to duplicate the logic already implemented by Python's
timedelta arithmetic.  The latter is a very useful building block for
these kinds of applications, directly handling all (& only) the units
needed in "calendar operations" for which there is no argument about
"the best" meaning.


> But maybe if I tried to implement one (which I will never do) , I'd see you
> point. Bu tin any case, doesn't dateutils already provide this?

What a coincidence!  I must have read your mind ;-)

From tim.peters at gmail.com  Sat Sep  5 10:06:37 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 5 Sep 2015 03:06:37 -0500
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <55E9F626.1080906@oddbird.net>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>
 <55E9F626.1080906@oddbird.net>
Message-ID: <CAExdVNmmqSMz5_2ZTcxZLJ2c1i9u=SkCi6mJvg8Lf7wHbRi1sA@mail.gmail.com>

[Tim, on hats]
>> ...

[Carl]
> I don't expect consistency from humans, it's just that my hat-intuiter
> doesn't always work right :-)

Nor my hat-signaler!


[Carl]
>>> All else being equal, designing a green-field datetime library,
>>> "universally recognized best practice" does not provide any argument for
>>> naive arithmetic over aware arithmetic on aware datetimes. Making the
>>> choice to implement aware arithmetic is not "fighting" a best practice,
>>> it's just providing a reasonable and fully consistent convenience for
>>> simple cases.

>> It would create an "attractive nuisance", yes ;-)

> I think that either choice of arithmetic might be an attractive
> nuisance; what matters is consistency with the rest of the choices in
> the library.

I went on to explain why the specific case of default timeline
arithmetic is an "attractive nuisance":  making it dead easy to spell
a poor practice.  That remains poor practice forever after.  "Easy to
spell" makes it attractive.  "Poor practice forever after" makes it a
nuisance.

Classic arithmetic is equivalent to doing integer arithmetic on
integer POSIX timestamps (although with wider range the same across
all platforms, and extended to microsecond precision).  That's hardly
novel - there's a deep and long history of doing exactly that in the
Unix(tm) world.  Which is Guido's world.  There "shouldn't be"
anything controversial about that.  The direct predecessor was already
best practice in its world.  How that could be considered a nuisance
seems a real strain to me.

Where it gets muddy is extending classic arithmetic to aware datetimes
too.  Then compounding the conceptual confusion by adding timeline
interzone subtraction and comparison.


> If datetime did naive arithmetic on tz-annotated datetimes, and also
> refused to ever implicitly convert them to UTC for purposes of
> cross-timezone comparison or arithmetic, and included a `fold` parameter
> not on the datetime object itself but only as an additional input
> argument when you explicitly convert from some other timezone to UTC,
> that would be a consistent view of the meaning of a tz-annotated
> datetime, and I wouldn't have any problem with that.

I would.  Pure or not, it sounds unusable:  when I convert _from_ UTC
to a local zone, I have no idea whether I'll end up in a gap, a fold,
or neither.  And so I'll have no idea either what to pass _to_
.utcoffset() when I need to convert back to UTC.  It doesn't solve the
conversion problem.  It's a do-it-yourself kit missing the most
important piece.  "But .fromutc() could return the right flag to pass
back later" isn't attractive either.  Then the user ends up needing to
maintain their own (datetime, convert_back_flag) pairs.  In which
case, why not just store the flag _in_ the datetime?  Only tzinfo
methods would ever need to look at it.

But note it's still not theoretically ideal:  it would mean timezone
conversion is not a wholly order-preserving function in all cases..
I'd much rather be drinking that poison, though :-(


> It would be a view consistent with what Guido described a few days ago,
> that "noon Eastern on June 3 2020" is not necessarily equivalent to a
> UTC instant; it means nothing more than "noon Eastern on June 3 2020"

If it wasn't obvious, "noon Eastern on June 3 2020" _is_ a "naive
time" in Guido's head.  One that will eventually become a civil time,
but not before civil time gets close to 2020.


> until you choose to explicitly convert it to UTC, providing a full
> zoneinfo definition of "Eastern" (and possibly a `fold` argument too,
> though it's not needed for "noon Eastern June 3 2020" unless something
> changes) at that moment.
>
> But that isn't datetime's view, at least not consistently. The problem
> isn't datetime's choice of arithmetic; it's just that sometimes it wants
> to treat a tz-annotated datetime as one thing, and sometimes as another.

How many times do we need to agree on this? ;-)   Although the
conceptual fog has not really been an impediment to using the module
in my experience.

In yours?  Do you use datetime?  If so, do you trip over this?


> (The fact that a _person_ might also want to have one sometimes and
> another sometimes is not a reason for an implementation to try to guess
> when they want one and when they want another. It could be a reason for
> two different types.)

Or three, or four, or ... but, in practice, one type has worked OK for
me.  Guido's "noon Eastern on June 3 2020" won't actually create any
problems for him either.


>> There is no argument that can possibly succeed for changing arithmetic
>> on aware datetimes:  "Tim as Python developer hat" there.  That would
>> be massively backward-incompatible.  No chance whatsoever.

> Of course! That's abundantly clear, and I'd be every bit as opposed as
> you are to a backwards-incompatible change. Can we just assume that if I
> refer to "changing arithmetic" it's short-hand for "provide an option
> for full consistency in a way that only occurs with an opt-in choice by
> the user, leaving existing code behaving identically."
>
> The latter is the only thing I've ever proposed, so your choice to
> assume here that I meant the former feels a bit like an intentional
> misunderstanding so as to provide an opportunity for unnecessary
> hyperbole. Or maybe your intuiter is just fallible too ;-)

You missed that I had my jester hat on ;-)  That was intended to be
comic relief, a dogmatic & rigid over-the-top rant from "a Python
developer".  It's a shame that you chopped part of it, because the
fragment that remains doesn't do it full justice.  Next time I'll try
to sound even more insanely enraged ;-)


> ...
> "What is the root issue" and "is the root issue practically worth fixing
> today" are separable questions. I'm still trying to figure out the
> former (but I think we're finally getting there); I'm not at all sure
> what I think of the latter (and won't be until I try an implementation).

I think the root problem is that "civil time" is a frickin' mess.  If
you want purity on all counts, then you need an object that solely
represents civil time, even to the extent of _requiring_ a non-None,
fully functional tzinfo.  Else you're leaving "but _whose_ civil
time?" ambiguous, and your object no longer represents a single
instant in UTC, and you can only possibly support classic arithmetic
(if you support any arithmetic at all).  But so much baggage is
required to specify one of those, lots of apps will look elsewhere.
So types will multiply.  Maybe that's the best that can be done.

> ...
> Of course. But I don't believe at all that understanding the core issues
> clearly, and identifying what we'd ideally have chosen initially, is
> pointless. It can be very useful (even a precondition) for deciding
> _how_ to move on from what is the case.

Except PEPs yearn to get beyond this stage ;-)  That is, there's
always an early stage where everyone wants to debate every design
decision that was ever made leading up to the PEP (sometimes even just
vaguely related to something the PEP mentions).  That's fine, but the
PEP author(s) eventually tune out.  They're not free to redesign
anything, and are usually trying to solve a more-or-less specific
problem.  Like here, we're just trying to add one stinking bit ;-)

If that inspires someone else to create a grander solution, that's
great.  I'm not sure it's ever happened, but it _could_ be great :-)


>>> 2) Principle of least surprise for casual users. On this question, "you
>>> should use UTC for arithmetic" is equivalent to "you should use a period
>>> recurrence library for period arithmetic." Both arguments are true in
>>> principle, neither one is relevant to the question of casual users
>>> getting the results they expect.

>> That last wasn't ever really a _driving_ force in Python's design.
>> From the earlier example, a great many users have complained a great
>> many times that
>>
>>     1 + "123"
>>
>> _doesn't_ return 124.  That _is_ what most casual users expect.  Tough
>> luck - Python's not for the terminally lazy.

> This example is a false equivalence.

All equivalences are false, yes?  I remain happy enough with the
high-order bits of this one.

> Clearly, trying to guess what a casual user expects to result from
> an ambiguous operation is a bad idea.

We're not guessing at all:  we know darned well what most casual users
expect in this case.  They've been _screaming_ 124 from the start.
The high-order bit is, as I said earlier, that catering to what casual
users expect has never been a primary driver in Python's design.  It'
may be a consideration, but perhaps never at the top of the list.

> I don't think datetime arithmetic (even on non-UTC datetimes) is an
> ambiguous operation,

The analogy wasn't about ambiguity; it was intended to be about "what
a casual user expects" not being a strong argument in the context of
Python's design history.

> given an implementation that consistently treats all timezone-aware
> datetimes as unambiguous instants, or an implementation that
> consistently treats them as naive datetimes with a timezone annotation.
> Given an implementation like datetime that isn't sure what they are,
> _either_ choice of arithmetic is an attractive nuisance.

But only if I also assume the user is terminally dense.  It's like PEP 20 says:

     There should be one-- and preferably only one --obvious way to do it.
     Although that way may not be obvious at first unless you're Dutch.

datetime does Dutch arithmetic.  Once a user figures that out, it's
obvious _then_.  And then also the only obvious way to do classic
arithmetic.  Guido thought using UTC for timeline arithmetic was the
one obvious way to do that;

The first time a user encounters datetime, they may well _think_ "OK,
I'll add a tzinfo, and now I'll get timeline arithmetic!".  That's why
this general rule of Python design required two entire lines in PEP 20
- their thinking is flawed because they're not Dutch.  But they can
learn to be.  Then there is indeed one - and only one - obvious way to
do each flavor of arithmetic, and each way is consistent with best
practices appropriate for that way.

I couldn't care less whether they "get it" at once.  I would care if
they _never_ got it.  But Guido still wouldn't care - he will always
be more profoundly Dutch than me ;-)


>> Well, you can't see me, but I really do have a collection of 42 hats
>> on the table next to me, and every time I write a reply, sentence by
>> sentence I put on the hat most appropriate to what the current
>> sentence intends ;-)

> That's an excellent image, and I'll keep it in mind :-)

If you picture me wearing a nightcap now, you have it nailed ;-)

From tim.peters at gmail.com  Sun Sep  6 03:22:04 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 5 Sep 2015 20:22:04 -0500
Subject: [Datetime-SIG] Another approach to 495's glitches
Message-ID: <CAExdVNmHtzsP36_wxeUMPp32KzKQeaHXAVGuPZTRYnYZiZPbVw@mail.gmail.com>

Thinking out loud.  Right now, we're making interzone arithmetic
consistent at the expense of making intrazone operations baffling in
some fold edge cases.  I'd like to see if we could reverse that.
Partly because datetime "shouldn't have" supported by-magic interzone
arithmetic to begin with.  But mostly because, outside of Python's
test suite, I've never seen an instance of by-magic interzone
comparison or subtraction (it's certain none of my code ever used it,
and I've never seen it elsewhere in real code I can recall).

So, compared to what Python does today:

1. Intrazone.

Go back to what the first 495 stab did:  ignore fold entirely (act as
if it were always 0), including in hash().

2. Interzone.

A. Subtraction.  Change nothing.

B. Comparison.
B1. __eq__.  If either operand has fold=1, return False.
B2. __ne__.  If either operand has fold=1, return True.
B3. The others.  Change nothing.

The hash problem goes away, because equality transitivity is restored
in the cases it matters for the hash problem (under 2B1 a datetime
with fold=1 never compares equal to any datetime in a different zone).
Before (first 495 stab) we had, where `early` and `late` are the same
except for `fold`:

    uearly = early.astimezone(utc)
    ulate = late.astimezone(utc)

and then:

    uearly == early == late == ulate
    uearly < ulate
    hash(uearly) == hash(early) == hash(late)
    hash(ulate) almost certainly != to those,
        despite late == ulate

That made a high-quality & correct hash() exceedingly painful.  Now
(current 495 stab) we have:

    uearly == early < late == ulate
    hash(uearly) == hash(early)
    hash(ulate) == hash(late)

No problem there, but "early < late" within the zone is so at odds
with "naive time" that various kinds of endcase backwards
incompatibilty snuck in (some of which explained in great detail in
messages between Carl and me).  It "looks nice" because we _are_
favoring by-magic intrazone consistency at the expense of everything
else.  In endcases sticking within the zone, it doesn't always "look
nice" at all.

Under 2B1 and 2B2:

    uearly == early == late != ulate
    uearly < ulate
    hash(uearly) == hash(early) == hash(late)
    hash(ulate) almost certainly != to those,
        but that's fine since late != ulate,
                              early != ulate, and
                              uearly != ulate

What we lose is:

A. trichotomy in interzone comparison in rare cases.  Right above, we
have late != ulate, but we do _not_ have late < ulate or late > ulate
either.  We're forcing __eq__ to say they're not equal, despite that
otherwise comparison logic would say they are equal.

B. equivalence between interzone comparison and interzone subtraction
in rare cases.  Right above, we have late - ulate == 0 despite that
late != ulate.

C. equality transitivity in rare cases that don't affect the hash
problem.  Right above, `late` has fold=1 so 2B2 says it's not equal to
`uearly` or `ulate` (it's "not equal" to _any_ datetime in UTC).
However, we also have uearly == early == late, from which we could
normally infer uearly == late.

D. zone conversion isn't wholly order-preserving.  Right above, the
ambiguous times compare equal in their own zone, but map to != values
in UTC.  `early` and `late` are equal in their own zone but not in any
other zone where neither ends up with fold=1.

So, until I find something I missed ;-) , all the rare endcase
surprises are pushed into interzone operations I doubt are used much
(if at all).  Seems better than putting them in routinely used
intrazone operations.

For the docs, the spiel would be along the lines that fold=1 is a new
case, and for technical reasons an aware datetime with fold=1 can't
compare equal to any datetime in any other zone.  That's "really" all
this amounts to.  Apps that need interzone comparison or subtraction
should convert to UTC instead.  Then everything will work fine.  I'd
also say that by-magic interzone comparison and subtraction may be
deprecated someday.  Something to discourage its use.  Especially
because, in fact, I bet it's barely (if ever) used now.

Someone else's turn now ;-)

PS:  not quite yet.  All the examples above assumed PEP 495-compliant
tzinfos were in use.  As detailed in a message with Carl, there are
also "backward compatibility" issues to consider after 495 is
implement but pre-495 tzinfos are used.  Making early < late can cause
endcase surprises there too.  Under the idea here, as in the first 495
stab, those surprises go away again, because _nothing_ within a zone
will "see fold=1", not even the tzinfo (remember, it's a pre-495
tzinfo in this case).

From alexander.belopolsky at gmail.com  Sun Sep  6 03:33:59 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Sat, 5 Sep 2015 21:33:59 -0400
Subject: [Datetime-SIG] Another approach to 495's glitches
In-Reply-To: <CAExdVNmHtzsP36_wxeUMPp32KzKQeaHXAVGuPZTRYnYZiZPbVw@mail.gmail.com>
References: <CAExdVNmHtzsP36_wxeUMPp32KzKQeaHXAVGuPZTRYnYZiZPbVw@mail.gmail.com>
Message-ID: <CAP7h-xY-9YnGP0MmnYkhSoN4CkVQEm572mjt0JxLtLPthwthLA@mail.gmail.com>

On Sat, Sep 5, 2015 at 9:22 PM, Tim Peters <tim.peters at gmail.com> wrote:

> B. Comparison.
> B1. __eq__.  If either operand has fold=1, return False.
>

Congratulations, you've just reinvented a NAN.   Sorry, but I won't
sacrifice the reflexivity of == for any other invariant.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150905/599f11be/attachment.html>

From alexander.belopolsky at gmail.com  Sun Sep  6 03:38:44 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Sat, 5 Sep 2015 21:38:44 -0400
Subject: [Datetime-SIG] Another approach to 495's glitches
In-Reply-To: <CAP7h-xY-9YnGP0MmnYkhSoN4CkVQEm572mjt0JxLtLPthwthLA@mail.gmail.com>
References: <CAExdVNmHtzsP36_wxeUMPp32KzKQeaHXAVGuPZTRYnYZiZPbVw@mail.gmail.com>
 <CAP7h-xY-9YnGP0MmnYkhSoN4CkVQEm572mjt0JxLtLPthwthLA@mail.gmail.com>
Message-ID: <CAP7h-xasRW-6EeHgpxYcd_K+tPJg4-Mo_duw+AyhWdwXw8apBw@mail.gmail.com>

Sorry, I missed the " Interzone" part.  Maybe you are on to something ...

On Sat, Sep 5, 2015 at 9:33 PM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:

>
> On Sat, Sep 5, 2015 at 9:22 PM, Tim Peters <tim.peters at gmail.com> wrote:
>
>> B. Comparison.
>> B1. __eq__.  If either operand has fold=1, return False.
>>
>
> Congratulations, you've just reinvented a NAN.   Sorry, but I won't
> sacrifice the reflexivity of == for any other invariant.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150905/07d33e14/attachment.html>

From tim.peters at gmail.com  Sun Sep  6 03:38:47 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 5 Sep 2015 20:38:47 -0500
Subject: [Datetime-SIG] Another approach to 495's glitches
In-Reply-To: <CAP7h-xY-9YnGP0MmnYkhSoN4CkVQEm572mjt0JxLtLPthwthLA@mail.gmail.com>
References: <CAExdVNmHtzsP36_wxeUMPp32KzKQeaHXAVGuPZTRYnYZiZPbVw@mail.gmail.com>
 <CAP7h-xY-9YnGP0MmnYkhSoN4CkVQEm572mjt0JxLtLPthwthLA@mail.gmail.com>
Message-ID: <CAExdVNmToyzu7eCUMgqBQJdrb_Ag0iRp78_PqK73-nT+C4Uu=A@mail.gmail.com>

[Tim]
>> B. Comparison.
>> B1. __eq__.  If either operand has fold=1, return False.

[Alex]
> Congratulations, you've just reinvented a NAN.   Sorry, but I won't
> sacrifice the reflexivity of == for any other invariant.

Neither would I.  I suspect you read hastily and missed that this
quote, in context, is in the "interzone" section.  It only applies to
comparisons between _different_ zones.  Of course x == x would return
True for any datetime x.  That case was in the earlier "intrazone"
section, where it just said "do what the first stab at 495 did"
(ignore fold entirely for intrazone comparisons).

From tim.peters at gmail.com  Sun Sep  6 03:41:57 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 5 Sep 2015 20:41:57 -0500
Subject: [Datetime-SIG] Another approach to 495's glitches
In-Reply-To: <CAP7h-xasRW-6EeHgpxYcd_K+tPJg4-Mo_duw+AyhWdwXw8apBw@mail.gmail.com>
References: <CAExdVNmHtzsP36_wxeUMPp32KzKQeaHXAVGuPZTRYnYZiZPbVw@mail.gmail.com>
 <CAP7h-xY-9YnGP0MmnYkhSoN4CkVQEm572mjt0JxLtLPthwthLA@mail.gmail.com>
 <CAP7h-xasRW-6EeHgpxYcd_K+tPJg4-Mo_duw+AyhWdwXw8apBw@mail.gmail.com>
Message-ID: <CAExdVNm2tLnCYndGaHPk+Qn6EBu5MFZeAvzT3qA9Az=d+gotWQ@mail.gmail.com>

[Alex]
> Sorry, I missed the " Interzone" part.  Maybe you are on to something ...

That's OK - I find it hard _not_ to be punch-drunk by now ;-)  But if
you see anything that in any way complicates intrazone behavior, I'll
be appalled.  The entire point here is to restore intrazone relative
sanity at the expense of pushing garbage into the (possibly never used
in real life) interzone by-magic operations.

From alexander.belopolsky at gmail.com  Sun Sep  6 03:53:56 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Sat, 5 Sep 2015 21:53:56 -0400
Subject: [Datetime-SIG] Another approach to 495's glitches
In-Reply-To: <CAExdVNmHtzsP36_wxeUMPp32KzKQeaHXAVGuPZTRYnYZiZPbVw@mail.gmail.com>
References: <CAExdVNmHtzsP36_wxeUMPp32KzKQeaHXAVGuPZTRYnYZiZPbVw@mail.gmail.com>
Message-ID: <CAP7h-xaBZm3G_ruLTFpFeA=eobkBFnGZ3yd6PmSY44eLS9ZxJQ@mail.gmail.com>

On Sat, Sep 5, 2015 at 9:22 PM, Tim Peters <tim.peters at gmail.com> wrote:

> 1. Intrazone.
>
> Go back to what the first 495 stab did:  ignore fold entirely (act as
> if it were always 0), including in hash().
>
> 2. Interzone.
>
> A. Subtraction.  Change nothing.
>
> B. Comparison.
> B1. __eq__.  If either operand has fold=1, return False.
> B2. __ne__.  If either operand has fold=1, return True.
> B3. The others.  Change nothing.
>

I really like this solution.  The reason I was procrastinating with
updating the PEP to reflect the previous solution was that I really did not
like the fact that it would make fold=1 times in the gap equal to the times
right before the gap.  Now this problem will go away with many others.

Let me sleep on this, but I really think this may work.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150905/12847cb0/attachment-0001.html>

From tim.peters at gmail.com  Sun Sep  6 04:02:33 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 5 Sep 2015 21:02:33 -0500
Subject: [Datetime-SIG] Another approach to 495's glitches
In-Reply-To: <CAP7h-xaBZm3G_ruLTFpFeA=eobkBFnGZ3yd6PmSY44eLS9ZxJQ@mail.gmail.com>
References: <CAExdVNmHtzsP36_wxeUMPp32KzKQeaHXAVGuPZTRYnYZiZPbVw@mail.gmail.com>
 <CAP7h-xaBZm3G_ruLTFpFeA=eobkBFnGZ3yd6PmSY44eLS9ZxJQ@mail.gmail.com>
Message-ID: <CAExdVN=phmU507kEXggYfFCe-ZyRBKr=zzgTpeLEFRuATy1x-Q@mail.gmail.com>

[Alex]
> I really like this solution.  The reason I was procrastinating with updating
> the PEP to reflect the previous solution was that I really did not like the
> fact that it would make fold=1 times in the gap equal to the times right
> before the gap.  Now this problem will go away with many others.
>
> Let me sleep on this, but I really think this may work.

Good!  I need to think on it more too.  It's much more like your first
495 stab, which indeed had many nice properties.  Nobody here gives a
shit about the interzone by-magic operations, so I'm happy to
sacrifice damn near anything in those ;-)

FYI, I'm most concerned about how glibly I "sold" the idea that it
really does solve the hash problem.  It seems obvious to me that it
does, but ... hash problems have a way of popping up in unexpected
ways in unconsidered contexts :-(

From tim.peters at gmail.com  Sun Sep  6 08:11:33 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Sun, 6 Sep 2015 01:11:33 -0500
Subject: [Datetime-SIG] Another approach to 495's glitches
In-Reply-To: <CAExdVN=phmU507kEXggYfFCe-ZyRBKr=zzgTpeLEFRuATy1x-Q@mail.gmail.com>
References: <CAExdVNmHtzsP36_wxeUMPp32KzKQeaHXAVGuPZTRYnYZiZPbVw@mail.gmail.com>
 <CAP7h-xaBZm3G_ruLTFpFeA=eobkBFnGZ3yd6PmSY44eLS9ZxJQ@mail.gmail.com>
 <CAExdVN=phmU507kEXggYfFCe-ZyRBKr=zzgTpeLEFRuATy1x-Q@mail.gmail.com>
Message-ID: <CAExdVNmN-7hKGeWhiSb00C5bZHuhSCq-sLnHKi3dU5Zbns+wiQ@mail.gmail.com>

[Tim]
> ...
> FYI, I'm most concerned about how glibly I "sold" the idea that it
> really does solve the hash problem.  It seems obvious to me that it
> does, but ... hash problems have a way of popping up in unexpected
> ways in unconsidered contexts :-(

So, after thinking about this for a few days, it's obvious after all ;-)

Consider two aware datetimes that compare equal.  The task is to prove
they have the same hash.  The subtlety is that while __eq__ and
__hash__ both use a notion of "UTC equivalent", they're not always the
same notion.  __eq__ always uses the given values of `fold`, while
__hash__ always forces fold=0.

1. Same zone.

.utcoffset() isn't used for equality in this case; it's only used by
hash.  Equality implies they differ at most in `fold`.  Since hash()
forces fold=0, hash's calls to .utcoffset() see exactly the same stuff
for both, so hash's force-fold-to-0 UTC equivalents are the same.
Same UTC equivalents, same hashes.

2. Different zones.

Equality implies fold=0 for both, and that both map to the same UTC
time.  Since we know fold=0 for both, we know __eq__ and __hash__ use
the same notion of UTC equivalent for both, so __hash__ sees the same
UTC equivalents __eq__ already saw and judged equal.  Same UTC
equivalents, same hashes.


Where it failed before:  `later` is the later of an ambiguous time, so
has fold=1.  `ulater` is its UTC equivalent (with fold=0).  They
compared equal before.  But hash(later) computed the hash based on the
force-fold-to-0 UTC equivalent, which is not the same as the fold=1
UTC equivalent `ulater`.  hash(ulater) and hash(later) had no more in
common than hash(math.pi) and hash("hash").

And they still won't.  But in the new world later != ulater (at least
one has fold=1 in a cross-zone comparison), so it no longer matters
that the hashes differ.

From tim.peters at gmail.com  Sun Sep  6 20:58:56 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Sun, 6 Sep 2015 13:58:56 -0500
Subject: [Datetime-SIG] Another approach to 495's glitches
In-Reply-To: <CAExdVNmN-7hKGeWhiSb00C5bZHuhSCq-sLnHKi3dU5Zbns+wiQ@mail.gmail.com>
References: <CAExdVNmHtzsP36_wxeUMPp32KzKQeaHXAVGuPZTRYnYZiZPbVw@mail.gmail.com>
 <CAP7h-xaBZm3G_ruLTFpFeA=eobkBFnGZ3yd6PmSY44eLS9ZxJQ@mail.gmail.com>
 <CAExdVN=phmU507kEXggYfFCe-ZyRBKr=zzgTpeLEFRuATy1x-Q@mail.gmail.com>
 <CAExdVNmN-7hKGeWhiSb00C5bZHuhSCq-sLnHKi3dU5Zbns+wiQ@mail.gmail.com>
Message-ID: <CAExdVNn9bLwiPxBGELTGWwTi=iv6S7UFyvkgP15V2JOXa4u3Rg@mail.gmail.com>

[Tim]
> ...
> Consider two aware datetimes that compare equal.  The task is to prove
> they have the same hash.  The subtlety is that while __eq__ and
> __hash__ both use a notion of "UTC equivalent", they're not always the
> same notion.  __eq__ always uses the given values of `fold`, while
> __hash__ always forces fold=0.

Which obviously ;-) suggests yet another, possibly cleaner, approach:
have interzone subtraction,  and all interzone comparisons, _also_
force fold to 0 (instead of having only interzone __eq__ and __ne__
special-case fold=1) .  There are many details about consequences for
me to work out, but it sounds promising on the face of it.

"The story" gets a lot more uniform then:  fold=1 is simply ignored
(acts as if 0) by virtually everything, except for 495 tzinfo
operations, where `fold` is essential.

Then we'd again have, e.g.,

    uearly == early == late != ulate
    uearly < ulate

but "late != ulate" in this variant not because __ne__ is special
casing fold=1, but because all cross-zone comparisons use the
force-fold-to-0 UTC equivalents for both `late` and `ulate`, and
they're simply not equal (assuming 495-conforming tzinfo; for a
pre-495 tzinfo, they would be equal, but in that case uearly==ulate
too).

We'd also have

    late - ulate != timedelta(0)

for the same reason, and consistency between interzone comparison and
subtraction would be restored.

Trichotomy for cross-zone comparison would also be restored (for x and
y in different zones, exactly one of x<y, x==y, and x>y would be
true).

That zone conversion isn't always order-preserving would remain so,
but it's impossible for any scheme to always preserve order so long as

    early == late

in the source zone, and it's highly desirable that they do compare equal.

The only remaining obvious glitch is that Interzone by-magic
subtraction and comparison would act as if fold=0 all the time, so may
return wrong results in cases where fold=1, although wrong results
consistent between interzone subtraction and comparison.

I don't care much, for reasons explained before.  Convert to UTC first
if you need to care about cross-zone comparison or subtraction in
cases of ambiguous times - that will always get the right answers
(assuming 495-conforming tzinfos are in use).

From alexander.belopolsky at gmail.com  Sun Sep  6 22:53:41 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Sun, 6 Sep 2015 16:53:41 -0400
Subject: [Datetime-SIG] Another approach to 495's glitches
In-Reply-To: <CAExdVNn9bLwiPxBGELTGWwTi=iv6S7UFyvkgP15V2JOXa4u3Rg@mail.gmail.com>
References: <CAExdVNmHtzsP36_wxeUMPp32KzKQeaHXAVGuPZTRYnYZiZPbVw@mail.gmail.com>
 <CAP7h-xaBZm3G_ruLTFpFeA=eobkBFnGZ3yd6PmSY44eLS9ZxJQ@mail.gmail.com>
 <CAExdVN=phmU507kEXggYfFCe-ZyRBKr=zzgTpeLEFRuATy1x-Q@mail.gmail.com>
 <CAExdVNmN-7hKGeWhiSb00C5bZHuhSCq-sLnHKi3dU5Zbns+wiQ@mail.gmail.com>
 <CAExdVNn9bLwiPxBGELTGWwTi=iv6S7UFyvkgP15V2JOXa4u3Rg@mail.gmail.com>
Message-ID: <CAP7h-xZPSn4AZoZmb5212-exQ6B0yEOt+i905BmV9Ymr9hn-iQ@mail.gmail.com>

On Sun, Sep 6, 2015 at 2:58 PM, Tim Peters <tim.peters at gmail.com> wrote:

> [Tim]
> > ...
> > Consider two aware datetimes that compare equal.  The task is to prove
> > they have the same hash.  The subtlety is that while __eq__ and
> > __hash__ both use a notion of "UTC equivalent", they're not always the
> > same notion.  __eq__ always uses the given values of `fold`, while
> > __hash__ always forces fold=0.
>
> Which obviously ;-) suggests yet another, possibly cleaner, approach:
> have interzone subtraction,  and all interzone comparisons, _also_
> force fold to 0 (instead of having only interzone __eq__ and __ne__
> special-case fold=1) .
>

I would not go that far.  While interzone subtraction between arbitrary
zones is a rarely needed overkill, I find it useful to have subtraction
work between a local zone and UTC.  For me, subtraction in this case is
similar to conversion.  Fix the EPOCH and d = t - EPOCH together with t =
EPOCH + d gives you a bijection between times and timedeltas.  From that,
you are one step away from various numeric time scales.  For example (t -
datetime(1, 1, 1, tzinfo=timezone.utc)) // timedelta.resolution will give
you a bijection between datetimes and some range of integers.  Thus if we
are going to "sell" fold as a way to implement conversions that "always
work", I think we should include these types of conversions as well.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150906/7b43676a/attachment.html>

From carl at oddbird.net  Mon Sep  7 02:19:58 2015
From: carl at oddbird.net (Carl Meyer)
Date: Sun, 6 Sep 2015 18:19:58 -0600
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <CAExdVNmmqSMz5_2ZTcxZLJ2c1i9u=SkCi6mJvg8Lf7wHbRi1sA@mail.gmail.com>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>
 <55E9F626.1080906@oddbird.net>
 <CAExdVNmmqSMz5_2ZTcxZLJ2c1i9u=SkCi6mJvg8Lf7wHbRi1sA@mail.gmail.com>
Message-ID: <55ECD82E.9070305@oddbird.net>

Hi Tim,

(tl;dr I think your latest proposal re PEP 495 is great.)

I think we're still mis-communicating somewhat. Before replying point by
point, let me just try to explain what I'm saying as clearly as I can.
Please tell me precisely where we part ways in this analysis.

Consider two models for the meaning of a "timezone-aware datetime
object". Let's just call them Model A and Model B:

In Model A, an aware datetime (in any timezone) is nothing more than an
alternate (somewhat complexified for human use) spelling of a Unix
timestamp, much like a timedelta is just a complexified spelling of some
number of microseconds. In this model, there's a bijection between aware
datetimes in any two timezones. (This model requires the PEP 495 flag,
or some equivalent. Technically, this model _could_ be implemented by
simply storing a Unix timestamp and a timezone name, and doing all
date/time calculations at display time.) In this model, "Nov 2 2014
1:30am US/Eastern fold=1" and "Nov 2 2014 6:30am UTC" are just alternate
spellings of the _same_ underlying timestamp.

Characteristics of Model A:

* There's no issue with comparisons or arithmetic involving datetimes in
different timezones; they're all just Unix timestamps under the hood
anyway, so ordering and arithmetic is always obvious and consistent:
it's always equivalent to simple integer arithmetic with Unix timestamps.

* Conversions between timezones are always unambiguous and lossless:
they're just alternate spellings of the same integer, after all.

* In this model, timeline arithmetic everywhere is the only option.
Every non-UTC aware datetime is just an alternate spelling of an
equivalent UTC datetime / Unix timestamp, so in a certain sense you're
always doing "arithmetic in UTC" (or "arithmetic with Unix timestamps"),
but you can spell it in whichever timezone you like. In this model,
there's very little reason to consider arithmetic in non-UTC timezones
problematic; it's always consistent and predictable and gives exactly
the same results as converting to UTC first. For sizable systems it may
still be good practice to do everything internally in UTC and convert at
the edges, but the reasons are not strong; mostly just avoiding
interoperability issues with databases or other systems that don't
implement the same model, or have poor timezone handling.

* In this model, "classic" arithmetic doesn't even rise to the level of
"attractive nuisance," it's simply "wrong arithmetic," because you get
different results if working with the "same time" represented in
different timezones, which violates the core axiom of the model; it's no
longer simply arithmetic with Unix timestamps.

I don't believe there's anything wrong with Model A. It's not the right
model for _all_ tasks, but it's simple, easy to understand, fully
consistent, and useful for many tasks. On the whole, it's still the
model I find most intuitive and would prefer for most of the timezone
code I personally write (and it's the one I actually use today in
practice, because it's the model of pytz).

Now Model B. In Model B, an "aware datetime" is a "clock face" or
"naive" datetime with an annotation of which timezone it's in. A non-UTC
aware datetime in model B doesn't inherently know what POSIX timestamp
it corresponds to; that depends on concepts that are outside of its
naive model of local time, in which time never jumps or goes backwards.
Model B is what Guido was describing in his email about an aware
datetime in 2020: he wants an aware datetime to mean "the calendar says
June 3, the clock face says noon, and I'm located in US/Eastern" and
nothing more.

Characteristics of Model B:

* Naive (or "classic", or "move the clock hands") arithmetic is the only
kind that makes sense under Model B.

* As Guido described, if you store an aware datetime and then your tz
database is updated before you load it again, Model A and Model B aware
datetimes preserve different invariants. A Model A aware datetime will
preserve the timestamp it represents, even if that means it now
represents a different local time than before the zoneinfo change. A
Model B aware datetime will preserve the local clock time, even though
it now corresponds to a different timestamp.

* You can't compare or do arithmetic between datetimes in different
timezones under Model B; you need to convert them to the same time zone
first (which may require resolving an ambiguity).

* Maintaining a `fold` attribute on datetimes at all is a departure from
Model B, because it represents a bit of information that's simply
nonsense/doesn't exist within Model B's naive-clock-time model.

* Under Model B, conversions between timezones are lossy during a fold
in the target timezone, because two different UTC times map to the same
Model B local time.

These models aren't chosen arbitrarily; they're the two models I'm aware
of for what a "timezone-aware datetime" could possibly mean that
preserve consistent arithmetic and total ordering in their allowed
domains (in Model A, all aware datetimes in any timezone can
interoperate as a single domain; in Model B, each timezone is a separate
domain).

A great deal of this thread (including most of my earlier messages and,
I think, even parts your last message here that I'm replying to) has
consisted of proponents of one of these two models arguing that behavior
from the other model is wrong or inferior or buggy (or an "attractive
nuisance"). I now think these assertions are all wrong :-) Both models
are reasonable and useful, and in fact both are capable enough to handle
all operations, it's just a question of which operations they make
simple. Model B people say "just do all your arithmetic and comparisons
in UTC"; Model A people say "if you want Model B, just use naive
datetimes and track the implied timezone separately."

I came into this discussion assuming that Model A was the only sensible
way for a datetime library to behave. Now (thanks mostly to Guido's note
about dates in 2020), I've been convinced that Model B is also
reasonable, and preferable for some uses. I've also been convinced that
Model B is the dominant influence and intended model in datetime's
design, and that's very unlikely to change (even in a
backwards-compatible way), so I'm no longer advocating that.

Datetime.py, unfortunately, has always mixed behavior from the two
models (interzone operations are all implemented from a Model A
viewpoint; intrazone are Model B). Part of the problem with this is that
it results in a system that looks like it ought to have total ordering
and consistent arithmetic, but doesn't. The bigger problem is that it
has allowed people to come to the library from either a Model A or Model
B viewpoint and find enough behavior confirming their mental model to
assume they were right, and assume any behavior that doesn't match their
model is a bug. That's what happened to Stuart, and that's why pytz
implements Model A, and has thus encouraged large swathes of Python
developers to even more confidently presume that Model A is the intended
model.

I think your latest proposal for PEP 495 (always ignore `fold` in all
intra-zone operations, and push the inconsistency into inter-zone
comparisons - which were already inconsistent - instead) is by far the
best option for bringing loss-less timezone-conversion round-trips to
Model B. Instead of saying (as earlier revisions of PEP 495 did) "we
claim we're really Model B, but we're going to introduce even more Model
A behaviors, breaking the consistency of Model B in some cases - good
luck keeping it straight!" it says "we're sticking with Model B, in
which `fold` is meaningless when you're working within a timezone, but
in the name of practical usability we'll still track `fold` internally
after a conversion, so you don't have to do it yourself in case you want
to convert to another timezone later."

If the above analysis makes any sense at all to anyone, and you think
something along these lines (but shorter and more carefully edited)
would make a useful addition to the datetime docs (either as a
tutorial-style "intro to how datetime works and how to think about aware
datetimes" or as an FAQ), I'd be very happy to write that patch.

Now on to your message:

[Tim]
> Classic arithmetic is equivalent to doing integer arithmetic on
> integer POSIX timestamps (although with wider range the same across
> all platforms, and extended to microsecond precision).  That's hardly
> novel - there's a deep and long history of doing exactly that in the
> Unix(tm) world.  Which is Guido's world.  There "shouldn't be"
> anything controversial about that.  The direct predecessor was already
> best practice in its world.  How that could be considered a nuisance
> seems a real strain to me.

Unless I'm misunderstanding what you are saying (always likely!), I
think this is just wrong. POSIX timestamps are a representation of an
instant in time (a number of seconds since the epoch _in UTC_). If you
are doing any kind of "integer arithmetic on POSIX timestamps", you are
_always_ doing timeline arithmetic. Classic arithmetic may be many
things, but the one thing it definitively is _not_ is "arithmetic on
POSIX timestamps."

This is easy to demonstrate: take one POSIX timestamp, convert it to
some timezone with DST, add 86400 seconds to it (using "classic
arithmetic") across a DST gap or fold, and then convert back to a POSIX
timestamp, and note that you don't have a timestamp 86400 seconds away
from the first timestamp. If you were doing simple "arithmetic on POSIX
timestamps", such a result would not be possible.

In Model A (the one that Lennart and myself and Stuart and Chris have
all been advocating during all these threads), all datetimes (in any
timezone) are unambiguous representations of a POSIX timestamp, and all
arithmetic is "arithmetic on POSIX timestamps." That right there is the
definition of timeline arithmetic.

So yes, I agree with you that it's hard to consider "arithmetic on POSIX
timestamps" an attractive nuisance :-)

> Where it gets muddy is extending classic arithmetic to aware datetimes
> too.

If by "muddy" you mean "not in any way 'arithmetic on POSIX timestamps'
anymore." :-)

I don't even know what you mean by "extending to aware datetimes" here;
the concept of "arithmetic on POSIX timestamps" has no meaning at all
with naive datetimes (unless you're implicitly assuming some timezone),
because naive datetimes don't correspond to any particular instant,
whereas a POSIX timestamp does.

> Then compounding the conceptual confusion by adding timeline
> interzone subtraction and comparison.

Yes, that addition (of Model A behavior into a Model B world) has caused
plenty of confusion! It's the root cause for most of the content on this
mailing list so far, I think :-)

[Carl]
>> If datetime did naive arithmetic on tz-annotated datetimes, and also
>> refused to ever implicitly convert them to UTC for purposes of
>> cross-timezone comparison or arithmetic, and included a `fold` parameter
>> not on the datetime object itself but only as an additional input
>> argument when you explicitly convert from some other timezone to UTC,
>> that would be a consistent view of the meaning of a tz-annotated
>> datetime, and I wouldn't have any problem with that.
[Tim]
> I would.  Pure or not, it sounds unusable:  when I convert _from_ UTC
> to a local zone, I have no idea whether I'll end up in a gap, a fold,
> or neither.  And so I'll have no idea either what to pass _to_
> .utcoffset() when I need to convert back to UTC.  It doesn't solve the
> conversion problem.  It's a do-it-yourself kit missing the most
> important piece.  "But .fromutc() could return the right flag to pass
> back later" isn't attractive either.  Then the user ends up needing to
> maintain their own (datetime, convert_back_flag) pairs.  In which
> case, why not just store the flag _in_ the datetime?  Only tzinfo
> methods would ever need to look at it.

Yes, I agree with you here. I think your latest proposal for PEP 495
does a great job of providing this additional convenience for the user
without killing the intra-timezone Model B consistency. I just wish that
the inconsistent inter-timezone operations weren't supported at all, but
I know it's about twelve years too late to do anything about that other
than document some variant of "you shouldn't compare or do arithmetic
with datetimes in different timezones; if you do you'll get inconsistent
results in some cases around DST transitions. Convert to the same
timezone first instead."

[Tim]
>> But that isn't datetime's view, at least not consistently. The problem
>> isn't datetime's choice of arithmetic; it's just that sometimes it wants
>> to treat a tz-annotated datetime as one thing, and sometimes as another.
> 
> How many times do we need to agree on this? ;-)

Everybody all together now, one more time! :-)

Until your latest proposal on PEP 495, I wasn't sure we really did agree
on this, because it seemed you were still willing to break the
consistency of Model B arithmetic in order to gain some of the benefits
of Model A (that is, introduce _even more_ of this context-dependent
ambiguity as to what a tz-annotated datetime means.) But your latest
proposal fixes that in a way I'm quite happy with, given where we are.

> Although the
> conceptual fog has not really been an impediment to using the module
> in my experience.
> 
> In yours?  Do you use datetime?  If so, do you trip over this?

No, because I use pytz, in which there is no conceptual fog, just strict
Model A (and an unfortunate API).

I didn't get to experience the joy of this conceptual fog until I
started arguing with you on this mailing list! And now I finally feel
like I'm seeing through that fog a bit. I hope I'm right :-)

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150906/07ff9d43/attachment-0001.sig>

From carl at oddbird.net  Mon Sep  7 02:31:12 2015
From: carl at oddbird.net (Carl Meyer)
Date: Sun, 6 Sep 2015 18:31:12 -0600
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <CAP7+vJKAOVNQKktgWFsBoQybsyroioq-=hrt52d+G+EiqS1CMg@mail.gmail.com>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAP7h-xag=HKFUB5DP-4iHuvFBjNU+A+0P_o60URzbXfEnLfbKA@mail.gmail.com>
 <55E9E0C3.7070003@oddbird.net>
 <CAP7+vJKAOVNQKktgWFsBoQybsyroioq-=hrt52d+G+EiqS1CMg@mail.gmail.com>
Message-ID: <55ECDAD0.3000505@oddbird.net>

On 09/04/2015 12:25 PM, Guido van Rossum wrote:
> I made it up, in analogy to "classic classes" in Python 2. I did this
> not as a euphemism, but to avoid confusion, since in the existing docs
> "naive" is only ever applied to objects (meaning tzinfo-less) and I
> wanted to have a term that couldn't confuse anyone into thinking we were
> only talking about arithmetic of naive objects.

Thanks for the clarification; that's reasonable. I shouldn't have
presumed a reason for the term.

And, as others have pointed out, "naive arithmetic" is just as
invented-here as "classic arithmetic" -- perhaps more meaningful to
someone not already familiar with it, but also possibly leading them to
the wrong meaning.

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150906/aaf0146c/attachment.sig>

From carl at oddbird.net  Mon Sep  7 02:36:51 2015
From: carl at oddbird.net (Carl Meyer)
Date: Sun, 6 Sep 2015 18:36:51 -0600
Subject: [Datetime-SIG] Another approach to 495's glitches
In-Reply-To: <CAP7h-xZPSn4AZoZmb5212-exQ6B0yEOt+i905BmV9Ymr9hn-iQ@mail.gmail.com>
References: <CAExdVNmHtzsP36_wxeUMPp32KzKQeaHXAVGuPZTRYnYZiZPbVw@mail.gmail.com>
 <CAP7h-xaBZm3G_ruLTFpFeA=eobkBFnGZ3yd6PmSY44eLS9ZxJQ@mail.gmail.com>
 <CAExdVN=phmU507kEXggYfFCe-ZyRBKr=zzgTpeLEFRuATy1x-Q@mail.gmail.com>
 <CAExdVNmN-7hKGeWhiSb00C5bZHuhSCq-sLnHKi3dU5Zbns+wiQ@mail.gmail.com>
 <CAExdVNn9bLwiPxBGELTGWwTi=iv6S7UFyvkgP15V2JOXa4u3Rg@mail.gmail.com>
 <CAP7h-xZPSn4AZoZmb5212-exQ6B0yEOt+i905BmV9Ymr9hn-iQ@mail.gmail.com>
Message-ID: <55ECDC23.1080201@oddbird.net>

On 09/06/2015 02:53 PM, Alexander Belopolsky wrote:
> On Sun, Sep 6, 2015 at 2:58 PM, Tim Peters <tim.peters at gmail.com
> <mailto:tim.peters at gmail.com>> wrote:
> 
>     [Tim]
>     > ...
>     > Consider two aware datetimes that compare equal.  The task is to prove
>     > they have the same hash.  The subtlety is that while __eq__ and
>     > __hash__ both use a notion of "UTC equivalent", they're not always the
>     > same notion.  __eq__ always uses the given values of `fold`, while
>     > __hash__ always forces fold=0.
> 
>     Which obviously ;-) suggests yet another, possibly cleaner, approach:
>     have interzone subtraction,  and all interzone comparisons, _also_
>     force fold to 0 (instead of having only interzone __eq__ and __ne__
>     special-case fold=1) .
> 
> I would not go that far.  While interzone subtraction between arbitrary
> zones is a rarely needed overkill, I find it useful to have subtraction
> work between a local zone and UTC.  For me, subtraction in this case is
> similar to conversion.  Fix the EPOCH and d = t - EPOCH together with t
> = EPOCH + d gives you a bijection between times and timedeltas.  From
> that, you are one step away from various numeric time scales.  For
> example (t - datetime(1, 1, 1, tzinfo=timezone.utc)) //
> timedelta.resolution will give you a bijection between datetimes and
> some range of integers.  Thus if we are going to "sell" fold as a way to
> implement conversions that "always work", I think we should include
> these types of conversions as well.

FWIW, Tim's latest proposal (either variant) resolves all my concerns
with PEP 495 (as I explained at greater length in the "Timeline
arithmetic" thread).

Fundamentally I don't care between these two variants (because the
difference between them only impacts interzone operations, and my
general advice on those going forward would be "don't use them").

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150906/8a45303d/attachment.sig>

From tim.peters at gmail.com  Mon Sep  7 03:05:41 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Sun, 6 Sep 2015 20:05:41 -0500
Subject: [Datetime-SIG] Another approach to 495's glitches
In-Reply-To: <CAP7h-xZPSn4AZoZmb5212-exQ6B0yEOt+i905BmV9Ymr9hn-iQ@mail.gmail.com>
References: <CAExdVNmHtzsP36_wxeUMPp32KzKQeaHXAVGuPZTRYnYZiZPbVw@mail.gmail.com>
 <CAP7h-xaBZm3G_ruLTFpFeA=eobkBFnGZ3yd6PmSY44eLS9ZxJQ@mail.gmail.com>
 <CAExdVN=phmU507kEXggYfFCe-ZyRBKr=zzgTpeLEFRuATy1x-Q@mail.gmail.com>
 <CAExdVNmN-7hKGeWhiSb00C5bZHuhSCq-sLnHKi3dU5Zbns+wiQ@mail.gmail.com>
 <CAExdVNn9bLwiPxBGELTGWwTi=iv6S7UFyvkgP15V2JOXa4u3Rg@mail.gmail.com>
 <CAP7h-xZPSn4AZoZmb5212-exQ6B0yEOt+i905BmV9Ymr9hn-iQ@mail.gmail.com>
Message-ID: <CAExdVNm+zmnyeJcQRm_g05RzFuUeWrOHE45RWU9E25nn1Th-8g@mail.gmail.com>

[Tim]
>> ...
>> Which obviously ;-) suggests yet another, possibly cleaner, approach:
>> have interzone subtraction,  and all interzone comparisons, _also_
>> force fold to 0 (instead of having only interzone __eq__ and __ne__
>> special-case fold=1) .

[Alex]
> I would not go that far.  While interzone subtraction between arbitrary
> zones is a rarely needed overkill, I find it useful to have subtraction work
> between a local zone and UTC.

Have you done so already in real life, or did it just occur to you
that you _could_ find it useful?


> For me, subtraction in this case is similar to conversion.  Fix the EPOCH
> and d = t - EPOCH together with t = EPOCH + d gives you a bijection between
> times and timedeltas.

Well, not without more words to clarify which operations are intended.
For example, it's impossible to tell what "-" means there unless you
spell out whether you're using classic or timeline arithmetic.  In
order to make your final claim true, I have to (I believe)
reverse-engineer that the claim is restricted to naive EPOCH and `d`,
or aware datetimes in a common fixed-offset zone. Otherwise your "-"
uses timeline arithmetic and your "+" classic arithmetic, and they're
different kinds of arithmetic in a non-fixed-offset zone.


> From that, you are one step away from various numeric time scales.
>  For example (t - datetime(1, 1, 1, tzinfo=timezone.utc)) // timedelta.resolution will
> give you a bijection between datetimes and some range of integers.

In this case the ambiguity is whether, by `datetimes`, you mean `t`
represents points in t.tzinfo's civil time, or points in a
tzinfo-annotated naive time.  I have to believe you mean the former,
because converting to UTC irretrievably loses tzinfo-annotated naive
times that correspond to "gap times" in that tzinfo's civil time
(i.e., this code doesn't give a bijection of tzinfo-annotated naive
datetimes if there are gaps in the tzinfo's civil time:  more than one
naive time can map to the same UTC time then, and so also to the same
integer then).

Replacing `t` with t.astimezone(utc) would make that obvious instead
of a puzzle, making it utterly clear that you only have civil time in
mind.  All instances of by-magic timeline arithmetic are an
"attractive nuisance" in datetime's current design :-(


> Thus if we are going to "sell" fold as a way to implement conversions that
> "always work", I think we should include these types of conversions as well.

Unfortunately, I have to suspect _someone_ out there already has this
kind of code, wrong-headed ;-) as it is.

So that kills that.

Unfortunately, that leaves the "special-case fold=1 in __eq__ and
__ne__" idea violating enough formal properties in interzone
arithmetic, albeit in rare cases, that I expect the best we can hope
for this PEP is "grudging acceptance".

I'll have to go back and read the "how about an insanely delicate
hash() implementation instead?" messages again ;-)

From tim.peters at gmail.com  Mon Sep  7 03:19:08 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Sun, 6 Sep 2015 20:19:08 -0500
Subject: [Datetime-SIG] Another approach to 495's glitches
In-Reply-To: <CAExdVNm+zmnyeJcQRm_g05RzFuUeWrOHE45RWU9E25nn1Th-8g@mail.gmail.com>
References: <CAExdVNmHtzsP36_wxeUMPp32KzKQeaHXAVGuPZTRYnYZiZPbVw@mail.gmail.com>
 <CAP7h-xaBZm3G_ruLTFpFeA=eobkBFnGZ3yd6PmSY44eLS9ZxJQ@mail.gmail.com>
 <CAExdVN=phmU507kEXggYfFCe-ZyRBKr=zzgTpeLEFRuATy1x-Q@mail.gmail.com>
 <CAExdVNmN-7hKGeWhiSb00C5bZHuhSCq-sLnHKi3dU5Zbns+wiQ@mail.gmail.com>
 <CAExdVNn9bLwiPxBGELTGWwTi=iv6S7UFyvkgP15V2JOXa4u3Rg@mail.gmail.com>
 <CAP7h-xZPSn4AZoZmb5212-exQ6B0yEOt+i905BmV9Ymr9hn-iQ@mail.gmail.com>
 <CAExdVNm+zmnyeJcQRm_g05RzFuUeWrOHE45RWU9E25nn1Th-8g@mail.gmail.com>
Message-ID: <CAExdVNm19FfNR-55g9Hrc8jLQ_MXKuSSPhm4Kia6dQhEkBJLLA@mail.gmail.com>

[Alex]
>> For me, subtraction in this case is similar to conversion.  Fix the EPOCH
>> and d = t - EPOCH together with t = EPOCH + d gives you a bijection between
>> times and timedeltas.

[Tim]
> Well, not without more words to clarify which operations are intended.
> For example, it's impossible to tell what "-" means there unless you
> spell out whether you're using classic or timeline arithmetic.  In
> order to make your final claim true, I have to (I believe)
> reverse-engineer that the claim is restricted to naive EPOCH and `d`,
> or aware datetimes in a common fixed-offset zone. Otherwise your "-"
> uses timeline arithmetic and your "+" classic arithmetic, and they're
> different kinds of arithmetic in a non-fixed-offset zone.

I'm missing a case there:  common non-fixed-offset zone.  That one
doesn't fail because different kinds of arithmetic are used (classic
is always used then), but because classic arithmetic ignores `fold`
entirely - there's no bijection in that case if you're viewing `t` as
civil time.

So, your EPOCH and ` t` share a common (possibly None) tzinfo, and
you're talking about a bijection in naive (not civil) time.

From tim.peters at gmail.com  Mon Sep  7 11:12:04 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 7 Sep 2015 04:12:04 -0500
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <55ECD82E.9070305@oddbird.net>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>
 <55E9F626.1080906@oddbird.net>
 <CAExdVNmmqSMz5_2ZTcxZLJ2c1i9u=SkCi6mJvg8Lf7wHbRi1sA@mail.gmail.com>
 <55ECD82E.9070305@oddbird.net>
Message-ID: <CAExdVNk9ggWEU4Nf_f4HV-W61gfLysn0ndcaxcaWQOXeev56JQ@mail.gmail.com>

[Carl Meyer <carl at oddbird.net>]
> (tl;dr I think your latest proposal re PEP 495 is great.)

I don't.  The last two were less annoying, though ;-)


> I think we're still mis-communicating somewhat. Before replying point by
> point,

Or it could be we have different goals here, and each keep trying to
nudge the other to change the topic ;-)

> let me just try to explain what I'm saying as clearly as I can.
> Please tell me precisely where we part ways in this analysis.
>
> Consider two models for the meaning of a "timezone-aware datetime
> object". Let's just call them Model A and Model B:

In which context?  Abstractly, or the context of Python's current
datetime module, or in the context of some hypothetical future Python
datetime module, or some datetime module that _might_ have existed
instead, or ...?

My only real interest here is moving the module that actually exists
to one that can get conversions right in all cases, preferably in a
wholly backward-compatible way.  Models don't really matter to that,
but specific behaviors do.


> In Model A, an aware datetime (in any timezone) is nothing more than an
> alternate (somewhat complexified for human use) spelling of a Unix
> timestamp, much like a timedelta is just a complexified spelling of some
> number of microseconds.

A Python datetime is also just a complexified spelling of some number
of microseconds (since the start of 1 January 1 of the proleptic
Gregorian calendar).


> In this model, there's a bijection between aware datetimes in any
> two timezones. (This model requires the PEP 495 flag,
> or some equivalent. Technically, this model _could_ be implemented by
> simply storing a Unix timestamp and a timezone name, and doing all
> date/time calculations at display time.) In this model, "Nov 2 2014
> 1:30am US/Eastern fold=1" and "Nov 2 2014 6:30am UTC" are just alternate
> spellings of the _same_ underlying timestamp.
>
> Characteristics of Model A:
>
> * There's no issue with comparisons or arithmetic involving datetimes in
> different timezones; they're all just Unix timestamps under the hood
> anyway, so ordering and arithmetic is always obvious and consistent:
> it's always equivalent to simple integer arithmetic with Unix timestamps.
>
> * Conversions between timezones are always unambiguous and lossless:
> they're just alternate spellings of the same integer, after all.
>
> * In this model, timeline arithmetic everywhere is the only option.

Why?  The kind of arithmetic needed for a task depends on the task.
There are no specific use cases given here, so who can say?  Some
tasks need to account for real-world durations; others need to
overlook irregularities in real-world durations (across zone
transitions) in order to maintain regularities between the
before-and-after calendar notations.  Timeline arithmetic is only
directly useful for dealing with real-world durations as they affect
civil calendar notations.  Some tasks require that, other tasks can't
tolerate that.

That said, it would be cleanest to have distinct types for each
purpose.  Whether that would be more _usable_ I don't know.


> Every non-UTC aware datetime is just an alternate spelling of an
> equivalent UTC datetime / Unix timestamp, so in a certain sense you're
> always doing "arithmetic in UTC" (or "arithmetic with Unix timestamps"),
> but you can spell it in whichever timezone you like. In this model,
> there's very little reason to consider arithmetic in non-UTC timezones
> problematic; it's always consistent and predictable and gives exactly
> the same results as converting to UTC first. For sizable systems it may
> still be good practice to do everything internally in UTC and convert at
> the edges, but the reasons are not strong; mostly just avoiding
> interoperability issues with databases or other systems that don't
> implement the same model, or have poor timezone handling.

How do you think timeline arithmetic is implemented?  datetime's
motivating use cases overwhelmingly involved quick access to local
calendar notation, so datetime stores local calendar notation (both in
memory and in pickles) directly.  Any non-toy implementation of
timeline arithmetic would store time internally in UTC ticks instead,
enduring expensive conversions to local calendar notation only when
explicitly demanded.  As is, the only way to get timeline arithmetic
in datetime is to do some equivalent to converting to UTC first, doing
dirt simple arithmetic in UTC, then converting back to local calendar
notation.  That's _horridly_ expensive in comparison.  pytz doesn't
avoid this.  The arithmetic itself is fast, because it is in fact
classic arithmetic.  The expense is hidden in the .normalize() calls,
which perform to-UTC-and-back "repair".

Pragmatics are important here too.  For many problem domains, you have
to get results before the contract expires ;-)


> * In this model, "classic" arithmetic doesn't even rise to the level of
> "attractive nuisance," it's simply "wrong arithmetic," because you get
> different results if working with the "same time" represented in
> different timezones, which violates the core axiom of the model; it's no
> longer simply arithmetic with Unix timestamps.

Models are irrelevant to right or wrong; right or wrong can only be
judged with respect to use cases (does a gimmick address the required
task, or not?  if so, "right"; if not, is it at least feasible to get
the job done?  if so, "grr - but OK"; if still not, "wrong").  Models
can make _achieving_ "right" harder or easier, depending on what a use
case requires.

datetime's model and implementation made it relatively easy to address
every use case collected across an extensive public design phase.
None of them were about accounting for real-world duration delta as
they affect, or are affected by, civil calendar notations.

Of course those may not be _your_ use cases.


> I don't believe there's anything wrong with Model A. It's not the right
> model for _all_ tasks, but it's simple, easy to understand, fully
> consistent, and useful for many tasks.

Sure!  Except for the "simple" and "easy to understand" parts ;-)
People really do trip all the time over zone transitions, to the
extent that no two distinct implementations of C mktime() can really
be expected to set is_dst the same way in all cases, not even after
decades of bug fixes.  Your "poor timezone handling" is a real problem
in edge cases across platforms.


> On the whole, it's still the model I find most intuitive and would prefer
> for most of the timezone code I personally write (and it's the one I actually
> use today in practice, because it's the model of pytz).

Do you do much datetime _arithmetic_ in pytz?  If you don't, the kind
of arithmetic you like is pretty much irrelevant ;-)  But, if you do,
take pytz's own docs to heart:

    The preferred way of dealing with times is to always work in UTC,
    converting to localtime only when generating output to be read
    by humans.

Your arithmetic-intensive code would run much faster if you followed
that advice, and you could throw out mountains of .normalize() calls.
You're working in Python, and even the storage format of Python
datetimes strongly favors classic arithmetic (as before, any serious
implementation of timeline arithmetic would store UTC ticks directly
instead).


> Now Model B. In Model B, an "aware datetime" is a "clock face" or
> "naive" datetime with an annotation of which timezone it's in. A non-UTC
> aware datetime in model B doesn't inherently know what POSIX timestamp
> it corresponds to; that depends on concepts that are outside of its
> naive model of local time, in which time never jumps or goes backwards.
> Model B is what Guido was describing in his email about an aware
> datetime in 2020: he wants an aware datetime to mean "the calendar says
> June 3, the clock face says noon, and I'm located in US/Eastern" and
> nothing more.
>
> Characteristics of Model B:
>
> * Naive (or "classic", or "move the clock hands") arithmetic is the only
> kind that makes sense under Model B.

It again depends on which specific use cases you have in mind.  Few
people think inside a rigid model.  Sometimes they want to break out
of the model, especially when a use case requires it ;-)  As you know
all too well already, Python also intends to support a programmer
changing their mind, to view their annotated naive datetime as a
moment in civil time too, at least for zone conversion purposes.


> * As Guido described, if you store an aware datetime and then your tz
> database is updated before you load it again, Model A and Model B aware
> datetimes preserve different invariants. A Model A aware datetime will
> preserve the timestamp it represents, even if that means it now
> represents a different local time than before the zoneinfo change. A
> Model B aware datetime will preserve the local clock time, even though
> it now corresponds to a different timestamp.
>
> * You can't compare or do arithmetic between datetimes in different
> timezones under Model B; you need to convert them to the same time zone
> first (which may require resolving an ambiguity).
>
> * Maintaining a `fold` attribute on datetimes at all is a departure from
> Model B, because it represents a bit of information that's simply
> nonsense/doesn't exist within Model B's naive-clock-time model.
>
> * Under Model B, conversions between timezones are lossy during a fold
> in the target timezone, because two different UTC times map to the same
> Model B local time.

Should also note that Model B conversions to UTC can map two datetimes
to the same UTC time (for times in a gap - they don't exist on the
local civil clock, so have to map to the same UTC value as some other
Model B time that _does_ exist on the local clock).


> These models aren't chosen arbitrarily; they're the two models I'm aware
> of for what a "timezone-aware datetime" could possibly mean that
> preserve consistent arithmetic and total ordering in their allowed
> domains (in Model A, all aware datetimes in any timezone can
> interoperate as a single domain; in Model B, each timezone is a separate
> domain).
>
> A great deal of this thread (including most of my earlier messages and,
> I think, even parts your last message here that I'm replying to) has
> consisted of proponents of one of these two models arguing that behavior
> from the other model is wrong or inferior or buggy (or an "attractive
> nuisance").

Direct overloaded-operator support for timeline arithmetic is an
attractive nuisance _in datetime_, or any other Python module sharing
datetime's data representation.  I disagree with your "but the reasons
are not strong" above.  It requires relatively enormous complexity and
expense to perform each lousy timeline addition, subtraction, and
comparison in a non-eternally-fixed-offset zone.  It's poor practice
for that reason alone.

Nevertheless, your code, your choice.


> I now think these assertions are all wrong :-) Both models
> are reasonable and useful, and in fact both are capable enough to handle
> all operations, it's just a question of which operations they make
> simple. Model B people say "just do all your arithmetic and comparisons
> in UTC"; Model A people say "if you want Model B, just use naive
> datetimes and track the implied timezone separately."

Do note that my _only_ complaint against timeline arithmetic is making
it seductively easy to spell in Python's datetime.  It's dead easy to
get the same results in the intended way (or, would be, in a post-495
world).


> I came into this discussion assuming that Model A was the only sensible
> way for a datetime library to behave. Now (thanks mostly to Guido's note
> about dates in 2020), I've been convinced that Model B is also
> reasonable, and preferable for some uses.

For the use cases collected when datetime was being designed, it was
often the clearly better model, and was never the worse model.  Where
"better" and "worse" are judged relative to the model's naturalness in
addressing a use case.  Alas, those were collected on a public Wiki
that no longer appears to exist.


> I've also been convinced that Model B is the dominant influence
> and intended model in datetime's design, and that's very unlikely
> to change (even in a backwards-compatible way), so I'm no
> longer advocating that.

That's good, because futility can become tiring as the decades drag on ;-)


> Datetime.py, unfortunately, has always mixed behavior from the two
> models (interzone operations are all implemented from a Model A
> viewpoint; intrazone are Model B).  Part of the problem with this is that
> it results in a system that looks like it ought to have total ordering
> and consistent arithmetic, but doesn't. The bigger problem is that it
> has allowed people to come to the library from either a Model A or Model
> B viewpoint and find enough behavior confirming their mental model to
> assume they were right, and assume any behavior that doesn't match their
> model is a bug. That's what happened to Stuart, and that's why pytz
> implements Model A, and has thus encouraged large swathes of Python
> developers to even more confidently presume that Model A is the intended
> model.

Stuart would have to address that.  He said earlier that his primary
concern was to fix conversions in all cases, not arithmetic.
Explained before that timeline arithmetic was a natural consequence of
the _way_ pytz repaired conversions.  It's natural enough then to
assume "oh, I just fixed _two_ bugs!" ;-)

As is, as Isaac noted earlier, he's had a hellish time getting, e.g.,
pytz and dateutil to work together.  dateutil requires classic
arithmetic (which is by far the more convenient for implementing
almost all forms of "calendar operations").  So, e.g., take a pytz
aware datetime d, and do

    d += relativedelta(month=12, day=1, weekday=FR(+3))

where everything on the RHS is a dateutil way to spell "same time on
the 3rd Friday of this December" when added to a datetime.  That's not
particularly contrived - it's, e.g., a way to spell the day monthly US
equity options expire in December, and a user may well need to set an
alarm "at the same wall clock time" then to check their expiring
December contracts before the market closes.  Being an hour off
_could_ be a financial disaster to them.

The result is fine, until you do a  pytz .normalize().  If d, e.g.,
started in June, then in the US the hour _will_ magically become wrong
"because" there was a DST transition between the original and final
times.  Far worse than useless.

A similar fate awaits any attempt to make timeline arithmetic a
default behavior (if it changed what datetime + timedelta did
directly, the dateutil result would be wrong immediately, because
dateutil's relativedelta.__add__ relies in part on what `datetime +
timedelta` does).  "Plays nice with others" is also important unless a
module is content to live in a world of its own.


> I think your latest proposal for PEP 495 (always ignore `fold` in all
> intra-zone operations, and push the inconsistency into inter-zone
> comparisons - which were already inconsistent - instead) is by far the
> best option for bringing loss-less timezone-conversion round-trips to
> Model B. Instead of saying (as earlier revisions of PEP 495 did) "we
> claim we're really Model B, but we're going to introduce even more Model
> A behaviors, breaking the consistency of Model B in some cases - good
> luck keeping it straight!" it says "we're sticking with Model B, in
> which `fold` is meaningless when you're working within a timezone, but
> in the name of practical usability we'll still track `fold` internally
> after a conversion, so you don't have to do it yourself in case you want
> to convert to another timezone later."

Alas, there's still no _good_ solution to this :-(


> If the above analysis makes any sense at all to anyone, and you think
> something along these lines (but shorter and more carefully edited)
> would make a useful addition to the datetime docs (either as a
> tutorial-style "intro to how datetime works and how to think about aware
> datetimes" or as an FAQ), I'd be very happy to write that patch.

I've mentioned a few times before that I'd welcome something more akin
to the "floating-point surprises" appendix:

     https://docs.python.org/3/tutorial/floatingpoint.html

Most users don't want to read anything about theory, but it needs to
be discussed sometimes.  So in that appendix, the approach is to
introduce bite-sized chunks of theory to explain concrete, visible
_behaviors_, along with practical advice.  The goal is to get the
reader unstuck, not to educate them _too_ much ;-)  Anyway, that
appendix appears to have been effective at getting many users unstuck,
so I think it's a now-proven approach.


>> Classic arithmetic is equivalent to doing integer arithmetic on
>> integer POSIX timestamps (although with wider range the same across
>> all platforms, and extended to microsecond precision).  That's hardly
>> novel - there's a deep and long history of doing exactly that in the
>> Unix(tm) world.  Which is Guido's world.  There "shouldn't be"
>> anything controversial about that.  The direct predecessor was already
>> best practice in its world.  How that could be considered a nuisance
>> seems a real strain to me.

> Unless I'm misunderstanding what you are saying (always likely!), I
> think this is just wrong. POSIX timestamps are a representation of an
> instant in time (a number of seconds since the epoch _in UTC_).

Well, in the POSIX approximation to UTC.  Strict POSIX forbids using
real-world UTC (which suffers leap seconds).  But, below, I won't keep
making this distinction.  That should be a relief ;-)


> If you are doing any kind of "integer arithmetic on POSIX timestamps", you
> are _always_ doing timeline arithmetic.

True.

> Classic arithmetic may be many things, but the one thing it definitively is
> _not_ is "arithmetic on POSIX timestamps."

False.  UTC is an eternally-fixed-offset zone.  There are no
transitions to be accounted for in UTC.  Classic and timeline
arithmetic are exactly the same thing in any eternally-fixed-offset
zone.  Because POSIX timestamps _are_ "in UTC", any arithmetic
performed on one is being done in UTC too.  Your illustration next
goes way beyond anything I could possibly read as doing arithmetic on
POSIX timestamps:


> This is easy to demonstrate: take one POSIX timestamp, convert it to
> some timezone with DST, add 86400 seconds to it (using "classic
> arithmetic") across a DST gap or fold, and then convert back to a POSIX
> timestamp, and note that you don't have a timestamp 86400 seconds away
> from the first timestamp. If you were doing simple "arithmetic on POSIX
> timestamps", such a result would not be possible.

But you're cheating there.  It's clear as mud what you have in mind,
concretely, for the _result_ of what you get from "convert it to
some timezone with DST", but the result of that can't possibly be a
POSIX timestamp:  as you said at the start, a POSIX timestamp denotes
a number of seconds from the epoch _in UTC_  You're no longer in UTC.
You left the POSIX timestamp world at your very first step.  So
anything you do after that is irrelevant to how arithmetic on POSIX
timestamps behaves.

BTW, how do you intend to do that conversion to begin with?  C's
localtime() doesn't return time_t (a POSIX timestamp).  The standard C
library supports no way to perform the conversion you described,
because that's not how times are intended to work in C, because in
turn the Unix world has the same approach to this as Python's
datetime:  all timeline arithmetic is intended to be done in UTC
(equivalent to POSIX timestamps), converting to UTC first (C's
mktime()), then back when arithmetic is done (C's localtime()).  The
only difference is that datetime spells both C library functions via
.astimezone(), and is about 1000 times easier to use ;-)

If you're unfamiliar with how this stuff is done in C, here's a
typically incomprehensible ;-) man page briefly describing all the
main C time functions:

    http://linux.die.net/man/3/mktime

Note that mktime ("convert from local to UTC") is the _only_ one
returning a timestamp (time_t).  The intent is you do all arithmetic
on time_t's, staying in UTC for the duration.  When you're done,
_then_ localtime() converts your final time_t back to local calendar
notation (fills a `struct tm` for output).  Exactly the same dance
datetime intends.  Python stole almost all of this from C best
practice, except for the spelling.

If by "convert it to some timezone with DST", you intended to get a
struct tm (local calendar notation), then add 86400 to the tm_sec
member, then that doesn't even have an hallucinogenic resemblance to
doing arithmetic on POSIX timestamps.


> In Model A (the one that Lennart and myself and Stuart and Chris have
> all been advocating during all these threads)  timezone) are unambiguous
> representations of a POSIX timestamp, and all arithmetic is "arithmetic
> on POSIX timestamps." That right there is the definition of timeline arithmetic.

Here's an example of arithmetic on POSIX timestamps:

   1 + 2

returning 3.  It's not some kind of equivalence relation or bijection,
it's concretely adding two integers to get a third integer.  That's
all I mean by "arithmetic on POSIX timestamps".  It's equally useful
for implementing classic or timeline arithmetic.  The difference
between those isn't in the timestamp arithmetic, it's in how
conversions between integers and calendar notations are defined.
There does happen to be an obvious bijection between arithmetic on
(wide enough) POSIX timestamps and naive datetime arithmetic, which is
in turn trivially isomorphic to aware datetime arithmetic in UTC.
Although the "obvious" there depends on knowing first that, at heart,
a Python datetime is an integer count of microseconds since the start
of 1 January 1.  It's just an integer stored in a bizarre mixed-radix
notation.


> So yes, I agree with you that it's hard to consider "arithmetic on POSIX
> timestamps" an attractive nuisance :-)

>> Where it gets muddy is extending classic arithmetic to aware datetimes
>> too.

> If by "muddy" you mean "not in any way 'arithmetic on POSIX timestamps'
> anymore." :-)
>
> I don't even know what you mean by "extending to aware datetimes" here;

I meant what I said:  extending classic arithmetic to aware datetimes
muddied the waters.  Because some people do expect aware datetimes to
implement timeline arithmetic instead.  That's all.


> the concept of "arithmetic on POSIX timestamps" has no meaning at all
> with naive datetimes (unless you're implicitly assuming some timezone),
> because naive datetimes don't correspond to any particular instant,
> whereas a POSIX timestamp does.

If you need to, implicitly assume UTC.  There are no surprises at all
if you want to _think_ of naive datetimes as being in (the POSIX
approximation of real-world) UTC.  They're identical in all visible
behaviors that don't require a tzinfo.  Indeed, here's how to convert
a naive datetime `dt` "by hand" to an integer POSIX timestamp,
pretending `dt` is a UTC time:

    EPOCH = datetime(1970, 1, 1)
    ts = (dt - EPOCH) // timedelta(seconds=1)

Try it!  If you don't have Python 3, it's just as trivial, but you'll
have to convert the 3 timedelta attributes (days, seconds,
microseconds) to seconds by hand and add them.

After, do

   EPOCH + timedelta(seconds=ts)

to get back the original dt.

To get a floating POSIX timestamp instead (including microseconds):

    ts = (dt - EPOCH).total_seconds()

Please let's not argue about trivially easy bijections.  datetime's
natural EPOCH is datetime(1, 1, 1), and _all_ classic arithmetic is
easily defined in terms of integer arithmetic on
integer-count-of-microsecond timestamps starting from there.  While it
would be _possible_ to think of those as denoting UTC timestamps, it
wouldn't really be helpful ;-)

...

>>> If datetime did naive arithmetic on tz-annotated datetimes, and also
>>> refused to ever implicitly convert them to UTC for purposes of
>>> cross-timezone comparison or arithmetic, and included a `fold` parameter
>>> not on the datetime object itself but only as an additional input
>>> argument when you explicitly convert from some other timezone to UTC,
>>> that would be a consistent view of the meaning of a tz-annotated
>>> datetime, and I wouldn't have any problem with that.

>> I would.  Pure or not, it sounds unusable:  when I convert _from_ UTC
>> to a local zone, I have no idea whether I'll end up in a gap, a fold,
>> or neither.  And so I'll have no idea either what to pass _to_
>> .utcoffset() when I need to convert back to UTC.  It doesn't solve the
>> conversion problem.  It's a do-it-yourself kit missing the most
>> important piece.  "But .fromutc() could return the right flag to pass
>> back later" isn't attractive either.  Then the user ends up needing to
>> maintain their own (datetime, convert_back_flag) pairs.  In which
>> case, why not just store the flag _in_ the datetime?  Only tzinfo
>> methods would ever need to look at it.

> Yes, I agree with you here. I think your latest proposal for PEP 495
> does a great job of providing this additional convenience for the user
> without killing the intra-timezone Model B consistency. I just wish that
> the inconsistent inter-timezone operations weren't supported at all, but
> I know it's about twelve years too late to do anything about that other
> than document some variant of "you shouldn't compare or do arithmetic
> with datetimes in different timezones; if you do you'll get inconsistent
> results in some cases around DST transitions. Convert to the same
> timezone first instead."

Alas, I'm afraid Alex is right that people may well be using interzone
subtraction to do conversions already.  For example, the timestamp
snippets I gave above are easily extended to convert any aware
datetime to a POSIX timestamp:  just slap tzinfo=utc on the EPOCH
constant, and then by-magic interzone subtraction converts `dt` to UTC
automatically.  For that to continue to work as intended in all cases
post-495, we can't change anything about interzone subtraction.
Which, for consistency between them, implies we "shouldn't" change
anything about interzone comparisons either.


> ...
> Until your latest proposal on PEP 495, I wasn't sure we really did agree
> on this, because it seemed you were still willing to break the
> consistency of Model B arithmetic in order to gain some of the benefits
> of Model A (that is, introduce _even more_ of this context-dependent
> ambiguity as to what a tz-annotated datetime means.) But your latest
> proposal fixes that in a way I'm quite happy with, given where we are.

I'm still not sure it's a net win to change anything .  Lots of
tradeoffs.  I do gratefully credit our exchanges for cementing my
hatred of muddying Model B:  the more I had to "defend" Model B, the
more intense my determination to preserve its God-given honor at all
costs ;-)


>> Although the conceptual fog has not really been an impediment to
>> using the module in my experience.

>> In yours?  Do you use datetime?  If so, do you trip over this?

> No, because I use pytz, in which there is no conceptual fog, just strict
> Model A (and an unfortunate API).

And applications that apparently require no use whatsoever of dateutil
operations ;-)


> I didn't get to experience the joy of this conceptual fog until I
> started arguing with you on this mailing list! And now I finally feel
> like I'm seeing through that fog a bit. I hope I'm right :-)

I doubt we'll ever know for sure ;-)

From alexander.belopolsky at gmail.com  Mon Sep  7 14:50:20 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 7 Sep 2015 08:50:20 -0400
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <55ECD82E.9070305@oddbird.net>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>
 <55E9F626.1080906@oddbird.net>
 <CAExdVNmmqSMz5_2ZTcxZLJ2c1i9u=SkCi6mJvg8Lf7wHbRi1sA@mail.gmail.com>
 <55ECD82E.9070305@oddbird.net>
Message-ID: <CAP7h-xarpgk9OH=8cH7+_mdOAYVM0Ty+DDCHUwLZOtAxE9rAkw@mail.gmail.com>

On Sun, Sep 6, 2015 at 8:19 PM, Carl Meyer <carl at oddbird.net> wrote:

> In this model, there's a bijection between aware
> datetimes in any two timezones. (This model requires the PEP 495 flag,
> or some equivalent.
>

A nitpick, but since I am also guilty of such loose usage of the term
"bijection", it may be worth a clarification.  We often say that there is a
bijection between two sets when in fact there is only a bijection between a
subset of one set an a subset of another.  In a particular case of aware
datetimes with tzinfo=UTC and tzinfo=Local, a set U = {u ? datetime |
u.tzinfo is UTC, u.fold=0} maps to a subset of L= {t ? datetime | u.tzinfo
is Local}.  This map creates a bijection between U and its image under the
map, but we are still ignoring the possibility that timezone correction may
take you out of [datetime.min, datetime.max] range.  To rigorously
construct a mathematical bijection - you need to account for those edge
effects as well.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150907/1334da3d/attachment.html>

From alexander.belopolsky at gmail.com  Mon Sep  7 15:13:39 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 7 Sep 2015 09:13:39 -0400
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <CAExdVNk9ggWEU4Nf_f4HV-W61gfLysn0ndcaxcaWQOXeev56JQ@mail.gmail.com>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>
 <55E9F626.1080906@oddbird.net>
 <CAExdVNmmqSMz5_2ZTcxZLJ2c1i9u=SkCi6mJvg8Lf7wHbRi1sA@mail.gmail.com>
 <55ECD82E.9070305@oddbird.net>
 <CAExdVNk9ggWEU4Nf_f4HV-W61gfLysn0ndcaxcaWQOXeev56JQ@mail.gmail.com>
Message-ID: <CAP7h-xbwom-w+5TLYQwE2SbJ_ow7QoyBKfUT8yG9OqDvGxx8oQ@mail.gmail.com>

On Mon, Sep 7, 2015 at 5:12 AM, Tim Peters <tim.peters at gmail.com> wrote:

> For the use cases collected when datetime was being designed, it was
> often the clearly better model, and was never the worse model.  Where
> "better" and "worse" are judged relative to the model's naturalness in
> addressing a use case.  Alas, those were collected on a public Wiki
> that no longer appears to exist.
>

The Wayback Machine to the rescue!

https://web.archive.org/web/20060504021923/http://www.zope.org/Members/fdrake/DateTimeWiki/FrontPage
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150907/b0bf52e3/attachment.html>

From carl at oddbird.net  Mon Sep  7 18:20:55 2015
From: carl at oddbird.net (Carl Meyer)
Date: Mon, 7 Sep 2015 10:20:55 -0600
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <CAExdVNk9ggWEU4Nf_f4HV-W61gfLysn0ndcaxcaWQOXeev56JQ@mail.gmail.com>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>
 <55E9F626.1080906@oddbird.net>
 <CAExdVNmmqSMz5_2ZTcxZLJ2c1i9u=SkCi6mJvg8Lf7wHbRi1sA@mail.gmail.com>
 <55ECD82E.9070305@oddbird.net>
 <CAExdVNk9ggWEU4Nf_f4HV-W61gfLysn0ndcaxcaWQOXeev56JQ@mail.gmail.com>
Message-ID: <55EDB967.2050108@oddbird.net>

I'll offer another TL;DR:

* You prefer Model B, and the use cases that drove the implementation of
datetime favored Model B. Great! I have zero problem with that, and zero
problem with datetime continuing to implement Model B (thus I agree with
you completely that by-default -- operator overloaded -- timeline
arithmetic in datetime would be wrong and break its model). As with any
library I use, I just want its objects to implement a consistent and
simple-as-possible (but no simpler!) mental model so that I can reliably
predict its behavior. I understand that it's too late for datetime to do
that fully, but we can still keep it in mind as a principle to help
guide future changes.

On 09/07/2015 03:12 AM, Tim Peters wrote:
> [Carl Meyer <carl at oddbird.net>]
>> (tl;dr I think your latest proposal re PEP 495 is great.)
> 
> I don't.  The last two were less annoying, though ;-)

"Great" here is thoroughly in context of "where we are today, and where
it's feasible to go from here." Isn't that the context you keep trying
to get me to think in? Keep up with my hats already! ;-)

More on the PEP 495 options later on.

>> Consider two models for the meaning of a "timezone-aware datetime
>> object". Let's just call them Model A and Model B:
> 
> In which context?  Abstractly, or the context of Python's current
> datetime module, or in the context of some hypothetical future Python
> datetime module, or some datetime module that _might_ have existed
> instead, or ...?

Any of 1, 3, or 4. But the exercise is illuminating for question 2,
also. Per what you say below, it sounds like my insistence on discussing
abstract mental models and their implications has already helped nudge
you towards a proposal that maintains Model B consistency better. My
preference for model A vs B is negligible compared to my preference for
_some_ consistently-applied mental model, so I think that's "great."

> My only real interest here is moving the module that actually exists
> to one that can get conversions right in all cases, preferably in a
> wholly backward-compatible way.  Models don't really matter to that,
> but specific behaviors do.

I think the two most important questions you can ask about the behavior
of any library are a) Does it apply a consistent mental model of the
problem domain? and b) is that mental model applicable to the problems
you need to solve? (Or perhaps it may offer more than one mental model,
but clearly split in the API so you can decide which one applies best to
your use cases).

I can't really fathom an approach to library design (even library design
constrained by backwards compatibility) that honestly believes "models
don't really matter, but specific behaviors do." Models are critical in
order to present a consistent set of behaviors that the user of the
library can successfully predict, once they understand the model.

>> In Model A, an aware datetime (in any timezone) is nothing more than an
>> alternate (somewhat complexified for human use) spelling of a Unix
>> timestamp, much like a timedelta is just a complexified spelling of some
>> number of microseconds.
> 
> A Python datetime is also just a complexified spelling of some number
> of microseconds (since the start of 1 January 1 of the proleptic
> Gregorian calendar).

Which is a "naive time" concept, which is a pretty good sign that Python
datetime wasn't intended to implement Model A. I thought it was already
pretty clear that I'd figured that out by now :-)

>> In this model, there's a bijection between aware datetimes in any
>> two timezones. (This model requires the PEP 495 flag,
>> or some equivalent. Technically, this model _could_ be implemented by
>> simply storing a Unix timestamp and a timezone name, and doing all
>> date/time calculations at display time.) In this model, "Nov 2 2014
>> 1:30am US/Eastern fold=1" and "Nov 2 2014 6:30am UTC" are just alternate
>> spellings of the _same_ underlying timestamp.
>>
>> Characteristics of Model A:
>>
>> * There's no issue with comparisons or arithmetic involving datetimes in
>> different timezones; they're all just Unix timestamps under the hood
>> anyway, so ordering and arithmetic is always obvious and consistent:
>> it's always equivalent to simple integer arithmetic with Unix timestamps.
>>
>> * Conversions between timezones are always unambiguous and lossless:
>> they're just alternate spellings of the same integer, after all.
>>
>> * In this model, timeline arithmetic everywhere is the only option.
> 
> Why?  

Because it's the only choice that doesn't break the mental model. If
"all datetimes in any timezone are really just alternate spellings of a
Unix timestamp", then adding X seconds to a datetime in any timezone
must result in a datetime that represents a Unix timestamp that's X
seconds later. _If you're within this mental model_. You may not prefer
this mental model; you may think is less useful, or slower, or whatever,
and that's fine. But you have to at least acknowledge that it is
internally consistent and conceptually simple; it's fundamentally
nothing more than arithmetic on POSIX timestamps, all the time and
everywhere.

I don't know how to say this any more clearly. If you still can't
acknowledge that much, I think I have to give up.

> The kind of arithmetic needed for a task depends on the task.
> There are no specific use cases given here, so who can say?  Some
> tasks need to account for real-world durations; others need to
> overlook irregularities in real-world durations (across zone
> transitions) in order to maintain regularities between the
> before-and-after calendar notations.  Timeline arithmetic is only
> directly useful for dealing with real-world durations as they affect
> civil calendar notations.  Some tasks require that, other tasks can't
> tolerate that.

Of course! I'm describing the implications of a mental model here, not
arguing that it's the best model for all tasks.

>> Every non-UTC aware datetime is just an alternate spelling of an
>> equivalent UTC datetime / Unix timestamp, so in a certain sense you're
>> always doing "arithmetic in UTC" (or "arithmetic with Unix timestamps"),
>> but you can spell it in whichever timezone you like. In this model,
>> there's very little reason to consider arithmetic in non-UTC timezones
>> problematic; it's always consistent and predictable and gives exactly
>> the same results as converting to UTC first. For sizable systems it may
>> still be good practice to do everything internally in UTC and convert at
>> the edges, but the reasons are not strong; mostly just avoiding
>> interoperability issues with databases or other systems that don't
>> implement the same model, or have poor timezone handling.
> 
> How do you think timeline arithmetic is implemented?  datetime's
> motivating use cases overwhelmingly involved quick access to local
> calendar notation, so datetime stores local calendar notation (both in
> memory and in pickles) directly.  Any non-toy implementation of
> timeline arithmetic would store time internally in UTC ticks instead,
> enduring expensive conversions to local calendar notation only when
> explicitly demanded.  As is, the only way to get timeline arithmetic
> in datetime is to do some equivalent to converting to UTC first, doing
> dirt simple arithmetic in UTC, then converting back to local calendar
> notation.  That's _horridly_ expensive in comparison.  pytz doesn't
> avoid this.  The arithmetic itself is fast, because it is in fact
> classic arithmetic.  The expense is hidden in the .normalize() calls,
> which perform to-UTC-and-back "repair".

Yes, of course. I know all this. In summary: "datetime wasn't intended
as Model A." How many times do we need to agree on that? ;-)

And I've also agreed that datetime shouldn't be converted to Model A. So
what are you trying to convince me of, here?

>> * In this model, "classic" arithmetic doesn't even rise to the level of
>> "attractive nuisance," it's simply "wrong arithmetic," because you get
>> different results if working with the "same time" represented in
>> different timezones, which violates the core axiom of the model; it's no
>> longer simply arithmetic with Unix timestamps.
> 
> Models are irrelevant to right or wrong; right or wrong can only be
> judged with respect to use cases (does a gimmick address the required
> task, or not?  if so, "right"; if not, is it at least feasible to get
> the job done?  if so, "grr - but OK"; if still not, "wrong").  Models
> can make _achieving_ "right" harder or easier, depending on what a use
> case requires.

Once again, you seem to be trying to interpret every characterization of
Model A as an argument that "Model A is right, other models are wrong,
and datetime ought to be Model A." I'm not saying any of that; which
model is best obviously depends on the use case (though both models are
_capable_ of handling all use cases, it just may be slower and less
convenient. That's a typical set of tradeoffs when choosing a model).

All I'm saying is "if you accept Model A as your mental model, this is
the behavior that must follow (the behavior that is _right_ _for the
model_; which _is_ something that is possible to judge), else you've
broken the model, and you're implementing some other model instead, or
(worse) you're not implementing a consistent model at all."

>> I don't believe there's anything wrong with Model A. It's not the right
>> model for _all_ tasks, but it's simple, easy to understand, fully
>> consistent, and useful for many tasks.
> 
> Sure!  Except for the "simple" and "easy to understand" parts ;-)

Maybe not to you, I guess; though I have to suspect that you're playing
a little dumb here for effect (is this the jester hat?). I think
"everything is isomorphic to a Unix timestamp, just represented in
different spellings, and all arithmetic is isomorphic to integer
arithmetic on Unix timestamps" is pretty simple and easy to understand,
personally.

> People really do trip all the time over zone transitions,

Of course they do, because timezones, and timezone transitions
specifically, are terrible. And some will continue to trip over them, in
different ways and in different scenarios, regardless of whether they
work in Model A or Model B.

They will trip over them _more_ if they are using a library that can't
decide what mental model it implements, and tries to guess that they
mean one for this operation and another for that operation, than if they
are using a library that consistently implements one mental model. Do we
still agree on that, or not anymore? ;-)

>> On the whole, it's still the model I find most intuitive and would prefer
>> for most of the timezone code I personally write (and it's the one I actually
>> use today in practice, because it's the model of pytz).
> 
> Do you do much datetime _arithmetic_ in pytz?  If you don't, the kind
> of arithmetic you like is pretty much irrelevant ;-)  But, if you do,
> take pytz's own docs to heart:
> 
>     The preferred way of dealing with times is to always work in UTC,
>     converting to localtime only when generating output to be read
>     by humans.
> 
> Your arithmetic-intensive code would run much faster if you followed
> that advice, and you could throw out mountains of .normalize() calls.
> You're working in Python, and even the storage format of Python
> datetimes strongly favors classic arithmetic (as before, any serious
> implementation of timeline arithmetic would store UTC ticks directly
> instead).

I do follow that advice; I don't believe my latest heavy-datetime-using
application does non-UTC timeline arithmetic anywhere.

But unless a library outlaws arithmetic on non-UTC datetimes altogether,
I'd like it to implement it in a way that's consistent with its mental
model, whichever one it picks. Because not all little scripts need to
follow the ideal best practice and squeeze out optimal performance, but
they nonetheless deserve predictable behavior that consistently
implements _some_ mental model of the problem domain.

>> Now Model B. In Model B, an "aware datetime" is a "clock face" or
>> "naive" datetime with an annotation of which timezone it's in. A non-UTC
>> aware datetime in model B doesn't inherently know what POSIX timestamp
>> it corresponds to; that depends on concepts that are outside of its
>> naive model of local time, in which time never jumps or goes backwards.
>> Model B is what Guido was describing in his email about an aware
>> datetime in 2020: he wants an aware datetime to mean "the calendar says
>> June 3, the clock face says noon, and I'm located in US/Eastern" and
>> nothing more.
>>
>> Characteristics of Model B:
>>
>> * Naive (or "classic", or "move the clock hands") arithmetic is the only
>> kind that makes sense under Model B.
> 
> It again depends on which specific use cases you have in mind.  Few
> people think inside a rigid model.  Sometimes they want to break out
> of the model, especially when a use case requires it ;-)  As you know
> all too well already, Python also intends to support a programmer
> changing their mind, to view their annotated naive datetime as a
> moment in civil time too, at least for zone conversion purposes.

I'm all in favor of Python supporting a programmer switching from one
mental model to another. There are good ways to do that explicitly, e.g.
by representing each mental model with its own type of object. See
JodaTime/NodaTime for one example.

I'm not in favor of Python guessing that the programmer "probably" has
one mental model in mind when doing one operation, and another when
doing another, on the very same object. That kind of thing leads to
angry programmers who think the library is buggy. You may have seen a
few of them on this mailing list ;-)

I thought we agreed on this (I recall you saying "how many times do we
have to agree on this?"), but then it seems like you keep waffling as to
whether you actually do or not. I guess it depends which hat you're
wearing at the time ;-)

...

>> These models aren't chosen arbitrarily; they're the two models I'm aware
>> of for what a "timezone-aware datetime" could possibly mean that
>> preserve consistent arithmetic and total ordering in their allowed
>> domains (in Model A, all aware datetimes in any timezone can
>> interoperate as a single domain; in Model B, each timezone is a separate
>> domain).
>>
>> A great deal of this thread (including most of my earlier messages and,
>> I think, even parts your last message here that I'm replying to) has
>> consisted of proponents of one of these two models arguing that behavior
>> from the other model is wrong or inferior or buggy (or an "attractive
>> nuisance").
> 
> Direct overloaded-operator support for timeline arithmetic is an
> attractive nuisance _in datetime_, or any other Python module sharing
> datetime's data representation.

I 100% agree with you. Datetime is a Model B implementation (mostly);
its data representation reflects that, and I absolutely don't think it
should have operator-overloaded support for timeline arithmetic. Was I
insufficiently clear about that?

Actually, I think "attractive nuisance" is too weak here. I think
operator-overloaded timeline arithmetic on aware datetimes in datetime
would be simply wrong; it would break the mental model of what an aware
datetime is, under Model B.

> I disagree with your "but the reasons
> are not strong" above.  It requires relatively enormous complexity and
> expense to perform each lousy timeline addition, subtraction, and
> comparison in a non-eternally-fixed-offset zone.

"In datetime or a a module sharing datetime's data representation," yes.
My "but the reasons are not strong" was clearly specific to Model A,
which datetime is not.

I tried very hard to set up a clear delineation between the two models,
and be very clear that I understand datetime is Model B and should
remain that way. But nonetheless, you seem very determined to blur that
line and interpret all my comments about Model A as if I'm saying they
should apply to datetime. Please don't do that ;-)

>> I now think these assertions are all wrong :-) Both models
>> are reasonable and useful, and in fact both are capable enough to handle
>> all operations, it's just a question of which operations they make
>> simple. Model B people say "just do all your arithmetic and comparisons
>> in UTC"; Model A people say "if you want Model B, just use naive
>> datetimes and track the implied timezone separately."
> 
> Do note that my _only_ complaint against timeline arithmetic is making
> it seductively easy to spell in Python's datetime.

Great! Then we agree, so can we stop arguing about it? ;-)

I thought I was already pretty clear that I no longer believed that
timeline arithmetic should be made easy to spell in Python's datetime.

I just _also_ think that there _is_ a reasonable alternative mental
model in which only timeline arithmetic makes sense and classic
arithmetic looks buggy, and I thought that trying to clearly outline
that alternative mental model might help make sense of where the
"classic arithmetic is wrong!" viewpoint originates.

>> If the above analysis makes any sense at all to anyone, and you think
>> something along these lines (but shorter and more carefully edited)
>> would make a useful addition to the datetime docs (either as a
>> tutorial-style "intro to how datetime works and how to think about aware
>> datetimes" or as an FAQ), I'd be very happy to write that patch.
> 
> I've mentioned a few times before that I'd welcome something more akin
> to the "floating-point surprises" appendix:
> 
>      https://docs.python.org/3/tutorial/floatingpoint.html
> 
> Most users don't want to read anything about theory, but it needs to
> be discussed sometimes.  So in that appendix, the approach is to
> introduce bite-sized chunks of theory to explain concrete, visible
> _behaviors_, along with practical advice.  The goal is to get the
> reader unstuck, not to educate them _too_ much ;-)  Anyway, that
> appendix appears to have been effective at getting many users unstuck,
> so I think it's a now-proven approach.

That's very similar to what I had in mind, actually. I'll work on a doc
patch, and look forward to you tearing it apart ;-)

> 
>>> Classic arithmetic is equivalent to doing integer arithmetic on
>>> integer POSIX timestamps (although with wider range the same across
>>> all platforms, and extended to microsecond precision).  That's hardly
>>> novel - there's a deep and long history of doing exactly that in the
>>> Unix(tm) world.  Which is Guido's world.  There "shouldn't be"
>>> anything controversial about that.  The direct predecessor was already
>>> best practice in its world.  How that could be considered a nuisance
>>> seems a real strain to me.
> 
>> If you are doing any kind of "integer arithmetic on POSIX timestamps", you
>> are _always_ doing timeline arithmetic.
> 
> True.
> 
>> Classic arithmetic may be many things, but the one thing it definitively is
>> _not_ is "arithmetic on POSIX timestamps."
> 
> False.  UTC is an eternally-fixed-offset zone.  There are no
> transitions to be accounted for in UTC.  Classic and timeline
> arithmetic are exactly the same thing in any eternally-fixed-offset
> zone.  Because POSIX timestamps _are_ "in UTC", any arithmetic
> performed on one is being done in UTC too.  Your illustration next
> goes way beyond anything I could possibly read as doing arithmetic on
> POSIX timestamps:

Translation: "I refuse to countenance the possibility of Model A."

>> This is easy to demonstrate: take one POSIX timestamp, convert it to
>> some timezone with DST, add 86400 seconds to it (using "classic
>> arithmetic") across a DST gap or fold, and then convert back to a POSIX
>> timestamp, and note that you don't have a timestamp 86400 seconds away
>> from the first timestamp. If you were doing simple "arithmetic on POSIX
>> timestamps", such a result would not be possible.
> 
> But you're cheating there.  It's clear as mud what you have in mind,
> concretely, for the _result_ of what you get from "convert it to
> some timezone with DST", but the result of that can't possibly be a
> POSIX timestamp:  as you said at the start, a POSIX timestamp denotes
> a number of seconds from the epoch _in UTC_  You're no longer in UTC.
> You left the POSIX timestamp world at your very first step.  So
> anything you do after that is irrelevant to how arithmetic on POSIX
> timestamps behaves.

Not if your mental model is that an aware datetime in some other
timezone is isomorphic to a POSIX timestamp with a timezone annotation.
 In that case, the "timezone conversion" part is really easy and
obvious; you just change the timezone annotation.

> BTW, how do you intend to do that conversion to begin with?  C's
> localtime() doesn't return time_t (a POSIX timestamp).  The standard C
> library supports no way to perform the conversion you described,
> because that's not how times are intended to work in C, because in
> turn the Unix world has the same approach to this as Python's
> datetime:  all timeline arithmetic is intended to be done in UTC
> (equivalent to POSIX timestamps), converting to UTC first (C's
> mktime()), then back when arithmetic is done (C's localtime()).  The
> only difference is that datetime spells both C library functions via
> .astimezone(), and is about 1000 times easier to use ;-)
> 
> If you're unfamiliar with how this stuff is done in C, here's a
> typically incomprehensible ;-) man page briefly describing all the
> main C time functions:
> 
>     http://linux.die.net/man/3/mktime

Thank you. In exchange, here's a reference to the ZonedDateTime object
from NodaTime:
http://nodatime.org/1.3.x/api/html/T_NodaTime_ZonedDateTime.htm

I think (notably unlike the C libraries) NodaTime/JodaTime is an
excellent example of a datetime library that maintains its mental models
clearly, and provides the necessary set of objects to represent all the
various concepts unambiguously and consistently. I think its usability
is attested to by the fact that it's become the de facto standard in the
Java world, and somebody went to the trouble of porting it to .NET, too,
where it's also become quite popular.

>> In Model A (the one that Lennart and myself and Stuart and Chris have
>> all been advocating during all these threads)  timezone) are unambiguous
>> representations of a POSIX timestamp, and all arithmetic is "arithmetic
>> on POSIX timestamps." That right there is the definition of timeline arithmetic.
> 
> Here's an example of arithmetic on POSIX timestamps:
> 
>    1 + 2
> 
> returning 3.  It's not some kind of equivalence relation or bijection,
> it's concretely adding two integers to get a third integer.  That's
> all I mean by "arithmetic on POSIX timestamps".  It's equally useful
> for implementing classic or timeline arithmetic.  The difference
> between those isn't in the timestamp arithmetic, it's in how
> conversions between integers and calendar notations are defined.
> There does happen to be an obvious bijection between arithmetic on
> (wide enough) POSIX timestamps and naive datetime arithmetic, which is
> in turn trivially isomorphic to aware datetime arithmetic in UTC.
> Although the "obvious" there depends on knowing first that, at heart,
> a Python datetime is an integer count of microseconds since the start
> of 1 January 1.  It's just an integer stored in a bizarre mixed-radix
> notation.

So, "timeline arithmetic is just arithmetic on POSIX timestamps" means
viewing aware datetimes as isomorphic to POSIX timestamps.

"Classic arithmetic is just arithmetic on POSIX timestamps" means
viewing aware datetimes as naive datetimes which one can pretend are in
a hypothetical (maybe UTC, if you like) fixed-offset timezone which is
isomorphic to actual POSIX timestamps (even though their actual timezone
may not be fixed-offset).

I accept that those are both true and useful in the implementation of
their respective model. I just don't think either one is inherently
obvious or useful as a justification of their respective mental models;
rather, which one you find "obvious" just reveals your preferred mental
model.

...

>> I think your latest proposal for PEP 495
>> does a great job of providing this additional convenience for the user
>> without killing the intra-timezone Model B consistency. I just wish that
>> the inconsistent inter-timezone operations weren't supported at all, but
>> I know it's about twelve years too late to do anything about that other
>> than document some variant of "you shouldn't compare or do arithmetic
>> with datetimes in different timezones; if you do you'll get inconsistent
>> results in some cases around DST transitions. Convert to the same
>> timezone first instead."
> 
> Alas, I'm afraid Alex is right that people may well be using interzone
> subtraction to do conversions already.  For example, the timestamp
> snippets I gave above are easily extended to convert any aware
> datetime to a POSIX timestamp:  just slap tzinfo=utc on the EPOCH
> constant, and then by-magic interzone subtraction converts `dt` to UTC
> automatically.  For that to continue to work as intended in all cases
> post-495, we can't change anything about interzone subtraction.
> Which, for consistency between them, implies we "shouldn't" change
> anything about interzone comparisons either.

Such code wouldn't be any _more_ broken after PEP 495 in a fold case
than it is already.

You can't maintain consistency everywhere, because datetime already
wants to treat aware datetimes as two different things in different
places. I thought we'd established that. The interzone timeline
arithmetic combined with intrazone classic arithmetic already results in
inconsistencies. So your choices are:

a) don't do PEP 495, and leave timezone conversions lossy for everyone
(except people using pytz). This effectively forces everyone who wants
loss-less conversions (and doesn't want to roll their own solution) into
the pytz model, which you don't like, and the pytz API, which nobody likes.

b) add `fold` solely as an argument to `astimezone` (and maybe `combine`
and the constructor too?), and maybe somehow allow users to get its
value out of a conversion going the other way (no idea what API would
work there) and make the user keep track of it themselves if they are
working in "local" time but may want to convert back later. This option
forces the inconsistency out of datetime by just making it the user's
problem. Usability is pretty bad, but at least it doesn't change
existing behavior, gives users _some_ way to be correct, and doesn't
guess at their intentions in inconsistent cases.

c) spike your intra-timezone classic arithmetic with a dash of timeline
arithmetic, making datetime even more confused about its mental model
than it is already.

d) don't support PEP 495 in interzone operations at all, meaning code
using interzone operations gains no benefit from PEP 495, but is no more
broken than it is today (but code using explicit timezone conversions
does benefit)

e) make interzone equality weird in fold cases, but otherwise support
PEP 495 in interzone operations as well as conversions.

I think (d) and (e) are the best options of those, and I don't have a
strong preference between them. They aren't ideal, but there is no ideal
option, including the "do nothing" option. All of these cases introduce
inconsistency somewhere, it's just a question of where you want to put
it. I'm personally not that fussed if you decide to stick with (a) instead.

> I'm still not sure it's a net win to change anything .  Lots of
> tradeoffs.  I do gratefully credit our exchanges for cementing my
> hatred of muddying Model B:  the more I had to "defend" Model B, the
> more intense my determination to preserve its God-given honor at all
> costs ;-)

My work here is done ;-)

Funny how it had roughly the opposite result from what I thought I
wanted when I entered the conversation, but I still think it's the right
result.

>>> Although the conceptual fog has not really been an impediment to
>>> using the module in my experience.
> 
>>> In yours?  Do you use datetime?  If so, do you trip over this?
> 
>> No, because I use pytz, in which there is no conceptual fog, just strict
>> Model A (and an unfortunate API).
> 
> And applications that apparently require no use whatsoever of dateutil
> operations ;-)

Oh, I use dateutil.rrule frequently, I just separate the tzinfo from the
datetime first, which makes perfect sense to me as a way to say "Ok, I
want to operate in the naive time model now, please." It's really not
that hard :-)

Please don't take my "I use pytz, so I don't have _these_ problems" as
"I use pytz, so I have _no_ problems." I fully accept that pytz is a
god-awful (though very impressive!) hack to implement Model A on top of
something that was always meant to be Model B, and that results in both
a bad API and bad performance for some operations (though the latter
really couldn't be less of an issue for my uses).

I'm still not sure what's a _better_ option than pytz for someone who
wants fully-correct and round-trippable timezone conversions and
fully-consistent behavior from a Python datetime library _today_.

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150907/c6532d71/attachment-0001.sig>

From guido at python.org  Mon Sep  7 19:06:19 2015
From: guido at python.org (Guido van Rossum)
Date: Mon, 7 Sep 2015 10:06:19 -0700
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <55EDB967.2050108@oddbird.net>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>
 <55E9F626.1080906@oddbird.net>
 <CAExdVNmmqSMz5_2ZTcxZLJ2c1i9u=SkCi6mJvg8Lf7wHbRi1sA@mail.gmail.com>
 <55ECD82E.9070305@oddbird.net>
 <CAExdVNk9ggWEU4Nf_f4HV-W61gfLysn0ndcaxcaWQOXeev56JQ@mail.gmail.com>
 <55EDB967.2050108@oddbird.net>
Message-ID: <CAP7+vJKfrmqmwGFxqDmLuya65hKkEKLjXzJKkbV9PfrUY-t5vw@mail.gmail.com>

FYI, I am still completely overwhelmed by this discussion. I will wait
until Tim and Alexander tell me there's a PEP to review and then I'll read
that. Carl: if you feel your position is not represented in that PEP (even
under "rejected alternatives") I recommend that you write your own PEP. But
I really hope that you all will come to an agreement without competing PEPs!

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150907/9059411d/attachment.html>

From carl at oddbird.net  Mon Sep  7 19:28:02 2015
From: carl at oddbird.net (Carl Meyer)
Date: Mon, 7 Sep 2015 11:28:02 -0600
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <CAP7+vJKfrmqmwGFxqDmLuya65hKkEKLjXzJKkbV9PfrUY-t5vw@mail.gmail.com>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>
 <55E9F626.1080906@oddbird.net>
 <CAExdVNmmqSMz5_2ZTcxZLJ2c1i9u=SkCi6mJvg8Lf7wHbRi1sA@mail.gmail.com>
 <55ECD82E.9070305@oddbird.net>
 <CAExdVNk9ggWEU4Nf_f4HV-W61gfLysn0ndcaxcaWQOXeev56JQ@mail.gmail.com>
 <55EDB967.2050108@oddbird.net>
 <CAP7+vJKfrmqmwGFxqDmLuya65hKkEKLjXzJKkbV9PfrUY-t5vw@mail.gmail.com>
Message-ID: <55EDC922.7050103@oddbird.net>

On 09/07/2015 11:06 AM, Guido van Rossum wrote:
> FYI, I am still completely overwhelmed by this discussion. I will wait
> until Tim and Alexander tell me there's a PEP to review and then I'll
> read that. Carl: if you feel your position is not represented in that
> PEP (even under "rejected alternatives") I recommend that you write your
> own PEP. But I really hope that you all will come to an agreement
> without competing PEPs!

Sure. At the moment I think PEP 495 is headed in a direction I support,
relative to the other options available. So I don't have any plans for a
competing PEP.

My latest couple messages in this thread are more about figuring out the
right framing for a documentation addition that might help people (like
me) coming from a pytz-style model understand datetime's model (and
specifically understand how "classic arithmetic" is not a bug). I think
I finally understand it now, so I'd like to put that understanding to
good use.

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150907/fa4aa91b/attachment.sig>

From tim.peters at gmail.com  Mon Sep  7 19:28:25 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 7 Sep 2015 12:28:25 -0500
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <CAP7+vJKfrmqmwGFxqDmLuya65hKkEKLjXzJKkbV9PfrUY-t5vw@mail.gmail.com>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>
 <55E9F626.1080906@oddbird.net>
 <CAExdVNmmqSMz5_2ZTcxZLJ2c1i9u=SkCi6mJvg8Lf7wHbRi1sA@mail.gmail.com>
 <55ECD82E.9070305@oddbird.net>
 <CAExdVNk9ggWEU4Nf_f4HV-W61gfLysn0ndcaxcaWQOXeev56JQ@mail.gmail.com>
 <55EDB967.2050108@oddbird.net>
 <CAP7+vJKfrmqmwGFxqDmLuya65hKkEKLjXzJKkbV9PfrUY-t5vw@mail.gmail.com>
Message-ID: <CAExdVNkLhyfAx69G+qDTa1WTfEcQPPLwtK8b04D8UMKDkp4X9A@mail.gmail.com>

[Guido]
> FYI, I am still completely overwhelmed by this discussion.

I recommend that you skip any message with "timeline" in the Subject
line ;-)  Nobody is actually arguing to make timeline arithmetic
(beyond what already exists) any part of PEP 495.  But this is a
"datetime" SIG, not a "PEP 495" SIG, so it's fair game to discuss it
here.


> I will wait until Tim and Alexander tell me there's a PEP to review and
> then I'll read that. Carl: if you feel your position is not represented in that
> PEP (even under "rejected alternatives") I recommend that you write
> your own PEP. But I really hope that you all will come to an agreement
> without competing PEPs!

Short course:  Carl prefers timeline arithmetic, but is not trying to
change anything about what Python's datetime does by default.  He
would like a new kind of tzinfo that simultaneously fixes the
conversion endcases _and_ forces use of timeline arithmetic for all
operations  Current code would neither be hurt nor helped, only code
using the new tzinfos would see any difference.  But current code
trying to use a new tzinfo could break anywhere it relied on classic
arithmetic.

While I'm not entirely sure, best guess is that Carl would also prefer
that 495 not be implemented.  But his new kind of tzinfo could be
implemented regardless.  They don't really compete, except in the
eternal battle over theoretical purity ;-)

From carl at oddbird.net  Mon Sep  7 19:37:12 2015
From: carl at oddbird.net (Carl Meyer)
Date: Mon, 7 Sep 2015 11:37:12 -0600
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <CAExdVNkLhyfAx69G+qDTa1WTfEcQPPLwtK8b04D8UMKDkp4X9A@mail.gmail.com>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>
 <55E9F626.1080906@oddbird.net>
 <CAExdVNmmqSMz5_2ZTcxZLJ2c1i9u=SkCi6mJvg8Lf7wHbRi1sA@mail.gmail.com>
 <55ECD82E.9070305@oddbird.net>
 <CAExdVNk9ggWEU4Nf_f4HV-W61gfLysn0ndcaxcaWQOXeev56JQ@mail.gmail.com>
 <55EDB967.2050108@oddbird.net>
 <CAP7+vJKfrmqmwGFxqDmLuya65hKkEKLjXzJKkbV9PfrUY-t5vw@mail.gmail.com>
 <CAExdVNkLhyfAx69G+qDTa1WTfEcQPPLwtK8b04D8UMKDkp4X9A@mail.gmail.com>
Message-ID: <55EDCB48.9010900@oddbird.net>

On 09/07/2015 11:28 AM, Tim Peters wrote:
> Short course:  Carl prefers timeline arithmetic, but is not trying to
> change anything about what Python's datetime does by default.  He
> would like a new kind of tzinfo that simultaneously fixes the
> conversion endcases _and_ forces use of timeline arithmetic for all
> operations  Current code would neither be hurt nor helped, only code
> using the new tzinfos would see any difference.  But current code
> trying to use a new tzinfo could break anywhere it relied on classic
> arithmetic.

I did propose that a couple days ago, and found the exercise of
proposing it enlightening :-) but I don't even think that's a good idea
anymore (as of yesterday, when I finally got my head fully around the
internal consistency of the "naive local time" model).

Trying to have both mental models implemented within datetime using
different types of tzinfo would just confuse matters further. Different
types of datetime would be a better bet, but that can just be a
different library altogether.

Better to have datetime be as true to its model as it can, and improve
the intro docs so people assuming a timeline-arithmetic model can also
get their heads around the naive-local-time model and do things the
right way for that model.

> While I'm not entirely sure, best guess is that Carl would also prefer
> that 495 not be implemented.  But his new kind of tzinfo could be
> implemented regardless.  They don't really compete, except in the
> eternal battle over theoretical purity ;-)

No, I would (weakly) prefer for PEP 495 to be accepted, as long as it
chooses to push the required inconsistency into inter-timezone
operations instead of breaking the consistency of classic arithmetic.

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150907/aff25a2f/attachment.sig>

From tim.peters at gmail.com  Mon Sep  7 20:43:29 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 7 Sep 2015 13:43:29 -0500
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <55EDB967.2050108@oddbird.net>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>
 <55E9F626.1080906@oddbird.net>
 <CAExdVNmmqSMz5_2ZTcxZLJ2c1i9u=SkCi6mJvg8Lf7wHbRi1sA@mail.gmail.com>
 <55ECD82E.9070305@oddbird.net>
 <CAExdVNk9ggWEU4Nf_f4HV-W61gfLysn0ndcaxcaWQOXeev56JQ@mail.gmail.com>
 <55EDB967.2050108@oddbird.net>
Message-ID: <CAExdVNmqahyRpU+s+RMoe+2L3XkE1e8-4QhhUH=AnhDXyMyGOw@mail.gmail.com>

All this violent agreement ;-) is sucking away all the free time I
have.  So I'm going to try something else:  focus on a single
_seeming_ disagreement that makes no sense to me.


[Carl]
>>> In Model A, an aware datetime (in any timezone) is nothing more than an
>>> alternate (somewhat complexified for human use) spelling of a Unix
>>> timestamp, much like a timedelta is just a complexified spelling of some
>>> number of microseconds.

[Tim]
>> A Python datetime is also just a complexified spelling of some number
>> of microseconds (since the start of 1 January 1 of the proleptic
>> Gregorian calendar).

[Carl]
> Which is a "naive time" concept, which is a pretty good sign that Python
> datetime wasn't intended to implement Model A. I thought it was already
> pretty clear that I'd figured that out by now :-)

So:

- You tell me that in model A an aware datetime is a spelling of a
Unix timestamp.

- I tell you that a Python datetime is a spelling of a different
flavor of timestamp.

- You tell me that "means" Python is using a naive time concept, and wasn't
  intended to implement model A.

Can you see why I'm baffled?  If it needs to explained, it's even more
baffling to me.  So here goes anyway:  Model A uses a very similar
concept.  Not identical, because:

- The Unix timestamp takes 1970-1-1 as its epoch, while Python's takes 1-1-1.
  They nevertheless use exactly the same proleptic calendar system.

- The Unix timestamp counts seconds, but Python's counts microseconds (on
  a platform where time_t is a floating type, a Unix timestamp can approximate
  decimal microseconds too, as fractions of a second).

- The resolution and range of a Unix timestamp vary across platforms, but Python
  defines both.

Where's a theoretically _significant_ difference?  It's simply not
true that viewing datetimes as timestamps has anything to do with
drawing a distinction between your models A and B.

An implementation of model A may or may not explicitly store the Unix
timestamp it has in mind.  From your statement that under model A it's
a "complexified" spelling of a Unix timestamp, I have to assume you
have in mind implementations where it's not explicitly stored.  In
which case it's exactly the same as in Python today:  to _find_ that
Unix timestamp, you need to convert your complexified spelling to UTC
first.

Perhaps the distinction you have in mind is that, under Model A, it's
impossible to think of an aware datetime as being anything _other_
than a Unix timestamp?  That may have been what your "nothing more"
meant.  Then, yes, there is that difference:  Python doesn't intend to
force any specific interpretation of its timestamps beyond that
they're instants in the proleptic Gregorian calendar.  Model A also
views them as instants in the proleptic Gregorian calendar, but tacks
on "and that calendar must always be viewed as being in (a proleptic
extension of an approximation to) UTC".

So maybe I understand you now after all.  But, if so, are these kinds
of seeming disagreements really worth resolving?  It requires a
seemingly unreasonable amount of time & effort to arrive at the
obvious ;-)

From guido at python.org  Mon Sep  7 21:04:42 2015
From: guido at python.org (Guido van Rossum)
Date: Mon, 7 Sep 2015 12:04:42 -0700
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <CAExdVNmqahyRpU+s+RMoe+2L3XkE1e8-4QhhUH=AnhDXyMyGOw@mail.gmail.com>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>
 <55E9F626.1080906@oddbird.net>
 <CAExdVNmmqSMz5_2ZTcxZLJ2c1i9u=SkCi6mJvg8Lf7wHbRi1sA@mail.gmail.com>
 <55ECD82E.9070305@oddbird.net>
 <CAExdVNk9ggWEU4Nf_f4HV-W61gfLysn0ndcaxcaWQOXeev56JQ@mail.gmail.com>
 <55EDB967.2050108@oddbird.net>
 <CAExdVNmqahyRpU+s+RMoe+2L3XkE1e8-4QhhUH=AnhDXyMyGOw@mail.gmail.com>
Message-ID: <CAP7+vJJs-9AAiPowGZNzcXKQQReVe2ai-aEs4tuNyEtjbfmwtw@mail.gmail.com>

Again, I can't follow this because I don't recall the definition of model
A. But here's a fundamental difference between a timezone-aware datetime
and a POSIX stamp (apart from epoch, range and precision). The difference
applies only to "political" timezones, which may change offsets or DST
rules. The difference is that an aware datetime says "in timezone Z, when
the local clock says T". If T is in the future, politicians may change the
mapping of T to UTC in Z. However, politics can't change the meaning of a
POSIX timestamp. Even for T in the (distant) past the mapping may still
change, when research finds that the rules for Z were different at some
year in the past than they were presumed. So, to me, an aware datetime
*fundamentally* differs from a POSIX timestamp, and even from a pair
composed of a POSIX timestamp plus a tzinfo object. (POSIX timestamps are
however embeddable in datetimes by using a fixed-offset tzinfo.)

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150907/dfba4f68/attachment.html>

From carl at oddbird.net  Mon Sep  7 21:38:41 2015
From: carl at oddbird.net (Carl Meyer)
Date: Mon, 7 Sep 2015 13:38:41 -0600
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <CAExdVNmqahyRpU+s+RMoe+2L3XkE1e8-4QhhUH=AnhDXyMyGOw@mail.gmail.com>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>
 <55E9F626.1080906@oddbird.net>
 <CAExdVNmmqSMz5_2ZTcxZLJ2c1i9u=SkCi6mJvg8Lf7wHbRi1sA@mail.gmail.com>
 <55ECD82E.9070305@oddbird.net>
 <CAExdVNk9ggWEU4Nf_f4HV-W61gfLysn0ndcaxcaWQOXeev56JQ@mail.gmail.com>
 <55EDB967.2050108@oddbird.net>
 <CAExdVNmqahyRpU+s+RMoe+2L3XkE1e8-4QhhUH=AnhDXyMyGOw@mail.gmail.com>
Message-ID: <55EDE7C1.5010903@oddbird.net>


On 09/07/2015 12:43 PM, Tim Peters wrote:
> [Carl]
>>>> In Model A, an aware datetime (in any timezone) is nothing more than an
>>>> alternate (somewhat complexified for human use) spelling of a Unix
>>>> timestamp, much like a timedelta is just a complexified spelling of some
>>>> number of microseconds.
> 
> [Tim]
>>> A Python datetime is also just a complexified spelling of some number
>>> of microseconds (since the start of 1 January 1 of the proleptic
>>> Gregorian calendar).
> 
> [Carl]
>> Which is a "naive time" concept, which is a pretty good sign that Python
>> datetime wasn't intended to implement Model A. I thought it was already
>> pretty clear that I'd figured that out by now :-)
> 
> So:
> 
> - You tell me that in model A an aware datetime is a spelling of a
> Unix timestamp.
> 
> - I tell you that a Python datetime is a spelling of a different
> flavor of timestamp.
> 
> - You tell me that "means" Python is using a naive time concept, and wasn't
>   intended to implement model A.
> 
> Can you see why I'm baffled?  If it needs to explained, it's even more
> baffling to me.  So here goes anyway:  Model A uses a very similar
> concept.  Not identical, because:
> 
> - The Unix timestamp takes 1970-1-1 as its epoch, while Python's takes 1-1-1.
>   They nevertheless use exactly the same proleptic calendar system.
> 
> - The Unix timestamp counts seconds, but Python's counts microseconds (on
>   a platform where time_t is a floating type, a Unix timestamp can approximate
>   decimal microseconds too, as fractions of a second).
> 
> - The resolution and range of a Unix timestamp vary across platforms, but Python
>   defines both.

Right, but (as you know) those are all incidental to the actual
distinction I was trying to make.

> Where's a theoretically _significant_ difference?  It's simply not
> true that viewing datetimes as timestamps has anything to do with
> drawing a distinction between your models A and B.

The key difference is that a Unix timestamp defines a single instant in
"real time" (or the UTC approximation of "real time," which is good
enough), because the Unix epoch is defined to be in UTC. The point of
even _having_ representations in other timezones (under Model A) is
never to change that basic "real monotonic time" model, it's solely to
get or parse a representation for the sake of a human (or some other
computer system) living naively in that timezone.

A Python datetime "timestamp," on the other hand, is "naive" or
"timezone-relative." It doesn't define a single instant in real time
until you pair it with an offset. The timestamp itself is
timezone-relative (it's "the number of microseconds since datetime(1, 1,
1) in naive local time in whatever timezone we're currently in"). That's
why doing integer arithmetic on this kind of timestamp does classic
arithmetic instead of timeline arithmetic. That's a Model B
understanding of what a non-UTC aware datetime represents.

> An implementation of model A may or may not explicitly store the Unix
> timestamp it has in mind.  From your statement that under model A it's
> a "complexified" spelling of a Unix timestamp, I have to assume you
> have in mind implementations where it's not explicitly stored.  In
> which case it's exactly the same as in Python today:  to _find_ that
> Unix timestamp, you need to convert your complexified spelling to UTC
> first.

I intentionally didn't specify any implementation. In outlining the
difference between Model A and Model B, I'm not concerned about
implementation details; I'm concerned about the mental model of what an
"aware datetime" represents (and thus what invariants you can expect it
to keep once you grasp the model.) I think Model A and Model B do
represent clear alternative mental models in that respect (regardless of
how they are implemented, and what e.g. speed/size tradeoffs that may
involve).

In Model A, an aware datetime is always a single unambiguous instant in
time (that is, isomorphic to UTC), and that alone tells you a lot about
how to expect it to behave in terms of arithmetic, equality, etc (or
even in "being stored across a zoneinfo update").

In Model B, an aware datetime is a local-clock time annotated with a
timezone, and that gives you a different set of consistent expectations
about how it should behave.

> Perhaps the distinction you have in mind is that, under Model A, it's
> impossible to think of an aware datetime as being anything _other_
> than a Unix timestamp? 

Yes, that's basically right. If you're working in Model A and you want
to work in "local clock time", you strip off the timezone information
and use an object representing simple naive clock time, with no timezone
awareness at all.

> That may have been what your "nothing more"
> meant.  Then, yes, there is that difference:  Python doesn't intend to
> force any specific interpretation of its timestamps beyond that
> they're instants in the proleptic Gregorian calendar. 

According to my use of the term (which I borrowed from J/NodaTime)
datetime's "timestamps" aren't really "instants" at all, in the sense
that they don't (alone) tell you when something occurred in the real
world (which is another way of saying that they don't map isomorphically
to UTC, or any other monotonic representation of time).  They represent
a point in the (abstract) proleptic Gregorian calendar, which only
represents an instant once paired with a UTC offset.

> Model A also
> views them as instants in the proleptic Gregorian calendar, but tacks
> on "and that calendar must always be viewed as being in (a proleptic
> extension of an approximation to) UTC".

I think I understand what you mean here. I would say that both Model A
and Model B have an equally opinionated interpretation of what an aware
datetime represents, but it's true that Model A's interpretation
requires it to carry enough information (in some form) to always be
isomorphic to UTC, whereas Model B doesn't require it to carry that much
information.

What Python actually _does_ is a bit more muddled, as we've both said
many times, because sometimes it acts like Model B (intra-zone) and
sometimes like Model A (inter-zone). I think that's unfortunate, because
it results in arithmetic and ordering inconsistencies, and headaches
like the ones you're having with PEP 495.

But I've accepted that Python wants _more_ to be Model B than Model A,
so it's best to just discourage use of the "magic" interzone operations
and be consistently Model B everywhere else, rather than finding a way
(like my earlier "strict tzinfo" proposal tried to) to arrive at an
implementation that's consistently Model A.

> So maybe I understand you now after all.  But, if so, are these kinds
> of seeming disagreements really worth resolving?  It requires a
> seemingly unreasonable amount of time & effort to arrive at the
> obvious ;-)

Well, perhaps all of this was always obvious to you, in which case I do
apologize for wasting so much of your time! But it _seemed_ to me that
we had proponents of both Model A and Model B in this mailing list,
almost entirely talking past each other, and that trying to outline how
each one is a consistent and usable model on its own terms might help
proponents of both to at least understand the other better.

It helped me understand the benefits of Model B, anyway. I'm curious if
it made any sense to Chris, if he's still following this thread. I'm
still hopeful of leveraging that understanding into something useful for
the docs. Sorry if it didn't help you :/ I certainly don't want to keep
wasting your time, so I'm happy to leave it here. Thanks for the
discussion; it's been useful to me, and I appreciate your time.

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150907/8fe1627c/attachment.sig>

From carl at oddbird.net  Mon Sep  7 21:42:40 2015
From: carl at oddbird.net (Carl Meyer)
Date: Mon, 7 Sep 2015 13:42:40 -0600
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <CAP7+vJJs-9AAiPowGZNzcXKQQReVe2ai-aEs4tuNyEtjbfmwtw@mail.gmail.com>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>
 <55E9F626.1080906@oddbird.net>
 <CAExdVNmmqSMz5_2ZTcxZLJ2c1i9u=SkCi6mJvg8Lf7wHbRi1sA@mail.gmail.com>
 <55ECD82E.9070305@oddbird.net>
 <CAExdVNk9ggWEU4Nf_f4HV-W61gfLysn0ndcaxcaWQOXeev56JQ@mail.gmail.com>
 <55EDB967.2050108@oddbird.net>
 <CAExdVNmqahyRpU+s+RMoe+2L3XkE1e8-4QhhUH=AnhDXyMyGOw@mail.gmail.com>
 <CAP7+vJJs-9AAiPowGZNzcXKQQReVe2ai-aEs4tuNyEtjbfmwtw@mail.gmail.com>
Message-ID: <55EDE8B0.4020103@oddbird.net>

On 09/07/2015 01:04 PM, Guido van Rossum wrote:
> Again, I can't follow this because I don't recall the definition of
> model A. But here's a fundamental difference between a timezone-aware
> datetime and a POSIX stamp (apart from epoch, range and precision). The
> difference applies only to "political" timezones, which may change
> offsets or DST rules. The difference is that an aware datetime says "in
> timezone Z, when the local clock says T". If T is in the future,
> politicians may change the mapping of T to UTC in Z. However, politics
> can't change the meaning of a POSIX timestamp. Even for T in the
> (distant) past the mapping may still change, when research finds that
> the rules for Z were different at some year in the past than they were
> presumed. So, to me, an aware datetime *fundamentally* differs from a
> POSIX timestamp, and even from a pair composed of a POSIX timestamp plus
> a tzinfo object. (POSIX timestamps are however embeddable in datetimes
> by using a fixed-offset tzinfo.)

Yes, that's a great description of the precise difference that I've been
trying to describe. Thanks.

(In an attempt to use totally value-neutral terms, I called the "POSIX
timestamp" model "Model A" and the "clock time plus a timezone" -- what
a Python aware datetime is -- "Model B". That probably just introduced
even more confusion.)

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150907/3479d400/attachment.sig>

From tim.peters at gmail.com  Mon Sep  7 22:04:58 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 7 Sep 2015 15:04:58 -0500
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <CAP7+vJJs-9AAiPowGZNzcXKQQReVe2ai-aEs4tuNyEtjbfmwtw@mail.gmail.com>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>
 <55E9F626.1080906@oddbird.net>
 <CAExdVNmmqSMz5_2ZTcxZLJ2c1i9u=SkCi6mJvg8Lf7wHbRi1sA@mail.gmail.com>
 <55ECD82E.9070305@oddbird.net>
 <CAExdVNk9ggWEU4Nf_f4HV-W61gfLysn0ndcaxcaWQOXeev56JQ@mail.gmail.com>
 <55EDB967.2050108@oddbird.net>
 <CAExdVNmqahyRpU+s+RMoe+2L3XkE1e8-4QhhUH=AnhDXyMyGOw@mail.gmail.com>
 <CAP7+vJJs-9AAiPowGZNzcXKQQReVe2ai-aEs4tuNyEtjbfmwtw@mail.gmail.com>
Message-ID: <CAExdVNn5bvo15CHL8nGMX-9Y+Huv6FYe2=CLJfzjxt0zBdJDGQ@mail.gmail.com>

[Guido]
> Again, I can't follow this because I don't recall the definition of model A.

Pretty much that an aware datetime is exactly and only a spelling of a
POSIX timestamp.  Various things follow from that, such that timeline
arithmetic is overwhelmingly most natural in that model.

> But here's a fundamental difference between a timezone-aware datetime and a
> POSIX stamp (apart from epoch, range and precision). The difference applies
> only to "political" timezones, which may change offsets or DST rules. The
> difference is that an aware datetime says "in timezone Z, when the local
> clock says T". If T is in the future, politicians may change the mapping of
> T to UTC in Z. However, politics can't change the meaning of a POSIX
> timestamp. Even for T in the (distant) past the mapping may still change,
> when research finds that the rules for Z were different at some year in the
> past than they were presumed. So, to me, an aware datetime *fundamentally*
> differs from a POSIX timestamp, and even from a pair composed of a POSIX
> timestamp plus a tzinfo object.

The last is unclear to me, unless it's a conceptual distinction with
no visible consequences.  An aware datetime _is_ a <naive datetime,
tzinfo> pair, and there's a natural bijection between naive datetimes
and POSIX timestamps (across all instants both can represent).  That a
time_t is "in UTC" is as inconsequential for this purpose as that to
compute 3+1 I happen to have 3 turtles in mind rather than the
distance in meters to my refrigerator ;-)  I do see that it's useless
conceptual baggage (even potentially misleading) to drag UTC into it
at all.


> (POSIX timestamps are however embeddable in datetimes by using a fixed-offset tzinfo.)

Or use a naive datetime, for all practical purposes.

From carl at oddbird.net  Mon Sep  7 22:50:42 2015
From: carl at oddbird.net (Carl Meyer)
Date: Mon, 7 Sep 2015 14:50:42 -0600
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <CAExdVNn5bvo15CHL8nGMX-9Y+Huv6FYe2=CLJfzjxt0zBdJDGQ@mail.gmail.com>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>
 <55E9F626.1080906@oddbird.net>
 <CAExdVNmmqSMz5_2ZTcxZLJ2c1i9u=SkCi6mJvg8Lf7wHbRi1sA@mail.gmail.com>
 <55ECD82E.9070305@oddbird.net>
 <CAExdVNk9ggWEU4Nf_f4HV-W61gfLysn0ndcaxcaWQOXeev56JQ@mail.gmail.com>
 <55EDB967.2050108@oddbird.net>
 <CAExdVNmqahyRpU+s+RMoe+2L3XkE1e8-4QhhUH=AnhDXyMyGOw@mail.gmail.com>
 <CAP7+vJJs-9AAiPowGZNzcXKQQReVe2ai-aEs4tuNyEtjbfmwtw@mail.gmail.com>
 <CAExdVNn5bvo15CHL8nGMX-9Y+Huv6FYe2=CLJfzjxt0zBdJDGQ@mail.gmail.com>
Message-ID: <A8EAB57D-A489-4B3D-B06F-7D40705AF392@oddbird.net>


> On Sep 7, 2015, at 2:04 PM, Tim Peters <tim.peters at gmail.com> wrote:
> 
> [Guido]
>> But here's a fundamental difference between a timezone-aware datetime and a
>> POSIX stamp (apart from epoch, range and precision). The difference applies
>> only to "political" timezones, which may change offsets or DST rules. The
>> difference is that an aware datetime says "in timezone Z, when the local
>> clock says T". If T is in the future, politicians may change the mapping of
>> T to UTC in Z. However, politics can't change the meaning of a POSIX
>> timestamp. Even for T in the (distant) past the mapping may still change,
>> when research finds that the rules for Z were different at some year in the
>> past than they were presumed. So, to me, an aware datetime *fundamentally*
>> differs from a POSIX timestamp, and even from a pair composed of a POSIX
>> timestamp plus a tzinfo object.
> 
> The last is unclear to me, unless it's a conceptual distinction with
> no visible consequences.  

A <posix time stamp, tz> pair is what I've been calling a "model A aware datetime." A <naive time, tz> pair is what I've been calling a "model B aware datetime." There are many visible differences if you assume that in both cases you do simple integer arithmetic and comparisons on the time stamp component. 

> An aware datetime _is_ a <naive datetime,
> tzinfo> pair, and there's a natural bijection between naive datetimes
> and POSIX timestamps (across all instants both can represent).

I don't understand this, and I suspect it's at the heart of our misunderstanding. I would say there are many possible bijections between naive datetimes and posix time stamps, one corresponding to every possible UTC offset. (Or if you allow that a naive datetime may represent a time in a zone with a non fixed offset, there may not be a bijection to posix time stamps at all). How do you decide which one is "natural"? Without the offset, you don't know how to compare a naive datetime to an instant expressed as a posix time stamp, meaning you don't actually know what instant it represents.

>  That a
> time_t is "in UTC" is as inconsequential for this purpose as that to
> compute 3+1 I happen to have 3 turtles in mind rather than the
> distance in meters to my refrigerator ;-)  I do see that it's useless
> conceptual baggage (even potentially misleading) to drag UTC into it
> at all.
> 
> 
>> (POSIX timestamps are however embeddable in datetimes by using a fixed-offset tzinfo.)
> 
> Or use a naive datetime, for all practical purposes.
> 

Conceptually, sure, if you're willing to assume an implied fixed offset timezone. "For all practical purposes," no, because the _practical_ purpose of a model A tz-aware datetime is to always be able to easily and unambiguously ask it "how do you spell yourself in timezone X."

Carl

From guido at python.org  Mon Sep  7 23:04:19 2015
From: guido at python.org (Guido van Rossum)
Date: Mon, 7 Sep 2015 14:04:19 -0700
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <CAExdVNn5bvo15CHL8nGMX-9Y+Huv6FYe2=CLJfzjxt0zBdJDGQ@mail.gmail.com>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>
 <55E9F626.1080906@oddbird.net>
 <CAExdVNmmqSMz5_2ZTcxZLJ2c1i9u=SkCi6mJvg8Lf7wHbRi1sA@mail.gmail.com>
 <55ECD82E.9070305@oddbird.net>
 <CAExdVNk9ggWEU4Nf_f4HV-W61gfLysn0ndcaxcaWQOXeev56JQ@mail.gmail.com>
 <55EDB967.2050108@oddbird.net>
 <CAExdVNmqahyRpU+s+RMoe+2L3XkE1e8-4QhhUH=AnhDXyMyGOw@mail.gmail.com>
 <CAP7+vJJs-9AAiPowGZNzcXKQQReVe2ai-aEs4tuNyEtjbfmwtw@mail.gmail.com>
 <CAExdVNn5bvo15CHL8nGMX-9Y+Huv6FYe2=CLJfzjxt0zBdJDGQ@mail.gmail.com>
Message-ID: <CAP7+vJKwphRVEpqHtTK0j=GUqQ_uVyoHDbaFfZnkqmWLuHfTqw@mail.gmail.com>

On Mon, Sep 7, 2015 at 1:04 PM, Tim Peters <tim.peters at gmail.com> wrote:

> [Guido]
> > Again, I can't follow this because I don't recall the definition of
> model A.
>
> Pretty much that an aware datetime is exactly and only a spelling of a
> POSIX timestamp.  Various things follow from that, such that timeline
> arithmetic is overwhelmingly most natural in that model.
>

OK. I'll just remember "model A bad, model B good." :-)

Or, perhaps more fairly, "model A is how pytz thinks, model B is how the
stdlib thinks."


> > But here's a fundamental difference between a timezone-aware datetime
> and a
> > POSIX stamp (apart from epoch, range and precision). The difference
> applies
> > only to "political" timezones, which may change offsets or DST rules. The
> > difference is that an aware datetime says "in timezone Z, when the local
> > clock says T". If T is in the future, politicians may change the mapping
> of
> > T to UTC in Z. However, politics can't change the meaning of a POSIX
> > timestamp. Even for T in the (distant) past the mapping may still change,
> > when research finds that the rules for Z were different at some year in
> the
> > past than they were presumed. So, to me, an aware datetime
> *fundamentally*
> > differs from a POSIX timestamp, and even from a pair composed of a POSIX
> > timestamp plus a tzinfo object.
>
> The last is unclear to me, unless it's a conceptual distinction with
> no visible consequences.  An aware datetime _is_ a <naive datetime,
> tzinfo> pair, and there's a natural bijection between naive datetimes
> and POSIX timestamps (across all instants both can represent).  That a
> time_t is "in UTC" is as inconsequential for this purpose as that to
> compute 3+1 I happen to have 3 turtles in mind rather than the
> distance in meters to my refrigerator ;-)  I do see that it's useless
> conceptual baggage (even potentially misleading) to drag UTC into it
> at all.
>

OK, you nerd-sniped me. :-)

In my view it *is* important that a time_t references UTC. Using a time_t
to store a non-UTC timestamp feels as wrong to me as using it to store a
number of turtles (even though I know there is code that does this). OTOH a
naive timestamp does not have this prejudice towards UTC -- it *could*
refer to UTC (e.g. when it's returned from utcnow()) or to local time (e.g.
from now()) or to some other timezone that is only inferred from the
context. (A struct tm also doesn't have this prejudice to me.)

Anyways, when I say "a (POSIX timestamp, tzinfo) tuple", the way I think of
it is that when I ask "what does the local clock say" this uses a mapping
from POSIX timestamp to that tzinfo. But when I say "a (naive datetime,
tzinfo) tuple", I assume the naive datetime to be what the local clock
says, so the tzinfo is only needed when I ask "what time is it in another
timezone".

Next, whatever the future of UTC relative to TAI or other time standards, I
expect that UTC will continue to approximate mean solar time somewhere in
Greenwich(*), and I expect that the vast majority of other timezones will
continue to be defined in terms of offsets from UTC (and typically in whole
hours). But I expect that the exact definition of many local timezones will
continue to be modified by local politicians, and as a consequence I cannot
be *sure* what UTC will be at noon on June 3rd 2020 in the US/Eastern
timezone. But I *can* be (tautologically) sure what the local clock will
say: 12:00:00.

And what I intend by all this is that when I pickle or otherwise persist
that particular datetime, I want to be sure that it records the naive local
time and the timezone, not the UTC time and the timezone. (Also, I want it
to record the timezone in a way that if I unpickle it years from now, it
will reference the US/Eastern timezone as it is defined at that time -- I
don't want it to reference a copy of the timezone rules at the time I
pickled it. This is similar to how globals such as classes and functions
are pickled by reference.)

I should also mention that this only matters when you persist an aware
datetime and restore it later. I don't think we should worry about timezone
definitions to be mutable within a process (though if processes were to
have expected lifetimes measured in years you might have to worry about
this -- but that worry is derived from more general worries about software
upgrades over such timescales).


> > (POSIX timestamps are however embeddable in datetimes by using a
> fixed-offset tzinfo.)
>
> Or use a naive datetime, for all practical purposes.
>

As long as the naive datetime is specified in UTC. :-)

(*) I visited the Royal Observatory this summer, and learned that there are
a number of different competing meridians. It's fascinating to realize that
as early as the 19th century astronomers cared about the location of their
telescopes to within meters:
https://en.wikipedia.org/wiki/United_Kingdom_Ordnance_Survey_Zero_Meridian .

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150907/9a723bb7/attachment.html>

From tim.peters at gmail.com  Mon Sep  7 23:44:39 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 7 Sep 2015 16:44:39 -0500
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <A8EAB57D-A489-4B3D-B06F-7D40705AF392@oddbird.net>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>
 <55E9F626.1080906@oddbird.net>
 <CAExdVNmmqSMz5_2ZTcxZLJ2c1i9u=SkCi6mJvg8Lf7wHbRi1sA@mail.gmail.com>
 <55ECD82E.9070305@oddbird.net>
 <CAExdVNk9ggWEU4Nf_f4HV-W61gfLysn0ndcaxcaWQOXeev56JQ@mail.gmail.com>
 <55EDB967.2050108@oddbird.net>
 <CAExdVNmqahyRpU+s+RMoe+2L3XkE1e8-4QhhUH=AnhDXyMyGOw@mail.gmail.com>
 <CAP7+vJJs-9AAiPowGZNzcXKQQReVe2ai-aEs4tuNyEtjbfmwtw@mail.gmail.com>
 <CAExdVNn5bvo15CHL8nGMX-9Y+Huv6FYe2=CLJfzjxt0zBdJDGQ@mail.gmail.com>
 <A8EAB57D-A489-4B3D-B06F-7D40705AF392@oddbird.net>
Message-ID: <CAExdVNmp=Y7nD5JJyRCXbmmhNOo+Y8BgYtwU+ZR4C0aNVwMEkg@mail.gmail.com>

[Tim]
>> An aware datetime _is_ a <naive datetime,
>> tzinfo> pair, and there's a natural bijection between naive datetimes
>> and POSIX timestamps (across all instants both can represent).

[Carl]
> I don't understand this, and I suspect it's at the heart of our
> misunderstanding. I would say there are many possible bijections ....

"Natural" bijection.  I gave you very simple Python code implementing
that bijection already.  A naive datetime represents an instant in the
proleptic Gregorian calendar.  So does a POSIX timestamp.  In POSIX,
the relationship between a timestamp and calendar notation is defined
by the C expression ("/" is truncating integer division):

timestamp = tm_sec + tm_min*60 + tm_hour*3600 + tm_yday*86400 +
    (tm_year-70)*31536000 + ((tm_year-69)/4)*86400 -
    ((tm_year-1)/100)*86400 + ((tm_year+299)/400)*86400

The natural bijection, between naive datetimes and POSIX timestamps,
is the bijection in which a naive datetime maps to/from the POSIX
timestamp such that

     the naive datetime's calendar notation
     is exactly equal to
     the POSIX calendar notation
         corresponding to that POSIX timestamp
         as defined by the expression above.

Any other bijection is strained in comparison, hence "unnatural".
Natural doesn't necessarily mean unique (although it does in this
specific case - there is only one bijection satisfying the above);
"natural" is more related to Occam's Razor ;-)

...

>>> (POSIX timestamps are however embeddable in datetimes by using a fixed-offset tzinfo.)

>> Or use a naive datetime, for all practical purposes.

> Conceptually, sure, if you're willing to assume an implied
> fixed offset timezone. "For all practical purposes," no, because
> the _practical_ purpose of a model A tz-aware datetime is
> to always be able to easily and unambiguously ask it "how
> do you spell yourself in timezone X."

Guido wasn't talking about any of that, and neither was I.  He was
talking about "embedding".  That's passive with respect to thing being
embedded.  Of course it's possible to "embed" a POSIX timestamp in a
naive datetime - for the purpose of being embedded, it's just a
frickin' integer ;-)

From carl at oddbird.net  Tue Sep  8 01:52:28 2015
From: carl at oddbird.net (Carl Meyer)
Date: Mon, 7 Sep 2015 17:52:28 -0600
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <CAExdVNmp=Y7nD5JJyRCXbmmhNOo+Y8BgYtwU+ZR4C0aNVwMEkg@mail.gmail.com>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>
 <55E9F626.1080906@oddbird.net>
 <CAExdVNmmqSMz5_2ZTcxZLJ2c1i9u=SkCi6mJvg8Lf7wHbRi1sA@mail.gmail.com>
 <55ECD82E.9070305@oddbird.net>
 <CAExdVNk9ggWEU4Nf_f4HV-W61gfLysn0ndcaxcaWQOXeev56JQ@mail.gmail.com>
 <55EDB967.2050108@oddbird.net>
 <CAExdVNmqahyRpU+s+RMoe+2L3XkE1e8-4QhhUH=AnhDXyMyGOw@mail.gmail.com>
 <CAP7+vJJs-9AAiPowGZNzcXKQQReVe2ai-aEs4tuNyEtjbfmwtw@mail.gmail.com>
 <CAExdVNn5bvo15CHL8nGMX-9Y+Huv6FYe2=CLJfzjxt0zBdJDGQ@mail.gmail.com>
 <A8EAB57D-A489-4B3D-B06F-7D40705AF392@oddbird.net>
 <CAExdVNmp=Y7nD5JJyRCXbmmhNOo+Y8BgYtwU+ZR4C0aNVwMEkg@mail.gmail.com>
Message-ID: <55EE233C.1020307@oddbird.net>

On 09/07/2015 03:44 PM, Tim Peters wrote:
> [Tim]
>>> An aware datetime _is_ a <naive datetime,
>>> tzinfo> pair, and there's a natural bijection between naive datetimes
>>> and POSIX timestamps (across all instants both can represent).
> [Carl]
>> I don't understand this, and I suspect it's at the heart of our
>> misunderstanding. I would say there are many possible bijections ....
[Tim]
> "Natural" bijection.  I gave you very simple Python code implementing
> that bijection already.  A naive datetime represents an instant in the
> proleptic Gregorian calendar.

What is your definition of "instant" here? I don't think a naive
datetime represents an instant at all; it represents a range of possible
instants, depending which timezone that naive datetime is interpreted
in. Without an offset, who knows which instant it might represent.

> So does a POSIX timestamp.  In POSIX,
> the relationship between a timestamp and calendar notation is defined
> by the C expression ("/" is truncating integer division):
> 
> timestamp = tm_sec + tm_min*60 + tm_hour*3600 + tm_yday*86400 +
>     (tm_year-70)*31536000 + ((tm_year-69)/4)*86400 -
>     ((tm_year-1)/100)*86400 + ((tm_year+299)/400)*86400
> 
> The natural bijection, between naive datetimes and POSIX timestamps,
> is the bijection in which a naive datetime maps to/from the POSIX
> timestamp such that
> 
>      the naive datetime's calendar notation
>      is exactly equal to
>      the POSIX calendar notation
>          corresponding to that POSIX timestamp
>          as defined by the expression above.
> 
> Any other bijection is strained in comparison, hence "unnatural".
> Natural doesn't necessarily mean unique (although it does in this
> specific case - there is only one bijection satisfying the above);
> "natural" is more related to Occam's Razor ;-)

Ok, sure, because POSIX is defined in terms of the Gregorian calendar in
UTC, if you have(for some reason) _must_ compare a naive datetime to a
POSIX timestamp, it's simplest to assume the naive datetime is also in
UTC, so that their Gregorian calendars line up with no offset. I buy
that's "most natural" of the available bijections in some sense, but I'm
missing the "so what?" Under what circumstances is it reasonable to make
that assumption about a naive datetime? Rather than saying "a naive
datetime simply doesn't correspond to any particular POSIX timestamp;
they aren't comparable at all unless you have additional information,"
which is what I'd say.

I mean, I certainly hope you wouldn't want datetime to make `utcdt -
naivedt` a defined operation where it's assumed the naive datetime is UTC.

[Guido]
>>>> (POSIX timestamps are however embeddable in datetimes by using a fixed-offset tzinfo.)
[Tim]
>>> Or use a naive datetime, for all practical purposes.
[Carl]
>> Conceptually, sure, if you're willing to assume an implied
>> fixed offset timezone. "For all practical purposes," no, because
>> the _practical_ purpose of a model A tz-aware datetime is
>> to always be able to easily and unambiguously ask it "how
>> do you spell yourself in timezone X."
[Tim]
> Guido wasn't talking about any of that, and neither was I.  He was
> talking about "embedding".  That's passive with respect to thing being
> embedded.  Of course it's possible to "embed" a POSIX timestamp in a
> naive datetime - for the purpose of being embedded, it's just a
> frickin' integer ;-)

Yes, of course. Sorry, I missed the context.

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150907/552ed929/attachment.sig>

From carl at oddbird.net  Tue Sep  8 02:00:01 2015
From: carl at oddbird.net (Carl Meyer)
Date: Mon, 7 Sep 2015 18:00:01 -0600
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <CAP7+vJKwphRVEpqHtTK0j=GUqQ_uVyoHDbaFfZnkqmWLuHfTqw@mail.gmail.com>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>
 <55E9F626.1080906@oddbird.net>
 <CAExdVNmmqSMz5_2ZTcxZLJ2c1i9u=SkCi6mJvg8Lf7wHbRi1sA@mail.gmail.com>
 <55ECD82E.9070305@oddbird.net>
 <CAExdVNk9ggWEU4Nf_f4HV-W61gfLysn0ndcaxcaWQOXeev56JQ@mail.gmail.com>
 <55EDB967.2050108@oddbird.net>
 <CAExdVNmqahyRpU+s+RMoe+2L3XkE1e8-4QhhUH=AnhDXyMyGOw@mail.gmail.com>
 <CAP7+vJJs-9AAiPowGZNzcXKQQReVe2ai-aEs4tuNyEtjbfmwtw@mail.gmail.com>
 <CAExdVNn5bvo15CHL8nGMX-9Y+Huv6FYe2=CLJfzjxt0zBdJDGQ@mail.gmail.com>
 <CAP7+vJKwphRVEpqHtTK0j=GUqQ_uVyoHDbaFfZnkqmWLuHfTqw@mail.gmail.com>
Message-ID: <55EE2501.6090901@oddbird.net>

[Guido]
> OK. I'll just remember "model A bad, model B good." :-)

Fine by me. :-)

> Or, perhaps more fairly, "model A is how pytz thinks, model B is how the
> stdlib thinks."

We'd be in better shape if it were that simple. pytz is strictly model
A. Unfortunately the stdlib isn't consistent in how it thinks (short
version: because __hash__ and cross-timezone equality and arithmetic
implicitly treat aware datetimes as if they were unambiguous model A
instants, when they aren't), and that's the root of all the difficulty
with PEP 495.

(I can give a longer explanation of _why_ that causes difficulty with
PEP 495 if you want it, or you can go back and read the last few threads
in detail, or you can just wait for the PEP :-) ).

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150907/59a768bb/attachment.sig>

From tim.peters at gmail.com  Tue Sep  8 03:54:54 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 7 Sep 2015 20:54:54 -0500
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <55EE233C.1020307@oddbird.net>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>
 <55E9F626.1080906@oddbird.net>
 <CAExdVNmmqSMz5_2ZTcxZLJ2c1i9u=SkCi6mJvg8Lf7wHbRi1sA@mail.gmail.com>
 <55ECD82E.9070305@oddbird.net>
 <CAExdVNk9ggWEU4Nf_f4HV-W61gfLysn0ndcaxcaWQOXeev56JQ@mail.gmail.com>
 <55EDB967.2050108@oddbird.net>
 <CAExdVNmqahyRpU+s+RMoe+2L3XkE1e8-4QhhUH=AnhDXyMyGOw@mail.gmail.com>
 <CAP7+vJJs-9AAiPowGZNzcXKQQReVe2ai-aEs4tuNyEtjbfmwtw@mail.gmail.com>
 <CAExdVNn5bvo15CHL8nGMX-9Y+Huv6FYe2=CLJfzjxt0zBdJDGQ@mail.gmail.com>
 <A8EAB57D-A489-4B3D-B06F-7D40705AF392@oddbird.net>
 <CAExdVNmp=Y7nD5JJyRCXbmmhNOo+Y8BgYtwU+ZR4C0aNVwMEkg@mail.gmail.com>
 <55EE233C.1020307@oddbird.net>
Message-ID: <CAExdVNnjngG1HeV9DFtEg-uCoYb2=gbYpcU4R4Oz0spSduBzaQ@mail.gmail.com>

[Tim]
>>>> An aware datetime _is_ a <naive datetime,
>>>> tzinfo> pair, and there's a natural bijection between naive datetimes
>>>> and POSIX timestamps (across all instants both can represent).

 [Carl]
>>> I don't understand this, and I suspect it's at the heart of our
>>> misunderstanding. I would say there are many possible bijections ....

 [Tim]
>> "Natural" bijection.  I gave you very simple Python code implementing
>> that bijection already.  A naive datetime represents an instant in the
>> proleptic Gregorian calendar.

[Carl]
> What is your definition of "instant" here?

I didn't need one - Occam's Razor again ;-)  To establish a bijection,
all that's required is to show that a proposed function meets all the
formal requirements.  I couldn't care less whether it does or doesn't
fit in with anyone's mental model, including my own.  "Represents an
instance" was just vague English motivation for what followed.  The
bijection was wholly defined by the latter, and never mentioned
"instant".  If it meets what someone _wants_ to think "an instant"
means. fine; if not, also fine.  Whether a proposed function is in
fact a bijection has nothing to do with anyone's opinion of what "an
instant" means, should mean, or must not mean.

But if you can't leave that alone, here:  by "an instant in the
proleptic Gregorian calendar", I mean any 5-tuple of integers that
meets the defined (by POSIX) requirements for a valid struct tm's
tm_sec, tm_min, tm_hour, tm_yday. and tm_year members.


> I don't think a naive datetime represents an instant at all;

Fine by me - and by Python.  Also fine if you _never_ use a naive datetime.


> it represents a range of possible instants,

Heh - I see you haven't defined what _you_ mean by "instant".  When
you do, please be sure it's consistent with what POSIX says here too:

    The relationship between the actual time of day and the current
    value for seconds since the Epoch is unspecified.

    How any changes to the value of seconds since the Epoch are
    made to align to a desired relationship with the current actual time
    is implementation-defined. As represented in seconds since the
    Epoch, each and every day shall be accounted for by exactly
    86400 seconds.

While you're at it, define a clean model in which all that makes a
lick of sense to a casual user ;-)


> depending which timezone that naive datetime is interpreted
> in. Without an offset, who knows which instant it might represent.

I understand much of it is at odds with Model A.  I also understand
that some datetime libraries for other languages supply different
types for different purposes.  That's fine by me too   But we're on a
Python datetime mailing list, so in the absence of explicit statements
to the contrary, it makes most sense here to assume Python's datetime
is being discussed on its own terms.


>> So does a POSIX timestamp.  In POSIX,
>> the relationship between a timestamp and calendar notation is defined
>> by the C expression ("/" is truncating integer division):
>>
>> timestamp = tm_sec + tm_min*60 + tm_hour*3600 + tm_yday*86400 +
>>     (tm_year-70)*31536000 + ((tm_year-69)/4)*86400 -
>>     ((tm_year-1)/100)*86400 + ((tm_year+299)/400)*86400
>>
>> The natural bijection, between naive datetimes and POSIX timestamps,
>> is the bijection in which a naive datetime maps to/from the POSIX
>> timestamp such that
>>
>>      the naive datetime's calendar notation
>>      is exactly equal to
>>      the POSIX calendar notation
>>          corresponding to that POSIX timestamp
>>          as defined by the expression above.
>>
>> Any other bijection is strained in comparison, hence "unnatural".
>> Natural doesn't necessarily mean unique (although it does in this
>> specific case - there is only one bijection satisfying the above);
>> "natural" is more related to Occam's Razor ;-)

> Ok, sure, because POSIX is defined in terms of the Gregorian calendar in
> UTC, if you have(for some reason) _must_ compare a naive datetime to a
> POSIX timestamp, it's simplest to assume the naive datetime is also in
> UTC, so that their Gregorian calendars line up with no offset.

It does happen to be an order-preserving bijection.  But I said
nothing in the quote about comparing anything apart from comparing
pairs of integers (not timestamps, and not datetimes - just the little
integers in the two calendar notations) for equality.


> I buy that's "most natural" of the available bijections in some sense, but I'm
> missing the "so what?"

The "so what?", in context, was to tweak Guido about saying an aware
datetime is fundamentally different from a <timestamp, tzinfo> pair,
despite that the space of such pairs is isomorphic to the space of
aware datetimes (which _is_ the space of <naive datetime, tzinfo>
pairs) under the natural naive_datetime <-> timestamp bijection.

Why is that setting _you_ off?  Guido handled it just fine ;-)


> Under what circumstances is it reasonable to make that assumption
> about a naive datetime?

Any use case where it's convenient   That's up to the user. not me -
or you.  For example, before Python grew its builtin
datetime.timezone.utc implementation of a UTC class, I routinely used
naive datetimes I thought of as being in UTC.  I was too lazy to
remember where I hid my own UTC class.  No problem.


> Rather than saying "a naive datetime simply doesn't correspond to
> any particular POSIX timestamp;  they aren't comparable at all unless
> you have additional information," which is what I'd say.

I'm starting to suspect you didn't design datetime ;-)  In context, I
was replying to Guido, who was talking about Python.  In Python's
datetime, naive datetimes are comparable.   Naive time has no
_concept_ of time zone.  Naive datetimes nevertheless have a  notion
of total order, which is isomorphic to the POSIX timestamp notion of
total order under the natural bijection.  Likewise for arithmetic,
etc.  There's nothing "wrong" about exploiting any of that when it's
convenient.


> I mean, I certainly hope you wouldn't want datetime to make `utcdt -
> naivedt` a defined operation where it's assumed the naive datetime is UTC.

Certainly not.  That _would_ be wrong ;-)

From alexander.belopolsky at gmail.com  Tue Sep  8 03:57:12 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 7 Sep 2015 21:57:12 -0400
Subject: [Datetime-SIG] PEP 495: What's left to resolve
Message-ID: <CAP7h-xZcGFmkaY6o1xB60dizC-T63iBOPK89jL1EVoxw-8P5+g@mail.gmail.com>

The good news that other than a few editorial changes there is only one
issue which keeps me from declaring PEP 495 complete.  The bad news is that
the remaining issue is subtle and while several solutions have been
proposed, neither stands out as an obviously right.

The Problem
-----------

PEP 495 requires that the value of the fold attribute is ignored when two
aware datetime objects that share tzinfo are compared.  This is motivated
by the reasons of backward compatibility: we want the value of fold to only
matter in conversions from one zone to another and not in arithmetic within
a single timezone.

As Tim pointed out, this rule is in conflict with the only requirement that
a hash function must satisfy: if two objects compare as equal, their hashes
should be equal as well.

Let t0 and t1 be two times in the fold that differ only by the value of
their fold attribute: t0.fold == 0, t1.fold == 1.  Let u0 =
t0.astimezone(utc) and u1 = t1.astimezone(t1).  PEP 495 requires that u0 <
u1.  (In fact, this is the main purpose of the PEP to disambiguate between
t0 and t1 so that conversion to UTC is well defined.)  However, by the
current PEP 495 rules, t0 == t1 is True, by the pre-PEP rule (and the PEP
rule that fold is ignored in comparisons) we also have t0 == u0 and t1 ==
u1.  So, we have (a) a violation of the transitivity of ==: u0 == t0 == t1
== u1 does not imply u0 == u1 which is bad enough by itself, and (b) since
hash(u0) can be equal to hash(u1) only by a lucky coincidence, the rule
"equality of objects implies equality of hashes" leads to contradiction
because applying it to the chain u0 == t0 == t1 == u1, we get hash(u0) ==
hash(t0) == hash(t1) == hash(u1) which is now a chain of equalities of
integers and on integers == is transitive, so we have hash(u0) == hash(u1)
which as we said can only happen by a lucky coincidence.


The Root of the Problem
-----------------------

The rules of arithmetic on aware datetime objects already cause some basic
mathematical identities to break.  The problem described above is avoided
by not having a way to represent u1 in the timezone where u0 and u1 map to
the same local time.  We still have a surprising u0 < u1, but
u0.astimezone(local) == u1.astimezone(local), but it does not rise to the
level of a hash invariant violation because u0.astimezone(local) and
u1.astimezone(local) are not only equal: they are identical in all other
ways and if we convert them back to UTC - they both convert to u0.

The root of the hash problem is not in the t0 == t1 is True rule.  It is in
u0 == t0.  The later equality is just too fragile: if you add
timedelta(hour=1) to both sides to this equation, then (assuming an
ordinary 1 hour fall-back fold), you will get two datetime objects that are
no longer equal. (Indeed, local to utc equality t == u is defined as t -
t.utcoffset() == u.replace(tzinfo=t.tzinfo), but when you add 1 hour to t0,
utcoffset() changes so the equality that held for t0 and u0 will no longer
hold for t0 + timedelta(hour=1) and u0 + timedelta(hour=1).)

PEP 495 gives us a way to break the u0 == t0 equality by replacing t0 with
an "equal" object t1 and simultaneously have u0 == t0, t0 == t1 and t1 !=
u0.


The Solutions
-------------

Tim suggested several solutions to this problem, but by his own admission
neither is more than "grudgingly acceptable."  For completeness, I will
also present my "non-solution."

Solution 0: Ignore the problem.  Since PEP 495 does not by itself introduce
any tzinfo implementations with variable utcoffset(), it does not create a
hash invariant violation.  I call this a non-solution because it would once
again punt an unsolvable problem to tzinfo implementors.  It is unsolvable
for *them* because without some variant of the rejected PEP 500, they will
have no control over datetime comparisons or hashing.

Solution 1: Make t1 > t0.

Solution 2: Leave t1 == t0, but make t1 != u1.


Request for Comments
--------------------

I will not discuss pros and cons on the two solutions because my goal here
was only to state the problem, identify the root case and indicate the
possible solutions.  Those interested in details can read Tim's excellent
explanations in the "Another round on error-checking" [1] and "Another
approach to 495's glitches" [2] threads.

I "bcc" python-dev in a hope that someone in the expanded forum will either
say "of course solution N is the right one and here is why" or "here is an
obviously right solution - how could  you guys miss it."


[1]:
https://mail.python.org/pipermail/datetime-sig/2015-September/000622.html
[2]:
https://mail.python.org/pipermail/datetime-sig/2015-September/000716.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150907/39e4c5ed/attachment-0001.html>

From tim.peters at gmail.com  Tue Sep  8 04:25:28 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 7 Sep 2015 21:25:28 -0500
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <55EE2501.6090901@oddbird.net>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>
 <55E9F626.1080906@oddbird.net>
 <CAExdVNmmqSMz5_2ZTcxZLJ2c1i9u=SkCi6mJvg8Lf7wHbRi1sA@mail.gmail.com>
 <55ECD82E.9070305@oddbird.net>
 <CAExdVNk9ggWEU4Nf_f4HV-W61gfLysn0ndcaxcaWQOXeev56JQ@mail.gmail.com>
 <55EDB967.2050108@oddbird.net>
 <CAExdVNmqahyRpU+s+RMoe+2L3XkE1e8-4QhhUH=AnhDXyMyGOw@mail.gmail.com>
 <CAP7+vJJs-9AAiPowGZNzcXKQQReVe2ai-aEs4tuNyEtjbfmwtw@mail.gmail.com>
 <CAExdVNn5bvo15CHL8nGMX-9Y+Huv6FYe2=CLJfzjxt0zBdJDGQ@mail.gmail.com>
 <CAP7+vJKwphRVEpqHtTK0j=GUqQ_uVyoHDbaFfZnkqmWLuHfTqw@mail.gmail.com>
 <55EE2501.6090901@oddbird.net>
Message-ID: <CAExdVNmhNH_4DjM3VvmLJtJ0TpGk5A6NiTGgRpP0nKssk=7XoQ@mail.gmail.com>

[Guido]
>> OK. I'll just remember "model A bad, model B good." :-)

[Carl]
> Fine by me. :-)

That's the spirit!  We'll have you chugging Dutch Kool-Aid yet ;-)


>> Or, perhaps more fairly, "model A is how pytz thinks, model B is how the
>> stdlib thinks."

> We'd be in better shape if it were that simple. pytz is strictly model
> A. Unfortunately the stdlib isn't consistent in how it thinks (short
> version: because __hash__ and cross-timezone equality and arithmetic
> implicitly treat aware datetimes as if they were unambiguous model A
> instants, when they aren't), and that's the root of all the difficulty
> with PEP 495.
>
> (I can give a longer explanation of _why_ that causes difficulty with
> PEP 495 if you want it, or you can go back and read the last few threads
> in detail, or you can just wait for the PEP :-) ).

Time for just the "high-order bits" again (for Guido):

Last time we left off with "End of problems.  Start of new problems.".

You can just repeat that now.  The new problems turned out to be even
uglier than the earlier problems.

So after going from "ignore fold as much as possible" to "pay
attention to it as much as possible", we're back to "ignore it as much
as possible" again.

The real pain remaining is that we'd love to ignore it in interzone
by-magic subtraction and comparison too, but doing so would break a
weak form of backward compatibility:  interzone code that already
works fine would continue to work fine, but after `fold` started
showing up may well no longer compute the _intended_ results in fold=1
cases.  Alex made a good case for why such code may actually exist,
and for why this would be a real regression for such code's intended
purposes.

So the best idea now is to special-case the snot out of fold==1 only
in interzone __eq__ and __ne__, to say that any datetime with fold=1
is "not equal" to any datetime in any other zone.  That hackery is to
squash the return of "the hash problem" (without needing an insanely
delicate hash() implementation).

This causes annoying special-case warts in current by-magic interzone
operations.  For example, cross-zone comparison trichotomy could fail:
 if x.fold==1 and y is in a different zone, none of x<y, x==y or x>y
would be true.

Best guess is that's of little consequence, but it's ugly.

So, if your time machine is gassed up and ready to go, just remove
by-magic interzone comparison and subtraction before they were added.
Thanks!  PEP 495 could be a delight then :-)

From guido at python.org  Tue Sep  8 06:21:29 2015
From: guido at python.org (Guido van Rossum)
Date: Mon, 7 Sep 2015 21:21:29 -0700
Subject: [Datetime-SIG] PEP 495: What's left to resolve
In-Reply-To: <CAP7h-xZcGFmkaY6o1xB60dizC-T63iBOPK89jL1EVoxw-8P5+g@mail.gmail.com>
References: <CAP7h-xZcGFmkaY6o1xB60dizC-T63iBOPK89jL1EVoxw-8P5+g@mail.gmail.com>
Message-ID: <CAP7+vJLfd7DnhPaMgAY1EoJA0gzG-D6hVJGq_587twVd7C5C1w@mail.gmail.com>

Maybe I should just reject PEP 495 in disgust. :-)

I think #2 is the only reasonable solution (of these three). Of all the
existing semantics we're trying to preserve, I find interzone comparison
the unholiest. (With the possible exceptions of the case where both zones
are known to be forever-fixed-offset, such as datetime.timezone instances
and pytz.utc, and even possibly the fixed-offset zones that pytz returns
from localize(). How exactly we're going to recognize those is a different
question, though I have an opinion there too.)

On Mon, Sep 7, 2015 at 6:57 PM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:

> The good news that other than a few editorial changes there is only one
> issue which keeps me from declaring PEP 495 complete.  The bad news is that
> the remaining issue is subtle and while several solutions have been
> proposed, neither stands out as an obviously right.
>
> The Problem
> -----------
>
> PEP 495 requires that the value of the fold attribute is ignored when two
> aware datetime objects that share tzinfo are compared.  This is motivated
> by the reasons of backward compatibility: we want the value of fold to only
> matter in conversions from one zone to another and not in arithmetic within
> a single timezone.
>
> As Tim pointed out, this rule is in conflict with the only requirement
> that a hash function must satisfy: if two objects compare as equal, their
> hashes should be equal as well.
>
> Let t0 and t1 be two times in the fold that differ only by the value of
> their fold attribute: t0.fold == 0, t1.fold == 1.  Let u0 =
> t0.astimezone(utc) and u1 = t1.astimezone(t1).  PEP 495 requires that u0 <
> u1.  (In fact, this is the main purpose of the PEP to disambiguate between
> t0 and t1 so that conversion to UTC is well defined.)  However, by the
> current PEP 495 rules, t0 == t1 is True, by the pre-PEP rule (and the PEP
> rule that fold is ignored in comparisons) we also have t0 == u0 and t1 ==
> u1.  So, we have (a) a violation of the transitivity of ==: u0 == t0 == t1
> == u1 does not imply u0 == u1 which is bad enough by itself, and (b) since
> hash(u0) can be equal to hash(u1) only by a lucky coincidence, the rule
> "equality of objects implies equality of hashes" leads to contradiction
> because applying it to the chain u0 == t0 == t1 == u1, we get hash(u0) ==
> hash(t0) == hash(t1) == hash(u1) which is now a chain of equalities of
> integers and on integers == is transitive, so we have hash(u0) == hash(u1)
> which as we said can only happen by a lucky coincidence.
>
>
> The Root of the Problem
> -----------------------
>
> The rules of arithmetic on aware datetime objects already cause some basic
> mathematical identities to break.  The problem described above is avoided
> by not having a way to represent u1 in the timezone where u0 and u1 map to
> the same local time.  We still have a surprising u0 < u1, but
> u0.astimezone(local) == u1.astimezone(local), but it does not rise to the
> level of a hash invariant violation because u0.astimezone(local) and
> u1.astimezone(local) are not only equal: they are identical in all other
> ways and if we convert them back to UTC - they both convert to u0.
>
> The root of the hash problem is not in the t0 == t1 is True rule.  It is
> in u0 == t0.  The later equality is just too fragile: if you add
> timedelta(hour=1) to both sides to this equation, then (assuming an
> ordinary 1 hour fall-back fold), you will get two datetime objects that are
> no longer equal. (Indeed, local to utc equality t == u is defined as t -
> t.utcoffset() == u.replace(tzinfo=t.tzinfo), but when you add 1 hour to t0,
> utcoffset() changes so the equality that held for t0 and u0 will no longer
> hold for t0 + timedelta(hour=1) and u0 + timedelta(hour=1).)
>
> PEP 495 gives us a way to break the u0 == t0 equality by replacing t0 with
> an "equal" object t1 and simultaneously have u0 == t0, t0 == t1 and t1 !=
> u0.
>
>
> The Solutions
> -------------
>
> Tim suggested several solutions to this problem, but by his own admission
> neither is more than "grudgingly acceptable."  For completeness, I will
> also present my "non-solution."
>
> Solution 0: Ignore the problem.  Since PEP 495 does not by itself
> introduce any tzinfo implementations with variable utcoffset(), it does not
> create a hash invariant violation.  I call this a non-solution because it
> would once again punt an unsolvable problem to tzinfo implementors.  It is
> unsolvable for *them* because without some variant of the rejected PEP 500,
> they will have no control over datetime comparisons or hashing.
>
> Solution 1: Make t1 > t0.
>
> Solution 2: Leave t1 == t0, but make t1 != u1.
>
>
> Request for Comments
> --------------------
>
> I will not discuss pros and cons on the two solutions because my goal here
> was only to state the problem, identify the root case and indicate the
> possible solutions.  Those interested in details can read Tim's excellent
> explanations in the "Another round on error-checking" [1] and "Another
> approach to 495's glitches" [2] threads.
>
> I "bcc" python-dev in a hope that someone in the expanded forum will
> either say "of course solution N is the right one and here is why" or "here
> is an obviously right solution - how could  you guys miss it."
>
>
> [1]:
> https://mail.python.org/pipermail/datetime-sig/2015-September/000622.html
> [2]:
> https://mail.python.org/pipermail/datetime-sig/2015-September/000716.html
>
>
> _______________________________________________
> Datetime-SIG mailing list
> Datetime-SIG at python.org
> https://mail.python.org/mailman/listinfo/datetime-sig
> The PSF Code of Conduct applies to this mailing list:
> https://www.python.org/psf/codeofconduct/
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150907/4142fc9f/attachment.html>

From guido at python.org  Tue Sep  8 06:43:01 2015
From: guido at python.org (Guido van Rossum)
Date: Mon, 7 Sep 2015 21:43:01 -0700
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <CAExdVNmhNH_4DjM3VvmLJtJ0TpGk5A6NiTGgRpP0nKssk=7XoQ@mail.gmail.com>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>
 <55E9F626.1080906@oddbird.net>
 <CAExdVNmmqSMz5_2ZTcxZLJ2c1i9u=SkCi6mJvg8Lf7wHbRi1sA@mail.gmail.com>
 <55ECD82E.9070305@oddbird.net>
 <CAExdVNk9ggWEU4Nf_f4HV-W61gfLysn0ndcaxcaWQOXeev56JQ@mail.gmail.com>
 <55EDB967.2050108@oddbird.net>
 <CAExdVNmqahyRpU+s+RMoe+2L3XkE1e8-4QhhUH=AnhDXyMyGOw@mail.gmail.com>
 <CAP7+vJJs-9AAiPowGZNzcXKQQReVe2ai-aEs4tuNyEtjbfmwtw@mail.gmail.com>
 <CAExdVNn5bvo15CHL8nGMX-9Y+Huv6FYe2=CLJfzjxt0zBdJDGQ@mail.gmail.com>
 <CAP7+vJKwphRVEpqHtTK0j=GUqQ_uVyoHDbaFfZnkqmWLuHfTqw@mail.gmail.com>
 <55EE2501.6090901@oddbird.net>
 <CAExdVNmhNH_4DjM3VvmLJtJ0TpGk5A6NiTGgRpP0nKssk=7XoQ@mail.gmail.com>
Message-ID: <CAP7+vJ+4FLMviS6nd9ZTeCSHvSWJicHTHht+D+STq2AVBjyBJQ@mail.gmail.com>

A bit of levity: http://penny-arcade.com/comic/2015/09/07/the-twain

--Guido


-- 
--Guido van Rossum (on iPad)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150907/4cd2408a/attachment-0001.html>

From stuart at stuartbishop.net  Tue Sep  8 06:48:53 2015
From: stuart at stuartbishop.net (Stuart Bishop)
Date: Tue, 8 Sep 2015 11:48:53 +0700
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
Message-ID: <CADmi=6MDJ3b5a25N_ZLkBiDQ=JuRcCNgK+-4KRujOpuedrMDzQ@mail.gmail.com>

On 4 September 2015 at 23:01, Chris Barker <chris.barker at noaa.gov> wrote:

> I would like a flag on datetime, but it seems it might be better to put that
> flag on a tzinfo object. But the implementation is the something to argue
> about only if there is any chance of doing it at all.

I would still lean towards a separate datetimetz class, but that is
just semantics.


> Also, particularly as PEP 495 will introduce changes to tzinfo, that will
> presumable lead to changes in tzinfo implementations (like pytz, etc), it
> seems that if other changes are afoot, now is a good time to map out how
> they should be done.
>
> Stuart, if you are listening:
>
> IIUC, you want "timeline" arithmetic to work with pytz tzinfo-aware
> datetimes. To the extent that the current implementation functions in a
> maybe "hacky", and at least inconvenient, way to achieve this.
>
> So you are an obvious person to say what we might put in the stdlib that
> would facilitate cleaning all that up. If anything.
>
> BTW: I'll at least take it as a given that we're not breaking backward
> compatibility, and that arithmetic needs to stay as fast as it currently is
> -- at least in the cases where it currently works.

To clean up pytz's interface and allow it to easily bolt on timeline
arithmetic to the existing datetime library, I need two hooks to
replace calls to tzinfo.localize() and tzinfo.normalize().

When a user does datetime.datetime(2000, 10, 9, 8, 7, 6,
tzinfo=pytz.timezone('US/Eastern'), a method on the tzinfo needs to be
invoked that returns the real tzinfo to be used for that datetime (ie.
the tzinfo instance for Oct 2000, not the default one for January
1878).

When arithmetic has been performed, a method on the resulting tzinfo
needs to be invoked that returns a datetime containing the real,
adjusted result.

These hooks are entirely separate to PEP-495 AFAICT. PEP-495 doesn't
help pytz the library much. It should help pytz users though, as most
use cases can stop using pytz and switch to using stdlib.

-- 
Stuart Bishop <stuart at stuartbishop.net>
http://www.stuartbishop.net/

From tim.peters at gmail.com  Tue Sep  8 06:50:04 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 7 Sep 2015 23:50:04 -0500
Subject: [Datetime-SIG] PEP 495: What's left to resolve
In-Reply-To: <CAP7+vJLfd7DnhPaMgAY1EoJA0gzG-D6hVJGq_587twVd7C5C1w@mail.gmail.com>
References: <CAP7h-xZcGFmkaY6o1xB60dizC-T63iBOPK89jL1EVoxw-8P5+g@mail.gmail.com>
 <CAP7+vJLfd7DnhPaMgAY1EoJA0gzG-D6hVJGq_587twVd7C5C1w@mail.gmail.com>
Message-ID: <CAExdVNk5HKTToGksFEgRfFs7aFZ0t2hEyxj28TS9x-jCUYRKmA@mail.gmail.com>

[Guido]
> Maybe I should just reject PEP 495 in disgust. :-)

Maybe so :-)

> I think #2 is the only reasonable solution (of these three).

No argument there either.

> Of all the
> existing semantics we're trying to preserve, I find interzone comparison the
> unholiest. (With the possible exceptions of the case where both zones are
> known to be forever-fixed-offset, such as datetime.timezone instances and
> pytz.utc, and even possibly the fixed-offset zones that pytz returns from
> localize(). How exactly we're going to recognize those is a different
> question, though I have an opinion there too.)

No real worries about those:  if 495 is implemented, there will be two
kinds of tzinfos:

1. With pre-495 semantics.  Those will never even look at `fold`, let
alone set it to 1.

2. With post-495 semantics.  .fromutc() is the only tzinfo method that
will set `fold`.  Any correct implementation of .fromutc() converting
to a fixed-offset zone will always set `fold` to 0 in its result,
since there are no ambiguous times in a fixed-offset zone.

There are two flavors of "solution 2" (which differ in how much they
muck with interzone subtraction and/or comparison), but neither of
those flavors changes anything about what happens when neither operand
has `fold=1`.

So the only way by-magic cross-zone subtraction or comparison between
fixed-offset zones could cease working exactly as they do today is if
the user forces `fold=1` manually.  And by "the only way", I mean the
only way I just happened to think of ;-)  But it's certain a correct
495 .fromutc() could not screw this up.

Note that intrazone arithmetic ignores `fold` in the current proposal
(classic arithmetic changes in no way, ever), but always forces it to
`0` when there's a datetime result.  So some stray fold=1 propagating
through intrazone datatime arithmetic isn't a concern either.

From stuart at stuartbishop.net  Tue Sep  8 06:53:43 2015
From: stuart at stuartbishop.net (Stuart Bishop)
Date: Tue, 8 Sep 2015 11:53:43 +0700
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
Message-ID: <CADmi=6OZ5=xiqcbdKxWwbACJwn7usT0c20LiRE9ixo03WYFE1g@mail.gmail.com>

On 4 September 2015 at 23:39, Tim Peters <tim.peters at gmail.com> wrote:

> It seems 495 really doesn't do anything for pytz, so I'm not sure
> Stuart would bother to implement 495-conforming tzinfos.  _Someone_
> will, though.  Eventually ;-)

I'll do it, but more than happy for someone else to do it first. 3.6 I
guess. More support in stdlib means fewer confused pytz users.

I still worry that landing real timezones in stdlib will be dropping
the pants on datetime, exposing its warts for all to see.


-- 
Stuart Bishop <stuart at stuartbishop.net>
http://www.stuartbishop.net/

From tim.peters at gmail.com  Tue Sep  8 06:58:55 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 7 Sep 2015 23:58:55 -0500
Subject: [Datetime-SIG] PEP 495: What's left to resolve
In-Reply-To: <CAExdVNk5HKTToGksFEgRfFs7aFZ0t2hEyxj28TS9x-jCUYRKmA@mail.gmail.com>
References: <CAP7h-xZcGFmkaY6o1xB60dizC-T63iBOPK89jL1EVoxw-8P5+g@mail.gmail.com>
 <CAP7+vJLfd7DnhPaMgAY1EoJA0gzG-D6hVJGq_587twVd7C5C1w@mail.gmail.com>
 <CAExdVNk5HKTToGksFEgRfFs7aFZ0t2hEyxj28TS9x-jCUYRKmA@mail.gmail.com>
Message-ID: <CAExdVNk5LGKk=xjEn4R41Z2=hHxxvKs81yDDd2Hxst3XA8aqfw@mail.gmail.com>

[Guido]
> ...
>  (With the possible exceptions of the case where both zones are
> known to be forever-fixed-offset, such as datetime.timezone instances and
> pytz.utc, and even possibly the fixed-offset zones that pytz returns from
> localize(). How exactly we're going to recognize those is a different
> question, though I have an opinion there too.)

BTW, I was looking at what it would take to do a 495-compliant
wrapping of zoneinfo.  That essentially hands us .fromutc(), but
leaves .utcoffset() a puzzle (mktime() all over again).

I found what I thought was a very happy solution:  when loading the
tzfile, it's easy to construct a list of every unique total UTC offset
in the zone's history.  Order them from most recent to least, and then
.utcoffset() would typically need to try no more than the first two to
find one where .fromutc() reproduced .utcoffset()'s input.

In that scheme, "is this a fixed offset zone?" is the same as asking
whether the zone's unique-offsets list is a singleton.

That doesn't belong in 495, just noting that the recognition question
you raised is dead easy to answer for the most important source of
timezone info.

From guido at python.org  Tue Sep  8 07:44:24 2015
From: guido at python.org (Guido van Rossum)
Date: Mon, 7 Sep 2015 22:44:24 -0700
Subject: [Datetime-SIG] PEP 495: What's left to resolve
In-Reply-To: <CAExdVNk5LGKk=xjEn4R41Z2=hHxxvKs81yDDd2Hxst3XA8aqfw@mail.gmail.com>
References: <CAP7h-xZcGFmkaY6o1xB60dizC-T63iBOPK89jL1EVoxw-8P5+g@mail.gmail.com>
 <CAP7+vJLfd7DnhPaMgAY1EoJA0gzG-D6hVJGq_587twVd7C5C1w@mail.gmail.com>
 <CAExdVNk5HKTToGksFEgRfFs7aFZ0t2hEyxj28TS9x-jCUYRKmA@mail.gmail.com>
 <CAExdVNk5LGKk=xjEn4R41Z2=hHxxvKs81yDDd2Hxst3XA8aqfw@mail.gmail.com>
Message-ID: <CAP7+vJLUgOVbE-aEGNhVM6LV1ennGX-+VNo7vinPYro+PPUXDQ@mail.gmail.com>

No, the question I care about is more like "could politicians change the
utc offset", not whether they have done so in the past. So instances of
datetime.timezone qualify, as do (I believe) lettered "military" zone names.

On Monday, September 7, 2015, Tim Peters <tim.peters at gmail.com> wrote:

> [Guido]
> > ...
> >  (With the possible exceptions of the case where both zones are
> > known to be forever-fixed-offset, such as datetime.timezone instances and
> > pytz.utc, and even possibly the fixed-offset zones that pytz returns from
> > localize(). How exactly we're going to recognize those is a different
> > question, though I have an opinion there too.)
>
> BTW, I was looking at what it would take to do a 495-compliant
> wrapping of zoneinfo.  That essentially hands us .fromutc(), but
> leaves .utcoffset() a puzzle (mktime() all over again).
>
> I found what I thought was a very happy solution:  when loading the
> tzfile, it's easy to construct a list of every unique total UTC offset
> in the zone's history.  Order them from most recent to least, and then
> .utcoffset() would typically need to try no more than the first two to
> find one where .fromutc() reproduced .utcoffset()'s input.
>
> In that scheme, "is this a fixed offset zone?" is the same as asking
> whether the zone's unique-offsets list is a singleton.
>
> That doesn't belong in 495, just noting that the recognition question
> you raised is dead easy to answer for the most important source of
> timezone info.
>


-- 
--Guido van Rossum (on iPad)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150907/e13a5deb/attachment-0001.html>

From tim.peters at gmail.com  Tue Sep  8 08:10:20 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 8 Sep 2015 01:10:20 -0500
Subject: [Datetime-SIG] PEP 495: What's left to resolve
In-Reply-To: <CAP7+vJLUgOVbE-aEGNhVM6LV1ennGX-+VNo7vinPYro+PPUXDQ@mail.gmail.com>
References: <CAP7h-xZcGFmkaY6o1xB60dizC-T63iBOPK89jL1EVoxw-8P5+g@mail.gmail.com>
 <CAP7+vJLfd7DnhPaMgAY1EoJA0gzG-D6hVJGq_587twVd7C5C1w@mail.gmail.com>
 <CAExdVNk5HKTToGksFEgRfFs7aFZ0t2hEyxj28TS9x-jCUYRKmA@mail.gmail.com>
 <CAExdVNk5LGKk=xjEn4R41Z2=hHxxvKs81yDDd2Hxst3XA8aqfw@mail.gmail.com>
 <CAP7+vJLUgOVbE-aEGNhVM6LV1ennGX-+VNo7vinPYro+PPUXDQ@mail.gmail.com>
Message-ID: <CAExdVNnDade5VpNrXLAv4qHx5+FX85iNh4B+SATY519et8PJnQ@mail.gmail.com>

[Guido]
> No, the question I care about is more like "could politicians change the utc
> offset", not whether they have done so in the past. So instances of
> datetime.timezone qualify, as do (I believe) lettered "military" zone names.

Ah, got it now.  No, that's impossible to determine from a tzfile.

Yes, the 25 {"A", "B", ... ,"Z"} - {"J"} military zones do (one for
each hour offset in -12 through +12 inclusive).  The military "J" zone
does not (that's whatever local civil zone is implied by context -
good luck programming that one ;-) ).

In any case, the message before still applies:   interzone subtraction
and comparison for such zones would continue to work fine after 495,
because their .fromutc() would never set `fold` to 1.

From alexander.belopolsky at gmail.com  Tue Sep  8 09:59:15 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 8 Sep 2015 03:59:15 -0400
Subject: [Datetime-SIG] PEP 495: What's left to resolve
In-Reply-To: <CAP7h-xZcGFmkaY6o1xB60dizC-T63iBOPK89jL1EVoxw-8P5+g@mail.gmail.com>
References: <CAP7h-xZcGFmkaY6o1xB60dizC-T63iBOPK89jL1EVoxw-8P5+g@mail.gmail.com>
Message-ID: <CAP7h-xb1tAfyYg8m7hEUBaMqPVZfO97LfU2aumEskqZV5v_Ddg@mail.gmail.com>

On Mon, Sep 7, 2015 at 9:57 PM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:

> Solution 1: Make t1 > t0.
>
> Solution 2: Leave t1 == t0, but make t1 != u1.
>

Solution 3:  Leave t1 == t0, but make *both* t0 != u0 and t1 != u1 if
t0.utcoffset() != t1.utcoffset().

In other words,

def __eq__(self, other):
    n_self = self.replace(tzinfo=None)
    n_other = other.replace(tzinfo=None)
    if self.tzinfo is other.tzinfo:
        return n_self == n_other
    u_self = n_self - self.utcoffset()
    v_self = n_self - self.replace(fold=(1-self.fold)).utcoffset()
    u_other = n_other - other.utcoffset()
    v_other = n_other - other.replace(fold=(1-self.fold)).utcoffset()
    return u_self == u_other == v_self == v_other

Before anyone complaints that this makes comparison 4x slower, I note that
we can add obvious optimizations for the common tzinfo is
datetime.timezone.utc and  isinstance(tzinfo, datetime.timezone) cases.
Users that truly want to compare aware datetime instances between two
variable offset timezones, should realize that fold/gap detection in *both*
r.h.s. and l.h.s. zones is part of the operation that they request.

This solution has some nice properties compared to the solution 2: (1) it
restores the transitivity - we no longer have u0 == t0 == t1 and t1 != u1;
(2) it restores the symmetry between fold=0 and fold=1 while preserving a
full backward compatibility.

I also think this solution makes an intuitive sense: since we cannot decide
which of the two UTC times u0 and u1 should belong in the equivalency class
of t0 == t1 - neither should.  "In the face of ambiguity" and all that.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150908/b517ff83/attachment.html>

From guido at python.org  Tue Sep  8 17:09:59 2015
From: guido at python.org (Guido van Rossum)
Date: Tue, 8 Sep 2015 08:09:59 -0700
Subject: [Datetime-SIG] PEP 495: What's left to resolve
In-Reply-To: <CAP7h-xb1tAfyYg8m7hEUBaMqPVZfO97LfU2aumEskqZV5v_Ddg@mail.gmail.com>
References: <CAP7h-xZcGFmkaY6o1xB60dizC-T63iBOPK89jL1EVoxw-8P5+g@mail.gmail.com>
 <CAP7h-xb1tAfyYg8m7hEUBaMqPVZfO97LfU2aumEskqZV5v_Ddg@mail.gmail.com>
Message-ID: <CAP7+vJKYM=u3b5rbFyQ+AODVgdd1pyf7dkGyqi0iZzRtEBxtAQ@mail.gmail.com>

On Tue, Sep 8, 2015 at 12:59 AM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:

>
> On Mon, Sep 7, 2015 at 9:57 PM, Alexander Belopolsky <
> alexander.belopolsky at gmail.com> wrote:
>
>> Solution 1: Make t1 > t0.
>>
>> Solution 2: Leave t1 == t0, but make t1 != u1.
>>
>
> Solution 3:  Leave t1 == t0, but make *both* t0 != u0 and t1 != u1 if
> t0.utcoffset() != t1.utcoffset().
>
> In other words,
>
> def __eq__(self, other):
>     n_self = self.replace(tzinfo=None)
>     n_other = other.replace(tzinfo=None)
>     if self.tzinfo is other.tzinfo:
>         return n_self == n_other
>     u_self = n_self - self.utcoffset()
>     v_self = n_self - self.replace(fold=(1-self.fold)).utcoffset()
>     u_other = n_other - other.utcoffset()
>     v_other = n_other - other.replace(fold=(1-self.fold)).utcoffset()
>     return u_self == u_other == v_self == v_other
>
> Before anyone complaints that this makes comparison 4x slower, I note that
> we can add obvious optimizations for the common tzinfo is
> datetime.timezone.utc and  isinstance(tzinfo, datetime.timezone) cases.
> Users that truly want to compare aware datetime instances between two
> variable offset timezones, should realize that fold/gap detection in *both*
> r.h.s. and l.h.s. zones is part of the operation that they request.
>
> This solution has some nice properties compared to the solution 2: (1) it
> restores the transitivity - we no longer have u0 == t0 == t1 and t1 != u1;
> (2) it restores the symmetry between fold=0 and fold=1 while preserving a
> full backward compatibility.
>
> I also think this solution makes an intuitive sense: since we cannot
> decide which of the two UTC times u0 and u1 should belong in the
> equivalency class of t0 == t1 - neither should.  "In the face of ambiguity"
> and all that.
>

But it breaks compatibility: it breaks the rule that for fold=0 nothing
changes.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150908/7b213a69/attachment.html>

From alexander.belopolsky at gmail.com  Tue Sep  8 17:46:58 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 8 Sep 2015 11:46:58 -0400
Subject: [Datetime-SIG] PEP 495: What's left to resolve
In-Reply-To: <CAP7+vJKYM=u3b5rbFyQ+AODVgdd1pyf7dkGyqi0iZzRtEBxtAQ@mail.gmail.com>
References: <CAP7h-xZcGFmkaY6o1xB60dizC-T63iBOPK89jL1EVoxw-8P5+g@mail.gmail.com>
 <CAP7h-xb1tAfyYg8m7hEUBaMqPVZfO97LfU2aumEskqZV5v_Ddg@mail.gmail.com>
 <CAP7+vJKYM=u3b5rbFyQ+AODVgdd1pyf7dkGyqi0iZzRtEBxtAQ@mail.gmail.com>
Message-ID: <CAP7h-xaviXVdspxMz8spzBuVqs5M8M+wwPbYacZUaNnncMGkZw@mail.gmail.com>

On Tue, Sep 8, 2015 at 11:09 AM, Guido van Rossum <guido at python.org> wrote:

> But it breaks compatibility: it breaks the rule that for fold=0 nothing
> changes.


It preserves a "weak form" of compatibility: nothing changes in the
behavior of aware datetime objects unless they use a post-PEP tzinfo.

Note that Solution 2 also breaks a "strong form" of compatibility (nothing
changes unless fold=1) because pre-PEP tzinfos are supposed to interpret
times in the fold as STD (fold=1).  Note that in my experience very few
tzinfo developers understand this requirement and with a run-of-the-mill
tzinfo you have a 50/50 chance that it will interpret ambiguous times as
fold=0 or fold=1.

Note that PEP 495 in its present form does not promise a "strong form" of
compatibility.  This is something you wanted to have with fold=-1, but I
thought I convinced you that it was not necessary.

The current compatibility promise of PEP 495 is that fold attribute is
ignored unless it is explicitly checked in tzinfo.utcoffset() and friends
implementations.  This stays under Solution 2 because u_ and v_ conversions
are always the same if utcoffset() ignores the value of fold.

Once you decide to use a post-PEP tzinfo, you have no choice but to test
your software on the edge cases if you care about them.  (And you probably
do if you bother to switch to a post-PEP tzinfo.)   If you don't care about
edge cases, you can continue using pre-PEP tzinfos or switch and accept a
more consistent but different edge case behavior.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150908/9a695fad/attachment.html>

From alexander.belopolsky at gmail.com  Tue Sep  8 18:19:59 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 8 Sep 2015 12:19:59 -0400
Subject: [Datetime-SIG] PEP 495: What's left to resolve
In-Reply-To: <CAP7h-xaviXVdspxMz8spzBuVqs5M8M+wwPbYacZUaNnncMGkZw@mail.gmail.com>
References: <CAP7h-xZcGFmkaY6o1xB60dizC-T63iBOPK89jL1EVoxw-8P5+g@mail.gmail.com>
 <CAP7h-xb1tAfyYg8m7hEUBaMqPVZfO97LfU2aumEskqZV5v_Ddg@mail.gmail.com>
 <CAP7+vJKYM=u3b5rbFyQ+AODVgdd1pyf7dkGyqi0iZzRtEBxtAQ@mail.gmail.com>
 <CAP7h-xaviXVdspxMz8spzBuVqs5M8M+wwPbYacZUaNnncMGkZw@mail.gmail.com>
Message-ID: <CAP7h-xaiPzua8yxUQvaL5PEgHpSWqdFQhMmpVvfVph98cjtXTQ@mail.gmail.com>

On Tue, Sep 8, 2015 at 11:46 AM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:

> it breaks the rule that for fold=0 nothing changes.


We may need a new section in the PEP explaining the differences between
pre-PEP and post-PEP tzinfo implementations.  For example, it is not true
that post-PEP utcoffset() will return the same value on a fold=0 instance
as a pre-PEP does.   The pre-PEP rule is to treat both ambiguous (fold) and
missing (gap) times as "standard time".  In the typical DST observing
timezone that alternated between STD and DST, this means that pre-PEP rule
treats fold times as fold=1 and gap times as fold=0.  For more complicated
situations where you can see two folds or two gaps in a row or a time shift
without a DST change (a change in STD offset), no rule is currently
specified.

The existing rules for fold/gap disambiguation are formulated for a single
purpose: to make the generic fromutc() implementation work for the US-style
timezones.  Since PEP 495 requires that the new tzinfo implementations
reimplement their own fromutc(), we decided that we are free to formulate
new gap/fold disambiguation rules.

The PEP 495 rules are formulated to be more rational than those that were
dictated by the fromutc() implementation.  For example, defaulting to the
first time in the fold seems more natural and a wise choice: it the worst
case you will have to kill an hour before the odd time meeting, but you
won't miss it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150908/84dd9435/attachment.html>

From tim.peters at gmail.com  Tue Sep  8 18:41:02 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 8 Sep 2015 11:41:02 -0500
Subject: [Datetime-SIG] Version check (was Re: PEP 495: What's left to
	resolve)
Message-ID: <CAExdVN=V7HL6ZXrEUjLeMVqKPqDEeauv4YRVdj10dsDSZQ4RrA@mail.gmail.com>

[Alex]
> ...
> Once you decide to use a post-PEP tzinfo, you have no choice but to test
> your software on the edge cases if you care about them.

Which reminds me:  the PEP should add a way for a post-495 tzinfo to
say it supplies post-495 semantics, so users can check whether they're
getting a tzinfo they require (if they need fold disambiguation) or
can't tolerate (if they need folds to be ignored for legacy reasons).

It's not a change to the tzinfo API, but is a change to tzinfo semantics.

I guess requiring a new `__version__ = 2` attribute would be OK.

Or (preferably "and") add an optional `fold=None` argument to
.utcoffset()  (by default, use the datetime's .fold attribute, else
use the passed value).  Then an obscure form of version-checking could
be done by seeing whether dt.utcoffset(fold=1) blows up.  That's a
poor way to spell "check the version", but would at least allow
checking to see what would happen if `fold` changed without the
expense of creating new short-lived datetime objects.  Like:

>    v_self = n_self - self.replace(fold=(1-self.fold)).utcoffset()

becoming:

    v_self = n_self - self.utcoffset(fold=1-self.fold)

It seems the worst way to spell "check the version" is the status quo,
where it seems a user would have to contrive a case where `fold`
matters.  While that's usually an excellent way ("check for the
behavior you actually require"), in this case it means the user would
have to know too much (e.g., how do they get a tzinfo representing a
multi-offset zone to begin with?  far as I know, there's no portable
way to ask for that - then, even if they solve that, they need to know
exactly where to find an ambiguous time in that zone).

From tim.peters at gmail.com  Tue Sep  8 19:06:00 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 8 Sep 2015 12:06:00 -0500
Subject: [Datetime-SIG] PEP 495: What's left to resolve
In-Reply-To: <CAP7h-xb1tAfyYg8m7hEUBaMqPVZfO97LfU2aumEskqZV5v_Ddg@mail.gmail.com>
References: <CAP7h-xZcGFmkaY6o1xB60dizC-T63iBOPK89jL1EVoxw-8P5+g@mail.gmail.com>
 <CAP7h-xb1tAfyYg8m7hEUBaMqPVZfO97LfU2aumEskqZV5v_Ddg@mail.gmail.com>
Message-ID: <CAExdVNkfXgrP5pXB2f20VQX0eu-H5wzR9dqaq_CStCbhDkpXrw@mail.gmail.com>

[Alex]
>> Solution 1: Make t1 > t0.
>>
>> Solution 2: Leave t1 == t0, but make t1 != u1.
>
>
> Solution 3:  Leave t1 == t0, but make *both* t0 != u0 and t1 != u1 if
> t0.utcoffset() != t1.utcoffset().
>
> In other words,
>
> def __eq__(self, other):
>     n_self = self.replace(tzinfo=None)
>     n_other = other.replace(tzinfo=None)
>     if self.tzinfo is other.tzinfo:
>         return n_self == n_other

Well, that's infinite recursion - but I know what you mean ;-)


>     u_self = n_self - self.utcoffset()
>     v_self = n_self - self.replace(fold=(1-self.fold)).utcoffset()
>     u_other = n_other - other.utcoffset()
>     v_other = n_other - other.replace(fold=(1-self.fold)).utcoffset()
>     return u_self == u_other == v_self == v_other

More infinite recursion.


> Before anyone complaints that this makes comparison 4x slower,

I don't care about the speed of by-magic interzone comparison, but if
someone does I'd say it's only about 2x slower.  .utcoffset() is the
major expense, and this only doubles the number of those.


> I note that we can add obvious optimizations for the common tzinfo is
> datetime.timezone.utc and  isinstance(tzinfo, datetime.timezone) cases.

Please no.  Comparison is almost certainly almost always intrazone,
and .utcoffset() isn't called at all for intrazone comparisons.


> Users that truly want to compare aware datetime instances between two
> variable offset timezones, should realize that fold/gap detection in *both*
> r.h.s. and l.h.s. zones is part of the operation that they request.
>
> This solution has some nice properties compared to the solution 2: (1) it
> restores the transitivity - we no longer have u0 == t0 == t1 and t1 != u1;
> (2) it restores the symmetry between fold=0 and fold=1 while preserving a
> full backward compatibility.
>
> I also think this solution makes an intuitive sense: since we cannot decide
> which of the two UTC times u0 and u1 should belong in the equivalency class
> of t0 == t1 - neither should.  "In the face of ambiguity" and all

I do like that this "breaks" interzone comparison only in cases where
`fold` actually makes a difference.  Certainly more principled and
focused than special-casing the snot out of all and only fold=1.

But I can never decide whether something really "fixes the hash
problem" without a lot more thought.

So far, so good :-)

From alexander.belopolsky at gmail.com  Tue Sep  8 19:38:11 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 8 Sep 2015 13:38:11 -0400
Subject: [Datetime-SIG] Version check (was Re: PEP 495: What's left to
	resolve)
In-Reply-To: <CAExdVN=V7HL6ZXrEUjLeMVqKPqDEeauv4YRVdj10dsDSZQ4RrA@mail.gmail.com>
References: <CAExdVN=V7HL6ZXrEUjLeMVqKPqDEeauv4YRVdj10dsDSZQ4RrA@mail.gmail.com>
Message-ID: <CAP7h-xZ=Wdy7s2rAh_xVMoNia=kSDv406aH8XeoS8sdsmgU3yA@mail.gmail.com>

On Tue, Sep 8, 2015 at 12:41 PM, Tim Peters <tim.peters at gmail.com> wrote:

> [Alex]
> > ...
> > Once you decide to use a post-PEP tzinfo, you have no choice but to test
> > your software on the edge cases if you care about them.
>
> Which reminds me:  the PEP should add a way for a post-495 tzinfo to
> say it supplies post-495 semantics, so users can check whether they're
> getting a tzinfo they require (if they need fold disambiguation) or
> can't tolerate (if they need folds to be ignored for legacy reasons).
>

We may end up providing something like this, but I hope developing this
mechanism can be left to the tzinfo implementers.  (Which can as well be
us, but in another PEP.)  I am not sure a tzinfo object will need a
persistent attribute rather than just a way to require specific
capabilities at the construction time.  For example, a hypothetical
zoneinfo() constructor or a factory function can take a "fold_aware"
boolean argument and let the user specify what kind of tzinfo is
requested.  It will then become a QOI issue of whether zoneinfo() supports
both pre- and post-PEP semantics or not.

Note that zoneinfo() providers may end up extending  the tzinfo API to
include queries such as give me all folds between year A and year B.

The downside of a persistent run-time attribute that differentiate between
pre-PEP and post-PEP tzinfos is that it may promote writing code that tries
to cope with the presence of pre-PEP and post-PEP tzinfos in the same
program.   This is a recipe for a combinatorial disaster.  Note that on top
of pre-PEP/post-PEP distinction a good tzinfo() library will probably also
supply a TZ database version.  Imagine writing a simple "within(t, start,
stop)" function that should account for the tree arguments possibly having
different    "fold_aware" attribute and different tzversion?


>
> It's not a change to the tzinfo API, but is a change to tzinfo semantics.
>
> I guess requiring a new `__version__ = 2` attribute would be OK.
>

I generally dislike "version" constants or attributes.  My preferred
solution would be to provide a generic PEP 495 compliant fromutc() in a
tzinfo subclass and ask PEP 495 compliant implementations to derive from
that.


>
> Or (preferably "and") add an optional `fold=None` argument to
> .utcoffset()  (by default, use the datetime's .fold attribute, else
> use the passed value).


I thought about this as an optimization.  dt.utcoffset(fold=1) being an
equivalent of dt.replace(fold=1).utcoffset() which avoids copying of the
entire dt object into a temporary.  I think this is a minor issue.  I can
go either way on this.


>   Then an obscure form of version-checking could
> be done by seeing whether dt.utcoffset(fold=1) blows up.


I would not add dt.utcoffset(fold=x) just for that and if we end up adding
it for other reasons will probably consider such use a hack.


>   That's a
> poor way to spell "check the version", but would at least allow
> checking to see what would happen if `fold` changed without the
> expense of creating new short-lived datetime objects.


Yes, this is a good reason and since calling utcoffset() both ways will be
typical for "careful" applications, I don't mind giving them some syntactic
sugar for that.

Yet again, this is not a "live or die" issue for PEP 495.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150908/5a979570/attachment.html>

From alexander.belopolsky at gmail.com  Tue Sep  8 19:43:51 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 8 Sep 2015 13:43:51 -0400
Subject: [Datetime-SIG] PEP 495: What's left to resolve
In-Reply-To: <CAExdVNkfXgrP5pXB2f20VQX0eu-H5wzR9dqaq_CStCbhDkpXrw@mail.gmail.com>
References: <CAP7h-xZcGFmkaY6o1xB60dizC-T63iBOPK89jL1EVoxw-8P5+g@mail.gmail.com>
 <CAP7h-xb1tAfyYg8m7hEUBaMqPVZfO97LfU2aumEskqZV5v_Ddg@mail.gmail.com>
 <CAExdVNkfXgrP5pXB2f20VQX0eu-H5wzR9dqaq_CStCbhDkpXrw@mail.gmail.com>
Message-ID: <CAP7h-xZ8v3DkfUiNxOk+_+CetaNYgPP1eHgjG86xBZj6gt5UOA@mail.gmail.com>

On Tue, Sep 8, 2015 at 1:06 PM, Tim Peters <tim.peters at gmail.com> wrote:

> > def __eq__(self, other):
> >     n_self = self.replace(tzinfo=None)
> >     n_other = other.replace(tzinfo=None)
> >     if self.tzinfo is other.tzinfo:
> >         return n_self == n_other
>
> Well, that's infinite recursion - but I know what you mean ;-)
>

No.  You've probably missed that n_ objects are naive and naive comparison
is just your plain old fold-unaware compare-all-components -except-fold
operation.

>
>
> >     u_self = n_self - self.utcoffset()
> >     v_self = n_self - self.replace(fold=(1-self.fold)).utcoffset()
> >     u_other = n_other - other.utcoffset()
> >     v_other = n_other - other.replace(fold=(1-self.fold)).utcoffset()
> >     return u_self == u_other == v_self == v_other
>
> More infinite recursion.


ditto
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150908/498c9480/attachment.html>

From alexander.belopolsky at gmail.com  Tue Sep  8 19:50:15 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 8 Sep 2015 13:50:15 -0400
Subject: [Datetime-SIG] PEP 495: What's left to resolve
In-Reply-To: <CAExdVNkfXgrP5pXB2f20VQX0eu-H5wzR9dqaq_CStCbhDkpXrw@mail.gmail.com>
References: <CAP7h-xZcGFmkaY6o1xB60dizC-T63iBOPK89jL1EVoxw-8P5+g@mail.gmail.com>
 <CAP7h-xb1tAfyYg8m7hEUBaMqPVZfO97LfU2aumEskqZV5v_Ddg@mail.gmail.com>
 <CAExdVNkfXgrP5pXB2f20VQX0eu-H5wzR9dqaq_CStCbhDkpXrw@mail.gmail.com>
Message-ID: <CAP7h-xYcwG1z0GhbKoH376dOP3wZ0V8tjCqFnv4w=t2eSrJQRg@mail.gmail.com>

On Tue, Sep 8, 2015 at 1:06 PM, Tim Peters <tim.peters at gmail.com> wrote:

> > I note that we can add obvious optimizations for the common tzinfo is
> > datetime.timezone.utc and  isinstance(tzinfo, datetime.timezone) cases.
>
> Please no.  Comparison is almost certainly almost always intrazone,
> and .utcoffset() isn't called at all for intrazone comparisons.


I don't understand this comment.  Solution 3 does not change anything for
the intrazone (self.tzinfo is other.tzinfo) comparisons.  Are you just
saying that a slowdown in interzone comparison is a welcome feature to
discourage bad programming practices?   Sorry, I have a few ideas on how to
optimize Solution 3  __eq__ even without special-casing fixed-offset
tzinfos. :-)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150908/cfe229b8/attachment-0001.html>

From tim.peters at gmail.com  Tue Sep  8 19:50:22 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 8 Sep 2015 12:50:22 -0500
Subject: [Datetime-SIG] PEP 495: What's left to resolve
In-Reply-To: <CAP7h-xZ8v3DkfUiNxOk+_+CetaNYgPP1eHgjG86xBZj6gt5UOA@mail.gmail.com>
References: <CAP7h-xZcGFmkaY6o1xB60dizC-T63iBOPK89jL1EVoxw-8P5+g@mail.gmail.com>
 <CAP7h-xb1tAfyYg8m7hEUBaMqPVZfO97LfU2aumEskqZV5v_Ddg@mail.gmail.com>
 <CAExdVNkfXgrP5pXB2f20VQX0eu-H5wzR9dqaq_CStCbhDkpXrw@mail.gmail.com>
 <CAP7h-xZ8v3DkfUiNxOk+_+CetaNYgPP1eHgjG86xBZj6gt5UOA@mail.gmail.com>
Message-ID: <CAExdVNkRYvCOORh2HhNBZx6mzv2WtkJEV7waR89C5N0bt1ovhA@mail.gmail.com>

[Alex]
>>> def __eq__(self, other):
>>>     n_self = self.replace(tzinfo=None)
>>>     n_other = other.replace(tzinfo=None)
>>>     if self.tzinfo is other.tzinfo:
>>>         return n_self == n_other

[Tim]
>> Well, that's infinite recursion - but I know what you mean ;-)

[Alox]
> No.  You've probably missed that n_ objects are naive and naive comparison
> is just your plain old fold-unaware compare-all-components -except-fold
> operation.

I assumed you were showing an implementation of datetime.__eq__.  Yes?
 In that case, `self` and `other` may both be naive on entry.  Then
the first two lines effectively make exactly copies of them.  Since
None is None, the `self.tzinfo is other.tzinfo` check succeeds, and so
goes on to compare n_self to n_other - which are exact copies of the
original inputs.  Lather, rinse, repeat.


>>>     u_self = n_self - self.utcoffset()
>>>     v_self = n_self - self.replace(fold=(1-self.fold)).utcoffset()
>>>     u_other = n_other - other.utcoffset()
>>>     v_other = n_other - other.replace(fold=(1-self.fold)).utcoffset()
>>>     return u_self == u_other == v_self == v_other

>> More infinite recursion.

> ditto

Ditto ;-)

From alexander.belopolsky at gmail.com  Tue Sep  8 19:55:06 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 8 Sep 2015 13:55:06 -0400
Subject: [Datetime-SIG] PEP 495: What's left to resolve
In-Reply-To: <CAExdVNkRYvCOORh2HhNBZx6mzv2WtkJEV7waR89C5N0bt1ovhA@mail.gmail.com>
References: <CAP7h-xZcGFmkaY6o1xB60dizC-T63iBOPK89jL1EVoxw-8P5+g@mail.gmail.com>
 <CAP7h-xb1tAfyYg8m7hEUBaMqPVZfO97LfU2aumEskqZV5v_Ddg@mail.gmail.com>
 <CAExdVNkfXgrP5pXB2f20VQX0eu-H5wzR9dqaq_CStCbhDkpXrw@mail.gmail.com>
 <CAP7h-xZ8v3DkfUiNxOk+_+CetaNYgPP1eHgjG86xBZj6gt5UOA@mail.gmail.com>
 <CAExdVNkRYvCOORh2HhNBZx6mzv2WtkJEV7waR89C5N0bt1ovhA@mail.gmail.com>
Message-ID: <CAP7h-xZqb0G6KpZRYVnqqYYDxJHf-qLfGKk10sorh9OwUpiFLQ@mail.gmail.com>

On Tue, Sep 8, 2015 at 1:50 PM, Tim Peters <tim.peters at gmail.com> wrote:

> I assumed you were showing an implementation of datetime.__eq__.  Yes?
>  In that case, `self` and `other` may both be naive on entry.  Then
> the first two lines effectively make exactly copies of them.  Since
> None is None, the `self.tzinfo is other.tzinfo` check succeeds, and so
> goes on to compare n_self to n_other - which are exact copies of the
> original inputs.  Lather, rinse, repeat.
>

Got it.  No, I was not concerned with the naive case - I assumed that it
was magically fulfilled without calling this __eq__ method.  If this idea
passes a sniff test - I will implement it in my fork so that we can play
with a working prototype.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150908/90efaf88/attachment.html>

From tim.peters at gmail.com  Tue Sep  8 20:02:39 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 8 Sep 2015 13:02:39 -0500
Subject: [Datetime-SIG] PEP 495: What's left to resolve
In-Reply-To: <CAP7h-xYcwG1z0GhbKoH376dOP3wZ0V8tjCqFnv4w=t2eSrJQRg@mail.gmail.com>
References: <CAP7h-xZcGFmkaY6o1xB60dizC-T63iBOPK89jL1EVoxw-8P5+g@mail.gmail.com>
 <CAP7h-xb1tAfyYg8m7hEUBaMqPVZfO97LfU2aumEskqZV5v_Ddg@mail.gmail.com>
 <CAExdVNkfXgrP5pXB2f20VQX0eu-H5wzR9dqaq_CStCbhDkpXrw@mail.gmail.com>
 <CAP7h-xYcwG1z0GhbKoH376dOP3wZ0V8tjCqFnv4w=t2eSrJQRg@mail.gmail.com>
Message-ID: <CAExdVNnsDHqGkSDupmhQZZj2NyA_K39KAVsBtBzNo-LybcLcQA@mail.gmail.com>

[Alex]
>>> I note that we can add obvious optimizations for the common tzinfo is
>>> datetime.timezone.utc and  isinstance(tzinfo, datetime.timezone) cases.

[Tim]
>> Please no.  Comparison is almost certainly almost always intrazone,
>> and .utcoffset() isn't called at all for intrazone comparisons.

[Alex]
> I don't understand this comment.  Solution 3 does not change anything for
> the intrazone (self.tzinfo is other.tzinfo) comparisons.

Right.  The most important cases are already as fast as they were before.


> Are you just saying that a slowdown in interzone comparison is a
> welcome feature to discourage bad programming practices?   Sorry,
> I have a few ideas on how to optimize Solution 3  __eq__ even
> without special-casing fixed-offset tzinfos. :-)

Premature optimization is the root of all evil.  You're proposing to
add even more complication to the code _solely_ to speed up cases in
which you're merely guessing it really will make a lick of difference
to user code.  And they'll "run slow" anyway, just not _as_ slow as
possible.

At the start it's always best to do the simplest thing that could
possibly work without inflicting _obviously_ unreasonable pain.

If it turns out it really does matter to someone, they'll file a
report, and then's the time to think about semantically useless
complications solely for speed.  Every line of code is another chance
for an error to sneak in, for maintainers to puzzle over after you're
gone, etc.

From alexander.belopolsky at gmail.com  Tue Sep  8 20:19:08 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 8 Sep 2015 14:19:08 -0400
Subject: [Datetime-SIG] PEP 495: What's left to resolve
In-Reply-To: <CAExdVNnsDHqGkSDupmhQZZj2NyA_K39KAVsBtBzNo-LybcLcQA@mail.gmail.com>
References: <CAP7h-xZcGFmkaY6o1xB60dizC-T63iBOPK89jL1EVoxw-8P5+g@mail.gmail.com>
 <CAP7h-xb1tAfyYg8m7hEUBaMqPVZfO97LfU2aumEskqZV5v_Ddg@mail.gmail.com>
 <CAExdVNkfXgrP5pXB2f20VQX0eu-H5wzR9dqaq_CStCbhDkpXrw@mail.gmail.com>
 <CAP7h-xYcwG1z0GhbKoH376dOP3wZ0V8tjCqFnv4w=t2eSrJQRg@mail.gmail.com>
 <CAExdVNnsDHqGkSDupmhQZZj2NyA_K39KAVsBtBzNo-LybcLcQA@mail.gmail.com>
Message-ID: <CAP7h-xazn1Da4jCujVGLKF3z6Y+4jm+1Y7xCtw2-LVmnaAoG4Q@mail.gmail.com>

On Tue, Sep 8, 2015 at 2:02 PM, Tim Peters <tim.peters at gmail.com> wrote:

> > Are you just saying that a slowdown in interzone comparison is a
> > welcome feature to discourage bad programming practices?   Sorry,
> > I have a few ideas on how to optimize Solution 3  __eq__ even
> > without special-casing fixed-offset tzinfos. :-)
>
> Premature optimization is the root of all evil.


Agree 100%.


>   You're proposing to add even more complication to the code


No, I actually think the code can be simpler (and without an infinite
recursion.)  In any case, it won't matter for the CPython users what we
will ship in datetime.py, so I will write something that make the intent
very clear.  The type of optimization that I had in mind was that once you
discover that self is in the fold/gap, you can return False without calling
other.utcoffset().

The question is what is easier to understand: (a) t1 and t2 are equal if
and only if t1 - t1.replace(fold=f1).utcoffset() ==  t2 -
t2.replace(fold=f2).utcoffset() for all four possible pairs (f1, f2); or
(b) t1 and t2 are equal if and only if they are unambiguous and valid in
their respective zones and convert to the same UTC instant.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150908/ecaf74fc/attachment.html>

From tim.peters at gmail.com  Tue Sep  8 21:13:45 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 8 Sep 2015 14:13:45 -0500
Subject: [Datetime-SIG] Version check (was Re: PEP 495: What's left to
	resolve)
In-Reply-To: <CAP7h-xZ=Wdy7s2rAh_xVMoNia=kSDv406aH8XeoS8sdsmgU3yA@mail.gmail.com>
References: <CAExdVN=V7HL6ZXrEUjLeMVqKPqDEeauv4YRVdj10dsDSZQ4RrA@mail.gmail.com>
 <CAP7h-xZ=Wdy7s2rAh_xVMoNia=kSDv406aH8XeoS8sdsmgU3yA@mail.gmail.com>
Message-ID: <CAExdVNm_NqF6_qdF_QbM6-Ce24VbBZ2y35Ack-UFmd0dOdVYKg@mail.gmail.com>

[Tim]
>> Which reminds me:  the PEP should add a way for a post-495 tzinfo to
>> say it supplies post-495 semantics, so users can check whether they're
>> getting a tzinfo they require (if they need fold disambiguation) or
>> can't tolerate (if they need folds to be ignored for legacy reasons).

[Alex]
> We may end up providing something like this, but I hope developing this
> mechanism can be left to the tzinfo implementers.

Python defines the tzinfo API and minimal tzinfo semantics.  If Python
doesn't also resolve tzinfo discoverability issues created by its own
new requirements. then tzinfo implementers will create a Tower of
Babel.  Far better for Python to define _the_ way to check whether a
tzinfo implements 495 semantics.  This should be nearly trivial, both
to specify and for tzinfo authors to implement - unless we go out of
our way to create complications that aren't _inherent_ to the problem
at hand ("which version did I get?").


> (Which can as well be us but in another PEP.)

Disagree.  PEP 495 is _creating_ a new "discoverability" problem.  So
that's the place to fix it too, before it becomes a real problem.


> I am not sure a tzinfo object will need a persistent attribute rather than
> just a way to require specific capabilities at the construction time.

"Tower of Babel" - Python has no business specifying how a tzinfo
object "must be" obtained to begin with, and there are already
multiple ways out in the field.  But Python is requiring a change to
semantics.  Some tzinfo authors may choose to provide an explicit way
to ask for PEP 495 semantics, while others may not, etc.  User code
needs a uniform way to ask whether what they get in the end meets
their requirements.  When their requirements depend only on things
where Python itself changed its mind, it's Python's proper responsibility
to give the user a way to tell which they got.


> For example, a hypothetical zoneinfo() constructor or a
> factory function can take a "fold_aware" boolean argument and let the user
> specify what kind of tzinfo is requested.  It will then become a QOI issue
> of whether zoneinfo() supports both pre- and post-PEP semantics or not.

Yes, Tower of Babel.  There's no need to inflict this potential
confusion on users.  Just specify a way to check. that _all_ post-495
tzinfos must support.


> Note that zoneinfo() providers may end up extending  the tzinfo API to
> include queries such as give me all folds between year A and year B.

Different issue, because _Python_ isn't specifying anything about
that.  We can't do anything about Towers of Babel tzinfo authors
choose to create on their own.  We can do something about new
semantics Python is forcing them to supply.  BTW, I've never yet
seen a tzinfo that supplied any functionality beyond the minimum
required by the docs.


> The downside of a persistent run-time attribute that differentiate between
> pre-PEP and post-PEP tzinfos is that it may promote writing code that tries
> to cope with the presence of pre-PEP and post-PEP tzinfos in the same
> program.   This is a recipe for a combinatorial disaster.

If a user chooses to embrace that, that's on them.  Far better to give
them a uniform way to check the tzinfos they get so they can
absolutely avoiding mixing pre-495 and post-495 tzinfos to begin with.


> Note that on top of pre-PEP/post-PEP distinction a good tzinfo() library
> will probably also supply a TZ database version.  Imagine writing a
> simple "within(t, start, stop)" function that should account for the
> tree arguments possibly having different  "fold_aware" attribute
> and different tzversion?

Again, how can a sane user ensure they're _not_ getting into a such a
mess if they can't even ask "is this a pre- or post-495 tzinfo?" in a
uniform way?  Assume 495 is successful.  Some general-purpose library code
will be _passed_ datetimes with tzinfos it had nothing to do with
creating, and general-purpose libraries can't assume more than the
minimum the Python docs require.  The library has no control at all
over the tzinfos it sees, but may _need_ to know whether they're pre-
or post-495.

495 can make that simple instead of nearly impossible.


>> I guess requiring a new `__version__ = 2` attribute would be OK.

> I generally dislike "version" constants or attributes.

Me too, but far better than nothing.


>  My preferred solution would be to provide a generic PEP 495 compliant
> fromutc() in a tzinfo subclass and ask PEP 495 compliant implementations
> to derive from that.

That would be fine, except it's no longer trivial - for us.  It would
be better to supply a new marker class in the stdlib a PEP 495
compliant tzinfo had to derive from, but whose .fromutc() _must_ be
overridden.  All the industrial-strength zone wrappings are dealing
with databases for which overriding .fromutc() is by far the best
approach anyway.  So, if we wanted to be _useful_, it would do more
good for more people if we supplied a horridly slow default
.utcoffset() instead.

But this is "creating complications that aren't _inherent_ to the
problem at hand".   And if this isn't the last change Python ever
makes to tzinfo semantics, a plain integer version number is probably
easier for most people to grasp and live with than a graph of marker
classes anyway.

>> Or (preferably "and") add an optional `fold=None` argument to
>> .utcoffset()  (by default, use the datetime's .fold attribute, else
>> use the passed value).

> I thought about this as an optimization.  dt.utcoffset(fold=1) being an
> equivalent of dt.replace(fold=1).utcoffset() which avoids copying of the
> entire dt object into a temporary.  I think this is a minor issue.  I can go
> either way on this.

It's a poor way to do version-checking, so I shouldn't have mentioned
it.  Alas, Guido's time machine is tied up preventing by-magic
interzone comparison from ever being implemented :-(

From carl at oddbird.net  Tue Sep  8 21:34:23 2015
From: carl at oddbird.net (Carl Meyer)
Date: Tue, 8 Sep 2015 13:34:23 -0600
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <CAExdVNnjngG1HeV9DFtEg-uCoYb2=gbYpcU4R4Oz0spSduBzaQ@mail.gmail.com>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>
 <55E9F626.1080906@oddbird.net>
 <CAExdVNmmqSMz5_2ZTcxZLJ2c1i9u=SkCi6mJvg8Lf7wHbRi1sA@mail.gmail.com>
 <55ECD82E.9070305@oddbird.net>
 <CAExdVNk9ggWEU4Nf_f4HV-W61gfLysn0ndcaxcaWQOXeev56JQ@mail.gmail.com>
 <55EDB967.2050108@oddbird.net>
 <CAExdVNmqahyRpU+s+RMoe+2L3XkE1e8-4QhhUH=AnhDXyMyGOw@mail.gmail.com>
 <CAP7+vJJs-9AAiPowGZNzcXKQQReVe2ai-aEs4tuNyEtjbfmwtw@mail.gmail.com>
 <CAExdVNn5bvo15CHL8nGMX-9Y+Huv6FYe2=CLJfzjxt0zBdJDGQ@mail.gmail.com>
 <A8EAB57D-A489-4B3D-B06F-7D40705AF392@oddbird.net>
 <CAExdVNmp=Y7nD5JJyRCXbmmhNOo+Y8BgYtwU+ZR4C0aNVwMEkg@mail.gmail.com>
 <55EE233C.1020307@oddbird.net>
 <CAExdVNnjngG1HeV9DFtEg-uCoYb2=gbYpcU4R4Oz0spSduBzaQ@mail.gmail.com>
Message-ID: <55EF383F.1020705@oddbird.net>

> [Tim]
>>>>> An aware datetime _is_ a <naive datetime,
>>>>> tzinfo> pair, and there's a natural bijection between naive datetimes
>>>>> and POSIX timestamps (across all instants both can represent).
> 
>  [Carl]
>>>> I don't understand this, and I suspect it's at the heart of our
>>>> misunderstanding. I would say there are many possible bijections ....
> 
>  [Tim]
>>> "Natural" bijection.  I gave you very simple Python code implementing
>>> that bijection already.  A naive datetime represents an instant in the
>>> proleptic Gregorian calendar.
> 
> [Carl]
>> What is your definition of "instant" here?

[Tim]
> I didn't need one - Occam's Razor again ;-)  To establish a bijection,
> all that's required is to show that a proposed function meets all the
> formal requirements...  "Represents an
> instance" was just vague English motivation for what followed.

Of course. I never expressed any doubt that you had established _a_
bijection. It was the motivation I was trying to understand.

>> I don't think a naive datetime represents an instant at all;
> 
> Fine by me - and by Python.  Also fine if you _never_ use a naive datetime. 
> 
>> it represents a range of possible instants,
> 
> Heh - I see you haven't defined what _you_ mean by "instant".

I already gave my definition earlier in this thread. It's borrowed from
NodaTime/JodaTime: an instant is a unique and unambiguous point on a
single global non-relativistic monotonic time line. Since I don't care
about leap seconds, this definition is satisfied equally well for my
purposes by a POSIX timestamp or a UTC datetime, among many other
possible representations.

I find this definition of instant _useful_ because it means that all
instants, no matter their representations, are always convertible to
integers on the same scale. That's not true of naive datetimes, without
making an additional assumption of timezone.

>  When
> you do, please be sure it's consistent with what POSIX says here too:
> 
>     The relationship between the actual time of day and the current
>     value for seconds since the Epoch is unspecified.
> 
>     How any changes to the value of seconds since the Epoch are
>     made to align to a desired relationship with the current actual time
>     is implementation-defined. As represented in seconds since the
>     Epoch, each and every day shall be accounted for by exactly
>     86400 seconds.

AFAICT that's just a bit of beating around the bush about not supporting
leap seconds. I don't care :-)

> While you're at it, define a clean model in which all that makes a
> lick of sense to a casual user ;-)

Actually, I think Model A _is_ such a clean model (if we can presume
that the casual user in question also doesn't care about leap seconds or
relativistic effects). I've taught many Python users how to use pytz,
and my experience has been that the concept of a single global monotonic
timeline, where all aware datetimes are simply variant spellings of some
unambiguous point on that timeline, but (other than in their
representation as a Gregorian date/time) behave the same no matter which
timezone you spell them in, is quite easy to explain and grasp, even for
people who've never worked with timezones before.

Part of my dismay in this thread has been realizing now that I've
mis-educated all these users about how datetime is really supposed to
work :-) Like Stuart, I'm a bit concerned that a whole lot of pytz users
are going to be very confused if or when they try to switch to PEP 495
style tzinfo's instead.

I think in some ways Model B is really more powerful than Model A,
because it lets you work in any number of different "local time" models,
rather than requiring that you always work on the same single global
timeline. And there are definitely cases where you need that.

> The "so what?", in context, was to tweak Guido about saying an aware
> datetime is fundamentally different from a <timestamp, tzinfo> pair,
> despite that the space of such pairs is isomorphic to the space of
> aware datetimes (which _is_ the space of <naive datetime, tzinfo>
> pairs) under the natural naive_datetime <-> timestamp bijection.
> 
> Why is that setting _you_ off?  Guido handled it just fine ;-)

Heh. Just the urge to understand things, that's all. I'm just slower
than Guido :-) but I think I get your point now; it was a narrower point
than I'd realized. <posix-timestamp, tz> is only "fundamentally
different" from <naive-datetime, tz> in that they imply different mental
models about what they are supposed to represent; mathematically they
are no different.

>> Under what circumstances is it reasonable to make that assumption
>> about a naive datetime?
> 
> Any use case where it's convenient   That's up to the user. not me -
> or you.  For example, before Python grew its builtin
> datetime.timezone.utc implementation of a UTC class, I routinely used
> naive datetimes I thought of as being in UTC.  I was too lazy to
> remember where I hid my own UTC class.  No problem.

Sure, of course. As a pytz user, I'm forced to do the same thing (use
naive datetimes and track an implied timezone separately) anytime I need
to work in a "local clock time" model.

>> Rather than saying "a naive datetime simply doesn't correspond to
>> any particular POSIX timestamp;  they aren't comparable at all unless
>> you have additional information," which is what I'd say.
> 
> I'm starting to suspect you didn't design datetime ;-)  In context, I
> was replying to Guido, who was talking about Python.  In Python's
> datetime, naive datetimes are comparable.   Naive time has no
> _concept_ of time zone.  Naive datetimes nevertheless have a  notion
> of total order, which is isomorphic to the POSIX timestamp notion of
> total order under the natural bijection.  Likewise for arithmetic,
> etc.  There's nothing "wrong" about exploiting any of that when it's
> convenient.

This is simply a mis-understanding. I certainly do consider naive
datetimes comparable to other naive datetimes, and I'm well aware (and
glad) that Python does too. The referent of "they" above was "naive
datetimes and POSIX timestamps." I don't consider those comparable _to
each other_ unless you bring an additional assumption about the implied
timezone of the naive datetime. And Python agrees.

>> I mean, I certainly hope you wouldn't want datetime to make `utcdt -
>> naivedt` a defined operation where it's assumed the naive datetime is UTC.
> 
> Certainly not.  That _would_ be wrong ;-)

Violent agreement again in the end once again, then...

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150908/2bd65107/attachment.sig>

From alexander.belopolsky at gmail.com  Tue Sep  8 21:45:16 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 8 Sep 2015 15:45:16 -0400
Subject: [Datetime-SIG] PEP 495: What's left to resolve
In-Reply-To: <CAExdVNkfXgrP5pXB2f20VQX0eu-H5wzR9dqaq_CStCbhDkpXrw@mail.gmail.com>
References: <CAP7h-xZcGFmkaY6o1xB60dizC-T63iBOPK89jL1EVoxw-8P5+g@mail.gmail.com>
 <CAP7h-xb1tAfyYg8m7hEUBaMqPVZfO97LfU2aumEskqZV5v_Ddg@mail.gmail.com>
 <CAExdVNkfXgrP5pXB2f20VQX0eu-H5wzR9dqaq_CStCbhDkpXrw@mail.gmail.com>
Message-ID: <CAP7h-xb3yAgMwq_ArnhDbDVaR6vumn-qTtA_mri3a4zVt9E8=Q@mail.gmail.com>

On Tue, Sep 8, 2015 at 1:06 PM, Tim Peters <tim.peters at gmail.com> wrote:

> But I can never decide whether something really "fixes the hash
> problem" without a lot more thought.
>

Let me try to outline a formal proof.

Definitions:  An aware datetime value t is called "regular" if
t.utcoffset() does not depend on the value of the fold attribute.  All
other values are called "special".  A binary relation "==" is defined by
the following rules: (a) two special values s1 and s2 satisfy the "=="
relation if they are the same (all component are equal) or they differ only
by the value of fold; (b) for any special value s and regular value r, both
r == s and s == r are False; and (c) for two regular values r1 and r2, r1
== r2 is equivalent to r1 - r1.utcoffset() and r2 - r2.utcoffset() having
the same components.  (Recall that according to PEP 495, dt - delta always
has fold=0.)  It will also be useful to define a "naive" equivalence: t1 ~
t2 if t1.tzinfo is t2.tzinfo and all their components except fold (year
through microseconds) are equal.  We will assume that ~ being an
equivalence relation is well known.

Lemma: The "==" relation defined above is an equivalence relation.

Proof:  We need to prove reflexivity (t == t for any t), symmetry (t1 == t2
=> t2 == t1) and transitivity (t1 == t2 and t2 == t3 implies t1 == t3).
Note that because of rule (b) it is enough to prove that == is equivalence
separately for regular and special values.  The complete proof is a rather
tedious analysis of six propositions: three properties for each
regular/special case.  I'll present the two least trivial ones.

1. Let's show that == is transitive on the regular datetimes.  Indeed, let
r1, r2 and r3 are regular datetimes and o1, o2, and o3 are their
utcoffset() values.  Then r1 == r2 and r2 == r3 implies that r1 - o1 ~ r2 -
o2 and r2 - o2 ~ r3 - o3, which in turn implies that r1 - o1 ~ r3 - o3 by
transitivity of ~, which in turn implies r1 == r3 by transitivity of ~.
QED.

2. Let's show that == is transitive on the special datetimes.  This case is
even simpler because s1 == s2 implies s1 ~ s2 (s1 and s2 differ only by
fold), s2 == s3 implies s2 ~ s3 and thus s1 ~ s3 by transitivity of ~ and
s1 == s3 by rule (a).

Lemma: A function that is constant on equivalence classes satisfies the
hash invariant.

Proof:  This is a tautology.

Proposition: newhash(t) = oldhash(t.replace(fold=0)) satisfies the hash
invariant.

Proof: If t is special, its equivalence class consists of itself and a
value with the complement value of fold.  Since we force fold=0 before
computing the hash values, it is trivially the same for both values in the
same class.  If t is regular, since oldhash is defined as a hash of t -
t.utcoffset() components, the hash values of r1 and r2 are equal if  r1 -
r1.utcoffset() ~ r2 - r2.utcoffset() which follows from r1 == r2 by rule
(c).


>
> So far, so good :-)
>

Except for headache. :-)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150908/189a4370/attachment.html>

From tim.peters at gmail.com  Tue Sep  8 23:22:28 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 8 Sep 2015 16:22:28 -0500
Subject: [Datetime-SIG] PEP 495: What's left to resolve
In-Reply-To: <CAP7h-xaviXVdspxMz8spzBuVqs5M8M+wwPbYacZUaNnncMGkZw@mail.gmail.com>
References: <CAP7h-xZcGFmkaY6o1xB60dizC-T63iBOPK89jL1EVoxw-8P5+g@mail.gmail.com>
 <CAP7h-xb1tAfyYg8m7hEUBaMqPVZfO97LfU2aumEskqZV5v_Ddg@mail.gmail.com>
 <CAP7+vJKYM=u3b5rbFyQ+AODVgdd1pyf7dkGyqi0iZzRtEBxtAQ@mail.gmail.com>
 <CAP7h-xaviXVdspxMz8spzBuVqs5M8M+wwPbYacZUaNnncMGkZw@mail.gmail.com>
Message-ID: <CAExdVN=mj9+4tgXZ_J+bVCkjQXyVm3D8KJa2rRzdfOWTJkKQ6g@mail.gmail.com>

[Guido]
>> But it breaks compatibility: it breaks the rule that for fold=0 nothing
>> changes.

[Alex]
> It preserves a "weak form" of compatibility: nothing changes in the behavior
> of aware datetime objects unless they use a post-PEP tzinfo.

In that specific way, it's "more backward compatible" than the
"special-case-the-snot-out-of-fold=1 in interzone __eq__ and __ne__",
but it's subtle:

Some datetime class constructors, like .now() and today(), can set
fold=1 even if no post-495 tzinfos exist, based on Python's own idea
of what the system zone rules are.

If one of those happens to be generated during a repeated time in the
system zone, and a pre-495 tzinfo is attached, then special-casing
fold=1 makes it "not equal" to anything in any other zone despite that
a pre-495 tzinfo is in use.  That's certainly breaking _some_ form of
backward compatibility, however obscure.

But under Alex's latest idea, that wouldn't break:  the pre-495
tzinfo's .utcoffset() would return the same thing regardless of
`fold`, so the new __eq__ wouldn't see any problem with it.  The
latest idea is based on determining whether a time is _really_ "a
problem case", and to a pre-495 tzinfo nothing is.  Just staring at
`fold` without consulting the tzinfo is guessing at whether it _might_
be a real problem for the tzinfo in use, and in fact always guesses
wrong when fold=1 and a pre-495 tzinfo is in use.


> Note that Solution 2 also breaks a "strong form" of compatibility (nothing
> changes unless fold=1) because pre-PEP tzinfos are supposed to interpret
> times in the fold as STD (fold=1).  Note that in my experience very few
> tzinfo developers understand this requirement and with a run-of-the-mill
> tzinfo you have a 50/50 chance that it will interpret ambiguous times as
> fold=0 or fold=1.

Well, if they copied the Python doc examples, they got this "right".
If they're using dateutil's wrappings, they also got this right.  And
it's a non-issue in pytz, because that only ever uses fixed-offset
classes.  The three users who remain will just have to eat their own
hasty cooking ;-)


> ...
> Once you decide to use a post-PEP tzinfo, you have no choice but to test
> your software on the edge cases if you care about them.  (And you probably
> do if you bother to switch to a post-PEP tzinfo.)   If you don't care about
> edge cases, you can continue using pre-PEP tzinfos or switch and accept a
> more consistent but different edge case behavior.

Yup!  The new idea is cleaner and clearer.  But runs slower ;-)

From rosuav at gmail.com  Wed Sep  9 03:49:15 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 9 Sep 2015 11:49:15 +1000
Subject: [Datetime-SIG] PEP 495: What's left to resolve
In-Reply-To: <CAP7h-xb3yAgMwq_ArnhDbDVaR6vumn-qTtA_mri3a4zVt9E8=Q@mail.gmail.com>
References: <CAP7h-xZcGFmkaY6o1xB60dizC-T63iBOPK89jL1EVoxw-8P5+g@mail.gmail.com>
 <CAP7h-xb1tAfyYg8m7hEUBaMqPVZfO97LfU2aumEskqZV5v_Ddg@mail.gmail.com>
 <CAExdVNkfXgrP5pXB2f20VQX0eu-H5wzR9dqaq_CStCbhDkpXrw@mail.gmail.com>
 <CAP7h-xb3yAgMwq_ArnhDbDVaR6vumn-qTtA_mri3a4zVt9E8=Q@mail.gmail.com>
Message-ID: <CAPTjJmrW_kZ60gyfjJ71qsURv5K+UOSU_=NB33LPTGm98AGwJQ@mail.gmail.com>

On Wed, Sep 9, 2015 at 5:45 AM, Alexander Belopolsky
<alexander.belopolsky at gmail.com> wrote:
> Definitions:  An aware datetime value t is called "regular" if t.utcoffset()
> does not depend on the value of the fold attribute.

One point to clarify here. Is the definition of "regular" based on the
timezone alone (that is to say, a UTC datetime is regular, and an
Australia/Brisbane datetime is regular, but anything in a region with
DST is always special), or are "special" datetimes only those in the
fold period?

The former is easily identified. As the zoneinfo file is parsed, it'll
be obvious which ones can ever have times that differ only in fold,
and they get flagged as "special". The check is simple - ask the
timezone object whether it's regular or special.

The latter, perhaps not so much. Given a particular datetime, can you
easily and reliably ascertain whether or not there is any other
section of time which can "look like" this one?

Maybe I've missed something, having been skimming rather than reading
every post in detail. (There have been rather a lot of them, and here
I am making that worse...)

ChrisA

From alexander.belopolsky at gmail.com  Wed Sep  9 04:02:42 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 8 Sep 2015 22:02:42 -0400
Subject: [Datetime-SIG] PEP 495: What's left to resolve
In-Reply-To: <CAPTjJmrW_kZ60gyfjJ71qsURv5K+UOSU_=NB33LPTGm98AGwJQ@mail.gmail.com>
References: <CAP7h-xZcGFmkaY6o1xB60dizC-T63iBOPK89jL1EVoxw-8P5+g@mail.gmail.com>
 <CAP7h-xb1tAfyYg8m7hEUBaMqPVZfO97LfU2aumEskqZV5v_Ddg@mail.gmail.com>
 <CAExdVNkfXgrP5pXB2f20VQX0eu-H5wzR9dqaq_CStCbhDkpXrw@mail.gmail.com>
 <CAP7h-xb3yAgMwq_ArnhDbDVaR6vumn-qTtA_mri3a4zVt9E8=Q@mail.gmail.com>
 <CAPTjJmrW_kZ60gyfjJ71qsURv5K+UOSU_=NB33LPTGm98AGwJQ@mail.gmail.com>
Message-ID: <CAP7h-xbgjXXPUUYVc3rCWU28114xhTWS71Lv8exsLhoW=1DFWw@mail.gmail.com>

On Tue, Sep 8, 2015 at 9:49 PM, Chris Angelico <rosuav at gmail.com> wrote:

> On Wed, Sep 9, 2015 at 5:45 AM, Alexander Belopolsky
> <alexander.belopolsky at gmail.com> wrote:
> > Definitions:  An aware datetime value t is called "regular" if
> t.utcoffset()
> > does not depend on the value of the fold attribute.
>
> One point to clarify here. Is the definition of "regular" based on the
> timezone alone (that is to say, a UTC datetime is regular, and an
> Australia/Brisbane datetime is regular, but anything in a region with
> DST is always special), or are "special" datetimes only those in the
> fold period?


It is what the definition says.  If you want to know whether t is regular
you have to compare t.utcoffset() and
t.replace(fold=1-t.fold).utcoffset().  If they are the same, t is regular.
If not - t is special.  If tzinfo is a fixed offset timezone, all times
with such tzinfo are regular.  If tzinfo is a typical DST observing
timezone, then times in the fold and in the gap are special and the rest
are regular.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150908/9d003781/attachment.html>

From tim.peters at gmail.com  Wed Sep  9 04:10:49 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 8 Sep 2015 21:10:49 -0500
Subject: [Datetime-SIG] PEP 495: What's left to resolve
In-Reply-To: <CAPTjJmrW_kZ60gyfjJ71qsURv5K+UOSU_=NB33LPTGm98AGwJQ@mail.gmail.com>
References: <CAP7h-xZcGFmkaY6o1xB60dizC-T63iBOPK89jL1EVoxw-8P5+g@mail.gmail.com>
 <CAP7h-xb1tAfyYg8m7hEUBaMqPVZfO97LfU2aumEskqZV5v_Ddg@mail.gmail.com>
 <CAExdVNkfXgrP5pXB2f20VQX0eu-H5wzR9dqaq_CStCbhDkpXrw@mail.gmail.com>
 <CAP7h-xb3yAgMwq_ArnhDbDVaR6vumn-qTtA_mri3a4zVt9E8=Q@mail.gmail.com>
 <CAPTjJmrW_kZ60gyfjJ71qsURv5K+UOSU_=NB33LPTGm98AGwJQ@mail.gmail.com>
Message-ID: <CAExdVNmwjF289C-WsNUo4Gm0yZeaf8ra5YM04bmRSxecQjN2yg@mail.gmail.com>

[Alex]
>> Definitions:  An aware datetime value t is called "regular" if t.utcoffset()
>> does not depend on the value of the fold attribute.

[Chris Angelico]
> One point to clarify here. Is the definition of "regular" based on the
> timezone alone (that is to say, a UTC datetime is regular, and an
> Australia/Brisbane datetime is regular, but anything in a region with
> DST is always special), or are "special" datetimes only those in the
> fold period?

It applies to "an aware datetime value t".  That's clear already ;-)
Everything about `t` matters.  In plain English `t` is "regular" if
and only if `t` is in neither a fold nor a gap.  So, e.g., all `t` in
UTC are regular.  In most zones with a notion of DST, there are
exactly 2 wall-clock hours per year that are not regular (in the gap
at the start of DST, and in the fold at DST end).


> The former is easily identified. As the zoneinfo file is parsed, it'll
> be obvious which ones can ever have times that differ only in fold,
> and they get flagged as "special". The check is simple - ask the
> timezone object whether it's regular or special.

What's actually needed isn't that simple.


> The latter, perhaps not so much. Given a particular datetime, can you
> easily and reliably ascertain whether or not there is any other
> section of time which can "look like" this one?

Impossible to answer "easily" without knowing all the details of a
specific tzinfo's internal data representation.

For, e.g., a timezone _defined_ by a POSIX TZ rule, it's trivial,
since those explicitly spell out the "problem hours" as local
wall-clock times.

For a tzfile, I posted pseudo-code a while back showing how to
determine whether a UTC time corresponds to a fold in the zone, using
a few simple calculations after doing a binary search across the
zone's transition list to locate where the input UTC time belongs.  A
fold exists if and only if the current total UTC offset is less than
the previous transition's total UTC offset.  The opposite for a gap.
However, this may mishandle cases (if any exist - I don't know) where
consecutive transitions have exactly the same total UTC offset.

So there are details left to flesh out, but it's conceptually easy enough ;-)

For other zone sources, who knows?

From rosuav at gmail.com  Wed Sep  9 04:55:17 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 9 Sep 2015 12:55:17 +1000
Subject: [Datetime-SIG] PEP 495: What's left to resolve
In-Reply-To: <CAExdVNmwjF289C-WsNUo4Gm0yZeaf8ra5YM04bmRSxecQjN2yg@mail.gmail.com>
References: <CAP7h-xZcGFmkaY6o1xB60dizC-T63iBOPK89jL1EVoxw-8P5+g@mail.gmail.com>
 <CAP7h-xb1tAfyYg8m7hEUBaMqPVZfO97LfU2aumEskqZV5v_Ddg@mail.gmail.com>
 <CAExdVNkfXgrP5pXB2f20VQX0eu-H5wzR9dqaq_CStCbhDkpXrw@mail.gmail.com>
 <CAP7h-xb3yAgMwq_ArnhDbDVaR6vumn-qTtA_mri3a4zVt9E8=Q@mail.gmail.com>
 <CAPTjJmrW_kZ60gyfjJ71qsURv5K+UOSU_=NB33LPTGm98AGwJQ@mail.gmail.com>
 <CAExdVNmwjF289C-WsNUo4Gm0yZeaf8ra5YM04bmRSxecQjN2yg@mail.gmail.com>
Message-ID: <CAPTjJmrRX9pfFtt37uPQM_6CJKMWSgNStU7Ei4=aYsSLZWRRNg@mail.gmail.com>

On Wed, Sep 9, 2015 at 12:10 PM, Tim Peters <tim.peters at gmail.com> wrote:
> [Alex]
>>> Definitions:  An aware datetime value t is called "regular" if t.utcoffset()
>>> does not depend on the value of the fold attribute.
>
> [Chris Angelico]
>> One point to clarify here. Is the definition of "regular" based on the
>> timezone alone (that is to say, a UTC datetime is regular, and an
>> Australia/Brisbane datetime is regular, but anything in a region with
>> DST is always special), or are "special" datetimes only those in the
>> fold period?
>
> It applies to "an aware datetime value t".  That's clear already ;-)
> Everything about `t` matters.  In plain English `t` is "regular" if
> and only if `t` is in neither a fold nor a gap.  So, e.g., all `t` in
> UTC are regular.  In most zones with a notion of DST, there are
> exactly 2 wall-clock hours per year that are not regular (in the gap
> at the start of DST, and in the fold at DST end).

Okay, that's what I thought it meant. And it's easy enough to see if
two datetimes differ only in fold. The problem I was seeing was a
difficulty in recognizing whether a single datetime is special or not,
which is answered here:

On Wed, Sep 9, 2015 at 12:02 PM, Alexander Belopolsky
<alexander.belopolsky at gmail.com> wrote:
> If you want to know whether t is regular you have to compare t.utcoffset()
> and t.replace(fold=1-t.fold).utcoffset().  If they are the same, t is
> regular.  If not - t is special.

Thanks Alex!

(I can imagine pushing this to the timezone object as a primitive,
which will allow it to be optimized down to "t is regular" for
timezones that are always regular, but that's an optimization only.)

ChrisA

From tim.peters at gmail.com  Wed Sep  9 05:10:56 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 8 Sep 2015 22:10:56 -0500
Subject: [Datetime-SIG] PEP 495: What's left to resolve
In-Reply-To: <CAPTjJmrRX9pfFtt37uPQM_6CJKMWSgNStU7Ei4=aYsSLZWRRNg@mail.gmail.com>
References: <CAP7h-xZcGFmkaY6o1xB60dizC-T63iBOPK89jL1EVoxw-8P5+g@mail.gmail.com>
 <CAP7h-xb1tAfyYg8m7hEUBaMqPVZfO97LfU2aumEskqZV5v_Ddg@mail.gmail.com>
 <CAExdVNkfXgrP5pXB2f20VQX0eu-H5wzR9dqaq_CStCbhDkpXrw@mail.gmail.com>
 <CAP7h-xb3yAgMwq_ArnhDbDVaR6vumn-qTtA_mri3a4zVt9E8=Q@mail.gmail.com>
 <CAPTjJmrW_kZ60gyfjJ71qsURv5K+UOSU_=NB33LPTGm98AGwJQ@mail.gmail.com>
 <CAExdVNmwjF289C-WsNUo4Gm0yZeaf8ra5YM04bmRSxecQjN2yg@mail.gmail.com>
 <CAPTjJmrRX9pfFtt37uPQM_6CJKMWSgNStU7Ei4=aYsSLZWRRNg@mail.gmail.com>
Message-ID: <CAExdVN=vYGp=y988jSPVdAT5fKXJJDjV49+GDMHmA4ogUup2xA@mail.gmail.com>

[Alex]
>> If you want to know whether t is regular you have to compare t.utcoffset()
>> and t.replace(fold=1-t.fold).utcoffset().  If they are the same, t is
>> regular.  If not - t is special.

[Chris]
> (I can imagine pushing this to the timezone object as a primitive,

Hey, I'm listed as a PEP co-author, and even I can't get Alex to budge
on adding my utterly sensible new ".classify()" tzinfo method ;-)

Instead zone-wrapping tzinfo authors will likely write one anyway for
their internal use, but not expose it (e.g., a tzinfo's .fromutc()
needs to compute "is this in a fold?  if so, earlier or later time?:"
each time it's called - and .utcoffset() needs to worry about both
folds and gaps on each call).


> which will allow it to be optimized down to "t is regular" for
> timezones that are always regular, but that's an optimization only.)

Any sensible wrapping of a fixed-offset ("always regular") zone will
have a 1-line .utcoffset() implementation, simply returning that
zone's constant offset.  It will be cheap enough.

From tim.peters at gmail.com  Wed Sep  9 06:25:08 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 8 Sep 2015 23:25:08 -0500
Subject: [Datetime-SIG] PEP 495: What's left to resolve
In-Reply-To: <CAP7h-xazn1Da4jCujVGLKF3z6Y+4jm+1Y7xCtw2-LVmnaAoG4Q@mail.gmail.com>
References: <CAP7h-xZcGFmkaY6o1xB60dizC-T63iBOPK89jL1EVoxw-8P5+g@mail.gmail.com>
 <CAP7h-xb1tAfyYg8m7hEUBaMqPVZfO97LfU2aumEskqZV5v_Ddg@mail.gmail.com>
 <CAExdVNkfXgrP5pXB2f20VQX0eu-H5wzR9dqaq_CStCbhDkpXrw@mail.gmail.com>
 <CAP7h-xYcwG1z0GhbKoH376dOP3wZ0V8tjCqFnv4w=t2eSrJQRg@mail.gmail.com>
 <CAExdVNnsDHqGkSDupmhQZZj2NyA_K39KAVsBtBzNo-LybcLcQA@mail.gmail.com>
 <CAP7h-xazn1Da4jCujVGLKF3z6Y+4jm+1Y7xCtw2-LVmnaAoG4Q@mail.gmail.com>
Message-ID: <CAExdVN=SSwh245C+sb07w1otPWr+XQpYUxWkuY0enRWMcxw5VQ@mail.gmail.com>

[Alex]
> ...
> The question is what is easier to understand: (a) t1 and t2 are equal if and
> only if t1 - t1.replace(fold=f1).utcoffset() ==  t2 -
> t2.replace(fold=f2).utcoffset() for all four possible pairs (f1, f2);

Infinite recursion again ;-) , but this time because interzone
equality is being defined in terms of 4 more interzone equalities.

> or (b)
> t1 and t2 are equal if and only if they are unambiguous and valid in their
> respective zones and convert to the same UTC instant.

The docs generally do both, when feasible:  an English description,
followed by a Python expression to resolve the inherent imprecision of
English.  The intent is for the English to give the high-order bits,
and for the Python expression to leave no possible misunderstanding.

For this case, the clearest Python I can think of is:

    def toutc(t, fold):
        return (t - t.replace(fold=fold).utcoffset()).replace(tzinfo=None)

Then t1 == t2, when t1 and t2 are aware datetimes in different zones,
if and only if:

    toutc(t1, 0) == toutc(t1, 1) == toutc(t2, 0) == toutc(t2, 1)

Then there's no English remaining to be misread.  As a side benefit,
the correctness of short-circuiting if t1 is a problem case becomes
dead obvious on the face of it ;-)

From ischwabacher at wisc.edu  Wed Sep  9 07:18:33 2015
From: ischwabacher at wisc.edu (Isaac J Schwabacher)
Date: Wed, 09 Sep 2015 05:18:33 +0000
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <CAExdVNmrjmxv_YneXYLN8es=SrCMfJkVWR2XEKy-O81mkPxs0g@mail.gmail.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xYvpqEOG0GF_NCfG+DdoJi3BAc9+YmxSCcykVVRwuSUsQ@mail.gmail.com>
 <CAExdVNmKk2jJvUaGG7Q0aa=_Yya_LMEiJdd+TN55yDO+RfwoeA@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <CAExdVNmrjmxv_YneXYLN8es=SrCMfJkVWR2XEKy-O81mkPxs0g@mail.gmail.com>
Message-ID: <SN1PR06MB18248C96E88BAC4CFEC4C301B6520@SN1PR06MB1824.namprd06.prod.outlook.com>

I stop following for the week and the world goes mad. I've lost count of the number of times I've thought, "Are you out of your *mind*!?" while reading this thread. You actually considered breaking the __hash__ invariant?

[Guido]
> > I could not accept a PEP that leads to different datetime being considered
> > == but having a different hash (*unless* due to a buggy tzinfo subclass
> > implementation -- however no historical timezone data should ever depend on
> > such a bug).
> >
> > I'm much less concerned about < being intransitive in edge cases.

[Tim]
> Offhand I don't know whether it can be (probably).  The case I
> stumbled into yesterday showed that equality ("==") could be
> intransitive:
> 
>     assert a == b == c == d  and  a < d
> 
> While initially jarring, I called it a "minor wart", because the
> middle "==" there is working in classic arithmetic but the other two
> are working in timeline arithmetic.  But _a_ wart all the same, since
> transitivity doesn't fail today.

I'm assuming that the moment of temporary insanity has passed and you consider the __hash__ invariant to be sacrosanct.

The problem here is that someone (Alexander, I think?) demonstrated a method of producing a tzinfo class and b and c to make this true, *given arbitrary a and d*. Equality may not be transitive, but equality of hashes is, which means that __hash__ must be constant over equivalence classes in the transitive closure of the relation defined by __eq__. In this case, this boils down to "if __hash__ ignores fold, all datetime objects must have the same hash".

I imagine the performance implications of this are not acceptable.

There is no satisfactory way of weaseling out of this; datetime equality is timeline equality now and forever, unless you're willing to give up one of backward compatibility, the __hash__ invariant, or the ability to implement new tzinfo classes. (The tzinfo in the example was contrived but not buggy.)

> > I also don't particularly care about == following from the difference being zero.
> > Still, unless we're constrained by backward compatibility, I would rather
> > not add equivalence between *any* two datetimes whose tzinfo is not the same
> > object -- even if we can infer that they both must refer to the same
> > instant.
> 
> Assuming "equivalent" means "compare equal", we're highly constrained.
> For datetimes x and y with distinct non-None tzinfos, it's always been
> the case that:
> 
> 1. x-y effectively converted both to UTC before subtraction.
> 
> 2. comparison effectively interpreted x-y as a __cmp__ result
> 2a.  various comparison transitivities essentially followed from that
> 
> 3. Because of #2, to maintain __hash__'s contract datetime.__hash__
>     also effectively converted to UTC before hashing
> 
> All of that would (well, "should") continue to work fine, except that
> fold=1 is being ignored in intrazone arithmetic (subtraction and
> comparisons) and by hash().  Maybe there are other surprises.  I just
> happened to notice the hash() problem, and equality intransitivity,
> both yesterday. via thought experiments.
> 
> On the face of it, it's a conceptual mess to try to make fold=1 "mean
> something" in some contexts but not in others.  In particular,
> arithmetic, comparison, and hashing are usually deeply interrelated,
> and have been in datetime so far.  Ignoring `fold` in single-zone
> arithmetic, comparisons and hashing works fine (in "naive time", where
> `fold` is senseless), but when going across zones `fold` cannot be
> ignored.
> 
> That's a huge problem for hash(), because it can have no idea whether
> the pattern of later equality comparisons relying on hash results
> _will_ be using classic or timeline rules (or a mix of both).
> 
> That didn't matter before, because _a_ unique UTC equivalent always
> existed (the possibility of ambiguous times was effectively ignored).
> 
> Now it does matter, because the UTC equivalent can differ depending on
> the `fold` value.  Ignoring it sometimes but not others leads to the
> current quandary.

The last time I made an argument like this, Guido called me the *very loyal* opposition. :)

ijs

From tim.peters at gmail.com  Wed Sep  9 08:34:48 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 9 Sep 2015 01:34:48 -0500
Subject: [Datetime-SIG] Another round on error-checking
In-Reply-To: <SN1PR06MB18248C96E88BAC4CFEC4C301B6520@SN1PR06MB1824.namprd06.prod.outlook.com>
References: <CAExdVNmvgxBE-YUzg0Rbm4zWx7ha5GFgpudBj0V13dAO=H-SWw@mail.gmail.com>
 <CAP7h-xYvpqEOG0GF_NCfG+DdoJi3BAc9+YmxSCcykVVRwuSUsQ@mail.gmail.com>
 <CAExdVNmKk2jJvUaGG7Q0aa=_Yya_LMEiJdd+TN55yDO+RfwoeA@mail.gmail.com>
 <CAP7h-xb1N9u1miA647Oibu4BVBEK4DdD41O82az6ufBpe8yBTg@mail.gmail.com>
 <CAExdVNmax=VUxqZYiiA7om2rvmHneO-yE5fPjeDpGjeyX+9aNA@mail.gmail.com>
 <CAP7h-xbY8bqLh0kjsD2nqLhWdaFNyLjq20T7Z5A2-u1e-R6XfA@mail.gmail.com>
 <CAExdVN=7QO4bFYPZyqhAkKk-4rhy8sgMi9s-q7D5Qgy4QAEONg@mail.gmail.com>
 <CAP7h-xbvJuWo5wjGDzKtt6ebE3DUomtH46eM6dG3nmXdHqFeUg@mail.gmail.com>
 <CAExdVN=U07gix+hyYmhgdjDio6ULbREyj0cka+8B9wakxmnGrg@mail.gmail.com>
 <CAP7h-xa2pRN=s9b==6Nxodx4JaHrv9xAx4Z20oZ-8dZ=q_mzug@mail.gmail.com>
 <CAExdVN=A1pegFORN5nQpaaY30S3jRQCtLbgAQ6cDegRoUYPOVQ@mail.gmail.com>
 <CAExdVNnUt9fDJGvCqtXuvy7cF721NXWRPAsMw4vC86d_wM2oFg@mail.gmail.com>
 <CAP7h-xbkLdC61KpPhKD_U_iHibLcn_mW52JD1kyMEp0tYo92qA@mail.gmail.com>
 <CAExdVN=UVQbE4G11iushwu9Tf0EXfV1zFjVhVY7=EDHTkLqWOA@mail.gmail.com>
 <CAP7h-xZwn0y=Ewi=rjyAvvdR1a49UayKoHhbT=2ZkZptHT535g@mail.gmail.com>
 <CAP7+vJJV6vJ1X=p=WsG9Nv7WNFFN1DyCNyEgB922LjnyFsCUBQ@mail.gmail.com>
 <CAExdVNmrjmxv_YneXYLN8es=SrCMfJkVWR2XEKy-O81mkPxs0g@mail.gmail.com>
 <SN1PR06MB18248C96E88BAC4CFEC4C301B6520@SN1PR06MB1824.namprd06.prod.outlook.com>
Message-ID: <CAExdVNkKz7sD4nGEd_FcCWFmE7Hjfaqeb=Q-GDA6Q8DMLbtXhA@mail.gmail.com>

[ijs]
> I stop following for the week and the world goes mad. I've
> lost count of the number of times I've thought, "Are you
> out of your *mind*!?" while reading this thread. You actually
> considered breaking the __hash__ invariant?

It went unnoticed for some time that the original PEP 495 _did_ break
it.  Not intentionally.  "Unintended consequence."

Alex resisted accepting that it was a fatal problem at first, but was
converted to One Of Us after a single night's intense torture ;-)

...

> I'm assuming that the moment of temporary insanity has
> passed and you consider the __hash__ invariant to be sacrosanct.

Of course!


> The problem here is that someone (Alexander, I think?)
> demonstrated a method of producing a tzinfo class and b
> and c to make this true, *given arbitrary a and d*. Equality
> may not be transitive, but equality of hashes is, which
> means that __hash__ must be constant over equivalence
> classes in the transitive closure of the relation defined by
> __eq__. In this case, this boils down to "if __hash__ ignores
> fold, all datetime objects must have the same hash".

Alex also sketched an approach to constructing a far higher-quality
hash (than a constant function), but it required having, in advance
(of the first hash() call), all tzinfos that could possibly be used
across a program's run.

For example, if we knew in advance there was only one possible
non-fixed-offset zone Z, hash(x) could convert x to zone Z. then
convert the result of that (ignoring its `fold`) to a timestamp (as a
timedelta object) relative to 0001-01-01 00:00:00 in Z, then hash the
timestamp.  Then all spellings in all zones of one of the times in a Z
fold would have the same hash.

It's clever, but can't see a way to make it practical.  There's
nothing, e.g., to stop code from building a brand new tzinfo as a big
string containing Python code, and compiling the string at runtime.


> I imagine the performance implications of this are not acceptable.

Heh.  We could try a constant hash function and see whether anyone
noticed.  That would be fun :-)


> There is no satisfactory way of weaseling out of this;

_Something_ has to give, yes.  "Satisfactory" is Guido's call.
Weaseling is our job.  I already did a small test to convince myself
people _would_ notice if we removed dicts from the language.  They're
the real source of this problem ;-)


> datetime equality is timeline equality now and forever, unless
> you're willing to give up one of backward compatibility, the
> __hash__ invariant, or the ability to implement new tzinfo classes.
> (The tzinfo in the example was contrived but not buggy.)

No tzinfo contrivance is necessary.  The hash problem in the original
PEP could be provoked using any zone whatsoever in which there's a
fold (like, say, US/Eastern).  I think you have in mind part of Alex's
sketch of a better-than-constant hash, where zones were indeed
contrived just to illustrate how nasty it _could_ get.

Guido is least fond of by-magic interzone comparison, and that's what
we've been picking on.  All worm-arounds so far would sacrifice
trichotomy in some (or all) cases of "problem times", by declaring
that some problem times wouldn't compare equal to any datetime in any
other zone.

In the latest version of that, there would be no change to comparison
results so long as pre-495 tzinfos were used.  If you started to use
post-495 tzinfos, that's your choice:  then you get by-magic `fold`
set correctly in all cases, correct zone conversions in all cases, and
correct by-magic interzone subtraction in all cases - at the cost of
living with that all problem times (whether in a gap or a fold) would
compare "not equal" to all datetimes in all other zones.

My own code couldn't care less (I've never used an interzone
comparison outside of lines in datetime's test suite).  You _could_
still compare them, but you'd either have to convert to a zone in
which they were not problem times (timezone.utc would always work for
this) first, or use by-magic interzone subtraction and check the sign
of the result.

So, given that a user would have to "do something" to have even the
possibility of suffering a surprise that will probably never happen in
their life, "not satisfactory" isn't a slam dunk.  Luckily, PEP 20 is
crystal clear about the right decision in this case.

From alexander.belopolsky at gmail.com  Wed Sep  9 17:44:40 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Wed, 9 Sep 2015 11:44:40 -0400
Subject: [Datetime-SIG] PEP 495: What's left to resolve
In-Reply-To: <CAP7h-xb1tAfyYg8m7hEUBaMqPVZfO97LfU2aumEskqZV5v_Ddg@mail.gmail.com>
References: <CAP7h-xZcGFmkaY6o1xB60dizC-T63iBOPK89jL1EVoxw-8P5+g@mail.gmail.com>
 <CAP7h-xb1tAfyYg8m7hEUBaMqPVZfO97LfU2aumEskqZV5v_Ddg@mail.gmail.com>
Message-ID: <CAP7h-xZi=JDxNBBHXDQCZJLRQ5sWwCPKA58_uPAS5VUNw_Ds_g@mail.gmail.com>

On Tue, Sep 8, 2015 at 3:59 AM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:

> On Mon, Sep 7, 2015 at 9:57 PM, Alexander Belopolsky <
> alexander.belopolsky at gmail.com> wrote:
>
>> Solution 1: Make t1 > t0.
>>
>> Solution 2: Leave t1 == t0, but make t1 != u1.
>>
>
> Solution 3:  Leave t1 == t0, but make *both* t0 != u0 and t1 != u1 if
> t0.utcoffset() != t1.utcoffset().


I've implemented [1] Solution 3 in my Github fork.

[1]:
https://github.com/abalkin/cpython/commit/aac301abe89cad2d65633df98764e5b5704f7629
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150909/9682704c/attachment.html>

From berker.peksag at gmail.com  Wed Sep  9 17:49:52 2015
From: berker.peksag at gmail.com (=?UTF-8?Q?Berker_Peksa=C4=9F?=)
Date: Wed, 9 Sep 2015 18:49:52 +0300
Subject: [Datetime-SIG] Making dt parameter of timezone.tzname(dt) optional
Message-ID: <CAF4280K603TPZTiEddoyLdktvpJEApjq10w9LC6PSVc_80BzmQ@mail.gmail.com>

The idea was came up when I reviewed issue 22241 [1] and Alexander
said "This is a reasonable request":

    http://bugs.python.org/review/22241/

Currently, we have tests like (see Lib/test/datetimetester.py)

    self.assertEqual('UTC', timezone.utc.tzname(None))
    self.assertEqual('UTC', timezone(ZERO).tzname(None))
    self.assertEqual('UTC-05:00', timezone(-5 * HOUR).tzname(None))
    self.assertEqual('UTC+09:30', timezone(9.5 * HOUR).tzname(None))

Can we just make dt optional and set its default value to None in Python 3.6? So

   timezone.utc.tzname(None) and timezone.utc.tzname()

will both return "UTC". It's a small change, but I think it will make
the API cleaner.

--Berker

[1] http://bugs.python.org/issue22241

From guido at python.org  Wed Sep  9 18:19:09 2015
From: guido at python.org (Guido van Rossum)
Date: Wed, 9 Sep 2015 09:19:09 -0700
Subject: [Datetime-SIG] Making dt parameter of timezone.tzname(dt)
	optional
In-Reply-To: <CAF4280K603TPZTiEddoyLdktvpJEApjq10w9LC6PSVc_80BzmQ@mail.gmail.com>
References: <CAF4280K603TPZTiEddoyLdktvpJEApjq10w9LC6PSVc_80BzmQ@mail.gmail.com>
Message-ID: <CAP7+vJ+bM-4WG51Nw0PjSvN51B7VEJ_+f71Rk+FDXyF=0=rF+g@mail.gmail.com>

+1, just submit a patch and mark it for 3.6.

On Wed, Sep 9, 2015 at 8:49 AM, Berker Peksa? <berker.peksag at gmail.com>
wrote:

> The idea was came up when I reviewed issue 22241 [1] and Alexander
> said "This is a reasonable request":
>
>     http://bugs.python.org/review/22241/
>
> Currently, we have tests like (see Lib/test/datetimetester.py)
>
>     self.assertEqual('UTC', timezone.utc.tzname(None))
>     self.assertEqual('UTC', timezone(ZERO).tzname(None))
>     self.assertEqual('UTC-05:00', timezone(-5 * HOUR).tzname(None))
>     self.assertEqual('UTC+09:30', timezone(9.5 * HOUR).tzname(None))
>
> Can we just make dt optional and set its default value to None in Python
> 3.6? So
>
>    timezone.utc.tzname(None) and timezone.utc.tzname()
>
> will both return "UTC". It's a small change, but I think it will make
> the API cleaner.
>
> --Berker
>
> [1] http://bugs.python.org/issue22241
> _______________________________________________
> Datetime-SIG mailing list
> Datetime-SIG at python.org
> https://mail.python.org/mailman/listinfo/datetime-sig
> The PSF Code of Conduct applies to this mailing list:
> https://www.python.org/psf/codeofconduct/
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150909/a5b735e3/attachment.html>

From tim.peters at gmail.com  Wed Sep  9 18:33:28 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 9 Sep 2015 11:33:28 -0500
Subject: [Datetime-SIG] Making dt parameter of timezone.tzname(dt)
	optional
In-Reply-To: <CAF4280K603TPZTiEddoyLdktvpJEApjq10w9LC6PSVc_80BzmQ@mail.gmail.com>
References: <CAF4280K603TPZTiEddoyLdktvpJEApjq10w9LC6PSVc_80BzmQ@mail.gmail.com>
Message-ID: <CAExdVNmPeF7OkAfBU6hye_Ujz3F8Ff7zn9c5qXeWACCm4NwT5w@mail.gmail.com>

[Berker Peksa? <berker.peksag at gmail.com>]
> The idea was came up when I reviewed issue 22241 [1] and Alexander
> said "This is a reasonable request":
>
>     http://bugs.python.org/review/22241/
>
> Currently, we have tests like (see Lib/test/datetimetester.py)
>
>     self.assertEqual('UTC', timezone.utc.tzname(None))
>     self.assertEqual('UTC', timezone(ZERO).tzname(None))
>     self.assertEqual('UTC-05:00', timezone(-5 * HOUR).tzname(None))
>     self.assertEqual('UTC+09:30', timezone(9.5 * HOUR).tzname(None))
>
> Can we just make dt optional and set its default value to None in Python 3.6? So
>
>    timezone.utc.tzname(None) and timezone.utc.tzname()
>
> will both return "UTC". It's a small change, but I think it will make
> the API cleaner.
>
> --Berker
>
> [1] http://bugs.python.org/issue22241

+0.  The base (tzinfo) class requires the datetime argument because,
in general, a zone's name depends on the datetime (like "is it in the
zone's "daylight" time"?).

A subclass (like `timezone`) is free to override it to remove that
requirement, but then code relying on the simplification is also
relying on that it will only ever see instances of that subclass.
General code can't make that assumption and get away with it.

The lines from the test suite can't possibly ever see anything except
the exact instance each is testing, so can't ever suffer that kind of
problem.  But it's also no real burden to add a few "None"s in the
test suite.  Snippets from test suites rarely make compelling examples
either way - they're so very specific to the tiny bit of behavior
they're probing.

From alexander.belopolsky at gmail.com  Wed Sep  9 19:24:04 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Wed, 9 Sep 2015 13:24:04 -0400
Subject: [Datetime-SIG] Making dt parameter of timezone.tzname(dt)
	optional
In-Reply-To: <CAExdVNmPeF7OkAfBU6hye_Ujz3F8Ff7zn9c5qXeWACCm4NwT5w@mail.gmail.com>
References: <CAF4280K603TPZTiEddoyLdktvpJEApjq10w9LC6PSVc_80BzmQ@mail.gmail.com>
 <CAExdVNmPeF7OkAfBU6hye_Ujz3F8Ff7zn9c5qXeWACCm4NwT5w@mail.gmail.com>
Message-ID: <CAP7h-xaz3JijxFztX0QEzrzdEKT8j-29a-ppGYmpvJd30mKdGQ@mail.gmail.com>

On Wed, Sep 9, 2015 at 12:33 PM, Tim Peters <tim.peters at gmail.com> wrote:

> +0.  The base (tzinfo) class requires the datetime argument because,
> in general, a zone's name depends on the datetime (like "is it in the
> zone's "daylight" time"?).
>

I was thinking of returning the "zoneinfo" name such as America/New_York in
this case.  This would end the debate about what is the "proper" timezone
name: if you know the date and time - you can get a specific EST/EDT
abbreviation.  If not - you'll just get whatever the zoneinfo calls itself.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150909/11d46edf/attachment.html>

From ethan at stoneleaf.us  Wed Sep  9 19:38:17 2015
From: ethan at stoneleaf.us (Ethan Furman)
Date: Wed, 09 Sep 2015 10:38:17 -0700
Subject: [Datetime-SIG] Making dt parameter of timezone.tzname(dt)
	optional
In-Reply-To: <CAP7h-xaz3JijxFztX0QEzrzdEKT8j-29a-ppGYmpvJd30mKdGQ@mail.gmail.com>
References: <CAF4280K603TPZTiEddoyLdktvpJEApjq10w9LC6PSVc_80BzmQ@mail.gmail.com>
 <CAExdVNmPeF7OkAfBU6hye_Ujz3F8Ff7zn9c5qXeWACCm4NwT5w@mail.gmail.com>
 <CAP7h-xaz3JijxFztX0QEzrzdEKT8j-29a-ppGYmpvJd30mKdGQ@mail.gmail.com>
Message-ID: <55F06E89.7050805@stoneleaf.us>

On 09/09/2015 10:24 AM, Alexander Belopolsky wrote:
> On Wed, Sep 9, 2015 at 12:33 PM, Tim Peters wrote:
>>
>> +0.  The base (tzinfo) class requires the datetime argument because,
>> in general, a zone's name depends on the datetime (like "is it in the
>> zone's "daylight" time"?).
>
> I was thinking of returning the "zoneinfo" name such as America/New_York
>  in this case.  This would end the debate about what is the "proper"
> timezone name: if you know the date and time - you can get a specific
>  EST/EDT abbreviation.  If not - you'll just get whatever the zoneinfo
>  calls itself.

+1

--
~Ethan~

From guido at python.org  Wed Sep  9 19:43:20 2015
From: guido at python.org (Guido van Rossum)
Date: Wed, 9 Sep 2015 10:43:20 -0700
Subject: [Datetime-SIG] Making dt parameter of timezone.tzname(dt)
	optional
In-Reply-To: <CAP7h-xaz3JijxFztX0QEzrzdEKT8j-29a-ppGYmpvJd30mKdGQ@mail.gmail.com>
References: <CAF4280K603TPZTiEddoyLdktvpJEApjq10w9LC6PSVc_80BzmQ@mail.gmail.com>
 <CAExdVNmPeF7OkAfBU6hye_Ujz3F8Ff7zn9c5qXeWACCm4NwT5w@mail.gmail.com>
 <CAP7h-xaz3JijxFztX0QEzrzdEKT8j-29a-ppGYmpvJd30mKdGQ@mail.gmail.com>
Message-ID: <CAP7+vJLP=TMiA6ueeZgiwb+CymGZ1=bMs4jnscwy_pFkkzxD2w@mail.gmail.com>

On Wed, Sep 9, 2015 at 10:24 AM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:

>
> On Wed, Sep 9, 2015 at 12:33 PM, Tim Peters <tim.peters at gmail.com> wrote:
>
>> +0.  The base (tzinfo) class requires the datetime argument because,
>> in general, a zone's name depends on the datetime (like "is it in the
>> zone's "daylight" time"?).
>>
>
> I was thinking of returning the "zoneinfo" name such as America/New_York
> in this case.  This would end the debate about what is the "proper"
> timezone name: if you know the date and time - you can get a specific
> EST/EDT abbreviation.  If not - you'll just get whatever the zoneinfo calls
> itself.
>

But that's not directly related to the proposal, is it? The proposal is to
treat tz.tzname() the same as tz.tzname(None) -- not to give the former a
different meaning.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150909/432c6b59/attachment.html>

From alexander.belopolsky at gmail.com  Wed Sep  9 19:51:24 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Wed, 9 Sep 2015 13:51:24 -0400
Subject: [Datetime-SIG] Making dt parameter of timezone.tzname(dt)
	optional
In-Reply-To: <CAP7+vJLP=TMiA6ueeZgiwb+CymGZ1=bMs4jnscwy_pFkkzxD2w@mail.gmail.com>
References: <CAF4280K603TPZTiEddoyLdktvpJEApjq10w9LC6PSVc_80BzmQ@mail.gmail.com>
 <CAExdVNmPeF7OkAfBU6hye_Ujz3F8Ff7zn9c5qXeWACCm4NwT5w@mail.gmail.com>
 <CAP7h-xaz3JijxFztX0QEzrzdEKT8j-29a-ppGYmpvJd30mKdGQ@mail.gmail.com>
 <CAP7+vJLP=TMiA6ueeZgiwb+CymGZ1=bMs4jnscwy_pFkkzxD2w@mail.gmail.com>
Message-ID: <CAP7h-xbc-Wm=P0+bQCXiygPuAdy9+nKd6nAKsp0QCFnJ59Hu_w@mail.gmail.com>

On Wed, Sep 9, 2015 at 1:43 PM, Guido van Rossum <guido at python.org> wrote:

> On Wed, Sep 9, 2015 at 10:24 AM, Alexander Belopolsky <
> alexander.belopolsky at gmail.com> wrote:
>
>>
>> On Wed, Sep 9, 2015 at 12:33 PM, Tim Peters <tim.peters at gmail.com> wrote:
>>
>>> +0.  The base (tzinfo) class requires the datetime argument because,
>>> in general, a zone's name depends on the datetime (like "is it in the
>>> zone's "daylight" time"?).
>>>
>>
>> I was thinking of returning the "zoneinfo" name such as America/New_York
>> in this case.  This would end the debate about what is the "proper"
>> timezone name: if you know the date and time - you can get a specific
>> EST/EDT abbreviation.  If not - you'll just get whatever the zoneinfo calls
>> itself.
>>
>
> But that's not directly related to the proposal, is it? The proposal is to
> treat tz.tzname() the same as tz.tzname(None) -- not to give the former a
> different meaning.
>
> Right.  That's an independent proposal.  I was mostly responding to Tim's
comment.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150909/da27403d/attachment.html>

From tim.peters at gmail.com  Wed Sep  9 19:58:54 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 9 Sep 2015 12:58:54 -0500
Subject: [Datetime-SIG] Making dt parameter of timezone.tzname(dt)
	optional
In-Reply-To: <CAP7h-xaz3JijxFztX0QEzrzdEKT8j-29a-ppGYmpvJd30mKdGQ@mail.gmail.com>
References: <CAF4280K603TPZTiEddoyLdktvpJEApjq10w9LC6PSVc_80BzmQ@mail.gmail.com>
 <CAExdVNmPeF7OkAfBU6hye_Ujz3F8Ff7zn9c5qXeWACCm4NwT5w@mail.gmail.com>
 <CAP7h-xaz3JijxFztX0QEzrzdEKT8j-29a-ppGYmpvJd30mKdGQ@mail.gmail.com>
Message-ID: <CAExdVNm1OuKVidNbY7v3aKt=B4i3XbKQa8CjqPOA3G2OkAcWfA@mail.gmail.com>

[Tim]
>> +0.  The base (tzinfo) class requires the datetime argument because,
>> in general, a zone's name depends on the datetime (like "is it in the
>> zone's "daylight" time"?).

[Alex]
> I was thinking of returning the "zoneinfo" name such as America/New_York in
> this case.  This would end the debate about what is the "proper" timezone
> name: if you know the date and time - you can get a specific EST/EDT
> abbreviation.  If not - you'll just get whatever the zoneinfo calls itself.

That's fine, and even desirable ;-)

Just saying it's too late to change that the _base_ trzinfo class has
always had a documented requirement for a datetime argument to
tzinfo.tzname().  General code slinging trzinfos can only assume
what's promised, and must supply what's required, by the base class.
Subclasses are free to promise more (but not less) and/or require less
(but not more), and code is free to rely on that, but such code is no
longer general.

Since that's just a _potential_ problem, and Python is for consenting
adults, +0 on the original proposal (doesn't really matter to me
either way, but I have a mild preference for allowing a simplification
("require less") in the `timezone` subclass).

From alexander.belopolsky at gmail.com  Wed Sep  9 20:30:21 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Wed, 9 Sep 2015 14:30:21 -0400
Subject: [Datetime-SIG] Making dt parameter of timezone.tzname(dt)
	optional
In-Reply-To: <CAExdVNm1OuKVidNbY7v3aKt=B4i3XbKQa8CjqPOA3G2OkAcWfA@mail.gmail.com>
References: <CAF4280K603TPZTiEddoyLdktvpJEApjq10w9LC6PSVc_80BzmQ@mail.gmail.com>
 <CAExdVNmPeF7OkAfBU6hye_Ujz3F8Ff7zn9c5qXeWACCm4NwT5w@mail.gmail.com>
 <CAP7h-xaz3JijxFztX0QEzrzdEKT8j-29a-ppGYmpvJd30mKdGQ@mail.gmail.com>
 <CAExdVNm1OuKVidNbY7v3aKt=B4i3XbKQa8CjqPOA3G2OkAcWfA@mail.gmail.com>
Message-ID: <CAP7h-xYHzi9w=inZVB2VmZGYb=vpz1arwQmAhVM0ixt=dMOBtg@mail.gmail.com>

On Wed, Sep 9, 2015 at 1:58 PM, Tim Peters <tim.peters at gmail.com> wrote:

> +0 on the original proposal (doesn't really matter to me
> either way, but I have a mild preference for allowing a simplification
> ("require less") in the `timezone` subclass).
>

What would you say for the following proposal: leave tzinfo.tzname()
signature as is, but add

def name(self, dt=None):
    return self.tzname(dt)

to the base tzinfo class.  Now `tzname()` is a hook for tzinfo
implementers, but name() is the higher level function for the users.  (Note
that I never liked that datetime.tzname() and tzinfo.tzname() had the same
method name, so my proposal may reflect a personal bias.)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150909/80b9d489/attachment.html>

From tim.peters at gmail.com  Wed Sep  9 23:10:07 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 9 Sep 2015 16:10:07 -0500
Subject: [Datetime-SIG] Making dt parameter of timezone.tzname(dt)
	optional
In-Reply-To: <CAP7h-xYHzi9w=inZVB2VmZGYb=vpz1arwQmAhVM0ixt=dMOBtg@mail.gmail.com>
References: <CAF4280K603TPZTiEddoyLdktvpJEApjq10w9LC6PSVc_80BzmQ@mail.gmail.com>
 <CAExdVNmPeF7OkAfBU6hye_Ujz3F8Ff7zn9c5qXeWACCm4NwT5w@mail.gmail.com>
 <CAP7h-xaz3JijxFztX0QEzrzdEKT8j-29a-ppGYmpvJd30mKdGQ@mail.gmail.com>
 <CAExdVNm1OuKVidNbY7v3aKt=B4i3XbKQa8CjqPOA3G2OkAcWfA@mail.gmail.com>
 <CAP7h-xYHzi9w=inZVB2VmZGYb=vpz1arwQmAhVM0ixt=dMOBtg@mail.gmail.com>
Message-ID: <CAExdVNkaQ3bBE3HYGxvQQa9UW5E0whXsBF+e98a=6RGY_Ss3Hw@mail.gmail.com>

[Alex]
> What would you say for the following proposal: leave tzinfo.tzname()
> signature as is, but add
>
> def name(self, dt=None):
>     return self.tzname(dt)
>
> to the base tzinfo class.  Now `tzname()` is a hook for tzinfo implementers,
> but name() is the higher level function for the users.  (Note that I never
> liked that datetime.tzname() and tzinfo.tzname() had the same method name,
> so my proposal may reflect a personal bias.)

Only if PEP 495 adds an obviously needed tzifno.classify(self, dt)
method so that ordinary users don't have to become implementation
experts to answer questions about datetimes that aren't about
implementation details ;-)

Short of that, I think I'd be happier if we changed tzinfo.tzname's
signature to `dt=None`.  No existing code would be harmed.  New code
would have to realize that any exploitation of the relaxed requirement
could fail if an older tzinfo object is used.  Same as that new code
would have to realize that any exploitation of a new tzinfo.name()
method could fail if an older tzinfo object is used.  That's what
"Version changed in" notes are for.

I don't much care because I can't believe anyone uses a mix of tzinfos
obtained from dozens of suppliers.

BTW, this would be another use for starting to require that a tzinfo
reveal a "version number" (however it's spelled).  PEP 495 is a good
place to start that too:  any time Python changes tzinfo semantics,
Python is creating _potential_ problems for everyone.  PEP 495 is the
first time we're proposing to change anything.

From alexander.belopolsky at gmail.com  Thu Sep 10 00:19:08 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Wed, 9 Sep 2015 18:19:08 -0400
Subject: [Datetime-SIG] PEP 495: The classify() method
Message-ID: <CAP7h-xa8ReTDjgXCbWpAB5ywQs+7iXLuFfCmUUTHD8uTcV2LHw@mail.gmail.com>

On Wed, Sep 9, 2015 at 5:10 PM, Tim Peters <tim.peters at gmail.com> wrote:
[in the "Making dt parameter of timezone.tzname(dt) optional" thread]
>
> Only if PEP 495 adds an obviously needed tzifno.classify(self, dt)
> method so that ordinary users don't have to become implementation
> experts to answer questions about datetimes that aren't about
> implementation details ;-)

Deal!  But the return values of classify() should be -1 (for gap), 0 (for
regular) and 1 (for fold).  And while we are at it, let's  bring back the
builtin cmp() method because all these cryptic >, < and == are just too
confusing. :-)

Seriously, though, I have no objection to the classify() method, but
someone else will have to design it and carry through the unavoidable
bikeshedding rounds.

My goal in PEP 495 is to draw a straight line between the current state of
affairs and a lossless astimezone().  Niceties like classify() are just a
little off that path.

I had no illusions when I started PEP 495 that it would be as easy as it
sounds (just add one measly bit!)  Still, I did not anticipate all the
subtle issues that would have to be resolved.

So rather than proposing more features that are not strictly necessary, I
would like to ask the group to start kicking the tires on the reference
implementation. [1]

[1]: https://github.com/abalkin/cpython/tree/issue24773-s3
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150909/30b304f3/attachment.html>

From chris.barker at noaa.gov  Thu Sep 10 21:25:55 2015
From: chris.barker at noaa.gov (Chris Barker)
Date: Thu, 10 Sep 2015 12:25:55 -0700
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <CADmi=6MDJ3b5a25N_ZLkBiDQ=JuRcCNgK+-4KRujOpuedrMDzQ@mail.gmail.com>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CADmi=6MDJ3b5a25N_ZLkBiDQ=JuRcCNgK+-4KRujOpuedrMDzQ@mail.gmail.com>
Message-ID: <CALGmxEJODQBWBLx3mtZd12NBH8zONiko9oiUuRz0scPPrDfpMg@mail.gmail.com>

On Mon, Sep 7, 2015 at 9:48 PM, Stuart Bishop <stuart at stuartbishop.net>
wrote:

> On 4 September 2015 at 23:01, Chris Barker <chris.barker at noaa.gov> wrote:
>
> > I would like a flag on datetime, but it seems it might be better to put
> that
> > flag on a tzinfo object. But the implementation is the something to argue
> > about only if there is any chance of doing it at all.
>
> I would still lean towards a separate datetimetz class, but that is
> just semantics.


As this conversation has progressed, it seems the way forward, if anyone
wants to go there, is a new datetime class that conforms to Carls "Model A"
-- is that what you mean?

For my part, it would be cool if such a class could use the same tzinfo
objects as datetime.datetime, and maybe the same timedelta.

But as Carl suggested -- that would be a job for a new library anyway.

-Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150910/e0ccffc9/attachment.html>

From tim.peters at gmail.com  Fri Sep 11 04:41:59 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 10 Sep 2015 21:41:59 -0500
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <55EDB967.2050108@oddbird.net>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>
 <55E9F626.1080906@oddbird.net>
 <CAExdVNmmqSMz5_2ZTcxZLJ2c1i9u=SkCi6mJvg8Lf7wHbRi1sA@mail.gmail.com>
 <55ECD82E.9070305@oddbird.net>
 <CAExdVNk9ggWEU4Nf_f4HV-W61gfLysn0ndcaxcaWQOXeev56JQ@mail.gmail.com>
 <55EDB967.2050108@oddbird.net>
Message-ID: <CAExdVNmDSET6Cok-En9XP9ybQRHH4Tc_bZy-8oMjezMsq6J4OA@mail.gmail.com>

It's become beyond obvious that I'll never be able to make enough time
to respond to all of these, so I'll address just this for now. because
it's impossible to make progress on anything unless there's agreement
on what technical terms mean:


[Carl Meyer <carl at oddbird.net>]
>>> If you are doing any kind of "integer arithmetic on POSIX timestamps", you
>>> are _always_ doing timeline arithmetic.

[Tim]
>> True.

[Carl]
>>> Classic arithmetic may be many things, but the one thing it definitively is
>>> _not_ is "arithmetic on POSIX timestamps."

[Tim]
>> False.  UTC is an eternally-fixed-offset zone.  There are no
>> transitions to be accounted for in UTC.  Classic and timeline
>> arithmetic are exactly the same thing in any eternally-fixed-offset
>> zone.  Because POSIX timestamps _are_ "in UTC", any arithmetic
>> performed on one is being done in UTC too.  Your illustration next
>> goes way beyond anything I could possibly read as doing arithmetic on
>> POSIX timestamps:

[Carl]
> Translation: "I refuse to countenance the possibility of Model A."

Not at all.  I've tried several times to get it across in English, so
this time I'll try code instead:

    def dt_add(dt, td, timeline=False):
        ofs = dt.utcoffset()
        as_utc = dt.replace(tzinfo=timezone.utc)

        # and the following is identical to converting to
        # a timestamp, "using POSIX timestamp arithmetic",
        # then converting back to calendar notation
        as_utc -= ofs
        as_utc += td

        if timeline:
            return as_utc.astimezone(dt.tzinfo)
        else: # classic
            return (as_utc + ofs).replace(tzinfo=dt.tzinfo)

That adds an aware datetime to a timedelta, doing either classic or
timeline arithmetic depending on the optional flag.  If you want to
claim this doesn't do either kind of arithmetic correctly, prove it
with a specific example (of course cases where it's impossible to do
_conversions_ correctly today would be off-point).  Here's a variant
of an earlier specific example:

    from datetime import datetime, timedelta, timezone
    from pytz.reference import Eastern

    turkey_in = datetime(2004, 10, 30, 15, tzinfo=Eastern)
    DAY = timedelta(days=1)
    turkey_out1 = dt_add(turkey_in, DAY, timeline=True)
    turkey_out2 = dt_add(turkey_in, DAY, timeline=False)
    print(turkey_in)
    print(turkey_out1)
    print(turkey_out2)

and its output:

    2004-10-30 15:00:00-04:00  # start
    2004-10-31 14:00:00-05:00  # "a day later" in timeline
    2004-10-31 15:00:00-05:00  # "a day later" in classic

"Timeline" arithmetic accounts for that an hour was inserted when DST
ended, and "classic" does not.

The "POSIX timestamp arithmetic" part is identical across both cases.
The only difference is in how the POSIX timestamp - which is always
and only a count of seconds in UTC (which isn't my definition - it's
POSIX's) - is converted back to local calendar notation at the very
end.

I believe you have _pictured_ the POSIX timestamp number line
annotated with local calendar notations in your head, but those labels
have nothing to do with the timestamp arithmetic.  The labels have
only to do with the functions used to map local calendar notations to
and from POSIX timestamps.  Those labelings are the difference between
"timeline" and "classic" arithmetic at the higher level of aware
datetime arithmetic.  At the POSIX timestamp level, an integer is just
an integer, with no defined meaning of any kind beyond a count of
seconds in UTC. and a POSIX-defined mapping to and from propleptic
Gregorian calendar notation.

That said, two things to note:

1. The "as_utc -= ofs" line is theoretically impure, because it's
treating a local time _as if_ it were a UTC time.  There's no real way
around that.  We have to convert from local to UTC _somehow_, and
POSIX dodges the issue by providing mktime() to do that "by magic".
Here we're _inside_ the sausage factory, doing it ourselves.  Some rat
guts are visible at this level.  If you look inside a C mktime()
implementation, you'll find rat guts all over that too.

But it's no problem for Guido ;-)  We just set the hands on a UTC
clock to match the local clock, then move the hands on the UTC clock
by the amount the local clock is "ahead of" or "behind" UTC.  In that
way you can indeed picture the operation as being entirely "in UTC".

2. This would be a foolish _implementation_ of classic arithmetic, but
not for semantic reasons.  It's just grossly inefficient.  Stare at
the code, and in the classic case it subtracts the UTC offset at first
only to add the same offset back later.  Those cancel out, so there's
no _semantic_ need to do either..  It's only excessive concern for
theoretical purity that could stop one from spelling it as

    return dt + td

from the start.  That's technically absurd, since it's doing POSIX
timestamp arithmetic on a timestamp that's _not_ a UTC seconds count.
Its only virtue is that it gets the same answer far faster ;-)

BTW, the same kind of reasoning shows why the value of the `timeline=`
flag makes no difference in any case a fixed-offset zone is being
used.  Which is, concretely, what I mean by saying that timeline and
classic arithmetic are exactly the same thing in any fixed-offset
zone.

From random832 at fastmail.com  Sat Sep 12 20:23:12 2015
From: random832 at fastmail.com (Random832)
Date: Sat, 12 Sep 2015 14:23:12 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of tzinfo?
Message-ID: <m2h9mzqyy7.fsf@fastmail.com>


I was trying to find out how arithmetic on aware datetimes is "supposed
to" work, and tested with pytz. When I posted asking why it behaves this
way I was told that pytz doesn't behave correctly according to the way
the API was designed. The tzlocal module, on the other hand, appears to
simply defer to pytz on Unix systems.

My question is, _are_ there any correct reference implementations that
demonstrate the proper behavior in the presence of a timezone that has
daylight saving time transitions?


From tim.peters at gmail.com  Sat Sep 12 20:53:06 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 12 Sep 2015 13:53:06 -0500
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <m2h9mzqyy7.fsf@fastmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
Message-ID: <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>

> I was trying to find out how arithmetic on aware datetimes is "supposed
> to" work, and tested with pytz. When I posted asking why it behaves this
> way I was told that pytz doesn't behave correctly according to the way
> the API was designed.

You were told (by me) that its implementation of tzinfos was not the
_intended_ way.  Which is another way of saying it was an
unanticipated way.  "Correctly" is a whole different kind of judgment.
pytz users who faithfully follow the docs seem happy with it.


> The tzlocal module, on the other hand, appears to
> simply defer to pytz on Unix systems.
>
> My question is, _are_ there any correct reference implementations that
> demonstrate the proper behavior in the presence of a timezone that has
> daylight saving time transitions?

Which specific "proper behaviors"?  :"Hybrid" tzinfos following the
recommendations in the Python docs, including the sample
implementations in the docs, correctly mimic local clock behavior
(skipping the clock ahead when DST starts, and moving the clock back
when DST ends) when converting from UTC.  It's impossible now to do
local -> UTC conversions correctly in all cases, because it's
impossible now to know which UTC time was intended for a local time in
a fold.  For the same reason, it's impossible now to know whether a
local time in a fold is intended to be viewed as being in daylight
time or standard time.

But do note limitations of the default .fromutc() implementation:  it
only guarantees correct mimic-the-local-clock behavior when
total-offset transitions are solely due to a notion of "daylight time"
that strictly alternates between .dst() returning zero and non-zero
values.

Transitions due to any other reason may or may not be reflected in
.fromutc()'s treatment of the local clock.  Most importantly, a
transition due to a zone changing its base ("standard") UTC offset is
a possibility the default .fromutc() knows nothing about.

The wrapping of the IANA ("Olson") zoneinfo database in dateutil uses
hybrid tzinfos (the intended way of wrapping zones with multiple UTC
offsets), and inherits the default .fromutc(), so all the above
applies to it.

Including all behaviors stemming from the impossibility of
disambiguating local times in a fold.  That's not a bug in dateutil.
It's a gap in datetime's design,  It was an intentional gap at the
time, but that pytz went to such heroic lengths to fill it suggests
PEP 495 may well be overdue ;-)

From random832 at fastmail.com  Sat Sep 12 21:16:02 2015
From: random832 at fastmail.com (random832 at fastmail.com)
Date: Sat, 12 Sep 2015 15:16:02 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
Message-ID: <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>

Oops, pressed the wrong reply button and it didn't include the datetime
list.

On Sat, Sep 12, 2015, at 14:53, Tim Peters wrote:
> > I was trying to find out how arithmetic on aware datetimes is
> > "supposed to" work, and tested with pytz. When I posted asking why
> > it behaves this way I was told that pytz doesn't behave correctly
> > according to the way the API was designed.
>
> You were told (by me) that its implementation of tzinfos was not the
> _intended_ way.  Which is another way of saying it was an
> unanticipated way.  "Correctly" is a whole different kind of judgment.
> pytz users who faithfully follow the docs seem happy with it.

My context is that I am working on an idea to include utc offsets in
datetime objects (or on a similar object in a new module), as an
alternative to something like a "fold" attribute. and since "classic
arithmetic" is apparently so important, I'm trying to figure out how
"classic arithmetic" _is actually supposed to work_ when adding a
timedelta to a time lands it on the opposite side of a transition (or in
the middle of a "spring forward" gap).

If there is a "fall back" transition tonight, then adding a day to a
time of 12 noon today could end up as:

12 noon tomorrow, offset still DST.
12 noon tomorrow, offset in standard time, 25 hours from now in real
time.
11 AM tomorrow, offset in standard time, 24 hours from now in real time

Which one of these is "classic arithmetic"? Pytz (if you don't
explicitly call a "normalize" function) results in something that looks
like the first. In one of the models I've thought of, you can get the
second by replacing the tzinfo again, or the third by doing astimezone,
but the first preserves "exactly 24 hours in the future" in both the UTC
moment and the naive interpretation by leaving the offset alone even if
it is an "unnatural" offset.

The second one above is what you get when you call normalize.

My question was whether there are any real implementations that work the
intended way. If there are not, maybe the intended semantics should go
by the wayside and be replaced by what pytz does.

From tim.peters at gmail.com  Sat Sep 12 21:41:15 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 12 Sep 2015 14:41:15 -0500
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
Message-ID: <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>

[<random832 at fastmail.com>]
> My context is that I am working on an idea to include utc offsets in
> datetime objects (or on a similar object in a new module), as an
> alternative to something like a "fold" attribute. and since "classic
> arithmetic" is apparently so important,

Love it or hate it, it's flatly impossible to change anything about it
now, for backward compatibility.


> I'm trying to figure out how
> "classic arithmetic" _is actually supposed to work_ when adding a
> timedelta to a time lands it on the opposite side of a transition (or in
> the middle of a "spring forward" gap).

datetime arithmetic is defined in the Python docs.


> If there is a "fall back" transition tonight, then adding a day to a
> time of 12 noon today could end up as:
>
> 12 noon tomorrow, offset still DST.
> 12 noon tomorrow, offset in standard time, 25 hours from now in real
> time.
> 11 AM tomorrow, offset in standard time, 24 hours from now in real time
>
> Which one of these is "classic arithmetic"?

12 noon tomorrow in every case, regardless of tzinfo and regardless of
whether any kind of transition may or may not have occurred.  Whether
it is or isn't in DST in this specific case isn't defined by Python -
that's entirely up to what the tzinfo implementation says.  The
_intended_ way of implementing tzinfos would say it was in standard
time.


> Pytz (if you don't
> explicitly call a "normalize" function) results in something that looks
> like the first.

Yes, because pytz always uses a fixed-offset tzinfo.  There is no
difference between timeline arithmetic and classic arithmetic in any
fixed-offset zone.


> In one of the models I've thought of, you can get the
> second by replacing the tzinfo again, or the third by doing astimezone,
> but the first preserves "exactly 24 hours in the future" in both the UTC
> moment and the naive interpretation by leaving the offset alone even if
> it is an "unnatural" offset.
>
> The second one above is what you get when you call normalize.

Yes.  .normalize() effectively converts to UTC and back again  In
fact, this is all it does:

    def normalize(self, dt, is_dst=False):
        if dt.tzinfo is self:
            return dt
        if dt.tzinfo is None:
            raise ValueError('Naive time - no tzinfo set')
        return dt.astimezone(self)

.fromutc() is called as the last step of .astimezone(), and .pytz
overrides the default .fromutc() to plug "the appropriate"
fixed-offset pytz tzinfo into the result.


> My question was whether there are any real implementations that work the
> intended way.

dateutil, plus all implementations anyone may have written for
themselves based on the Python doc examples.  When datetime was
originally released, there were no concrete tzinfo implementations in
the world, so lots of people wrote their own for the zones they needed
by copy/paste/edit of the doc examples.


> If there are not, maybe the intended semantics should go
> by the wayside and be replaced by what pytz does.

Changing anything about default arithmetic behavior is not a
possibility.  This has been beaten to death multiple times on this
mailing list already, and I'm not volunteering for another round of it
;-)

From ethan at stoneleaf.us  Sat Sep 12 21:38:10 2015
From: ethan at stoneleaf.us (Ethan Furman)
Date: Sat, 12 Sep 2015 12:38:10 -0700
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
Message-ID: <55F47F22.5080802@stoneleaf.us>

On 09/12/2015 12:16 PM, random832 at fastmail.com wrote:

> If there is a "fall back" transition tonight, then adding a day to a
> time of 12 noon today could end up as:
>
> 12 noon tomorrow, offset still DST.
> 12 noon tomorrow, offset in standard time, 25 hours from now in real
> time.
> 11 AM tomorrow, offset in standard time, 24 hours from now in real time

I believe option 2 is the intended semantics.

--
~Ethan~

From alexander.belopolsky at gmail.com  Sat Sep 12 21:53:38 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Sat, 12 Sep 2015 15:53:38 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
Message-ID: <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>

On Sat, Sep 12, 2015 at 3:41 PM, Tim Peters <tim.peters at gmail.com> wrote:

> > If there are not, maybe the intended semantics should go
> > by the wayside and be replaced by what pytz does.
>
> Changing anything about default arithmetic behavior is not a
> possibility.  This has been beaten to death multiple times on this
> mailing list already, and I'm not volunteering for another round of it
> ;-)


Tim and Guido only grudgingly accept it, but datetime already gives you
"the pytz way" and PEP 495 makes a small improvement to it.  The
localize/normalize functionality is provided by the .astimezone() method
which when called without arguments will attach an appropriate fixed offset
timezone to a datetime object.   You can then add timedeltas to the result
and stay within a "fictitious" fixed offset timezone that extends
indefinitely in both directions.  To get back to the actual civil time -
you call .astimezone() again.  This gives you what we call here a
"timeline" arithmetic and occasionally it is preferable to doing arithmetic
in UTC.  (Effectively you do arithmetic in local standard time instead of
UTC.)   Using a fixed offset timezone other than UTC for timeline
arithmetic is preferable in timezones that are far enough from UTC that
business hours straddle UTC midnight.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150912/f70c007b/attachment-0001.html>

From alexander.belopolsky at gmail.com  Sat Sep 12 21:55:58 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Sat, 12 Sep 2015 15:55:58 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <55F47F22.5080802@stoneleaf.us>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <55F47F22.5080802@stoneleaf.us>
Message-ID: <CAP7h-xYqFmeV6uxDyqDfOeOWxTwiNVsg0pr=-TTT0UWS-rD4wA@mail.gmail.com>

On Sat, Sep 12, 2015 at 3:38 PM, Ethan Furman <ethan at stoneleaf.us> wrote:

> On 09/12/2015 12:16 PM, random832 at fastmail.com wrote:
>
> If there is a "fall back" transition tonight, then adding a day to a
>> time of 12 noon today could end up as:
>>
>> (1) 12 noon tomorrow, offset still DST.
>> (2) 12 noon tomorrow, offset in standard time, 25 hours from now in real
>> time.
>> (3) 11 AM tomorrow, offset in standard time, 24 hours from now in real
>> time
>>
>
> I believe option 2 is the intended semantics.


This is correct.   We call this behavior "classic arithmetic" on this list.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150912/281d0c7d/attachment.html>

From tim.peters at gmail.com  Sat Sep 12 22:10:29 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 12 Sep 2015 15:10:29 -0500
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
Message-ID: <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>

>>> If there are not, maybe the intended semantics should go
>> > by the wayside and be replaced by what pytz does.

>> Changing anything about default arithmetic behavior is not a
>> possibility.  This has been beaten to death multiple times on this
>> mailing list already, and I'm not volunteering for another round of it
>> ;-)


[Alex]
> Tim and Guido only grudgingly accept it, but datetime already gives you "the
> pytz way" and PEP 495 makes a small improvement to it.

To be clear, "Tim and Guido" have nothing at all against timeline
arithmetic.  Sometimes it's exactly what you need.  But the _intended_
way to get it was always to convert to UTC first, or to just use plain
old timestamps.  Classic arithmetic was very intentionally the
default.

The only "grudgingly accepted" part is that .astimezone() grew a
special case later, to make the absence of an argument "mean
something":


> The localize/normalize functionality is provided by the .astimezone()
> method which when called without arguments will attach an appropriate
> fixed offset timezone to a datetime object.   You can then add timedeltas
> to the result and stay within a "fictitious" fixed offset timezone that extends
> indefinitely in both directions.  To get back to the actual civil time - you
> call .astimezone() again.  This gives you what we call here a "timeline"
> arithmetic and occasionally it is preferable to doing arithmetic in UTC.
> (Effectively you do arithmetic in local standard time instead of UTC.)
> Using a fixed offset timezone other than UTC for timeline arithmetic is
> preferable in timezones that are far enough from UTC that business hours
> straddle UTC midnight.

The distance from UTC can't make any difference to the end result,
although if you're working in an interactive shell "it's nice" to see
intermediate results near current wall-clock time.

"A potential problem" with .astimezone()'s default is that it _does_
create a fixed-offset zone.  It's not at all obvious that it should do
so.  First time I saw it, my initial _expectation_ was that it
"obviously" created a hybrid tzinfo reflecting the system zone's
actual daylight rules, as various "tzlocal" implementations outside of
Python do.

From alexander.belopolsky at gmail.com  Sat Sep 12 23:24:37 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Sat, 12 Sep 2015 17:24:37 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
Message-ID: <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>

On Sat, Sep 12, 2015 at 4:10 PM, Tim Peters <tim.peters at gmail.com> wrote:

> "A potential problem" with .astimezone()'s default is that it _does_
> create a fixed-offset zone.  It's not at all obvious that it should do
> so.  First time I saw it, my initial _expectation_ was that it
> "obviously" created a hybrid tzinfo reflecting the system zone's
> actual daylight rules, as various "tzlocal" implementations outside of
> Python do.
>

The clue should have been that  .astimezone() is an instance method and you
don't need to know time to create a hybrid tzinfo.  If a Local tzinfo was
available, it could just be passed to the .astimezone() method as an
argument.  You would not need .astimezone() to both create a tzinfo and
convert the datetime instance to it.

Still, I agree that this was a hack and a very similar hack to the one
implemented by pytz.   Hopefully once PEP 495 is implemented we will
shortly see "as intended" tzinfos to become more popular.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150912/6b976ebb/attachment.html>

From guido at python.org  Sun Sep 13 00:24:58 2015
From: guido at python.org (Guido van Rossum)
Date: Sat, 12 Sep 2015 15:24:58 -0700
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
Message-ID: <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>

On Sat, Sep 12, 2015 at 2:24 PM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:

>
> On Sat, Sep 12, 2015 at 4:10 PM, Tim Peters <tim.peters at gmail.com> wrote:
>
>> "A potential problem" with .astimezone()'s default is that it _does_
>> create a fixed-offset zone.  It's not at all obvious that it should do
>> so.  First time I saw it, my initial _expectation_ was that it
>> "obviously" created a hybrid tzinfo reflecting the system zone's
>> actual daylight rules, as various "tzlocal" implementations outside of
>> Python do.
>>
>
> The clue should have been that  .astimezone() is an instance method and
> you don't need to know time to create a hybrid tzinfo.  If a Local tzinfo
> was available, it could just be passed to the .astimezone() method as an
> argument.  You would not need .astimezone() to both create a tzinfo and
> convert the datetime instance to it.
>
> Still, I agree that this was a hack and a very similar hack to the one
> implemented by pytz.   Hopefully once PEP 495 is implemented we will
> shortly see "as intended" tzinfos to become more popular.
>

The repeated claims (by Alexander?) that astimezone() has the power of
pytz's localize() need to stop. Those pytz methods work for any (pytz)
timezone -- astimezone() with a default argument only works for the local
time zone. (And indeed what it does is surprising, except perhaps to pytz
users.)

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150912/044472ce/attachment.html>

From alexander.belopolsky at gmail.com  Sun Sep 13 02:46:45 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Sat, 12 Sep 2015 20:46:45 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
Message-ID: <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>

On Sat, Sep 12, 2015 at 6:24 PM, Guido van Rossum <guido at python.org> wrote:

> The repeated claims (by Alexander?) that astimezone() has the power of
> pytz's localize() need to stop.


Prove me wrong! :-)


> Those pytz methods work for any (pytz) timezone -- astimezone() with a
> default argument only works for the local time zone.


That's what os.environ['TZ'] = zonename is for.  The  astimezone() method
works for every timezone installed on your system.  Try it - you won't even
need to call time.tzset()!


> (And indeed what it does is surprising, except perhaps to pytz users.)


That I agree with.  Which makes it even more surprising that I often find
myself and pytz advocates on the opposite sides of the fence.

Granted, setting TZ is a silly trick, but one simple way to bring a full TZ
database to Python is to allow .astimezone() take a zonename string like
'Europe/Amsterdam' or 'America/Montevideo' as an argument and act as
os.environ['TZ'] = zonename; t.astimezone() does now, but without messing
with global state.

I made this suggestion before, but I find it inferior to "as intended"
tzinfos.

The only real claim that I am making is that fictitious fixed offset
timezones are useful and we already have some support for them in stdlib.
The datetime.timezone instances that .astimezone() attaches as tzinfo are
not that different from the instances that are attached by pytz's localize
and normalize methods.

In fact, the only major differences between datetime.timezone instances and
those used by pytz is that pytz's EST and EDT instances know that they come
from America/New_York, while datetime.timezone instances don't.  That's why
once you specify America/New_York in localize, your tzinfo.normalize knows
it implicitely, while in the extended .astimezone() solution you will have
to specify it again.  This is not a problem when you only support one local
timezone, but comes with a different set of tradeoffs when you have
multiple timezones.

One advantage of not carrying the memory of the parent zoneinfo in the
fixed offset tzinfo instance is that pickling of datetime objects and their
interchange between different systems becomes simpler.  A pickle of a
datetime.timezone instance is trivial - same as that of a tuple of
timedelta and a short string, but if your fixed offset tzinfo carries a
reference to a potentially large zoneinfo structure, you get all kinds of
interesting problems when you share them between systems that have
different TZ databases.

In any case, there are three approaches to designing a TZ database
interface in the datetime module: the "as intended" approach, the pytz
approach and the astimezone(zonename:str) approach.  The last two don't
require a fold attribute to disambiguate end-of-dst times and the first one
does.  With respect to arithmetic, the last two approaches are equivalent:
both timeline and classic arithmetics are possible, but neither is
painless.  The "as intended" approach comes with classic arithmetic that
"just works" and encourages the best practice for timeline arithmetic: do
it in UTC.  That's why I believe PEP 495 followed by the implementation of
fold-aware "as intended" tzinfos (either within stdlib or by third parties)
is the right approach.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150912/c0dc493c/attachment-0001.html>

From tim.peters at gmail.com  Sun Sep 13 03:58:48 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 12 Sep 2015 20:58:48 -0500
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
Message-ID: <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>

[Guido]
>> Those pytz methods work for any (pytz) timezone -- astimezone() with a
>> default argument only works for the local time zone.

{Alex]
> That's what os.environ['TZ'] = zonename is for.  The  astimezone() method
> works for every timezone installed on your system.  Try it - you won't even
> need to call time.tzset()!

I tried it.  It makes no difference to anything for me.  I stay on
Windows to remind people that millions of Python users don't see any
of the horrid nonsense Linuxish systems force on poor users ;-)


> ...
> In any case, there are three approaches to designing a TZ database interface
> in the datetime module: the "as intended" approach, the pytz approach and
> the astimezone(zonename:str) approach.

Portability rules out #3, unless Python bundles its own zoneinfo wrapping.

pytk's approach has many attractions, like no need for `fold` and no
breakage of anything, and blazing fast .utcoffset().   Except at least
<datetime, timedelta> arithmetic would have to be patched to do a
`normalize` variant by magic (to attach the now-appropriate
fixed-offset tzinfo, but without changing the clock in the process).
Alas, that would be a huge speed hit for classic arithmetic.

So, as always, the original intent is the only one that makes sense in
the end ;-)

> ...
> That's why I believe PEP 495 followed by the implementation
> of fold-aware "as intended" tzinfos (either within stdlib or by third
> parties) is the right approach.

Me too - except I think acceptance of 495 should be contingent upon
someone first completing a fully functional (if not releasable)
fold-aware zoneinfo wrapping.  Details have a way of surprising, and
we should learn from the last time we released a tzinfo spec in the
absence of any industrial-strength wrappings using it.

From guido at python.org  Sun Sep 13 04:13:04 2015
From: guido at python.org (Guido van Rossum)
Date: Sat, 12 Sep 2015 19:13:04 -0700
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
Message-ID: <CAP7+vJKhHpzYiVFUrZ5amo-tPG-jcXFk9scg_3OXFAhVccGuGg@mail.gmail.com>

On Sat, Sep 12, 2015 at 5:46 PM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:

>
> On Sat, Sep 12, 2015 at 6:24 PM, Guido van Rossum <guido at python.org>
> wrote:
>
>> The repeated claims (by Alexander?) that astimezone() has the power of
>> pytz's localize() need to stop.
>
>
> Prove me wrong! :-)
>
>
>> Those pytz methods work for any (pytz) timezone -- astimezone() with a
>> default argument only works for the local time zone.
>
>
> That's what os.environ['TZ'] = zonename is for.  The  astimezone() method
> works for every timezone installed on your system.  Try it - you won't even
> need to call time.tzset()!
>

That's global state. Doesn't count.


> (And indeed what it does is surprising, except perhaps to pytz users.)
>
>
> That I agree with.  Which makes it even more surprising that I often find
> myself and pytz advocates on the opposite sides of the fence.
>
> Granted, setting TZ is a silly trick, but one simple way to bring a full
> TZ database to Python is to allow .astimezone() take a zonename string like
> 'Europe/Amsterdam' or 'America/Montevideo' as an argument and act as
> os.environ['TZ'] = zonename; t.astimezone() does now, but without messing
> with global state.
>

It might as well be a different method then though.


> I made this suggestion before, but I find it inferior to "as intended"
> tzinfos.
>
> The only real claim that I am making is that fictitious fixed offset
> timezones are useful and we already have some support for them in stdlib.
> The datetime.timezone instances that .astimezone() attaches as tzinfo are
> not that different from the instances that are attached by pytz's localize
> and normalize methods.
>

And it has the same defect.


> In fact, the only major differences between datetime.timezone instances
> and those used by pytz is that pytz's EST and EDT instances know that they
> come from America/New_York, while datetime.timezone instances don't.
> That's why once you specify America/New_York in localize, your
> tzinfo.normalize knows it implicitely, while in the extended .astimezone()
> solution you will have to specify it again.  This is not a problem when you
> only support one local timezone, but comes with a different set of
> tradeoffs when you have multiple timezones.
>
> One advantage of not carrying the memory of the parent zoneinfo in the
> fixed offset tzinfo instance is that pickling of datetime objects and their
> interchange between different systems becomes simpler.  A pickle of a
> datetime.timezone instance is trivial - same as that of a tuple of
> timedelta and a short string, but if your fixed offset tzinfo carries a
> reference to a potentially large zoneinfo structure, you get all kinds of
> interesting problems when you share them between systems that have
> different TZ databases.
>

The pickling should be careful to pickle by reference (on the timezone
name). That its meaning depends on the tz database is a feature.


> In any case, there are three approaches to designing a TZ database
> interface in the datetime module: the "as intended" approach, the pytz
> approach and the astimezone(zonename:str) approach.  The last two don't
> require a fold attribute to disambiguate end-of-dst times and the first one
> does.  With respect to arithmetic, the last two approaches are equivalent:
> both timeline and classic arithmetics are possible, but neither is
> painless.  The "as intended" approach comes with classic arithmetic that
> "just works" and encourages the best practice for timeline arithmetic: do
> it in UTC.  That's why I believe PEP 495 followed by the implementation of
> fold-aware "as intended" tzinfos (either within stdlib or by third parties)
> is the right approach.
>

Right. So please focus on this path and don't try to pretend to pytz users
that hacks around astimezone() make pytz redundant, because they don't.
There are other ways to fix the damage that pytz has done.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150912/085d042f/attachment.html>

From alexander.belopolsky at gmail.com  Sun Sep 13 04:15:02 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Sat, 12 Sep 2015 22:15:02 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
Message-ID: <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>

On Sat, Sep 12, 2015 at 9:58 PM, Tim Peters <tim.peters at gmail.com> wrote:

> > That's why I believe PEP 495 followed by the implementation
> > of fold-aware "as intended" tzinfos (either within stdlib or by third
> > parties) is the right approach.
>
> Me too - except I think acceptance of 495 should be contingent upon
> someone first completing a fully functional (if not releasable)
> fold-aware zoneinfo wrapping.


Good idea.  How far are you from completing that?


>   Details have a way of surprising, and
> we should learn from the last time we released a tzinfo spec in the
> absence of any industrial-strength wrappings using it.


I completely agree.  That's why I am adding test cases like Lord Hope
Island and Vilnius to datetimetester.

I will try to create a  zoneinfo wrapping prototype as well, but I will
probably "cheat" and build it on top of pytz.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150912/c1cd5cb9/attachment-0001.html>

From tim.peters at gmail.com  Sun Sep 13 04:25:19 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 12 Sep 2015 21:25:19 -0500
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
Message-ID: <CAExdVN=_wDWrjJPyau4fmP28NQ=vNfSgQH8DuveHRJ2T2m5ZqA@mail.gmail.com>

[Tim]
>> Me too - except I think acceptance of 495 should be contingent upon
>> someone first completing a fully functional (if not releasable)
>> fold-aware zoneinfo wrapping.

[Alex]
> Good idea.  How far are you from completing that?

In my head, it was done last week ;-)  In real life, I'm running out
of spare time for much of anything anymore.  I don't expect to be able
to resume zoneinfo fiddling for at least 2 weeks.


>>  Details have a way of surprising, and
>> we should learn from the last time we released a tzinfo spec in the
>> absence of any industrial-strength wrappings using it.

> I completely agree.  That's why I am adding test cases like Lord Hope Island
> and Vilnius to datetimetester.

That helps a lot, but "industrial-strength" implies "by algorithm".
There are far too many zones to deal with by crafting a hand-written
class for each.

> I will try to create a  zoneinfo wrapping prototype as well, but I will
> probably "cheat" and build it on top of pytz.

It would be crazy not to ;-)  Note that Stuart got to punt on "the
hard part":  .utcoffset(), since pytz only uses fixed-offset classes.
For a prototype - and possibly forever after - I'd be inclined to
create an exhaustive list of transition times in local time, parallel
to the list of such times already there in UTC.  An index into either
list then gives an index into the other, and into the list of
information about the transition (total offset, is_dst, etc).

From alexander.belopolsky at gmail.com  Sun Sep 13 04:40:33 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Sat, 12 Sep 2015 22:40:33 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAExdVN=_wDWrjJPyau4fmP28NQ=vNfSgQH8DuveHRJ2T2m5ZqA@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <CAExdVN=_wDWrjJPyau4fmP28NQ=vNfSgQH8DuveHRJ2T2m5ZqA@mail.gmail.com>
Message-ID: <CAP7h-xawArbsYeeciwu0PGhDXhckCMxiRoKCxwKxKWmciy0YmQ@mail.gmail.com>

On Sat, Sep 12, 2015 at 10:25 PM, Tim Peters <tim.peters at gmail.com> wrote:

> > I will try to create a  zoneinfo wrapping prototype as well, but I will
> > probably "cheat" and build it on top of pytz.
>
> It would be crazy not to ;-)  Note that Stuart got to punt on "the
> hard part":  .utcoffset(), since pytz only uses fixed-offset classes.
> For a prototype - and possibly forever after - I'd be inclined to
> create an exhaustive list of transition times in local time, parallel
> to the list of such times already there in UTC.


Yes.  The only complication is that you need four transition points instead
of two per year in a regular DST case: (1) start of gap; (2) end of gap;
(3) start of fold; and (4) end of fold.  Once you know where you are with
respect to those points, figuring out utcoffset(), dst() and tzname() for
either value of fold is trivial.


>   An index into either
> list then gives an index into the other, and into the list of
> information about the transition (total offset, is_dst, etc).


Right.

It's a shame though to work from a transitions in UTC list because most of
DST rules are expressed in local times and then laboriously converted into
UTC.  I think I should also implement the POSIX TZ spec tzinfo.  This is
where the advantage of the "as intended" approach will be obvious.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150912/f7b15506/attachment.html>

From random832 at fastmail.com  Sun Sep 13 04:54:09 2015
From: random832 at fastmail.com (random832 at fastmail.com)
Date: Sat, 12 Sep 2015 22:54:09 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAExdVN=_wDWrjJPyau4fmP28NQ=vNfSgQH8DuveHRJ2T2m5ZqA@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <CAExdVN=_wDWrjJPyau4fmP28NQ=vNfSgQH8DuveHRJ2T2m5ZqA@mail.gmail.com>
Message-ID: <1442112849.1530082.382090617.0F961872@webmail.messagingengine.com>


On Sat, Sep 12, 2015, at 22:25, Tim Peters wrote:
> That helps a lot, but "industrial-strength" implies "by algorithm".
> There are far too many zones to deal with by crafting a hand-written
> class for each.

It occurs to me that though it's written in C, the zdump utility
included in the tz code is implementation-agnostic w.r.t. what algorithm
is used by the localtime function being tested. It's algorithm could
probably be adapted to python.

From alexander.belopolsky at gmail.com  Sun Sep 13 05:42:10 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Sat, 12 Sep 2015 23:42:10 -0400
Subject: [Datetime-SIG] PEP 495: What's left to resolve
In-Reply-To: <CAP7h-xZi=JDxNBBHXDQCZJLRQ5sWwCPKA58_uPAS5VUNw_Ds_g@mail.gmail.com>
References: <CAP7h-xZcGFmkaY6o1xB60dizC-T63iBOPK89jL1EVoxw-8P5+g@mail.gmail.com>
 <CAP7h-xb1tAfyYg8m7hEUBaMqPVZfO97LfU2aumEskqZV5v_Ddg@mail.gmail.com>
 <CAP7h-xZi=JDxNBBHXDQCZJLRQ5sWwCPKA58_uPAS5VUNw_Ds_g@mail.gmail.com>
Message-ID: <CAP7h-xbcTjp-YP5XMZ7jEmT3rQzQ97ii0zQfZFJkOgBWqexrnQ@mail.gmail.com>

I have now rewritten the "Temporal Arithmetic" section of the PEP to
reflect "Solution 3."

Hg commit: https://hg.python.org/peps/rev/3dc0382326de
Rendered PEP section:
https://www.python.org/dev/peps/pep-0495/#temporal-arithmetic-and-comparison-operators

In addition to a general review of the rewritten section, I would like to
ask the group to comment on the following part specifically: "The result of
addition (subtraction) of a timedelta to (from) a datetime will always have
fold set to 0 even if the original datetime instance had fold=1."

There are two "obvious" choices here: (t + d).fold == 0 and (t + d).fold ==
t.fold.  My original motivation for the rule above was to minimize the
chances that a user would ever see a fold=1 instance.  However, I now think
that preserving the value of fold may be a better option.  For example, an
application that needs to iterate over minutes in the repeated hour will
not need to adjust the fold attribute after each addition.  On the other
hand, there is little harm from accidentally "leaking" fold=1 into the
regular zone where fold value makes no difference.

On Wed, Sep 9, 2015 at 11:44 AM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:
>>
> >Solution 1: Make t1 > t0.
>>
>> Solution 2: Leave t1 == t0, but make t1 != u1.
>>
>>
>> Solution 3:  Leave t1 == t0, but make *both* t0 != u0 and t1 != u1 if
t0.utcoffset() != t1.utcoffset().
>
>
> I've implemented [1] Solution 3 in my Github fork.
>
> [1]:
https://github.com/abalkin/cpython/commit/aac301abe89cad2d65633df98764e5b5704f7629
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150912/21378517/attachment.html>

From tim.peters at gmail.com  Sun Sep 13 05:54:47 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 12 Sep 2015 22:54:47 -0500
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAP7h-xawArbsYeeciwu0PGhDXhckCMxiRoKCxwKxKWmciy0YmQ@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <CAExdVN=_wDWrjJPyau4fmP28NQ=vNfSgQH8DuveHRJ2T2m5ZqA@mail.gmail.com>
 <CAP7h-xawArbsYeeciwu0PGhDXhckCMxiRoKCxwKxKWmciy0YmQ@mail.gmail.com>
Message-ID: <CAExdVNm-Qnxc5o2tybY9=+Bj-CWsYmdsxzMgeLo6iL2vOM=Q+Q@mail.gmail.com>

[Alex]
>>> I will try to create a  zoneinfo wrapping prototype as well, but I will
>>> probably "cheat" and build it on top of pytz.

[Tim]
>> It would be crazy not to ;-)  Note that Stuart got to punt on "the
>> hard part":  .utcoffset(), since pytz only uses fixed-offset classes.
>> For a prototype - and possibly forever after - I'd be inclined to
>> create an exhaustive list of transition times in local time, parallel
>> to the list of such times already there in UTC.

[Alex]
> Yes.  The only complication is that you need four transition points instead
> of two per year in a regular DST case: (1) start of gap; (2) end of gap; (3)
> start of fold; and (4) end of fold.  Once you know where you are with
> respect to those points, figuring out utcoffset(), dst() and tzname() for
> either value of fold is trivial.

I wouldn't call those extras transitions - they're just warts hanging
off of actual transitions.  Earlier I showed Stuart how to determine
everything about a possible fold from a UTC time using pytz's internal
info, in

    PEP-431/495
    Fri, 28 Aug 2015 01:01:06 -0500

He didn't reply that I saw, so it was either obvious or
incomprehensible to him ;-)  In any case, it just takes some very
simple code once the transition record the UTC time belongs in is
found.  I'd be surprised if it weren't similarly easy to determine
everything about a possible gap.

At least in a zoneinfo wrapping, a hybrid tzinfo's .utcoffset() has to
(at least internally) find "the transition record the UTC time belongs
in" regardless.


> ...
> It's a shame though to work from a transitions in UTC list

But that's what tzfiles store.  It would be insane for a zoneinfo
wrapping not to take advantage of that.  For which reason, I do
consider dateutil's zoneinfo wrapping to be insane ;-)  (It inherits
the default .fromutc())

Ah, BTW, I think dateutil's zoneinfo's wrapping also misunderstood
some of what's actually in a tzfile.  Specifically, a tzfile's "
UTC/local indicators" and " standard/wall indicators" are 100% useless
for anything we need, and shouldn't even be read from the file(*)
(seek over 'em).


> because most of DST rules are expressed in local times and then
> laboriously converted into UTC.

It's just a few lines of code in zoneinfo's zic.c.  Nobody is doing it
"by hand" there.

> I think I should also implement the POSIX TZ spec tzinfo.

For that you really should grab dateutil.  It has a full
implementation of POSIX TZ rules, as hybrid tzinfos; here from its
docs:

>>> tz1 = tzstr('EST+05EDT,M4.1.0,M10.5.0')
>>> tz2 = tzstr('AEST-10AEDT-11,M10.5.0,M3.5.0')
>>> dt = datetime(2003, 5, 8, 2, 7, 36, tzinfo=tz1)
>>> dt.strftime('%X %x %Z')
'02:07:36 05/08/03 EDT'
>>> dt.astimezone(tz2).strftime('%X %x %Z')
'16:07:36 05/08/03 AEST'

Of course this implementation is tied into dateutil's rich supply of
"calendar operations" too.


> This is where the advantage of the "as intended" approach will be obvious.

?  "As intended" is all about (to me) using hybrid tzinfos.  And those
are far richer in tzfiles than from POSIX rules.  The latter only
present us with simplest-possible DST transitions; tzfiles present us
with every absurd political manipulation yet inflicted on humankind
;-)

-------
(*) Long boring story.  Short course:  those indicators are only
needed, on some systems, if a POSIZ TZ rule specifies a zone offset
but gives no rules at all for daylight transitions, _and_ the system
has a "posixrules" tzfile.  Then an insane scheme is used to make up
daylight rules "as if" the source file from which the posixrules
tzfile was created had been for a zone with the TZ-specified standard
offset instead, and these absurd indicators are used to figure out
whether the posixrules source file specified _its_ daylight rules
using UTC or local times, and if the later case then whether using
standard time or wall-clock time instead.  It's completely nuts.

From carl at oddbird.net  Sun Sep 13 08:16:23 2015
From: carl at oddbird.net (Carl Meyer)
Date: Sun, 13 Sep 2015 00:16:23 -0600
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <CAExdVNmDSET6Cok-En9XP9ybQRHH4Tc_bZy-8oMjezMsq6J4OA@mail.gmail.com>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>
 <55E9F626.1080906@oddbird.net>
 <CAExdVNmmqSMz5_2ZTcxZLJ2c1i9u=SkCi6mJvg8Lf7wHbRi1sA@mail.gmail.com>
 <55ECD82E.9070305@oddbird.net>
 <CAExdVNk9ggWEU4Nf_f4HV-W61gfLysn0ndcaxcaWQOXeev56JQ@mail.gmail.com>
 <55EDB967.2050108@oddbird.net>
 <CAExdVNmDSET6Cok-En9XP9ybQRHH4Tc_bZy-8oMjezMsq6J4OA@mail.gmail.com>
Message-ID: <55F514B7.60004@oddbird.net>

Hi Tim,

On 09/10/2015 08:41 PM, Tim Peters wrote:
> It's become beyond obvious that I'll never be able to make enough time
> to respond to all of these, so I'll address just this for now. because
> it's impossible to make progress on anything unless there's agreement
> on what technical terms mean:
> 
> 
> [Carl Meyer <carl at oddbird.net>]
>>>> If you are doing any kind of "integer arithmetic on POSIX timestamps", you
>>>> are _always_ doing timeline arithmetic.
> 
> [Tim]
>>> True.
> 
> [Carl]
>>>> Classic arithmetic may be many things, but the one thing it definitively is
>>>> _not_ is "arithmetic on POSIX timestamps."
> 
> [Tim]
>>> False.  UTC is an eternally-fixed-offset zone.  There are no
>>> transitions to be accounted for in UTC.  Classic and timeline
>>> arithmetic are exactly the same thing in any eternally-fixed-offset
>>> zone.  Because POSIX timestamps _are_ "in UTC", any arithmetic
>>> performed on one is being done in UTC too.  Your illustration next
>>> goes way beyond anything I could possibly read as doing arithmetic on
>>> POSIX timestamps:
> 
> [Carl]
>> Translation: "I refuse to countenance the possibility of Model A."
> 
> Not at all.  I've tried several times to get it across in English, so
> this time I'll try code instead:
> 
>     def dt_add(dt, td, timeline=False):
>         ofs = dt.utcoffset()
>         as_utc = dt.replace(tzinfo=timezone.utc)
> 
>         # and the following is identical to converting to
>         # a timestamp, "using POSIX timestamp arithmetic",
>         # then converting back to calendar notation
>         as_utc -= ofs
>         as_utc += td
> 
>         if timeline:
>             return as_utc.astimezone(dt.tzinfo)
>         else: # classic
>             return (as_utc + ofs).replace(tzinfo=dt.tzinfo)

Well, sure. Of course it is possible to use "arithmetic on POSIX
timestamps" within an implementation of either kind of arithmetic, if
you try hard enough; I've never said anything to the contrary (that
would be a provably silly thing to say).

What your code does make clear is that if you convert from a DST-using
timezone to a POSIX timestamp, do "arithmetic on POSIX timestamps" and
then do a normal (what you would in any other context call a "correct")
conversion back to the first timezone afterwards, the result you get is
timeline arithmetic. Sure, if you do a specific sort of weird (what you
would in any other context call "wrong") conversion from the POSIX
timestamp back to the other timezone afterward, then you can get classic
arithmetic instead. I'm not sure what you think that demonstrates. I
think it demonstrates that both timeline and classic arithmetic _can_ be
described in terms that include "arithmetic on POSIX timestamps," but
timeline arithmetic is much more naturally seen that way.

Your original assertion was that "Classic arithmetic is equivalent to
doing integer arithmetic on integer POSIX timestamps" as a justification
for why datetime chose classic arithmetic, implying that classic
arithmetic is somehow _more_ or _more naturally_ seen as "equivalent to
integer arithmetic on integer POSIX timestamps" than timeline
arithmetic. I found that assertion puzzling, and I still do.

I'd still conclude the same thing I already said in an earlier reply:

"""
So, "timeline arithmetic is just arithmetic on POSIX timestamps" means
viewing all aware datetimes as isomorphic to POSIX timestamps.

"Classic arithmetic is just arithmetic on POSIX timestamps" means
viewing aware datetimes as naive datetimes which one can pretend are in
a hypothetical (maybe UTC, if you like) fixed-offset timezone which is
isomorphic to actual POSIX timestamps (even though their actual timezone
may not be fixed-offset).

I accept that those are both true and useful in the implementation of
their respective model. I just don't think either one is inherently
obvious or useful as a justification of their respective mental models;
rather, which one you find "obvious" just reveals your preferred mental
model.
"""

> That adds an aware datetime to a timedelta, doing either classic or
> timeline arithmetic depending on the optional flag.  If you want to
> claim this doesn't do either kind of arithmetic correctly, prove it
> with a specific example

I'm not sure why you'd think I'd have any issue with that code, or any
desire to prove it wrong.

[...]
> I believe you have _pictured_ the POSIX timestamp number line
> annotated with local calendar notations in your head, but those labels
> have nothing to do with the timestamp arithmetic.

It would be more accurate to say that a Model A view pictures only a
single timeline, which is physical (Newtonian) time. A point on that
timeline is an instant. Any given instant is annotated with any number
of labels, each one a unique and unambiguous description of that instant
in some labeling system. A labeling system can be very simple (e.g.
POSIX timestamps), less simple (proleptic Gregorian in UTC, or to a
lesser extent any fixed-offset timezone), or slightly ridiculous
(timezones with folds and gaps, where now we need a `fold` attribute or
an explicit offset at each instant or something similar to keep each
label unique and unambiguous). This mental model implies (and requires)
that all of these labeling systems are isomorphic to each other and to
the physical-time timeline, and that arithmetic in any of them is
isomorphic to arithmetic in any other (and is thus obviously timeline
arithmetic).

Really my only point in this entire thread has been that this model
(contrary to some of the denigration of it on this mailing list) is
actually quite intuitive, not difficult to teach, and possible to do all
sorts of useful work in (_even_ when you have to also teach pytz's
unfortunate API for it). If you can agree with that - great, we're done
here. If you don't agree with that, we may as well still be done,
because I have too much personal experience suggesting it to be true for
you to be likely able to convince me otherwise :-)

I've also come to recognize, through this thread, that Model B (where
the "local clock time in a given timezone" "timeline" is elevated to
sort-of-equal status with the physical timeline, rather than just
considered a weird complex labeling system for physical time) is also
useful (more useful for some tasks) and makes intuitive sense too.

[...]
> 1. The "as_utc -= ofs" line is theoretically impure, because it's
> treating a local time _as if_ it were a UTC time.  There's no real way
> around that.  We have to convert from local to UTC _somehow_, and
> POSIX dodges the issue by providing mktime() to do that "by magic".
> Here we're _inside_ the sausage factory, doing it ourselves.  Some rat
> guts are visible at this level.  If you look inside a C mktime()
> implementation, you'll find rat guts all over that too.

This seems like a really hand-wavy rationalization of an operation that
can only really be described as an incorrect timezone conversion. Of
course that incorrect timezone conversion operation is useful for
implementing classic arithmetic in the way you've implemented it, but
taken out of that context it's just an incorrect conversion. The reason
you _need_ that incorrect conversion is because for some reason you're
really wanting to do your arithmetic in terms of POSIX timestamps (which
are defined as being in UTC), but you don't _really_ want correct
conversion to UTC and back (because if you do that, you'll get timeline
arithmetic).

> But it's no problem for Guido ;-)  We just set the hands on a UTC
> clock to match the local clock, then move the hands on the UTC clock
> by the amount the local clock is "ahead of" or "behind" UTC.  In that
> way you can indeed picture the operation as being entirely "in UTC".

Sure, you can, if you're motivated enough :-)

> 2. This would be a foolish _implementation_ of classic arithmetic, but
> not for semantic reasons.  It's just grossly inefficient.  Stare at
> the code, and in the classic case it subtracts the UTC offset at first
> only to add the same offset back later.  Those cancel out, so there's
> no _semantic_ need to do either..  It's only excessive concern for
> theoretical purity that could stop one from spelling it as
> 
>     return dt + td
> 
> from the start.  That's technically absurd, since it's doing POSIX
> timestamp arithmetic on a timestamp that's _not_ a UTC seconds count.
> Its only virtue is that it gets the same answer far faster ;-)

I actually think this implementation would be _less_ technically absurd.
I'm not sure why you'd insist that any arithmetic on a count of seconds
must be "POSIX timestamp arithmetic." In this case you're just doing
integer arithmetic on a naive count of seconds since some point in the
local timezone clock, rather than on a count of seconds in UTC. That's a
much more natural way to view classic arithmetic, and also happens to be
the way datetime actually does it (where "some point" is datetime(1, 1, 1)).

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150913/0a3e7ce8/attachment.sig>

From tim.peters at gmail.com  Sun Sep 13 17:27:30 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Sun, 13 Sep 2015 10:27:30 -0500
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <201509131224.t8DCOXHO004891@fido.openend.se>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
Message-ID: <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>

[Alex]
>>I will try to create a  zoneinfo wrapping prototype as well, but I will
>>probably "cheat" and build it on top of pytz.

[Laura Creighton]
> My question, is whether it will handle Creighton, Saskatchewan, Canada?
> Creighton is an odd little place.  Like all of Saskatchewan, it is
> in the Central time zone, even though you would expect it to be
> in the Mountain time zone based on its location on the globe.
> The people of Saskatchewan have decided not to adopt Daylight
> Savings time.  Except for the people of Creighton (and
> nearby Denare Beach) -- who _do_ observe Daylight savings time.
>
> makes for an interesting corner case, one that I remember for
> personal (and not economic, or professional) reasons.

Hi, Laura!  By "zoneinfo" here, we mean the IANA (aka "Olson") time
zone database, which is ubiquitous on (at least) Linux:

    https://www.iana.org/time-zones

So "will a wrapping of zoneinfo handle XYZ?" isn't so much a question
about the wrapping as about what's in the IANA database.

Best guess is that Creighton's rules are covered by that database's
America/Winnipeg entries.  It's generally true that the database makes
no attempt to name every location on the planet.  Instead it uses
names of the form "general/specific" where "general" limits the scope
to some large area of the Earth (here "America" really means "North
America"), and "specific" names a well-known city within that area.

For example, I live in Ashland, Wisconsin (extreme far north in that
state, on Lake Superior), but so far as IANA is concerned my time zone
rules are called "America/Chicago" (some 460 air miles SSE, in a
different state).

Just for fun, I'll paste in the comments from the Saskatchewan section
of IANA's "northamerica" data file (a plain text source file from
which binary tzfiles like America/Chicago and America/Winnipeg are
generated).  You'll see Creighton mentioned if you stay alert ;-)

# Saskatchewan

# From Mark Brader (2003-07-26):
# The first actual adoption of DST in Canada was at the municipal
# level.  As the [Toronto] Star put it (1912-06-07), "While people
# elsewhere have long been talking of legislation to save daylight,
# the city of Moose Jaw [Saskatchewan] has acted on its own hook."
# DST in Moose Jaw began on Saturday, 1912-06-01 (no time mentioned:
# presumably late evening, as below), and would run until "the end of
# the summer".  The discrepancy between municipal time and railroad
# time was noted.

# From Paul Eggert (2003-07-27):
# Willett (1914-03) notes that DST "has been in operation ... in the
# City of Moose Jaw, Saskatchewan, for one year."

# From Paul Eggert (2006-03-22):
# Shanks & Pottenger say that since 1970 this region has mostly been as Regina.
# Some western towns (e.g. Swift Current) switched from MST/MDT to CST in 1972.
# Other western towns (e.g. Lloydminster) are like Edmonton.
# Matthews and Vincent (1998) write that Denare Beach and Creighton
# are like Winnipeg, in violation of Saskatchewan law.

# From W. Jones (1992-11-06):
# The. . .below is based on information I got from our law library, the
# provincial archives, and the provincial Community Services department.
# A precise history would require digging through newspaper archives, and
# since you didn't say what you wanted, I didn't bother.
#
# Saskatchewan is split by a time zone meridian (105W) and over the years
# the boundary became pretty ragged as communities near it reevaluated
# their affiliations in one direction or the other.  In 1965 a provincial
# referendum favoured legislating common time practices.
#
# On 15 April 1966 the Time Act (c. T-14, Revised Statutes of
# Saskatchewan 1978) was proclaimed, and established that the eastern
# part of Saskatchewan would use CST year round, that districts in
# northwest Saskatchewan would by default follow CST but could opt to
# follow Mountain Time rules (thus 1 hour difference in the winter and
# zero in the summer), and that districts in southwest Saskatchewan would
# by default follow MT but could opt to follow CST.
#
# It took a few years for the dust to settle (I know one story of a town
# on one time zone having its school in another, such that a mom had to
# serve her family lunch in two shifts), but presently it seems that only
# a few towns on the border with Alberta (e.g. Lloydminster) follow MT
# rules any more; all other districts appear to have used CST year round
# since sometime in the 1960s.

# From Chris Walton (2006-06-26):
# The Saskatchewan time act which was last updated in 1996 is about 30 pages
# long and rather painful to read.
# http://www.qp.gov.sk.ca/documents/English/Statutes/Statutes/T14.pdf

From tim.peters at gmail.com  Sun Sep 13 19:25:35 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Sun, 13 Sep 2015 12:25:35 -0500
Subject: [Datetime-SIG] Timeline arithmetic?
In-Reply-To: <55F514B7.60004@oddbird.net>
References: <CALGmxE+6U7WoDX3JRiTz_utE9ZkAx=e-kGZgbaZwC6etHDDU+Q@mail.gmail.com>
 <CAExdVN=M9isGVPwAZYf34k06T4u2jz-FwXHH59H-rNKNUKFQoA@mail.gmail.com>
 <55E9D6EB.2090108@oddbird.net>
 <CAExdVNmv2MkJX=jdHrJyD5q49XGUCk_N0N_AH_N9fZrx7_CrsQ@mail.gmail.com>
 <55E9F626.1080906@oddbird.net>
 <CAExdVNmmqSMz5_2ZTcxZLJ2c1i9u=SkCi6mJvg8Lf7wHbRi1sA@mail.gmail.com>
 <55ECD82E.9070305@oddbird.net>
 <CAExdVNk9ggWEU4Nf_f4HV-W61gfLysn0ndcaxcaWQOXeev56JQ@mail.gmail.com>
 <55EDB967.2050108@oddbird.net>
 <CAExdVNmDSET6Cok-En9XP9ybQRHH4Tc_bZy-8oMjezMsq6J4OA@mail.gmail.com>
 <55F514B7.60004@oddbird.net>
Message-ID: <CAExdVNkM4tSBy26Xfv0RrsM2Ep4f9zS4Gd39ynbzO=Wr0O7KOA@mail.gmail.com>

[Carl Meyer]
> Well, sure. Of course it is possible to use "arithmetic on POSIX
> timestamps" within an implementation of either kind of arithmetic, if
> you try hard enough; I've never said anything to the contrary (that
> would be a provably silly thing to say).

"""
>>>> Classic arithmetic may be many things, but the one thing it definitively is
>>>> _not_ is "arithmetic on POSIX timestamps."
"""

"""
>> Translation: "I refuse to countenance the possibility of Model A."
"""

And for "try hard enough" here, "hard enough" amounted to "trivial" ;-)


> What your code does make clear is that if you convert from a DST-using
> timezone to a POSIX timestamp, do "arithmetic on POSIX timestamps" and
> then do a normal (what you would in any other context call a "correct")
> conversion back to the first timezone afterwards, the result you get is
> timeline arithmetic.

How else can you do timelime arithmetic?  Zones are _defined_ as
offsets from UTC now.


> Sure, if you do a specific sort of weird (what you would in any other
> context call "wrong") conversion from the POSIX timestamp

There are only two contexts:  Model A and Model B.  So your "any other
context" means simply "Model A", and, yes, a Model B conversion looks
"wrong" to your Model A eyes.  It's equally true that a Model A
conversion looks "wrong" to Model B eyes.  The code shows concretely
how arbitrary this choice is.  It's just a difference in how POSIX
timestamps are _labelled_.  It has nothing to do with the low-level
arithmetic itself.


> back to the other timezone afterward, then you can get classic
> arithmetic instead. I'm not sure what you think that demonstrates

"""
>>>> Classic arithmetic may be many things, but the one thing it definitively is
>>>> _not_ is "arithmetic on POSIX timestamps."
"""

"""
>> Translation: "I refuse to countenance the possibility of Model A."
"""

> I think it demonstrates that both timeline and classic arithmetic _can_ be
> described in terms that include "arithmetic on POSIX timestamps," but
> timeline arithmetic is much more naturally seen that way.

To you, obviously.   But _on its own_ (devoid of any imposed
labellings), POSIX timestamp arithmetic is _solely_ arithmetic on
seconds-counts in UTC.  There is no distinction between classic and
timeline arithmetic in UTC (or in any other fixed-offset zone).
Classic arithmetic is no more than "let's just pretend our clock is
already showing UTC, do the arithmetic, then stop pretending".  By
Occam's Razor, that's as "natural" as anything gets ;-)


> Your original assertion was that "Classic arithmetic is equivalent to
> doing integer arithmetic on integer POSIX timestamps"

It is.  So is timeline arithmetic.  The difference is in labeling, not
in the arithmetic.

> as a justification for why datetime chose classic arithmetic,

Sorry, I don't recall trying to "justify" that choice beyond noting
that there _was_ a choice, and one was overwhelmingly better suited to
Guido's novel "naive time" model, while best practice for the other
was already established in C via converting to UTC and back (whether
spelled via a UTC tzinfo or via POSIX timestamps).  There was no
agonizing over that decision:  the best way to proceed was obvious
_given that_ "naive time" was the primary model in mind.


> implying that classic arithmetic is somehow _more_ or _more naturally_
> seen as "equivalent to integer arithmetic on integer POSIX timestamps" than
> timeline arithmetic. I found that assertion puzzling, and I still do.

To me, it's dead easy to implement either kind of higher-level
arithmetic via POSIX timestamp arithmetic, although it's easi_est_ to
implement classic via the "just pretend at both ends" trick - no
conversions are actually needed on either end.


> I'd still conclude the same thing I already said in an earlier reply:
>
> """
> So, "timeline arithmetic is just arithmetic on POSIX timestamps" means
> viewing all aware datetimes as isomorphic to POSIX timestamps.

You're missing here that there isn't a _unique_ isomorphism.  The code
concretely showed that, at the higher level of datetime arithmetic,
you can get either timeline or classic arithmetic depending on _which_
isomorphism you pick.  The isomorphism is about the labeling, not
about the POSIX timestamp arithmetic.

> "Classic arithmetic is just arithmetic on POSIX timestamps" means
> viewing aware datetimes as naive datetimes which one can pretend are in
> a hypothetical (maybe UTC, if you like) fixed-offset timezone which is
> isomorphic to actual POSIX timestamps (even though their actual timezone
> may not be fixed-offset).

That's why I wanted to show code  ;-)  The entire distinction is in
the single if/else clause at the end.  It doesn't require piles of
words.


> I accept that those are both true and useful in the implementation of
> their respective model. I just don't think either one is inherently
> obvious or useful as a justification of their respective mental models;
> rather, which one you find "obvious" just reveals your preferred mental
> model.
> """

I'm not trying to "justify" anything.  I'm trying to say that "POSIX
timestamp arithmetic" on its own says nothing about which kind of
higher-level arithmetic one sees.  That's in the lableling.  Which
labeling you need _becomes_ obvious only after you identify the
higher-level model you want.

>> That adds an aware datetime to a timedelta, doing either classic or
>> timeline arithmetic depending on the optional flag.  If you want to
>> claim this doesn't do either kind of arithmetic correctly, prove it
>> with a specific example

> I'm not sure why you'd think I'd have any issue with that code, or any
> desire to prove it wrong.

"""
>>>> Classic arithmetic may be many things, but the one thing it definitively is
>>>> _not_ is "arithmetic on POSIX timestamps."
"""

"""
>> Translation: "I refuse to countenance the possibility of Model A."
"""

[...]
>> I believe you have _pictured_ the POSIX timestamp number line
>> annotated with local calendar notations in your head, but those labels
>> have nothing to do with the timestamp arithmetic.

> It would be more accurate to say that a Model A view pictures only a
> single timeline, which is physical (Newtonian) time. A point on that
> timeline is an instant. Any given instant is annotated with any number
> of labels, each one a unique and unambiguous description of that instant
> in some labeling system. A labeling system can be very simple (e.g.
> POSIX timestamps), less simple (proleptic Gregorian in UTC, or to a
> lesser extent any fixed-offset timezone), or slightly ridiculous
> (timezones with folds and gaps, where now we need a `fold` attribute or
> an explicit offset at each instant or something similar to keep each
> label unique and unambiguous). This mental model implies (and requires)
> that all of these labeling systems are isomorphic to each other and to
> the physical-time timeline, and that arithmetic in any of them is
> isomorphic to arithmetic in any other (and is thus obviously timeline
> arithmetic).

Regardless, the "labels have nothing to do with the timestamp arithmetic".


> Really my only point in this entire thread has been that this model
> (contrary to some of the denigration of it on this mailing list) is
> actually quite intuitive, not difficult to teach, and possible to do all
> sorts of useful work in (_even_ when you have to also teach pytz's
> unfortunate API for it). If you can agree with that - great, we're done
> here. If you don't agree with that, we may as well still be done,
> because I have too much personal experience suggesting it to be true for
> you to be likely able to convince me otherwise :-)

If that was indeed your only point, then yes - there was again no need
for any of this ;-)


> I've also come to recognize, through this thread, that Model B (where
> the "local clock time in a given timezone" "timeline" is elevated to
> sort-of-equal status with the physical timeline, rather than just
> considered a weird complex labeling system for physical time) is also
> useful (more useful for some tasks) and makes intuitive sense too.

It does suffer the drawback of not matching how clocks in the real
world actually behave ;-)


[...]
>> 1. The "as_utc -= ofs" line is theoretically impure, because it's
>> treating a local time _as if_ it were a UTC time.  There's no real way
>> around that.  We have to convert from local to UTC _somehow_, and
>> POSIX dodges the issue by providing mktime() to do that "by magic".
>> Here we're _inside_ the sausage factory, doing it ourselves.  Some rat
>> guts are visible at this level.  If you look inside a C mktime()
>> implementation, you'll find rat guts all over that too.

> This seems like a really hand-wavy rationalization of an operation that
> can only really be described as an incorrect timezone conversion.

Perhaps you missed that "as_utc -= ofs" is _also_ needed to implement
timeline arithmetic?  In fact, it's not _necessary_ to get the effect
of classic arithmetic.  It is necessary to implement timeline
arithmetic:  zones are defined as offsets from UTC, and doing POSIX
timestamp arithmetic _requires_ converting to UTC first.  How else are
you going to do that, other than by subtracting the zone's UTC offset
to convert to UTC?


> Of course that incorrect timezone conversion operation is useful for
> implementing classic arithmetic in the way you've implemented it, but
> taken out of that context it's just an incorrect conversion.

Nonsense:  it is exactly the conversion "you" need at the start to
correctly convert to UTC in Model A.  Unless you do that first, you
can't use "POSIX timestamp arithmetic" at all.


> The reason you _need_ that incorrect conversion is because for some
> reason you're really wanting to do your arithmetic in terms of POSIX
> timestamps

I needed it for two reasons.  First, to implement timeline arithmetic
using POSIX timestamps (a problem you seem to wish away by viewing the
labels you want as being _inherently_ attached to the POSIX timestamp
number line - but they're not - the only labels defined by POSIX are
to and from the propleptic Gregorian calendar viewed in UTC).  Second,
to address your:

"""
>>>> Classic arithmetic may be many things, but the one thing it definitively is
>>>> _not_ is "arithmetic on POSIX timestamps."
"""

> (which are defined as being in UTC), but you don't _really_
> want correct conversion to UTC and back (because if you do that, you'll
> get timeline arithmetic).

As above, it's really Model A that needs that conversion.  Model B can
live without it (and, in the actual Python implementation of classic
arithmetic, doesn't bother with conversion on either end).

As to "correct" conversion, that depends on which model you intend to
implement.  The "right" conversion at the end is "wrong" for the other
model.


>> But it's no problem for Guido ;-)  We just set the hands on a UTC
>> clock to match the local clock, then move the hands on the UTC clock
>> by the amount the local clock is "ahead of" or "behind" UTC.  In that
>> way you can indeed picture the operation as being entirely "in UTC".

> Sure, you can, if you're motivated enough :-)

>> 2. This would be a foolish _implementation_ of classic arithmetic, but
>> not for semantic reasons.  It's just grossly inefficient.  Stare at
>> the code, and in the classic case it subtracts the UTC offset at first
>> only to add the same offset back later.  Those cancel out, so there's
>> no _semantic_ need to do either..  It's only excessive concern for
>> theoretical purity that could stop one from spelling it as
>>
>>     return dt + td
>>
>> from the start.  That's technically absurd, since it's doing POSIX
>> timestamp arithmetic on a timestamp that's _not_ a UTC seconds count.
>> Its only virtue is that it gets the same answer far faster ;-)

> I actually think this implementation would be _less_ technically absurd.
> I'm not sure why you'd insist that any arithmetic on a count of seconds
> must be "POSIX timestamp arithmetic."

Because I was addressing _your_ claims about POSIX timestamp arithmetic, like:

"""
>>>> Classic arithmetic may be many things, but the one thing it definitively is
>>>> _not_ is "arithmetic on POSIX timestamps."
"""

To address that specific claim, I stuck solely to "arithmetic on POSIX
timestamps".


> In this case you're just doing integer arithmetic on a naive count of seconds
> since some point in the local timezone clock, rather than on a count of
> seconds in UTC. That's a much more natural way to view classic arithmetic,
>:and also happens to be the way datetime actually does it (where "some
> point" is datetime(1, 1, 1)).

It can be viewed either way.  A count of microseconds since 0001-01-01
00:00:00 0.0 is certainly more natural given knowledge of Python
internals, but it's just a linear transformation between that notion
and viewing it as a POSIX timestamp instead.  As shown before, that's
why "by hand" code to convert a UTC datetime to or from a POSIX
timestamp (either integer or floating) is so trivial to write.

From tim.peters at gmail.com  Sun Sep 13 21:00:33 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Sun, 13 Sep 2015 14:00:33 -0500
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <201509131600.t8DG07e0025688@fido.openend.se>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
Message-ID: <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>

[Tim]
>> Hi, Laura!  By "zoneinfo" here, we mean the IANA (aka "Olson") time
>> zone database, which is ubiquitous on (at least) Linux:
>>
>>    https://www.iana.org/time-zones
>>
>>So "will a wrapping of zoneinfo handle XYZ?" isn't so much a question
>>about the wrapping as about what's in the IANA database.

[Laura]
> Then we had better be able to override it when it is wrong.

Anyone can write their own tzinfo implementing any rules they like,
and nobody is required to use anyone else's tzinfos.

That said, zoneinfo is the most extensive collection of time zone info
there is, so most people will probably use only that.

And that said, zoneinfo is inordinately concerned with recording
highly dubious guesses about what "the rules" were even over a century
ago.  Most people would probably be happy with a tzinfo that _only_
knew what "today's rules" are.  POSIX TZ rules give a simple way to
spell exactly that.  Simple, but annoyingly cryptic.  Gustavo's
`dateutil` already supplies a way to magically build a tzinfo
implementing a zone specified by a POSIX TZ rule string.  More obvious
ways to spell that are surely possible (like, for example, the obvious
ways).

Patches welcome ;-)


>> Best guess is that Creighton's rules are covered by that database's
>> America/Winnipeg entries.
>>
>> # Saskatchewan
>> # Other western towns (e.g. Lloydminster) are like Edmonton.
>> # Matthews and Vincent (1998) write that Denare Beach and Creighton
>> # are like Winnipeg, in violation of Saskatchewan law.

> I think that this will work.
> Creighton is just across the border from Flin Flan, Manitoba.  Indeed I think
> the problem of 'drunken people from Manitoba trying to get one hours more
> drinking done and being a menace on the highway' may have fueled the
> 'we are going to have DST in violation of the law' movement in Creighton.

:-)


> But I am not sure how it is that a poor soul who just wants to print a
> railway schedule 'in local time' is supposed to know that Creighton is
> using Winnipeg time.

I'm not sure how that poor soul would get a railway schedule
manipulable in Python to begin with ;-)

If it's "a problem" for "enough" users of a computer system, a Linux
admin could simply make "America/Creighton" a link to the
"America/Winnipeg" tzfile.  But doing that for every nameable place on
Earth might be considered annoying.

To cover everyone, you may even need to specify a street address
within "a city":

    http://www.quora.com/Are-there-any-major-cities-divided-by-two-time-zones

Blame politicians for this.  I can assure you Guido is not responsible
for creating this mess ;-)

From tim.peters at gmail.com  Sun Sep 13 22:13:53 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Sun, 13 Sep 2015 15:13:53 -0500
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <201509131940.t8DJe36w015280@fido.openend.se>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509131940.t8DJe36w015280@fido.openend.se>
Message-ID: <CAExdVN=RPPVUYWpAUwHWpvy0vMz7zbjFHawYxYfDwuhbFM8B+A@mail.gmail.com>

[Laura]
>>> But I am not sure how it is that a poor soul who just wants to print a
>>> railway schedule 'in local time' is supposed to know that Creighton is
>>> using Winnipeg time.

[Tim]
>> I'm not sure how that poor soul would get a railway schedule
>> manipulable in Python to begin with ;-)

[Laura]
> Via Rail will give you a schedule when you book your tickets.  But I
> am wrong, it gives it to you in local time, which you can scrape or
> even use the via rail api.  So it is the person getting off in
> Creighton who wants to tell his relatives back in Halifax what
> time he is arriving (in their time) (so they can call him and
> avoid the hellish hotel surtax on long distance calls) who will
> have the problem.

Whatever time zone the traveler's railroad schedule uses, so long as
it sticks to just one the traveler subtracts the departure time from
the arrival time to determine how long the trip takes.  They add that
to the Halifax time at which they depart, and tell their Halifax
relatives the result.  They don't need to know anything about the
destination's time zone to do this, unless a daylight transition
occurs between departure and arrival, and the schedule itself
remembered to account for it.  In which case, pragmatically, they can
just add an hour "to be safe" ;-)


> And this is the sort of use case I think we will see a lot of.

But there's nothing new here:  datetime has been around for a dozen
years already, and nobody is proposing to add any new basic
functionality to tzinfos.  PEP 495 is only about adding a flag to
allow correct conversion of ambiguous local times (typically at the
end of DST, when the local clock repeats a span of times) to UTC.  So
if this were a popular use case, I expect we would already have heard
of it.  Note that Python zoneinfo wrappings are already available via,
at least, the pytz and dateutil packages.

From tim.peters at gmail.com  Sun Sep 13 23:58:09 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Sun, 13 Sep 2015 16:58:09 -0500
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <201509132031.t8DKVTwJ028027@fido.openend.se>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <CAExdVN=RPPVUYWpAUwHWpvy0vMz7zbjFHawYxYfDwuhbFM8B+A@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
Message-ID: <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>

[Tim]
>> Whatever time zone the traveler's railroad schedule uses, so long as
>> it sticks to just one

[Laura]
> This is what does not happen.  Which is why I have written a python
> app to perform conversions for my parents, in the past.

So how did they get the right time zone rules for Creighton?


>>But there's nothing new here:  datetime has been around for a dozen
>>years already, and nobody is proposing to add any new basic
>>functionality to tzinfos.  PEP 495 is only about adding a flag to
>>allow correct conversion of ambiguous local times (typically at the
>>end of DST, when the local clock repeats a span of times) to UTC.  So
>>if this were a popular use case, I expect we would already have heard
>>of it.  Note that Python zoneinfo wrappings are already available via,
>>at least, the pytz and dateutil packages.

> I am a happy user of pytz.  On the other hand, I think this means that
> my brain has gone through some sort of non-reversible transformation
> which makes me accurate, but not exactly sane on the issue.

pytz made some strange decisions, from the POV of datetime's intended
tzinfo design.  But it also solved a problem datetime left hanging:
how to disambiguate ambiguous local times.

The _intended_ way to model zones with UTC offset transitions was via
what the docs call a "hybrid" tzinfo:  a single object smart enough on
its own to figure out, e.g., whether a datetime's date and time are in
"daylight" or "standard" time.  However, there's currently no way for
such a tzinfo to know whether an ambiguous local time is intended to
be the earlier or the later of repeated times.  PEP 495 aims to plug
that hole.

pytz solves it by _never_ creating a hybrid tzinfo.  It only uses
eternally-fixed-offset tzinfos.  For example, for a conceptual zone
with two possible total UTC offsets (one for "daylight", one for
"standard"), there two distinct eternally-fixed-offset tzinfo objects
in pytz.  Then an ambiguous time is resolved by _which_ specific
tzinfo object is attached.  Typically the "daylight" tzinfo for the
first time a repeated local time appears, and the "standard" tzinfo
for its second appearance.

In return, you have to use .localize() and .normalize() at various
times, because pytz's tzinfo objects themselves are completely blind
to the possibility of the total UTC offset changing. .localize() and
.normalize() are needed to possibly _replace_ the tzinfo object in
use, depending on the then-current date and time.

OTOH, `dateutil` does create hybrid tzinfo objects.  No dances are
ever needed to possibly replace them.  But it's impossible for
dateutil's tzinfos to disambiguate times in a fold.  Incidentally,
dateutil also makes no attempt to account for transitions other than
DST (e.g., sometimes a zone may change its _base_ ("standard") offset
from UTC).

So, yup, if you're thoroughly indoctrinated in pytz behavior, you will
be accurate but appear insane to Guido ;-)  At a semantic level, a
pytz tzinfo doesn't capture the notion of a zone with offset changes -
it doesn't even try to.  All knowledge about offset changes is inside
the .localize() and .normalize() dances.


> I think I have misunderstood Alexander Belopolsky as saying that
> datetime had functionality which I don't think it has. Thus I thought
> we must be planning to add some functionality here.  Sorry about this.

Guido told Alex to stop saying that ;-)  You can already get
eternally-fixed-offset classes, like pytz does, on (at least) Linux
systems by setting os.environ['TZ'] and then exploiting that
.astimezone() without an argument magically synthesizes an
eternally-fixed-offset tzinfo for "the system zone" (which the TZ
envar specifies) current total UTC offset.  That's not really
comparable to what pytz does, except at a level that makes a lot of
sense in theory but not much at all in practice ;-)


> However, people do need to be aware, if they are not already, that
> people with 3 times in 3 different tz will want to sort them.  Telling
> them that they must convert them to UTC before they do so is, in my
> opinion, a very fine idea. Expecting them to work this out by themselves
> via a assertion that the comparison operator is not transitive, is,
> I think, asking a lot of them.

Of course.  Note that it's _not_ a problem in pytz, though:  there are
no sorting (or transitivity) problems if the only tzinfos you ever use
have eternally fixed UTC offsets.  There are no gaps or folds then,
and everything works in an utterly obvious way - except that you have
to keep _replacing_  tzinfos when they become inappropriate for the
current dates and times in the datetimes they're attached to.

From guido at python.org  Mon Sep 14 00:21:45 2015
From: guido at python.org (Guido van Rossum)
Date: Sun, 13 Sep 2015 15:21:45 -0700
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <201509131224.t8DCOXHO004891@fido.openend.se>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
Message-ID: <CAP7+vJJqo8OWaEF6KEEKcEfGH9L5Tmnz+qAqa+ukscWn=6xW1w@mail.gmail.com>

On Sun, Sep 13, 2015 at 5:24 AM, Laura Creighton <lac at openend.se> wrote:

> My question, is whether it will handle Creighton, Saskatchewan, Canada?
> Creighton is an odd little place.  Like all of Saskatchewan, it is
> in the Central time zone, even though you would expect it to be
> in the Mountain time zone based on its location on the globe.
> The people of Saskatchewan have decided not to adopt Daylight
> Savings time.  Except for the people of Creighton (and
> nearby Denare Beach) -- who _do_ observe Daylight savings time.
>
> makes for an interesting corner case, one that I remember for
> personal (and not economic, or professional) reasons.
>

Hi Laura!

Wouldn't it be sufficient for people in Creighton to set their timezone to
US/Central? IIUC the Canadian DST rules are the same as the US ones. Now,
the question may remain how do people know what to set their timezone to.
But neither pytz nor datetime can help with that -- it is up to the
sysadmin.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150913/f14a2a88/attachment.html>

From tim.peters at gmail.com  Mon Sep 14 02:13:19 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Sun, 13 Sep 2015 19:13:19 -0500
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAP7+vJJqo8OWaEF6KEEKcEfGH9L5Tmnz+qAqa+ukscWn=6xW1w@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAP7+vJJqo8OWaEF6KEEKcEfGH9L5Tmnz+qAqa+ukscWn=6xW1w@mail.gmail.com>
Message-ID: <CAExdVN=THrvrW3AXMNS2TjGMS8X23sW-bJC7-pHgG8DqaUBdLw@mail.gmail.com>

[Guido]
> Wouldn't it be sufficient for people in Creighton to set their timezone to
> US/Central? IIUC the Canadian DST rules are the same as the US ones. Now,
> the question may remain how do people know what to set their timezone to.
> But neither pytz nor datetime can help with that -- it is up to the
> sysadmin.

As Laura's use case evolved, it seems it was more that a train
traveler from Halifax to Creighton wants to tell their Halifax
relatives when they'll arrive in Creighton, but (of course) expressed
in Halifax time.  Nobody in this case knows anything about Creighton's
rules, except the traveler may be staring at a train schedule giving
arrival in Creighton time anyway.

While this may be beyond pytz's wizardy, nothing is too hard for datetime ;-)

    datetime.timezone.setcontext("datetime-sig messages from mid-Sep 2015")
    arrivaltime = datetime.strptime(scraped_arrival_time, "<magic>")
    arrivaltime = datetime.replace(arrivaltime,
tzinfo=gettz("Context/Creighton"))
    print(arrivaltime.astimezone(gettz("Context/Halifax"))

From alexander.belopolsky at gmail.com  Mon Sep 14 05:54:42 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Sun, 13 Sep 2015 23:54:42 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAP7+vJJqo8OWaEF6KEEKcEfGH9L5Tmnz+qAqa+ukscWn=6xW1w@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAP7+vJJqo8OWaEF6KEEKcEfGH9L5Tmnz+qAqa+ukscWn=6xW1w@mail.gmail.com>
Message-ID: <CAP7h-xbhNs9sZ_7cEPDDeEVuictbpKALFi8SLtwttbTs44=wdg@mail.gmail.com>

On Sun, Sep 13, 2015 at 6:21 PM, Guido van Rossum <guido at python.org> wrote:
>
> Now, the question may remain how do people know what to set their
timezone to. But neither pytz nor datetime can help with that -- it is up
to the sysadmin.


Note that this question is also out of the scope of "tzdist", IETF Time
Zone Data Distribution Service Working Group:

"""
The following are Out of scope for the working group:
...
- Lookup protocols or APIs to map a location to a time zone.
""" <https://datatracker.ietf.org/wg/tzdist/charter/>

I am not aware of any effort to develop such service. On the other hand,
stationary ISPs have means to distribute TZ information to the hosts.  See
for example, RFC 4833 ("Timezone Options for DHCP").
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150913/92981937/attachment.html>

From random832 at fastmail.com  Mon Sep 14 21:13:16 2015
From: random832 at fastmail.com (Random832)
Date: Mon, 14 Sep 2015 15:13:16 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
Message-ID: <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>

On Mon, Sep 14, 2015, at 14:53, Tim Peters wrote:
> So, on your own machine, whenever daylight time starts or ends, you
> manually change your TZ environment variable to specify the newly
> appropriate eternally-fixed-offset zone?  Of course not.

No, but the hybrid zone isn't what gets attached to the individual
struct tm value when you convert a time from utc (or from a POSIX
timestamp) to a timezone local value. A single fixed utc offset is
(along with the name and, yes, isdst flag).

And pytz doesn't involve manually changing anything, it involves (as
best it can) automatically applying the value to attach to each
individual datetime value.

> A datetime object is the Python spelling of a C struct tm, but never
> included the tm_isdst flag.

And no-one behind this proposal seems to be contemplating adding an
equivalent to tm_gmtoff, despite that it would serve the same
disambiguation purpose and make it much cheaper to maintain global
invariants like a sort order according to the UTC value (No, I don't
*care* how that's not how it's defined, it is *in fact* true for the UTC
value that you will ever actually get from converting the values to UTC
*today*, and it's the only total ordering that actually makes any sense)

From alexander.belopolsky at gmail.com  Mon Sep 14 21:25:55 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 14 Sep 2015 15:25:55 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
Message-ID: <CAP7h-xZYsqVPOZ0ndj3CAumq_Pp4uQ+AgQC5hHVPH9YA793cAg@mail.gmail.com>

On Mon, Sep 14, 2015 at 3:13 PM, Random832 <random832 at fastmail.com> wrote:

> (No, I don't
> *care* how that's not how it's defined, it is *in fact* true for the UTC
> value that you will ever actually get from converting the values to UTC
> *today*, and it's the only total ordering that actually makes any sense)
>

This is a fine attitude when you implement your own brand new datetime
library.  As an author of a new library you have freedoms that developers
of a 12 years old widely deployed code don't have.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150914/8381f900/attachment.html>

From tim.peters at gmail.com  Mon Sep 14 21:30:58 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 14 Sep 2015 14:30:58 -0500
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
Message-ID: <CAExdVN=s_V_uz9mSOtp6b6+fKjqLZHArXKUQ-ty5EYkLM5V2qw@mail.gmail.com>

[Tim]
>> So, on your own machine, whenever daylight time starts or ends, you
>> manually change your TZ environment variable to specify the newly
>> appropriate eternally-fixed-offset zone?  Of course not.

[Random832 <random832 at fastmail.com>]
> No, but the hybrid zone isn't what gets attached to the individual
> struct tm value when you convert a time from utc (or from a POSIX
> timestamp) to a timezone local value. A single fixed utc offset is
> (along with the name and, yes, isdst flag).

You're assuming much more than POSIX - and the ISO C standard -
requirs.  My description was quite explicitly about how POSIX has done
it all along.  tm_gmtoff and tm_zone are extensions to the standards,
introduced (IIRC) by BSD.  Portable code (including Python's
implementation) can't assume they're available.


> And pytz doesn't involve manually changing anything, it involves (as
> best it can) automatically applying the value to attach to each
> individual datetime value.

.normalize() is a manual step.  It doesn't invoke itself by magic
(although I believe Stuart would like Python to add internal hooks so
pytz _could_ get it invoked by magic).


>> A datetime object is the Python spelling of a C struct tm, but never
>> included the tm_isdst flag.

> And no-one behind this proposal seems to be contemplating adding an
> equivalent to tm_gmtoff,

It was off the table because, for backward compatibility, we need to
mess with the pickle format as little as possible.  It's vital that
datetimes obtained from old pickles continue to work fine, and that
pickles obtained from new datetime objects work fine when loaded by
older Pythons unless they actually require the new fold=1 possibility.

> despite that it would serve the same disambiguation purpose and
> make it much cheaper to maintain global invariants like a sort order
> according to the UTC value

It would be nice to have!  .utcoffset() is an expensive operation
as-is, and being able to rely on tm_gmtoff would make that dirt-cheap
instead.


> (No, I don't *care* how that's not how it's defined,

?  How what is defined?:


> it is *in fact* true for the UTC value that you will ever actually get
> from converting the values to UTC *today*, and it's the only total
> ordering that actually makes any sense)

Well, you lost me there.  In a post-495 world, conversion to UTC will
work correctly in all cases.  It cannot today.;

From alexander.belopolsky at gmail.com  Mon Sep 14 21:44:25 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 14 Sep 2015 15:44:25 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAExdVN=s_V_uz9mSOtp6b6+fKjqLZHArXKUQ-ty5EYkLM5V2qw@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAExdVN=s_V_uz9mSOtp6b6+fKjqLZHArXKUQ-ty5EYkLM5V2qw@mail.gmail.com>
Message-ID: <CAP7h-xa3fu5Yo5GGdqDiM-vgxFJnwW9SQYW9Hcm1+Jfpy6csmA@mail.gmail.com>

On Mon, Sep 14, 2015 at 3:30 PM, Tim Peters <tim.peters at gmail.com> wrote:

> > make it much cheaper to maintain global invariants like a sort order
> > according to the UTC value
>
> It would be nice to have!  .utcoffset() is an expensive operation
> as-is, and being able to rely on tm_gmtoff would make that dirt-cheap
> instead.


If it  is just a question of optimization, datetime objects can be extended
to cache utcoffset.  Note that PyPy have recently added caching of the hash
values in datetime objects.  I merged their changes in our datetime.py, but
it did not look like C implementation would benefit from it as much as pure
python did.  I expect something similar from caching utcoffset: a
measurable improvement for tzinfos implemented in Python and a wash for
those implemented in C.  (A more promising optimization approach is to
define a C API for tzinfo interface.)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150914/330ba3d6/attachment.html>

From alexander.belopolsky at gmail.com  Mon Sep 14 21:48:08 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 14 Sep 2015 15:48:08 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <1442259852.259192.383467881.5156BE88@webmail.messagingengine.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAP7h-xZYsqVPOZ0ndj3CAumq_Pp4uQ+AgQC5hHVPH9YA793cAg@mail.gmail.com>
 <1442259852.259192.383467881.5156BE88@webmail.messagingengine.com>
Message-ID: <CAP7h-xbVMAyTaM8yX3iz-td3hns7BmMRWL9qL-1WFjn=B57SCA@mail.gmail.com>

On Mon, Sep 14, 2015 at 3:44 PM, Random832 <random832 at fastmail.com> wrote:

> It is an
> invariant that is true today, and therefore which you can't rely on any
> of the consumers of this 12 years old widely deployed code not to assume
> will remain true.
>

Sorry, this sentence does not parse.  You are missing a "not" somewhere.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150914/48640990/attachment-0001.html>

From tim.peters at gmail.com  Mon Sep 14 21:49:53 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 14 Sep 2015 14:49:53 -0500
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAP7h-xa3fu5Yo5GGdqDiM-vgxFJnwW9SQYW9Hcm1+Jfpy6csmA@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAExdVN=s_V_uz9mSOtp6b6+fKjqLZHArXKUQ-ty5EYkLM5V2qw@mail.gmail.com>
 <CAP7h-xa3fu5Yo5GGdqDiM-vgxFJnwW9SQYW9Hcm1+Jfpy6csmA@mail.gmail.com>
Message-ID: <CAExdVN=TYS-xZWKRy7nVh0_V9SC6-WPh2KPE5E1fN1qd+sNeeA@mail.gmail.com>

[Tim]
>> It would be nice to have!  .utcoffset() is an expensive operation
>> as-is, and being able to rely on tm_gmtoff would make that dirt-cheap
>> instead.

[Alex]
> If it  is just a question of optimization,

Yes.  If it's more than just that, then 495 doesn't actually solve the
problem of getting the correct UTC offset in all cases.


> datetime objects can be extended to cache utcoffset.  Note that PyPy
> have recently added caching of the hash values in datetime objects.  I
> merged their changes in our datetime.py, but it did not look like C
> implementation would benefit from it as much as pure python did.  I
> expect something similar from caching utcoffset: a measurable
> improvement for tzinfos implemented in Python and a wash for those
> implemented in C.  (A more promising optimization approach is to define a C
> API for tzinfo interface.)

There's no answer to this.  It depends on how expensive .utcoffset()
is, which in turn depends on how the tzinfo author implements it.

I don't care now fast it is.  But, even if I did, "premature
optimization" applies at this time ;-)

From random832 at fastmail.com  Mon Sep 14 21:44:12 2015
From: random832 at fastmail.com (Random832)
Date: Mon, 14 Sep 2015 15:44:12 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAP7h-xZYsqVPOZ0ndj3CAumq_Pp4uQ+AgQC5hHVPH9YA793cAg@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAP7h-xZYsqVPOZ0ndj3CAumq_Pp4uQ+AgQC5hHVPH9YA793cAg@mail.gmail.com>
Message-ID: <1442259852.259192.383467881.5156BE88@webmail.messagingengine.com>

On Mon, Sep 14, 2015, at 15:25, Alexander Belopolsky wrote:
> This is a fine attitude when you implement your own brand new datetime
> library.  As an author of a new library you have freedoms that developers
> of a 12 years old widely deployed code don't have.

I'm talking about the real behavior of datetime as it exists *today*,
and has existed for the past 12 years, before any of this "add fold flag
but sort 2:15 fold1 before 2:45 fold0" nonsense gets in. It is an
invariant that is true today, and therefore which you can't rely on any
of the consumers of this 12 years old widely deployed code not to assume
will remain true.

Enforcing an invariant that all ordering is done according to UTC
timestamps would not break any backward compatibility, because there is
not a *single* pair of timestamps that can be represented today with any
*remotely* plausible tzinfo whose order is different from that. For that
matter, a tzinfo where two possible values for fold aren't sufficient to
disambiguate timestamps is *more* plausible than one where the naive
ordering of any two non-fold timestamps is reversed from the UTC order,
yet that case apparently isn't being considered.

From random832 at fastmail.com  Mon Sep 14 21:58:34 2015
From: random832 at fastmail.com (Random832)
Date: Mon, 14 Sep 2015 15:58:34 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAExdVN=s_V_uz9mSOtp6b6+fKjqLZHArXKUQ-ty5EYkLM5V2qw@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAExdVN=s_V_uz9mSOtp6b6+fKjqLZHArXKUQ-ty5EYkLM5V2qw@mail.gmail.com>
Message-ID: <1442260714.263025.383475777.4728D768@webmail.messagingengine.com>

On Mon, Sep 14, 2015, at 15:30, Tim Peters wrote:
> You're assuming much more than POSIX - and the ISO C standard -
> requirs.  My description was quite explicitly about how POSIX has done
> it all along.  tm_gmtoff and tm_zone are extensions to the standards,
> introduced (IIRC) by BSD.  Portable code (including Python's
> implementation) can't assume they're available.

No, but that doesn't mean it's not in fact true (what was under
discussion was "your own machine", not "a minimal POSIX
implementation"). And it doesn't mean it's not a best practice that
python can and should copy. I'm not talking about *using* it, I'm
talking about working the same way independently, so this has nothing to
do with assuming it's available.

> It was off the table because, for backward compatibility, we need to
> mess with the pickle format as little as possible.  It's vital that
> datetimes obtained from old pickles continue to work fine, and that
> pickles obtained from new datetime objects work fine when loaded by
> older Pythons unless they actually require the new fold=1 possibility.

I don't see how this would prevent that. Aware datetimes have a tzinfo
*right there* that can be asked for a value to populate utcoffset with
if there isn't a pickled one.

> > (No, I don't *care* how that's not how it's defined,
> 
> ?  How what is defined?:

Just trying, unsuccessfully apparently, to head off the "no, it's
defined as working the same as a naive datetime if the tzinfo values are
the same" argument that got brought up the *last* time I made this
claim.

> > it is *in fact* true for the UTC value that you will ever actually get
> > from converting the values to UTC *today*, and it's the only total
> > ordering that actually makes any sense)
> 
> Well, you lost me there.  In a post-495 world, conversion to UTC will
> work correctly in all cases.  It cannot today.;

It'll provide *a* value in all cases. The sort order today is equivalent
to using that value in all cases unless you've got a pathological tzinfo
specifically crafted to break it. I think that's an important enough
invariant to be worth keeping, since it is the only possible way to
provide a total order in the presence of interzone comparisons.

From alexander.belopolsky at gmail.com  Mon Sep 14 22:01:03 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 14 Sep 2015 16:01:03 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAExdVN=TYS-xZWKRy7nVh0_V9SC6-WPh2KPE5E1fN1qd+sNeeA@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAExdVN=s_V_uz9mSOtp6b6+fKjqLZHArXKUQ-ty5EYkLM5V2qw@mail.gmail.com>
 <CAP7h-xa3fu5Yo5GGdqDiM-vgxFJnwW9SQYW9Hcm1+Jfpy6csmA@mail.gmail.com>
 <CAExdVN=TYS-xZWKRy7nVh0_V9SC6-WPh2KPE5E1fN1qd+sNeeA@mail.gmail.com>
Message-ID: <CAP7h-xbdx+f2SSYbBCXqzsasTF8kP5hGAnJQtvK9f1z6sPvxhw@mail.gmail.com>

On Mon, Sep 14, 2015 at 3:49 PM, Tim Peters <tim.peters at gmail.com> wrote:

> It depends on how expensive .utcoffset()
> is, which in turn depends on how the tzinfo author implements it.
>

No, it does not.  In most time zones, UTC offset in seconds can be computed
by C code as a 4-byte integer faster than CPython can look up the
.utcoffset method. (At least for times within a few years around now.) A
programmer who makes it slower should be fired.  Yet I agree,
"'premature optimization'
applies at this time."
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150914/58681ec5/attachment.html>

From random832 at fastmail.com  Mon Sep 14 22:08:50 2015
From: random832 at fastmail.com (Random832)
Date: Mon, 14 Sep 2015 16:08:50 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAP7h-xbVMAyTaM8yX3iz-td3hns7BmMRWL9qL-1WFjn=B57SCA@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAP7h-xZYsqVPOZ0ndj3CAumq_Pp4uQ+AgQC5hHVPH9YA793cAg@mail.gmail.com>
 <1442259852.259192.383467881.5156BE88@webmail.messagingengine.com>
 <CAP7h-xbVMAyTaM8yX3iz-td3hns7BmMRWL9qL-1WFjn=B57SCA@mail.gmail.com>
Message-ID: <1442261330.265134.383487745.4E9C3005@webmail.messagingengine.com>

On Mon, Sep 14, 2015, at 15:48, Alexander Belopolsky wrote:
> On Mon, Sep 14, 2015 at 3:44 PM, Random832 <random832 at fastmail.com>
> wrote:
> 
> > It is an
> > invariant that is true today, and therefore which you can't rely on any
> > of the consumers of this 12 years old widely deployed code not to assume
> > will remain true.
> >
> 
> Sorry, this sentence does not parse.  You are missing a "not" somewhere.

Nope. I am asserting that:

This invariant is true today.
Therefore, it is likely that at least some consumers of datetime will
assume it is true.
Therefore, you cannot rely on there not being any consumers which assume
it will remain true.

It's awkward, since when I go back to analyze it it turns out that the
"not" after 'code' actually technically modifies "any" earlier in the
sentence, but the number of negatives is correct. (Though, it actually
works out even without that change, since the question of *which*
consumers rely on the invariant is unknown.)

From tim.peters at gmail.com  Mon Sep 14 22:15:35 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 14 Sep 2015 15:15:35 -0500
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <1442260714.263025.383475777.4728D768@webmail.messagingengine.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAExdVN=s_V_uz9mSOtp6b6+fKjqLZHArXKUQ-ty5EYkLM5V2qw@mail.gmail.com>
 <1442260714.263025.383475777.4728D768@webmail.messagingengine.com>
Message-ID: <CAExdVN=OWrFxYiuNZrWuVHfeT74VyXaUOXLKj0Q1Ai1SoupqbQ@mail.gmail.com>

[Random832 <random832 at fastmail.com>]

Whether or not datetimes stored  tm_gmtoff and tm_zone workalikes has
no effect on semantics I can see.  If, in your view, they're purely an
optimization, they're just a distraction for now.  If you're proposing
to add them _instead_ of adding `fold`, no, that can't work, for the
pickle compatibility reasons already explained.  Whether something is
in a fold needs to preserved across pickling, but "almost all" pickles
need to be readable by older Pythons too.  This is doable adding one
bit, but not doable at all if we need to add arbitrary timedelta and
string objects _instead_ of that bit.

...

>>>  (No, I don't *care* how that's not how it's defined,

>> ?  How what is defined?:

> Just trying, unsuccessfully apparently, to head off the "no, it's
> defined as working the same as a naive datetime if the tzinfo values are
> the same" argument that got brought up the *last* time I made this
> claim.

Sorry, I still don't know what this is about.


>>> it is *in fact* true for the UTC value that you will ever actually get
>>> from converting the values to UTC *today*, and it's the only total
>>> ordering that actually makes any sense)

>> Well, you lost me there.  In a post-495 world, conversion to UTC will
>> work correctly in all cases.  It cannot today.;

> It'll provide *a* value in all cases.

It will provide the correct UTC offset in all cases.


> The sort order today is equivalent to using that value in all
> cases unless you've got a pathological tzinfo
> specifically crafted to break it. I think that's an important enough
> invariant to be worth keeping, since it is the only possible way to
> provide a total order in the presence of interzone comparisons.

Show some code?  I don't know what you're talking about.

It is true that the earlier and later of an ambiguous time in a fold
will compare equal in their own zone, but compare not equal after
conversion to UTC (or to any other zone in which they're not in one of
the latter zone's folds).  Is that what you're talking about?

From tim.peters at gmail.com  Mon Sep 14 22:22:50 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 14 Sep 2015 15:22:50 -0500
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAP7h-xbdx+f2SSYbBCXqzsasTF8kP5hGAnJQtvK9f1z6sPvxhw@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAExdVN=s_V_uz9mSOtp6b6+fKjqLZHArXKUQ-ty5EYkLM5V2qw@mail.gmail.com>
 <CAP7h-xa3fu5Yo5GGdqDiM-vgxFJnwW9SQYW9Hcm1+Jfpy6csmA@mail.gmail.com>
 <CAExdVN=TYS-xZWKRy7nVh0_V9SC6-WPh2KPE5E1fN1qd+sNeeA@mail.gmail.com>
 <CAP7h-xbdx+f2SSYbBCXqzsasTF8kP5hGAnJQtvK9f1z6sPvxhw@mail.gmail.com>
Message-ID: <CAExdVNk2FA1px4m0EbKysMxqvqJ9Vp47b5OCGBB40ej263WH_Q@mail.gmail.com>

[Tim]
>> It depends on how expensive .utcoffset()
>> is, which in turn depends on how the tzinfo author implements it.

[Alex]
> No, it does not.  In most time zones, UTC offset in seconds can be computed
> by C code as a 4-byte integer

Which is a specific implementation of .utcoffset().  Which likely has
nothing to do with how most tzinfo authors will implement _their_
.utcoffset().  For example, look at any tzinfo.utcoffset()
implementation that currently exists ;-)


> faster
> than CPython can look up the .utcoffset method. (At least for times
> within a few years around now.) A programmer who makes it slower should
> be fired.

So any programmer who implements .utcoffset() in Python should be
fired?  That's the only way I can read that.


> Yet I agree,  "'premature optimization' applies at this time."

I'm more worried now about premature firing ;-)

From random832 at fastmail.com  Mon Sep 14 22:27:05 2015
From: random832 at fastmail.com (Random832)
Date: Mon, 14 Sep 2015 16:27:05 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAExdVN=OWrFxYiuNZrWuVHfeT74VyXaUOXLKj0Q1Ai1SoupqbQ@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAExdVN=s_V_uz9mSOtp6b6+fKjqLZHArXKUQ-ty5EYkLM5V2qw@mail.gmail.com>
 <1442260714.263025.383475777.4728D768@webmail.messagingengine.com>
 <CAExdVN=OWrFxYiuNZrWuVHfeT74VyXaUOXLKj0Q1Ai1SoupqbQ@mail.gmail.com>
Message-ID: <1442262425.268793.383506657.0443601E@webmail.messagingengine.com>

On Mon, Sep 14, 2015, at 16:15, Tim Peters wrote:
> [Random832 <random832 at fastmail.com>]
> 
> Whether or not datetimes stored  tm_gmtoff and tm_zone workalikes has
> no effect on semantics I can see.  If, in your view, they're purely an
> optimization, they're just a distraction for now.  If you're proposing
> to add them _instead_ of adding `fold`, no, that can't work, for the
> pickle compatibility reasons already explained.  Whether something is
> in a fold needs to preserved across pickling, but "almost all" pickles
> need to be readable by older Pythons too.  This is doable adding one
> bit, but not doable at all if we need to add arbitrary timedelta and
> string objects _instead_ of that bit.

A) I'm still not sure why, but I was talking about adding an int, not a
timedelta and a string.

B) Older python versions can't make use of either utcoffset or fold, but
can ignore either of them. I don't even see why they couldn't ignore a
timedelta and a string if we felt like adding those.

C) What value fold "should" have can be inferred from the time, the
utcoffset, and the tzinfo.

> >> Well, you lost me there.  In a post-495 world, conversion to UTC will
> >> work correctly in all cases.  It cannot today.;
> 
> > It'll provide *a* value in all cases.
> 
> It will provide the correct UTC offset in all cases.

I'm saying that *today*, even with no 495, it does provide *a* value in
all cases (even if that's sometimes the "wrong" value for an ambiguous
time). And that value is, for any plausible tzinfo, ordered the same for
any given pair of datetimes with the same tzinfo as the datetimes
considered as naive datetimes.

There is, or appears to be, a faction that is proposing to change that
by sorting fold=1 2:15 before fold=0 2:45 even though the former is
*actually* 30 minutes later than the latter, and I am *utterly baffled*
at why they think this is a good idea.

> It is true that the earlier and later of an ambiguous time in a fold
> will compare equal in their own zone, but compare not equal after
> conversion to UTC (or to any other zone in which they're not in one of
> the latter zone's folds).  Is that what you're talking about?

Yes. Or two different ambiguous times, where the properly earlier one
compares greater and vice versa. I have no idea why anyone thinks this
is reasonable or desirable behavior.

From alexander.belopolsky at gmail.com  Mon Sep 14 22:27:26 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 14 Sep 2015 16:27:26 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <1442261330.265134.383487745.4E9C3005@webmail.messagingengine.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAP7h-xZYsqVPOZ0ndj3CAumq_Pp4uQ+AgQC5hHVPH9YA793cAg@mail.gmail.com>
 <1442259852.259192.383467881.5156BE88@webmail.messagingengine.com>
 <CAP7h-xbVMAyTaM8yX3iz-td3hns7BmMRWL9qL-1WFjn=B57SCA@mail.gmail.com>
 <1442261330.265134.383487745.4E9C3005@webmail.messagingengine.com>
Message-ID: <CAP7h-xaqOcSN2BvF6AEJRkDHP3ET8aM6qt7XjZ9NDqsAZYGy9A@mail.gmail.com>

On Mon, Sep 14, 2015 at 4:08 PM, Random832 <random832 at fastmail.com> wrote:

> On Mon, Sep 14, 2015, at 15:48, Alexander Belopolsky wrote:
> > On Mon, Sep 14, 2015 at 3:44 PM, Random832 <random832 at fastmail.com>
> > wrote:
> >
> > > It is an
> > > invariant that is true today, and therefore which you can't rely on any
> > > of the consumers of this 12 years old widely deployed code not to
> assume
> > > will remain true.
> > >
> >
> > Sorry, this sentence does not parse.  You are missing a "not" somewhere.
>
> Nope. I am asserting that:
>
> This invariant is true today.
>

You've never specified "this invariant", but I'll assume you are talking
about "a < b implies a.astimezone(UTC) < b.astimezone(UTC)."  This is *not*
true today:

>>> from datetime import *
>>> from datetimetester import Eastern
>>> UTC = timezone.utc
>>> a = datetime(2002, 4, 7, 1, 40, tzinfo=Eastern)
>>> b = datetime(2002, 4, 7, 2, 20, tzinfo=Eastern)
>>> a < b
True
>>> a.astimezone(UTC) < b.astimezone(UTC)
False


> Therefore, it is likely that at least some consumers of datetime will
> assume it is true.
>

Obviously, if Random832 is a real person, the last statement is true.  This
does not make the assumption true, just proves that at least one user is
confused about the current behavior. :-)


> Therefore, you cannot rely on there not being any consumers which assume
> it will remain true.
>

That's where we are now.  Some users make baseless assumptions.  This will
probably remain true. :-(


> It's awkward, since when I go back to analyze it it turns out that the
> "not" after 'code' actually technically modifies "any" earlier in the
> sentence, but the number of negatives is correct.


Writing in shorter sentences may help.


> (Though, it actually
> works out even without that change, since the question of *which*
> consumers rely on the invariant is unknown.)
>

True.  We will never know how many users rely on false assumptions.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150914/f0e03c18/attachment-0001.html>

From alexander.belopolsky at gmail.com  Mon Sep 14 22:39:14 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 14 Sep 2015 16:39:14 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAExdVNk2FA1px4m0EbKysMxqvqJ9Vp47b5OCGBB40ej263WH_Q@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAExdVN=s_V_uz9mSOtp6b6+fKjqLZHArXKUQ-ty5EYkLM5V2qw@mail.gmail.com>
 <CAP7h-xa3fu5Yo5GGdqDiM-vgxFJnwW9SQYW9Hcm1+Jfpy6csmA@mail.gmail.com>
 <CAExdVN=TYS-xZWKRy7nVh0_V9SC6-WPh2KPE5E1fN1qd+sNeeA@mail.gmail.com>
 <CAP7h-xbdx+f2SSYbBCXqzsasTF8kP5hGAnJQtvK9f1z6sPvxhw@mail.gmail.com>
 <CAExdVNk2FA1px4m0EbKysMxqvqJ9Vp47b5OCGBB40ej263WH_Q@mail.gmail.com>
Message-ID: <CAP7h-xZAcVTV9pufCOK=r6VwLuepz4+gdSphrPgeWOJmoXZ2gA@mail.gmail.com>

On Mon, Sep 14, 2015 at 4:22 PM, Tim Peters <tim.peters at gmail.com> wrote:

> > faster
> > than CPython can look up the .utcoffset method. (At least for times
> > within a few years around now.) A programmer who makes it slower should
> > be fired.
>
> So any programmer who implements .utcoffset() in Python should be
> fired?  That's the only way I can read that.


No, no!  I've already conceded that caching UTC offset will probably help
pure Python implementations.  PyPy folks have established this fact for
hash and I am willing to extrapolate their results to UTC offset.  I am
only trying to say that if we decide to bring a fast TZ database to
CPython, pure python tzinfo interface will likely become our main
bottleneck, not the speed with which C code can compute the offset value.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150914/dcc77bff/attachment.html>

From tim.peters at gmail.com  Mon Sep 14 22:45:17 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 14 Sep 2015 15:45:17 -0500
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <1442262425.268793.383506657.0443601E@webmail.messagingengine.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAExdVN=s_V_uz9mSOtp6b6+fKjqLZHArXKUQ-ty5EYkLM5V2qw@mail.gmail.com>
 <1442260714.263025.383475777.4728D768@webmail.messagingengine.com>
 <CAExdVN=OWrFxYiuNZrWuVHfeT74VyXaUOXLKj0Q1Ai1SoupqbQ@mail.gmail.com>
 <1442262425.268793.383506657.0443601E@webmail.messagingengine.com>
Message-ID: <CAExdVNmNL4oiGWjokOmvKaWvqPdsp2kuGKYGJ-aERRB+irQU8A@mail.gmail.com>

[Random832 <random832 at fastmail.com>]
> A) I'm still not sure why, but I was talking about adding an int, not a
> timedelta and a string.
>
> B) Older python versions can't make use of either utcoffset or fold, but
> can ignore either of them. I don't even see why they couldn't ignore a
> timedelta and a string if we felt like adding those.

Because all versions of Python expect a very specific pickle layout
for _every_ kind of pickled object (including datetimes)..  Make any
change to the pickle format of any object, and older Pythons will
simply blow up (raise an exception) when trying to load the new pickle
- or do something insane with the pickle bits.  It's impossible for
older Pythons to know anything about what "the new bits" are supposed
to mean, and there is no way to spell, in the pickle engine, "but if
you're an older version, skip over the next N bytes".


> C) What value fold "should" have can be inferred from the time, the
> utcoffset, and the tzinfo.

So you are proposing to add ... something ... _instead_ of adding
`fold`.  Already addressed that.  See above.

> I'm saying that *today*, even with no 495, it [utcoffset] does provide
> *a* value in all cases (even if that's sometimes the "wrong" value
> for an ambiguous time).

Sure.

> And that value is, for any plausible tzinfo, ordered the same for
> any given pair of datetimes with the same tzinfo as the datetimes
> considered as naive datetimes.

Yes.

> There is, or appears to be, a faction that is proposing to change that
> by sorting fold=1 2:15 before fold=0 2:45 even though the former is
> *actually* 30 minutes later than the latter, and I am *utterly baffled*
> at why they think this is a good idea.

It's not so much a "good idea" as that it's the only idea consistent
with Python's "naive time" model.  Folds and gaps don't exist in naive
time.  Indeed, the _concept_ of "time zone" doesn't really exist in
naive time.  There's _inherent_ tension between the naive time model
and the way multi-offset time zones actually behave.  So it goes.


>> It is true that the earlier and later of an ambiguous time in a fold
>> will compare equal in their own zone, but compare not equal after
>> conversion to UTC (or to any other zone in which they're not in one of
>> the latter zone's folds).  Is that what you're talking about?

> Yes. Or two different ambiguous times, where the properly earlier one
> compares greater and vice versa. I have no idea why anyone thinks this
> is reasonable or desirable behavior.

>From which I can guess, without asking, that you think "naive time"
itself is unreasonable and undesirable ;-)

From random832 at fastmail.com  Mon Sep 14 22:54:56 2015
From: random832 at fastmail.com (Random832)
Date: Mon, 14 Sep 2015 16:54:56 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAP7h-xaqOcSN2BvF6AEJRkDHP3ET8aM6qt7XjZ9NDqsAZYGy9A@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAP7h-xZYsqVPOZ0ndj3CAumq_Pp4uQ+AgQC5hHVPH9YA793cAg@mail.gmail.com>
 <1442259852.259192.383467881.5156BE88@webmail.messagingengine.com>
 <CAP7h-xbVMAyTaM8yX3iz-td3hns7BmMRWL9qL-1WFjn=B57SCA@mail.gmail.com>
 <1442261330.265134.383487745.4E9C3005@webmail.messagingengine.com>
 <CAP7h-xaqOcSN2BvF6AEJRkDHP3ET8aM6qt7XjZ9NDqsAZYGy9A@mail.gmail.com>
Message-ID: <1442264096.274955.383520321.20C18E75@webmail.messagingengine.com>

On Mon, Sep 14, 2015, at 16:27, Alexander Belopolsky wrote:
> You've never specified "this invariant", 

I have specified it numerous times.

> but I'll assume you are talking
> about "a < b implies a.astimezone(UTC) < b.astimezone(UTC)."  This is
> *not*
> true today:

In my first few posts about the issue I did note "mid-spring-forward"
times as an exception (and I assert that they are the *only* exception).
But repetition and having to keep explaining this has worn me down.

> >>> from datetime import *
> >>> from datetimetester import Eastern
> >>> UTC = timezone.utc
> >>> a = datetime(2002, 4, 7, 1, 40, tzinfo=Eastern)
> >>> b = datetime(2002, 4, 7, 2, 20, tzinfo=Eastern)
> >>> a < b
> True
> >>> a.astimezone(UTC) < b.astimezone(UTC)
> False

I don't know how your datetimetester works, so this is a bit of a black
box to me - correct me if any of the below is wrong:

I assume that 2002-04-07 is the morning of the "spring forward"
transition of that year. Therefore, it's worth noting, the time in "b"
is one that doesn't actually exist. I actually did mention, in one of my
messages on the subject, that "spring forward" times were an exception -
the *only* exception, to the invariant, but that's been lost in a few of
my repetitions.

I'm going to assume that the interpretations that led to your results
are:
a = 2002-04-07 01:40:00 -0500 = 2002-04-07 06:40:00 Z
b = 2002-04-07 02:20:00 -0400 = 2002-04-07 06:20:00 Z

I don't think this is a reasonable value for b.astimezone(UTC) to have.

But anyway, none of this is actually relevant to my claims about how the
times near "fall back" transitions (i.e. with different fold values)
should be sorted. I wasn't at any point proposing *actually* converting
to UTC as part of the mechanism for comparing times. Just that having
times near "fold" points ordered in any other way would be surprising
and unreasonable.

From alexander.belopolsky at gmail.com  Mon Sep 14 23:01:15 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 14 Sep 2015 17:01:15 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <1442264096.274955.383520321.20C18E75@webmail.messagingengine.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAP7h-xZYsqVPOZ0ndj3CAumq_Pp4uQ+AgQC5hHVPH9YA793cAg@mail.gmail.com>
 <1442259852.259192.383467881.5156BE88@webmail.messagingengine.com>
 <CAP7h-xbVMAyTaM8yX3iz-td3hns7BmMRWL9qL-1WFjn=B57SCA@mail.gmail.com>
 <1442261330.265134.383487745.4E9C3005@webmail.messagingengine.com>
 <CAP7h-xaqOcSN2BvF6AEJRkDHP3ET8aM6qt7XjZ9NDqsAZYGy9A@mail.gmail.com>
 <1442264096.274955.383520321.20C18E75@webmail.messagingengine.com>
Message-ID: <CAP7h-xYeMubPMbM4aVcboawy4MGy_Vmy7FyWT-cRpF_L5ZgYqg@mail.gmail.com>

On Mon, Sep 14, 2015 at 4:54 PM, Random832 <random832 at fastmail.com> wrote:

> I don't know how your datetimetester works


Please educate yourself:

https://hg.python.org/cpython/file/tip/Lib/test/datetimetester.py#l3539

Some familiarity with the CPython test suit is pretty much a pre-requisite
to make a meaningful contribution to PEP 495 discussions at this stage.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150914/54e9c1a7/attachment-0001.html>

From alexander.belopolsky at gmail.com  Mon Sep 14 23:10:47 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 14 Sep 2015 17:10:47 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <1442264096.274955.383520321.20C18E75@webmail.messagingengine.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAP7h-xZYsqVPOZ0ndj3CAumq_Pp4uQ+AgQC5hHVPH9YA793cAg@mail.gmail.com>
 <1442259852.259192.383467881.5156BE88@webmail.messagingengine.com>
 <CAP7h-xbVMAyTaM8yX3iz-td3hns7BmMRWL9qL-1WFjn=B57SCA@mail.gmail.com>
 <1442261330.265134.383487745.4E9C3005@webmail.messagingengine.com>
 <CAP7h-xaqOcSN2BvF6AEJRkDHP3ET8aM6qt7XjZ9NDqsAZYGy9A@mail.gmail.com>
 <1442264096.274955.383520321.20C18E75@webmail.messagingengine.com>
Message-ID: <CAP7h-xby6t5nEdwSX84+yz8+wLiuK29pu1vwYu==+0ha1j5=EA@mail.gmail.com>

On Mon, Sep 14, 2015 at 4:54 PM, Random832 <random832 at fastmail.com> wrote:

> I'm going to assume that the interpretations that led to your results
> are:
> a = 2002-04-07 01:40:00 -0500 = 2002-04-07 06:40:00 Z
> b = 2002-04-07 02:20:00 -0400 = 2002-04-07 06:20:00 Z
>

Looks right:

>>> print(a)
2002-04-07 01:40:00-05:00
>>> print(a.astimezone(UTC))
2002-04-07 06:40:00+00:00
>>> print(b)
2002-04-07 02:20:00-04:00
>>> print(b.astimezone(UTC))
2002-04-07 06:20:00+00:00


> I don't think this is a reasonable value for b.astimezone(UTC) to have.
>

You would have to go back in time to 2002-2003 and argue with Tim and Guido
about that.  Trust me - you would loose.   Arguing about it today is even
more futile.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150914/f5a85b16/attachment.html>

From random832 at fastmail.com  Mon Sep 14 23:23:20 2015
From: random832 at fastmail.com (Random832)
Date: Mon, 14 Sep 2015 17:23:20 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAExdVNmNL4oiGWjokOmvKaWvqPdsp2kuGKYGJ-aERRB+irQU8A@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAExdVN=s_V_uz9mSOtp6b6+fKjqLZHArXKUQ-ty5EYkLM5V2qw@mail.gmail.com>
 <1442260714.263025.383475777.4728D768@webmail.messagingengine.com>
 <CAExdVN=OWrFxYiuNZrWuVHfeT74VyXaUOXLKj0Q1Ai1SoupqbQ@mail.gmail.com>
 <1442262425.268793.383506657.0443601E@webmail.messagingengine.com>
 <CAExdVNmNL4oiGWjokOmvKaWvqPdsp2kuGKYGJ-aERRB+irQU8A@mail.gmail.com>
Message-ID: <1442265800.280460.383547057.16B65298@webmail.messagingengine.com>

On Mon, Sep 14, 2015, at 16:45, Tim Peters wrote:
> Because all versions of Python expect a very specific pickle layout
> for _every_ kind of pickled object (including datetimes)..  Make any
> change to the pickle format of any object, and older Pythons will
> simply blow up (raise an exception) when trying to load the new pickle
> - or do something insane with the pickle bits.  It's impossible for
> older Pythons to know anything about what "the new bits" are supposed
> to mean, and there is no way to spell, in the pickle engine, "but if
> you're an older version, skip over the next N bytes".

Well, you could have put some reserved bits in the original pickle
format for datetime back when it was first defined, or even just allowed
passing in a longer string for future extension purposes. That you
didn't makes me wonder just where you're finding the space to put the
fold bit.

> It's not so much a "good idea" as that it's the only idea consistent
> with Python's "naive time" model.  Folds and gaps don't exist in naive
> time.  Indeed, the _concept_ of "time zone" doesn't really exist in
> naive time.  There's _inherent_ tension between the naive time model
> and the way multi-offset time zones actually behave.  So it goes.

But why does it need to be consistent? You can't compare naive datetimes
with aware ones. If you want to sort/bisect a list of datetimes, they
have to either all be naive or all be aware. So when we're talking about
how ordering works, we're fundamentally talking about how it works for
aware datetimes.

From alexander.belopolsky at gmail.com  Mon Sep 14 23:23:34 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 14 Sep 2015 17:23:34 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <1442264096.274955.383520321.20C18E75@webmail.messagingengine.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAP7h-xZYsqVPOZ0ndj3CAumq_Pp4uQ+AgQC5hHVPH9YA793cAg@mail.gmail.com>
 <1442259852.259192.383467881.5156BE88@webmail.messagingengine.com>
 <CAP7h-xbVMAyTaM8yX3iz-td3hns7BmMRWL9qL-1WFjn=B57SCA@mail.gmail.com>
 <1442261330.265134.383487745.4E9C3005@webmail.messagingengine.com>
 <CAP7h-xaqOcSN2BvF6AEJRkDHP3ET8aM6qt7XjZ9NDqsAZYGy9A@mail.gmail.com>
 <1442264096.274955.383520321.20C18E75@webmail.messagingengine.com>
Message-ID: <CAP7h-xas5k=94BptwjJKkX99J+T1NHO4rXJpF-4fLXPmNNvAOA@mail.gmail.com>

On Mon, Sep 14, 2015 at 4:54 PM, Random832 <random832 at fastmail.com> wrote:

> But anyway, none of this is actually relevant to my claims about how the
> times near "fall back" transitions (i.e. with different fold values)
> should be sorted.
>

Current behavior for gap times is relevant because it shows that you do get
surprising results when you step out of the naive time model.  The gap
times can be created now and they violate astimezone(utc) monotonicity.
PEP 495 allows more times that are outside of the naive time model: fold=1
times in the fall-back fold.  It is unavoidable that astimezone(utc) is
non-monotonic in this case as well.  After all, why does it concern you
more than the non-monotonicity of  astimezone(local)?

I wasn't at any point proposing *actually* converting
> to UTC as part of the mechanism for comparing times.
>

In this case what were you *actually* proposing?


> Just that having
> times near "fold" points ordered in any other way would be surprising
> and unreasonable.
>

"Other" than what?  In the previous sentence you said that converting to
UTC to compare was not your proposal.  Please let us know what your
proposal is rather than what it isn't.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150914/fc29f624/attachment.html>

From random832 at fastmail.com  Mon Sep 14 23:30:29 2015
From: random832 at fastmail.com (Random832)
Date: Mon, 14 Sep 2015 17:30:29 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAP7h-xas5k=94BptwjJKkX99J+T1NHO4rXJpF-4fLXPmNNvAOA@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAP7h-xZYsqVPOZ0ndj3CAumq_Pp4uQ+AgQC5hHVPH9YA793cAg@mail.gmail.com>
 <1442259852.259192.383467881.5156BE88@webmail.messagingengine.com>
 <CAP7h-xbVMAyTaM8yX3iz-td3hns7BmMRWL9qL-1WFjn=B57SCA@mail.gmail.com>
 <1442261330.265134.383487745.4E9C3005@webmail.messagingengine.com>
 <CAP7h-xaqOcSN2BvF6AEJRkDHP3ET8aM6qt7XjZ9NDqsAZYGy9A@mail.gmail.com>
 <1442264096.274955.383520321.20C18E75@webmail.messagingengine.com>
 <CAP7h-xas5k=94BptwjJKkX99J+T1NHO4rXJpF-4fLXPmNNvAOA@mail.gmail.com>
Message-ID: <1442266229.281937.383569841.7079F391@webmail.messagingengine.com>

On Mon, Sep 14, 2015, at 17:23, Alexander Belopolsky wrote:
> In this case what were you *actually* proposing?

My point is that I'm not proposing a specific mechanism. Just saying
that the order that other people are claiming is somehow necessary for
consistency with naive datetimes (that you can't actually compare these
values with) is not necessary *and* not reasonable, and whatever is
implemented should put them in the right order by whatever mechanism is
determined to be best.

From tim.peters at gmail.com  Mon Sep 14 23:34:07 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 14 Sep 2015 16:34:07 -0500
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <1442265800.280460.383547057.16B65298@webmail.messagingengine.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAExdVN=s_V_uz9mSOtp6b6+fKjqLZHArXKUQ-ty5EYkLM5V2qw@mail.gmail.com>
 <1442260714.263025.383475777.4728D768@webmail.messagingengine.com>
 <CAExdVN=OWrFxYiuNZrWuVHfeT74VyXaUOXLKj0Q1Ai1SoupqbQ@mail.gmail.com>
 <1442262425.268793.383506657.0443601E@webmail.messagingengine.com>
 <CAExdVNmNL4oiGWjokOmvKaWvqPdsp2kuGKYGJ-aERRB+irQU8A@mail.gmail.com>
 <1442265800.280460.383547057.16B65298@webmail.messagingengine.com>
Message-ID: <CAExdVNnJG3-wz1fLtV2wLb+cbHQstc0Zh-KGxBFj6Q5CeqQpEw@mail.gmail.com>

[Tim]
>> Because all versions of Python expect a very specific pickle layout
>> for _every_ kind of pickled object (including datetimes)..  Make any
>> change to the pickle format of any object, and older Pythons will
>> simply blow up (raise an exception) when trying to load the new pickle
>> - or do something insane with the pickle bits.  It's impossible for
>> older Pythons to know anything about what "the new bits" are supposed
>> to mean, and there is no way to spell, in the pickle engine, "but if
>> you're an older version, skip over the next N bytes".

[Random832 <random832 at fastmail.com>]
> Well, you could have put some reserved bits in the original pickle
> format for datetime back when it was first defined, or even just allowed
> passing in a longer string for future extension purposes.

Yes, we "could have" done that for all pickle formats for all types.
But why on Earth would we?  Pickle size is important to many apps
(e.g., Zope applications can store billions of pickles in databases.
and it may not be purely coincidence ;-) that Zope Corp paid for
datetime development), and there would have been loud screaming about
any "wasted" bytes.


> That you didn't makes me wonder just where you're finding the space to put the
> fold bit.

PEP 495 gives all the details.  Short course:  there are bits that are
_always_ 0 now within some datetime pickle bytes.  `fold` will abuse
one of those always-0-now pickle bits.


>> It's not so much a "good idea" as that it's the only idea consistent
>> with Python's "naive time" model.  Folds and gaps don't exist in naive
>> time.  Indeed, the _concept_ of "time zone" doesn't really exist in
>> naive time.  There's _inherent_ tension between the naive time model
>> and the way multi-offset time zones actually behave.  So it goes.

> But why does it need to be consistent? You can't compare naive datetimes
> with aware ones. If you want to sort/bisect a list of datetimes, they
> have to either all be naive or all be aware. So when we're talking about
> how ordering works, we're fundamentally talking about how it works for
> aware datetimes.

Aware datetimes _within_ a zone also follow the naive time model.
It's unfortunate that they're nevertheless called "aware" datetimes.

So, sorry, even when sorting a list of aware datetimes, if they share
a common zone it is wholly intended that they all work in naive time.

Apps that can't tolerate naive time should convert to UTC first.  End
of problems.

From carl at oddbird.net  Mon Sep 14 23:39:28 2015
From: carl at oddbird.net (Carl Meyer)
Date: Mon, 14 Sep 2015 15:39:28 -0600
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <1442265800.280460.383547057.16B65298@webmail.messagingengine.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAExdVN=s_V_uz9mSOtp6b6+fKjqLZHArXKUQ-ty5EYkLM5V2qw@mail.gmail.com>
 <1442260714.263025.383475777.4728D768@webmail.messagingengine.com>
 <CAExdVN=OWrFxYiuNZrWuVHfeT74VyXaUOXLKj0Q1Ai1SoupqbQ@mail.gmail.com>
 <1442262425.268793.383506657.0443601E@webmail.messagingengine.com>
 <CAExdVNmNL4oiGWjokOmvKaWvqPdsp2kuGKYGJ-aERRB+irQU8A@mail.gmail.com>
 <1442265800.280460.383547057.16B65298@webmail.messagingengine.com>
Message-ID: <55F73E90.2040600@oddbird.net>

On 09/14/2015 03:23 PM, Random832 wrote:
> Well, you could have put some reserved bits in the original pickle
> format for datetime back when it was first defined, or even just allowed
> passing in a longer string for future extension purposes. That you
> didn't makes me wonder just where you're finding the space to put the
> fold bit.

By exploiting the currently-always-0 first bit in the "minutes" byte.
See https://www.python.org/dev/peps/pep-0495/#pickles

It might be useful to read PEP 495 before commenting on it ;-)

>> It's not so much a "good idea" as that it's the only idea consistent
>> with Python's "naive time" model.  Folds and gaps don't exist in naive
>> time.  Indeed, the _concept_ of "time zone" doesn't really exist in
>> naive time.  There's _inherent_ tension between the naive time model
>> and the way multi-offset time zones actually behave.  So it goes.
> 
> But why does it need to be consistent? You can't compare naive datetimes
> with aware ones. If you want to sort/bisect a list of datetimes, they
> have to either all be naive or all be aware. So when we're talking about
> how ordering works, we're fundamentally talking about how it works for
> aware datetimes.

What you're missing (and I was missing too, before going around in some
lengthy earlier threads on this mailing list, which you may -- or may
not -- find it worth your time to read) is that even "aware datetimes"
in Python's datetime library always operate in "naive local clock time"
for whatever timezone they are in; they aren't just alternate notations
for the corresponding UTC time.

This is why if you add timedelta(hours=24) to datetime(2014, 11, 2, 12,
tzinfo=Eastern), you get datetime(2014, 11, 3, 12, tzinfo=Eastern), even
though the difference between those two datetimes in UTC is 25 hours,
not 24.

In order to stay consistent with that "naive local clock time" model,
all operations within a time zone must ignore the `fold` value. The
`fold` value really doesn't exist at all in the naive clock time model,
it's only tracked as a convenience for correct round-tripping. This
implies that 1:30am fold=0 and 1:30am fold=1 are equal, and also that
1:20am fold=1 is "earlier" than 1:40am fold=0 (as long as you stay
within the naive clock time model -- if you don't want to, you should
convert to UTC).

You may want to rail against that model. I (and some others) already
did. You can go back in the archives here and read our efforts. Perhaps
you'll have better luck if you try; I doubt it. But given that model,
this is the only approach that makes sense.

And you can get the same work done in that model. If you want to operate
on the physical-time timeline, just always operate in UTC internally and
only translate to "aware datetimes" at display time. That's what you
probably should be doing anyway.

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150914/382234db/attachment.sig>

From tim.peters at gmail.com  Mon Sep 14 23:48:11 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 14 Sep 2015 16:48:11 -0500
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <55F73E90.2040600@oddbird.net>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAExdVN=s_V_uz9mSOtp6b6+fKjqLZHArXKUQ-ty5EYkLM5V2qw@mail.gmail.com>
 <1442260714.263025.383475777.4728D768@webmail.messagingengine.com>
 <CAExdVN=OWrFxYiuNZrWuVHfeT74VyXaUOXLKj0Q1Ai1SoupqbQ@mail.gmail.com>
 <1442262425.268793.383506657.0443601E@webmail.messagingengine.com>
 <CAExdVNmNL4oiGWjokOmvKaWvqPdsp2kuGKYGJ-aERRB+irQU8A@mail.gmail.com>
 <1442265800.280460.383547057.16B65298@webmail.messagingengine.com>
 <55F73E90.2040600@oddbird.net>
Message-ID: <CAExdVN=mKWj5Tydu_SMZJZqXcb6+xArv9QuDhP8=tBqvtsXd5A@mail.gmail.com>

[Carl Meyer <carl at oddbird.net>,
 on "aware" datetimes following the "naive time" model]
> ...
> You may want to rail against that model. I (and some others) already
> did. You can go back in the archives here and read our efforts. Perhaps
> you'll have better luck if you try; I doubt it.

There are two ways Random832  might have better luck:

1. Making Guido regret naive time.
2. Making datetime change what it's done for the last dozen years.

I'd say the chance of #1 is one in a billion.  But that's a lot better
than the chance of #2 ;-)


> But given that model, this is the only approach that makes sense.

We should also note that we already _tried_ paying attention to fold
within a single zone.  Besides being even more of a conceptual mess,
as you and I batted examples back & forth it became clear that it
broke various other kinds of backward compatibility.


> And you can get the same work done in that model. If you want to operate
> on the physical-time timeline, just always operate in UTC internally and
> only translate to "aware datetimes" at display time. That's what you
> probably should be doing anyway.

Alas, sanity is the last thing any good programmer will yield to ;-)

From random832 at fastmail.com  Mon Sep 14 23:53:55 2015
From: random832 at fastmail.com (Random832)
Date: Mon, 14 Sep 2015 17:53:55 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAExdVNnJG3-wz1fLtV2wLb+cbHQstc0Zh-KGxBFj6Q5CeqQpEw@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAExdVN=s_V_uz9mSOtp6b6+fKjqLZHArXKUQ-ty5EYkLM5V2qw@mail.gmail.com>
 <1442260714.263025.383475777.4728D768@webmail.messagingengine.com>
 <CAExdVN=OWrFxYiuNZrWuVHfeT74VyXaUOXLKj0Q1Ai1SoupqbQ@mail.gmail.com>
 <1442262425.268793.383506657.0443601E@webmail.messagingengine.com>
 <CAExdVNmNL4oiGWjokOmvKaWvqPdsp2kuGKYGJ-aERRB+irQU8A@mail.gmail.com>
 <1442265800.280460.383547057.16B65298@webmail.messagingengine.com>
 <CAExdVNnJG3-wz1fLtV2wLb+cbHQstc0Zh-KGxBFj6Q5CeqQpEw@mail.gmail.com>
Message-ID: <1442267635.287083.383576201.0990DAA7@webmail.messagingengine.com>

On Mon, Sep 14, 2015, at 17:34, Tim Peters wrote:
> Yes, we "could have" done that for all pickle formats for all types.
> But why on Earth would we?  Pickle size is important to many apps
> (e.g., Zope applications can store billions of pickles in databases.
> and it may not be purely coincidence ;-) that Zope Corp paid for
> datetime development), and there would have been loud screaming about
> any "wasted" bytes.

Would allowing a 16-byte string in the future have increased the storage
occupied by a 10-byte string today? Would allowing a third argument in
the future have increased the storage occupied by two arguments today?
As far as I can tell the pickle format for non-primitive types isn't
_that_ fixed-width.

> > That you didn't makes me wonder just where you're finding the space to put the
> > fold bit.
> 
> PEP 495 gives all the details.  Short course:  there are bits that are
> _always_ 0 now within some datetime pickle bytes.  `fold` will abuse
> one of those always-0-now pickle bits.

And what happens to older implementations if that bit is 1?

> Aware datetimes _within_ a zone also follow the naive time model.
> It's unfortunate that they're nevertheless called "aware" datetimes.
> 
> So, sorry, even when sorting a list of aware datetimes, if they share
> a common zone it is wholly intended that they all work in naive time.

And if some of them share a common zone, then some of them will work in
naive time, and some of them will work in aware time, and some pairs
(well, triples) of them will cause problems for sort/bisect algorithms.

Maybe it'd be best to simply ban interzone comparisons. Or have some
sort of context manager to determine how arithmetic and comparisons
work.

From carl at oddbird.net  Tue Sep 15 00:00:32 2015
From: carl at oddbird.net (Carl Meyer)
Date: Mon, 14 Sep 2015 16:00:32 -0600
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <1442267635.287083.383576201.0990DAA7@webmail.messagingengine.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAExdVN=s_V_uz9mSOtp6b6+fKjqLZHArXKUQ-ty5EYkLM5V2qw@mail.gmail.com>
 <1442260714.263025.383475777.4728D768@webmail.messagingengine.com>
 <CAExdVN=OWrFxYiuNZrWuVHfeT74VyXaUOXLKj0Q1Ai1SoupqbQ@mail.gmail.com>
 <1442262425.268793.383506657.0443601E@webmail.messagingengine.com>
 <CAExdVNmNL4oiGWjokOmvKaWvqPdsp2kuGKYGJ-aERRB+irQU8A@mail.gmail.com>
 <1442265800.280460.383547057.16B65298@webmail.messagingengine.com>
 <CAExdVNnJG3-wz1fLtV2wLb+cbHQstc0Zh-KGxBFj6Q5CeqQpEw@mail.gmail.com>
 <1442267635.287083.383576201.0990DAA7@webmail.messagingengine.com>
Message-ID: <55F74380.2030202@oddbird.net>

On 09/14/2015 03:53 PM, Random832 wrote:
> And if some of them share a common zone, then some of them will work in
> naive time, and some of them will work in aware time, and some pairs
> (well, triples) of them will cause problems for sort/bisect algorithms.

Yep, if you're working with a heterogenous-tzinfo set of aware datetimes
in Python, there may not be a total ordering (and you'll get violations
of various other arithmetic identities, too). Best available option:
don't do that.

> Maybe it'd be best to simply ban interzone comparisons.

Yes, you've got it. Interzone comparisons and arithmetic are the real
wart in the datetime module, once you accept its intended mental model.
If the time machine were in working order, they ought to be banned and
require explicit conversion to the same timezone instead.

> Or have some
> sort of context manager to determine how arithmetic and comparisons
> work.

Ouch, please no. If there were a strong desire to support _both_ mental
models of an aware datetime in the Python datetime library, there would
be several better ways to do it (like two different classes for aware
datetimes, or a flag on tzinfo classes, or the - rejected by Guido - PEP
500).

But given the option to "just work in UTC" when you want timeline
arithmetic, and the potential for just multiplying confusion by
providing more mental models, I don't think there's sufficient desire
for that. At least, I've lost such desire as I once may have had ;-)

Carl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150914/763c9334/attachment.sig>

From tim.peters at gmail.com  Tue Sep 15 00:09:47 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 14 Sep 2015 17:09:47 -0500
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <1442267635.287083.383576201.0990DAA7@webmail.messagingengine.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAExdVN=s_V_uz9mSOtp6b6+fKjqLZHArXKUQ-ty5EYkLM5V2qw@mail.gmail.com>
 <1442260714.263025.383475777.4728D768@webmail.messagingengine.com>
 <CAExdVN=OWrFxYiuNZrWuVHfeT74VyXaUOXLKj0Q1Ai1SoupqbQ@mail.gmail.com>
 <1442262425.268793.383506657.0443601E@webmail.messagingengine.com>
 <CAExdVNmNL4oiGWjokOmvKaWvqPdsp2kuGKYGJ-aERRB+irQU8A@mail.gmail.com>
 <1442265800.280460.383547057.16B65298@webmail.messagingengine.com>
 <CAExdVNnJG3-wz1fLtV2wLb+cbHQstc0Zh-KGxBFj6Q5CeqQpEw@mail.gmail.com>
 <1442267635.287083.383576201.0990DAA7@webmail.messagingengine.com>
Message-ID: <CAExdVNkLmcCNHa20Y3NVe1w8U+GDXhxLZ8e9Zt74i0WAwi9iWA@mail.gmail.com>

[Random832 <random832 at fastmail.com>]
> Would allowing a 16-byte string in the future have increased the storage
> occupied by a 10-byte string today? Would allowing a third argument in
> the future have increased the storage occupied by two arguments today?
> As far as I can tell the pickle format for non-primitive types isn't
> _that_ fixed-width.

Sorry, I'm not arguing about this any more.  Pickle doesn't work at
all at the level of "count of bytes followed by a string".  If you
want to make a pickle argument that makes sense, I'm afraid you'll
need to become familiar with how pickle works first.  This is not the
place for a pickle tutorial.

Start by learning what a datetime pickle actually is.
pickletools.dis() will be very helpful.


>>> That you didn't makes me wonder just where you're finding the space to put the
>>> fold bit.

>> PEP 495 gives all the details.  Short course:  there are bits that are
>> _always_ 0 now within some datetime pickle bytes.  `fold` will abuse
>> one of those always-0-now pickle bits.

> And what happens to older implementations if that bit is 1?

Unpickling will raise an exception, complaining that the minute value
is out of range.


>> Aware datetimes _within_ a zone also follow the naive time model.
>> It's unfortunate that they're nevertheless called "aware" datetimes.
>>
>> So, sorry, even when sorting a list of aware datetimes, if they share
>> a common zone it is wholly intended that they all work in naive time.

> And if some of them share a common zone, then some of them will work in
> naive time, and some of them will work in aware time, and some pairs
> (well, triples) of them will cause problems for sort/bisect algorithms.

All sorts of things may happen, yes.  As I said, if you need to care,
convert to UTC first.  Most apps do nothing like this.


> Maybe it'd be best to simply ban interzone comparisons.

We cannot.  Backward compatibility.  If would have been better had
interzone comparisons and subtraction not been supported from the
start.  Too late to change that.


> Or have some sort of context manager to determine how arithmetic and comparisons
> work.

Write a PEP ;-)

From tim.peters at gmail.com  Tue Sep 15 02:31:24 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 14 Sep 2015 19:31:24 -0500
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <1442264096.274955.383520321.20C18E75@webmail.messagingengine.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAP7h-xZYsqVPOZ0ndj3CAumq_Pp4uQ+AgQC5hHVPH9YA793cAg@mail.gmail.com>
 <1442259852.259192.383467881.5156BE88@webmail.messagingengine.com>
 <CAP7h-xbVMAyTaM8yX3iz-td3hns7BmMRWL9qL-1WFjn=B57SCA@mail.gmail.com>
 <1442261330.265134.383487745.4E9C3005@webmail.messagingengine.com>
 <CAP7h-xaqOcSN2BvF6AEJRkDHP3ET8aM6qt7XjZ9NDqsAZYGy9A@mail.gmail.com>
 <1442264096.274955.383520321.20C18E75@webmail.messagingengine.com>
Message-ID: <CAExdVNmnZv9kURxROjeGLax5zpqAg3MHAWs87E1ui5=S4izOCQ@mail.gmail.com>

[Alexander Belopolsky]
>> ...
>> >>> from datetime import *
>> >>> from datetimetester import Eastern
>> >>> UTC = timezone.utc
>> >>> a = datetime(2002, 4, 7, 1, 40, tzinfo=Eastern)
>> >>> b = datetime(2002, 4, 7, 2, 20, tzinfo=Eastern)
>> >>> a < b
>> True
>> >>> a.astimezone(UTC) < b.astimezone(UTC)
>> False

[Random832 <random832 at fastmail.com>]
> ...
> I don't know how your datetimetester works, so this is a bit of a black
> box to me - correct me if any of the below is wrong:
>
> I assume that 2002-04-07 is the morning of the "spring forward"
> transition of that year. Therefore, it's worth noting, the time in "b"
> is one that doesn't actually exist. I actually did mention, in one of my
> messages on the subject, that "spring forward" times were an exception -
> the *only* exception, to the invariant, but that's been lost in a few of
> my repetitions.
>
> I'm going to assume that the interpretations that led to your results
> are:
> a = 2002-04-07 01:40:00 -0500 = 2002-04-07 06:40:00 Z
> b = 2002-04-07 02:20:00 -0400 = 2002-04-07 06:20:00 Z
>
> I don't think this is a reasonable value for b.astimezone(UTC) to have.

I can explain the thinking here:  in "naive time", there's no such
thing as "missing time".  Indeed, if you watch an old-fashioned
mechanical clock near the time DST starts, you'll see it change from
1:59 to 2:00 to 2:01 ... to 2:20.  Since it's now ">= 2:00" on the
local clock, US rules say you're now in daylight time.  So the only
UTC offset that _does_ make sense is the US/Eastern daylight offset:
-4.

"But you forgot to set the clock ahead, so this should _really_ be
considered as still being in standard time!" is an argument outside
the naive time model.  "Set the clock ahead?  That's insane!  My clock
keeps perfect time - why would I break it?" ;-)

The bottom-line lesson being the same as always:  if you need to care
about folds and gaps, in datetime it's intended that you work in UTC
instead (or some other fixed-offset zone).

From random832 at fastmail.com  Tue Sep 15 03:19:56 2015
From: random832 at fastmail.com (Random832)
Date: Mon, 14 Sep 2015 21:19:56 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <CAExdVNkLmcCNHa20Y3NVe1w8U+GDXhxLZ8e9Zt74i0WAwi9iWA@mail.gmail.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAExdVN=s_V_uz9mSOtp6b6+fKjqLZHArXKUQ-ty5EYkLM5V2qw@mail.gmail.com>
 <1442260714.263025.383475777.4728D768@webmail.messagingengine.com>
 <CAExdVN=OWrFxYiuNZrWuVHfeT74VyXaUOXLKj0Q1Ai1SoupqbQ@mail.gmail.com>
 <1442262425.268793.383506657.0443601E@webmail.messagingengine.com>
 <CAExdVNmNL4oiGWjokOmvKaWvqPdsp2kuGKYGJ-aERRB+irQU8A@mail.gmail.com>
 <1442265800.280460.383547057.16B65298@webmail.messagingengine.com>
 <CAExdVNnJG3-wz1fLtV2wLb+cbHQstc0Zh-KGxBFj6Q5CeqQpEw@mail.gmail.com>
 <1442267635.287083.383576201.0990DAA7@webmail.messagingengine.com>
 <CAExdVNkLmcCNHa20Y3NVe1w8U+GDXhxLZ8e9Zt74i0WAwi9iWA@mail.gmail.com>
Message-ID: <1442279996.198469.383712497.36F9DE26@webmail.messagingengine.com>

On Mon, Sep 14, 2015, at 18:09, Tim Peters wrote:
> Sorry, I'm not arguing about this any more.  Pickle doesn't work at
> all at the level of "count of bytes followed by a string". 

The SHORT_BINBYTES opcode consists of the byte b'C', followed by *yes
indeed* "count of bytes followed by a string".

> If you
> want to make a pickle argument that makes sense, I'm afraid you'll
> need to become familiar with how pickle works first.  This is not the
> place for a pickle tutorial.
> 
> Start by learning what a datetime pickle actually is.
> pickletools.dis() will be very helpful.

    0: \x80 PROTO      3
    2: c    GLOBAL     'datetime datetime'
   21: q    BINPUT     0
   23: C    SHORT_BINBYTES b'\x07\xdf\t\x0e\x15\x06*\x00\x00\x00'
   35: q    BINPUT     1
   37: \x85 TUPLE1
   38: q    BINPUT     2
   40: R    REDUCE
   41: q    BINPUT     3
   43: .    STOP

The payload is ten bytes, and the byte immediately before it is in fact
0x0a. If I pickle any byte string under 256 bytes long by itself, the
byte immediately before the data is the length. This is how I initially
came to the conclusion that "count of bytes followed by a string" was
valid.

I did, before writing my earlier post, look into the high-level aspects
of how datetime pickle works - it uses __reduce__ to create up to two
arguments, one of which is a 10-byte string, and the other is the
tzinfo. Those arguments are passed into the date constructor and
detected by that constructor - for example, I can call it directly with
datetime(b'\x07\xdf\t\x0e\x15\x06*\x00\x00\x00') and get the same result
as unpickling.

At the low level, the part that represents that first argument does
indeed appear to be "count of bytes followed by a string". I can add to
the count, add more bytes, and it will call the constructor with the
longer string. If I use pickletools.dis on my modified value the output
looks the same except for, as expected, the offsets and the value of the
argument to the SHORT_BINBYTES opcode.

So, it appears that, as I was saying, "wasted space" would not have been
an obstacle to having the "payload" accepted by the constructor (and
produced by __reduce__ ultimately _getstate) consist of "a byte string
of >= 10 bytes, the first 10 of which are used and the rest of which are
ignored by python <= 3.5" instead of "a byte string of exactly 10
bytes", since it would have accepted and produced exactly the same
pickle values, but been prepared to accept larger arguments pickled from
future versions.

For completeness: Protocol version 2 and 1 use BINUNICODE on a
latin1-to-utf8 version of the byte string, with a similar "count of
bytes followed by a string" (though the count of bytes is of UTF-8
bytes). Protocol version 0 uses UNICODE, terminated by \n, and a literal
\n is represented by \\u000a. In all cases some extra data around the
value sets it up to call "codecs.encode(..., 'latin1')" upon unpickling.

So have I shown you that I know enough about the pickle format to know
that permitting a longer string (and ignoring the extra bytes) would
have had zero impact on the pickle representation of values that did not
contain a longer string? I'd already figured out half of this before
writing my earlier post; I just assumed *you* knew enough that I
wouldn't have to show my work.

Extra credit:
    0: \x80 PROTO      3
    2: c    GLOBAL     'datetime datetime'
   21: q    BINPUT     0
   23: (    MARK
   24: M        BININT2    2014
   27: K        BININT1    9
   29: K        BININT1    14
   31: K        BININT1    21
   33: K        BININT1    6
   35: K        BININT1    42
   37: t        TUPLE      (MARK at 23)
   38: q    BINPUT     1
   40: R    REDUCE
   41: q    BINPUT     2
   43: .    STOP

From alexander.belopolsky at gmail.com  Tue Sep 15 03:42:00 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 14 Sep 2015 21:42:00 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <1442279996.198469.383712497.36F9DE26@webmail.messagingengine.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAExdVN=s_V_uz9mSOtp6b6+fKjqLZHArXKUQ-ty5EYkLM5V2qw@mail.gmail.com>
 <1442260714.263025.383475777.4728D768@webmail.messagingengine.com>
 <CAExdVN=OWrFxYiuNZrWuVHfeT74VyXaUOXLKj0Q1Ai1SoupqbQ@mail.gmail.com>
 <1442262425.268793.383506657.0443601E@webmail.messagingengine.com>
 <CAExdVNmNL4oiGWjokOmvKaWvqPdsp2kuGKYGJ-aERRB+irQU8A@mail.gmail.com>
 <1442265800.280460.383547057.16B65298@webmail.messagingengine.com>
 <CAExdVNnJG3-wz1fLtV2wLb+cbHQstc0Zh-KGxBFj6Q5CeqQpEw@mail.gmail.com>
 <1442267635.287083.383576201.0990DAA7@webmail.messagingengine.com>
 <CAExdVNkLmcCNHa20Y3NVe1w8U+GDXhxLZ8e9Zt74i0WAwi9iWA@mail.gmail.com>
 <1442279996.198469.383712497.36F9DE26@webmail.messagingengine.com>
Message-ID: <EC9B5098-6691-455A-8251-9C3849F599E8@gmail.com>

No credit for anything other than the "extra credit" section.  Partial credit for that.  Study that printout and you should understand what Tim was saying. 


> On Sep 14, 2015, at 9:19 PM, Random832 <random832 at fastmail.com> wrote:
> 
>> On Mon, Sep 14, 2015, at 18:09, Tim Peters wrote:
>> Sorry, I'm not arguing about this any more.  Pickle doesn't work at
>> all at the level of "count of bytes followed by a string". 
> 
> The SHORT_BINBYTES opcode consists of the byte b'C', followed by *yes
> indeed* "count of bytes followed by a string".
> 
>> If you
>> want to make a pickle argument that makes sense, I'm afraid you'll
>> need to become familiar with how pickle works first.  This is not the
>> place for a pickle tutorial.
>> 
>> Start by learning what a datetime pickle actually is.
>> pickletools.dis() will be very helpful.
> 
>    0: \x80 PROTO      3
>    2: c    GLOBAL     'datetime datetime'
>   21: q    BINPUT     0
>   23: C    SHORT_BINBYTES b'\x07\xdf\t\x0e\x15\x06*\x00\x00\x00'
>   35: q    BINPUT     1
>   37: \x85 TUPLE1
>   38: q    BINPUT     2
>   40: R    REDUCE
>   41: q    BINPUT     3
>   43: .    STOP
> 
> The payload is ten bytes, and the byte immediately before it is in fact
> 0x0a. If I pickle any byte string under 256 bytes long by itself, the
> byte immediately before the data is the length. This is how I initially
> came to the conclusion that "count of bytes followed by a string" was
> valid.
> 
> I did, before writing my earlier post, look into the high-level aspects
> of how datetime pickle works - it uses __reduce__ to create up to two
> arguments, one of which is a 10-byte string, and the other is the
> tzinfo. Those arguments are passed into the date constructor and
> detected by that constructor - for example, I can call it directly with
> datetime(b'\x07\xdf\t\x0e\x15\x06*\x00\x00\x00') and get the same result
> as unpickling.
> 
> At the low level, the part that represents that first argument does
> indeed appear to be "count of bytes followed by a string". I can add to
> the count, add more bytes, and it will call the constructor with the
> longer string. If I use pickletools.dis on my modified value the output
> looks the same except for, as expected, the offsets and the value of the
> argument to the SHORT_BINBYTES opcode.
> 
> So, it appears that, as I was saying, "wasted space" would not have been
> an obstacle to having the "payload" accepted by the constructor (and
> produced by __reduce__ ultimately _getstate) consist of "a byte string
> of >= 10 bytes, the first 10 of which are used and the rest of which are
> ignored by python <= 3.5" instead of "a byte string of exactly 10
> bytes", since it would have accepted and produced exactly the same
> pickle values, but been prepared to accept larger arguments pickled from
> future versions.
> 
> For completeness: Protocol version 2 and 1 use BINUNICODE on a
> latin1-to-utf8 version of the byte string, with a similar "count of
> bytes followed by a string" (though the count of bytes is of UTF-8
> bytes). Protocol version 0 uses UNICODE, terminated by \n, and a literal
> \n is represented by \\u000a. In all cases some extra data around the
> value sets it up to call "codecs.encode(..., 'latin1')" upon unpickling.
> 
> So have I shown you that I know enough about the pickle format to know
> that permitting a longer string (and ignoring the extra bytes) would
> have had zero impact on the pickle representation of values that did not
> contain a longer string? I'd already figured out half of this before
> writing my earlier post; I just assumed *you* knew enough that I
> wouldn't have to show my work.
> 
> Extra credit:
>    0: \x80 PROTO      3
>    2: c    GLOBAL     'datetime datetime'
>   21: q    BINPUT     0
>   23: (    MARK
>   24: M        BININT2    2014
>   27: K        BININT1    9
>   29: K        BININT1    14
>   31: K        BININT1    21
>   33: K        BININT1    6
>   35: K        BININT1    42
>   37: t        TUPLE      (MARK at 23)
>   38: q    BINPUT     1
>   40: R    REDUCE
>   41: q    BINPUT     2
>   43: .    STOP
> _______________________________________________
> Datetime-SIG mailing list
> Datetime-SIG at python.org
> https://mail.python.org/mailman/listinfo/datetime-sig
> The PSF Code of Conduct applies to this mailing list: https://www.python.org/psf/codeofconduct/

From tim.peters at gmail.com  Tue Sep 15 03:56:47 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 14 Sep 2015 20:56:47 -0500
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <1442279996.198469.383712497.36F9DE26@webmail.messagingengine.com>
References: <m2h9mzqyy7.fsf@fastmail.com>
 <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com>
 <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com>
 <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com>
 <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com>
 <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com>
 <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com>
 <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com>
 <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com>
 <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com>
 <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAExdVN=s_V_uz9mSOtp6b6+fKjqLZHArXKUQ-ty5EYkLM5V2qw@mail.gmail.com>
 <1442260714.263025.383475777.4728D768@webmail.messagingengine.com>
 <CAExdVN=OWrFxYiuNZrWuVHfeT74VyXaUOXLKj0Q1Ai1SoupqbQ@mail.gmail.com>
 <1442262425.268793.383506657.0443601E@webmail.messagingengine.com>
 <CAExdVNmNL4oiGWjokOmvKaWvqPdsp2kuGKYGJ-aERRB+irQU8A@mail.gmail.com>
 <1442265800.280460.383547057.16B65298@webmail.messagingengine.com>
 <CAExdVNnJG3-wz1fLtV2wLb+cbHQstc0Zh-KGxBFj6Q5CeqQpEw@mail.gmail.com>
 <1442267635.287083.383576201.0990DAA7@webmail.messagingengine.com>
 <CAExdVNkLmcCNHa20Y3NVe1w8U+GDXhxLZ8e9Zt74i0WAwi9iWA@mail.gmail.com>
 <1442279996.198469.383712497.36F9DE26@webmail.messagingengine.com>
Message-ID: <CAExdVNmm+7F5Fd7B4PrtALhrQ926+oq8h6bkvRMDZHnZ7TTLug@mail.gmail.com>

[Tim]
>> Sorry, I'm not arguing about this any more.  Pickle doesn't work at
>> all at the level of "count of bytes followed by a string".

[Random832 <random832 at fastmail.com>]
> The SHORT_BINBYTES opcode consists of the byte b'C', followed by *yes
> indeed* "count of bytes followed by a string".

Yes, some individual opcodes do work that way.


>> If you
>> want to make a pickle argument that makes sense, I'm afraid you'll
>> need to become familiar with how pickle works first.  This is not the
>> place for a pickle tutorial.
>>
>> Start by learning what a datetime pickle actually is.
>> pickletools.dis() will be very helpful.

>     0: \x80 PROTO      3
>     2: c    GLOBAL     'datetime datetime'
>    21: q    BINPUT     0
>    23: C    SHORT_BINBYTES b'\x07\xdf\t\x0e\x15\x06*\x00\x00\x00'
>    35: q    BINPUT     1
>    37: \x85 TUPLE1
>    38: q    BINPUT     2
>    40: R    REDUCE
>    41: q    BINPUT     3
>    43: .    STOP
>
> The payload is ten bytes, and the byte immediately before it is in fact
> 0x0a. If I pickle any byte string under 256 bytes long by itself, the
> byte immediately before the data is the length. This is how I initially
> came to the conclusion that "count of bytes followed by a string" was
> valid.

Ditto.

> I did, before writing my earlier post, look into the high-level aspects
> of how datetime pickle works - it uses __reduce__ to create up to two
> arguments, one of which is a 10-byte string, and the other is the
> tzinfo. Those arguments are passed into the date constructor and
> detected by that constructor - for example, I can call it directly with
> datetime(b'\x07\xdf\t\x0e\x15\x06*\x00\x00\x00') and get the same result
> as unpickling.

Good job!  That abuse of the constructor was supposed to remain a secret ;-)


> At the low level, the part that represents that first argument does
> indeed appear to be "count of bytes followed by a string". I can add to
> the count, add more bytes, and it will call the constructor with the
> longer string. If I use pickletools.dis on my modified value the output
> looks the same except for, as expected, the offsets and the value of the
> argument to the SHORT_BINBYTES opcode.
>
> So, it appears that, as I was saying, "wasted space" would not have been
> an obstacle to having the "payload" accepted by the constructor (and
> produced by __reduce__ ultimately _getstate) consist of "a byte string
> of >= 10 bytes, the first 10 of which are used and the rest of which are
> ignored by python <= 3.5" instead of "a byte string of exactly 10
> bytes", since it would have accepted and produced exactly the same
> pickle values, but been prepared to accept larger arguments pickled from
> future versions.

Yes, if we had done things differently from the start, things would
work differently today.  But what's the point?  We have to live now
with what _was_ done.  A datetime pickle carrying a string payload
with anything other than exactly 10 bytes will almost always blow up
under older Pythons. and would be considered "a bug" if it didn't.
Pickles are not at all intended to be forgiving (they're enough of a
potential security hole without going out of their way to ignore
random mysteries).

It may be nicer if Python had a serialization format more deliberately
designed for evolution of class structure - but it doesn't.  Classes
that need such a thing now typically store their own idea of a
"version" number as part of their pickled state  datetime never did.


> ...
> So have I shown you that I know enough about the pickle format to know
> that permitting a longer string (and ignoring the extra bytes) would
> have had zero impact on the pickle representation of values that did not
> contain a longer string?

Yes.  If we had a time machine, it might even have proved useful ;-)


> I'd already figured out half of this before
> writing my earlier post; I just assumed *you* knew enough that I
> wouldn't have to show my work.

It's always best to show your work on a public list.  Thanks for
finally ;-) doing so!

From random832 at fastmail.com  Tue Sep 15 04:08:58 2015
From: random832 at fastmail.com (Random832)
Date: Mon, 14 Sep 2015 22:08:58 -0400
Subject: [Datetime-SIG] Are there any "correct" implementations of
	tzinfo?
In-Reply-To: <EC9B5098-6691-455A-8251-9C3849F599E8@gmail.com> (Alexander
 Belopolsky's message of "Mon, 14 Sep 2015 21:42:00 -0400")
References: <m2h9mzqyy7.fsf@fastmail.com>
 <201509131224.t8DCOXHO004891@fido.openend.se>
 <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com>
 <201509131600.t8DG07e0025688@fido.openend.se>
 <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com>
 <201509132031.t8DKVTwJ028027@fido.openend.se>
 <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com>
 <201509140827.t8E8RPqb001076@fido.openend.se>
 <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com>
 <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com>
 <CAExdVN=s_V_uz9mSOtp6b6+fKjqLZHArXKUQ-ty5EYkLM5V2qw@mail.gmail.com>
 <1442260714.263025.383475777.4728D768@webmail.messagingengine.com>
 <CAExdVN=OWrFxYiuNZrWuVHfeT74VyXaUOXLKj0Q1Ai1SoupqbQ@mail.gmail.com>
 <1442262425.268793.383506657.0443601E@webmail.messagingengine.com>
 <CAExdVNmNL4oiGWjokOmvKaWvqPdsp2kuGKYGJ-aERRB+irQU8A@mail.gmail.com>
 <1442265800.280460.383547057.16B65298@webmail.messagingengine.com>
 <CAExdVNnJG3-wz1fLtV2wLb+cbHQstc0Zh-KGxBFj6Q5CeqQpEw@mail.gmail.com>
 <1442267635.287083.383576201.0990DAA7@webmail.messagingengine.com>
 <CAExdVNkLmcCNHa20Y3NVe1w8U+GDXhxLZ8e9Zt74i0WAwi9iWA@mail.gmail.com>
 <1442279996.198469.383712497.36F9DE26@webmail.messagingengine.com>
 <EC9B5098-6691-455A-8251-9C3849F599E8@gmail.com>
Message-ID: <m27fnsh1s5.fsf@fastmail.com>

Alexander Belopolsky <alexander.belopolsky at gmail.com> writes:

> No credit for anything other than the "extra credit" section.  Partial
> credit for that.  Study that printout and you should understand what
> Tim was saying.

My original claim was that the pickler can't know and doesn't care if a
byte string value merely happens to be 10 bytes long for this object, or
is 10 bytes long every time, it will encode it with SHORT_BINBYTES ["C",
count of bytes, string of bytes] regardless.

The datetime constructor itself does care what value is passed to it,
but what I was saying was the class could have been written originally
to accept optionally longer strings and ignore the extra values, so that
future versions could pickle as longer strings and be compatible.

In such a case, the actual pickle format would _still_ have consisted of
__reduce__() == (datetime, (b"..........", [optional tzinfo])), just
with the option of accepting (and ignoring) longer byte strings encoded
by later versions of the datetime class.

The pickle format is versatile enough to pass any (pickleable) value at
all to a constructor (or to __setstate__). Designing the datetime
constructor/setstate in the past to be able to accept a byte string of a
length other than exactly 10 would have allowed the representation to be
extended in the present, rather than smuggling a single extra bit into
one of the existing bytes. But it would not have changed the actual
representation that would have been produced by pickle back then, not
one bit.

And, now, to answer my own question from a previous message...
>>> class C():
...  def __reduce__(self):
...   return (datetime, (b"\x07\xdf\t\x0e\x155'\rA\xb2",))
...
>>> pickle.loads(pickle.dumps(C()))
datetime.datetime(2015, 9, 14, 21, 53, 39, 868786)
>>> class C():
...  def __reduce__(self):
...   return (datetime, (b"\x07\xdf\t\x0e\x955'\rA\xb2",))
...
>>> pickle.loads(pickle.dumps(C()))
datetime.datetime(2015, 9, 14, 149, 53, 39, 868786)
>>> datetime.strftime(pickle.loads(pickle.dumps(C())), '%Y%m%d%H%M%S')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: hour out of range

That was the bit we were talking about, right?

From alexander.belopolsky at gmail.com  Wed Sep 16 04:53:21 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 15 Sep 2015 22:53:21 -0400
Subject: [Datetime-SIG] Adding PEP 495 support to dateutil
Message-ID: <CAP7h-xYcem+qKVXSH=fx2DRDzBBqZ-TkXo4OZ9PNUG_j+uFRMA@mail.gmail.com>

On Sat, Sep 12, 2015 at 9:58 PM, Tim Peters <tim.peters at gmail.com> wrote:

> I think acceptance of 495 should be contingent upon
> someone first completing a fully functional (if not releasable)
> fold-aware zoneinfo wrapping.
>

After studying both pytz and dateutil offerings, I decided that it is
easier to add "fold-awareness" to the later.  I created a fork [1] on
Github and added [2] fold-awareness logic to the tzrange class that appears
to be the base class for most other tzinfo implementations.  I was
surprised how few test cases had to be changed.  It looks like  dateutil
test suit does not test questionable (in the absence of fold) behavior.  I
will need to beef up the test coverage.

I am making all development public early on and hope to see code reviews
and pull requests from interested parties.  Pull requests with additional
test cases are most welcome.

[1]: https://github.com/abalkin/dateutil/tree/pep-0495
[2]:
https://github.com/abalkin/dateutil/commit/57ecdbf481de7e21335ece8fcc5673d59252ec3f
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150915/c7c85360/attachment.html>

From alexander.belopolsky at gmail.com  Fri Sep 18 04:47:44 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Thu, 17 Sep 2015 22:47:44 -0400
Subject: [Datetime-SIG] Adding PEP 495 support to dateutil
In-Reply-To: <CAP7h-xYcem+qKVXSH=fx2DRDzBBqZ-TkXo4OZ9PNUG_j+uFRMA@mail.gmail.com>
References: <CAP7h-xYcem+qKVXSH=fx2DRDzBBqZ-TkXo4OZ9PNUG_j+uFRMA@mail.gmail.com>
Message-ID: <CAP7h-xYiMh83BAk1K0o3BpFKhsSLkxSAL0Dvwy5hidDG_kqgjA@mail.gmail.com>

[Tim Peters]
>
> I think acceptance of 495 should be contingent upon
> someone first completing a fully functional (if not releasable)
> fold-aware zoneinfo wrapping.


[Alexander Belopolsky]
>
> I am making all development public early on and hope to see code reviews
and pull requests from interested parties.  Pull requests with additional
test cases are most welcome.


I've made some additional progress in my dateutil fork [1].  The tzfile
class is now fold-aware.  The tzfile implementation of tzinfo takes the
history of local time type changes from a binary zoneinfo file. These files
are installed on the majority of UNIX platforms.

More testing is needed, but I think my fork is now close to meeting Tim's
challenge.

Please note that you need to run the modified  dateutil fork [1] code under
PEP 495 fork of CPython. [2]

[1]: https://github.com/abalkin/dateutil/tree/pep-0495
[2]: https://github.com/abalkin/cpython
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150917/122dff30/attachment.html>

From paul at ganssle.io  Fri Sep 18 16:23:30 2015
From: paul at ganssle.io (Paul Ganssle)
Date: Fri, 18 Sep 2015 10:23:30 -0400
Subject: [Datetime-SIG] Adding PEP 495 support to dateutil
Message-ID: <55FC1E62.6060202@ganssle.io>

> After studying both pytz and dateutil offerings, I decided that it is
> easier to add "fold-awareness" to the later. I created a fork [1] on
> Github and added [2] fold-awareness logic to the tzrange class that
> appears to be the base class for most other tzinfo implementations. I
> was surprised how few test cases had to be changed. It looks like
> dateutil test suit does not test questionable (in the absence of fold)
> behavior. I will need to beef up the test coverage. 

Just to clarify on the point of test coverage, I think one of the main
reasons for this is that, at the moment, dateutil doesn't handle
ambiguous times well (see Issue #57[1] and Issue #112[2]), so any such
tests would likely be failing tests.

At the moment, I can't comment on how easy this will be to implement in
a release version of dateutil if PEP 495 is accepted because I haven't
looked into it enough, but one thing to be aware of is that
backwards-compatibility is a high priority here (we'll continue to
support python 2.6+ for the foreseeable future), so any changes need to
fall back to sane behavior. Preferably, they would fall back to the
exact /same/ behavior, regardless of platform and python version.

Of course, it doesn't seem like your goal right now is to build
something that can roll out right away as soon as PEP 495 is integrated,
so there's plenty of time to clean it up and possibly build in a
compatibility module, I just thought I'd bring that up so you're aware.

[1] https://github.com/dateutil/dateutil/issues/57
[2] https://github.com/dateutil/dateutil/issues/112

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150918/db4d2063/attachment.html>

From tim.peters at gmail.com  Fri Sep 18 17:56:15 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 18 Sep 2015 10:56:15 -0500
Subject: [Datetime-SIG] Adding PEP 495 support to dateutil
In-Reply-To: <55FC1E62.6060202@ganssle.io>
References: <55FC1E62.6060202@ganssle.io>
Message-ID: <CAExdVNk3_-bD+JGYP-bDJ-f+yeHUxFX+TbBdFKnZ-tNcJgAR-w@mail.gmail.com>

[Alex]
>> After studying both pytz and dateutil offerings, I decided that it is
>> easier to add "fold-awareness" to the later.  I created a fork [1] on
>> Github and added [2] fold-awareness logic to the tzrange class that appears
>> to be the base class for most other tzinfo implementations.  I was
>> surprised how few test cases had to be changed.  It looks like  dateutil
>> test suit does not test questionable (in the absence of fold) behavior.  I
>> will need to beef up the test coverage.

[Paul Ganssle <paul at ganssle.io>]
> Just to clarify on the point of test coverage, I think one of the main
> reasons for this is that, at the moment, dateutil doesn't handle
> ambiguous times well (see Issue #57[1] and Issue #112[2]), so any such
> tests would likely be failing tests.

Because dateutil inherits the default .fromutc(), it's all but certain
it can't handle cases in the IANA database where a zone's base
("standard") offset changed either.

But it's handling gaps & folds due to DST transitions as well as is
_possible_ for a hybrid tzinfo given datetime's original design.
There was no provision in datetime to make it possible for a hybrid
tzinfo to know whether the earlier or later of an ambiguous local time
is intended.  That's not dateutil's fault, and not something any
hybrid tzinfo can solve before PEP 495 is implemented.

dateutil is following the doc's advice to consider an ambiguous time
to be the later (in "standard time"), which in combination with
inheriting the default .fromutc() is enough to ensure that UTC->local
conversion at least mimics the hands on the local clock (skipping
local times at DST start, and repeating some at DST end).  So it's
doing the best it _can_ do now in those respects.


> At the moment, I can't comment on how easy this will be to implement in
> a release version of dateutil if PEP 495 is accepted because I haven't
> looked into it enough, but one thing to be aware of is that
> backwards-compatibility is a high priority here (we'll continue to
> support python 2.6+ for the foreseeable future), so any changes need to
> fall back to sane behavior. Preferably, they would fall back to the
> exact /same/ behavior, regardless of platform and python version.
>
> Of course, it doesn't seem like your goal right now is to build
> something that can roll out right away as soon as PEP 495 is integrated,
> so there's plenty of time to clean it up and possibly build in a
> compatibility module, I just thought I'd bring that up so you're aware.

The goal of PEP 495 is to make it possible for hybrid tzinfos to
handle all cases of gaps and folds due to any cause whatsoever
(provided that folds are never worse than 2-to-1),  What Alex is
really after here is to kick the tires on PEP 495, to make sure:

1. All cases in the IANA database are in fact solved (that database
   is the richest source of the goofiest zone changes to date).

2. That it's not only possible, but implementable with reasonable effort
   and performance.

dateutil was "the obvious" base to start from, since it's the only
widely used wrapping of the IANA database using hybrid tzinfos (pytz
took a very different path).

Whether dateutil can make _use_ of this experiment is up to you ;-)

In cases where results differ from the current implementation, the
latter results can only be called "wrong".  Which you may well need to
preserve.

In which case, I'd suggest leaving the current implementation alone,
and _adding_ a new wrapping of tzfiles based on Alex's code.
dateutil's get-a-zone factory functions would need to grow some way to
spell "I want a pre-495 tzinfo" or "I want a post-495 tzinfo".  New
functions, optional function flags, global setting ... whatever you
think works best.

Of course this would apply to wrappings of other sources of zone info
too, but the IANA database must be by far the hardest (e.g., fold and
gap times can be deduced directly from a POSIX TZ string rule, which
are only subject to twice-a-year DST changes at worst).

From paul at ganssle.io  Fri Sep 18 19:05:40 2015
From: paul at ganssle.io (Paul Ganssle)
Date: Fri, 18 Sep 2015 13:05:40 -0400
Subject: [Datetime-SIG] Adding PEP 495 support to dateutil
In-Reply-To: <CAExdVNk3_-bD+JGYP-bDJ-f+yeHUxFX+TbBdFKnZ-tNcJgAR-w@mail.gmail.com>
References: <55FC1E62.6060202@ganssle.io>
 <CAExdVNk3_-bD+JGYP-bDJ-f+yeHUxFX+TbBdFKnZ-tNcJgAR-w@mail.gmail.com>
Message-ID: <55FC4464.9050201@ganssle.io>

On 9/18/2015 11:56, Tim Peters wrote:
> Because dateutil inherits the default .fromutc(), it's all but certain
> it can't handle cases in the IANA database where a zone's base
> ("standard") offset changed either.
>
> But it's handling gaps & folds due to DST transitions as well as is
> _possible_ for a hybrid tzinfo given datetime's original design. There
> was no provision in datetime to make it possible for a hybrid tzinfo
> to know whether the earlier or later of an ambiguous local time is
> intended. That's not dateutil's fault, and not something any hybrid
> tzinfo can solve before PEP 495 is implemented.
>
> dateutil is following the doc's advice to consider an ambiguous time
> to be the later (in "standard time"), which in combination with
> inheriting the default .fromutc() is enough to ensure that UTC->local
> conversion at least mimics the hands on the local clock (skipping
> local times at DST start, and repeating some at DST end). So it's
> doing the best it _can_ do now in those respects. 
This is quite possibly true, and is roughly in line with my thinking on
the matter to date, but in my mind the behavior of dateutil with respect
to ambiguous times is undefined, so I'm not going to add tests that
enforce an arbitrary implementation choice as it's not behavior I want
to lock down.

It's a separate question as to whether it can or cannot do better in
some cases. The issues I linked to are both cases where an unambiguously
specified time ("now" or a time specified in UTC with an IANA time zone)
are incorrectly converted into local time. It is//almost certainly true
that enough information is available to properly localize these
datetimes, but at least in the case of localizing "now" the cost in
doing so is additional complexity on the back-end.

>
> The goal of PEP 495 is to make it possible for hybrid tzinfos to
> handle all cases of gaps and folds due to any cause whatsoever
> (provided that folds are never worse than 2-to-1),  What Alex is
> really after here is to kick the tires on PEP 495, to make sure:
>
> 1. All cases in the IANA database are in fact solved (that database
>    is the richest source of the goofiest zone changes to date).
>
> 2. That it's not only possible, but implementable with reasonable effort
>    and performance.
>
> dateutil was "the obvious" base to start from, since it's the only
> widely used wrapping of the IANA database using hybrid tzinfos (pytz
> took a very different path).

Yes, this was more or less my understanding. I just thought I'd put it
out there in case the fact that the more complex nature of the actual
implementation had some bearing on the thinking about the
implementation. For example, these tests
<https://github.com/dateutil/dateutil/pull/127/files#diff-f4452d36e9493b89cd3b961d71465007R5771>
could be problematic from a backwards compatibility standpoint. I
haven't had time to read the PEP or the discussion on the matter, so
maybe this has already been considered, but would make for a simpler
interface if an unspecified value for fold left the old behavior intact.

I'll definitely read these things when I have time, so if it's already
been discussed no need to re-hash on my behalf.

> Whether dateutil can make _use_ of this experiment is up to you ;-)
>
> In cases where results differ from the current implementation, the
> latter results can only be called "wrong".  Which you may well need to
> preserve.
> In which case, I'd suggest leaving the current implementation alone,
> and _adding_ a new wrapping of tzfiles based on Alex's code.
> dateutil's get-a-zone factory functions would need to grow some way to
> spell "I want a pre-495 tzinfo" or "I want a post-495 tzinfo".  New
> functions, optional function flags, global setting ... whatever you
> think works best.
>
> Of course this would apply to wrappings of other sources of zone info
> too, but the IANA database must be by far the hardest (e.g., fold and
> gap times can be deduced directly from a POSIX TZ string rule, which
> are only subject to twice-a-year DST changes at worst).
I think it's likely premature (and the wrong forum) to discuss such
downstream implementation details, but I imagine that it won't be
difficult to devise some scheme that by default gives the right answer
where possible, as long as there's a relatively straightforward way of
wrapping datetimes such that it provides a consistent /interface/ across
various platforms.

As for the question of whether to preserve the "wrong" values for the
sake of backwards compatibility, I'm not likely to sacrifice maximum
/accuracy/ across platforms for maximum /consistency/ across platforms.
But again, this is somewhat off-topic.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150918/113ed65b/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 834 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150918/113ed65b/attachment-0001.sig>

From guido at python.org  Fri Sep 18 19:07:38 2015
From: guido at python.org (Guido van Rossum)
Date: Fri, 18 Sep 2015 10:07:38 -0700
Subject: [Datetime-SIG] Adding PEP 495 support to dateutil
In-Reply-To: <55FC4464.9050201@ganssle.io>
References: <55FC1E62.6060202@ganssle.io>
 <CAExdVNk3_-bD+JGYP-bDJ-f+yeHUxFX+TbBdFKnZ-tNcJgAR-w@mail.gmail.com>
 <55FC4464.9050201@ganssle.io>
Message-ID: <CAP7+vJLdeeGot8CWC2e4T4Gw5=B-oku4qjHKPb0gPnDOAjYyRA@mail.gmail.com>

On Fri, Sep 18, 2015 at 10:05 AM, Paul Ganssle <paul at ganssle.io> wrote:

> On 9/18/2015 11:56, Tim Peters wrote:
>
> Because dateutil inherits the default .fromutc(), it's all but certain it
> can't handle cases in the IANA database where a zone's base ("standard")
> offset changed either.
>
> But it's handling gaps & folds due to DST transitions as well as is
> _possible_ for a hybrid tzinfo given datetime's original design. There was
> no provision in datetime to make it possible for a hybrid tzinfo to know
> whether the earlier or later of an ambiguous local time is intended. That's
> not dateutil's fault, and not something any hybrid tzinfo can solve before
> PEP 495 is implemented.
>
> dateutil is following the doc's advice to consider an ambiguous time to be
> the later (in "standard time"), which in combination with inheriting the
> default .fromutc() is enough to ensure that UTC->local conversion at least
> mimics the hands on the local clock (skipping local times at DST start, and
> repeating some at DST end). So it's doing the best it _can_ do now in those
> respects.
>
> This is quite possibly true, and is roughly in line with my thinking on
> the matter to date, but in my mind the behavior of dateutil with respect to
> ambiguous times is undefined, so I'm not going to add tests that enforce an
> arbitrary implementation choice as it's not behavior I want to lock down.
>

Could you at least lock down that ambiguous times return *something* rather
than raising an exception? Or perhaps even that they return one of two
valid alternatives?


> It's a separate question as to whether it can or cannot do better in some
> cases. The issues I linked to are both cases where an unambiguously
> specified time ("now" or a time specified in UTC with an IANA time zone)
> are incorrectly converted into local time. It is almost certainly true
> that enough information is available to properly localize these datetimes,
> but at least in the case of localizing "now" the cost in doing so is
> additional complexity on the back-end.
>
>
> The goal of PEP 495 is to make it possible for hybrid tzinfos to
> handle all cases of gaps and folds due to any cause whatsoever
> (provided that folds are never worse than 2-to-1),  What Alex is
> really after here is to kick the tires on PEP 495, to make sure:
>
> 1. All cases in the IANA database are in fact solved (that database
>    is the richest source of the goofiest zone changes to date).
>
> 2. That it's not only possible, but implementable with reasonable effort
>    and performance.
>
> dateutil was "the obvious" base to start from, since it's the only
> widely used wrapping of the IANA database using hybrid tzinfos (pytz
> took a very different path).
>
>
> Yes, this was more or less my understanding. I just thought I'd put it out
> there in case the fact that the more complex nature of the actual
> implementation had some bearing on the thinking about the implementation.
> For example, these tests
> <https://github.com/dateutil/dateutil/pull/127/files#diff-f4452d36e9493b89cd3b961d71465007R5771>
> could be problematic from a backwards compatibility standpoint. I haven't
> had time to read the PEP or the discussion on the matter, so maybe this has
> already been considered, but would make for a simpler interface if an
> unspecified value for fold left the old behavior intact.
>
> I'll definitely read these things when I have time, so if it's already
> been discussed no need to re-hash on my behalf.
>
> Whether dateutil can make _use_ of this experiment is up to you ;-)
>
> In cases where results differ from the current implementation, the
> latter results can only be called "wrong".  Which you may well need to
> preserve.
>
> In which case, I'd suggest leaving the current implementation alone,
> and _adding_ a new wrapping of tzfiles based on Alex's code.
> dateutil's get-a-zone factory functions would need to grow some way to
> spell "I want a pre-495 tzinfo" or "I want a post-495 tzinfo".  New
> functions, optional function flags, global setting ... whatever you
> think works best.
>
> Of course this would apply to wrappings of other sources of zone info
> too, but the IANA database must be by far the hardest (e.g., fold and
> gap times can be deduced directly from a POSIX TZ string rule, which
> are only subject to twice-a-year DST changes at worst).
>
> I think it's likely premature (and the wrong forum) to discuss such
> downstream implementation details, but I imagine that it won't be difficult
> to devise some scheme that by default gives the right answer where
> possible, as long as there's a relatively straightforward way of wrapping
> datetimes such that it provides a consistent *interface* across various
> platforms.
>
> As for the question of whether to preserve the "wrong" values for the sake
> of backwards compatibility, I'm not likely to sacrifice maximum *accuracy*
> across platforms for maximum *consistency* across platforms. But again,
> this is somewhat off-topic.
>
> _______________________________________________
> Datetime-SIG mailing list
> Datetime-SIG at python.org
> https://mail.python.org/mailman/listinfo/datetime-sig
> The PSF Code of Conduct applies to this mailing list:
> https://www.python.org/psf/codeofconduct/
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150918/e6a0ac33/attachment.html>

From alexander.belopolsky at gmail.com  Fri Sep 18 19:36:30 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Fri, 18 Sep 2015 13:36:30 -0400
Subject: [Datetime-SIG] Adding PEP 495 support to dateutil
In-Reply-To: <55FC4464.9050201@ganssle.io>
References: <55FC1E62.6060202@ganssle.io>
 <CAExdVNk3_-bD+JGYP-bDJ-f+yeHUxFX+TbBdFKnZ-tNcJgAR-w@mail.gmail.com>
 <55FC4464.9050201@ganssle.io>
Message-ID: <CAP7h-xZ1Y9LBGVKGoYcuANeix7ky_CgO1vJOyynf+PFUqTHs-g@mail.gmail.com>

On Fri, Sep 18, 2015 at 1:05 PM, Paul Ganssle <paul at ganssle.io> wrote:

> I haven't had time to read the PEP or the discussion on the matter, so
> maybe this has already been considered, but would make for a simpler
> interface if an unspecified value for fold left the old behavior intact.
>

Yes this have been considered and there is a section [1] on this in the
PEP.  TL;DR: There will be no way to spell "fold=unspecified."  We decided
to change the current disambiguation rule (default to STD) because it does
not work for roll-back transitions that don't change isdst.  Furthermore,
this rule is only needed to make default fromutc() work, but post-PEP
tzinfos will have to override that method anyways.

[1]: https://www.python.org/dev/peps/pep-0495/#backward-compatibility
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150918/f6f1ea13/attachment.html>

From paul at ganssle.io  Fri Sep 18 19:32:44 2015
From: paul at ganssle.io (Paul Ganssle)
Date: Fri, 18 Sep 2015 13:32:44 -0400
Subject: [Datetime-SIG] Adding PEP 495 support to dateutil
In-Reply-To: <CAP7+vJLdeeGot8CWC2e4T4Gw5=B-oku4qjHKPb0gPnDOAjYyRA@mail.gmail.com>
References: <55FC1E62.6060202@ganssle.io>
 <CAExdVNk3_-bD+JGYP-bDJ-f+yeHUxFX+TbBdFKnZ-tNcJgAR-w@mail.gmail.com>
 <55FC4464.9050201@ganssle.io>
 <CAP7+vJLdeeGot8CWC2e4T4Gw5=B-oku4qjHKPb0gPnDOAjYyRA@mail.gmail.com>
Message-ID: <55FC4ABC.7020207@ganssle.io>

This is a reasonable point that I'll have to mull over. I think by and
large I'd prefer to be agnostic about these sorts of things if they
don't have any bearing on my contribution to the interface. Everything
below datetime uses native python modules, so making explicit guarantees
about the results of what is essentially undefined behavior above and
beyond the guarantees the interpreter / standard is already making seems
unnecessary to me.

That said, I'll have to go over the module in more detail and see places
where the dateutil interface /should/ be making some guarantees about
the behavior.

On 9/18/2015 13:07, Guido van Rossum wrote:
> On Fri, Sep 18, 2015 at 10:05 AM, Paul Ganssle <paul at ganssle.io
> <mailto:paul at ganssle.io>> wrote:
>
>     On 9/18/2015 11:56, Tim Peters wrote:
>>     Because dateutil inherits the default .fromutc(), it's all but
>>     certain it can't handle cases in the IANA database where a zone's
>>     base ("standard") offset changed either.
>>
>>     But it's handling gaps & folds due to DST transitions as well as
>>     is _possible_ for a hybrid tzinfo given datetime's original
>>     design. There was no provision in datetime to make it possible
>>     for a hybrid tzinfo to know whether the earlier or later of an
>>     ambiguous local time is intended. That's not dateutil's fault,
>>     and not something any hybrid tzinfo can solve before PEP 495 is
>>     implemented.
>>
>>     dateutil is following the doc's advice to consider an ambiguous
>>     time to be the later (in "standard time"), which in combination
>>     with inheriting the default .fromutc() is enough to ensure that
>>     UTC->local conversion at least mimics the hands on the local
>>     clock (skipping local times at DST start, and repeating some at
>>     DST end). So it's doing the best it _can_ do now in those respects. 
>     This is quite possibly true, and is roughly in line with my
>     thinking on the matter to date, but in my mind the behavior of
>     dateutil with respect to ambiguous times is undefined, so I'm not
>     going to add tests that enforce an arbitrary implementation choice
>     as it's not behavior I want to lock down.
>
>
> Could you at least lock down that ambiguous times return *something*
> rather than raising an exception? Or perhaps even that they return one
> of two valid alternatives?
>  
>
>     It's a separate question as to whether it can or cannot do better
>     in some cases. The issues I linked to are both cases where an
>     unambiguously specified time ("now" or a time specified in UTC
>     with an IANA time zone) are incorrectly converted into local time.
>     It is//almost certainly true that enough information is available
>     to properly localize these datetimes, but at least in the case of
>     localizing "now" the cost in doing so is additional complexity on
>     the back-end.
>
>>
>>     The goal of PEP 495 is to make it possible for hybrid tzinfos to
>>     handle all cases of gaps and folds due to any cause whatsoever
>>     (provided that folds are never worse than 2-to-1),  What Alex is
>>     really after here is to kick the tires on PEP 495, to make sure:
>>
>>     1. All cases in the IANA database are in fact solved (that database
>>        is the richest source of the goofiest zone changes to date).
>>
>>     2. That it's not only possible, but implementable with reasonable effort
>>        and performance.
>>
>>     dateutil was "the obvious" base to start from, since it's the only
>>     widely used wrapping of the IANA database using hybrid tzinfos (pytz
>>     took a very different path).
>
>     Yes, this was more or less my understanding. I just thought I'd
>     put it out there in case the fact that the more complex nature of
>     the actual implementation had some bearing on the thinking about
>     the implementation. For example, these tests
>     <https://github.com/dateutil/dateutil/pull/127/files#diff-f4452d36e9493b89cd3b961d71465007R5771>
>     could be problematic from a backwards compatibility standpoint. I
>     haven't had time to read the PEP or the discussion on the matter,
>     so maybe this has already been considered, but would make for a
>     simpler interface if an unspecified value for fold left the old
>     behavior intact.
>
>     I'll definitely read these things when I have time, so if it's
>     already been discussed no need to re-hash on my behalf.
>
>>     Whether dateutil can make _use_ of this experiment is up to you ;-)
>>
>>     In cases where results differ from the current implementation, the
>>     latter results can only be called "wrong".  Which you may well need to
>>     preserve.
>>     In which case, I'd suggest leaving the current implementation alone,
>>     and _adding_ a new wrapping of tzfiles based on Alex's code.
>>     dateutil's get-a-zone factory functions would need to grow some way to
>>     spell "I want a pre-495 tzinfo" or "I want a post-495 tzinfo".  New
>>     functions, optional function flags, global setting ... whatever you
>>     think works best.
>>
>>     Of course this would apply to wrappings of other sources of zone info
>>     too, but the IANA database must be by far the hardest (e.g., fold and
>>     gap times can be deduced directly from a POSIX TZ string rule, which
>>     are only subject to twice-a-year DST changes at worst).
>     I think it's likely premature (and the wrong forum) to discuss
>     such downstream implementation details, but I imagine that it
>     won't be difficult to devise some scheme that by default gives the
>     right answer where possible, as long as there's a relatively
>     straightforward way of wrapping datetimes such that it provides a
>     consistent /interface/ across various platforms.
>
>     As for the question of whether to preserve the "wrong" values for
>     the sake of backwards compatibility, I'm not likely to sacrifice
>     maximum /accuracy/ across platforms for maximum /consistency/
>     across platforms. But again, this is somewhat off-topic.
>
>     _______________________________________________
>     Datetime-SIG mailing list
>     Datetime-SIG at python.org <mailto:Datetime-SIG at python.org>
>     https://mail.python.org/mailman/listinfo/datetime-sig
>     The PSF Code of Conduct applies to this mailing list:
>     https://www.python.org/psf/codeofconduct/
>
>
>
>
> -- 
> --Guido van Rossum (python.org/~guido <http://python.org/%7Eguido>)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150918/ed479a1d/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 834 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150918/ed479a1d/attachment-0001.sig>

From alexander.belopolsky at gmail.com  Fri Sep 18 20:06:45 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Fri, 18 Sep 2015 14:06:45 -0400
Subject: [Datetime-SIG] Adding PEP 495 support to dateutil
In-Reply-To: <55FC4ABC.7020207@ganssle.io>
References: <55FC1E62.6060202@ganssle.io>
 <CAExdVNk3_-bD+JGYP-bDJ-f+yeHUxFX+TbBdFKnZ-tNcJgAR-w@mail.gmail.com>
 <55FC4464.9050201@ganssle.io>
 <CAP7+vJLdeeGot8CWC2e4T4Gw5=B-oku4qjHKPb0gPnDOAjYyRA@mail.gmail.com>
 <55FC4ABC.7020207@ganssle.io>
Message-ID: <CAP7h-xb4Vv9+i55u_jsHfpuUMhxj3A8bKg8nQGdeE6R_yQv6TA@mail.gmail.com>

On Fri, Sep 18, 2015 at 1:32 PM, Paul Ganssle <paul at ganssle.io> wrote:

> Everything below datetime uses native python modules, so making explicit
> guarantees about the results of what is essentially undefined behavior
> above and beyond the guarantees the interpreter / standard is already
> making seems unnecessary to me.


If you looks at the standard library tests, you will see that we guarantee
consistency in all but most extreme edge cases.  (E.g., conversion between
timezones with overlapping but not equal folds is one of such cases. [2])

I've found that dateutil test coverage is very good for the
utcoffset()/tzname()/dst() triad, but it is less thorough for anything that
involves fromutc().  (I believe I had to adjust only one test case when I
added fold-awareness to fromutc().)  This is understandable because you
rely on stdlib version of fromutc(), but this is a problem in itself.  We
know that default fromutc() is only adequate for tzrange and very simple
tzfile cases.  I suspect dateutil has problems that are not limited to
ambiguous datetimes in some IANA time zones.

[1]:
https://hg.python.org/cpython/file/v3.5.0/Lib/test/datetimetester.py#l2818
[2]:
https://hg.python.org/cpython/file/v3.5.0/Lib/test/datetimetester.py#l3640
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150918/37febc37/attachment.html>

From tim.peters at gmail.com  Fri Sep 18 20:42:59 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 18 Sep 2015 13:42:59 -0500
Subject: [Datetime-SIG] Adding PEP 495 support to dateutil
In-Reply-To: <55FC4464.9050201@ganssle.io>
References: <55FC1E62.6060202@ganssle.io>
 <CAExdVNk3_-bD+JGYP-bDJ-f+yeHUxFX+TbBdFKnZ-tNcJgAR-w@mail.gmail.com>
 <55FC4464.9050201@ganssle.io>
Message-ID: <CAExdVN=DZz5VCV9ctX6=XFDQQU5P9xpO9CDY4HJjQidOfASirA@mail.gmail.com>

[Paul Ganssle <paul at ganssle.io>]
> ..
> It's a separate question as to whether it can or cannot do better in some
> cases. The issues I linked to are both cases where an unambiguously
> specified time ("now" or a time specified in UTC with an IANA time zone) are
> incorrectly converted into local time. It is almost certainly true that
> enough information is available to properly localize these datetimes, but at
> least in the case of localizing "now" the cost in doing so is additional
> complexity on the back-end.

When converting from UTC to a local ambiguous time, you obviously know
which UTC time you started with.  The problem is that it's impossible
to _record_ which UTC time you started with.  The date and time
attributes of the local datetimes are (must be) identical, so the only
way you _could_ record it is by overriding .fromutc() to attach a
different tzinfo object (the only bits of a datetime object that could
possibly differ between the earlier and later of an ambiguous local
time).

Which is what pytz does.  But then the semantics of arithmetic changes
too, because datetime subtraction and comparison do different things
depending on whether or not the datetimes' tzinfo objects are
identical (same object).

This is why POSIX has a tm_isdst flag in a struct tm (the POSIX
spelling of a Python datetime), to record whether an ambiguous local
time is intended to be the earlier or later.  PEP 495's new `fold`
flag is the same so far as DST transitions go, but is also clearly
applicable to all possible causes of folds (including a zone's
"standard" offset changing).

From tim.peters at gmail.com  Fri Sep 18 21:00:57 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 18 Sep 2015 14:00:57 -0500
Subject: [Datetime-SIG] Adding PEP 495 support to dateutil
In-Reply-To: <CAP7h-xb4Vv9+i55u_jsHfpuUMhxj3A8bKg8nQGdeE6R_yQv6TA@mail.gmail.com>
References: <55FC1E62.6060202@ganssle.io>
 <CAExdVNk3_-bD+JGYP-bDJ-f+yeHUxFX+TbBdFKnZ-tNcJgAR-w@mail.gmail.com>
 <55FC4464.9050201@ganssle.io>
 <CAP7+vJLdeeGot8CWC2e4T4Gw5=B-oku4qjHKPb0gPnDOAjYyRA@mail.gmail.com>
 <55FC4ABC.7020207@ganssle.io>
 <CAP7h-xb4Vv9+i55u_jsHfpuUMhxj3A8bKg8nQGdeE6R_yQv6TA@mail.gmail.com>
Message-ID: <CAExdVN=pga4+LYfC2+2cutBgoxbVi9V0pKcWR9ULOM5amViLtA@mail.gmail.com>

[Alex]
> ...
> I suspect dateutil has problems that are not limited to ambiguous
> datetimes in some IANA time zones.

For pytz, Stuart said he ran zdump across all zones in the database,
to drive exhaustive tests of all transition instants in every zone.
That's an excellent idea :-)

I strongly suspect dateutil will get some cases wrong simply because
it's paying attention to the gmt/std/wall indicators in tzfiles.
Those have no meaning for anything a tzinfo is trying to accomplish -
it's an "attractive nuisance" that they're even stored in a tzfile.
To convert transition times from UTC to local times (as dateutil
appears to want to do), it should simply add the current total UTC
offset, ignoring the gmt/std/wall indicators entirely.  All transition
times in tzfiles are recorded in UTC, regardless of what the
gmt/std/wall indicators say.

That won't make any difference for "most" zones because it just so
happens that the "wall" indicator is set for most transitions and the
"std" indicator is not (reflecting that most zoneinfo _source_ files
record DST transition points in local wall-clock time).  An exhaustive
test would stumble into the exceptions.  The way to fix broken cases
discovered this way is to just ignore gmt/std/wall (better, seek over
'em when reading the file - they're useless).

From alexander.belopolsky at gmail.com  Fri Sep 18 21:01:16 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Fri, 18 Sep 2015 15:01:16 -0400
Subject: [Datetime-SIG] Adding PEP 495 support to dateutil
In-Reply-To: <CAExdVN=DZz5VCV9ctX6=XFDQQU5P9xpO9CDY4HJjQidOfASirA@mail.gmail.com>
References: <55FC1E62.6060202@ganssle.io>
 <CAExdVNk3_-bD+JGYP-bDJ-f+yeHUxFX+TbBdFKnZ-tNcJgAR-w@mail.gmail.com>
 <55FC4464.9050201@ganssle.io>
 <CAExdVN=DZz5VCV9ctX6=XFDQQU5P9xpO9CDY4HJjQidOfASirA@mail.gmail.com>
Message-ID: <CAP7h-xb-HmmWc3vmK_2hdb5Us4nGwTVq=CEDYCM5WEvtaoJsuw@mail.gmail.com>

On Fri, Sep 18, 2015 at 2:42 PM, Tim Peters <tim.peters at gmail.com> wrote:
>
> When converting from UTC to a local ambiguous time, you obviously know
> which UTC time you started with.  The problem is that it's impossible
> to _record_ which UTC time you started with.  The date and time
> attributes of the local datetimes are (must be) identical, so the only
> way you _could_ record it is by overriding .fromutc() to attach a
> different tzinfo object (the only bits of a datetime object that could
> possibly differ between the earlier and later of an ambiguous local
> time).
>
> Which is what pytz does.

The pytz hack is in violation of the strict reading of the reference manual
[1] which says "The purpose of fromutc() is to adjust the date and time
data ...".  I think it is in the spirit if not in the letter of datetime
module design that fromutc(dt) should not change dt.tzinfo.

In any case, I think we have concluded on this list that pytz approach is
not an example to be followed.  I just wanted to mention for Paul's benefit
that it is not just the arithmetic that is affected by the pytz hack.  The
changes in arithmetic are themselves consequences of the violation of the
"fromutc(dt).tzinfo is dt.tzinfo" invariant.

[1]: https://docs.python.org/3/library/datetime.html#datetime.tzinfo.fromutc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150918/d788014e/attachment.html>

From paul at ganssle.io  Fri Sep 18 21:16:20 2015
From: paul at ganssle.io (Paul Ganssle)
Date: Fri, 18 Sep 2015 15:16:20 -0400
Subject: [Datetime-SIG] Adding PEP 495 support to dateutil
In-Reply-To: <CAExdVN=pga4+LYfC2+2cutBgoxbVi9V0pKcWR9ULOM5amViLtA@mail.gmail.com>
References: <55FC1E62.6060202@ganssle.io>
 <CAExdVNk3_-bD+JGYP-bDJ-f+yeHUxFX+TbBdFKnZ-tNcJgAR-w@mail.gmail.com>
 <55FC4464.9050201@ganssle.io>
 <CAP7+vJLdeeGot8CWC2e4T4Gw5=B-oku4qjHKPb0gPnDOAjYyRA@mail.gmail.com>
 <55FC4ABC.7020207@ganssle.io>
 <CAP7h-xb4Vv9+i55u_jsHfpuUMhxj3A8bKg8nQGdeE6R_yQv6TA@mail.gmail.com>
 <CAExdVN=pga4+LYfC2+2cutBgoxbVi9V0pKcWR9ULOM5amViLtA@mail.gmail.com>
Message-ID: <CA+6apDo6tK5iRtxzeiKJHKNNjGY0zVtqrXwjpMNa6O7_NYXSmA@mail.gmail.com>

Appreciate the advice, I have to admit that these edge cases seem rare
enough that they haven't been a priority for me (I'm still trying to wrap
up a release that doesn't occasionally break the parser for certain strings
on the 29th-31st of every month, for example).

Much to think about hate, I like that zdump idea as a method for test
discovery.
On Sep 18, 2015 3:01 PM, "Tim Peters" <tim.peters at gmail.com> wrote:

> [Alex]
> > ...
> > I suspect dateutil has problems that are not limited to ambiguous
> > datetimes in some IANA time zones.
>
> For pytz, Stuart said he ran zdump across all zones in the database,
> to drive exhaustive tests of all transition instants in every zone.
> That's an excellent idea :-)
>
> I strongly suspect dateutil will get some cases wrong simply because
> it's paying attention to the gmt/std/wall indicators in tzfiles.
> Those have no meaning for anything a tzinfo is trying to accomplish -
> it's an "attractive nuisance" that they're even stored in a tzfile.
> To convert transition times from UTC to local times (as dateutil
> appears to want to do), it should simply add the current total UTC
> offset, ignoring the gmt/std/wall indicators entirely.  All transition
> times in tzfiles are recorded in UTC, regardless of what the
> gmt/std/wall indicators say.
>
> That won't make any difference for "most" zones because it just so
> happens that the "wall" indicator is set for most transitions and the
> "std" indicator is not (reflecting that most zoneinfo _source_ files
> record DST transition points in local wall-clock time).  An exhaustive
> test would stumble into the exceptions.  The way to fix broken cases
> discovered this way is to just ignore gmt/std/wall (better, seek over
> 'em when reading the file - they're useless).
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150918/a9c48948/attachment-0001.html>

From 4kir4.1i at gmail.com  Fri Sep 18 22:59:50 2015
From: 4kir4.1i at gmail.com (Akira Li)
Date: Fri, 18 Sep 2015 23:59:50 +0300
Subject: [Datetime-SIG] Adding PEP 495 support to dateutil
In-Reply-To: <CAP7h-xb-HmmWc3vmK_2hdb5Us4nGwTVq=CEDYCM5WEvtaoJsuw@mail.gmail.com>
 (Alexander Belopolsky's message of "Fri, 18 Sep 2015 15:01:16 -0400")
References: <55FC1E62.6060202@ganssle.io>
 <CAExdVNk3_-bD+JGYP-bDJ-f+yeHUxFX+TbBdFKnZ-tNcJgAR-w@mail.gmail.com>
 <55FC4464.9050201@ganssle.io>
 <CAExdVN=DZz5VCV9ctX6=XFDQQU5P9xpO9CDY4HJjQidOfASirA@mail.gmail.com>
 <CAP7h-xb-HmmWc3vmK_2hdb5Us4nGwTVq=CEDYCM5WEvtaoJsuw@mail.gmail.com>
Message-ID: <87a8sj4f5l.fsf@gmail.com>

Alexander Belopolsky <alexander.belopolsky at gmail.com> writes:

> On Fri, Sep 18, 2015 at 2:42 PM, Tim Peters <tim.peters at gmail.com> wrote:
>>
>> When converting from UTC to a local ambiguous time, you obviously know
>> which UTC time you started with.  The problem is that it's impossible
>> to _record_ which UTC time you started with.  The date and time
>> attributes of the local datetimes are (must be) identical, so the only
>> way you _could_ record it is by overriding .fromutc() to attach a
>> different tzinfo object (the only bits of a datetime object that could
>> possibly differ between the earlier and later of an ambiguous local
>> time).
>>
>> Which is what pytz does.
>
> The pytz hack is in violation of the strict reading of the reference manual
> [1] which says "The purpose of fromutc() is to adjust the date and time
> data ...".  I think it is in the spirit if not in the letter of datetime
> module design that fromutc(dt) should not change dt.tzinfo.

pytz's fromutc() returns the correct* result. dateutil can't do it (at
the moment) https://github.com/dateutil/dateutil/issues/112

* The word "correct" here does not depend on the programming language
  specification and/or its implementation e.g. from the bug description:

    Input: 2011-11-06 05:30:00 UTC+0000, America/Toronto
    Expected: 2011-11-06 01:30:00 EDT-0400

If the reference manual mandates a different result then it is wrong wrong.

> In any case, I think we have concluded on this list that pytz approach is
> not an example to be followed.  I just wanted to mention for Paul's benefit
> that it is not just the arithmetic that is affected by the pytz hack.  The
> changes in arithmetic are themselves consequences of the violation of the
> "fromutc(dt).tzinfo is dt.tzinfo" invariant.

Consider the following (natural) equality:

  tz.fromutc(utc_time) == utc_time.replace(tzinfo=utc_tz).astimezone(tz)

The right side allows *utc_time.tzinfo* being None or *utc_time.tzinfo*
may be some equivalent of *timezone.utc*. It is confusing that the
method named *fromutc()* (its stdlib implementation) rejects *utc_time*
if it is in utc timezone.

stdlib's behavior that mandates utc_time.tzinfo == tz where tz may have
non-zero utc offset is weird (mind-bending -- input time must be utc but
tzinfo is not utc -- wtf). There is no need to attach *tz* before calling
*tz.fromutc()* -- tz is passed as *self* anyway.


> [1]: https://docs.python.org/3/library/datetime.html#datetime.tzinfo.fromutc

From alexander.belopolsky at gmail.com  Fri Sep 18 23:03:35 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Fri, 18 Sep 2015 17:03:35 -0400
Subject: [Datetime-SIG] Adding PEP 495 support to dateutil
In-Reply-To: <87a8sj4f5l.fsf@gmail.com>
References: <55FC1E62.6060202@ganssle.io>
 <CAExdVNk3_-bD+JGYP-bDJ-f+yeHUxFX+TbBdFKnZ-tNcJgAR-w@mail.gmail.com>
 <55FC4464.9050201@ganssle.io>
 <CAExdVN=DZz5VCV9ctX6=XFDQQU5P9xpO9CDY4HJjQidOfASirA@mail.gmail.com>
 <CAP7h-xb-HmmWc3vmK_2hdb5Us4nGwTVq=CEDYCM5WEvtaoJsuw@mail.gmail.com>
 <87a8sj4f5l.fsf@gmail.com>
Message-ID: <CAP7h-xYq+O6KD3WYe4T2YnEqPZc1fyedZQAiJXS4EmEgtXnAPA@mail.gmail.com>

On Fri, Sep 18, 2015 at 4:59 PM, Akira Li <4kir4.1i at gmail.com> wrote:

> stdlib's behavior that mandates utc_time.tzinfo == tz where tz may have
> non-zero utc offset is weird (mind-bending -- input time must be utc but
> tzinfo is not utc -- wtf).
>

Please stop fighting decisions that have been made 12 years ago.  You
cannot win regardless of the merits of your arguments.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150918/92beb298/attachment.html>

From random832 at fastmail.com  Fri Sep 18 23:29:20 2015
From: random832 at fastmail.com (Random832)
Date: Fri, 18 Sep 2015 17:29:20 -0400
Subject: [Datetime-SIG] Adding PEP 495 support to dateutil
In-Reply-To: <CAExdVN=pga4+LYfC2+2cutBgoxbVi9V0pKcWR9ULOM5amViLtA@mail.gmail.com>
References: <55FC1E62.6060202@ganssle.io>
 <CAExdVNk3_-bD+JGYP-bDJ-f+yeHUxFX+TbBdFKnZ-tNcJgAR-w@mail.gmail.com>
 <55FC4464.9050201@ganssle.io>
 <CAP7+vJLdeeGot8CWC2e4T4Gw5=B-oku4qjHKPb0gPnDOAjYyRA@mail.gmail.com>
 <55FC4ABC.7020207@ganssle.io>
 <CAP7h-xb4Vv9+i55u_jsHfpuUMhxj3A8bKg8nQGdeE6R_yQv6TA@mail.gmail.com>
 <CAExdVN=pga4+LYfC2+2cutBgoxbVi9V0pKcWR9ULOM5amViLtA@mail.gmail.com>
Message-ID: <1442611760.2536451.387678633.3DAFF41C@webmail.messagingengine.com>

On Fri, Sep 18, 2015, at 15:00, Tim Peters wrote:
> Those have no meaning for anything a tzinfo is trying to accomplish -
> it's an "attractive nuisance" that they're even stored in a tzfile.

For background information: the purpose of storing them in a tzfile is
to allow that tzfile to be used as a template for dynamically creating
timezones with the same rules but other offsets. This is used for the
timezone named "posixrules" - which is a US timezone (America/New_York)
by default - to generate timezones for POSIX timezone strings that don't
explicitly specify their daylight rules.

They should not be used for normal interpretation of a timezone.

From tim.peters at gmail.com  Sat Sep 19 03:02:21 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 18 Sep 2015 20:02:21 -0500
Subject: [Datetime-SIG] Adding PEP 495 support to dateutil
In-Reply-To: <CAP7h-xb-HmmWc3vmK_2hdb5Us4nGwTVq=CEDYCM5WEvtaoJsuw@mail.gmail.com>
References: <55FC1E62.6060202@ganssle.io>
 <CAExdVNk3_-bD+JGYP-bDJ-f+yeHUxFX+TbBdFKnZ-tNcJgAR-w@mail.gmail.com>
 <55FC4464.9050201@ganssle.io>
 <CAExdVN=DZz5VCV9ctX6=XFDQQU5P9xpO9CDY4HJjQidOfASirA@mail.gmail.com>
 <CAP7h-xb-HmmWc3vmK_2hdb5Us4nGwTVq=CEDYCM5WEvtaoJsuw@mail.gmail.com>
Message-ID: <CAExdVN=Dfabatv5C5TmAdfViaa9weh-afEkeYbxLb1OJSZ+6-w@mail.gmail.com>

[Tim]
>> When converting from UTC to a local ambiguous time, you obviously know
>> which UTC time you started with.  The problem is that it's impossible
>> to _record_ which UTC time you started with.  The date and time
>> attributes of the local datetimes are (must be) identical, so the only
>> way you _could_ record it is by overriding .fromutc() to attach a
>> different tzinfo object (the only bits of a datetime object that could
>> possibly differ between the earlier and later of an ambiguous local
>> time).
>>
>> Which is what pytz does.

[Alex]
> The pytz hack is in violation of the strict reading of the reference manual
> [1] which says "The purpose of fromutc() is to adjust the date and time data
> ...".  I think it is in the spirit if not in the letter of datetime module
> design that fromutc(dt) should not change dt.tzinfo.

It's certainly "in the spirit" not to change it.  I wrote that part of
the docs, and it never occurred to me that anyone would even
_consider_ changing it ;-)


> In any case, I think we have concluded on this list that pytz approach is
> not an example to be followed.

Well, it was dead easy to establish it wasn't Guido's intent as the
primary original designer, or my intent as the primary original
implementer & doc author - all anyone ever had to do to establish
_that_ was to ask us ;-)

I happen to still believe that a "hybrid" tzinfo is the best approach,
but appreciate that pytz solved a world of problems with its approach
(while creating others).  I really can't tell if a consensus has been
reached among the relative handful of datetime-SIG participants.
Which means there is no consensus.


> I just wanted to mention for Paul's benefit
> that it is not just the arithmetic that is affected by the pytz hack.  The
> changes in arithmetic are themselves consequences of the violation of the
> "fromutc(dt).tzinfo is dt.tzinfo" invariant.

Paul, something else you should know:  you don't _have_ to change
anything if PEP 495 is implemented.  That alone shouldn't change any
results dateutil computes in any case.  datetuil will simply ignore
`fold` then, and compute the same results it computes today.

The intent is to make it _possible_ for dateutil to get conversions
exactly right in every case, which it cannot do today.

From tim.peters at gmail.com  Sat Sep 19 03:10:26 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 18 Sep 2015 20:10:26 -0500
Subject: [Datetime-SIG] Adding PEP 495 support to dateutil
In-Reply-To: <CA+6apDo6tK5iRtxzeiKJHKNNjGY0zVtqrXwjpMNa6O7_NYXSmA@mail.gmail.com>
References: <55FC1E62.6060202@ganssle.io>
 <CAExdVNk3_-bD+JGYP-bDJ-f+yeHUxFX+TbBdFKnZ-tNcJgAR-w@mail.gmail.com>
 <55FC4464.9050201@ganssle.io>
 <CAP7+vJLdeeGot8CWC2e4T4Gw5=B-oku4qjHKPb0gPnDOAjYyRA@mail.gmail.com>
 <55FC4ABC.7020207@ganssle.io>
 <CAP7h-xb4Vv9+i55u_jsHfpuUMhxj3A8bKg8nQGdeE6R_yQv6TA@mail.gmail.com>
 <CAExdVN=pga4+LYfC2+2cutBgoxbVi9V0pKcWR9ULOM5amViLtA@mail.gmail.com>
 <CA+6apDo6tK5iRtxzeiKJHKNNjGY0zVtqrXwjpMNa6O7_NYXSmA@mail.gmail.com>
Message-ID: <CAExdVNmDxBr-8+rmmKHfQEnt1fj=ws9tptxwNnt-OzyCtXno6w@mail.gmail.com>

[Paul Ganssle <paul at ganssle.io>]
> Appreciate the advice, I have to admit that these edge cases seem rare
> enough that they haven't been a priority for me

On a third look, I think you can ignore my rant about the gmt/std/wall
indicators:  those don't appear to be _used_ at all in the current
dateutil code.  I was either hallucinating, or (mis)remembering some
older version of the code.  But since they're not used, you could save
some memory space & cycles by not bothering to read them from the
tzfile to begin with.

About edge cases, as before it's simply not possible to get them all
right today, nor to get as many right as _is_ possible for IANA zones
today without overriding .fromutc().  If I were you I'd wait to see
PEP 495's fate.  Then "always right all the time" could become
possible.


> (I'm still trying to wrap up a release that doesn't occasionally break the
> parser for certain strings on the 29th-31st of every month, for example).

Just fix the 30th, and call it progress ;-)

From tim.peters at gmail.com  Sat Sep 19 03:33:48 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 18 Sep 2015 20:33:48 -0500
Subject: [Datetime-SIG] Adding PEP 495 support to dateutil
In-Reply-To: <87a8sj4f5l.fsf@gmail.com>
References: <55FC1E62.6060202@ganssle.io>
 <CAExdVNk3_-bD+JGYP-bDJ-f+yeHUxFX+TbBdFKnZ-tNcJgAR-w@mail.gmail.com>
 <55FC4464.9050201@ganssle.io>
 <CAExdVN=DZz5VCV9ctX6=XFDQQU5P9xpO9CDY4HJjQidOfASirA@mail.gmail.com>
 <CAP7h-xb-HmmWc3vmK_2hdb5Us4nGwTVq=CEDYCM5WEvtaoJsuw@mail.gmail.com>
 <87a8sj4f5l.fsf@gmail.com>
Message-ID: <CAExdVNn1D6g5-XEgAgAuAj8cL6iHzgdS+ms5M=af2nJAU8hJ1w@mail.gmail.com>

[Akira Li <4kir4.1i at gmail.com>]
> pytz's fromutc() returns the correct* result. dateutil can't do it (at
> the moment) https://github.com/dateutil/dateutil/issues/112

Do you understand why PEP 495 is being proposed?


> ...
> Consider the following (natural) equality:
>
>   tz.fromutc(utc_time) == utc_time.replace(tzinfo=utc_tz).astimezone(tz)

Clear as mud to me ;-)


> The right side allows *utc_time.tzinfo* being None or *utc_time.tzinfo*
> may be some equivalent of *timezone.utc*.

Since the RHS replaces utc_time.tzinfo before using utc_time, the RHS
"allows" utc_time.tzinfo to be anything whatsoever at the start.


> It is confusing that the method named *fromutc()* (its stdlib implementation)
> rejects *utc_time* if it is in utc timezone.

But your use, despite your claim of being "natural", is highly
_un_natural.  The natural use of .astimezone() is to invoke it _from_
a datetime object:

    a_datetime.astimezone(tz)

.fromutc() was rarely intended to be invoked directly, except perhaps
by tzinfo authors.  In that context, its real use is to help implement
.astimezone(),  And its calling conventions are natural in that
context:

    def datetime.astimezone(self, tz):
        myoffset = self.utcoffset()
        utc = (self - myoffset).replace(tzinfo=tz)
        return tz.fromutc(utc)


> stdlib's behavior that mandates utc_time.tzinfo == tz

Not "==", "is".


> where tz may have non-zero utc offset is weird (mind-bending --
> input time must be utc but tzinfo is not utc -- wtf). There is no
> need to attach *tz* before calling *tz.fromutc()* -- tz is passed
> as *self* anyway.

Redundancy helps catch programming errors.  I know darned well this
check helped catch errors I made when implementing this stuff to begin
with.  There's always potential confusion when one object delegates
operations to operations of the same names implemented by a contained
object.

If you don't like it, tough ;-)  Stick to using astimezone() and leave
the internals alone.  If you are going to play with the internals,
follow the rules,  It's not like they weren't documented ;-)

From 4kir4.1i at gmail.com  Sat Sep 19 05:19:33 2015
From: 4kir4.1i at gmail.com (Akira Li)
Date: Sat, 19 Sep 2015 06:19:33 +0300
Subject: [Datetime-SIG] Adding PEP 495 support to dateutil
In-Reply-To: <CAExdVNn1D6g5-XEgAgAuAj8cL6iHzgdS+ms5M=af2nJAU8hJ1w@mail.gmail.com>
 (Tim Peters's message of "Fri, 18 Sep 2015 20:33:48 -0500")
References: <55FC1E62.6060202@ganssle.io>
 <CAExdVNk3_-bD+JGYP-bDJ-f+yeHUxFX+TbBdFKnZ-tNcJgAR-w@mail.gmail.com>
 <55FC4464.9050201@ganssle.io>
 <CAExdVN=DZz5VCV9ctX6=XFDQQU5P9xpO9CDY4HJjQidOfASirA@mail.gmail.com>
 <CAP7h-xb-HmmWc3vmK_2hdb5Us4nGwTVq=CEDYCM5WEvtaoJsuw@mail.gmail.com>
 <87a8sj4f5l.fsf@gmail.com>
 <CAExdVNn1D6g5-XEgAgAuAj8cL6iHzgdS+ms5M=af2nJAU8hJ1w@mail.gmail.com>
Message-ID: <87wpvn2j0a.fsf@gmail.com>

Tim Peters <tim.peters at gmail.com> writes:

> [Akira Li <4kir4.1i at gmail.com>]
>> pytz's fromutc() returns the correct* result. dateutil can't do it (at
>> the moment) https://github.com/dateutil/dateutil/issues/112
>
> Do you understand why PEP 495 is being proposed?
>

Yes, that is why I said "at the moment"

  https://www.python.org/dev/peps/pep-0495/#rationale
  https://github.com/python/peps/blob/70c78c6c48f9f025f0485f4a756b313d414b5786/pep-0495.txt#L31-L54

>> ...
>> Consider the following (natural) equality:
>>
>>   tz.fromutc(utc_time) == utc_time.replace(tzinfo=utc_tz).astimezone(tz)
>
> Clear as mud to me ;-)
>
>
>> The right side allows *utc_time.tzinfo* being None or *utc_time.tzinfo*
>> may be some equivalent of *timezone.utc*.
>
> Since the RHS replaces utc_time.tzinfo before using utc_time, the RHS
> "allows" utc_time.tzinfo to be anything whatsoever at the start.

"anything whatsover" would conflict with the _name_ *utc_time*.

If *utc_time* is a naive datetime object then it may be interpreted as utc
time in a given program.

If *utc_time* is timezone-aware then utc_time.tzinfo being an equivalent
of timezone.utc is not surprising too.

>> It is confusing that the method named *fromutc()* (its stdlib implementation)
>> rejects *utc_time* if it is in utc timezone.
>
> But your use, despite your claim of being "natural", is highly
> _un_natural.  The natural use of .astimezone() is to invoke it _from_
> a datetime object:
>
>     a_datetime.astimezone(tz)
>
> .fromutc() was rarely intended to be invoked directly, except perhaps
> by tzinfo authors.  In that context, its real use is to help implement
> .astimezone(),  And its calling conventions are natural in that
> context:
>
>     def datetime.astimezone(self, tz):
>         myoffset = self.utcoffset()
>         utc = (self - myoffset).replace(tzinfo=tz)
>         return tz.fromutc(utc)
>
>
>
>> stdlib's behavior that mandates utc_time.tzinfo == tz
>
> Not "==", "is".

Yes, it was an error. Though it does not change the meaning of the
sentence i.e., any value except None or timezone.utc analog is
surprising for utc_time.tzinfo

>> where tz may have non-zero utc offset is weird (mind-bending --
>> input time must be utc but tzinfo is not utc -- wtf). There is no
>> need to attach *tz* before calling *tz.fromutc()* -- tz is passed
>> as *self* anyway.
>
> Redundancy helps catch programming errors.  I know darned well this
> check helped catch errors I made when implementing this stuff to begin
> with.  There's always potential confusion when one object delegates
> operations to operations of the same names implemented by a contained
> object.
>
> If you don't like it, tough ;-)  Stick to using astimezone() and leave
> the internals alone.  If you are going to play with the internals,
> follow the rules,  It's not like they weren't documented ;-)

To be clear, it is not a suggestion to change anything in stdlib. It was
a reaction to the earlier message in this thread, to point out why
stdlib's fromutc() API is not the example that should be followed. Thank
you for providing the explicit reasons for the specific choices in the
API design: "redundency helps" and fromutc() is semi-private. I can't
remember when I've used fromutc() directly (It is used indirectly via
datetime.now(tz), datetime.fromtimestamp(ts, tz), d.astimezone(tz),
tz.normalize()).

From tim.peters at gmail.com  Sat Sep 19 05:25:29 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 18 Sep 2015 22:25:29 -0500
Subject: [Datetime-SIG] Adding PEP 495 support to dateutil
In-Reply-To: <87wpvn2j0a.fsf@gmail.com>
References: <55FC1E62.6060202@ganssle.io>
 <CAExdVNk3_-bD+JGYP-bDJ-f+yeHUxFX+TbBdFKnZ-tNcJgAR-w@mail.gmail.com>
 <55FC4464.9050201@ganssle.io>
 <CAExdVN=DZz5VCV9ctX6=XFDQQU5P9xpO9CDY4HJjQidOfASirA@mail.gmail.com>
 <CAP7h-xb-HmmWc3vmK_2hdb5Us4nGwTVq=CEDYCM5WEvtaoJsuw@mail.gmail.com>
 <87a8sj4f5l.fsf@gmail.com>
 <CAExdVNn1D6g5-XEgAgAuAj8cL6iHzgdS+ms5M=af2nJAU8hJ1w@mail.gmail.com>
 <87wpvn2j0a.fsf@gmail.com>
Message-ID: <CAExdVN=GH7YTBndkoCxKdE1MuMALAtEU2UTStACv=-o==rwstA@mail.gmail.com>

[Akira Li <4kir4.1i at gmail.com>]
> ...
> To be clear, it is not a suggestion to change anything in stdlib. It was
> a reaction to the earlier message in this thread, to point out why
> stdlib's fromutc() API is not the example that should be followed. Thank
> you for providing the explicit reasons for the specific choices in the
> API design: "redundency helps" and fromutc() is semi-private. I can't
> remember when I've used fromutc() directly (It is used indirectly via
> datetime.now(tz), datetime.fromtimestamp(ts, tz), d.astimezone(tz),

Which are part of Python.

> tz.normalize()).

Which is unique to pytz.

So, yes, it's used as intended, by _implementations_ of higher-level
methods.  In those contexts, "convenience" is of no importance, but
the value of catching errors (by implementers!) is of supreme
importance.

From alexander.belopolsky at gmail.com  Sat Sep 19 07:09:01 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Sat, 19 Sep 2015 01:09:01 -0400
Subject: [Datetime-SIG] Adding PEP 495 support to dateutil
In-Reply-To: <87wpvn2j0a.fsf@gmail.com>
References: <55FC1E62.6060202@ganssle.io>
 <CAExdVNk3_-bD+JGYP-bDJ-f+yeHUxFX+TbBdFKnZ-tNcJgAR-w@mail.gmail.com>
 <55FC4464.9050201@ganssle.io>
 <CAExdVN=DZz5VCV9ctX6=XFDQQU5P9xpO9CDY4HJjQidOfASirA@mail.gmail.com>
 <CAP7h-xb-HmmWc3vmK_2hdb5Us4nGwTVq=CEDYCM5WEvtaoJsuw@mail.gmail.com>
 <87a8sj4f5l.fsf@gmail.com>
 <CAExdVNn1D6g5-XEgAgAuAj8cL6iHzgdS+ms5M=af2nJAU8hJ1w@mail.gmail.com>
 <87wpvn2j0a.fsf@gmail.com>
Message-ID: <CAP7h-xad0_7NMTVQCtTVrtCQ-teraXWPSMfkJ5i3FR6V9L57ag@mail.gmail.com>

On Fri, Sep 18, 2015 at 11:19 PM, Akira Li <4kir4.1i at gmail.com> wrote:

> It was
> a reaction to the earlier message in this thread, to point out why
> stdlib's fromutc() API is not the example that should be followed.
>

You don't have to "follow" it but you must understand what datetime module
expects from you as a tzinfo implementer if you decide to override the
default fromutc() implementation.  What Stuart did in pytz was a hack that
the authors of the original design did not expect.

I think you find fromutc() design unnatural because you have a different
view of what datetime instances are.  I believe for you, datetime instances
are labels on a time line, but they are not.  They are more like clock
faces.  Aware datetimes are clocks with stickers that say "New York",
"Madrid", etc.  The label tells you how to interpret the time that the
clock shows, but that time does not have to be "current" or "accurate" time
at the location written on the label.  You can take a "Madrid" clock and
set it to show "current" New York time.  Nothing in datetime module will
stop you even if you set the time that falls in Madrid "gap" and makes no
sense there.

The fromutc() method helps you to set your New York clock if you know
"current" UTC time.  The instructions are simple: set "current" UTC time on
your New York clock and call fromutc().

If you adopt this mental picture, then the idea of replacing tzinfo on a
datetime becomes absurd.  Why would you want ruin a perfectly good "New
York" clock simply because it comes from Geneva showing time that is 5
hours ahead?  You don't rip off the "New York" label - you just wind the
clock back 5 hours.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150919/4c6d7a28/attachment.html>

From alexander.belopolsky at gmail.com  Sat Sep 19 22:16:01 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Sat, 19 Sep 2015 16:16:01 -0400
Subject: [Datetime-SIG] Computing .dst() as a timedelta
Message-ID: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>

The datetime.dst() and its namesake tzinfo.dst() [1] methods are required
to return a timedelta object that represents a quantity added to standard
time in a spring-forward transition.

As explained in documentation, the dst() value is already incorporated in
the value returned by utcoffset() and is not needed in typical
calculations.  Therefore, it is not surprising that both dateutil and pytz
get it wrong in some cases. [2,3]

While pytz does slightly better than dateutil, it looks like it may not be
possible to derive the correct value of dst() from the compiled binary
tzfiles alone in all cases.

The problematic cases are transitions that involve a simultaneous change in
standard time and a DST transition.  For example, Portugal switching from
CET to WEST in 1996. [2]

While the "SAVE" amount can be found in the raw tzdist files, this
information is lost when the raw files are compiled.  The transition
information includes only the full new UTC offset and a boolean isdst
flag.  If the transition is a pure DST transition, then dst() is just the
difference between the new UTC offset and the old, but if the standard time
offset changes at the time of the DST transition, there is no information
in the binary tzfile to split the full difference into standard time change
and DST adjustment.

Unless I miss something, it looks like a high-quality tzinfo implementation
should extract the "SAVE" information from the raw files.

[1]: https://docs.python.org/3/library/datetime.html#datetime.tzinfo.dst
[2]: https://github.com/dateutil/dateutil/issues/128
[3]: https://bugs.launchpad.net/pytz/+bug/1497619
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150919/e0aaaae2/attachment.html>

From tim.peters at gmail.com  Sun Sep 20 00:01:59 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 19 Sep 2015 17:01:59 -0500
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
Message-ID: <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>

[Alexander Belopolsky <alexander.belopolsky at gmail.com>]
> The datetime.dst() and its namesake tzinfo.dst() [1] methods are required to
> return a timedelta object that represents a quantity added to standard time
> in a spring-forward transition.
>
> As explained in documentation, the dst() value is already incorporated in
> the value returned by utcoffset() and is not needed in typical calculations.
> Therefore, it is not surprising that both dateutil and pytz get it wrong in
> some cases. [2,3]

Ya, the docs over-promised here ;-)  I think the only "important"
invariant to maintain is that _some_ kind of DST is in effect if and
only if .dst() != timedelta(0).


> While pytz does slightly better than dateutil, it looks like it may not be
> possible to derive the correct value of dst() from the compiled binary
> tzfiles alone in all cases.

You're right, it can't, but for a more general reason than what you
give next:  at base, it's impossible to always know what a zone's
"standard offset" is from what a tzfile stores, even though the
zoneinfo source (text) files do spell that out.


> The problematic cases are transitions that involve a simultaneous change in
> standard time and a DST transition.  For example, Portugal switching from
> CET to WEST in 1996. [2]

Specifically, on 1996-03-31 that simultaneously switched from CET
(standard time) to WEST (daylight time), yes?  The total UTC offset
was !;00:00 both before and after.

In cases "like this", you can search either backward or forward in the
transition list, to find a closest _different_ DST switch, and
calculate a change of 1 hour either way.  So it's "almost certain"
that the DST offset is an hour in this case too.

A case where that doesn't work, unless squinting:  that place in
Antarctica with two kinds of DST each year.  The total UTC offset
increases by 1 when the first DST kicks in, and by 1 again when the
second kicks in. . So, in the second case, the delta between adjacent
total UTC offsets is just 1, despite that the (total) DST offset is
actually 2.

Which suggests a more general "good guess":

    If the transition record says DST is not in effect,
        dst() should return timedelta(0).
    Else it says DST is in effect.
    If the prior transition record says it was not in effect
        and the total UTC offsets differ,
        .dst() should return their difference.
    Else the total offsets are the same, or
        DST is in effect for both.
    Search back to find the closest preceding time DST switched.
    Use the total UTC offset from the "not DST" half of that switch instead.
    If none can be found going backward, go forward instead.
    And if both searches fail, return timedelta(hours=1).


> While the "SAVE" amount can be found in the raw tzdist files, this
> information is lost when the raw files are compiled.  The transition
> information includes only the full new UTC offset and a boolean isdst flag.
> If the transition is a pure DST transition, then dst() is just the
> difference between the new UTC offset and the old, but if the standard time
> offset changes at the time of the DST transition, there is no information in
> the binary tzfile to split the full difference into standard time change and
> DST adjustment.
>
> Unless I miss something, it looks like a high-quality tzinfo implementation
> should extract the "SAVE" information from the raw files.

I will continue to draw a distinction between "high quality" and
"timezone wonk" quality ;-)


> [1]: https://docs.python.org/3/library/datetime.html#datetime.tzinfo.dst
> [2]: https://github.com/dateutil/dateutil/issues/128
> [3]: https://bugs.launchpad.net/pytz/+bug/1497619

From tim.peters at gmail.com  Sun Sep 20 02:11:11 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 19 Sep 2015 19:11:11 -0500
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
 <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>
 <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>
Message-ID: <CAExdVNmPLUU+7WTfwaDFjqgRiXLN-UzaScHnNjr-LJeCczNmnQ@mail.gmail.com>

[Alex]
> I wonder, what's the point of saving daylight at the place where sun does
> not set? (or does not rise depending on the time of the year?)

What's the point of DST _anywhere_?  Politics :-)  But in Antarctica,
the base notion of "time zone" itself is essentially senseless:

https://en.wikipedia.org/wiki/Time_in_Antarctica

"""
Antarctica sits on every line of longitude, due to the South Pole
being situated near the middle of the continent. Theoretically
Antarctica would be located in all time zones; however, areas south of
the Antarctic Circle experience extreme day-night cycles near the
times of the June and December solstices, making it difficult to
determine which time zone would be appropriate. For practical purposes
time zones are usually based on territorial claims; however, many
stations use the time of the country they are owned by or the time
zone of their supply base (e.g. McMurdo Station and Amundsen?Scott
South Pole Station use New Zealand time due to their main supply base
beingChristchurch, New Zealand).[1] Nearby stations can have different
time zones, due to their belonging to different countries. Many areas
have no time zone since nothing is decided and there are not even any
temporary settlements that have any clocks. They are simply labeled
with UTC time.[2]
"""

Then there's a list of "standard" UTC offsets for various Antarctica
locations, varying from -4 to +12.

From alexander.belopolsky at gmail.com  Sun Sep 20 02:14:56 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Sat, 19 Sep 2015 20:14:56 -0400
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
 <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>
 <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>
Message-ID: <CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>

On Sat, Sep 19, 2015 at 8:04 PM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:
>
>
> On Sat, Sep 19, 2015 at 6:01 PM, Tim Peters <tim.peters at gmail.com> wrote:
>>
>> that place in Antarctica with two kinds of DST each year.
>
>
> I wonder, what's the point of saving daylight at the place where sun does
not set? (or does not rise depending on the time of the year?)


Tim, are you referring to the "Troll" rule? [1]  That's a strange beast
indeed and a comment above it says:

# The CET-switching Troll rules require zic from tzcode 2014b or later, so
as
# suggested by Bengt-Inge Larsson comment them out for now, and approximate
# with only UTC and CEST. Uncomment them when 2014b is more prevalent.

On the other hand, I don't see any challenges to PEP 495 there other than
finding means to extract the relevant information.   Maybe I should
hand-code this rule as demo/test case.

[1]: https://github.com/eggert/tz/blob/master/antarctica#L217
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150919/3c578ff6/attachment.html>

From alexander.belopolsky at gmail.com  Sun Sep 20 02:04:47 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Sat, 19 Sep 2015 20:04:47 -0400
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
 <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>
Message-ID: <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>

On Sat, Sep 19, 2015 at 6:01 PM, Tim Peters <tim.peters at gmail.com> wrote:

> that place in Antarctica with two kinds of DST each year.
>

I wonder, what's the point of saving daylight at the place where sun does
not set? (or does not rise depending on the time of the year?)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150919/8f72949e/attachment-0001.html>

From tim.peters at gmail.com  Sun Sep 20 02:30:52 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 19 Sep 2015 19:30:52 -0500
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
 <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>
 <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>
 <CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>
Message-ID: <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>

[Alex]
> Tim, are you referring to the "Troll" rule? [1]  That's a strange beast
> indeed and a comment above it says:
>
> # The CET-switching Troll rules require zic from tzcode 2014b or later, so
> as
> # suggested by Bengt-Inge Larsson comment them out for now, and approximate
> # with only UTC and CEST. Uncomment them when 2014b is more prevalent.

Yes, and this was pointed out some time ago.  These really are the
rules they use:

    http://www.timeanddate.com/time/zone/antarctica/troll


> On the other hand, I don't see any challenges to PEP 495 there other than
> finding means to extract the relevant information.

The only problem is figuring out how to handle .dst() - which is a
problem regardless of whether 495 is implemented.  I remain unclear as
to why it broke zic.c, though!


> Maybe I should hand-code this rule as demo/test case.

This is soooo sad - you're clearly becoming a timezone wonk ;-)


> [1]: https://github.com/eggert/tz/blob/master/antarctica#L217

From alexander.belopolsky at gmail.com  Sun Sep 20 02:51:00 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Sat, 19 Sep 2015 20:51:00 -0400
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
 <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>
 <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>
 <CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>
 <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>
Message-ID: <CAP7h-xbC19UQX6WU8JEppo0bOJu5zisV5yZOyNfUYow0LCfp+A@mail.gmail.com>

On Sat, Sep 19, 2015 at 8:30 PM, Tim Peters <tim.peters at gmail.com> wrote:

> I remain unclear as to why it broke zic.c, though!
>

http://mm.icann.org/pipermail/tz/2014-March/020758.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150919/c1249437/attachment.html>

From alexander.belopolsky at gmail.com  Sun Sep 20 02:43:17 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Sat, 19 Sep 2015 20:43:17 -0400
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
 <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>
 <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>
 <CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>
 <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>
Message-ID: <CAP7h-xbdR3Mt1GF3QBQwLX5RQNwE1LaewccMZveQRspZre3rng@mail.gmail.com>

On Sat, Sep 19, 2015 at 8:30 PM, Tim Peters <tim.peters at gmail.com> wrote:

> > Maybe I should hand-code this rule as demo/test case.
>
> This is soooo sad - you're clearly becoming a timezone wonk ;-)


No, I just want to close the timezone issue in Python once and for all.
Despite its reputation, the issue is trivial: its all about a bunch of
piecewise constant functions and very simple expressions like x + f(x).

BTW, how do you like my new algorithm for inverting x + f(x)?

https://github.com/abalkin/cpython/blob/7c30620c1789ee6ecead945513e2b34ce0c24d26/Lib/test/datetimetester.py#L4328
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150919/12c561ed/attachment.html>

From alexander.belopolsky at gmail.com  Mon Sep 21 05:49:22 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Sun, 20 Sep 2015 23:49:22 -0400
Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for
	pronouncement
In-Reply-To: <CAP7h-xapmYDX_5=a4ODBrd+ZTSkOja49JNYxES5p2jtOs+gMsA@mail.gmail.com>
References: <CAP7h-xapmYDX_5=a4ODBrd+ZTSkOja49JNYxES5p2jtOs+gMsA@mail.gmail.com>
Message-ID: <CAP7h-xYVZCtFxrtC=xxX=X2K0ZPHjvK9pRyCaTjRN_NFRqybRg@mail.gmail.com>

On Sat, Aug 15, 2015 at 8:49 PM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:
>
> PEP 495 [1] is a deliberately minimalistic proposal to remove an
> ambiguity in representing some local times as datetime.datetime
> objects.

A major issue has come up since my announcement above.  Tim Peters have
noticed that PEP 495 would violate the "hash invariant" unless the fold
attribute is accounted for in inter-zone comparisons.
See [2] for details.  This issue has been resolved by modifying the
definition [3] of the "==" operator for aware datetimes with post-PEP
tzinfo.  Note that no program will be affected by this change unless it
uses a post-PEP tzinfo implementation.

I made some smaller changes [4] to the PEP as well and it should finally be
ready for pronouncement.

[1]: https://www.python.org/dev/peps/pep-0495
[2]:
https://mail.python.org/pipermail/datetime-sig/2015-September/000625.html
[3]:
https://www.python.org/dev/peps/pep-0495/#aware-datetime-equality-comparison
[4]: https://hg.python.org/peps/log/39b7c1da05a2/pep-0495.txt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150920/a68615fe/attachment.html>

From alexander.belopolsky at gmail.com  Mon Sep 21 06:55:16 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 21 Sep 2015 00:55:16 -0400
Subject: [Datetime-SIG] Adding PEP 495 support to dateutil
In-Reply-To: <CAExdVN=Dfabatv5C5TmAdfViaa9weh-afEkeYbxLb1OJSZ+6-w@mail.gmail.com>
References: <55FC1E62.6060202@ganssle.io>
 <CAExdVNk3_-bD+JGYP-bDJ-f+yeHUxFX+TbBdFKnZ-tNcJgAR-w@mail.gmail.com>
 <55FC4464.9050201@ganssle.io>
 <CAExdVN=DZz5VCV9ctX6=XFDQQU5P9xpO9CDY4HJjQidOfASirA@mail.gmail.com>
 <CAP7h-xb-HmmWc3vmK_2hdb5Us4nGwTVq=CEDYCM5WEvtaoJsuw@mail.gmail.com>
 <CAExdVN=Dfabatv5C5TmAdfViaa9weh-afEkeYbxLb1OJSZ+6-w@mail.gmail.com>
Message-ID: <CAP7h-xaoTy9-4DeV43LmakGRoKv9bONv3NX9eEkhP9FWDA1CTg@mail.gmail.com>

On Fri, Sep 18, 2015 at 9:02 PM, Tim Peters <tim.peters at gmail.com> wrote:
>
> I happen to still believe that a "hybrid" tzinfo is the best approach,
> but appreciate that pytz solved a world of problems with its approach
> (while creating others).  I really can't tell if a consensus has been
> reached among the relative handful of datetime-SIG participants.
> Which means there is no consensus.


If "consensus" means "absence of sustained opposition" [1], it looks like
we either weared out or intimidated the "opposition" enough for it not to
be "sustained" anymore. :-)

Luckily, PEP 495 solves at least one problem that has nothing to do with a
choice of tzinfo:  it makes the result of datetime.now() unambiguous.
Also, the PEP does not take a position on what approach is better - it just
makes both equally feasible.

[1] https://lehors.wordpress.com/2008/08/07/what-consensus-means/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150921/b0f19013/attachment.html>

From mal at egenix.com  Mon Sep 21 14:44:23 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 21 Sep 2015 14:44:23 +0200
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>	<CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>	<CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>	<CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>
 <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>
Message-ID: <55FFFBA7.80905@egenix.com>

On 20.09.2015 02:30, Tim Peters wrote:
> [Alex]
>> Tim, are you referring to the "Troll" rule? [1]  That's a strange beast
>> indeed and a comment above it says:
>>
>> # The CET-switching Troll rules require zic from tzcode 2014b or later, so
>> as
>> # suggested by Bengt-Inge Larsson comment them out for now, and approximate
>> # with only UTC and CEST. Uncomment them when 2014b is more prevalent.
> 
> Yes, and this was pointed out some time ago.  These really are the
> rules they use:
> 
>     http://www.timeanddate.com/time/zone/antarctica/troll

Interesting. It lists two "DST"s per year: they first
go from GMT to CET, then to CEST, and then back to CET and GMT.
I guess they switched to CET when the station was used and
to GMT for the instruments during winter when it was not used.

But this is not consistent with what the Norwegians report on their
Troll station website:

http://www.npolar.no/en/about-us/stations-vessels/troll/index.html

"""
The time zone Troll is located in, UTC +0, is 1 hour behind Norwegian
time (2 hours during summer time in Norway). Photo: Norwegian Polar
Institute
"""

The webcam confirms this: ftp://ftp.npolar.no/Out/TrollWebCam/TrollPublic.jpg

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Sep 21 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...           http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/
________________________________________________________________________
2015-09-14: Released mxODBC Plone/Zope DA 2.2.3   http://egenix.com/go84
2015-09-26: Python Meeting Duesseldorf Sprint 2015          5 days to go
2015-10-21: Python Meeting Duesseldorf ...                 30 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From alexander.belopolsky at gmail.com  Mon Sep 21 17:01:20 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 21 Sep 2015 11:01:20 -0400
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <55FFFBA7.80905@egenix.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
 <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>
 <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>
 <CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>
 <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>
 <55FFFBA7.80905@egenix.com>
Message-ID: <CAP7h-xaVVcNRe4cy9Mh1-bGq=A+24fE1AApp-L1eQSZV-cX2Ww@mail.gmail.com>

On Mon, Sep 21, 2015 at 8:44 AM, M.-A. Lemburg <mal at egenix.com> wrote:

> Interesting. It lists two "DST"s per year: they first
> go from GMT to CET, then to CEST, and then back to CET and GMT.
> I guess they switched to CET when the station was used and
> to GMT for the instruments during winter when it was not used.
>
> But this is not consistent with what the Norwegians report on their
> Troll station website:
>
> http://www.npolar.no/en/about-us/stations-vessels/troll/index.html
>
> """
> The time zone Troll is located in, UTC +0, is 1 hour behind Norwegian
> time (2 hours during summer time in Norway). Photo: Norwegian Polar
> Institute
> """
>
> The webcam confirms this:
> ftp://ftp.npolar.no/Out/TrollWebCam/TrollPublic.jpg
>

For those who care about this kind of timezone trivia, apparently [1] Troll
station is inhabited by Norwegians only during the colder months
(March?October)
who use Norwegian time (TZ=Europe/Oslo).  During the warmer ("busy")
season, the station switches to UTC as an option that is equally annoying
to all international inhabitants.

The specific rules that appear in some versions of the IANA zone info files
are pure fantasy.

[1]: http://mm.icann.org/pipermail/tz/2014-March/020705.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150921/08409736/attachment.html>

From mal at egenix.com  Mon Sep 21 17:49:45 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 21 Sep 2015 17:49:45 +0200
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <CAP7h-xaVVcNRe4cy9Mh1-bGq=A+24fE1AApp-L1eQSZV-cX2Ww@mail.gmail.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>	<CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>	<CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>	<CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>	<CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>	<55FFFBA7.80905@egenix.com>
 <CAP7h-xaVVcNRe4cy9Mh1-bGq=A+24fE1AApp-L1eQSZV-cX2Ww@mail.gmail.com>
Message-ID: <56002719.8090404@egenix.com>

On 21.09.2015 17:01, Alexander Belopolsky wrote:
> On Mon, Sep 21, 2015 at 8:44 AM, M.-A. Lemburg <mal at egenix.com> wrote:
> 
>> Interesting. It lists two "DST"s per year: they first
>> go from GMT to CET, then to CEST, and then back to CET and GMT.
>> I guess they switched to CET when the station was used and
>> to GMT for the instruments during winter when it was not used.
>>
>> But this is not consistent with what the Norwegians report on their
>> Troll station website:
>>
>> http://www.npolar.no/en/about-us/stations-vessels/troll/index.html
>>
>> """
>> The time zone Troll is located in, UTC +0, is 1 hour behind Norwegian
>> time (2 hours during summer time in Norway). Photo: Norwegian Polar
>> Institute
>> """
>>
>> The webcam confirms this:
>> ftp://ftp.npolar.no/Out/TrollWebCam/TrollPublic.jpg
>>
> 
> For those who care about this kind of timezone trivia, apparently [1] Troll
> station is inhabited by Norwegians only during the colder months
> (March?October)
> who use Norwegian time (TZ=Europe/Oslo).  During the warmer ("busy")
> season, the station switches to UTC as an option that is equally annoying
> to all international inhabitants.
> 
> The specific rules that appear in some versions of the IANA zone info files
> are pure fantasy.
> 
> [1]: http://mm.icann.org/pipermail/tz/2014-March/020705.html

Looks like assigning a "time zone" to the place is simply conceptually
wrong and was just done to make some tz folks happy.

Anyway, the main takeaway for me is that it is obviously possible
to have more than two DST switches during the year, which is
something I wasn't aware of before seeing this example.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Sep 21 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...           http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/
________________________________________________________________________
2015-09-14: Released mxODBC Plone/Zope DA 2.2.3   http://egenix.com/go84
2015-09-26: Python Meeting Duesseldorf Sprint 2015          5 days to go
2015-10-21: Python Meeting Duesseldorf ...                 30 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From tim.peters at gmail.com  Mon Sep 21 18:04:21 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 21 Sep 2015 11:04:21 -0500
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <56002719.8090404@egenix.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
 <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>
 <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>
 <CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>
 <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>
 <55FFFBA7.80905@egenix.com>
 <CAP7h-xaVVcNRe4cy9Mh1-bGq=A+24fE1AApp-L1eQSZV-cX2Ww@mail.gmail.com>
 <56002719.8090404@egenix.com>
Message-ID: <CAExdVNkqE0GtG+Zf1R=AghEyPFJngQuwFf4WXC_-NZ0swQPN0Q@mail.gmail.com>

[Marc-Andre, on Antarctica/Troll]
> Looks like assigning a "time zone" to the place is simply conceptually
> wrong and was just done to make some tz folks happy.

There are many "time zones" in Antarctica (theoretically, it's in all
time zones).  They're all senseless ;-)

    https://en.wikipedia.org/wiki/Time_in_Antarctica


> Anyway, the main takeaway for me is that it is obviously possible
> to have more than two DST switches during the year, which is
> something I wasn't aware of before seeing this example.

The Brits beat 'em to it, but a long time ago:

    https://en.wikipedia.org/wiki/British_Summer_Time

    In 1940, during the Second World War, the clocks in Britain
    were not put back by an hour at the end of Summer Time.
    In subsequent years, clocks continued to be advanced by
    one hour each spring and put back by an hour each autumn
    until July 1945. During these summers, therefore, Britain was
    two hours ahead of GMT and operating on British Double
    Summer Time (BDST). The clocks were brought back in line
    with GMT at the end of summer in 1945. In 1947, due to
    severe fuel shortages, clocks were advanced by one hour
    on two occasions during the spring, and put back by one hour
    on two occasions during the autumn, meaning that Britain was
    back on BDST during that summer.

These may be the corresponding lines in IANA's "europe" file:

Rule GB-Eire 1947 only - Mar 16 2:00s 1:00 BST
Rule GB-Eire 1947 only - Apr 13 1:00s 2:00 BDST
Rule GB-Eire 1947 only - Aug 10 1:00s 1:00 BST
Rule GB-Eire 1947 only - Nov 2 2:00s 0 GMT

From alexander.belopolsky at gmail.com  Mon Sep 21 18:20:23 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 21 Sep 2015 12:20:23 -0400
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <56002719.8090404@egenix.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
 <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>
 <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>
 <CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>
 <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>
 <55FFFBA7.80905@egenix.com>
 <CAP7h-xaVVcNRe4cy9Mh1-bGq=A+24fE1AApp-L1eQSZV-cX2Ww@mail.gmail.com>
 <56002719.8090404@egenix.com>
Message-ID: <CAP7h-xawArCmhEdxMQfD-CYyvhEMr+8N9od8yb8GdcmbuhvRYA@mail.gmail.com>

On Mon, Sep 21, 2015 at 11:49 AM, M.-A. Lemburg <mal at egenix.com> wrote:
>
> Looks like assigning a "time zone" to the place is simply conceptually
> wrong and was just done to make some tz folks happy.


Yes, the best definition of "time zone" in computing contexts is the one
given by the tzdist group: "A description of the past and predicted future
timekeeping practices  of a collection of clocks that are intended to
agree."   Apparently, there is no concerted effort at the Troll station to
have a station-specific set of timekeeping rules.  They just use either UTC
or Europe/Oslo depending on the needs of the current expedition.

>
> Anyway, the main takeaway for me is that it is obviously possible
> to have more than two DST switches during the year, which is
> something I wasn't aware of before seeing this example.

The March 1st switch at Troll from UTC to CET is not really a DST
transition.  It is a transition that changes the standard time.  (The value
of isdst does not change in the transition.)

Even more exotic things can happen if one would try to model a ship's clock
using a tzinfo instance.  By convention, ships use the time of the closest
port or whatever the captain feels appropriate in international waters.
Since ship logs are usually reliable and ship speed is low, such specialty
application will probably work in most cases.  Note that faster vehicles
such as the ISS use UTC these days, but I think the Apollo program used the
Houston time.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150921/5f31ee92/attachment-0001.html>

From rosuav at gmail.com  Mon Sep 21 19:02:56 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Tue, 22 Sep 2015 03:02:56 +1000
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <CAP7h-xawArCmhEdxMQfD-CYyvhEMr+8N9od8yb8GdcmbuhvRYA@mail.gmail.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
 <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>
 <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>
 <CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>
 <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>
 <55FFFBA7.80905@egenix.com>
 <CAP7h-xaVVcNRe4cy9Mh1-bGq=A+24fE1AApp-L1eQSZV-cX2Ww@mail.gmail.com>
 <56002719.8090404@egenix.com>
 <CAP7h-xawArCmhEdxMQfD-CYyvhEMr+8N9od8yb8GdcmbuhvRYA@mail.gmail.com>
Message-ID: <CAPTjJmrBHdtXk6j7v2bU+2s7qyNGFdYMossvjA4QOZ5hk63pOg@mail.gmail.com>

On Tue, Sep 22, 2015 at 2:20 AM, Alexander Belopolsky
<alexander.belopolsky at gmail.com> wrote:
> Note that faster vehicles such as the ISS use UTC these days...

Isn't the ISS fast enough that relativity starts getting in the way? I
am *so* glad the Python datetime module doesn't have to concern itself
with that...

ChrisA

From alexander.belopolsky at gmail.com  Mon Sep 21 19:10:53 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 21 Sep 2015 13:10:53 -0400
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <CAPTjJmrBHdtXk6j7v2bU+2s7qyNGFdYMossvjA4QOZ5hk63pOg@mail.gmail.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
 <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>
 <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>
 <CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>
 <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>
 <55FFFBA7.80905@egenix.com>
 <CAP7h-xaVVcNRe4cy9Mh1-bGq=A+24fE1AApp-L1eQSZV-cX2Ww@mail.gmail.com>
 <56002719.8090404@egenix.com>
 <CAP7h-xawArCmhEdxMQfD-CYyvhEMr+8N9od8yb8GdcmbuhvRYA@mail.gmail.com>
 <CAPTjJmrBHdtXk6j7v2bU+2s7qyNGFdYMossvjA4QOZ5hk63pOg@mail.gmail.com>
Message-ID: <CAP7h-xaZopoC_M0X9xfJ_HX6EDc5OApYa1VxBgJR9pEzBHdAcw@mail.gmail.com>

On Mon, Sep 21, 2015 at 1:02 PM, Chris Angelico <rosuav at gmail.com> wrote:

> On Tue, Sep 22, 2015 at 2:20 AM, Alexander Belopolsky
> <alexander.belopolsky at gmail.com> wrote:
> > Note that faster vehicles such as the ISS use UTC these days...
>
> Isn't the ISS fast enough that relativity starts getting in the way?
>

Not fast enough for an astronaut to miss a wedding anniversary.

I am *so* glad the Python datetime module doesn't have to concern itself
> with that...
>

Neither do the astronauts when they schedule a video conference with a
family at home and that's the only time they need to worry about civil time
zones on the Earth.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150921/fc7fbb19/attachment.html>

From alexander.belopolsky at gmail.com  Mon Sep 21 19:23:29 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 21 Sep 2015 13:23:29 -0400
Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for
	pronouncement
In-Reply-To: <CAP7h-xYVZCtFxrtC=xxX=X2K0ZPHjvK9pRyCaTjRN_NFRqybRg@mail.gmail.com>
References: <CAP7h-xapmYDX_5=a4ODBrd+ZTSkOja49JNYxES5p2jtOs+gMsA@mail.gmail.com>
 <CAP7h-xYVZCtFxrtC=xxX=X2K0ZPHjvK9pRyCaTjRN_NFRqybRg@mail.gmail.com>
Message-ID: <CAP7h-xYp_Nq-xJh3y=my8zWhbBGKbAEmrWk8xWgM4M+57sxSmQ@mail.gmail.com>

For those who prefer using Github's review tools, I have republished the
PEP at <https://github.com/abalkin/ltdf/tree/pep-0495>.

Comments and pull requests are welcome.

On Sun, Sep 20, 2015 at 11:49 PM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:

> On Sat, Aug 15, 2015 at 8:49 PM, Alexander Belopolsky <
> alexander.belopolsky at gmail.com> wrote:
> >
> > PEP 495 [1] is a deliberately minimalistic proposal to remove an
> > ambiguity in representing some local times as datetime.datetime
> > objects.
>
> A major issue has come up since my announcement above.  Tim Peters have
> noticed that PEP 495 would violate the "hash invariant" unless the fold
> attribute is accounted for in inter-zone comparisons.
> See [2] for details.  This issue has been resolved by modifying the
> definition [3] of the "==" operator for aware datetimes with post-PEP
> tzinfo.  Note that no program will be affected by this change unless it
> uses a post-PEP tzinfo implementation.
>
> I made some smaller changes [4] to the PEP as well and it should finally
> be ready for pronouncement.
>
> [1]: https://www.python.org/dev/peps/pep-0495
> [2]:
> https://mail.python.org/pipermail/datetime-sig/2015-September/000625.html
> [3]:
> https://www.python.org/dev/peps/pep-0495/#aware-datetime-equality-comparison
> [4]: https://hg.python.org/peps/log/39b7c1da05a2/pep-0495.txt
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150921/edc5223c/attachment.html>

From guido at python.org  Mon Sep 21 23:54:41 2015
From: guido at python.org (Guido van Rossum)
Date: Mon, 21 Sep 2015 14:54:41 -0700
Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for
	pronouncement
In-Reply-To: <CAP7h-xYVZCtFxrtC=xxX=X2K0ZPHjvK9pRyCaTjRN_NFRqybRg@mail.gmail.com>
References: <CAP7h-xapmYDX_5=a4ODBrd+ZTSkOja49JNYxES5p2jtOs+gMsA@mail.gmail.com>
 <CAP7h-xYVZCtFxrtC=xxX=X2K0ZPHjvK9pRyCaTjRN_NFRqybRg@mail.gmail.com>
Message-ID: <CAP7+vJJ8MxDnAtnUNTAU581=aBff1-ffgHYKov=sCHCmMPOJSg@mail.gmail.com>

On Sun, Sep 20, 2015 at 8:49 PM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:

> On Sat, Aug 15, 2015 at 8:49 PM, Alexander Belopolsky <
> alexander.belopolsky at gmail.com> wrote:
> >
> > PEP 495 [1] is a deliberately minimalistic proposal to remove an
> > ambiguity in representing some local times as datetime.datetime
> > objects.
>
> A major issue has come up since my announcement above.  Tim Peters have
> noticed that PEP 495 would violate the "hash invariant" unless the fold
> attribute is accounted for in inter-zone comparisons.
> See [2] for details.  This issue has been resolved by modifying the
> definition [3] of the "==" operator for aware datetimes with post-PEP
> tzinfo.  Note that no program will be affected by this change unless it
> uses a post-PEP tzinfo implementation.
>
> I made some smaller changes [4] to the PEP as well and it should finally
> be ready for pronouncement.
>
> [1]: https://www.python.org/dev/peps/pep-0495
> [2]:
> https://mail.python.org/pipermail/datetime-sig/2015-September/000625.html
> [3]:
> https://www.python.org/dev/peps/pep-0495/#aware-datetime-equality-comparison
> [4]: https://hg.python.org/peps/log/39b7c1da05a2/pep-0495.txt
>

I've reviewed this latest version and I am hereby accepting it. The topic
is both controversial and yawn-inducing, so I think it's better not to give
the usual one-day warning on python-dev -- I'll just post my decision there.

Alexander and Tim, thank for all your work on this! It's been a wild, wild
ride. (And no, I am not going to make a joke about leap seconds here. :-)

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150921/2b5c2f83/attachment.html>

From mal at egenix.com  Tue Sep 22 18:12:20 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 22 Sep 2015 18:12:20 +0200
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <CAP7h-xawArCmhEdxMQfD-CYyvhEMr+8N9od8yb8GdcmbuhvRYA@mail.gmail.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>	<CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>	<CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>	<CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>	<CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>	<55FFFBA7.80905@egenix.com>	<CAP7h-xaVVcNRe4cy9Mh1-bGq=A+24fE1AApp-L1eQSZV-cX2Ww@mail.gmail.com>	<56002719.8090404@egenix.com>
 <CAP7h-xawArCmhEdxMQfD-CYyvhEMr+8N9od8yb8GdcmbuhvRYA@mail.gmail.com>
Message-ID: <56017DE4.9030806@egenix.com>

On 21.09.2015 18:20, Alexander Belopolsky wrote:
> On Mon, Sep 21, 2015 at 11:49 AM, M.-A. Lemburg <mal at egenix.com> wrote:
>>
>> Looks like assigning a "time zone" to the place is simply conceptually
>> wrong and was just done to make some tz folks happy.
> 
> 
> Yes, the best definition of "time zone" in computing contexts is the one
> given by the tzdist group: "A description of the past and predicted future
> timekeeping practices  of a collection of clocks that are intended to
> agree."   Apparently, there is no concerted effort at the Troll station to
> have a station-specific set of timekeeping rules.  They just use either UTC
> or Europe/Oslo depending on the needs of the current expedition.

Ah, the joys of freedom of choice :-)

>> Anyway, the main takeaway for me is that it is obviously possible
>> to have more than two DST switches during the year, which is
>> something I wasn't aware of before seeing this example.
> 
> The March 1st switch at Troll from UTC to CET is not really a DST
> transition.  It is a transition that changes the standard time.  (The value
> of isdst does not change in the transition.)
> 
> Even more exotic things can happen if one would try to model a ship's clock
> using a tzinfo instance.  By convention, ships use the time of the closest
> port or whatever the captain feels appropriate in international waters.
> Since ship logs are usually reliable and ship speed is low, such specialty
> application will probably work in most cases.  Note that faster vehicles
> such as the ISS use UTC these days, but I think the Apollo program used the
> Houston time.

Time on ships seems to depend on what the captain and company
think is the right way:

http://travel.stackexchange.com/questions/43245/what-time-is-used-on-board-a-cruise-ship

even though there is a standard called "Nautical time" for this:

https://en.wikipedia.org/wiki/Nautical_time

"""
In practice, nautical times are used only for radio communication, etc. Aboard the ship, e.g. for
scheduling work and meal times, the ship may use a suitable time of its own choosing. The captain is
permitted to change his or her clocks at a chosen time following the ship's entry into another time
zone, typically at midnight. Ships on long-distance passages change time zone on board in this
fashion. On short passages the captain may not adjust clocks at all, even if they pass through
different time zones, for example between the UK and continental Europe. Passenger ships often use
both nautical and on-board time zones on signs. When referring to time tables and when communicating
with land, the land time zone must be employed.
"""

On planes, the situation seems to be similar. I've not been on a flight
yet where the captain announces new time zones midway :-)

I guess even though the approach to use location names for time zones
creates a more or less sane system on the ground, it doesn't really
address the changes in authority when things start moving.

Perhaps we should just standardize on UTC world-wide and then
instead have the work day begin at different times depending
on location. Crazy idea, but then it'd safe us all a lot of
work :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Sep 22 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...           http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/
________________________________________________________________________
2015-09-14: Released mxODBC Plone/Zope DA 2.2.3   http://egenix.com/go84
2015-09-26: Python Meeting Duesseldorf Sprint 2015          4 days to go
2015-10-21: Python Meeting Duesseldorf ...                 29 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From alexander.belopolsky at gmail.com  Tue Sep 22 18:32:11 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 22 Sep 2015 12:32:11 -0400
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <56017DE4.9030806@egenix.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
 <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>
 <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>
 <CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>
 <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>
 <55FFFBA7.80905@egenix.com>
 <CAP7h-xaVVcNRe4cy9Mh1-bGq=A+24fE1AApp-L1eQSZV-cX2Ww@mail.gmail.com>
 <56002719.8090404@egenix.com>
 <CAP7h-xawArCmhEdxMQfD-CYyvhEMr+8N9od8yb8GdcmbuhvRYA@mail.gmail.com>
 <56017DE4.9030806@egenix.com>
Message-ID: <CAP7h-xZEsY=2vPL5=U_7kfPfPcOEGWjWECgPVqURbYO3f6v_uw@mail.gmail.com>

On Tue, Sep 22, 2015 at 12:12 PM, M.-A. Lemburg <mal at egenix.com> wrote:

> Perhaps we should just standardize on UTC world-wide and then
> instead have the work day begin at different times depending
> on location. Crazy idea, but then it'd safe us all a lot of
> work :-)
>

Publishers of daily planners should lobby for this.  They will be able to
sell planners customized for each location.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150922/868d0fb3/attachment.html>

From rosuav at gmail.com  Tue Sep 22 18:44:38 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 23 Sep 2015 02:44:38 +1000
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <56017DE4.9030806@egenix.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
 <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>
 <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>
 <CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>
 <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>
 <55FFFBA7.80905@egenix.com>
 <CAP7h-xaVVcNRe4cy9Mh1-bGq=A+24fE1AApp-L1eQSZV-cX2Ww@mail.gmail.com>
 <56002719.8090404@egenix.com>
 <CAP7h-xawArCmhEdxMQfD-CYyvhEMr+8N9od8yb8GdcmbuhvRYA@mail.gmail.com>
 <56017DE4.9030806@egenix.com>
Message-ID: <CAPTjJmoWej+HPGO6skVUZkE740XLGzja+QZLZX6aBy6rRQFF7w@mail.gmail.com>

On Wed, Sep 23, 2015 at 2:12 AM, M.-A. Lemburg <mal at egenix.com> wrote:
> On planes, the situation seems to be similar. I've not been on a flight
> yet where the captain announces new time zones midway :-)

Me neither. Usually what I see is "Time at origin" and "Time at
destination", and occasionally a few other time points, but nobody
really cares about "time right underneath us".

> Perhaps we should just standardize on UTC world-wide and then
> instead have the work day begin at different times depending
> on location. Crazy idea, but then it'd safe us all a lot of
> work :-)
>

Yes! Yes, a hundred times yes! Whenever possible, I try to synchronize
on UTC with everyone. Our Dungeons & Dragons campaigns are all
scheduled that way - eg I run one at 2AM UTC every Sunday. It's fair
on everyone, that way; nobody has to cope with more than one
timezone's DST changes, and only ever their own.

ChrisA

From alexander.belopolsky at gmail.com  Tue Sep 22 19:13:30 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 22 Sep 2015 13:13:30 -0400
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <CAPTjJmoWej+HPGO6skVUZkE740XLGzja+QZLZX6aBy6rRQFF7w@mail.gmail.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
 <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>
 <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>
 <CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>
 <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>
 <55FFFBA7.80905@egenix.com>
 <CAP7h-xaVVcNRe4cy9Mh1-bGq=A+24fE1AApp-L1eQSZV-cX2Ww@mail.gmail.com>
 <56002719.8090404@egenix.com>
 <CAP7h-xawArCmhEdxMQfD-CYyvhEMr+8N9od8yb8GdcmbuhvRYA@mail.gmail.com>
 <56017DE4.9030806@egenix.com>
 <CAPTjJmoWej+HPGO6skVUZkE740XLGzja+QZLZX6aBy6rRQFF7w@mail.gmail.com>
Message-ID: <CAP7h-xY3f4WkFxP2a+rSik_K3zyaTo8YrzB0Txa1kE7dZeswbg@mail.gmail.com>

On Tue, Sep 22, 2015 at 12:44 PM, Chris Angelico <rosuav at gmail.com> wrote:
>
> Yes! Yes, a hundred times yes! Whenever possible, I try to synchronize
> on UTC with everyone. Our Dungeons & Dragons campaigns are all
> scheduled that way - eg I run one at 2AM UTC every Sunday. It's fair
> on everyone, that way; nobody has to cope with more than one
> timezone's DST changes, and only ever their own.

I call UTC "make it equally annoying to everyone choice."   It is tolerable
within (Western) Europe, but when your team is more geographically diverse,
2AM UTC may still be Saturday for some.

Our job as programmers is to teach computers how to understand humans, not
the other way around.  A good scheduling application should allow you to
set up the schedule in any way you want including your local time, your
standard time or any other time zone that is "special" for your particular
case (Dragonlance Mean Time, perhaps:-).  Once scheduled, the times should
be displayed to your team members in their local time.  If you make a
schedule relative to a time zone with DST transitions, some of your team
members may be surprised by the apparent changes of the schedule.  That's a
human problem - whatever compromise you come up with - your computer should
be helpful in implementing.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150922/0f46808b/attachment.html>

From rosuav at gmail.com  Tue Sep 22 19:40:06 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 23 Sep 2015 03:40:06 +1000
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <CAP7h-xY3f4WkFxP2a+rSik_K3zyaTo8YrzB0Txa1kE7dZeswbg@mail.gmail.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
 <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>
 <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>
 <CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>
 <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>
 <55FFFBA7.80905@egenix.com>
 <CAP7h-xaVVcNRe4cy9Mh1-bGq=A+24fE1AApp-L1eQSZV-cX2Ww@mail.gmail.com>
 <56002719.8090404@egenix.com>
 <CAP7h-xawArCmhEdxMQfD-CYyvhEMr+8N9od8yb8GdcmbuhvRYA@mail.gmail.com>
 <56017DE4.9030806@egenix.com>
 <CAPTjJmoWej+HPGO6skVUZkE740XLGzja+QZLZX6aBy6rRQFF7w@mail.gmail.com>
 <CAP7h-xY3f4WkFxP2a+rSik_K3zyaTo8YrzB0Txa1kE7dZeswbg@mail.gmail.com>
Message-ID: <CAPTjJmpDjQ1HEX1A8LW2dkRAg3vAW9EqFks4eEOb7W3KamBF2g@mail.gmail.com>

On Wed, Sep 23, 2015 at 3:13 AM, Alexander Belopolsky
<alexander.belopolsky at gmail.com> wrote:
> I call UTC "make it equally annoying to everyone choice."   It is tolerable
> within (Western) Europe, but when your team is more geographically diverse,
> 2AM UTC may still be Saturday for some.

It may indeed - that's actually a feature (most of our American
players like it being a Saturday evening).

> Our job as programmers is to teach computers how to understand humans, not
> the other way around.  A good scheduling application should allow you to set
> up the schedule in any way you want including your local time, your standard
> time or any other time zone that is "special" for your particular case
> (Dragonlance Mean Time, perhaps:-).  Once scheduled, the times should be
> displayed to your team members in their local time.  If you make a schedule
> relative to a time zone with DST transitions, some of your team members may
> be surprised by the apparent changes of the schedule.  That's a human
> problem - whatever compromise you come up with - your computer should be
> helpful in implementing.

The trouble is the exact same thing that we were discussing with the
beginning of this mailing list. Allow me to spin you a few scenarios.

"Hi folks! We're starting a D&D campaign, and we'll be meeting up every week."
1) "It'll be at 9PM every Saturday in your time zone (Chicago)." --
different for everyone
2) "It'll be at noon every Sunday for the Dungeon Master (Melbourne)."
3) "It'll be at 2AM every Sunday in UTC."

I can easily write a program that does the conversions - in fact, I
have one built into the MUD client that we use for actually playing
the game. The trouble is, people will expect these recurring events to
repeat on a cycle based on the displayed time - what's been referred
to as "classic" or "naive" arithmetic. The result is:

1) Game time is every Saturday when my clock shows 9PM.
2) Game time is ... uhh ... I dunno, I'll just show up some time and hope.
3) Game time is 168 hours after the previous game started.

#1 fundamentally can't work, because we have to sync up around the
globe. Either that, or the program has to recalculate "it'll be 9PM
this week, but 8PM next week" every time, and it would have to do that
on the basis of #2 or #3. Even so, it's confusing to have to go and
check it every time; the clock time for the game might change
unpredictably, depending on the fundamental timezone.

#2 inflicts double DST confusion on everyone that isn't in the same
time zone as the Dungeon Master. This is how Threshold RPG works - the
official timezone is EST (though I prefer to describe it by its tzdata
name, America/New_York), so anyone in the US east coast states has it
easy, and other people in contiguous USA are doing reasonably alright;
folks in Australia [1] have to cope with two hour DST changes each
year, and folks in Europe have to worry about temporary
desynchronizations each year as DST stabilizes.

#3 works for everyone. Again, papering over the difference slightly
can help (which is why the MUD client has a time converter in it), but
it's much easier to explain: when you go onto Daylight Saving Time,
your clock moves forward, which means the next session will happen at
9PM on your clock instead of 8PM. There's exactly one clock shift for
every DST transition, and all you have to do is explain to people that
DST doesn't change anything except your clock. That's why we schedule
things in UTC.

Showing the time in your local timezone as an abstraction over a UTC
fundamental is nice and safe. We *know* we can always do that
unambiguously, and it's easy to explain what's going on. It's still a
leaky abstraction, though, and I prefer to explicitly tell people that
it's scheduled in UTC, but that they can see what UTC time translates
to what local time. That's why it's safest to be clear about UTC
usage, and honestly, this has nothing to do with what a computer can
and can't be taught to do - it's all about what humans can get their
heads around.

ChrisA

[1] Except Queensland, where they're smart.

From alexander.belopolsky at gmail.com  Tue Sep 22 20:02:39 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 22 Sep 2015 14:02:39 -0400
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <CAPTjJmpDjQ1HEX1A8LW2dkRAg3vAW9EqFks4eEOb7W3KamBF2g@mail.gmail.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
 <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>
 <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>
 <CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>
 <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>
 <55FFFBA7.80905@egenix.com>
 <CAP7h-xaVVcNRe4cy9Mh1-bGq=A+24fE1AApp-L1eQSZV-cX2Ww@mail.gmail.com>
 <56002719.8090404@egenix.com>
 <CAP7h-xawArCmhEdxMQfD-CYyvhEMr+8N9od8yb8GdcmbuhvRYA@mail.gmail.com>
 <56017DE4.9030806@egenix.com>
 <CAPTjJmoWej+HPGO6skVUZkE740XLGzja+QZLZX6aBy6rRQFF7w@mail.gmail.com>
 <CAP7h-xY3f4WkFxP2a+rSik_K3zyaTo8YrzB0Txa1kE7dZeswbg@mail.gmail.com>
 <CAPTjJmpDjQ1HEX1A8LW2dkRAg3vAW9EqFks4eEOb7W3KamBF2g@mail.gmail.com>
Message-ID: <CAP7h-xb825Sw-dE=px_E0mqA1V0R-C2bZMzS7uPGSSQfAWXRNQ@mail.gmail.com>

On Tue, Sep 22, 2015 at 1:40 PM, Chris Angelico <rosuav at gmail.com> wrote:
>
> That's why it's safest to be clear about UTC
> usage, and honestly, this has nothing to do with what a computer can
> and can't be taught to do - it's all about what humans can get their
> heads around.

UTC is no better than DMT (Dragonlance Mean Time).  In fact, I think I will
have easier time explaining DMT to a ten year old than explaining UTC.   If
your team can agree on a natural language, they can agree on a timescale.
It does not matter what it is.  If uniformity was a universal virtue, we
would all be speaking Esperanto by now.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150922/086973e0/attachment.html>

From 4kir4.1i at gmail.com  Tue Sep 22 20:35:18 2015
From: 4kir4.1i at gmail.com (Akira Li)
Date: Tue, 22 Sep 2015 21:35:18 +0300
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <CAP7h-xb825Sw-dE=px_E0mqA1V0R-C2bZMzS7uPGSSQfAWXRNQ@mail.gmail.com>
 (Alexander Belopolsky's message of "Tue, 22 Sep 2015 14:02:39 -0400")
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
 <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>
 <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>
 <CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>
 <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>
 <55FFFBA7.80905@egenix.com>
 <CAP7h-xaVVcNRe4cy9Mh1-bGq=A+24fE1AApp-L1eQSZV-cX2Ww@mail.gmail.com>
 <56002719.8090404@egenix.com>
 <CAP7h-xawArCmhEdxMQfD-CYyvhEMr+8N9od8yb8GdcmbuhvRYA@mail.gmail.com>
 <56017DE4.9030806@egenix.com>
 <CAPTjJmoWej+HPGO6skVUZkE740XLGzja+QZLZX6aBy6rRQFF7w@mail.gmail.com>
 <CAP7h-xY3f4WkFxP2a+rSik_K3zyaTo8YrzB0Txa1kE7dZeswbg@mail.gmail.com>
 <CAPTjJmpDjQ1HEX1A8LW2dkRAg3vAW9EqFks4eEOb7W3KamBF2g@mail.gmail.com>
 <CAP7h-xb825Sw-dE=px_E0mqA1V0R-C2bZMzS7uPGSSQfAWXRNQ@mail.gmail.com>
Message-ID: <87d1xa2tg9.fsf@gmail.com>

Alexander Belopolsky <alexander.belopolsky at gmail.com> writes:

> On Tue, Sep 22, 2015 at 1:40 PM, Chris Angelico <rosuav at gmail.com> wrote:
>>
>> That's why it's safest to be clear about UTC
>> usage, and honestly, this has nothing to do with what a computer can
>> and can't be taught to do - it's all about what humans can get their
>> heads around.
>
> UTC is no better than DMT (Dragonlance Mean Time).  In fact, I think I will
> have easier time explaining DMT to a ten year old than explaining UTC.   If
> your team can agree on a natural language, they can agree on a timescale.
> It does not matter what it is.  If uniformity was a universal virtue, we
> would all be speaking Esperanto by now.

We are already speaking Esperanto. It is just called English.

If we want the same time moment in real life then scheduling in UTC is
the default. Particular time you could display using any label you like
as long as the corresponding UTC time is easily available.

It is trivial to find out UTC time on the computer.

From alexander.belopolsky at gmail.com  Tue Sep 22 20:40:53 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 22 Sep 2015 14:40:53 -0400
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <87d1xa2tg9.fsf@gmail.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
 <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>
 <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>
 <CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>
 <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>
 <55FFFBA7.80905@egenix.com>
 <CAP7h-xaVVcNRe4cy9Mh1-bGq=A+24fE1AApp-L1eQSZV-cX2Ww@mail.gmail.com>
 <56002719.8090404@egenix.com>
 <CAP7h-xawArCmhEdxMQfD-CYyvhEMr+8N9od8yb8GdcmbuhvRYA@mail.gmail.com>
 <56017DE4.9030806@egenix.com>
 <CAPTjJmoWej+HPGO6skVUZkE740XLGzja+QZLZX6aBy6rRQFF7w@mail.gmail.com>
 <CAP7h-xY3f4WkFxP2a+rSik_K3zyaTo8YrzB0Txa1kE7dZeswbg@mail.gmail.com>
 <CAPTjJmpDjQ1HEX1A8LW2dkRAg3vAW9EqFks4eEOb7W3KamBF2g@mail.gmail.com>
 <CAP7h-xb825Sw-dE=px_E0mqA1V0R-C2bZMzS7uPGSSQfAWXRNQ@mail.gmail.com>
 <87d1xa2tg9.fsf@gmail.com>
Message-ID: <CAP7h-xZsVsZXN+boJkZ4cu7UUE9xPAVLDi_1fpR-8dE3bUELnA@mail.gmail.com>

On Tue, Sep 22, 2015 at 2:35 PM, Akira Li <4kir4.1i at gmail.com> wrote:
[Alexander Belopolsky]

> If uniformity was a universal virtue, we  would all be speaking Esperanto
> by now.
>
> We are already speaking Esperanto. It is just called English.


I should have said "exclusively."
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150922/bcac73a3/attachment.html>

From alexander.belopolsky at gmail.com  Tue Sep 22 20:43:30 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 22 Sep 2015 14:43:30 -0400
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <87d1xa2tg9.fsf@gmail.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
 <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>
 <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>
 <CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>
 <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>
 <55FFFBA7.80905@egenix.com>
 <CAP7h-xaVVcNRe4cy9Mh1-bGq=A+24fE1AApp-L1eQSZV-cX2Ww@mail.gmail.com>
 <56002719.8090404@egenix.com>
 <CAP7h-xawArCmhEdxMQfD-CYyvhEMr+8N9od8yb8GdcmbuhvRYA@mail.gmail.com>
 <56017DE4.9030806@egenix.com>
 <CAPTjJmoWej+HPGO6skVUZkE740XLGzja+QZLZX6aBy6rRQFF7w@mail.gmail.com>
 <CAP7h-xY3f4WkFxP2a+rSik_K3zyaTo8YrzB0Txa1kE7dZeswbg@mail.gmail.com>
 <CAPTjJmpDjQ1HEX1A8LW2dkRAg3vAW9EqFks4eEOb7W3KamBF2g@mail.gmail.com>
 <CAP7h-xb825Sw-dE=px_E0mqA1V0R-C2bZMzS7uPGSSQfAWXRNQ@mail.gmail.com>
 <87d1xa2tg9.fsf@gmail.com>
Message-ID: <CAP7h-xZno6uuc20L0A+5Nu_LuveHu+sknsfx4RtO-1zNUxEyBw@mail.gmail.com>

On Tue, Sep 22, 2015 at 2:35 PM, Akira Li <4kir4.1i at gmail.com> wrote:

> It is trivial to find out UTC time on the computer.


Unless it is shut down in an anticipation of a leap second. :-)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150922/8322283e/attachment.html>

From rosuav at gmail.com  Wed Sep 23 01:05:07 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 23 Sep 2015 09:05:07 +1000
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <CAP7h-xb825Sw-dE=px_E0mqA1V0R-C2bZMzS7uPGSSQfAWXRNQ@mail.gmail.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
 <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>
 <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>
 <CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>
 <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>
 <55FFFBA7.80905@egenix.com>
 <CAP7h-xaVVcNRe4cy9Mh1-bGq=A+24fE1AApp-L1eQSZV-cX2Ww@mail.gmail.com>
 <56002719.8090404@egenix.com>
 <CAP7h-xawArCmhEdxMQfD-CYyvhEMr+8N9od8yb8GdcmbuhvRYA@mail.gmail.com>
 <56017DE4.9030806@egenix.com>
 <CAPTjJmoWej+HPGO6skVUZkE740XLGzja+QZLZX6aBy6rRQFF7w@mail.gmail.com>
 <CAP7h-xY3f4WkFxP2a+rSik_K3zyaTo8YrzB0Txa1kE7dZeswbg@mail.gmail.com>
 <CAPTjJmpDjQ1HEX1A8LW2dkRAg3vAW9EqFks4eEOb7W3KamBF2g@mail.gmail.com>
 <CAP7h-xb825Sw-dE=px_E0mqA1V0R-C2bZMzS7uPGSSQfAWXRNQ@mail.gmail.com>
Message-ID: <CAPTjJmqi_dbsrpG1UXY21BZNYXLXDBfLwdDOF_Y+cL6YQFat0g@mail.gmail.com>

On Wed, Sep 23, 2015 at 4:02 AM, Alexander Belopolsky
<alexander.belopolsky at gmail.com> wrote:
> On Tue, Sep 22, 2015 at 1:40 PM, Chris Angelico <rosuav at gmail.com> wrote:
>>
>> That's why it's safest to be clear about UTC
>> usage, and honestly, this has nothing to do with what a computer can
>> and can't be taught to do - it's all about what humans can get their
>> heads around.
>
> UTC is no better than DMT (Dragonlance Mean Time).  In fact, I think I will
> have easier time explaining DMT to a ten year old than explaining UTC.   If
> your team can agree on a natural language, they can agree on a timescale.
> It does not matter what it is.  If uniformity was a universal virtue, we
> would all be speaking Esperanto by now.

If I were creating my own standard out of thin air, then yes, it
wouldn't make a lot of difference, and I could pick anywhere. (There
are a few invariants that I'd maintain, such as that it should "tick"
the same way our civil clocks do - one second equals one civil second,
and they're packaged up into hours and days the same way - but it
doesn't matter what the exact offset is.) But UTC already exists, and
that gives it an inherent advantage. I've never tried to explain DMT
to anyone, but explaining a simplified form of GMT/UTC (ignore leap
seconds, ignore relativity, ignore UT0/UT1 etc) is pretty easy - it's
just a well-known time zone that has no DST.

ChrisA

From alexander.belopolsky at gmail.com  Wed Sep 23 02:12:10 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 22 Sep 2015 20:12:10 -0400
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <CAPTjJmqi_dbsrpG1UXY21BZNYXLXDBfLwdDOF_Y+cL6YQFat0g@mail.gmail.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
 <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>
 <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>
 <CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>
 <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>
 <55FFFBA7.80905@egenix.com>
 <CAP7h-xaVVcNRe4cy9Mh1-bGq=A+24fE1AApp-L1eQSZV-cX2Ww@mail.gmail.com>
 <56002719.8090404@egenix.com>
 <CAP7h-xawArCmhEdxMQfD-CYyvhEMr+8N9od8yb8GdcmbuhvRYA@mail.gmail.com>
 <56017DE4.9030806@egenix.com>
 <CAPTjJmoWej+HPGO6skVUZkE740XLGzja+QZLZX6aBy6rRQFF7w@mail.gmail.com>
 <CAP7h-xY3f4WkFxP2a+rSik_K3zyaTo8YrzB0Txa1kE7dZeswbg@mail.gmail.com>
 <CAPTjJmpDjQ1HEX1A8LW2dkRAg3vAW9EqFks4eEOb7W3KamBF2g@mail.gmail.com>
 <CAP7h-xb825Sw-dE=px_E0mqA1V0R-C2bZMzS7uPGSSQfAWXRNQ@mail.gmail.com>
 <CAPTjJmqi_dbsrpG1UXY21BZNYXLXDBfLwdDOF_Y+cL6YQFat0g@mail.gmail.com>
Message-ID: <CAP7h-xahho7+C-UAJ8B=6bbfdKt-WxrUbYT8WsEcQd6EL+ad+Q@mail.gmail.com>

On Tue, Sep 22, 2015 at 7:05 PM, Chris Angelico <rosuav at gmail.com> wrote:

> But UTC already exists, and
> that gives it an inherent advantage. I've never tried to explain DMT
> to anyone, but explaining a simplified form of GMT/UTC (ignore leap
> seconds, ignore relativity, ignore UT0/UT1 etc) is pretty easy - it's
> just a well-known time zone that has no DST.
>

It is not as well-known as you might think.  I, for one, don't even know
how to translate it in my native Russian.  I bet people in Russia who know
what Moscow time is outnumber those who know what UTC is at least 100 to
1.  I bet you will get a similar ratio in California between UTC and say
Eastern Standard Time.

No TV station in Russia or in the US will ever announce its schedule in
UTC.  They will use Moscow time in Russia and EST in the US.  Occasionally,
national TV networks in the US will announce the show time in two or three
major time zones, but never in UTC.   In Russia, time zones are identified
as Moscow+HH much more often than UTC+HH.  The only place you will see
clocks showing UTC time in Russia is the space command center.

GMT is popular in Western Europe because of geographical proximity and the
ubiquitous BBC broadcasts, but it is not as well-known elsewhere.

Let's have a show of hands here: how many people know what "C" stands for
in UTC and what "M" stands in GMT and what is the significance of these
letters?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150922/66329f9a/attachment-0001.html>

From rosuav at gmail.com  Wed Sep 23 02:57:05 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 23 Sep 2015 10:57:05 +1000
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <CAP7h-xahho7+C-UAJ8B=6bbfdKt-WxrUbYT8WsEcQd6EL+ad+Q@mail.gmail.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
 <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>
 <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>
 <CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>
 <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>
 <55FFFBA7.80905@egenix.com>
 <CAP7h-xaVVcNRe4cy9Mh1-bGq=A+24fE1AApp-L1eQSZV-cX2Ww@mail.gmail.com>
 <56002719.8090404@egenix.com>
 <CAP7h-xawArCmhEdxMQfD-CYyvhEMr+8N9od8yb8GdcmbuhvRYA@mail.gmail.com>
 <56017DE4.9030806@egenix.com>
 <CAPTjJmoWej+HPGO6skVUZkE740XLGzja+QZLZX6aBy6rRQFF7w@mail.gmail.com>
 <CAP7h-xY3f4WkFxP2a+rSik_K3zyaTo8YrzB0Txa1kE7dZeswbg@mail.gmail.com>
 <CAPTjJmpDjQ1HEX1A8LW2dkRAg3vAW9EqFks4eEOb7W3KamBF2g@mail.gmail.com>
 <CAP7h-xb825Sw-dE=px_E0mqA1V0R-C2bZMzS7uPGSSQfAWXRNQ@mail.gmail.com>
 <CAPTjJmqi_dbsrpG1UXY21BZNYXLXDBfLwdDOF_Y+cL6YQFat0g@mail.gmail.com>
 <CAP7h-xahho7+C-UAJ8B=6bbfdKt-WxrUbYT8WsEcQd6EL+ad+Q@mail.gmail.com>
Message-ID: <CAPTjJmpXc+Yjy9oK7SNMYbmxRvUJ4TAYp3gaVSUsqE+Xgzqj2Q@mail.gmail.com>

On Wed, Sep 23, 2015 at 10:12 AM, Alexander Belopolsky
<alexander.belopolsky at gmail.com> wrote:
> It is not as well-known as you might think.  I, for one, don't even know how
> to translate it in my native Russian.  I bet people in Russia who know what
> Moscow time is outnumber those who know what UTC is at least 100 to 1.  I
> bet you will get a similar ratio in California between UTC and say Eastern
> Standard Time.

Of course. Local time is always better known than UTC. But any given
local time is only going to be known in its own locality. I would bet
that the people in Russia who know Eastern Standard Time, or the
people in California who know Moscow time, would be quite low.

> Let's have a show of hands here: how many people know what "C" stands for in
> UTC and what "M" stands in GMT and what is the significance of these
> letters?

I know, on both counts, because I'm a wonk. But those specifics are
part of what I would elide, along with leap seconds and relativity,
when explaining a scheduling system. (Let's face it - nobody's going
to schedule a meeting to such accuracy that any of it will matter.)
Time is a lot messier than most people need to care about.

ChrisA

From alexander.belopolsky at gmail.com  Wed Sep 23 04:16:07 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 22 Sep 2015 22:16:07 -0400
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <CAPTjJmpXc+Yjy9oK7SNMYbmxRvUJ4TAYp3gaVSUsqE+Xgzqj2Q@mail.gmail.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
 <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>
 <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>
 <CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>
 <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>
 <55FFFBA7.80905@egenix.com>
 <CAP7h-xaVVcNRe4cy9Mh1-bGq=A+24fE1AApp-L1eQSZV-cX2Ww@mail.gmail.com>
 <56002719.8090404@egenix.com>
 <CAP7h-xawArCmhEdxMQfD-CYyvhEMr+8N9od8yb8GdcmbuhvRYA@mail.gmail.com>
 <56017DE4.9030806@egenix.com>
 <CAPTjJmoWej+HPGO6skVUZkE740XLGzja+QZLZX6aBy6rRQFF7w@mail.gmail.com>
 <CAP7h-xY3f4WkFxP2a+rSik_K3zyaTo8YrzB0Txa1kE7dZeswbg@mail.gmail.com>
 <CAPTjJmpDjQ1HEX1A8LW2dkRAg3vAW9EqFks4eEOb7W3KamBF2g@mail.gmail.com>
 <CAP7h-xb825Sw-dE=px_E0mqA1V0R-C2bZMzS7uPGSSQfAWXRNQ@mail.gmail.com>
 <CAPTjJmqi_dbsrpG1UXY21BZNYXLXDBfLwdDOF_Y+cL6YQFat0g@mail.gmail.com>
 <CAP7h-xahho7+C-UAJ8B=6bbfdKt-WxrUbYT8WsEcQd6EL+ad+Q@mail.gmail.com>
 <CAPTjJmpXc+Yjy9oK7SNMYbmxRvUJ4TAYp3gaVSUsqE+Xgzqj2Q@mail.gmail.com>
Message-ID: <CAP7h-xYQ-_Q6OgMd0fX1qEhn5b+cggxL6exre+2+cxeaT8=yiA@mail.gmail.com>

On Tue, Sep 22, 2015 at 8:57 PM, Chris Angelico <rosuav at gmail.com> wrote:

> [ Alexander Belopolsky] I bet people in Russia who know what

> Moscow time is outnumber those who know what UTC is at least 100 to 1.  I
> > bet you will get a similar ratio in California between UTC and say
> Eastern
> > Standard Time.
>
> Of course. Local time is always better known than UTC.


Moscow Time is hardly local for Russian Anadyr or Petropavlovsk-Kamchatsky,
but people still use Moscow Time for train schedules there.  In fact, those
places are closer to California than they are to Moscow.


> But any given local time is only going to be known in its own locality.


Depends on a locality.  Local time at the village of Greenwich is fairly
well-known. :-)


> I would bet
> that the people in Russia who know Eastern Standard Time, or the
> people in California who know Moscow time, would be quite low.
>

I suspect that anyone who knows about UTC would know about both Moscow and
New York.


> > Let's have a show of hands here: how many people know what "C" stands
> for in
> > UTC and what "M" stands in GMT and what is the significance of these
> > letters?
>
> I know, on both counts, because I'm a wonk.


Well, in this case you know more than I do.  I know that "M" stands for
"mean" (I've heard that on BBC:-) and that it has something to do with the
solar time, but I cannot tell you "mean" of what it is or whether BBC's
fifth beep comes on a UTC or GMT second.


> But those specifics are
> part of what I would elide, along with leap seconds and relativity,
> when explaining a scheduling system.


Right, but most people (myself included) only learn about UTC when they
learn about those complications.  I would say in New York, Eastern Time is
for most people, EST is for nerds and UTC is for wonks.

(Let's face it - nobody's going
> to schedule a meeting to such accuracy that any of it will matter.)
> Time is a lot messier than most people need to care about.


Right.  So let them use the time that their wall clocks are showing.  When
a New Yorker calls Cupertino, they have three options: Eastern, Pacific and
UTC.  The first two are a slight inconvenience for one of them and the
third is a major annoyance for both.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150922/139405df/attachment.html>

From rosuav at gmail.com  Wed Sep 23 04:27:52 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 23 Sep 2015 12:27:52 +1000
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <CAP7h-xYQ-_Q6OgMd0fX1qEhn5b+cggxL6exre+2+cxeaT8=yiA@mail.gmail.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
 <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>
 <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>
 <CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>
 <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>
 <55FFFBA7.80905@egenix.com>
 <CAP7h-xaVVcNRe4cy9Mh1-bGq=A+24fE1AApp-L1eQSZV-cX2Ww@mail.gmail.com>
 <56002719.8090404@egenix.com>
 <CAP7h-xawArCmhEdxMQfD-CYyvhEMr+8N9od8yb8GdcmbuhvRYA@mail.gmail.com>
 <56017DE4.9030806@egenix.com>
 <CAPTjJmoWej+HPGO6skVUZkE740XLGzja+QZLZX6aBy6rRQFF7w@mail.gmail.com>
 <CAP7h-xY3f4WkFxP2a+rSik_K3zyaTo8YrzB0Txa1kE7dZeswbg@mail.gmail.com>
 <CAPTjJmpDjQ1HEX1A8LW2dkRAg3vAW9EqFks4eEOb7W3KamBF2g@mail.gmail.com>
 <CAP7h-xb825Sw-dE=px_E0mqA1V0R-C2bZMzS7uPGSSQfAWXRNQ@mail.gmail.com>
 <CAPTjJmqi_dbsrpG1UXY21BZNYXLXDBfLwdDOF_Y+cL6YQFat0g@mail.gmail.com>
 <CAP7h-xahho7+C-UAJ8B=6bbfdKt-WxrUbYT8WsEcQd6EL+ad+Q@mail.gmail.com>
 <CAPTjJmpXc+Yjy9oK7SNMYbmxRvUJ4TAYp3gaVSUsqE+Xgzqj2Q@mail.gmail.com>
 <CAP7h-xYQ-_Q6OgMd0fX1qEhn5b+cggxL6exre+2+cxeaT8=yiA@mail.gmail.com>
Message-ID: <CAPTjJmqJ2gyrP-M88HzZpNCE+sMiKx39KWDgpVV2sJt-WinM-g@mail.gmail.com>

On Wed, Sep 23, 2015 at 12:16 PM, Alexander Belopolsky
<alexander.belopolsky at gmail.com> wrote:
>
> On Tue, Sep 22, 2015 at 8:57 PM, Chris Angelico <rosuav at gmail.com> wrote:
>>
>> [ Alexander Belopolsky] I bet people in Russia who know what
>>
>> > Moscow time is outnumber those who know what UTC is at least 100 to 1.
>> > I
>> > bet you will get a similar ratio in California between UTC and say
>> > Eastern
>> > Standard Time.
>>
>> Of course. Local time is always better known than UTC.
>
>
> Moscow Time is hardly local for Russian Anadyr or Petropavlovsk-Kamchatsky,
> but people still use Moscow Time for train schedules there.  In fact, those
> places are closer to California than they are to Moscow.

"Close" doesn't necessarily have anything to do with geographic
location. I'm fairly sure Troll Research Station isn't physically
close to Norway, but when it's being operated solely by Norwegians,
it's politically very close. I've no idea how the trains operate, but
it's a lot more likely that they're politically near Moscow than
California.

>> I would bet
>> that the people in Russia who know Eastern Standard Time, or the
>> people in California who know Moscow time, would be quite low.
>
> I suspect that anyone who knows about UTC would know about both Moscow and
> New York.

Know about, yes, but they won't necessarily know the DST rules etc.

>> > Let's have a show of hands here: how many people know what "C" stands
>> > for in
>> > UTC and what "M" stands in GMT and what is the significance of these
>> > letters?
>>
>> I know, on both counts, because I'm a wonk.
>
> Well, in this case you know more than I do.  I know that "M" stands for
> "mean" (I've heard that on BBC:-) and that it has something to do with the
> solar time, but I cannot tell you "mean" of what it is or whether BBC's
> fifth beep comes on a UTC or GMT second.

Yes, it's because GMT is based on the average solar noon. If you have
an actual sundial, you can observe actual solar noon, but to convert
that to civil time, you need a table of translations that takes
seasonal variation into account. In theory, Greenwich Time would show
noon when the sun is directly overhead, but that would mean that
successive days vary in length; Greenwich Mean Time averages it all
out so you get a consistent 86400-second day.

UTC is defined by the coordination of a bunch of clocks around the
world. There are a few different forms, most of which never go more
than one second away from each other. GMT is usually defined as being
equal to one or other of them, but which one is not entirely
standardized, so if you need subsecond accuracy, don't use GMT at all.
For scheduling events, though, GMT == UTC == TIA == Unix time.

>> But those specifics are
>> part of what I would elide, along with leap seconds and relativity,
>> when explaining a scheduling system.
>
>
> Right, but most people (myself included) only learn about UTC when they
> learn about those complications.  I would say in New York, Eastern Time is
> for most people, EST is for nerds and UTC is for wonks.
>
>> (Let's face it - nobody's going
>> to schedule a meeting to such accuracy that any of it will matter.)
>> Time is a lot messier than most people need to care about.
>
>
> Right.  So let them use the time that their wall clocks are showing.  When a
> New Yorker calls Cupertino, they have three options: Eastern, Pacific and
> UTC.  The first two are a slight inconvenience for one of them and the third
> is a major annoyance for both.

Sure. If you're scheduling a one-off event, that's no problem. But
when you schedule a recurring event, suddenly the first two become
major annoyances and the third becomes much more minor. (With the
possible exception that different states of the US can probably cheat,
since there's federal US legislation about DST. But if your examples
were New York and Sydney, then my point stands.)

ChrisA

From alexander.belopolsky at gmail.com  Wed Sep 23 04:45:19 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 22 Sep 2015 22:45:19 -0400
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <CAPTjJmqJ2gyrP-M88HzZpNCE+sMiKx39KWDgpVV2sJt-WinM-g@mail.gmail.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
 <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>
 <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>
 <CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>
 <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>
 <55FFFBA7.80905@egenix.com>
 <CAP7h-xaVVcNRe4cy9Mh1-bGq=A+24fE1AApp-L1eQSZV-cX2Ww@mail.gmail.com>
 <56002719.8090404@egenix.com>
 <CAP7h-xawArCmhEdxMQfD-CYyvhEMr+8N9od8yb8GdcmbuhvRYA@mail.gmail.com>
 <56017DE4.9030806@egenix.com>
 <CAPTjJmoWej+HPGO6skVUZkE740XLGzja+QZLZX6aBy6rRQFF7w@mail.gmail.com>
 <CAP7h-xY3f4WkFxP2a+rSik_K3zyaTo8YrzB0Txa1kE7dZeswbg@mail.gmail.com>
 <CAPTjJmpDjQ1HEX1A8LW2dkRAg3vAW9EqFks4eEOb7W3KamBF2g@mail.gmail.com>
 <CAP7h-xb825Sw-dE=px_E0mqA1V0R-C2bZMzS7uPGSSQfAWXRNQ@mail.gmail.com>
 <CAPTjJmqi_dbsrpG1UXY21BZNYXLXDBfLwdDOF_Y+cL6YQFat0g@mail.gmail.com>
 <CAP7h-xahho7+C-UAJ8B=6bbfdKt-WxrUbYT8WsEcQd6EL+ad+Q@mail.gmail.com>
 <CAPTjJmpXc+Yjy9oK7SNMYbmxRvUJ4TAYp3gaVSUsqE+Xgzqj2Q@mail.gmail.com>
 <CAP7h-xYQ-_Q6OgMd0fX1qEhn5b+cggxL6exre+2+cxeaT8=yiA@mail.gmail.com>
 <CAPTjJmqJ2gyrP-M88HzZpNCE+sMiKx39KWDgpVV2sJt-WinM-g@mail.gmail.com>
Message-ID: <CAP7h-xYxkaz+UCi5bu8qCe+pY3hzvtjJbn2geb8CYuR-bak_Pw@mail.gmail.com>

On Tue, Sep 22, 2015 at 10:27 PM, Chris Angelico <rosuav at gmail.com> wrote:

> [ Alexander Belopolsky] but I cannot tell you "mean" of what it is or
> whether BBC's
> > fifth beep comes on a UTC or GMT second.
>
> Yes, it's because GMT is based on the average solar noon. If you have
> an actual sundial, you can observe actual solar noon, but to convert
> that to civil time, you need a table of translations that takes
> seasonal variation into account. In theory, Greenwich Time would show
> noon when the sun is directly overhead, but that would mean that
> successive days vary in length; Greenwich Mean Time averages it all
> out so you get a consistent 86400-second day.
>
> UTC is defined by the coordination of a bunch of clocks around the
> world. There are a few different forms, most of which never go more
> than one second away from each other. GMT is usually defined as being
> equal to one or other of them, but which one is not entirely
> standardized, so if you need subsecond accuracy, don't use GMT at all.
> For scheduling events, though, GMT == UTC == TIA == Unix time.
>

Thanks for the lecture, but I still don't know what BBC broadcasts. :-)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150922/186f48af/attachment-0001.html>

From alexander.belopolsky at gmail.com  Wed Sep 23 04:53:23 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 22 Sep 2015 22:53:23 -0400
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <CAPTjJmqJ2gyrP-M88HzZpNCE+sMiKx39KWDgpVV2sJt-WinM-g@mail.gmail.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
 <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>
 <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>
 <CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>
 <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>
 <55FFFBA7.80905@egenix.com>
 <CAP7h-xaVVcNRe4cy9Mh1-bGq=A+24fE1AApp-L1eQSZV-cX2Ww@mail.gmail.com>
 <56002719.8090404@egenix.com>
 <CAP7h-xawArCmhEdxMQfD-CYyvhEMr+8N9od8yb8GdcmbuhvRYA@mail.gmail.com>
 <56017DE4.9030806@egenix.com>
 <CAPTjJmoWej+HPGO6skVUZkE740XLGzja+QZLZX6aBy6rRQFF7w@mail.gmail.com>
 <CAP7h-xY3f4WkFxP2a+rSik_K3zyaTo8YrzB0Txa1kE7dZeswbg@mail.gmail.com>
 <CAPTjJmpDjQ1HEX1A8LW2dkRAg3vAW9EqFks4eEOb7W3KamBF2g@mail.gmail.com>
 <CAP7h-xb825Sw-dE=px_E0mqA1V0R-C2bZMzS7uPGSSQfAWXRNQ@mail.gmail.com>
 <CAPTjJmqi_dbsrpG1UXY21BZNYXLXDBfLwdDOF_Y+cL6YQFat0g@mail.gmail.com>
 <CAP7h-xahho7+C-UAJ8B=6bbfdKt-WxrUbYT8WsEcQd6EL+ad+Q@mail.gmail.com>
 <CAPTjJmpXc+Yjy9oK7SNMYbmxRvUJ4TAYp3gaVSUsqE+Xgzqj2Q@mail.gmail.com>
 <CAP7h-xYQ-_Q6OgMd0fX1qEhn5b+cggxL6exre+2+cxeaT8=yiA@mail.gmail.com>
 <CAPTjJmqJ2gyrP-M88HzZpNCE+sMiKx39KWDgpVV2sJt-WinM-g@mail.gmail.com>
Message-ID: <CAP7h-xYy9=HQv5q_qW2zMthxv3u32nuaUOLPmxLv+-32z2a+iQ@mail.gmail.com>

On Tue, Sep 22, 2015 at 10:27 PM, Chris Angelico <rosuav at gmail.com> wrote:

> (With the
> possible exception that different states of the US can probably cheat,
> since there's federal US legislation about DST. But if your examples
> were New York and Sydney, then my point stands.)
>

What would be your guess for the ratio between the number of calls between
New York and say San Francisco to that between New York and Sydney?   For
the latter, I'll concede: UTC makes sense because it is somewhere in the
middle. :-)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150922/23049190/attachment.html>

From rosuav at gmail.com  Wed Sep 23 04:58:28 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 23 Sep 2015 12:58:28 +1000
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <CAP7h-xYy9=HQv5q_qW2zMthxv3u32nuaUOLPmxLv+-32z2a+iQ@mail.gmail.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
 <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>
 <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>
 <CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>
 <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>
 <55FFFBA7.80905@egenix.com>
 <CAP7h-xaVVcNRe4cy9Mh1-bGq=A+24fE1AApp-L1eQSZV-cX2Ww@mail.gmail.com>
 <56002719.8090404@egenix.com>
 <CAP7h-xawArCmhEdxMQfD-CYyvhEMr+8N9od8yb8GdcmbuhvRYA@mail.gmail.com>
 <56017DE4.9030806@egenix.com>
 <CAPTjJmoWej+HPGO6skVUZkE740XLGzja+QZLZX6aBy6rRQFF7w@mail.gmail.com>
 <CAP7h-xY3f4WkFxP2a+rSik_K3zyaTo8YrzB0Txa1kE7dZeswbg@mail.gmail.com>
 <CAPTjJmpDjQ1HEX1A8LW2dkRAg3vAW9EqFks4eEOb7W3KamBF2g@mail.gmail.com>
 <CAP7h-xb825Sw-dE=px_E0mqA1V0R-C2bZMzS7uPGSSQfAWXRNQ@mail.gmail.com>
 <CAPTjJmqi_dbsrpG1UXY21BZNYXLXDBfLwdDOF_Y+cL6YQFat0g@mail.gmail.com>
 <CAP7h-xahho7+C-UAJ8B=6bbfdKt-WxrUbYT8WsEcQd6EL+ad+Q@mail.gmail.com>
 <CAPTjJmpXc+Yjy9oK7SNMYbmxRvUJ4TAYp3gaVSUsqE+Xgzqj2Q@mail.gmail.com>
 <CAP7h-xYQ-_Q6OgMd0fX1qEhn5b+cggxL6exre+2+cxeaT8=yiA@mail.gmail.com>
 <CAPTjJmqJ2gyrP-M88HzZpNCE+sMiKx39KWDgpVV2sJt-WinM-g@mail.gmail.com>
 <CAP7h-xYy9=HQv5q_qW2zMthxv3u32nuaUOLPmxLv+-32z2a+iQ@mail.gmail.com>
Message-ID: <CAPTjJmr9L1V2C8g4B-E_a+SCrTXvURGxWh=cX8mtrXDtzrhcvQ@mail.gmail.com>

On Wed, Sep 23, 2015 at 12:53 PM, Alexander Belopolsky
<alexander.belopolsky at gmail.com> wrote:
> On Tue, Sep 22, 2015 at 10:27 PM, Chris Angelico <rosuav at gmail.com> wrote:
>>
>> (With the
>> possible exception that different states of the US can probably cheat,
>> since there's federal US legislation about DST. But if your examples
>> were New York and Sydney, then my point stands.)
>
>
> What would be your guess for the ratio between the number of calls between
> New York and say San Francisco to that between New York and Sydney?   For
> the latter, I'll concede: UTC makes sense because it is somewhere in the
> middle. :-)

Heh. It's not really a matter of being in the middle, though - I would
advocate UTC for any recurring event that involves different DST
rules. UTC isn't mid-way between, say, Sydney and Warsaw, but if you
want to phone someone in the opposite hemisphere every week, it'd be
best to schedule it in UTC so you don't have to worry about four
different offsets (you could both be on DST, or either of you could,
or neither).

Of course, if DST were abolished world-wide, then everything would be
easy, and we could happily schedule things in each other's timezones
without any confusion. I could key in "11PM" and the program would
interpret that as being UTC+10, and then my friend in Florida could
see it as "9AM", and nobody would be confused at all. Alas, I fear
'tis a vain hope...

ChrisA

From alexander.belopolsky at gmail.com  Wed Sep 23 05:15:57 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 22 Sep 2015 23:15:57 -0400
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <CAPTjJmr9L1V2C8g4B-E_a+SCrTXvURGxWh=cX8mtrXDtzrhcvQ@mail.gmail.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
 <CAExdVNnC0wk5c1wTvFTOS+mp=6at0nXOuThzXO4gBGKFaDYm-A@mail.gmail.com>
 <CAP7h-xYZB+p2mpL-7cF69hx5Sw5AuW-xU203fZ8908bSjA_37Q@mail.gmail.com>
 <CAP7h-xZVG6OeWPvZQEt0S14rBZ=-atLzjDKSvmoKqKzhcgjmmQ@mail.gmail.com>
 <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>
 <55FFFBA7.80905@egenix.com>
 <CAP7h-xaVVcNRe4cy9Mh1-bGq=A+24fE1AApp-L1eQSZV-cX2Ww@mail.gmail.com>
 <56002719.8090404@egenix.com>
 <CAP7h-xawArCmhEdxMQfD-CYyvhEMr+8N9od8yb8GdcmbuhvRYA@mail.gmail.com>
 <56017DE4.9030806@egenix.com>
 <CAPTjJmoWej+HPGO6skVUZkE740XLGzja+QZLZX6aBy6rRQFF7w@mail.gmail.com>
 <CAP7h-xY3f4WkFxP2a+rSik_K3zyaTo8YrzB0Txa1kE7dZeswbg@mail.gmail.com>
 <CAPTjJmpDjQ1HEX1A8LW2dkRAg3vAW9EqFks4eEOb7W3KamBF2g@mail.gmail.com>
 <CAP7h-xb825Sw-dE=px_E0mqA1V0R-C2bZMzS7uPGSSQfAWXRNQ@mail.gmail.com>
 <CAPTjJmqi_dbsrpG1UXY21BZNYXLXDBfLwdDOF_Y+cL6YQFat0g@mail.gmail.com>
 <CAP7h-xahho7+C-UAJ8B=6bbfdKt-WxrUbYT8WsEcQd6EL+ad+Q@mail.gmail.com>
 <CAPTjJmpXc+Yjy9oK7SNMYbmxRvUJ4TAYp3gaVSUsqE+Xgzqj2Q@mail.gmail.com>
 <CAP7h-xYQ-_Q6OgMd0fX1qEhn5b+cggxL6exre+2+cxeaT8=yiA@mail.gmail.com>
 <CAPTjJmqJ2gyrP-M88HzZpNCE+sMiKx39KWDgpVV2sJt-WinM-g@mail.gmail.com>
 <CAP7h-xYy9=HQv5q_qW2zMthxv3u32nuaUOLPmxLv+-32z2a+iQ@mail.gmail.com>
 <CAPTjJmr9L1V2C8g4B-E_a+SCrTXvURGxWh=cX8mtrXDtzrhcvQ@mail.gmail.com>
Message-ID: <CAP7h-xbVgSSFxiP3wO8w3WsFVe9tKFkXZTBA913EGZKzgYYZSQ@mail.gmail.com>

On Tue, Sep 22, 2015 at 10:58 PM, Chris Angelico <rosuav at gmail.com> wrote:

> Of course, if DST were abolished world-wide, then everything would be
> easy, and we could happily schedule things in each other's timezones
> without any confusion. I could key in "11PM" and the program would
> interpret that as being UTC+10, and then my friend in Florida could
> see it as "9AM", and nobody would be confused at all. Alas, I fear
> 'tis a vain hope...
>

I think all these DST-related scheduling problems are highly exaggerated.
My kids go to a school in New York with a European curriculum.  Apparently,
schoolchildren in Europe study six days a week, so the program is organized
on a 6-days cycle.  This means that the first Monday is day 1, the second
is day 6, the third is ... I am lost already.  Guess what: the kids don't
complain.  Figuring out timezone difference between New York and Sydney is
easy.  Try to match up the school holiday schedules between New York and
New Jersey!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150922/bf9a5de8/attachment.html>

From mal at egenix.com  Wed Sep 23 09:43:52 2015
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 23 Sep 2015 09:43:52 +0200
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <CAPTjJmqJ2gyrP-M88HzZpNCE+sMiKx39KWDgpVV2sJt-WinM-g@mail.gmail.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>	<CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>	<55FFFBA7.80905@egenix.com>	<CAP7h-xaVVcNRe4cy9Mh1-bGq=A+24fE1AApp-L1eQSZV-cX2Ww@mail.gmail.com>	<56002719.8090404@egenix.com>	<CAP7h-xawArCmhEdxMQfD-CYyvhEMr+8N9od8yb8GdcmbuhvRYA@mail.gmail.com>	<56017DE4.9030806@egenix.com>	<CAPTjJmoWej+HPGO6skVUZkE740XLGzja+QZLZX6aBy6rRQFF7w@mail.gmail.com>	<CAP7h-xY3f4WkFxP2a+rSik_K3zyaTo8YrzB0Txa1kE7dZeswbg@mail.gmail.com>	<CAPTjJmpDjQ1HEX1A8LW2dkRAg3vAW9EqFks4eEOb7W3KamBF2g@mail.gmail.com>	<CAP7h-xb825Sw-dE=px_E0mqA1V0R-C2bZMzS7uPGSSQfAWXRNQ@mail.gmail.com>	<CAPTjJmqi_dbsrpG1UXY21BZNYXLXDBfLwdDOF_Y+cL6YQFat0g@mail.gmail.com>	<CAP7h-xahho7+C-UAJ8B=6bbfdKt-WxrUbYT8WsEcQd6EL+ad+Q@mail.gmail.com>	<CAPTjJmpXc+Yjy9oK7SNMYbmxRvUJ4TAYp3gaVSUsqE+Xgzqj2Q@mail.gmail.com>	<CAP7h-xYQ-_Q6OgMd0fX1qEhn5b+cggxL6exre+2+cxeaT8=yiA@mail.gmail.com>
 <CAPTjJmqJ2gyrP-M88HzZpNCE+sMiKx39KWDgpVV2sJt-WinM-g@mail.gmai
 l.com>
Message-ID: <56025838.1010702@egenix.com>

I think I only got part of my tongue-in-cheek suggestion across :-)

The idea was to drop local time altogether and instead use UTC
everywhere. Wall clocks would all show UTC. Instead of switching
time zones, you'd adapt your schedule as needed and this could
be as flexible as you want.

People would just have to get used to having dinner at e.g.
03:00 UTC instead of 8pm [add some timezone here] and you
would be able to enjoy sunset at 10:00 UTC in some places.

Ain't going to happen, but it would allow people to gain back
some more freedom in scheduling their lives.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Sep 23 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...           http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/
________________________________________________________________________
2015-09-14: Released mxODBC Plone/Zope DA 2.2.3   http://egenix.com/go84
2015-09-26: Python Meeting Duesseldorf Sprint 2015          3 days to go
2015-10-21: Python Meeting Duesseldorf ...                 28 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From rosuav at gmail.com  Wed Sep 23 13:33:50 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 23 Sep 2015 21:33:50 +1000
Subject: [Datetime-SIG] Computing .dst() as a timedelta
In-Reply-To: <56025838.1010702@egenix.com>
References: <CAP7h-xa6RtuS_uy9cqRaD1K_VwLNFEF1N7b8sHBa3ObiV0mt8A@mail.gmail.com>
 <CAExdVNn=eVEvWnmzVrAwibsE5A=gknRc_LSyy8O08t0DMjq39g@mail.gmail.com>
 <55FFFBA7.80905@egenix.com>
 <CAP7h-xaVVcNRe4cy9Mh1-bGq=A+24fE1AApp-L1eQSZV-cX2Ww@mail.gmail.com>
 <56002719.8090404@egenix.com>
 <CAP7h-xawArCmhEdxMQfD-CYyvhEMr+8N9od8yb8GdcmbuhvRYA@mail.gmail.com>
 <56017DE4.9030806@egenix.com>
 <CAPTjJmoWej+HPGO6skVUZkE740XLGzja+QZLZX6aBy6rRQFF7w@mail.gmail.com>
 <CAP7h-xY3f4WkFxP2a+rSik_K3zyaTo8YrzB0Txa1kE7dZeswbg@mail.gmail.com>
 <CAPTjJmpDjQ1HEX1A8LW2dkRAg3vAW9EqFks4eEOb7W3KamBF2g@mail.gmail.com>
 <CAP7h-xb825Sw-dE=px_E0mqA1V0R-C2bZMzS7uPGSSQfAWXRNQ@mail.gmail.com>
 <CAPTjJmqi_dbsrpG1UXY21BZNYXLXDBfLwdDOF_Y+cL6YQFat0g@mail.gmail.com>
 <CAP7h-xahho7+C-UAJ8B=6bbfdKt-WxrUbYT8WsEcQd6EL+ad+Q@mail.gmail.com>
 <CAPTjJmpXc+Yjy9oK7SNMYbmxRvUJ4TAYp3gaVSUsqE+Xgzqj2Q@mail.gmail.com>
 <CAP7h-xYQ-_Q6OgMd0fX1qEhn5b+cggxL6exre+2+cxeaT8=yiA@mail.gmail.com>
 <CAPTjJmqJ2gyrP-M88HzZpNCE+sMiKx39KWDgpVV2sJt-WinM-g@mail.gmail.com>
 <56025838.1010702@egenix.com>
Message-ID: <CAPTjJmr=2x4+7Digt7fH3-Pid_A=m3M_D4xSzPGNOoqfzhL7fw@mail.gmail.com>

On Wed, Sep 23, 2015 at 5:43 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> I think I only got part of my tongue-in-cheek suggestion across :-)
>
> The idea was to drop local time altogether and instead use UTC
> everywhere. Wall clocks would all show UTC. Instead of switching
> time zones, you'd adapt your schedule as needed and this could
> be as flexible as you want.
>
> People would just have to get used to having dinner at e.g.
> 03:00 UTC instead of 8pm [add some timezone here] and you
> would be able to enjoy sunset at 10:00 UTC in some places.
>
> Ain't going to happen, but it would allow people to gain back
> some more freedom in scheduling their lives.

*looks at left wrist* Current civil time is 9:33PM.
*looks at right wrist* Current UTC time is 11:33.

I wear two watches for that exact reason. Bring it on!

ChrisA

From alexander.belopolsky at gmail.com  Wed Sep 23 23:00:32 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Wed, 23 Sep 2015 17:00:32 -0400
Subject: [Datetime-SIG] IANA TZ database statistics
Message-ID: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>

I added a method to datetimetester [1] to compute some overall statistics
on tzfiles.  My code ignores "version 2" data, so I include only
transitions that fall within 32-bit time_t (1900 to 2038 range).

Here are the results for the default Mac OSX system files and the most
recent Github version of tz [2]:

>>> from datetimetester import *
>>> ZoneInfo.stats()
Number of zones:         584
Number of transitions: 38510 = 19058 (gaps) + 19008 (folds) + 444 (zeros)
Min gap:                  0:00:16 at 1935-01-01 03:40:52 in
America/Paramaribo
Max gap:           1 day, 0:00:00 at 2011-12-30 10:00:00 in Pacific/Apia
Min fold:                 0:01:31 at 1932-01-01 03:58:29 in America/Barbados
Max fold:                10:00:00 at 1952-01-13 14:00:00 in
Antarctica/DumontDUrville
>>> ZoneInfo.zoneroot = '/usr/local/etc/zoneinfo'
>>> ZoneInfo.stats()
Number of zones:         585
Number of transitions: 39018 = 19434 (gaps) + 19131 (folds) + 453 (zeros)
Min gap:                  0:00:04 at 1914-01-01 04:00:04 in America/Manaus
Max gap:           1 day, 0:00:00 at 2011-12-30 11:00:00 in Pacific/Fakaofo
Min fold:                 0:00:10 at 1906-06-30 16:53:20 in Asia/Ho_Chi_Minh
Max fold:                23:00:00 at 1969-09-30 13:00:00 in Kwajalein

[1]:
https://github.com/abalkin/cpython/commit/fa4f8055ac6723d4d0940ea141e05f931c718a2c
[2]: https://github.com/eggert/tz
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150923/06cc756a/attachment.html>

From rosuav at gmail.com  Thu Sep 24 00:16:44 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Thu, 24 Sep 2015 08:16:44 +1000
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
Message-ID: <CAPTjJmqYV6zKih5hXkasDh6Y+nOG3nF1uudUgWnn+dj8k_mxJw@mail.gmail.com>

On Thu, Sep 24, 2015 at 7:00 AM, Alexander Belopolsky
<alexander.belopolsky at gmail.com> wrote:
> Here are the results for the default Mac OSX system files and the most
> recent Github version of tz [2]:
>
>>>> from datetimetester import *
>>>> ZoneInfo.stats()
> Number of zones:         584
> Number of transitions: 38510 = 19058 (gaps) + 19008 (folds) + 444 (zeros)
> Min gap:                  0:00:16 at 1935-01-01 03:40:52 in
> America/Paramaribo
> Max gap:           1 day, 0:00:00 at 2011-12-30 10:00:00 in Pacific/Apia
> Min fold:                 0:01:31 at 1932-01-01 03:58:29 in America/Barbados
> Max fold:                10:00:00 at 1952-01-13 14:00:00 in
> Antarctica/DumontDUrville
>>>> ZoneInfo.zoneroot = '/usr/local/etc/zoneinfo'
>>>> ZoneInfo.stats()
> Number of zones:         585
> Number of transitions: 39018 = 19434 (gaps) + 19131 (folds) + 453 (zeros)
> Min gap:                  0:00:04 at 1914-01-01 04:00:04 in America/Manaus
> Max gap:           1 day, 0:00:00 at 2011-12-30 11:00:00 in Pacific/Fakaofo
> Min fold:                 0:00:10 at 1906-06-30 16:53:20 in Asia/Ho_Chi_Minh
> Max fold:                23:00:00 at 1969-09-30 13:00:00 in Kwajalein

Neat! (Is that meant to be "from test.datetimetester import *", or was
I loading this up the wrong way? Anyway, not significant.)

A lot of the small numbers are going to be when different places
adopted standard time, and such. To get a better handle on what's
happening _now_, I added an option [1] to your stats function for a
starting year:

>>> ZoneInfo.stats(start_year=1970)
Number of zones:  1790 = 46266 (gaps) + 46130 (folds) + 843 (zeros)
Min gap:                  0:15:00 at 1985-12-31 18:30:13 in right/Asia/Kathmandu
Max gap:           1 day, 0:00:00 at 2011-12-30 10:00:24 in right/Pacific/Apia
Min fold:                 0:30:00 at 2037-04-04 15:00:26 in
right/Australia/Lord_Howe
Max fold:                 3:00:00 at 2012-02-21 17:00:24 in
right/Antarctica/Casey
>>> ZoneInfo.stats()
Number of zones:  1790 = 58914 (gaps) + 58777 (folds) + 1363 (zeros)
Min gap:                  0:00:16 at 1935-01-01 03:40:52 in
posix/America/Paramaribo
Max gap:           1 day, 0:00:00 at 2011-12-30 10:00:24 in right/Pacific/Apia
Min fold:                 0:01:31 at 1932-01-01 03:58:29 in
posix/America/Barbados
Max fold:                10:00:00 at 1952-01-13 14:00:00 in
posix/Antarctica/DumontDUrville

I'm not sure whether this actually helps anything or not, but hey, cool stats :)

ChrisA

[1] https://github.com/Rosuav/cpython/commit/ed51575f7ffe7ba98bfad58a43602cb8f74cfe2a

From alexander.belopolsky at gmail.com  Thu Sep 24 00:44:26 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Wed, 23 Sep 2015 18:44:26 -0400
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAPTjJmqYV6zKih5hXkasDh6Y+nOG3nF1uudUgWnn+dj8k_mxJw@mail.gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAPTjJmqYV6zKih5hXkasDh6Y+nOG3nF1uudUgWnn+dj8k_mxJw@mail.gmail.com>
Message-ID: <CAP7h-xZ8Z37MV=YE6jLc_uOAtJQTZ9j6RbHO8PvjEEAFFH3R2A@mail.gmail.com>

On Wed, Sep 23, 2015 at 6:16 PM, Chris Angelico <rosuav at gmail.com> wrote:

> A lot of the small numbers are going to be when different places
> adopted standard time, and such. To get a better handle on what's
> happening _now_, I added an option [1] to your stats function for a
> starting year:
>
> >>> ZoneInfo.stats(start_year=1970)
> Number of zones:  1790 = 46266 (gaps) + 46130 (folds) + 843 (zeros)
> Min gap:                  0:15:00 at 1985-12-31 18:30:13 in
> right/Asia/Kathmandu
> Max gap:           1 day, 0:00:00 at 2011-12-30 10:00:24 in
> right/Pacific/Apia
> Min fold:                 0:30:00 at 2037-04-04 15:00:26
> in right/Australia/Lord_Howe
> Max fold:                 3:00:00 at 2012-02-21 17:00:24
> in right/Antarctica/Casey
> >>> ZoneInfo.stats()
> Number of zones:  1790 = 58914 (gaps) + 58777 (folds) + 1363 (zeros)
> Min gap:                  0:00:16 at 1935-01-01 03:40:52 in
> posix/America/Paramaribo
> Max gap:           1 day, 0:00:00 at 2011-12-30 10:00:24 in
> right/Pacific/Apia
> Min fold:                 0:01:31 at 1932-01-01 03:58:29 in
> posix/America/Barbados
> Max fold:                10:00:00 at 1952-01-13 14:00:00 in
> posix/Antarctica/DumontDUrville
>

It looks like you've got double counts because you included both "posix"
and "right" tzfiles in the search.  (I don't think the data that I actually
read is different between the two sets.)


> I'm not sure whether this actually helps anything or not, but hey, cool
> stats :)
>

If we can make some simplified assumptions about transition locations and
sizes, we can avoid a binary search over seconds to locate the transitions
via POSIX localtime/mktime APIs.

I am considering making the cut-off at 1970 and assume 1970 standard time
for all times before that year.  I think this is best we can do on Windows
where (IIRC) mktime does not work for times before epoch.  (What about
localtime?)

In any case, TZ data before 1970 is highly suspect so we will probably do
our users a favor by assuming standard time and letting those with
historical timeseries figure out the transitions by themselves.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150923/1bb0d0b8/attachment.html>

From rosuav at gmail.com  Thu Sep 24 00:49:15 2015
From: rosuav at gmail.com (Chris Angelico)
Date: Thu, 24 Sep 2015 08:49:15 +1000
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAP7h-xZ8Z37MV=YE6jLc_uOAtJQTZ9j6RbHO8PvjEEAFFH3R2A@mail.gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAPTjJmqYV6zKih5hXkasDh6Y+nOG3nF1uudUgWnn+dj8k_mxJw@mail.gmail.com>
 <CAP7h-xZ8Z37MV=YE6jLc_uOAtJQTZ9j6RbHO8PvjEEAFFH3R2A@mail.gmail.com>
Message-ID: <CAPTjJmr4X9nKhNO9Sc-jz+pq8sYf0hTwwMQCj7q0jFk0=ViLwg@mail.gmail.com>

On Thu, Sep 24, 2015 at 8:44 AM, Alexander Belopolsky
<alexander.belopolsky at gmail.com> wrote:
> In any case, TZ data before 1970 is highly suspect so we will probably do
> our users a favor by assuming standard time and letting those with
> historical timeseries figure out the transitions by themselves.

Yeah. Originally I made a boolean to suppress pre-1970 data, before
settling on the arbitrary starting year option. I expect that 1970
will be the most common year to use as the base.

FWIW: https://github.com/abalkin/cpython/pull/1

ChrisA

From tim.peters at gmail.com  Thu Sep 24 02:47:28 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 23 Sep 2015 19:47:28 -0500
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
Message-ID: <CAExdVNn7VYWUMY2A3Yx0EjC6-SZq38VvusCnrL6pkV60AVc+6A@mail.gmail.com>

[Alex]
> I added a method to datetimetester [1] to compute some overall statistics on
> tzfiles.  My code ignores "version 2" data, so I include only transitions
> that fall within 32-bit time_t (1900 to 2038 range).

>From staring at zic.c, looks like (so far) the data in the version 2
section is identical to that in the version 1 section, except written
out in wider data formats.  The pretty clear intent is that they never
intend to generate explicit transitions beyond 2037 in any version,
until it's after 2037 in the real world and they need to do so because
a POSIX TZ rule can't handle some new goofy exception (and version 2
also contains a POSIX TZ rule at the end, when possible).  Then
they'll need to add new transitions in the version 2 section only
(version 1 data formats are too narrow to record them).


> I am considering making the cut-off at 1970 and assume 1970 standard
> time for all times before that year.

Is there a real need for a "high performance" tzinfo?  That is, who
cares? ;-)  It would sure be _surprising_ if a Python wrapping of
zoneinfo returned different results than native Linux tools wrapping
the same thing.


>  I think this is best we can do on Windows

Of course not, if by "best" we mean "gets the same answers everyone
else gets".  In that case, "best" is returning what the IANA database
says should be returned in all cases.


> where (IIRC) mktime does not work for times before epoch.  (What about
> localtime?)

Python 3.4.3 (v3.4.3:9b73f1c3e601, Feb 24 2015, 22:44:40) [MSC v.1600
64 bit (AMD64)] on win32
>>> import time
>>> time.localtime(0)
time.struct_time(tm_year=1969, tm_mon=12, tm_mday=31, tm_hour=18,
tm_min=0, tm_sec=0, tm_wday=2, tm_yday=365, tm_isdst=0)

>>> time.localtime(-1)
Traceback (most recent call last):
  File "<pyshell#2>", line 1, in <module>
    time.localtime(-1)
OSError: [Errno 22] Invalid argument


Which is another meaning for "best":  avoid flaky C library functions
altogether.

>>> epoch = datetime(1970, 1, 1)
>>> epoch + timedelta(seconds=1e11)
datetime.datetime(5138, 11, 16, 9, 46, 40)
>>> import time
>>> time.localtime(1e11)
Traceback (most recent call last):
  File "<pyshell#13>", line 1, in <module>
    time.localtime(1e11)
OSError: [Errno 22] Invalid argument

From random832 at fastmail.com  Thu Sep 24 03:12:59 2015
From: random832 at fastmail.com (Random832)
Date: Wed, 23 Sep 2015 21:12:59 -0400
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAExdVNn7VYWUMY2A3Yx0EjC6-SZq38VvusCnrL6pkV60AVc+6A@mail.gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAExdVNn7VYWUMY2A3Yx0EjC6-SZq38VvusCnrL6pkV60AVc+6A@mail.gmail.com>
Message-ID: <1443057179.975535.392060185.41F67F5C@webmail.messagingengine.com>

On Wed, Sep 23, 2015, at 20:47, Tim Peters wrote:
> [Alex]
> > I added a method to datetimetester [1] to compute some overall statistics on
> > tzfiles.  My code ignores "version 2" data, so I include only transitions
> > that fall within 32-bit time_t (1900 to 2038 range).
> 
> From staring at zic.c, looks like (so far) the data in the version 2
> section is identical to that in the version 1 section, except written
> out in wider data formats.  The pretty clear intent is that they never
> intend to generate explicit transitions beyond 2037 in any version,

They do have transitions for before 1901, though.

> >  I think this is best we can do on Windows
> > where (IIRC) mktime does not work for times before epoch.  (What about
> > localtime?)

Windows has its own mechanism for storing timezone information, not
tzdata, but no, none of the MSVCRT functions work for times before 1970.

From tim.peters at gmail.com  Thu Sep 24 03:22:20 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 23 Sep 2015 20:22:20 -0500
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAP7h-xZ8Z37MV=YE6jLc_uOAtJQTZ9j6RbHO8PvjEEAFFH3R2A@mail.gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAPTjJmqYV6zKih5hXkasDh6Y+nOG3nF1uudUgWnn+dj8k_mxJw@mail.gmail.com>
 <CAP7h-xZ8Z37MV=YE6jLc_uOAtJQTZ9j6RbHO8PvjEEAFFH3R2A@mail.gmail.com>
Message-ID: <CAExdVNmohmLvzM0spFvqUoUKr3yQJB68p3Pku0ODHNeUcOt_AQ@mail.gmail.com>

[Alex]
> ...
> If we can make some simplified assumptions about transition locations and
> sizes, we can avoid a binary search over seconds to locate the transitions
> via POSIX localtime/mktime APIs.

BTW, "the obvious" way to almost always avoid binary search is for a
tzinfo to remember the index of the last transition it had to use,
then next time start a linear search from there.  It should usually
succeed in 1 or 2 tries.  Programs in real life don't jump around
across all possible times at random.

To be truly insane, it could meld linear search with binary search,
like Python's listsort.c's "gallop" functions.  I'd say that's far
more trouble than it's worth in this context, though.  Simple linear
search with a search finger (index saved across searches) should do
fine.

From alexander.belopolsky at gmail.com  Thu Sep 24 05:58:18 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Wed, 23 Sep 2015 23:58:18 -0400
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAExdVNn7VYWUMY2A3Yx0EjC6-SZq38VvusCnrL6pkV60AVc+6A@mail.gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAExdVNn7VYWUMY2A3Yx0EjC6-SZq38VvusCnrL6pkV60AVc+6A@mail.gmail.com>
Message-ID: <CAP7h-xbC=EaYVxSA5iSdsBJ1=-owNt=qqGcTrpfLyPzq+58g1g@mail.gmail.com>

> [Alex]
> > My code ignores "version 2" data, so I include only transitions
> > that fall within 32-bit time_t (1900 to 2038 range).
> [Tim]
> From staring at zic.c,


.. I get a pounding headache. :-(


> looks like (so far) the data in the version 2
> section is identical to that in the version 1 section, except written
> out in wider data formats.  The pretty clear intent is that they never
> intend to generate explicit transitions beyond 2037 in any version,
>

I compiled the latest Github version on my Mac and I get

$ /usr/local/etc/zdump -V 'America/New_York'| tail -4
America/New_York  Sun Mar  8 06:59:59 2499 UT = Sun Mar  8 01:59:59 2499
EST isdst=0 gmtoff=-18000
America/New_York  Sun Mar  8 07:00:00 2499 UT = Sun Mar  8 03:00:00 2499
EDT isdst=1 gmtoff=-14400
America/New_York  Sun Nov  1 05:59:59 2499 UT = Sun Nov  1 01:59:59 2499
EDT isdst=1 gmtoff=-14400
America/New_York  Sun Nov  1 06:00:00 2499 UT = Sun Nov  1 01:00:00 2499
EST isdst=0 gmtoff=-18000


> until it's after 2037 in the real world and they need to do so because
> a POSIX TZ rule can't handle some new goofy exception (and version 2
> also contains a POSIX TZ rule at the end, when possible).


What they do is a so-called 400-year hack: since the Gregorian calendar
repeats itself every 400 years, any regular calendar-based rule will
generate transitions with a 400-year period.  This observation allows them
to generate 400+ years of explicit transitions through 2499 and extent that
through eternity by periodicity.


>   Then they'll need to add new transitions in the version 2 section only
> (version 1 data formats are too narrow to record them).
>
> They already do that for transitions both before EPOCH - 2**31 seconds and
after EPOCH + 2**31 seconds.

$ /usr/local/etc/zdump -V 'America/New_York'| head -2
America/New_York  Sun Nov 18 16:59:59 1883 UT = Sun Nov 18 12:03:57 1883
LMT isdst=0 gmtoff=-17762
America/New_York  Sun Nov 18 17:00:00 1883 UT = Sun Nov 18 12:00:00 1883
EST isdst=0 gmtoff=-18000


[Alex]

> I am considering making the cut-off at 1970 and assume 1970 standard

> time for all times before that year.
> [Tim]
> Is there a real need for a "high performance" tzinfo?


This is not about new tzinfos.  This is about implementing PEP 495's
.astimezone().


>   That is, who cares? ;-)


I do.  :-)


> It would sure be _surprising_ if a Python wrapping of
> zoneinfo returned different results than native Linux tools wrapping
> the same thing.
>

This is not about wrapping IANA's tzdist.  This is about implementing PEP
495 features using POSIX APIs.


> [Alex]

>  I think this is best we can do on Windows
> [Tim]
> Of course not, if by "best" we mean "gets the same answers everyone
> else gets".  In that case, "best" is returning what the IANA database
> says should be returned in all cases.
>

Which version of  IANA database?


> [Alex]

> where (IIRC) mktime does not work for times before epoch.  (What about
> > localtime?)
> [Tim]
> >>> time.localtime(-1) ..

OSError: [Errno 22] Invalid argument
>

That's what I thought.

>
> Which is another meaning for "best":  avoid flaky C library functions
> altogether.
>
> >>> time.localtime(1e11) ..
> OSError: [Errno 22] Invalid argument
>

I don't want to try to figure out how to access tzfiles in a portable way.
We need another PEP for this because I don't see any better solution than
to repackage IANA files as a pip-installable package.  Such PEP should
probably be discussed on distutils-sig first.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150923/36ce55cd/attachment-0001.html>

From tim.peters at gmail.com  Thu Sep 24 07:37:48 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 24 Sep 2015 00:37:48 -0500
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAP7h-xbC=EaYVxSA5iSdsBJ1=-owNt=qqGcTrpfLyPzq+58g1g@mail.gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAExdVNn7VYWUMY2A3Yx0EjC6-SZq38VvusCnrL6pkV60AVc+6A@mail.gmail.com>
 <CAP7h-xbC=EaYVxSA5iSdsBJ1=-owNt=qqGcTrpfLyPzq+58g1g@mail.gmail.com>
Message-ID: <CAExdVNk9B3hEZv_nEm1dbeBLgaHgQrmt0aB-M3eQusFRPJeNkg@mail.gmail.com>

[Alex]
> ...
> This is not about new tzinfos.  This is about implementing PEP 495's
> .astimezone().

Ah.  You realize that's the first time that's been mentioned in this
thread?  It's been a total mystery until now ;-)


> ...
> This is not about wrapping IANA's tzdist.  This is about implementing PEP
> 495 features using POSIX APIs.

Specifically which features?  Do you just mean .astimezone() treating
a naive datetime as being in the system zone, and the absence of any
argument implying the system zone?  Or more than just that?


>> ...
>> In that case, "best" is returning what the IANA database
>> says should be returned in all cases.

> Which version of  IANA database?

If it's still relevant, the only version any user cares about:  the
one that happens to be installed on their machine ;-)


> ...
> I don't want to try to figure out how to access tzfiles in a portable way.
> We need another PEP for this because I don't see any better solution than to
> repackage IANA files as a pip-installable package.  Such PEP should probably
> be discussed on distutils-sig first.

Sorry, since this thread started by presenting statistics about the
contents of the IANA database, I three-quarters assumed that _was_
what this was about.  I agree that needs a whole different PEP.

I also agree figuring out the system zone's rules is a puzzle using
POSIX.  Note that Gustavo gave up on trying to use mktime() in
dateutil's tzlocal class.  You could say time.timezone and
time.altzone define the only two (or one, if time.daylight is 0)
possible total UTC offsets, and assume that's always been, and always
will be, the case.  But I don't think even `altzone` is actually
required by POSIX - it's of little help :-(

From alexander.belopolsky at gmail.com  Thu Sep 24 17:11:48 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Thu, 24 Sep 2015 11:11:48 -0400
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAExdVNk9B3hEZv_nEm1dbeBLgaHgQrmt0aB-M3eQusFRPJeNkg@mail.gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAExdVNn7VYWUMY2A3Yx0EjC6-SZq38VvusCnrL6pkV60AVc+6A@mail.gmail.com>
 <CAP7h-xbC=EaYVxSA5iSdsBJ1=-owNt=qqGcTrpfLyPzq+58g1g@mail.gmail.com>
 <CAExdVNk9B3hEZv_nEm1dbeBLgaHgQrmt0aB-M3eQusFRPJeNkg@mail.gmail.com>
Message-ID: <CAP7h-xYZwKsuKHwQCw7KAzcyZNGTsKR2FwMzF-z0AK9Gsf3OWA@mail.gmail.com>

> [Alex]

> This is not about wrapping IANA's tzdist.  This is about implementing PEP
> > 495 features using POSIX APIs.
> [Tim]
> Specifically which features?  Do you just mean .astimezone() treating
> a naive datetime as being in the system zone, and the absence of any
> argument implying the system zone?  Or more than just that?
>

Also, .timestamp() respecting the fold attribute and datetime.now() and
datetime.fromtimestamp() setting the fold attribute appropriately.  In all
these cases one needs to know how far the transition point is from a given
time.

>
> >> [Tim]
> >> In that case, "best" is returning what the IANA database
> >> says should be returned in all cases.
>

The database itself does not say anything about what should be returned by
various tools, but I would interpret that as "whatever zdump returns."


>
> > [Alex]

> Which version of  IANA database?

[Tim]
> If it's still relevant, the only version any user cares about:  the
> one that happens to be installed on their machine ;-)


I don't think Windows comes with any, but I know close to nothing about
Windows.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150924/d79980d6/attachment.html>

From alexander.belopolsky at gmail.com  Thu Sep 24 17:26:34 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Thu, 24 Sep 2015 11:26:34 -0400
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAExdVNk9B3hEZv_nEm1dbeBLgaHgQrmt0aB-M3eQusFRPJeNkg@mail.gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAExdVNn7VYWUMY2A3Yx0EjC6-SZq38VvusCnrL6pkV60AVc+6A@mail.gmail.com>
 <CAP7h-xbC=EaYVxSA5iSdsBJ1=-owNt=qqGcTrpfLyPzq+58g1g@mail.gmail.com>
 <CAExdVNk9B3hEZv_nEm1dbeBLgaHgQrmt0aB-M3eQusFRPJeNkg@mail.gmail.com>
Message-ID: <CAP7h-xZURoD2VCrUz5NTW8r2yfJ5w+-2q+m4KvRYJD7RjuVvfg@mail.gmail.com>

On Thu, Sep 24, 2015 at 1:37 AM, Tim Peters <tim.peters at gmail.com> wrote:

> I also agree figuring out the system zone's rules is a puzzle using
> POSIX.  Note that Gustavo gave up on trying to use mktime() in
> dateutil's tzlocal class.
>

I think he was bitten by the flaky behavior of mktime() when tm_isdst is
passed as -1.  I intend calling mktime twice with tm_isdst=0 and tm_isdst=1
and detect fold/gap by what mktime that does to the tm structure.  If we
discover that some systems misbehave even in tm_isdst>=0 cases, we can roll
out our own mktime() that probes localtime() multiple times.


>   You could say time.timezone and
> time.altzone define the only two (or one, if time.daylight is 0)
> possible total UTC offsets, and assume that's always been, and always
> will be, the case.
>

Linux (glibc) updates timezone, altzone and tzname whenever localtime() is
called.  I think this is a horrible hack, but it does not seem to be in
violation of POSIX.


>   But I don't think even `altzone` is actually
> required by POSIX - it's of little help :-(
>

I don't want to rely on any of these variables.  For offsets, I would just
compute the timestamp on the output of localtime_r (the reentrant version
does not mess with globals) and compare that to the input timestamp.   For
tzname, I would use strftime("%Z") which seems to be rather portable.  Of
course, on platforms where localtime and mktime fill in tm_gmtoff and
tm_zone, I can just use those.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150924/a8d05bf7/attachment.html>

From tim.peters at gmail.com  Thu Sep 24 19:06:48 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 24 Sep 2015 12:06:48 -0500
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAP7h-xYZwKsuKHwQCw7KAzcyZNGTsKR2FwMzF-z0AK9Gsf3OWA@mail.gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAExdVNn7VYWUMY2A3Yx0EjC6-SZq38VvusCnrL6pkV60AVc+6A@mail.gmail.com>
 <CAP7h-xbC=EaYVxSA5iSdsBJ1=-owNt=qqGcTrpfLyPzq+58g1g@mail.gmail.com>
 <CAExdVNk9B3hEZv_nEm1dbeBLgaHgQrmt0aB-M3eQusFRPJeNkg@mail.gmail.com>
 <CAP7h-xYZwKsuKHwQCw7KAzcyZNGTsKR2FwMzF-z0AK9Gsf3OWA@mail.gmail.com>
Message-ID: <CAExdVNm21f4_sqrHNMgBE4TyyHf8x0PMKVaggL2zrjQmEX66dA@mail.gmail.com>

[Tim]
>> ...
>> Specifically which features?  Do you just mean .astimezone() treating
>> a naive datetime as being in the system zone, and the absence of any
>> argument implying the system zone?  Or more than just that?

[Alex]
> Also, .timestamp() respecting the fold attribute and datetime.now() and
> datetime.fromtimestamp() setting the fold attribute appropriately.  In all
> these cases one needs to know how far the transition point is from a given
> time.

Got it.  I should have known that the first time - sorry ;-)


>> In that case, "best" is returning what the IANA database
>> says should be returned in all cases.

> The database itself does not say anything about what should be returned by
> various tools, but I would interpret that as "whatever zdump returns."

Gimme a break.


> ...
> I don't think Windows comes with any, but I know close to nothing about
> Windows.

Windows has minimal (compared to IANA) time zone info stored in the
registry.  You can look at dateutil's tzwin.py for code accessing it.
Zones generally store no historical info, and assume a zone switches
DST zero or two times per year..  In the latter case, the registry
essentially stores a compiled version of the "n'th weekday of the
month" flavor of POSIX TZ string rules, so code can compute when DST
starts and ends each year.  tzwin.py's tzwinlocal class implements a
hybrid tzinfo appropriate for the current system zone (although it
never worked for me :-( ).

So, ironically enough, this could all be relatively straightforward on
Windows:  in return for sticking to regular rules, you get to know the
rules up front.

For portable code, think of Windows as implementing as little as POSIX
requires of localtime() and mktime().  While it uses a 64-bit type for
time_t, values must be >= 0 and are documented as working only through
31 December 3000 23:59:59 UTC.  On my Windows 10 box, it actually goes
21 whole hours ;-) beyond that:"

Python 3.4.3 (v3.4.3:9b73f1c3e601, Feb 24 2015, 22:44:40) [MSC v.1600
64 bit (AMD64)] on win32
...
>>> datetime.utcfromtimestamp(32535215999)
datetime.datetime(3000, 12, 31, 23, 59, 59)

>>> datetime.utcfromtimestamp(32535215999 + 21 * 3600)
datetime.datetime(3001, 1, 1, 20, 59, 59)

>>> datetime.utcfromtimestamp(32535215999 + 21 * 3600 + 1)
Traceback (most recent call last):
  File "<pyshell#36>", line 1, in <module>
    datetime.utcfromtimestamp(32535215999 + 21 * 3600 + 1)
OSError: [Errno 22] Invalid argument


>> I also agree figuring out the system zone's rules is a puzzle using
>> POSIX.  Note that Gustavo gave up on trying to use mktime() in
>> dateutil's tzlocal class.

> I think he was bitten by the flaky behavior of mktime() when tm_isdst
> is passed as -1.

Good point!

> I intend calling mktime twice with tm_isdst=0 and tm_isdst=1 and detect
> fold/gap by what mktime that does to the tm structure.  If we discover
> that some systems misbehave even in tm_isdst>=0 cases, we can roll
> out our own mktime() that probes localtime() multiple times.

Just one suggestion:  force the year/timestamp into a 400-year span
starting at 1971 first (via adding/subtracting multiples of 400
years).  Then not even Windows will blow up ;-)

> ...

From alexander.belopolsky at gmail.com  Thu Sep 24 19:38:17 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Thu, 24 Sep 2015 13:38:17 -0400
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAExdVNm21f4_sqrHNMgBE4TyyHf8x0PMKVaggL2zrjQmEX66dA@mail.gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAExdVNn7VYWUMY2A3Yx0EjC6-SZq38VvusCnrL6pkV60AVc+6A@mail.gmail.com>
 <CAP7h-xbC=EaYVxSA5iSdsBJ1=-owNt=qqGcTrpfLyPzq+58g1g@mail.gmail.com>
 <CAExdVNk9B3hEZv_nEm1dbeBLgaHgQrmt0aB-M3eQusFRPJeNkg@mail.gmail.com>
 <CAP7h-xYZwKsuKHwQCw7KAzcyZNGTsKR2FwMzF-z0AK9Gsf3OWA@mail.gmail.com>
 <CAExdVNm21f4_sqrHNMgBE4TyyHf8x0PMKVaggL2zrjQmEX66dA@mail.gmail.com>
Message-ID: <CAP7h-xZqMNSeFLdKH6SBznLKWYq46Z+7H1EV99HLfusgaPpCag@mail.gmail.com>

On Thu, Sep 24, 2015 at 1:06 PM, Tim Peters <tim.peters at gmail.com> wrote:

> Just one suggestion:  force the year/timestamp into a 400-year span
> starting at 1971 first (via adding/subtracting multiples of 400
> years).  Then not even Windows will blow up ;-)
>

This will work for the future dates (and I think I should use 2100 through
2399 range to avoid extending not-regular rules into the far future).  For
the far in the past dates, I still think the earliest transition to
standard time should be used as the "big bang" transition.   Note that the
400 year hack does not work for systems with 32-bit time_t.  I think it is
ok to just raise OverflowError on those whenever a timezone operation is
requested on a date outside of EPOCH ? 2**31 range.  That's about 140 years.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150924/87a920cf/attachment-0001.html>

From tim.peters at gmail.com  Thu Sep 24 20:16:59 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 24 Sep 2015 13:16:59 -0500
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAP7h-xZqMNSeFLdKH6SBznLKWYq46Z+7H1EV99HLfusgaPpCag@mail.gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAExdVNn7VYWUMY2A3Yx0EjC6-SZq38VvusCnrL6pkV60AVc+6A@mail.gmail.com>
 <CAP7h-xbC=EaYVxSA5iSdsBJ1=-owNt=qqGcTrpfLyPzq+58g1g@mail.gmail.com>
 <CAExdVNk9B3hEZv_nEm1dbeBLgaHgQrmt0aB-M3eQusFRPJeNkg@mail.gmail.com>
 <CAP7h-xYZwKsuKHwQCw7KAzcyZNGTsKR2FwMzF-z0AK9Gsf3OWA@mail.gmail.com>
 <CAExdVNm21f4_sqrHNMgBE4TyyHf8x0PMKVaggL2zrjQmEX66dA@mail.gmail.com>
 <CAP7h-xZqMNSeFLdKH6SBznLKWYq46Z+7H1EV99HLfusgaPpCag@mail.gmail.com>
Message-ID: <CAExdVNmnQVrVoMtXe19kJK5DyK2KfULMFFES6Bvptvcn18dKKg@mail.gmail.com>

[Tim]
>> Just one suggestion:  force the year/timestamp into a 400-year span
>> starting at 1971 first (via adding/subtracting multiples of 400
>> years).  Then not even Windows will blow up ;-)

[Alex]
> This will work for the future dates (and I think I should use 2100 through
> 2399 range to avoid extending not-regular rules into the far future).

That's fine.

> For the far in the past dates, I still think the earliest transition to standard
> time should be used as the "big bang" transition.

I'm not sure exactly what that means - I'm just trying to worm around
that time_t values less than 0 aren't supported on all systems.


>  Note that the 400 year
> hack does not work for systems with 32-bit time_t.  I think it is ok to just
> raise OverflowError on those whenever a timezone operation is requested on a
> date outside of EPOCH ? 2**31 range.  That's about 140 years.

The 400-year hack is just mindlessly simple.  It's possible to do far
better, since there are only 14 possible yearly calendars (which day
of the week is January first, and is it a leap year?  7*2 = 14).  So a
table with 14 entries, mapping (weekday_of_1_Jan, is_leap) -> fixed
canonical year is sufficient.  Nothing in that depends on the time
zone - it can be precomputed as a static table equally applicable to
all time zones (for years in which "normalization": is desired).  In
general, most (*) 28-year spans contain at least one of each possible
yearly calendar.  So a 32-bit time_t isn't a real problem here.  For
example, any system capable of representing the years from 1972
through 1996 inclusive covers all possible yearly calendars.

(*) Exceptions can occur when the span crosses a century.

From alexander.belopolsky at gmail.com  Thu Sep 24 20:29:58 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Thu, 24 Sep 2015 14:29:58 -0400
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAExdVNmnQVrVoMtXe19kJK5DyK2KfULMFFES6Bvptvcn18dKKg@mail.gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAExdVNn7VYWUMY2A3Yx0EjC6-SZq38VvusCnrL6pkV60AVc+6A@mail.gmail.com>
 <CAP7h-xbC=EaYVxSA5iSdsBJ1=-owNt=qqGcTrpfLyPzq+58g1g@mail.gmail.com>
 <CAExdVNk9B3hEZv_nEm1dbeBLgaHgQrmt0aB-M3eQusFRPJeNkg@mail.gmail.com>
 <CAP7h-xYZwKsuKHwQCw7KAzcyZNGTsKR2FwMzF-z0AK9Gsf3OWA@mail.gmail.com>
 <CAExdVNm21f4_sqrHNMgBE4TyyHf8x0PMKVaggL2zrjQmEX66dA@mail.gmail.com>
 <CAP7h-xZqMNSeFLdKH6SBznLKWYq46Z+7H1EV99HLfusgaPpCag@mail.gmail.com>
 <CAExdVNmnQVrVoMtXe19kJK5DyK2KfULMFFES6Bvptvcn18dKKg@mail.gmail.com>
Message-ID: <CAP7h-xYF_DZM1n427zTTF=LQJmLwkGT6fn7e9dCsmjahN5s0ig@mail.gmail.com>

On Thu, Sep 24, 2015 at 2:16 PM, Tim Peters <tim.peters at gmail.com> wrote:

> > For the far in the past dates, I still think the earliest transition to
> standard
> > time should be used as the "big bang" transition.
>
> I'm not sure exactly what that means - I'm just trying to worm around
> that time_t values less than 0 aren't supported on all systems.


I should have said "earliest *discoverable* transition."   For systems with
non-negative time_t, that would be somewhere in 1970s.  The key decision
here is that regular DST transitions are extended into the future but not
into the past.  For the far past utcoffset will be fixed at the earliest
standard time offset that can be fished out from localtime/mktime calls.

Does Python support any systems with 32-bit time_t?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150924/358025ca/attachment.html>

From tim.peters at gmail.com  Thu Sep 24 20:40:43 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 24 Sep 2015 13:40:43 -0500
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAP7h-xYF_DZM1n427zTTF=LQJmLwkGT6fn7e9dCsmjahN5s0ig@mail.gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAExdVNn7VYWUMY2A3Yx0EjC6-SZq38VvusCnrL6pkV60AVc+6A@mail.gmail.com>
 <CAP7h-xbC=EaYVxSA5iSdsBJ1=-owNt=qqGcTrpfLyPzq+58g1g@mail.gmail.com>
 <CAExdVNk9B3hEZv_nEm1dbeBLgaHgQrmt0aB-M3eQusFRPJeNkg@mail.gmail.com>
 <CAP7h-xYZwKsuKHwQCw7KAzcyZNGTsKR2FwMzF-z0AK9Gsf3OWA@mail.gmail.com>
 <CAExdVNm21f4_sqrHNMgBE4TyyHf8x0PMKVaggL2zrjQmEX66dA@mail.gmail.com>
 <CAP7h-xZqMNSeFLdKH6SBznLKWYq46Z+7H1EV99HLfusgaPpCag@mail.gmail.com>
 <CAExdVNmnQVrVoMtXe19kJK5DyK2KfULMFFES6Bvptvcn18dKKg@mail.gmail.com>
 <CAP7h-xYF_DZM1n427zTTF=LQJmLwkGT6fn7e9dCsmjahN5s0ig@mail.gmail.com>
Message-ID: <CAExdVNkt-sFc1cYG2PpqJnGqYbYNcDp-DaeogyZTBhbXBvcmpw@mail.gmail.com>

[Alex]
> Does Python support any systems with 32-bit time_t?

Not that I use ;-)  But I'm sure there are many - pick some random
32-bit box.  The move to 64-bit time_t appears to be relatively recent
even on Linux system.  I don't know when it happened on Windows, but
32-bit Windows XP boxes definitely use 32 bits (and there are still
flags to allow switching back to that).

http://stackoverflow.com/questions/14361651/is-there-any-way-to-get-64-bit-time-t-in-32-bit-program-in-linux

From random832 at fastmail.com  Thu Sep 24 20:42:18 2015
From: random832 at fastmail.com (Random832)
Date: Thu, 24 Sep 2015 14:42:18 -0400
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAP7h-xYF_DZM1n427zTTF=LQJmLwkGT6fn7e9dCsmjahN5s0ig@mail.gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAExdVNn7VYWUMY2A3Yx0EjC6-SZq38VvusCnrL6pkV60AVc+6A@mail.gmail.com>
 <CAP7h-xbC=EaYVxSA5iSdsBJ1=-owNt=qqGcTrpfLyPzq+58g1g@mail.gmail.com>
 <CAExdVNk9B3hEZv_nEm1dbeBLgaHgQrmt0aB-M3eQusFRPJeNkg@mail.gmail.com>
 <CAP7h-xYZwKsuKHwQCw7KAzcyZNGTsKR2FwMzF-z0AK9Gsf3OWA@mail.gmail.com>
 <CAExdVNm21f4_sqrHNMgBE4TyyHf8x0PMKVaggL2zrjQmEX66dA@mail.gmail.com>
 <CAP7h-xZqMNSeFLdKH6SBznLKWYq46Z+7H1EV99HLfusgaPpCag@mail.gmail.com>
 <CAExdVNmnQVrVoMtXe19kJK5DyK2KfULMFFES6Bvptvcn18dKKg@mail.gmail.com>
 <CAP7h-xYF_DZM1n427zTTF=LQJmLwkGT6fn7e9dCsmjahN5s0ig@mail.gmail.com>
Message-ID: <1443120138.1452862.392770233.43DB6E32@webmail.messagingengine.com>

On Thu, Sep 24, 2015, at 14:29, Alexander Belopolsky wrote:
> Does Python support any systems with 32-bit time_t?

Uh... Linux/i386 comes to mind.

I still don't see the logic in doing any of this rather than parsing
zoneinfo files directly on systems that use it;
Get[Dynamic]TimeZoneInformation[ForYear] on Windows, etc.

From tim.peters at gmail.com  Thu Sep 24 20:48:19 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 24 Sep 2015 13:48:19 -0500
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <1443120138.1452862.392770233.43DB6E32@webmail.messagingengine.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAExdVNn7VYWUMY2A3Yx0EjC6-SZq38VvusCnrL6pkV60AVc+6A@mail.gmail.com>
 <CAP7h-xbC=EaYVxSA5iSdsBJ1=-owNt=qqGcTrpfLyPzq+58g1g@mail.gmail.com>
 <CAExdVNk9B3hEZv_nEm1dbeBLgaHgQrmt0aB-M3eQusFRPJeNkg@mail.gmail.com>
 <CAP7h-xYZwKsuKHwQCw7KAzcyZNGTsKR2FwMzF-z0AK9Gsf3OWA@mail.gmail.com>
 <CAExdVNm21f4_sqrHNMgBE4TyyHf8x0PMKVaggL2zrjQmEX66dA@mail.gmail.com>
 <CAP7h-xZqMNSeFLdKH6SBznLKWYq46Z+7H1EV99HLfusgaPpCag@mail.gmail.com>
 <CAExdVNmnQVrVoMtXe19kJK5DyK2KfULMFFES6Bvptvcn18dKKg@mail.gmail.com>
 <CAP7h-xYF_DZM1n427zTTF=LQJmLwkGT6fn7e9dCsmjahN5s0ig@mail.gmail.com>
 <1443120138.1452862.392770233.43DB6E32@webmail.messagingengine.com>
Message-ID: <CAExdVNk_j0_MJ7WnvLwME9tYVMkPS7OzC=aZxrT68Jhj-cShkw@mail.gmail.com>

[Random832 <random832 at fastmail.com>]
> ...
> I still don't see the logic in doing any of this rather than parsing
> zoneinfo files directly on systems that use it;
> Get[Dynamic]TimeZoneInformation[ForYear] on Windows, etc.

Presumably Alex doesn't want to devote his life to fleshing out "etc"
on endless platforms he doesn't use.

If you think it's simple, _you_ write the code.  Start by writing code
to answer the question "well, _does_ this system use zoneinfo files?"
on all possible Python platforms.  Thanks ;-)

From alexander.belopolsky at gmail.com  Thu Sep 24 20:50:39 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Thu, 24 Sep 2015 14:50:39 -0400
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <1443120138.1452862.392770233.43DB6E32@webmail.messagingengine.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAExdVNn7VYWUMY2A3Yx0EjC6-SZq38VvusCnrL6pkV60AVc+6A@mail.gmail.com>
 <CAP7h-xbC=EaYVxSA5iSdsBJ1=-owNt=qqGcTrpfLyPzq+58g1g@mail.gmail.com>
 <CAExdVNk9B3hEZv_nEm1dbeBLgaHgQrmt0aB-M3eQusFRPJeNkg@mail.gmail.com>
 <CAP7h-xYZwKsuKHwQCw7KAzcyZNGTsKR2FwMzF-z0AK9Gsf3OWA@mail.gmail.com>
 <CAExdVNm21f4_sqrHNMgBE4TyyHf8x0PMKVaggL2zrjQmEX66dA@mail.gmail.com>
 <CAP7h-xZqMNSeFLdKH6SBznLKWYq46Z+7H1EV99HLfusgaPpCag@mail.gmail.com>
 <CAExdVNmnQVrVoMtXe19kJK5DyK2KfULMFFES6Bvptvcn18dKKg@mail.gmail.com>
 <CAP7h-xYF_DZM1n427zTTF=LQJmLwkGT6fn7e9dCsmjahN5s0ig@mail.gmail.com>
 <1443120138.1452862.392770233.43DB6E32@webmail.messagingengine.com>
Message-ID: <CAP7h-xYDzGtfsccyYmYgfVFT37FTK_GKOb3JpCBZrFX3FiLwqA@mail.gmail.com>

On Thu, Sep 24, 2015 at 2:42 PM, Random832 <random832 at fastmail.com> wrote:

> I still don't see the logic in doing any of this rather than parsing
> zoneinfo files directly on systems that use it;
>

There is no portable way to even discover the location of the zoneinfo
files.  (The default location when installing from the source is the
incredible /usr/local/etc/zoneinfo! )  If some location is guessed by
searching some likely candidates, there is no guarantee that this is what
system tzset() is using.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150924/7918a49a/attachment-0001.html>

From alexander.belopolsky at gmail.com  Thu Sep 24 20:57:53 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Thu, 24 Sep 2015 14:57:53 -0400
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAExdVNk_j0_MJ7WnvLwME9tYVMkPS7OzC=aZxrT68Jhj-cShkw@mail.gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAExdVNn7VYWUMY2A3Yx0EjC6-SZq38VvusCnrL6pkV60AVc+6A@mail.gmail.com>
 <CAP7h-xbC=EaYVxSA5iSdsBJ1=-owNt=qqGcTrpfLyPzq+58g1g@mail.gmail.com>
 <CAExdVNk9B3hEZv_nEm1dbeBLgaHgQrmt0aB-M3eQusFRPJeNkg@mail.gmail.com>
 <CAP7h-xYZwKsuKHwQCw7KAzcyZNGTsKR2FwMzF-z0AK9Gsf3OWA@mail.gmail.com>
 <CAExdVNm21f4_sqrHNMgBE4TyyHf8x0PMKVaggL2zrjQmEX66dA@mail.gmail.com>
 <CAP7h-xZqMNSeFLdKH6SBznLKWYq46Z+7H1EV99HLfusgaPpCag@mail.gmail.com>
 <CAExdVNmnQVrVoMtXe19kJK5DyK2KfULMFFES6Bvptvcn18dKKg@mail.gmail.com>
 <CAP7h-xYF_DZM1n427zTTF=LQJmLwkGT6fn7e9dCsmjahN5s0ig@mail.gmail.com>
 <1443120138.1452862.392770233.43DB6E32@webmail.messagingengine.com>
 <CAExdVNk_j0_MJ7WnvLwME9tYVMkPS7OzC=aZxrT68Jhj-cShkw@mail.gmail.com>
Message-ID: <CAP7h-xb1r3-_FDyWCeuPsqZh9sriizE5WdyQoYBqNtotQSWY4w@mail.gmail.com>

On Thu, Sep 24, 2015 at 2:48 PM, Tim Peters <tim.peters at gmail.com> wrote:

> > Get[Dynamic]TimeZoneInformation[ForYear] on Windows, etc.
>
> Presumably Alex doesn't want to devote his life to fleshing out "etc"
> on endless platforms he doesn't use.


It's worse than that.  I have no desire to learn even what
"Get[Dynamic]TimeZoneInformation[ForYear]" is. :-(
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150924/b81356c4/attachment.html>

From 4kir4.1i at gmail.com  Fri Sep 25 01:51:55 2015
From: 4kir4.1i at gmail.com (Akira Li)
Date: Fri, 25 Sep 2015 02:51:55 +0300
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAP7h-xYDzGtfsccyYmYgfVFT37FTK_GKOb3JpCBZrFX3FiLwqA@mail.gmail.com>
 (Alexander Belopolsky's message of "Thu, 24 Sep 2015 14:50:39 -0400")
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAExdVNn7VYWUMY2A3Yx0EjC6-SZq38VvusCnrL6pkV60AVc+6A@mail.gmail.com>
 <CAP7h-xbC=EaYVxSA5iSdsBJ1=-owNt=qqGcTrpfLyPzq+58g1g@mail.gmail.com>
 <CAExdVNk9B3hEZv_nEm1dbeBLgaHgQrmt0aB-M3eQusFRPJeNkg@mail.gmail.com>
 <CAP7h-xYZwKsuKHwQCw7KAzcyZNGTsKR2FwMzF-z0AK9Gsf3OWA@mail.gmail.com>
 <CAExdVNm21f4_sqrHNMgBE4TyyHf8x0PMKVaggL2zrjQmEX66dA@mail.gmail.com>
 <CAP7h-xZqMNSeFLdKH6SBznLKWYq46Z+7H1EV99HLfusgaPpCag@mail.gmail.com>
 <CAExdVNmnQVrVoMtXe19kJK5DyK2KfULMFFES6Bvptvcn18dKKg@mail.gmail.com>
 <CAP7h-xYF_DZM1n427zTTF=LQJmLwkGT6fn7e9dCsmjahN5s0ig@mail.gmail.com>
 <1443120138.1452862.392770233.43DB6E32@webmail.messagingengine.com>
 <CAP7h-xYDzGtfsccyYmYgfVFT37FTK_GKOb3JpCBZrFX3FiLwqA@mail.gmail.com>
Message-ID: <87oagr1ilg.fsf@gmail.com>

Alexander Belopolsky <alexander.belopolsky at gmail.com> writes:

> On Thu, Sep 24, 2015 at 2:42 PM, Random832 <random832 at fastmail.com> wrote:
>
>> I still don't see the logic in doing any of this rather than parsing
>> zoneinfo files directly on systems that use it;
>>
>
> There is no portable way to even discover the location of the zoneinfo
> files.  (The default location when installing from the source is the
> incredible /usr/local/etc/zoneinfo! )  If some location is guessed by
> searching some likely candidates, there is no guarantee that this is what
> system tzset() is using.

tzlocal module by Lennart Regebro might be good enough in practice
  https://github.com/regebro/tzlocal

From alexander.belopolsky at gmail.com  Fri Sep 25 02:18:35 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Thu, 24 Sep 2015 20:18:35 -0400
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <87oagr1ilg.fsf@gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAExdVNn7VYWUMY2A3Yx0EjC6-SZq38VvusCnrL6pkV60AVc+6A@mail.gmail.com>
 <CAP7h-xbC=EaYVxSA5iSdsBJ1=-owNt=qqGcTrpfLyPzq+58g1g@mail.gmail.com>
 <CAExdVNk9B3hEZv_nEm1dbeBLgaHgQrmt0aB-M3eQusFRPJeNkg@mail.gmail.com>
 <CAP7h-xYZwKsuKHwQCw7KAzcyZNGTsKR2FwMzF-z0AK9Gsf3OWA@mail.gmail.com>
 <CAExdVNm21f4_sqrHNMgBE4TyyHf8x0PMKVaggL2zrjQmEX66dA@mail.gmail.com>
 <CAP7h-xZqMNSeFLdKH6SBznLKWYq46Z+7H1EV99HLfusgaPpCag@mail.gmail.com>
 <CAExdVNmnQVrVoMtXe19kJK5DyK2KfULMFFES6Bvptvcn18dKKg@mail.gmail.com>
 <CAP7h-xYF_DZM1n427zTTF=LQJmLwkGT6fn7e9dCsmjahN5s0ig@mail.gmail.com>
 <1443120138.1452862.392770233.43DB6E32@webmail.messagingengine.com>
 <CAP7h-xYDzGtfsccyYmYgfVFT37FTK_GKOb3JpCBZrFX3FiLwqA@mail.gmail.com>
 <87oagr1ilg.fsf@gmail.com>
Message-ID: <CAP7h-xadHg3kaBGcFSwTyiZR5b+DGm0AvdxcFzaSSG1+_kGz1w@mail.gmail.com>

On Thu, Sep 24, 2015 at 7:51 PM, Akira Li <4kir4.1i at gmail.com> wrote:

> tzlocal module by Lennart Regebro might be good enough in practice
>   https://github.com/regebro/tzlocal
>

How well has it been tested on say FreeBSD or Solaris?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150924/842148d1/attachment.html>

From tim.peters at gmail.com  Fri Sep 25 03:11:08 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 24 Sep 2015 20:11:08 -0500
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAP7h-xadHg3kaBGcFSwTyiZR5b+DGm0AvdxcFzaSSG1+_kGz1w@mail.gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAExdVNn7VYWUMY2A3Yx0EjC6-SZq38VvusCnrL6pkV60AVc+6A@mail.gmail.com>
 <CAP7h-xbC=EaYVxSA5iSdsBJ1=-owNt=qqGcTrpfLyPzq+58g1g@mail.gmail.com>
 <CAExdVNk9B3hEZv_nEm1dbeBLgaHgQrmt0aB-M3eQusFRPJeNkg@mail.gmail.com>
 <CAP7h-xYZwKsuKHwQCw7KAzcyZNGTsKR2FwMzF-z0AK9Gsf3OWA@mail.gmail.com>
 <CAExdVNm21f4_sqrHNMgBE4TyyHf8x0PMKVaggL2zrjQmEX66dA@mail.gmail.com>
 <CAP7h-xZqMNSeFLdKH6SBznLKWYq46Z+7H1EV99HLfusgaPpCag@mail.gmail.com>
 <CAExdVNmnQVrVoMtXe19kJK5DyK2KfULMFFES6Bvptvcn18dKKg@mail.gmail.com>
 <CAP7h-xYF_DZM1n427zTTF=LQJmLwkGT6fn7e9dCsmjahN5s0ig@mail.gmail.com>
 <1443120138.1452862.392770233.43DB6E32@webmail.messagingengine.com>
 <CAP7h-xYDzGtfsccyYmYgfVFT37FTK_GKOb3JpCBZrFX3FiLwqA@mail.gmail.com>
 <87oagr1ilg.fsf@gmail.com>
 <CAP7h-xadHg3kaBGcFSwTyiZR5b+DGm0AvdxcFzaSSG1+_kGz1w@mail.gmail.com>
Message-ID: <CAExdVNmvAEgovwaz_VFS79uwLtYXZxF9rmuHfBNN+KpK3MqZGA@mail.gmail.com>

[Akira Li <4kir4.1i at gmail.com>]
>> tzlocal module by Lennart Regebro might be good enough in practice
>>   https://github.com/regebro/tzlocal

[Alex]
> How well has it been tested on say FreeBSD or Solaris?

I'm not sure it's relevant to what you're trying to do now.  Lennart's
tzlocal is intended to work with pytz, and its unix.py just searches
all over creation for "the local" IANA tzfile to pass to pytz.

My guess is that unless/until Python ships IANA files itself, we're
best off sticking to standard C functions.  Those have _some_ scant
chance of working everywhere ;-)

From alexander.belopolsky at gmail.com  Fri Sep 25 21:57:36 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Fri, 25 Sep 2015 15:57:36 -0400
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAPTjJmqYV6zKih5hXkasDh6Y+nOG3nF1uudUgWnn+dj8k_mxJw@mail.gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAPTjJmqYV6zKih5hXkasDh6Y+nOG3nF1uudUgWnn+dj8k_mxJw@mail.gmail.com>
Message-ID: <CAP7h-xYs630fA7FELpgZkW07_ewT51ok8gt5DzMzyZr7AUzi=w@mail.gmail.com>

On Wed, Sep 23, 2015 at 6:16 PM, Chris Angelico <rosuav at gmail.com> wrote:

> I'm not sure whether this actually helps anything or not, ...


Based on the fold size statistic, I have implemented [1] a more robust
fold-detection algorithm that passes the exhaustive test.  The only
assumption that it requires is that no fold is bigger than 24 hours.

[1]:
https://github.com/abalkin/cpython/commit/54d3596b0180512c68c91e8308665c0a9e61c9eb
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150925/34ddcc01/attachment.html>

From tim.peters at gmail.com  Fri Sep 25 23:32:50 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 25 Sep 2015 16:32:50 -0500
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAP7h-xYs630fA7FELpgZkW07_ewT51ok8gt5DzMzyZr7AUzi=w@mail.gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAPTjJmqYV6zKih5hXkasDh6Y+nOG3nF1uudUgWnn+dj8k_mxJw@mail.gmail.com>
 <CAP7h-xYs630fA7FELpgZkW07_ewT51ok8gt5DzMzyZr7AUzi=w@mail.gmail.com>
Message-ID: <CAExdVNkxNTxTx41pyLoM9z2Hjdpq+R_cHf-BSjH22hNmqw-j+A@mail.gmail.com>

[Alex]
> Based on the fold size statistic, I have implemented [1] a more robust
> fold-detection algorithm that passes the exhaustive test.  The only
> assumption that it requires is that no fold is bigger than 24 hours.
>
> [1]:
> https://github.com/abalkin/cpython/commit/54d3596b0180512c68c91e8308665c0a9e61c9eb

Wondering whether this line:

    if probe2 != result + trans:

could be replaced with:

    if probe2 == result:

I'm not sure what the first line is saying ;-)  The second line says
to me "this is the later time in a fold if and only if subtracting the
width of the fold from the starting timestamp converts to the same
local time - that's what 'the later time in a fold' means".

From alexander.belopolsky at gmail.com  Sat Sep 26 02:25:19 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Fri, 25 Sep 2015 20:25:19 -0400
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAExdVNkxNTxTx41pyLoM9z2Hjdpq+R_cHf-BSjH22hNmqw-j+A@mail.gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAPTjJmqYV6zKih5hXkasDh6Y+nOG3nF1uudUgWnn+dj8k_mxJw@mail.gmail.com>
 <CAP7h-xYs630fA7FELpgZkW07_ewT51ok8gt5DzMzyZr7AUzi=w@mail.gmail.com>
 <CAExdVNkxNTxTx41pyLoM9z2Hjdpq+R_cHf-BSjH22hNmqw-j+A@mail.gmail.com>
Message-ID: <CAP7h-xah97QuSw6SQTmRs53pbp+KYLnqdQYdu42qMzkbihOrcQ@mail.gmail.com>

On Fri, Sep 25, 2015 at 5:32 PM, Tim Peters <tim.peters at gmail.com> wrote:
>
> >
> https://github.com/abalkin/cpython/commit/54d3596b0180512c68c91e8308665c0a9e61c9eb
>
> Wondering whether this line:
>
>     if probe2 != result + trans:
>
> could be replaced with:
>
>     if probe2 == result:
>

Yes, it can.   Thanks for the suggestion.


>
> I'm not sure what the first line is saying ;-)


It says that probe2 and result are on the opposite sides of the transition,
but your tests is simpler and easier to understand.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150925/9318e9d6/attachment.html>

From tim.peters at gmail.com  Sat Sep 26 04:28:20 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 25 Sep 2015 21:28:20 -0500
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAP7h-xah97QuSw6SQTmRs53pbp+KYLnqdQYdu42qMzkbihOrcQ@mail.gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAPTjJmqYV6zKih5hXkasDh6Y+nOG3nF1uudUgWnn+dj8k_mxJw@mail.gmail.com>
 <CAP7h-xYs630fA7FELpgZkW07_ewT51ok8gt5DzMzyZr7AUzi=w@mail.gmail.com>
 <CAExdVNkxNTxTx41pyLoM9z2Hjdpq+R_cHf-BSjH22hNmqw-j+A@mail.gmail.com>
 <CAP7h-xah97QuSw6SQTmRs53pbp+KYLnqdQYdu42qMzkbihOrcQ@mail.gmail.com>
Message-ID: <CAExdVNmukhObxoUYTJfwiWVN=gebs6crN7S6Yv8V1Z562qBTBw@mail.gmail.com>

>>> https://github.com/abalkin/cpython/commit/54d3596b0180512c68c91e8308665c0a9e61c9eb

[Tim]
>> Wondering whether this line:
>>
>>     if probe2 != result + trans:
>>
>> could be replaced with:
>>
>>     if probe2 == result:

[Alex]
> Yes, it can.   Thanks for the suggestion.

Good!  So you have a simple, cross-platform solution now, at least for
timestamps localtime() doesn't barf on.


> The only assumption that it requires is that no fold is bigger than 24 hours.

Well, it does rely on more than just that.  For example, if there's a
gap where the clock jumps from 2 to 3, followed soon after by a fold
of an hour repeating times of the form 4:MM, then the second
occurrence of 4:30 won't be detected as such.- the fold and the gap
"cancel out" with respect to subtracting 24 hours in either naive time
or timestamp time.

So it seems a sufficient condition is that there's at most one UTC
offset change in the last 24 hours.

I wouldn't be surprised if that's always true now - or that it won't
be after Kim Jong-un learns he could annoy us by making it false ;-)

From alexander.belopolsky at gmail.com  Sat Sep 26 04:50:19 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Fri, 25 Sep 2015 22:50:19 -0400
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAExdVNmukhObxoUYTJfwiWVN=gebs6crN7S6Yv8V1Z562qBTBw@mail.gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAPTjJmqYV6zKih5hXkasDh6Y+nOG3nF1uudUgWnn+dj8k_mxJw@mail.gmail.com>
 <CAP7h-xYs630fA7FELpgZkW07_ewT51ok8gt5DzMzyZr7AUzi=w@mail.gmail.com>
 <CAExdVNkxNTxTx41pyLoM9z2Hjdpq+R_cHf-BSjH22hNmqw-j+A@mail.gmail.com>
 <CAP7h-xah97QuSw6SQTmRs53pbp+KYLnqdQYdu42qMzkbihOrcQ@mail.gmail.com>
 <CAExdVNmukhObxoUYTJfwiWVN=gebs6crN7S6Yv8V1Z562qBTBw@mail.gmail.com>
Message-ID: <CAP7h-xYiO52ry+KcSQ1nEXUSx+kBhAW3Wf88u9Avvy4Y4DHSBg@mail.gmail.com>

On Fri, Sep 25, 2015 at 10:28 PM, Tim Peters <tim.peters at gmail.com> wrote:
>
> So it seems a sufficient condition is that there's at most one UTC
> offset change in the last 24 hours.


Yes.  That's the condition I've been talking for months about.

If you have a chance, please take a look at

https://github.com/abalkin/cpython/commit/d146830e70a1fda22380c5ba0d9592c16acd23de

It fails on Europe/Tallinn which seems to have transitions separated by 22
hours with the *same* utcoffset.

I don't understand why zic would ever produce something like this.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150925/5e90f587/attachment.html>

From alexander.belopolsky at gmail.com  Sat Sep 26 05:03:29 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Fri, 25 Sep 2015 23:03:29 -0400
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAP7h-xYiO52ry+KcSQ1nEXUSx+kBhAW3Wf88u9Avvy4Y4DHSBg@mail.gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAPTjJmqYV6zKih5hXkasDh6Y+nOG3nF1uudUgWnn+dj8k_mxJw@mail.gmail.com>
 <CAP7h-xYs630fA7FELpgZkW07_ewT51ok8gt5DzMzyZr7AUzi=w@mail.gmail.com>
 <CAExdVNkxNTxTx41pyLoM9z2Hjdpq+R_cHf-BSjH22hNmqw-j+A@mail.gmail.com>
 <CAP7h-xah97QuSw6SQTmRs53pbp+KYLnqdQYdu42qMzkbihOrcQ@mail.gmail.com>
 <CAExdVNmukhObxoUYTJfwiWVN=gebs6crN7S6Yv8V1Z562qBTBw@mail.gmail.com>
 <CAP7h-xYiO52ry+KcSQ1nEXUSx+kBhAW3Wf88u9Avvy4Y4DHSBg@mail.gmail.com>
Message-ID: <CAP7h-xY8Rc6QJLXyud9-HeNtcdfqJt1bg5oywFDk1ckkrPvkaQ@mail.gmail.com>

On Fri, Sep 25, 2015 at 10:50 PM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:

> It fails on Europe/Tallinn which seems to have transitions separated by 22
> hours with the *same* utcoffset.
>
> I don't understand why zic would ever produce something like this.
>

Interestingly, system zdump misses the problem transition:

$ zdump -v Europe/Tallinn | grep 1999
Europe/Tallinn  Sun Mar 28 00:59:59 1999 UTC = Sun Mar 28 02:59:59 1999 EET
isdst=0
Europe/Tallinn  Sun Mar 28 01:00:00 1999 UTC = Sun Mar 28 04:00:00 1999
EEST isdst=1
Europe/Tallinn  Sun Oct 31 00:59:59 1999 UTC = Sun Oct 31 03:59:59 1999
EEST isdst=1
Europe/Tallinn  Sun Oct 31 01:00:00 1999 UTC = Sun Oct 31 03:00:00 1999 EET
isdst=0

You need to use my zdump.py tool [1] to see it:

$ ./python.exe Tools/tz/zdump.py Europe/Tallinn | grep 1999
1999-03-28 01:00:00 UTC = 1999-03-28 04:00:00 EEST  isdst=1 +1
1999-10-31 01:00:00 UTC = 1999-10-31 03:00:00 EET   isdst=0 -1
1999-10-31 22:00:00 UTC = 1999-11-01 00:00:00 EET   isdst=0 +0

[1] :
https://github.com/abalkin/cpython/blob/issue24773-s3/Tools/tz/zdump.py
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150925/0876905a/attachment-0001.html>

From alexander.belopolsky at gmail.com  Sat Sep 26 05:12:26 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Fri, 25 Sep 2015 23:12:26 -0400
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAP7h-xY8Rc6QJLXyud9-HeNtcdfqJt1bg5oywFDk1ckkrPvkaQ@mail.gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAPTjJmqYV6zKih5hXkasDh6Y+nOG3nF1uudUgWnn+dj8k_mxJw@mail.gmail.com>
 <CAP7h-xYs630fA7FELpgZkW07_ewT51ok8gt5DzMzyZr7AUzi=w@mail.gmail.com>
 <CAExdVNkxNTxTx41pyLoM9z2Hjdpq+R_cHf-BSjH22hNmqw-j+A@mail.gmail.com>
 <CAP7h-xah97QuSw6SQTmRs53pbp+KYLnqdQYdu42qMzkbihOrcQ@mail.gmail.com>
 <CAExdVNmukhObxoUYTJfwiWVN=gebs6crN7S6Yv8V1Z562qBTBw@mail.gmail.com>
 <CAP7h-xYiO52ry+KcSQ1nEXUSx+kBhAW3Wf88u9Avvy4Y4DHSBg@mail.gmail.com>
 <CAP7h-xY8Rc6QJLXyud9-HeNtcdfqJt1bg5oywFDk1ckkrPvkaQ@mail.gmail.com>
Message-ID: <CAP7h-xY4Tf4NDKqTNKQe37+7m=aORXSoaW=4GN7HBE1xhh-bpQ@mail.gmail.com>

On Fri, Sep 25, 2015 at 11:03 PM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:

> You need to use my zdump.py tool [1] to see it:
>
> $ ./python.exe Tools/tz/zdump.py Europe/Tallinn | grep 1999
> 1999-03-28 01:00:00 UTC = 1999-03-28 04:00:00 EEST  isdst=1 +1
> 1999-10-31 01:00:00 UTC = 1999-10-31 03:00:00 EET   isdst=0 -1
> 1999-10-31 22:00:00 UTC = 1999-11-01 00:00:00 EET   isdst=0 +0
>
> [1] :
> https://github.com/abalkin/cpython/blob/issue24773-s3/Tools/tz/zdump.py
>

It looks like this problem has been fixed [2] in the 2015f release:

$ ./python.exe Tools/tz/zdump.py /usr/local/etc/zoneinfo/Europe/Tallinn |
grep 1999
1999-03-28 01:00:00 UTC = 1999-03-28 04:00:00 EEST  isdst=1 +1
1999-10-31 01:00:00 UTC = 1999-10-31 03:00:00 EET   isdst=0 -1

[2]:
https://github.com/eggert/tz/commit/cf8df34364ffc9bd4eaddc5ff0d6bcdbd699893b
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150925/428ed694/attachment.html>

From alexander.belopolsky at gmail.com  Sat Sep 26 05:36:45 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Fri, 25 Sep 2015 23:36:45 -0400
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAExdVNmukhObxoUYTJfwiWVN=gebs6crN7S6Yv8V1Z562qBTBw@mail.gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAPTjJmqYV6zKih5hXkasDh6Y+nOG3nF1uudUgWnn+dj8k_mxJw@mail.gmail.com>
 <CAP7h-xYs630fA7FELpgZkW07_ewT51ok8gt5DzMzyZr7AUzi=w@mail.gmail.com>
 <CAExdVNkxNTxTx41pyLoM9z2Hjdpq+R_cHf-BSjH22hNmqw-j+A@mail.gmail.com>
 <CAP7h-xah97QuSw6SQTmRs53pbp+KYLnqdQYdu42qMzkbihOrcQ@mail.gmail.com>
 <CAExdVNmukhObxoUYTJfwiWVN=gebs6crN7S6Yv8V1Z562qBTBw@mail.gmail.com>
Message-ID: <CAP7h-xa22czwTMR84MEGQn9iLPbqgF9xxXS8yhMCU7yOzv8w5Q@mail.gmail.com>

On Fri, Sep 25, 2015 at 10:28 PM, Tim Peters <tim.peters at gmail.com> wrote:

> So it seems a sufficient condition is that there's at most one UTC
> offset change in the last 24 hours.
>

Apparently [1] we are not alone in wanting this condition.

[1]: http://mm.icann.org/pipermail/tz/2015-June/022309.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150925/387e4c78/attachment.html>

From tim.peters at gmail.com  Sat Sep 26 05:36:52 2015
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 25 Sep 2015 22:36:52 -0500
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAP7h-xYiO52ry+KcSQ1nEXUSx+kBhAW3Wf88u9Avvy4Y4DHSBg@mail.gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAPTjJmqYV6zKih5hXkasDh6Y+nOG3nF1uudUgWnn+dj8k_mxJw@mail.gmail.com>
 <CAP7h-xYs630fA7FELpgZkW07_ewT51ok8gt5DzMzyZr7AUzi=w@mail.gmail.com>
 <CAExdVNkxNTxTx41pyLoM9z2Hjdpq+R_cHf-BSjH22hNmqw-j+A@mail.gmail.com>
 <CAP7h-xah97QuSw6SQTmRs53pbp+KYLnqdQYdu42qMzkbihOrcQ@mail.gmail.com>
 <CAExdVNmukhObxoUYTJfwiWVN=gebs6crN7S6Yv8V1Z562qBTBw@mail.gmail.com>
 <CAP7h-xYiO52ry+KcSQ1nEXUSx+kBhAW3Wf88u9Avvy4Y4DHSBg@mail.gmail.com>
Message-ID: <CAExdVNka3mesad+w7bpuc0cvUmAagNd=_BcGs4Ox9f7azM3PDA@mail.gmail.com>

[Tim]
>> So it seems a sufficient condition is that there's at most one UTC
>> offset change in the last 24 hours.

[Alex]
> Yes.  That's the condition I've been talking for months about.

?


> If you have a chance, please take a look at
>
> https://github.com/abalkin/cpython/commit/d146830e70a1fda22380c5ba0d9592c16acd23de
>
> It fails on Europe/Tallinn which seems to have transitions separated by 22
> hours with the *same* utcoffset.
>
> I don't understand why zic would ever produce something like this.

Well, the Tallinn source rules I see include:

2:00 EU EE%sT 1999 Nov  1
2:00 - EET 2002 Feb 21

That is, they decided to stop messing with DST at all effective the
start of November, 1999.  But until then, they were following "EU"
daylight rules.  Which ends DST on the last Sunday of October, which
in 1999 happened to be Oct 31.

So the first switch to EET local Sunday morning was due to EU daylight
time ending, and then the second "switch" to EET at local midnight was
due to Tallinn opting out of DST rules altogether.  Which didn't
change the zone name, DST status, or UTC offset.

The output of zic doesn't appear nearly well defined enough to say
whether that's "a bug" or "a feature", though :-(

From alexander.belopolsky at gmail.com  Sat Sep 26 05:44:09 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Fri, 25 Sep 2015 23:44:09 -0400
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAExdVNka3mesad+w7bpuc0cvUmAagNd=_BcGs4Ox9f7azM3PDA@mail.gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAPTjJmqYV6zKih5hXkasDh6Y+nOG3nF1uudUgWnn+dj8k_mxJw@mail.gmail.com>
 <CAP7h-xYs630fA7FELpgZkW07_ewT51ok8gt5DzMzyZr7AUzi=w@mail.gmail.com>
 <CAExdVNkxNTxTx41pyLoM9z2Hjdpq+R_cHf-BSjH22hNmqw-j+A@mail.gmail.com>
 <CAP7h-xah97QuSw6SQTmRs53pbp+KYLnqdQYdu42qMzkbihOrcQ@mail.gmail.com>
 <CAExdVNmukhObxoUYTJfwiWVN=gebs6crN7S6Yv8V1Z562qBTBw@mail.gmail.com>
 <CAP7h-xYiO52ry+KcSQ1nEXUSx+kBhAW3Wf88u9Avvy4Y4DHSBg@mail.gmail.com>
 <CAExdVNka3mesad+w7bpuc0cvUmAagNd=_BcGs4Ox9f7azM3PDA@mail.gmail.com>
Message-ID: <CAP7h-xb1DaXaKyP2K1qz6pEDD0ZU_zKz+_5tPjsKf9MAwfRL7A@mail.gmail.com>

On Fri, Sep 25, 2015 at 11:36 PM, Tim Peters <tim.peters at gmail.com> wrote:
>
> [Alex]
> > Yes.  That's the condition I've been talking for months about.
>
> ?


"if we (generously) allow utcoffset to vary from -24h to +24h, then a
"sane" zone can be defined as the one where
utcoffset changes at most once in any 48 hour period."

https://mail.python.org/pipermail/python-dev/2015-April/139171.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150925/e0342c08/attachment.html>

From alexander.belopolsky at gmail.com  Sat Sep 26 05:46:47 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Fri, 25 Sep 2015 23:46:47 -0400
Subject: [Datetime-SIG] IANA TZ database statistics
In-Reply-To: <CAExdVNka3mesad+w7bpuc0cvUmAagNd=_BcGs4Ox9f7azM3PDA@mail.gmail.com>
References: <CAP7h-xZVL4NXo+c9ds3jTSiXES-G2EAvNzBy2CKTRqHoU+HEGg@mail.gmail.com>
 <CAPTjJmqYV6zKih5hXkasDh6Y+nOG3nF1uudUgWnn+dj8k_mxJw@mail.gmail.com>
 <CAP7h-xYs630fA7FELpgZkW07_ewT51ok8gt5DzMzyZr7AUzi=w@mail.gmail.com>
 <CAExdVNkxNTxTx41pyLoM9z2Hjdpq+R_cHf-BSjH22hNmqw-j+A@mail.gmail.com>
 <CAP7h-xah97QuSw6SQTmRs53pbp+KYLnqdQYdu42qMzkbihOrcQ@mail.gmail.com>
 <CAExdVNmukhObxoUYTJfwiWVN=gebs6crN7S6Yv8V1Z562qBTBw@mail.gmail.com>
 <CAP7h-xYiO52ry+KcSQ1nEXUSx+kBhAW3Wf88u9Avvy4Y4DHSBg@mail.gmail.com>
 <CAExdVNka3mesad+w7bpuc0cvUmAagNd=_BcGs4Ox9f7azM3PDA@mail.gmail.com>
Message-ID: <CAP7h-xaMiM8dcuK_9D-_3K7aqNG9ZQT2Ggu0FEkv8SNPU8yFmw@mail.gmail.com>

On Fri, Sep 25, 2015 at 11:36 PM, Tim Peters <tim.peters at gmail.com> wrote:

> Well, the Tallinn source rules I see include:
>
> 2:00 EU EE%sT 1999 Nov  1
> 2:00 - EET 2002 Feb 21
>

That's a bug that has been fixed in 2015f.  See <
http://mm.icann.org/pipermail/tz/2015-June/022309.html> for details.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150925/ebce0366/attachment.html>

From alexander.belopolsky at gmail.com  Tue Sep 29 03:21:51 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 28 Sep 2015 21:21:51 -0400
Subject: [Datetime-SIG] Making tm_gmtoff and tm_zone available on all
	platforms
Message-ID: <CAP7h-xZLemUycL5=H8wNN_UEv6g-357xN9gKcxKKvh6sUeWoVg@mail.gmail.com>

Most UNIX platforms extend struct tm to include tm_gmtoff and tm_zone
fields that contain the current UTC offset in seconds and the zone
abbreviation.

Python has been making these fields available as attributes of
time.struct_time [1] since version 3.3, but only on platforms that support
them in the C library.

>>> import time
>>> t = time.localtime()
>>> t.tm_gmtoff
-14400
>>> t.tm_zone
'EDT'

I propose that we make these attributes available on all platforms by
computing their values when they are not available in struct tm.

The tm_gmtoff value is easy to compute by comparing localtime() to gmtime():

>>> u = time.gmtime(time.mktime(t))
>>> from calendar import timegm
>>> timegm(t) - timegm(u)
-14400

and tm_zone can be computed by calling strftime() with a '%Z' directive.

>>> time.strftime('%Z', t)
'EDT'

What does the group think?

[1]: https://docs.python.org/3/library/time.html#time.struct_time
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150928/017c3d69/attachment.html>

From guido at python.org  Tue Sep 29 05:04:31 2015
From: guido at python.org (Guido van Rossum)
Date: Mon, 28 Sep 2015 20:04:31 -0700
Subject: [Datetime-SIG] Making tm_gmtoff and tm_zone available on all
	platforms
In-Reply-To: <CAP7h-xZLemUycL5=H8wNN_UEv6g-357xN9gKcxKKvh6sUeWoVg@mail.gmail.com>
References: <CAP7h-xZLemUycL5=H8wNN_UEv6g-357xN9gKcxKKvh6sUeWoVg@mail.gmail.com>
Message-ID: <CAP7+vJ+YuvvNzQwDjFfC-nN7L_J5d5PgJ=os=sU5fCCsDUFJ0Q@mail.gmail.com>

I had been wondering about that myself. But your implementation proposal
sounds kind of expensive, doesn't it?

On Mon, Sep 28, 2015 at 6:21 PM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:

> Most UNIX platforms extend struct tm to include tm_gmtoff and tm_zone
> fields that contain the current UTC offset in seconds and the zone
> abbreviation.
>
> Python has been making these fields available as attributes of
> time.struct_time [1] since version 3.3, but only on platforms that support
> them in the C library.
>
> >>> import time
> >>> t = time.localtime()
> >>> t.tm_gmtoff
> -14400
> >>> t.tm_zone
> 'EDT'
>
> I propose that we make these attributes available on all platforms by
> computing their values when they are not available in struct tm.
>
> The tm_gmtoff value is easy to compute by comparing localtime() to
> gmtime():
>
> >>> u = time.gmtime(time.mktime(t))
> >>> from calendar import timegm
> >>> timegm(t) - timegm(u)
> -14400
>
> and tm_zone can be computed by calling strftime() with a '%Z' directive.
>
> >>> time.strftime('%Z', t)
> 'EDT'
>
> What does the group think?
>
> [1]: https://docs.python.org/3/library/time.html#time.struct_time
>
> _______________________________________________
> Datetime-SIG mailing list
> Datetime-SIG at python.org
> https://mail.python.org/mailman/listinfo/datetime-sig
> The PSF Code of Conduct applies to this mailing list:
> https://www.python.org/psf/codeofconduct/
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150928/86acae63/attachment.html>

From alexander.belopolsky at gmail.com  Tue Sep 29 05:29:59 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 28 Sep 2015 23:29:59 -0400
Subject: [Datetime-SIG] Making tm_gmtoff and tm_zone available on all
	platforms
In-Reply-To: <CAP7+vJ+YuvvNzQwDjFfC-nN7L_J5d5PgJ=os=sU5fCCsDUFJ0Q@mail.gmail.com>
References: <CAP7h-xZLemUycL5=H8wNN_UEv6g-357xN9gKcxKKvh6sUeWoVg@mail.gmail.com>
 <CAP7+vJ+YuvvNzQwDjFfC-nN7L_J5d5PgJ=os=sU5fCCsDUFJ0Q@mail.gmail.com>
Message-ID: <CAP7h-xbkKufOA8Ka0sdXSD6aknL73EOhQwa_J7czT3p6GomEGw@mail.gmail.com>

On Mon, Sep 28, 2015 at 11:04 PM, Guido van Rossum <guido at python.org> wrote:
>
> I had been wondering about that myself. But your implementation proposal
sounds kind of expensive, doesn't it?


It could be with a naive implementation that would simply fill additional
fields in the existing time.struct_time object, but we can also modify the
struct_time class to compute the additional attributes only when they are
requested.  (I believe struct_time is currently implemented as
PyStructSequence, so we will probably need to subclass that somehow.)

On the other hand, I would start with a naive implementation and worry
about the optimizations later.  As far as I know, the POSIX layer on
Windows (which is the main platform that will be affected) is already very
slow, so the price of cross-platform portability may be within user
expectations in this case.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150928/f4e18862/attachment.html>

From alexander.belopolsky at gmail.com  Tue Sep 29 06:08:21 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 29 Sep 2015 00:08:21 -0400
Subject: [Datetime-SIG] PEP 495 implementation
Message-ID: <CAP7h-xaPxQuM3mK8NZmX9PfbtagHH-hKKuqATm6bG2AXM7Vk6w@mail.gmail.com>

I have completed a pure python implementation of PEP 495 and the patch is
ready for review. [1]  If you prefer the Github interface, please review
the pull request from my cpython clone. [2]  Finally, please add yourself
as "nosy" to issue #24773 [3] if you would like to follow future
developments.


[1]: http://bugs.python.org/review/24773/#ps15654
[2]: https://github.com/python/cpython/pull/20
[3]: http://bugs.python.org/issue24773
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150929/b0d3a761/attachment.html>

From guido at python.org  Tue Sep 29 23:03:14 2015
From: guido at python.org (Guido van Rossum)
Date: Tue, 29 Sep 2015 14:03:14 -0700
Subject: [Datetime-SIG] Making tm_gmtoff and tm_zone available on all
	platforms
In-Reply-To: <CAP7h-xbkKufOA8Ka0sdXSD6aknL73EOhQwa_J7czT3p6GomEGw@mail.gmail.com>
References: <CAP7h-xZLemUycL5=H8wNN_UEv6g-357xN9gKcxKKvh6sUeWoVg@mail.gmail.com>
 <CAP7+vJ+YuvvNzQwDjFfC-nN7L_J5d5PgJ=os=sU5fCCsDUFJ0Q@mail.gmail.com>
 <CAP7h-xbkKufOA8Ka0sdXSD6aknL73EOhQwa_J7czT3p6GomEGw@mail.gmail.com>
Message-ID: <CAP7+vJLyaL6g=Tb=F+53DgVZ0cp1y7D5q3=A52F3vLG+1cdHVQ@mail.gmail.com>

OK, I think this is fine then.


On Mon, Sep 28, 2015 at 8:29 PM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:

>
> On Mon, Sep 28, 2015 at 11:04 PM, Guido van Rossum <guido at python.org>
> wrote:
> >
> > I had been wondering about that myself. But your implementation proposal
> sounds kind of expensive, doesn't it?
>
>
> It could be with a naive implementation that would simply fill additional
> fields in the existing time.struct_time object, but we can also modify the
> struct_time class to compute the additional attributes only when they are
> requested.  (I believe struct_time is currently implemented as
> PyStructSequence, so we will probably need to subclass that somehow.)
>
> On the other hand, I would start with a naive implementation and worry
> about the optimizations later.  As far as I know, the POSIX layer on
> Windows (which is the main platform that will be affected) is already very
> slow, so the price of cross-platform portability may be within user
> expectations in this case.
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150929/2296a768/attachment.html>

From alexander.belopolsky at gmail.com  Wed Sep 30 18:54:56 2015
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Wed, 30 Sep 2015 12:54:56 -0400
Subject: [Datetime-SIG] Making tm_gmtoff and tm_zone available on all
	platforms
In-Reply-To: <CAP7+vJLyaL6g=Tb=F+53DgVZ0cp1y7D5q3=A52F3vLG+1cdHVQ@mail.gmail.com>
References: <CAP7h-xZLemUycL5=H8wNN_UEv6g-357xN9gKcxKKvh6sUeWoVg@mail.gmail.com>
 <CAP7+vJ+YuvvNzQwDjFfC-nN7L_J5d5PgJ=os=sU5fCCsDUFJ0Q@mail.gmail.com>
 <CAP7h-xbkKufOA8Ka0sdXSD6aknL73EOhQwa_J7czT3p6GomEGw@mail.gmail.com>
 <CAP7+vJLyaL6g=Tb=F+53DgVZ0cp1y7D5q3=A52F3vLG+1cdHVQ@mail.gmail.com>
Message-ID: <CAP7h-xY8PrbE=5WE-DM0kPJOXiHaOpgPz56g+4sVDgUp7KFS0Q@mail.gmail.com>

On Tue, Sep 29, 2015 at 5:03 PM, Guido van Rossum <guido at python.org> wrote:

> OK, I think this is fine then.


Implementation will be tracked at <http://bugs.python.org/issue25283>.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150930/b0bebafa/attachment.html>