[Datetime-SIG] Another round on error-checking

Tue Sep 1 18:58:34 CEST 2015

On Tue, Sep 1, 2015 at 9:37 AM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:

>
> On Tue, Sep 1, 2015 at 12:12 PM, Guido van Rossum <guido at python.org>
> wrote:
>
>> I could not accept a PEP that leads to different datetime being
>> considered == but having a different hash (*unless* due to a buggy tzinfo
>> subclass implementation -- however no historical timezone data should ever
>> depend on such a bug).
>>
>
> I agree, but my analysis demonstrates that we cannot fix hash to make an
> arbitrary tzinfo work.  ("Arbitrary" includes tzinfos with leap
> microseconds and leap centuries.)   We can probably come up with a good
> enough hash if we restrict fold sizes to multiples of 15 min up to 1 hour
> and locations to a hour boundaries.
>

That's bizarre. I suspect this came from assuming too much about how ==
must work.

> My preferred solution would be to delegate hash calculation to tzinfo and
> make it someone else's headache, but I know you don't like this solution.
>
>
>
>> I'm much less concerned about < being intransitive in edge cases. I also
>> don't particularly care about == following from the difference being zero.
>>
>
> I believe Tim does care about this.  I did consider divorcing comparison
> and arithmetic, but I think that led to problems with the total ordering.
> Maybe we can make == differentiate between fold=0 and fold=1 at the expense
> of not(a > b) and not(b<a) implying a==b?
> I am not too hopeful.  Messing with total ordering axioms is just as fatal
> for binary searches as messing with hash invariants is for dictionary
> lookups.
>

I think it's better to have some values that are neither < nor == nor >
each other, than to have two values that are == but differ in hash.

> Still, unless we're constrained by backward compatibility, I would rather
>> not add equivalence between *any* two datetimes whose tzinfo is not the
>> same object -- even if we can infer that they both must refer to the same
>> instant.
>>
>
> Not even for fixed offset timezones?  I am afraid this will break too many
> programs.
>

Oh, it looks like we currently allow < and >  if the utcoffset() of both
arguments are the same. I presume that's really a proxy for "both tzinfos
have the same fixed offset" which we can't detect directly. But this is
already pretty broken -- for tzinfos that don't have fixed offsets, the
comparison succeeds if both datetimes happen to fall in a period where the
offsets *are* the same.

In any case, a broken total ordering doesn't bother me that much, except
when the tzinfo is the same object. I wonder if we could cache the built-in
fixed-offset timezone instances? (Currently a new instance is created each
time you call astimezone(None).) Does pytz reuse its fixed-offset objects?

And given that we already have total ordering problems, from that
perspective I could live with declaring that two datetimes that differ only
in the fold are unequal. (Hm, aren't they already unequal because their
utcoffset() differs?)

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150901/0409864c/attachment.html>