[Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement

Alexander Belopolsky alexander.belopolsky at gmail.com
Sun Aug 16 23:45:20 CEST 2015


On Sun, Aug 16, 2015 at 3:23 PM, Guido van Rossum <guido at python.org> wrote:
> I think that a courtesy message to python-dev is appropriate, with a link to
> the PEP and an invitation to discuss its merits on datetime-sig.

Will do.  (Does anyone know how to set Reply-To: header in Gmail?)

..
> - I'm surprised the name of the proposed flag doesn't occur in the abstract.
>

That's because I wanted people to get to the proposal section before
starting to bikeshed on the name of the flag.   More on that below.

> - The rationale might explicitly mention the two cases we're thinking about:
> DST transitions and adjustments to the timezone's base offset -- noting that
> the latter may be an arbitrary interval (not just an hour).
>

Actually, in either case the adjustment can be a fraction of an hour.
I'll add this to the rationale.

> - The sidebar doesn't show up as a sidebar, but as somewhat mysterious text,
> on https://www.python.org/dev/peps/pep-0495/ (it does on legacy.python.org,
> but we're trying to avoid that site). Maybe you should file a bug with the
> pydotorg project on GitHub (if you haven't already).

I did: <https://github.com/python/pythondotorg/issues/808>.

> (While I like the
> artwork, it's a bit un-PEP-like, and maybe not worth it given the problems
> making the image appear properly.)

If we don't fix the layout issues before the pronouncement, I'll
remove the graphic.

> - Conversely, on legacy.python.org there are some error messages about
> "Unknown directive type "code"" (lines 112, 118).

I'll look into this.  I've never had problems with ReStructuredText
rendering on docs.p.o, but the peps site seems to be more restrictive.

>
> - "a fold is created in the fabric of time" sounds a bit like
> science-fiction. I'd just say "a time fold is created", or "a fold is
> created in time".
>

Agree.  After all, a "fold" already suggests some kind of fabric.

> - Despite having read the section about the naming, I'm still not wild about
> the name 'first'. This is in part because this requires True as the default,
> in part because without knowing the background its meaning somewhat
> mysterious.

I agree.  My top candidate is "repeated=False", but an invitation to
bikeshed, <https://mail.python.org/pipermail/datetime-sig/2015-August/000241.html>,
was not met with the usual enthusiasm.  To defend the "True means
earlier" choice, I would mention that it matches "isdst=1 means
earlier" in the fold.

> I'm not wild about the alternatives either, so perhaps this
> requires more bikeshedding. :-( (FWIW I agree that the name should not
> reference DST, since time folds may appear for other reasons.) Hmm... Maybe
> "fold=True" to select the second occurrance?

I really want something that disambiguates two times based on their
most natural characteristics: do you want the earlier or the later of
the two choice?  Anything else, in my view would require additional
knowledge.

>
> - I'm a bit surprised that this flag doesn't have three values (e.g. None,
> True, False) -- in C, the tm_isdst flag in struct tm can be -1, 0 and 1,
> where -1 means "figure it out" or "don't care".

With the proposed functionality, one can easily implement any of the
C-style isdst logic.  The problem, however is that while most C
libraries agree with in their treatment of 0 and 1, the behavior on
tm_isdst=-1 ranges from bad to absurd.  For example, the value
returned by mktime in the ambiguous case may depend on the arguments
passed to the previous call to mktime.

> The "don't care" case should allow stricter backward compatibility.

I am not sure we want to maintain the behavior described in
<http://bugs.python.org/issue22627> (Calling timestamp() on a datetime
object modifies the timestamp of a different datetime object.)

>
> - "[1] An instance that has first=False in a non-ambiguous case is said to
> represent an invalid time ..." Could you quickly elaborate here whether such
> an invalid time is considered an hour later than the valid corresponding
> time with first=True, given a reasonable timezone with and without DST?

Such an instance is just *invalid* as in "February 29, 2015."  In a
non-ambiguous case,  first=False means "the second of one", which does
not make sense.  Such instances should never be produced except for a
narrow purpose of probing the astimezone() or timestamp() to determine
whether a given datetime is ambiguous or not.

>
> - "In CPython, a non-boolean value of first will raise a TypeError , but
> other implementations may allow the value None to behave the same as when
> first is not given." This is surprisingly lenient. Why allow the second
> behavior at all?

Because it is currently allowed for the other arguments of replace()
in the pure python datetime implementation that we ship.  I will be
happy to change that starting with the "first".

> (Surely all Python implementations can distinguish between
> a value equal to None and a missing value, even if some kind of hack is
> needed.) Also, why this clause for replace() but not for other methods?

What other methods?  replace() is fairly unique in its treatment of arguments.

>
> - I'm disappointed that there are now some APIs that explicitly treat a
> naive datetime as local (using the system timezone). I carefully avoided
> such interpretation in the original design, since a naive datetime can also
> be used to represent a point in UTC, or in some timezone that's implicit.
> But I guess this cat is out of the bag since it's already assumed by
> timestamp() and fromtimestamp(). :-(

I held that siege as long as I could.

>
> - "Conversion from POSIX seconds from EPOCH" I'd move this section before
> the opposite clause, since it is simpler and the other clause references
> fromtimestamp(). The behavior of fromtimestamp() may also be considered
> motivational for having only the values True and False for the flag.
>

Will do.

> - "New guidelines will be published for implementing concrete timezones with
> variable UTC offset." Where?

In the official datetime documentation.  I'll clarify that.

> (Is this just a forward reference to the next section? Then I'd drop it.)

No, I expect that section to be incorporated in the official datetime
library documentation.


>
> - "... must follow these guidelines." Here "must" is very strong (it is the
> strongest word in "standards speak", stronger than "should", "ought to",
> "may"). I recommend "should", that's strong enough.

OK.  This is a remnant of the idea to include a first-aware fromutc()
implementation, which after some private discussions with Tim we
decided to abandon.  In light of that idea, "must" made sense as in
"in order for unmodified fromutc() work correctly with your tzinfo
implementation, it *must* ..."

..
> - "We chose the minute byte to store the the "first" bit because this choice
> preserves the natural ordering." This only works with folds of exactly one
> hour. Also, is the natural ordering (of the pickles, apparently) used
> anywhere? I would hope not. Finally, given that two times that differ only
> in their 'first' flag compare equal, the natural ordering (if relevant :-)
> would be to store/compare the 'first' bit last.
>

I'll remove the rationale.  The ordering is a red herring anyways.  I
needs a place to stick one bit in the 10-byte payload and the minute
byte looked like a natural place.  I made up the ordering rational to
a posteriori justify an arbitrary choice.


> - Temporal Arithmetic (probably shouldn't have an "s" at the end):

Wikipedia is of no help here: "Arithmetic or arithmetics (from the
Greek ἀριθμός arithmos, "number")  ..." I'll check what we use in the
library docs.  (For some reason, I thought that Arithmetic is a branch
of mathematic while arithmetics is a set of rules.)

> this probably needs some motivation. I think it's inevitable (since we don't know
> the size of the time fold), but it still feels weird.
>

It's what you say and backward compatibility considerations.  We want
existing programs to produce the same results even if they
occasionally encounter first=False instances from say datetime.now().
I'll add a footnote.

> - "[2] As of Python 3.5, tzinfo is ignored whenever timedelta is added or
> subtracted ..." I don't see a reason for this footnoote to discuss possible
> future changes to datetime arithmetic; leave that up to the respective PEP.

I'll remove the discussion of the future changes to datetime arithmetic.

> (OTOH you may have a specific follow-up PEP in mind, and it may be better to
> review this one in the light of the follow-up PEP.)

Yes, there is a PEP-0500, but it is nowhere as ready as this one.

> - "This proposal will have little effect on the programs that do not read
> the first flag explicitly or use tzinfo implementations that do." This seems
> ambiguous -- if I use a tzinfo implementation that reads the first flag, am
> I affected or not? Also, "the programs" should be just "programs", and I'm
> kind of curious why the hedging of "little effect" (rather than "no effect")

We are changing the behavior of datetime.timestamp on naive instances.
  This is really what the "hedging" is about.

> is needed. Also, you might give some examples of changes that programs that
> *do* use the first flag may experience.

I don't understand.  The programs that  *do* use the first flag now
experience an AttributeError, and that will surely change.  Perhaps
you want to see some examples of how the programs can start using the
first flag?

>
> - In a reply to this thread, you wrote "The rule for the missing time is the
> opposite to that for the ambiguous time. This allows a program that probes
> the TZ database by calling timestamp with two different values of the
> "first" flag to avoid any additional calls to differentiate between the gap
> and the fold." Can you clarify this (I'm not sure how this works, though I
> intuitively agree that the two rules should be each other's opposite) and
> add it to the PEP?
>

Yes, I posted something like this before, but will include in the PEP.
A first-aware program can do something like the following when it gets
a naive instance dt that it wants to decorated with a timezone.

dt1 = dt.replace(first=True).astimezone()
dt2 = dt.replace(first=False).astimezone()

if dt1 == dt2:
    return dt1

if dt1 < dt2:
    warn("ambiguous time: picked %s but it could be %s", dt1, dt2)
    return dt1

if dt1 > dt2:
    raise ValueError("invalid time", dt, dt1, dt2)


> - Would there be any merit in proposing, together with the idea of a
> three-value flag, that datetime arithmetic should use "timeline arithmetic"
> if the flag is defined and a tzinfo is present?

To add a third value, you will need a full additional bit anyways, so
why not just have a separate flag that controls the choice of
arithmetic and leave "first" a pure fold disambiguation flag?  I
consider the problem of local time disambiguation and that of the
"timeline arithmetic" to be two orthogonal problems.  Yes, "timeline
arithmetic" can benefit from the first flag, but it is possible
without it.  Similarly, the problem of round-tripping the times
between timezones can benefit from "timeline arithmetic", but PEP 495
solves it without introducing the new arithmetic.

In my view PEP 495 solves a long-standing problem for which there is
no adequate workaround within stdlib and third-party workarounds are
cumbersome.  The alternative datetime arithmetic PEP (PEP-0500)
enables some nice to have features, but does not enable anything that
cannot be achieved  by other means.  I would like to avoid mixing the
two proposals.


More information about the Datetime-SIG mailing list