[Python-Dev] Status on PEP-431 Timezones

Fri Apr 10 18:43:06 CEST 2015

On 15-04-10, Stuart Bishop  wrote:
> 
> On 10 April 2015 at 17:12, Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
> > The question of "store the DST flag" vs "store the offset" is essentially a
> > data normalisation one - there's only a single bit of additional information
> > actually needed (whether the time is DST or not in the annual hour of
> > ambiguity), which can then be combined with the local time, the location and
> > the zone info database to get the *actual* offset.

One thing that hasn't been mentioned is that is_dst and offset are not parallel on the datetime.time class, since you have no information about when in the history of the time zone a time was recorded. If you only record the time zone and is_dst flag, then updating zoneinfo can change the UTC time corresponding to existing aware time objects. Whereas if you only store offsets, it can change whether a time is interpreted as being DST or not (or whether the offset is considered valid at all). Which is mostly to say that aware time objects that don't also store dates are probably just a bad idea in general.

> The dst flag must be stored in the datetime, either as a boolean or
> encoded in the timezone structure. If you only have the offset, you
> lose interoperability with the time module (and the standard IANA
> zoneinfo library, posix etc., which it wraps). (dt +
> timedelta(hours=1)).ctime() will give an incorrect answer when you
> cross a DST transition.

>From local time + offset you can compute UTC time and from there you can lookup in the tzinfo whether it's DST or not. But yes, I'm starting to be less enamored of this idea. I get that it's basically the same as doing everything in UTC and caching local time, but I wasn't thinking about the fact that so many existing APIs need (local time, is_dst) that the friction is a problem.

> > A question the PEP perhaps *should* consider is whether or not to offer an
> > API allowing datetime objects to be built from a naive datetime, a fixed
> > offset and a location, throwing NonExistentTimeError if the given date, time
> > and offset doesn't match either the DST or non-DST times at that location.
> 
> I don't think you need a specific API, apart from being able to
> construct a tzinfo using nothing but an offset (lots of people require
> this for things like parsing email headers, which is why pytz has the
> FixedOffset class).
> 
> datetime.now().astimezone(FixedOffset(-1200)).astimezone(timezone('Melbourne/Australia',
> is_dst=None)

This doesn't work:

```
>>> from datetime import *
>>> datetime.now().astimezone(timezone(-timedelta(hours=2)))
ValueError: astimezone() cannot be applied to a naive datetime
```

But also, how is this different from needing to know the offset in order to construct an aware datetime?

I know that you can do datetime.now(tz), and you can do datetime(2013, 11, 3, 1, 30, tzinfo=zoneinfo('America/Chicago')), but not being able to add a time zone to an existing naive datetime is painful (and strptime doesn't even let you pass in a time zone). Some of the people who designed the instruments that we use to collect data understood why we would care when it was collected, and some of them didn't, and I need to be able to handle their data regardless. (The makers of one device with subsecond resolution opted for maximum compatibility with Microsoft Excel by writing a CSV with times as naive days since Dec. 30, 1899 to five decimal digits, but I doubt there's anything to be done for them.)

In addition, it would be nice to able to say that going back and forth between aware and naive datetimes is evil and you should never do it, but it's a necessity if you want to be able to implement relative timedeltas in some form. Unless the stdlib wants to step up with a be-all, end-all implementation of "this time tomorrow" and friends, there shouldn't be unnecessary friction here.

> > P.S. The description of NonExistentTimeError in the PEP doesn't seem quite
> > right, as it currently says it will only be thrown if "is_dst=None", which
> > seems like a copy and paste error from the definition of AmbiguousTimeError.
> 
> The PEP is correct here. If you explicitly specify is_dst, you know
> which side of the transition you are on and which offset to use. You
> can calculate datetime(2000, 12, 31, 2, 0, 0, 0).astimezone(foo,
> is_dst=False) using the non-DST offset and get an answer. It might not
> have ever appeared on the clock on your wall, but it is better than a
> punch in the face. If you want a punch in the face, is_dst=None
> refuses to guess and you get the exception.

Returning a DST time precisely when is_dst=False is passed isn't a punch in the face? I mean, I get what you're saying and I agree that "given that this time is skipped, which offset do I want to interpret it in?" is the right question to ask, but this is horribly counterintuitive to anyone who hasn't spent a lot of time pondering it. If nonexistent times are going to behave that way, the API needs to have a better solution to Naming Things.

> (Except for those cases where the timezone offset is changed without a
> DST transition, but that is rare enough everyone pretends they don't
> exist)

As far as method parameters go, you might as well just say the first time is DST and the second is STD and explain in the docs that "is_dst" is just a mnemonic. But if you have an is_dst flag on your datetimes, suddenly this is an issue.

ijs